Project Information
Team Captain: Ryan Peng
Team Size: 1
Grade Category: 10-12
Subject Area: Engineering/Computer
Science
Project Type: Descriptive III
Language: English
Source of Project Idea:
The reason why this program was developed is to provide a reliable tool
for bridging the gap between a text document written in an unknown
language to the user and an instant translator. When using an instant
translator, one must know the source and target languages. The target
language is what the user wants the text document to be written in after
translation. The source language is the language that the original text
document is written in (before translation). Sometimes, the user may not
know the source language. This program is designed to determine the
source language. This project was inspired by the nuisances of guessing
the source language of a document written in
an unknown language.
Project Summary:
The purpose of this innovation project is to develop a program that will
determine the language that a piece of text is written in with high
accuracy and speed. The program should be able to take in text, scan
through it, and output what it thinks the language the text is written
in. During the onset of this project, I hypothesized that it was
possible to create such a program and have that program be quick and
accurate. I think that the program should take no more than 120 seconds
to scan and should be at least 95% accurate in determining what language
a document is written in. In order to evaluate the Language Recognition
Program’s performance, several hundred excerpts of text were scanned
with the program. These excerpts of text were written in several
different languages. The results were recorded and compared against the
true languages of the excerpts. Overall, the Language Recognition
Program achieved an accuracy rating of 99.5%. The program handled the
varied excerpts well; the excerpts were chosen from different styles of
writing in order to test if the program could handle a variety of
topics. In conclusion, the Language Recognition Program is an accurate
and fast program. Two primary methods of language determination, word
recognition and special character recognition, were integrated together
to form a strong algorithm for determining the language of a document
efficiently. It is a reliable tool that can help people translate and
understand a document written in an unknown language.
Software Tools Used:
Hardware Tools Used:
Special Skills Used:
Awards:
-
"Best (1st Place)
Intermediate Engineering/Computer Science Project" at the Saskatoon
Regional Science Fair 2008
-
"Winner's Showcase (Top 6
Projects of the Whole Fair)" at the Saskatoon Regional Science Fair
2008
Past Virtual Science Fair Websites:
|