Lab #2: Translationary
Learn to read and parse a .txt file, character-by-character.Goals
- Use a
java.util.Scanner
to read and parse input - Implement the
labs.lab2.Translationary
interface
Needs
- translationaryWords.txt
- Translationary.jar - UPDATED 8/31/13
Background
**NOTE: BACKGROUND UPDATE ADDED HERE! READ UPDATE FIRST! (8/31/13)**
Reading input in Java is easily done by using a java.util.Scanner
. Scanner has a few constructors, but the one that we will be using in this lab will be
public Scanner(File source,
String charsetName)
throws FileNotFoundException
The charsetName
string defines the character set that we want the scanner to convert the bytes from the file into. For our purposes of getting ready for the first assignment, we will be using UTF-8
. The java.io.File
provided can be created within the constructor of the scanner by pointing to location/pathname of the downloaded translationaryWords.txt file on your machine (i.e. "C:\\Users\\Activation\\Desktop\\translationaryWords.txt". Note the double '\'; Since the backwards slash is the escape character in Java, we need to escape the escape character so it'll go to the correct place on my hdd).
A scanner has a bunch of methods that it uses to scan/read the file that was provided. Some methods are really cool and use delimiters and regexs. Unfortunately, we are NOT using those methods for this lab (we'll learn them at a later date). Even if you know how to use them, Professor Ackley wants us to learn how to parse strings character-by-character for Assignment 1. To do this, we need to get a String from the translationaryWords.txt. The methods to look at within the scanner class are hasNextLine()
and nextLine()
. The former returns a boolean
that returns true
if the scanner has another line it can return as a String, or false
if it has reached the end of the file. NOTE: unlike C where there is an actual EOF character that you can reach when scanf-ing through some array of chars, the java.util.Scanner.hasNextLine()
(and the more generic java.util.Scanner.next()
) will return true
iff there is another line (or token). In other words, it will return false
if it has reached the end of file.
public String nextLine()
, "Advances this scanner past the current line and returns the input that was skipped. This method returns the rest of the current line, excluding any line separator at the end. The position is set to the beginning of the next line.
Since this method continues to search through the input looking for a line separator, it may buffer all of the input searching for the line to skip if no line separators are present."
Once you have a String from the file, make sure to parse it character-by-character. Check out String.charAt()
.
//TODO
Add the Translationary.jar file to the build path of the Java project you are using for labs. The .jar file contains exactly one labs.lab2.Translationary
interface which contains exactly two method declarations:
void readAndParseFile(String fileURL) throws FileNotFoundException;
String translate(String englishWord);
Your job is to implement a TranslationaryImpl.java
file which implements the labs.lab2.Translationary
interface. Check out the Javadoc for the labs.lab2.Translationary
interface so you are sure to abide by the contract you're entering into with the interface. Your implementation will have a main method which does the following (you are allowed to copy and paste):
Translationary translationary = new TranslationaryImpl();
try {
translationary.readAndParseFile("INSERT_LOCATION_OF_YOUR_FILE");
}
catch (FileNotFoundException e)
{
System.out.println("The specified file does not exist.");
}
// print all provided translations
for (int i = 0; i < args.length; i++) {
translationary.translate(args[i]);
}
We will be passing in arguments using the the main method's String[] args
. This is easily done through Eclipse by setting the "Run" -> "Run Configurations..." -> "Arguments" tab -> "Program arguments:". Here you can enter in any number of Strings, separated by spaces, and they will incrementally fill the String[] args used by the main method.
To get started, look at the "translationaryWords.txt" file and pick some English words found in there to "translate". I will test your code on the following input:
hey boy girl what the happy computer screen mom dad
backyard sky airplane duh doorknob google door
implementation occupy requirements inbox
calendar broken interface
usage cellular scanner cs science hey university of new mexico
If you notice, I will be testing words that are not in the Translationary, so make sure to protect against that.
Here is a sample of correct output from the main method:
hey - Translation: eyhay
boy - Translation: oybay
girl - Translation: irlgay
what - Translation: atwhay
the - Translation: ethay
happy - Translation: appyhay
computer - Translation: omputercay
screen - Translation: eenscray
mom - Translation: ommay
dad - Translation: adday
backyard - Translation: ackyardbay
sky - Translation: yskay
airplane - Translation: airplaneway
"duh" doesn't exist in the Translationary. Try again.
doorknob - Translation: oorknobday
google - Translation: ooglegay
door - Translation: oorday
implementation - Translation: implementationway
occupy - Translation: occupyway
requirements - Translation: equirementsray
inbox - Translation: inboxway
calendar - Translation: alendarcay
"broken" doesn't exist in the Translationary. Try again.
interface - Translation: interfaceway
usage - Translation: usageway
cellular - Translation: ellularcay
scanner - Translation: annerscay
cs - Translation: csay
science - Translation: iencescay
hey - Translation: eyhay
university - Translation: universityway
of - Translation: ofway
new - Translation: ewnay
mexico - Translation: exicomay
BACKGROUND UPDATE (8/31/13)
I gave background information on how to complete this lab assignment in a way that differs from what's expected of you in the SiteRiter spec. The spec states that "((C.1.2)) For maximum reusability, the input to lexical analysis is a java.io.Reader
, nothing more specific than that."
I was mistaken to lead you towards using a java.util.Scanner
object to traverse character-by-character. The Scanner
object is created for more specific parsing using delimiters and regular expressions, which we will not be using in the SiteRiter assignment.
Reading input in for this Lab is easily done by using a java.io.Reader
. Reader
is an abstract class and has a few direct subclasses. One that you might be specifically interested in for this assignment is java.io.InputStreamReader
. InputStreamReader
has a few constructors, but the one that we will be using in this lab will be
public InputStreamReader(InputStream in,
String charsetName)
The charsetName
string defines the character set that we want the scanner to convert the bytes from the file into. For our purposes of getting ready for the first assignment, we will be using "UTF-8"
.
java.io.InputStream
is also an abstract class which has some subclasses. Check out java.io.FileInputStream
. It takes a java.io.File
object which I explain how to use above.
Once you have your newly instantiated Reader
, you can traverse the content in the reader's input stream by using the Reader.read()
method. It reads a single character from the set input stream and returns you the character that was read, or -1 if the end of the stream has been reached.
Everything under the //TODO section should be followed exactly the same way.
Turn in a single "TranslationaryImpl.java" file using the online interface by 11:59:59 PM of your lab day.