CS351 Spring 2004 Lab 2

Testing and Empirical Validation

Performance of a learning algorithm

  1. Collect a large set of examples
  2. Divide it into two disjoint sets: the training set and the test set
  3. Use the training set to train the algorithm
  4. Use the algorithm to classify the test set
  5. The performance is the percentage of times that the algorithm correctly classified item in the test set

A useful way of visualizing the data is by plotting the size of the training set versus the % correct on the training set. This graph is called the learning curve of the algorithm. Example:

Fig 18.9, Artificial Intelligence: A Modern Approach, Struat Russell and Peter Norvig, used with permission

More than one way of being wrong

The Confusion Matrix
True Positive
False Positive
True Negative
False Negative



Testing(Correctness)

Unit Testing: Testing the individual components of a software product in isolation
Blackbox Testing: Testing the entire product as a unit

Test Data:
Exhaustive
Edge Case
Code Coverage
Random Cases
Use-oriented data


Testing(Performance)

Your MondoHashTable certain quantitative performance requirements:
CS351 Project 1 Section 4

Amortized analysis: the time required to perform a sequence of data-structure operations averaged over all the operations performed.

Checking memory usage of an object in Java:
System.runFinalization();
System.gc();
int t = Runtime.getRuntime.totalMemory();

//Instantiate object to be measured

System.runFinalization();
System.gc();
t = t - RunTime.getRuntime.totalMemory();

Note that this method is rather crude (and useless in a multithreaded program). It can be used to get approximate memory amounts.

Timing an operation in Java:
long t = System.currentTimeMillis();
// Do operation
t = System.currentTimeMillis() - t;
Note that this method does not have very good resolution, and the operation may have to be repeated a number of time in order to get usage data.







Exercise

The operator + on the class String concatenates two strings. It does so in a relatively inefficient manner. Your task to show this. More specificly devise a simple test program that shows compares the speed of String+ with the operator StringBuffer.append.

Don't get too elaborate. It's generally except in the Java community that StringBuffer is substantially faster, your task is to provide some simple evidence to show this.
This shouldn't take much work, hand in your results by
Monday Lab:          Wednesday at 5pm
Wednesday Lab:      Friday at 5pm