Chernoff Park Plots
In 1973,
Herman Chernoff wrote "The Use of Faces to
Represent Points in k-Dimensional Space Graphically". Humans
can rapidly recognize and characterize faces, making them a
convenient way to spot patterns in data. By using the familiar
South Park
characters as templates, the data is easier to describe
(e.g. there are a lot of pissed-off looking Kyles in the upper
left quadrant) and the calibration of faces is easier for the
user because people are familiar with the standard appearance
of the characters.
In this particular implementation of Chernoff Face plots,
a single face represents 8 variables. The features are:
- head eccentricity
- inter-eye distance
- eye size
- eyebrow angle
- pupil size
- mouth curvature
- mouth openness
- mouth width
These features can vary over a range of approximately 8 states,
or 3 bits of data.
There are currently only two characters to choose from,
which means that only two data sets can be simultaneously
displayed conveniently.
Kenny is not used because he does not have a visible mouth,
which would result in the loss of 9 bits of information per
face. Though the information density is low, the
ease of identifying patterns over a multivariate data set can
compensate in many situations.
A random Chernoff Park plot. There are approximately 12 faces,
each conveying 8 features, that display a total of 12*8*3=288 bits of
information. This plot could be made denser by reducing face size.
links:
back