Chernoff Park

Chernoff Park Plots

In 1973, Herman Chernoff wrote "The Use of Faces to Represent Points in k-Dimensional Space Graphically". Humans can rapidly recognize and characterize faces, making them a convenient way to spot patterns in data. By using the familiar South Park characters as templates, the data is easier to describe (e.g. there are a lot of pissed-off looking Kyles in the upper left quadrant) and the calibration of faces is easier for the user because people are familiar with the standard appearance of the characters.

In this particular implementation of Chernoff Face plots, a single face represents 8 variables. The features are:

head eccentricity
inter-eye distance
eye size
eyebrow angle
pupil size
mouth curvature
mouth openness
mouth width

These features can vary over a range of approximately 8 states, or 3 bits of data. There are currently only two characters to choose from, which means that only two data sets can be simultaneously displayed conveniently. Kenny is not used because he does not have a visible mouth, which would result in the loss of 9 bits of information per face. Though the information density is low, the ease of identifying patterns over a multivariate data set can compensate in many situations.

A random Chernoff Park plot. There are approximately 12 faces, each conveying 8 features, that display a total of 12*8*3=288 bits of information. This plot could be made denser by reducing face size.

links:

back