Historical Approaches | |
Luhn Algorithm (cont.) 5. Similar spellings are consolidated into word types (a rough approximation of a stemmer) 5a. any token with less than seven letter
non-matches are considered to be of the same word type:
|
frequently frequent
10 letters, 8 match, 2 non-match |