2 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns=
"http://www.w3.org/1999/xhtml">
5 <title>Text::Soundex - Implementation of the Soundex Algorithm as Described by Knuth
</title>
6 <meta http-equiv=
"content-type" content=
"text/html; charset=utf-8" />
7 <link rev=
"made" href=
"mailto:" />
10 <body style=
"background-color: white">
11 <table border=
"0" width=
"100%" cellspacing=
"0" cellpadding=
"3">
12 <tr><td class=
"block" style=
"background-color: #cccccc" valign=
"middle">
13 <big><strong><span class=
"block"> Text::Soundex - Implementation of the Soundex Algorithm as Described by Knuth
</span></strong></big>
17 <p><a name=
"__index__"></a></p>
22 <li><a href=
"#name">NAME
</a></li>
23 <li><a href=
"#synopsis">SYNOPSIS
</a></li>
24 <li><a href=
"#description">DESCRIPTION
</a></li>
25 <li><a href=
"#examples">EXAMPLES
</a></li>
26 <li><a href=
"#limitations">LIMITATIONS
</a></li>
27 <li><a href=
"#author">AUTHOR
</a></li>
34 <h1><a name=
"name">NAME
</a></h1>
35 <p>Text::Soundex - Implementation of the Soundex Algorithm as Described by Knuth
</p>
39 <h1><a name=
"synopsis">SYNOPSIS
</a></h1>
41 use Text::Soundex;
</pre>
43 $code = soundex $string; # get soundex code for a string
44 @codes = soundex @list; # get list of codes for list of strings
</pre>
46 # set value to be returned for strings without soundex code
</pre>
48 $soundex_nocode = 'Z000';
</pre>
52 <h1><a name=
"description">DESCRIPTION
</a></h1>
53 <p>This module implements the soundex algorithm as described by Donald Knuth
54 in Volume
3 of
<strong>The Art of Computer Programming
</strong>. The algorithm is
55 intended to hash words (in particular surnames) into a small space using a
56 simple model which approximates the sound of the word when spoken by an English
57 speaker. Each word is reduced to a four character string, the first
58 character being an upper case letter and the remaining three being digits.
</p>
59 <p>If there is no soundex code representation for a string then the value of
60 <code>$soundex_nocode
</code> is returned. This is initially set to
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef
</code></a>, but
61 many people seem to prefer an
<em>unlikely
</em> value like
<code>Z000
</code>
62 (how unlikely this is depends on the data set being dealt with.) Any value
63 can be assigned to
<code>$soundex_nocode
</code>.
</p>
64 <p>In scalar context
<code>soundex
</code> returns the soundex code of its first
65 argument, and in list context a list is returned in which each element is the
66 soundex code for the corresponding argument passed to
<code>soundex
</code> e.g.
</p>
68 @codes = soundex qw(Mike Stok);
</pre>
69 <p>leaves
<code>@codes
</code> containing
<code>('M200', 'S320')
</code>.
</p>
73 <h1><a name=
"examples">EXAMPLES
</a></h1>
74 <p>Knuth's examples of various names and the soundex codes they map to
77 Euler, Ellery -
> E460
78 Gauss, Ghosh -
> G200
79 Hilbert, Heilbronn -
> H416
80 Knuth, Kant -
> K530
81 Lloyd, Ladd -
> L300
82 Lukasiewicz, Lissajous -
> L222
</pre>
85 $code = soundex 'Knuth'; # $code contains 'K530'
86 @list = soundex qw(Lloyd Gauss); # @list contains 'L300', 'G200'
</pre>
90 <h1><a name=
"limitations">LIMITATIONS
</a></h1>
91 <p>As the soundex algorithm was originally used a
<strong>long
</strong> time ago in the US
92 it considers only the English alphabet and pronunciation.
</p>
93 <p>As it is mapping a large space (arbitrary length strings) onto a small
94 space (single letter plus
3 digits) no inference can be made about the
95 similarity of two strings which end up with the same soundex code. For
96 example, both
<code>Hilbert
</code> and
<code>Heilbronn
</code> end up with a soundex code
97 of
<code>H416
</code>.
</p>
101 <h1><a name=
"author">AUTHOR
</a></h1>
102 <p>This code was implemented by Mike Stok (
<code>stok@cybercom.net
</code>) from the
103 description given by Knuth. Ian Phillipps (
<code>ian@pipex.net
</code>) and Rich Pinder
104 (
<code>rpinder@hsc.usc.edu
</code>) supplied ideas and spotted mistakes.
</p>
105 <table border=
"0" width=
"100%" cellspacing=
"0" cellpadding=
"3">
106 <tr><td class=
"block" style=
"background-color: #cccccc" valign=
"middle">
107 <big><strong><span class=
"block"> Text::Soundex - Implementation of the Soundex Algorithm as Described by Knuth
</span></strong></big>