+ Installing and Registering the Module
- The Soundex system is a method of matching similar-sounding (similar phonemes) names by
- converting them to the same code.
- Soundex is most useful for English names.
- These functions work with Soundex codes:
- soundex(text string1) returns text
-difference(text string1, text string2) returns int
- The soundex function converts a string to its Soundex code. Soundex
- codes consist of four characters.
- The difference function converts two strings to their Soundex codes
- and then reports the number of matching code positions. The result ranges from zero to
- four, zero being no match and four being an exact match. These are some examples:
- SELECT soundex('hello world!');
-SELECT soundex('Anne'), soundex('Ann'), difference('Anne', 'Ann');
-SELECT soundex('Anne'), soundex('Andrew'), difference('Anne', 'Andrew');
-SELECT soundex('Anne'), soundex('Margaret'), difference('Anne', 'Margaret');
-
-CREATE TABLE s (nm text);
-
-INSERT INTO s VALUES ('john');
-INSERT INTO s VALUES ('joan');
-INSERT INTO s VALUES ('wobbly');
-INSERT INTO s VALUES ('jack');
-
-SELECT * FROM s WHERE soundex(nm) = soundex('john');
-
-SELECT * FROM s WHERE difference(s.nm, 'john') > 2;
- For information about the Soundex indexing system see .
+ The fuzzystrmatch module is installed when you install
+ Greenplum Database. Before you can use any of the functions defined in the
+ module, you must register the fuzzystrmatch extension in
+ each database in which you want to use the functions.
+ Refer to Installing Additional Supplied Modules
+ for more information.
-
- Levenshtein Functions
+
+ Module Documentation
- These functions calculate the Levenshtein distance between two strings:
- levenshtein(text source, text target, int ins_cost, int del_cost, int sub_cost) returns int
-levenshtein(text source, text target) returns int
-levenshtein_less_equal(text source, text target, int ins_cost, int del_cost, int sub_cost, int max_d) returns int
-levenshtein_less_equal(text source, text target, int max_d) returns int
- Both the source and target parameters can be any
- non-null string, with a maximum of 255 bytes. The cost parameters
- ins_cost, del_cost, and sub_cost
- specify cost of a character insertion, deletion, or substitution, respectively. You can
- omit the cost parameters, as in the second version of the function; in that case the cost
- parameters default to 1.
-
- levenshtein_less_equal is accelerated version of
- levenshtein function for low values of distance. If actual distance
- is less or equal then max_d, then
- levenshtein_less_equal returns an accurate value of the distance.
- Otherwise, this function returns value which is greater than max_d.
- Examples:
- test=# SELECT levenshtein('GUMBO', 'GAMBOL');
- levenshtein
--------------
- 2
-(1 row)
-
-test=# SELECT levenshtein('GUMBO', 'GAMBOL', 2,1,1);
- levenshtein
--------------
- 3
-(1 row)
-
-test=# SELECT levenshtein_less_equal('extensive', 'exhaustive',2);
- levenshtein_less_equal
-------------------------
- 3
-(1 row)
-
-test=# SELECT levenshtein_less_equal('extensive', 'exhaustive',4);
- levenshtein_less_equal
-------------------------
- 4
-(1 row)
- For information about the Levenshtein algorithm, see .
-
-
-
- Metaphone Functions
-
- Metaphone, like Soundex, is based on the idea of constructing a representative code for
- an input string. Two strings are then deemed similar if they have the same codes. This
- function calculates the metaphone code of an input string
- :metaphone(text source, int max_output_length) returns text
- The source parameter must be a non-null string with a maximum of 255
- characters. The max_output_length parameter sets the maximum length of
- the output metaphone code; if longer, the output is truncated to this length. Example:
- test=# SELECT metaphone('GUMBO', 4);
- metaphone
------------
- KM
-(1 row)
- For information about the Metaphone algorithm, see .
-
-
-
- Double Metaphone Functions
-
- The Double Metaphone system computes two "sounds like" strings for a given input string -
- a "primary" and an "alternate". In most cases they are the same, but for non-English names
- especially they can be a bit different, depending on pronunciation. These functions
- compute the primary and alternate
- codes:dmetaphone(text source) returns text
-dmetaphone_alt(text source) returns text
- There is no length limit on the input strings. Example:
- test=# select dmetaphone('gumbo');
- dmetaphone
-------------
- KMP
-(1 row)
- For information about the Double Metaphone algorithm, see .
-
-
-
- Registering the fuzzystrmatch Functions
-
- Greenplum Database supplies SQL scripts to register the fuzzystrmatch
- module functions.
- Before you can use fuzzystrmatch functions, you must register
- the fuzzystrmatch extension in each database in which you want
- to use the functions:
- psql -d testdb -c "CREATE EXTENSION fuzzystrmatch"
- To remove the functions from a database:
- psql -d testdb -c "DROP EXTENSION fuzzystrmatch"
- When you remove the fuzzystrmatchfunctions from a database,
- UDFs that you created in the database that use the functions will no longer
- work.
+ See fuzzystrmatch in the PostgreSQL
+ documentation for detailed information about the individual functions in
+ this module.