Personal tools
You are here: Home Code Double Metaphone

Double Metaphone

Double Metaphone is an algorithm to match words that are spelled differently, but sound the same. Its useful for avoiding duplicated names in a database. You can find it implemented in C, Ruby, and PHP (see the Wikipedia article). I've written versions in Python and SQL for MySQL.
Double Metaphone in Python and MySQL
Notes on my implementation of the Double Metaphone algorithm in Python (and MySQL).
File metaphone.py
My implementation of the Double Metaphone algorithm in Python. Python file. Updated 12/17/2007 to fix three bugs. Updated June 25, 2010 to use UTF-8 and to fix several bugs.
File metaphone.sql
This is an implementation in SQL for MySQL 5.0+. It creates a function you can use in your SQL queries a la: "SELECT Name FROM tblPeople WHERE dm(Name) = dm(@Search)". (Updated Nov 27, 2007 to fix a type-o in the 'CC' section. Updated June 1, 2010 to fix a bug in the 'Z' section. Updated June 25, 2010 to fix many bugs - see the file opening comments.)
The Wikipedia article on Double Metaphone
This is an excelent place to start
aspel.net
I translated the C source file linked as 'A slightly modified version' on this page.
Apache Commons Codec
There's an implementation of Double Metaphone in the Apache Commons Codec library for Java.
Metaphone 3
There is a third version of the Metaphone algorithm. It is not free, but the price seemed reasonable when I checked so it might be worth a look if you can use Java or C++ source.
Fuzzy
Fuzzy is a Python package that includes a C implementation of Double Metaphone as well as Soundex and NYSIIS. You might still be interested in my code if you want to dig into how it works, but Fuzzy looks like it would run faster.
Document Actions