Subject: Function for SOUNDEX
From: Andrew Foster <afoster@www.uni-com.co.uk>
Newsgroups: comp.databases.informix
Date: Fri, 06 Jun 1997 18:54:20 +0200

--------------226F3049229E1F439960EF17
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I worked on a project in Illustra which required similar functionality
to what you are describing.  I ended up adapting a C function to provide
my solution, from which i'm sure you can adapt it into whatever form you
require.

char *M_SoundsLike(char *HashStr, char *Data, int NoBytes)
/*

*************************************************************************

  * M_SoundsLike()                           Author  :  Andrew
Foster     *
  *                                          Version :
1.0               *
  * "Computes a value representing sounds"   Date    :
19-09-96          *

*************************************************************************

    DESCRIPTION
     Computes a hash value which represents the sounds contained in the
data
     supplied.  This value can then be used to compare the sound of
     different words, names etc. even when they are slightly
mis-spelled.

    PARAMETERS
     HashStr - A pointer to the location where the hash value is to
            be stored.
     Data - The data to be analysed, and a hash value be generated
     NoBytes - The number of bytes to process

    RETURNS
     A null terminated string containing the code representing the
sound.

    OTHER NOTES
     This was apparently adapted originally from an algorithim obtained
     from the book Knuth, D.E. (1973) The art of computer programming,
     Addison-Wesley Publishing Company, Volume 3: Sorting and searching,
P.392.

     However, I obtained it from the internet, so this may or may not be
true!

*/
{
   static int   Weight[] =
      {  0,1,2,3,0,1,2,0,0,2,2,4,5,5,0,1,2,6,2,3,0,1,0,2,0,2 };
      /* a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z */
   register char  Character;
   register int  Last=0;
   register int  Count=0;
   char   *HashPtr;

   /* Set up default key, complete with trailing '0's */
   strcpy(HashStr,"Z0000");

   // Initialise the hash pointer
   HashPtr = HashStr;

   /* Advance to the first letter.  If none present, return default key
*/
   for(Count = 0; (Count < NoBytes) && !isalpha(*Data); Count++)
   {
      // Advance the data pointer
      Data++;
   }

   if (Count < NoBytes)
   {
      // Pull out the first letter, uppercase it, and set up for main
loop
      *HashPtr = toupper(*Data);
      Last = Weight[*HashPtr - 'A'];
      HashPtr++;
      Data++;
      Count++;

      for (;Count < NoBytes; Data++,Count++)
      {
  if (isalpha(*Data))
  {
            Character = tolower(*Data);

      // Fold together adjacent letters sharing the same code
       if (Last != Weight[Character - 'a'])
       {
        Last = Weight[Character - 'a'];

        // Ignore weight==0 letters except as separators
        if (Last != 0)
        {
           *HashPtr = Weight[Character - 'a'] + '0';
           HashPtr++;
        }
     }
  }
      }
   }

   // Insert the null character
   *HashPtr = 0x0;

   return(HashStr);
}

Sameer Bhatia (Contract) wrote:

> Hi Folks,
>
> Oracle has a function SOUNDEX. This function has the ability
> to find words that sound like other words, virtually
> regardless of how either is spelled.
>
> For example,
> consider a table weather, with columns : city, temperature
> and condition.
>
> SELECT * from weather
> WHERE SOUNDEX ( city) = SOUNDEX ('Sidney')
>
> The result will be
>
> CITY             TEMPERATURE              CONDITION
> ---------------------------------------------------------------------------
>
> SYDNEY                                29
> SNOW
>
> I am working on porting of an application on Oracle to
> Informix and the current application has quite a few
> instances of SOUNDEX.
>
> Wondering whether a function (stored procedure) can be
> written in Informix which would provide the same
> functionality as of SOUNDEX in oracle.
>
> Any help would be appreciated.
>
> Sameer

--

Andrew Foster
afoster@www.uni-com.co.uk
---------------------------------------------------------------------
Unique Communique Group
Unit 10, Parkway 4, Fourways, Trafford Park, Manchester, M17 1SN.
UK.
Tel. : +44 (0)161 877 2055                 Fax. : +44 (0)161 877 9649

--------------226F3049229E1F439960EF17--
