XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Fuzzy matcher implemented in XSLT/XPath

Hi Folks,

A fellow by the name of Roger Cauvin implemented a fuzzy matcher in XSLT/XPath. He posted his code to the xsl-list:

http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201301/msg00164.html 

I tested his fuzzy matcher program and it is fantastic.

Here's my test. I first created this XML document containing a list of names:

<Names>
    <Name>Smith</Name>
    <Name>Costello</Name>
    <Name>costello</Name>
    <Name>COSTELLO</Name>
    <Name>CoStEllO</Name>
    <Name>Costelo</Name>
    <Name>Costell</Name>
    <Name>Costllo</Name>
</Names>

I want to do a fuzzy match on the name 'Costello'.

I created a stylesheet that imports his fuzzy matcher code and then loops through each name, calling his compare-strings named template and then printing the results. Here are the results. My test-stylesheet follows. 

If the fuzzy matcher (i.e., compare-strings) returns a score greater-than-or-equal-to 0.9 then I consider it to be a match.

-------------------------------------------
                Output
-------------------------------------------
Searching for matches on: Costello

Results:

Not a match: Smith
Fuzzy match score: 0

Found a match: Costello
Fuzzy match score: 1

Found a match: costello
Fuzzy match score: 1

Found a match: COSTELLO
Fuzzy match score: 1

Found a match:CoStEllO
Fuzzy match score:1

Found a match: Costelo
Fuzzy match score: 0.9230769230769231

Found a match: Costell
Fuzzy match score: 0.9230769230769231

Not a match: Costllo
Fuzzy match score: 0.7692307692307693

------------------------------------------
    test-fuzzy-matcher.xsl
------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                        version="1.0">
    
    <xsl:import href="fuzzy-matcher.xsl"/>
    
    <xsl:output method="text"/>
    
    <xsl:variable name="name-being-searched-for" select="'Costello'" />
    
    <xsl:template match="Names">
        
        <xsl:text>Searching for matches on:</xsl:text>
        <xsl:value-of select="$name-being-searched-for"/>
        <xsl:text>Results:</xsl:text>
        
        <xsl:for-each select="Name">
            <xsl:variable name="check-this-name" select="." />
            <xsl:variable name="result-of-fuzzy-match-test">
                <xsl:call-template name="compare-strings">
                    <xsl:with-param name="string1" select="$name-being-searched-for" />
                    <xsl:with-param name="string2" select="$check-this-name" />
                </xsl:call-template>
            </xsl:variable>

            <xsl:choose>
                <xsl:when test="$result-of-fuzzy-match-test &gt;= 0.9">
                    <xsl:text>Found a match:</xsl:text>
                    <xsl:value-of select="$check-this-name"/>
                    <xsl:text>Fuzzy match score:</xsl:text>
                    <xsl:value-of select="$result-of-fuzzy-match-test"/>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:text>Not a match:</xsl:text>
                    <xsl:value-of select="$check-this-name"/>
                    <xsl:text>Fuzzy match score:</xsl:text>
                    <xsl:value-of select="$result-of-fuzzy-match-test"/>    
                </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each>
        
    </xsl:template>
    
</xsl:stylesheet>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS