[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Fuzzy matcher implemented in XSLT/XPath
- From: "Costello, Roger L." <costello@mitre.org>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Thu, 7 Feb 2013 12:58:21 +0000
Hi Folks,
A fellow by the name of Roger Cauvin implemented a fuzzy matcher in XSLT/XPath. He posted his code to the xsl-list:
http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201301/msg00164.html
I tested his fuzzy matcher program and it is fantastic.
Here's my test. I first created this XML document containing a list of names:
<Names>
<Name>Smith</Name>
<Name>Costello</Name>
<Name>costello</Name>
<Name>COSTELLO</Name>
<Name>CoStEllO</Name>
<Name>Costelo</Name>
<Name>Costell</Name>
<Name>Costllo</Name>
</Names>
I want to do a fuzzy match on the name 'Costello'.
I created a stylesheet that imports his fuzzy matcher code and then loops through each name, calling his compare-strings named template and then printing the results. Here are the results. My test-stylesheet follows.
If the fuzzy matcher (i.e., compare-strings) returns a score greater-than-or-equal-to 0.9 then I consider it to be a match.
-------------------------------------------
Output
-------------------------------------------
Searching for matches on: Costello
Results:
Not a match: Smith
Fuzzy match score: 0
Found a match: Costello
Fuzzy match score: 1
Found a match: costello
Fuzzy match score: 1
Found a match: COSTELLO
Fuzzy match score: 1
Found a match:CoStEllO
Fuzzy match score:1
Found a match: Costelo
Fuzzy match score: 0.9230769230769231
Found a match: Costell
Fuzzy match score: 0.9230769230769231
Not a match: Costllo
Fuzzy match score: 0.7692307692307693
------------------------------------------
test-fuzzy-matcher.xsl
------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:import href="fuzzy-matcher.xsl"/>
<xsl:output method="text"/>
<xsl:variable name="name-being-searched-for" select="'Costello'" />
<xsl:template match="Names">
<xsl:text>Searching for matches on:</xsl:text>
<xsl:value-of select="$name-being-searched-for"/>
<xsl:text>Results:</xsl:text>
<xsl:for-each select="Name">
<xsl:variable name="check-this-name" select="." />
<xsl:variable name="result-of-fuzzy-match-test">
<xsl:call-template name="compare-strings">
<xsl:with-param name="string1" select="$name-being-searched-for" />
<xsl:with-param name="string2" select="$check-this-name" />
</xsl:call-template>
</xsl:variable>
<xsl:choose>
<xsl:when test="$result-of-fuzzy-match-test >= 0.9">
<xsl:text>Found a match:</xsl:text>
<xsl:value-of select="$check-this-name"/>
<xsl:text>Fuzzy match score:</xsl:text>
<xsl:value-of select="$result-of-fuzzy-match-test"/>
</xsl:when>
<xsl:otherwise>
<xsl:text>Not a match:</xsl:text>
<xsl:value-of select="$check-this-name"/>
<xsl:text>Fuzzy match score:</xsl:text>
<xsl:value-of select="$result-of-fuzzy-match-test"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]