Hi Folks,
How do you map one XML vocabulary to another? Do you hand-code a bunch of XSLT template rules? Do you create a few million instances of each XML vocabulary and then let machine learning figure out the mapping? Do you write regular expression descriptions of each vocabulary's data and then write code that consumes the regex descriptions and automatically generates XSLT templates? Perhaps something else?
I have two XML vocabularies with closely related data. I want to convert XML instances that conform to vocabulary 1 to an equivalent instance conforming to vocabulary 2, and vice versa. The data in the two vocabularies might be structured quite differently.
The two vocabularies deal with magnetic variation. Magnetic variation is the difference between true north and magnetic north.
Here is an XML instance that uses vocabulary 1:
<MAG_VAR>W014404</MAG_VAR>
I want to convert it to vocabulary 2:
<magneticVariation>
<magneticVariationEWT>West</magneticVariationEWT>
<magneticVariationValue>14.7</magneticVariationValue>
</magneticVariation>
And vice-versa.
Here is the mapping between the two formats:
Vocabulary 1 Vocabulary 2
W West
E East
n/a True
Vocabulary 1 expresses magnetic variation in degrees, minutes, and tenths of minutes. Here is the structure of its magnetic variation data:
xyyyzzw
x = W or E corresponding to West and East, respectively.
yyy = the part of the magnetic variation in degrees.
zz = the part of the magnetic variation in minutes.
w = the part of the magnetic variation in tenths of a minute.
So, the data W014404 means the magnetic variation is West, 014 degrees, 40.4 minutes.
Vocabulary 2 expresses magnetic variation as a decimal value 0 to 180, with one digit to the right of the decimal point.
So, the data West, 14.7 means the magnetic variation is 14.7 degrees West.
Converting W014404 to West, 14.7 requires:
Map the first character of W014404 to West.
Map the second, third, and fourth characters to 14 degrees.
Map the fifth, sixth, and seventh characters (404) to .7 degrees. This will involve some arithmetic.
Doing the reverse mapping from West, 14.7 to W014404 requires:
Map West to W.
Map 14 to 014.
Map .7 to 420. Notice the loss of precision (vocabulary 1's original value was 404, not 420).
One approach to implementing these mappings is to use XSLT: one template that maps vocabulary 1 to vocabulary 2, a second template that maps vocabulary 2 to vocabulary 1. The templates can be placed in the same XSLT file and distinguished using XSLT modes. When we want to execute the mapping to convert vocabulary 1 to vocabulary 2, we can invoke Saxon on the command line with the -im (identify mode) flag:
java -jar saxon/saxon9he.jar magnetic-variation-vocab1.xml -xsl:transform-magnetic-variation.xsl -o:magnetic-variation-vocab2.xml -im:vocab1_to_vocab2
At the bottom of this message is the actual XSLT code to perform the mapping. It works, but I am wondering if there is a better approach to mapping vocabularies than hand-coding a bunch of template rules? I welcome your thoughts.
Here is XSLT that converts vocabulary 1 to vocabulary 2:
<xsl:template match="MAG_VAR" mode="vocab1_to_vocab2">
<magneticVariation>
<xsl:variable name="magVar" select="./text()"/>
<magneticVariationEWT>
<xsl:choose>
<xsl:when test="starts-with($magVar, 'W')">West</xsl:when>
<xsl:when test="starts-with($magVar, 'E')">East</xsl:when>
<xsl:otherwise>
<xsl:value-of select="error(xs:QName('MAG__VAR'), 'Invalid first character')"/>
</xsl:otherwise>
</xsl:choose>
</magneticVariationEWT>
<magneticVariationValue>
<xsl:variable name="unrounded-value" select="xs:integer(substring($magVar, 2,3)) + (xs:integer(substring($magVar, 5,3)) div 600)"/>
<xsl:variable name="rounded-value" select="(round($unrounded-value * 10)) div 10"/>
<xsl:value-of select="$rounded-value" />
</magneticVariationValue>
</magneticVariation>
</xsl:template>
Here is XSLT that converts vocabulary 2 to vocabulary 1:
<xsl:template match="magneticVariation" mode="vocab2_to_vocab1">
<MAG__VAR>
<xsl:choose>
<xsl:when test="magneticVariationEWT eq 'West'">W</xsl:when>
<xsl:when test="magneticVariationEWT eq 'East'">E</xsl:when>
<xsl:when test="magneticVariationEWT eq 'True'">
<xsl:value-of select="error(xs:QName('magneticVariation'), 'Vocab1 does not support magVar True')"/>
</xsl:when>
<xsl:otherwise><xsl:value-of select="error(xs:QName('magneticVariation'), 'Invalid value for magneticVariationEWT')"/></xsl:otherwise>
</xsl:choose>
<xsl:variable name="degrees" select="f:make-3-digits(substring-before(magneticVariationValue, '.'))"/>
<xsl:value-of select="$degrees"/>
<xsl:variable name="tenths-degree" select="concat('.', substring-after(magneticVariationValue, '.'))" />
<xsl:variable name="minutes" select="xs:decimal($tenths-degree) * 600"/>
<xsl:value-of select="$minutes"/>
</MAG__VAR>
</xsl:template>
/Roger
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php