OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] datatype functionality I'd like to see

[ Lists Home | Date Index | Thread Index ]

Hi,

> Perhaps I am uninformed however, can anyone think of any particular
> schema language one can do this in, and if you are the person who
> knows of such a language can you give me an example if possible.
> (not that it's something I need to do, just something I thought
> would be extremely useful to be able to do at some point)

This was one of the features of the datatype library language that I
have been working on [1]. You could do something like (bearing in mind
I don't know how SSNs actually work):

  <define name="Digit">
    <charGroup><range from="0" to="9" /></charGroup>
  </define>

  <datatype name="SSN">
    <parse>
      <group name="state">
        <repeat exactly="3"><ref name="Digit" /></repeat>
      </group>
      <string>-</string>
      <group name="individual">
        <repeat exactly="2"><ref name="Digit" /></repeat>
        <string>-</string>
        <repeat exactly="4"><ref name="Digit" /></repeat>
      </group>
    </parse>
    ...
  </datatype>

and in the rest of the datatype definition you'd work with a tree
containing <state> and <individual> elements. For example, the SSN
123-12-1234 would become:

  <SSN><state>123</state>-<individual>12-1234</individual></SSN>

At http://www.jenitennison.com/datatypes/#implementation, there's an
implementation that transforms the datatype library syntax into an
XSLT 2.0 stylesheet that contains a bunch of functions for each
datatype. You could probably do something with Schematron such that
you declare the datatypes in the Schematron schema and then use them
in the test expressions, as long as you were happy using an XSLT 2.0
processor, but I haven't pursued that.
  
I'm currently in the process of revising the language I initially came
up with so that (among other changes) you can just use named
subexpressions within a regular expression; something like:

  <datatype name="SSN">
    <format>
      <regex>(?[state][0-9]{3})-(?[individual][0-9]{2}-[0-9]{4})</regex>
    </format>
    ...
  </datatype>

or use other (extensible) methods for expressing the format of a
value, such as BNF or PEGs or whatever the particular datatype library
processor understands, but it's all work in progress...

Cheers,

Jeni

[1] http://www.jenitennison.com/datatypes/

---
Jeni Tennison
http://www.jenitennison.com/





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS