OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] limits of the generic

[ Lists Home | Date Index | Thread Index ]

> Hi Jeni,
> 
> On Tue, 2002-10-01 at 00:52, Jeni Tennison wrote:
> 
> > >> <dt:datatype name="my:UKDate">
> > >> <!-- a date in the format DD/MM/YYYY -->
> > >> 
> > >> <dt:components>
> > >>   <dt:component name="day" select="substring(., 1, 2)" />
> > >>   <dt:component name="month" select="substring(., 4, 2)" />
> > >>   <dt:component name="year" select="substring(., 7, 4)" />
> > >> </dt:components>
> 
> If you allow the plug, this looks rather parallel to what I am proposing
> with xvif (see for instance in my strawman [1] the test cases spilting
> dates as a list of tokens [2] or elements [3] which you can also try
> online).

This is true.  I guess if you put it in that light, I can consider it with a 
more friendly eye.  I know that XVIF has been designed from the beginning to 
support generic lexical processing.  I guess that's been what Jeni has been 
trying to do as well, but it looked as if her example was couched in the sense 
of defining a set of operations and lexical mappings tailored to WXS types.  
Perhaps I was too hasty in that judgment.

So, starting afresh on this idea, and expressing it in XVIF, which has the 
advantage of a handy implementation right now, the basic lexical 
interpretation of dates would look like (partly stolen from Eric's doc):

<dt:components xmlns="http://relaxng.org/ns/structure/1.0"; 
xmlns:if="http://namespaces.xmlschemata.org/xvif/iframe"; 
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes";>
 <dt:lexical-preprocess>
   <if:transform type="http://simonstl.com/ns/fragments/";>
    <if:apply>
     <fragmentRules xmlns="http://simonstl.com/ns/fragments/";>
      <fragmentRule pattern="^[ \t\n]*([0-9]{4})-([0-9]{2})-([0-9]{2})[ 
\t\n]*$">
       <applyTo>
        <element localName="date"/>
       </applyTo>
       <produce>
        <element localName="year"/>
        <element localName="month"/>
        <element localName="day"/>
       </produce>
      </fragmentRule>
     </fragmentRules>
    </if:apply>
   </if:transform> 
 </dt:lexical-preprocess>
</dt:components>


So this uses regular fragmentations to turn

<spam>2002-10-01</spam>

into

<spam>
  <year>2002</year>
  <month>10</year</month>
  <day>01</day>
</spam>

Yes, we could use some XPath mechanism to do this as well, as Eric's straw man 
also demonstrates, but I wanted to use the mechanism that appeals to me best.

People can also put whatever they want into the dt:lexical-proprocess element 
including a pipe that defines some other transforms and perhaps a validation 
step.

So now we have the lexical representation mapped to a set of sub-elements that 
can be very readily manipulated by XPath for value-space declarations.

What we'd need to add is some mechanism for such declarations, mapping them to 
operators.  Maybe:

<dt:components xmlns="http://relaxng.org/ns/structure/1.0"; 
xmlns:if="http://namespaces.xmlschemata.org/xvif/iframe"; 
dtl="http://www.w3.org/2001/XMLSchema-datatypes";>
 <dt:lexical-preprocess>
   <if:transform type="http://simonstl.com/ns/fragments/";>
    <if:apply>
     <fragmentRules xmlns="http://simonstl.com/ns/fragments/";>
      <fragmentRule pattern="^[ \t\n]*([0-9]{4})-([0-9]{2})-([0-9]{2})[ 
\t\n]*$">
       <applyTo>
        <element localName="date"/>
       </applyTo>
       <produce>
        <element localName="year"/>
        <element localName="month"/>
        <element localName="day"/>
       </produce>
      </fragmentRule>
     </fragmentRules>
    </if:apply>
   </if:transform> 
 </dt:lexical-preprocess>
 <dt:operator symbol="=">
   <dt:result>
     <if:transform type="http://www.w3.org/TR/xpath";
       apply="$lhs/year = $rhs/year and $lhs/month = $rhs/month and $lhs/day = 
$rhs/day"/>
   </dt:result>
 </dt:operator>
 <!-- silly function example -->
 <dt:function name="dtl:date-in-us-format">
   <dt:result>
     <if:transform type="http://www.w3.org/TR/xpath";
       apply="concat(month,'/',day,'/',year)"/>
   </dt:result>
 </dt:function>
</dt:components>


Very intriguing idea, I guess, after all.  Naturally, optimized 
implementations would not have to use all the above binding info and can just 
jump straight to the optimized code, similar to functions that implement EXSLT 
extensions natively and do not then have to run the exsl:function version, 
though they can always fall back to that for classes they do not support.

This does only answer one of my complaint: generic dispatch and constraint 
processing.  That is a big bone of contention, so I'd be happy for such a 
solution, but the fact is that it still introduces a lot of complexity, which 
is worrisome.

However, if it is inevitable that the data types juggernaut must have its 
stone of flesh in the end, I would rather a mechanism such as the above 
allowed others to put their own data types on an even keel, and also allowed 
other forms of axiomatic processing besides data types.  I also like that it 
expresses value space operations as simple transforms on the plain lexical 
information, which is an important assertion of layering.


> I am currently trying to define the basic building blocks of xvif, but
> at a later stage, dt:compents(s) could be good candidates as shortcuts.
> 
> Xvif doesn't support a test like you've shown, but I am wondering if it
> shouldn't support variables in such case this kind of test should be
> possible to express...

XVIF would need some additional facilities to support this, including variable 
bindings.  But none of them look like more work than what's already there.


> And of course, as already mentioned, there is no reasons why xvif
> couldn't be used either standalone or in other host languages than Relax
> NG in which case XSLT would be the number one candidate!

Indeed.  As I mentioned to you the other day, I think all XVIF needs is a 
binding for results of the if:pipe in terms of all the XPath data types, not 
just boolean as currently defined.  Therefore validates would tend to map to 
boolean, transforms to node sets or strings, etc.  Again shouldn't bee too 
terribly hard compared to what has gone before.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Apache 2.0 API - http://www-106.ibm.com/developerworks/linux/library/l-apache/
Python&XML column: Tour of Python/XML - http://www.xml.com/pub/a/2002/09/18/py.html
Python/Web Services column: xmlrpclib - http://www-106.ibm.com/developerworks/webservices/library/ws-pyth10.html 






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS