OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] are the regular expressions over xml structure?

I'm asking myself the question: is there a do-it-yourself solution to 
this one?

I'm thinking of something like

<xsl:for-each-group select="*" group-adjacent="f:matches-group('image, 
caption*, table')">

where f:matches-group() is a user-written function. Given that 
group-adjacent partitions the input sequence, we could define it to 
return a positive number for a group that matches the pattern and a 
negative number for a single element that does not match the pattern. 
How would one write such a function?

Here's a sketch of an approach: construct a mapping from distinct 
element names appearing in the grouping population to distinct Unicode 
characters. Then reduce the pattern to a corresponding regex, say IC*T. 
Similarly, construct a string representing the sequence of element names 
in the grouping population, say PPPPPICCCTPICCCT. Use analyze-string to 
find the matching and non-matching substrings. In the analyze-string 
code, construct a list giving the lengths of the successive matching and 
non-matching substrings and use this to construct a mapping from nodes 
in the population to the group that they belong to; this mapping is then 
used to compute the grouping key used by group-adjacent.

Perhaps someone with more time that I have could try implementing it! It 
will be a lot easier if you allow yourself to use maps (as implemented 
in Saxon 9.4)

Michael Kay

On 02/07/2012 14:35, Michael Kay wrote:
> On 02/07/2012 13:38, David Lee wrote:
>> I personally would love to see xml level regex in XQuery and XSLT.
>> The examples are quite concise and readable.
>> Just noodling, would it be possible to translate some forms of regex 
>> into XPath ?  Or would XPath (and XSLT match expressions) need deep 
>> enhancements.
> The way I've thought about doing this in the past would be some kind 
> of generalization of group-starting-with and group-ending-with, to 
> something like
> group-matching="image, caption*, table"
> But it's a lot of machinery addressing a rather specialized 
> requirement, so it's hard to see it ever making the cut.
> Michael Kay
> Saxonica
> _______________________________________________________________________
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS