[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] are the regular expressions over xml structure?
- From: Michael Kay <mike@saxonica.com>
- To: xml-dev@lists.xml.org
- Date: Mon, 02 Jul 2012 18:42:46 +0100
I'm asking myself the question: is there a do-it-yourself solution to
this one?
I'm thinking of something like
<xsl:for-each-group select="*" group-adjacent="f:matches-group('image,
caption*, table')">
where f:matches-group() is a user-written function. Given that
group-adjacent partitions the input sequence, we could define it to
return a positive number for a group that matches the pattern and a
negative number for a single element that does not match the pattern.
How would one write such a function?
Here's a sketch of an approach: construct a mapping from distinct
element names appearing in the grouping population to distinct Unicode
characters. Then reduce the pattern to a corresponding regex, say IC*T.
Similarly, construct a string representing the sequence of element names
in the grouping population, say PPPPPICCCTPICCCT. Use analyze-string to
find the matching and non-matching substrings. In the analyze-string
code, construct a list giving the lengths of the successive matching and
non-matching substrings and use this to construct a mapping from nodes
in the population to the group that they belong to; this mapping is then
used to compute the grouping key used by group-adjacent.
Perhaps someone with more time that I have could try implementing it! It
will be a lot easier if you allow yourself to use maps (as implemented
in Saxon 9.4)
Michael Kay
Saxonica
On 02/07/2012 14:35, Michael Kay wrote:
>
>
> On 02/07/2012 13:38, David Lee wrote:
>> I personally would love to see xml level regex in XQuery and XSLT.
>> The examples are quite concise and readable.
>> Just noodling, would it be possible to translate some forms of regex
>> into XPath ? Or would XPath (and XSLT match expressions) need deep
>> enhancements.
> The way I've thought about doing this in the past would be some kind
> of generalization of group-starting-with and group-ending-with, to
> something like
>
> group-matching="image, caption*, table"
>
> But it's a lot of machinery addressing a rather specialized
> requirement, so it's hard to see it ever making the cut.
>
> Michael Kay
> Saxonica
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]