[
Lists Home |
Date Index |
Thread Index
]
AndrewWatt2000@aol.com scripsit:
> So, <xsd:pattern value="\w" /> would match many (unwanted) characters that <
> xsd:pattern value="[A-Za-z0-9_] /> would reject as non-matching. Correct?
Definitely.
> In W3C XML Schema, and therefore in XForms, is it correct that the only way
> to express the notion of an English language / ASCII "word character" in a
> regular expression is using [A-Za-z0-9_]?
Correct.
> Is there any facility to express the notion of, for example, a French word
> character? Or German?
You'd have to concoct a similar character class, and there is always
a measure of controversy about these things. The standard English spellings of
"naïve" and "façade" require letters outside [A-Za-z], and so does
one spelling of "coöperate".
> Or is the \p{Basic_Latin} the smallest / most precise
> "chunk" of characters that can be used in such a setting?
That certainly doesn't do what you want: it matches any ASCII character,
rejecting the non-ASCII ones.
--
We call nothing profound jcowan@reutershealth.com
that is not wittily expressed. John Cowan
--Northrop Frye (improved) http://www.reutershealth.com
|