[
Lists Home |
Date Index |
Thread Index
]
Depends on what you mean by "allowable". If you mean, what characters
will appear in your documents (where you presumably limit the characters
to ASCII characters in some fashion), I believe you're correct. If you
mean what characters an XML Schema-based validator will accept, it would
be all of the Unicode characters that match the NCName production,
regardless of whether they're ASCII or not.
-- Ron
Roger L. Costello wrote:
> Hi Folks,
>
> Suppose that I create XML documents, restricting myself to just using the
> ASCII character set.
>
> And suppose that I declare an element to have the datatype NCName:
>
> <element name="foo" type="NCName"/>
>
> What are the allowable characters for <foo>?
>
> I believe that the answer is: [a-zA-Z_][a-zA-Z0-9.-_]*
>
> Here's how I arrived at my answer:
>
> The production rule for NCName in the XML specification:
>
> NCName ::= (Letter | '_') (NCNameChar)*
>
> NCNameChar ::= Letter | Digit | '.' | '-' | '_' | CombiningChar | Extender
>
> Given that I am just using the ASCII character set,
>
> Letter is a-zA-X
> Digit is 0 - 9
> CombiningChar and Extender are characters outside the ASCII character
> set (I think)
>
> Do you agree that, given the restriction of using only ASCII characters, the
> set of characters that can be used in <foo> is: [a-zA-Z_][a-zA-Z0-9.-_]*
|