XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Why is < illegal in an attribute value but theequivalent hex and decimal character entities are legal?

On 17/03/2022 11:25, Roger L Costello wrote:
So this is perfectly well-formed XML:

<Test foo="&#x3C;x>blah&#x3C;/x>"/>

And the numeric character references will be replaced during the parsing process to yield this:

<Test foo="<x>blah</x>"/>
I'd say that parsing <Test foo="&#x3C;x>blah&#x3C;/x>"/> yields an attribute named "foo" with a value of "<x>blah</x>".

It's not creating an alternate piece of XML that is then parsed again.

The sequence is more like:

- Low level XML tokeniser reads attribute name "foo"

- Low level parse checks it's followed by "=" and quotes

- Low level parser reads until it finds end quote, getting "&#x3C;x>blah&#x3C;/x>"

- Internal logic converts "&#x3C;x>blah&#x3C;/x>" to "<x>blah</x>"

- Internal logic creates a data record for an attribute of name "foo" with value "<x>blah</x>" and associates it with the element "Test".

Regards,

Pete.
--
---------------------------------------------------------------------
Pete Cordell
Codalogic Ltd
Read & write XML in C++, http://www.xml2cpp.com
---------------------------------------------------------------------


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS