XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
XML start tags are wicked complicated

Hi Folks,

XML start tags have a simple structure, right?

Wrong!

Here are some of the permutations of a start tag:

'<' tag-name '>'
'<' tag-name "/>"
'<' tag-name WSP '>'
'<' tag-name WSP "/>"
'<' tag-name WSP attribute-name '=' "value" '>'
'<' tag-name WSP attribute-name WSP '=' "value" '>'
'<' tag-name WSP attribute-name '=' WSP "value" '>'
'<' tag-name WSP attribute-name WSP '=' WSP "value" '>'
'<' tag-name WSP attribute-name WSP '=' WSP "value" WSP '>'
... a lot more ...

Now, let's play parser: We are scanning and encounter these items
...    '<' 
   ...     tag-name 
      ...      WSP 
         ...       attribute/value pair 
            ...       WSP  

Trouble!

What does the WSP (WSP = whitespace) signify? Does it signify:

(a) Space between the first attribute and a second attribute? E.g. WSP attribute-name '=' "value"
(b) Space just prior to the end angle bracket? I.e., WSP '>'

The only way to know the answer is to lookahead beyond the WSP to see what token comes next. But a two-token lookahead requires a more powerful parser than a one-token lookahead parser.

So the next time someone tells you that the structure of an XML start tag is simple, tell 'em it ain't so!

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS