1. Define the term “lexical structure”. 2. Is the set of characters of a parsed XML document always identical to the set of characters in which the XML document was written?
3. Can you change the interpretation of a sequence of characters in an XML document? Scroll down to see the answers … 1. The lexical structure of the XML language is the characters that may appear in an XML file and how they are collected into lexical units, or tokens. 2. Source versus parsed character sets: The set of characters of a parsed XML document is not necessarily the same as the one in which the XML document is written. Example: This XML document: <Test>'</Test> uses this character set: < T e s t > & a p o / The set of characters of the parsed XML document is this: < T e s t > ' / 3. The syntax of XML’s escape characters represents a state-independent encoding for a character – the & ; pair changes the interpretation of its embedded string to form a single character. Example: ' encodes ' and the encoding is independent of what came before it.
If we go a layer lower to the unicode encoding of characters we find
state-dependent encoding of bytes: the interpretation of a byte may depend on byte(s) before it. |