[
Lists Home |
Date Index |
Thread Index
]
Martin Olsson wrote:
>
> --- QUESTION 1
>
> Can *all* attributes be equivalently represented as child tags in
> *all* XML formats? If so, why does DTD define attribute separately?
> In particular, I wonder about the core XML attribute such as
> namespaces etc; are these two lines equivalent (see below)?
>
> <myTag xsl:myPrefix="hello"></myTag>
> <myTag><xsl:myPrefix>hello</xsl:myPrefix></myTag>
Not equivalent at all. Attributes are very different from child elements
in ways that go beyond just the representation and even the obvious
structural differences (e.g., attributes are always unordered). You can
define a grammar that allows both an attribute and a child element with
the same name, if you want to allow both forms in your documents, but
you have to do this explicitly.
>
> --- QUESTION 2
>
> XML files can use different character encodings including UNICODE and
> normal ascii text files. An XML parser must know what encoding is used
> before it starts to process the file, loading a UNICODE file is very
> different from loading a normal text file. The parser can obviously
> not first read the encoding attribute of the XML declaration which is
> the first line of the XML file and then load the file. So is there a
> complete list of possible char encodings what is XML compatible?
> Should the XML parser use a brute force approach and try all of these?
There's a suggested way of handling this in the XML recommendation:
http://www.w3.org/TR/REC-xml/#sec-guessing Basically the parser is
expected to read the first part of the document to see if (1) the
character encoding is indicated by particular byte values, or (2) has an
XML declaration present in any of several possible formats.
- Dennis
|