XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] [ Revised ] 15 elementary truths about XML

On 01/11/2011 14:33, Costello, Roger L. wrote:
> David Carlisle wrote:
>
>> [XML] Documents consist of characters not bytes
>
> If an XML processor processes characters, not bytes, then what
> software takes the bytes in a file and generates characters?

As I said, the XML processor must be able to handle entities encoded in 
utf8 and utf16 and those entities will be sequences of bytes, but it may 
handle other encodings as well. in any event the "xml document"
is the sequence of characters resulting from decoding the entity.
there may never have been an encoded entity at all, for example if you 
call an xml parser from xslt (saxon:parse or equivalent) you pass it a 
stream of characters, not a stream of bytes in some encoding.

>
> Are you saying that an XML processors builds on top of another piece
> of software (which converts bytes to characters)?

well the xml processor needs to include both components but the 
byte-to-character component is only used if the entity is presented in 
an encoding that consists of a stream of bytes.

>
> How is a character presented to an XML processor if it is not
> presented as one or more bytes?
whatever. (See the example above). the "sequence of characters" may of 
course actually be a sequence of bytes on some predetermined encoding 
(such as utf16 if it's java) but that's an implementation detail about 
which you shouldn't care, and certainly isn't an elementary truth.

>
> Perhaps I should use the word "file" rather than "document"? For
> example:  The contents of an XML file is a sequence of zeros and ones
> called bits.

No! If you want elementary truths about XML, best to stick to the 
vocabulary used in the XML spec. An XML document may not exist in a file 
at all, it may just exist in an in-memory stream/pipe for example.
>
> /Roger

David

________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
________________________________________________________________________


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS