XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] XML Redux

On Tue, 2011-02-15 at 21:21 +0000, Stephen Green wrote:
> I kind of like this idea of a parser 'reverting
> to character data' and what is the alternative
> if you need to remove ampersands from
> some XML (which got in there because no
> code was put in place to stop ampersands
> or less-thans being included in input in a
> web form and the input was subsequently
> turned into XML and stored as text - quite
> a common scenario of course). 

Flog the developer with a wet fish.

We (W3C) had a request to allow NUL in XML 1.1 because of some people
who had garbage in databases like this, and it would be easier for it
"not to be an error."  At some point it's going to be an error.  Do you
want the customer to telephone when they get little rectangles on the
screen, do you want the airline pilot to radio home when the navigation
systems fail?

I also see all to often
    printf("<%s><![CDATA[%s]]></%s>\n", elem, theData, elem);
which produces
   <e><![CDATA[Mr. Green]]></e>
without ever checking whether "theData" contains ]]>...

The reason for this is the lack of widespread APIs that are easy to use
and that automate the escaping,
   printf("%s\n", xmlelement(e, theData));


> It seems
> the problem is that XML parsers fall over
> with such characters 


The theory was always, better that the parser reject it (they don't
"fall over" - it's an intended error condition, not a failure of the
parser) than that the application "fall over"...

>  Treating them as something like
> strings, text or character data which can still
> be extracted seems like a good starting point.

You have to ask what is the consequence of this in the application.

Sometimes it's just fine. For a Web browser it might be OK. For a 'plane
navigation system, or for the pedals and steering wheel in your car,
maybe better to catch the problem in development, maybe better to have
the car detect a problem than go wrong.

For this reason, it's supposed to be the application that says, "carry
on, give me the wrong data" or "do whatever you want to recover, swap
"x" and 'y" coordinates or whatever you happen to find works, I don't
care."

At that point it's not engineering, it's art :-)

Best,

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS