XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Error and Fatal Error

> or it could allow "&" to represent itself if not followed by a name character
 
Maybe that alone would be enough to fix my bugbear.
 
But what did the spec writers expect to happen about the '<' character
appearing in content? More to the point: What did they expect an XML
parser to do about such a character? Did they really expect preparsing
to be necessary as an overhead merely for the purpose of replacing
this character and the ampersand '&' in content?
 
Even aside from what the XML spec witers expected; what did XML
parser writers expect developers to do about these characters in
content being passed to the parser? Did they too expect pre-parsing
just for the purpose of removing/replacing/escaping such characters?
 
Either way seems slightly irrational to me. Which developers in their
right mind would expect to be doing preparsing before sending XML
to a parser? They would surely just expect the parser to be able to
handle these characters. They would surely expect any standards
compliance of that parser (for conforming to the XML specs) to include
being able to gracefully handle these characters. If not, they would
want to see this fix before they have to insist on fellow developers
knowing what to do about it. They wouldn't expect to have to write a
parser just to be able to send some XML to a 'standard' parser.
----
Stephen D Green



On 16 July 2011 20:32, Michael Kay <mike@saxonica.com> wrote:

I think we need a parser which understands the
slightly erroneous XML and can find any errors in it:
In short we need a parser which has an API which
can allow the web developer (in this case with .NET)
to repair XML.

I find it very useful to have a Java IDE which can turn what I type into syntactically correct Java, especially as it's both configurable and interactive so I can tell it what rules to apply in doing this job. But I wouldn't want the Java compiler itself to silently make assumptions about what I intended when I give it incorrect source code.

Similarly I think there's a distinction between what an XML parser should do, and what a data cleansing tool might attempt in terms of turning garbage into XML.

And I'm not sure that data cleansing tools need to be interoperable, in which case there is no reason for standards organisations to get involved in designing how they should work.

That's not to say, of course, that the spec couldn't be a bit more liberal in its definition of well-formed XML; for example it could allow the whitespace between attributes to be omitted, or it could allow "--" in comments, or it could allow "&" to represent itself if not followed by a name character. But changing the definition of the language is one thing; encouraging parsers to accept things that the language disallows is another.


Michael Kay
Saxonica

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS