OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Error and Fatal Error

In my experience this is a novel way of attempting to use XML tools, and even if you could get it to 'work' doesn't actually solve the worse of the problems.   I would not attempt, nor expect to work, using a parser in this way.  That is, using a parser to throw anything I can at it and hope that it would 'catch errors and fix them.


Even if it *could* 'catch errors' and let me fix them, what about totally valid markup which is not supposed to be markup ?  Those are the worse kind of errors.  This is a classic case of 'A fence on the hill or an ambulance down in the valley'.

This leaves you wide open to XML injection attacks.   That's extremely scary and should be addressed with the upmost attention.

Invalid markup is a minor annoyance degenerate case of the primary problem of allowing *any markup* being entered by the user.

To use a Banking Analogy, it's like having an ATM use field validation that only checks the textual format of the number, not whether you have that much money in your account.


The code snipping you (Steve) provided is not the one I'm interested in.   That's the Ambulance in the valley.

what I'm interested in is how is the XML text *initially* created.   That is where the bug is.

I think you will find little disagreement that its extremely difficult to try to fix up errors after then are injected.   In fact I assert it is not possible to reasonably do so at all, with any parser or tool or specification, in this scenario, because at the context the errors are detected is too far down the processing chain - valid or invalid content because an application contextual  semantic problem, not a syntactic issue.  Even schema validation may not catch XML injection, depending on the schema.     And even if it could, should be prevented by proper escaping of the xml prior to insertion and not expecting the parser to handle the entire responsibility of detecting or fixing invalid content.   At least that is my opinion.





David A. Lee




From: Stephen D Green [mailto:stephengreenubl@gmail.com]
Sent: Monday, July 18, 2011 5:07 AM
To: David Lee
Cc: Andrew Welch; David Carlisle; xml-dev@lists.xml.org
Subject: Re: [xml-dev] Error and Fatal Error




There are always bugs. I don't see that as the issue; rather that we always have to

write even simple apps such that the bugs do not cause problems for the enduser.

In this case it means we have to anticipate errors and handle them gracefully. That

seems to be, as would be expected, only properly, emphasised in the XML spec

for behaviour of the conforming XML parser. At least that would be the intention of

the spec. The actual outcome in conforming software would depend on how good

the spec does its job (and how well the architects of XML and the XML spec

design understand the effect of the spec on implementers, which isn't easy to do

and requires feedback at all stages and possibly redesign as part of the spec's

maintenance). So here the spec wants the parser to be useful for what some are

calling preparsing - a step where errors are found and the application using the

parser gets an opportunity to correct them. This aspect of the parser/spec is what

I want to bring to peoples' attention so the spec can be improved rather than try

to work out why errors happen in the first place. If the spec attempts to allow for the

correction of errors it is doing better than just saying 'errors should never happen'.


Stephen D Green

On 18 July 2011 01:34, David Lee <dlee@calldei.com> wrote:

This is starting to sound like a toolkit bug.  And as such probably on the wrong list.

But obviously you have a lot of people's active attention !


If you could post a code snippet may answer a thousand questions.

The one I have is


"Does the toolkit generate the bad XML or does the custom code?"


If the toolkit/framework is generating the XML and it passes through unescaped invalid XML markup then the toolkit has a bug and should be reported *to the toolkit developers*.

If custom code is generating bad XML then it needs to be fixed by the custom code developers.


In neither case is the "XML Spec" at fault here. 

Any more than passing an extra "," to CSV or a UTF8 sequence to a Ascii parser or any of a thousand million zillion examples I could come up with offhand of invalid data to languages/parsers/tools which expect valid data.
It's pretty clear in XML specs what's valid and what is not.   GIGO and all that ...









David A. Lee




From: Stephen D Green [mailto:stephengreenubl@gmail.com]
Sent: Sunday, July 17, 2011 6:12 PM
To: Andrew Welch
Cc: David Carlisle;
Subject: Re: [xml-dev] Error and Fatal Error


I'll try and have a look at the code again tomorrow at work.


As far as I remember we do not have access to the strings.

These are *very* commonly used Ajax controls and they

probably bind to a dataset from ASP.NET 'markup'. Not

everything in controls like this is available to the developer.

If the controls do bind to a dataset (XML a la .NET) then

the data is possibly pre-packaged as XML even if it has '&'

in the element content. Besides, this is a framework we are

talking about so I do not have much say in what other

developers working now and in the future on the apps do.

One tends to stick with the framework (go with the flow) and

understand that others will try and do the same.


Stephen D Green

On 17 July 2011 22:26, David Carlisle <davidc@nag.co.uk> wrote:

On 17/07/2011 22:13, Stephen D Green wrote:

I don't buy that. And not so easy to replace '<' with
'&lt;' in just the element content and not the tags.


I thought you indicated that you were taking strings from user supplied form data and adding it to xml, in which case you need to escape the xml syntax characters before you add it to the xml so there is no element content and no tags to worry about. You don't want to add it then try to parse to find element content afterwards as if adding the content has made the xml non well formed you've already lost.


On 17 July 2011 22:26, Andrew Welch <andrew.j.welch@gmail.com> wrote:

On 17 July 2011 22:13, Stephen D Green <stephengreenubl@gmail.com> wrote:
> I don't buy that. And not so easy to replace '<' with
> '&lt;' in just the element content and not the tags.

It really is straightforward... if you are adding to an in memory tree
just add the text as-is, if you are adding to serialised xml then just
put text through a serialiser first (by wrapping it in a root node,
serialising it, then substringing it out).

If that's not what you mean then can you do an example of the problem?

If the user is writing markup then it's back to helping them get it
right first time by parsing in the background and highlighting errors.




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS