OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Whitespace

[ Lists Home | Date Index | Thread Index ]
  • From: digitome@iol.ie (Sean Mc Grath)
  • To: xml-dev@ic.ac.uk
  • Date: Tue, 26 Aug 1997 20:40:04 +0100

>At 05:45 PM 26/08/97 +0100, Sean Mc Grath wrote:
>>It is easy to see what has happened here. The s/w developers have
>>a pattern for matching AREA elements that does not countenance the presence
>>of a CRLF.

[Tim Bray]
>Gimme a break; the software developers in this case have screwed up;
>there is a technical term to describe this behavior: "wrong".  There may
>in fact be productive things to be said about particular application
>profiles for whitespace handing, but this example is a complete
>red herring. 
>

I presented this "red herring" because it was *real*. I could have
contrived a more realistic one:-) This is an
example of a *real* programmer screwing up in a real application.

I am interested in avoiding screwups. WS is a screwup "happy hunting
ground" for us normal programmers who make mistakes day in day out.

At least I think it is. Perhaps (hopefully) I'm wrong.

I doubt if I will get this right but I will try and formulate the programming
problem as I see it. 

Here goes:-

XML processing applications that read/write XML have to faithfully
reproduce white space to avoid data loss. In the course of XML processing,
actions will regularly be triggered by context. I.e. "element X within
element Y",
"first data content chunk below element X" etc.

Take a really simple context, "X followed by Y". In order to faithfully
reproduce 
WS on output the simple pattern "XY" must be transformed into (in rusty Perl)

"(w*)X(w*)Y(w*)"

Where "w" represents the pattern for White Space.

As the state spaces get more complex (i.e. realistic) doesn't this problem
escalate?

Could someone out there who reckons this is easy kindly put
me out of my misery by showing how it can be best handled?



Sean Mc Grath

sean@digitome.com
Digitome Electronic Publishing
http://www.digitome.com


xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS