OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] XML=WAP? And DOA?

[ Lists Home | Date Index | Thread Index ]


From: "Mike Champion" <mc@xegesis.org>

> 1/13/2002 8:10:02 PM, Paul T <pault12@pacbell.net> wrote:

> >Sure. And it is kinda  more convinient to use CSV, because 
> >CSV-based world  has developed a sophisticated, convinient, 
> >universal  binding mechanizm, called "regular expressions".
> 
> Hmmm, that's an interesting way to put it.  What would an XML 
> universal binding mechanism look like .... a clean 
> integration of XPath and DOM (and maybe RELAX-NG)?  A RELAX-
> NG data binding tool 
> http://www.asahi-net.or.jp/~dp8t-asm/java/tools/Relaxer/  ?

0. I belive that regular expressions,  as we know them,  are 
also not the best possible binding, because I belive that 
they should not be greedy. I think they are greedy 
because of historical reasons and maybe because of 
'match/split'  processing pattern ( awk ). I think that the 
uiniverse of non-greedy regular expressions is not yet 
explored.

1. As to Relaxer: Code generators are always a pain to maintain 
and to debug. It has taken a long time to get, for example, 
yacc / lex to output 'everything' ( nopt only C)  - and when that 
happened - both yacc and lex became obsolete (JavaCC).

Just look at JavaCC.  It is sooo convinient ... but it generates 
Java only and there is no way I can use JavaCC for, say,  
Perl. 

I hate to say it, but it could be that the simplest 
language-neutral binding mechanizm would be : 
"take a subset of XML and everything would immediately 
become simple, otherwise you may keep trying to find 
that black cat in that dark room for ages".
 
> Both Sun and Microsoft seem to be working hard to make it 
> easier to use XML from ordinary programming languages; are 
> either/both at least moving in the direction you want to see?

They both are moving into usual direction : to lock-in 
as many developers,  as they can,  into the platforms, 
that they sell. That would be insane for a big software 
company to provide something that could be easily 
used outside their core platform.  However, that's minor.

The most important thing could be that ... no 
good language-neutral binding is even *possible* 
for XML v 1.0. 

Chunks binding is convininent, *only* because it does not 
support a complex mixed content cases + there is some 
'intuitive' whitespace stripping  rule.  Place the complex 
mixed content and complex whitespace processing back 
into the model - and it could be that the *only* 
possible model for XML *is* DOM. Which is a hell to 
process without the XPath and even with XPath it is still 
hard.

It has taken a very long time for SML-dev to figure out 
what could be possibly simplified in the XML model and 
to me the answer is "... we still don't know, but maybe it 
is possible ;-)"

It is interesting to look at YAML ( even I don't like it ;-)

YAML is the 'markup' language, that has been designed 
*other way around*.

They *first* looked at language-neutral data structures and 
*then* they designed a 'markup'. Of course YAML allows 
an easy binding!

If XML would have been designed 
'to serialize Fortran Arrays' that would make XML 
binding obvious and straightforward. 

However, I think that there could be some problems,
if trying to ask  IBM lawyers to map their documents 
into Fortran arrays ;-)

<aside>
Also, I think that YAML is too complex, 
because I think they thought that if they publish only 
a really simple part of it, nobody would take it seriosly ;-)
</aside>

<rant>
I hate to say it, but I think that all that markup stuff 
is actually about placing '\' and ',' symbols on steroids in 
one way or another. Why can't people agree that 
any 'markup' language is :

0. Everything is (unicode) text.

1. Text can have  'groups' , separated by 'separators' 
( the less, the better, but hard to tell in advance ;-)

2. There should be some way to escape separators 
( \ works just fine, from my point of view ;-)

Isn't it all we need to know about the 'markup language' ;-)?

Why restrict ourselvs to 'one markup language, that fits all' ?

It is like in early days, before YACC and similiar tools, people 
were trying to invent 'the best possible programming language, 
good for everything'. I think it is now obvious that such a 
programming language does not exist. Why there should be 
a 'markup language that is good for everything' is a question 
I have no answer to. ;-)
</rant>

> >Regular expressions are not blessed by W3C, sure.
> 
> But they are at the heart of RELAX-NG ... I hope we can 
> distinguish "XML" from "the set of all specs that the W3C has 
> put out dealing with XML" or "the picture of XML promulgated 
> by the most visionary W3C working groups." 

I think it's even worse, than that. There is also:

"Maybe XML is not a convinient thing to process, 
but because the XML text looks nice, let us migrate 
all the binary formats into some 'industry-blessed' 
vocabularies and then 'industry-blessed'  APIs 
would deal with the mess. Still that would be 
better than binary / CORBA driven solution". 

That would be also the right thing to do! 

Just some better unification of industry-specific 
legacy data formats would be better than to 
keep building on a formats, designed 30+ 
years ago by hardcore COBOL coders, who 
cared about saving bytes, because storage 
was important.

Rgds.Paul.






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS