OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XML Blueberry

At 05:16 PM 6/21/01, you wrote:
>Vincent-Olivier Arsenault wrote:
>>This revision is indeed NECESSARY as (I think) XML should have a greater 
>>(if not complete) independence from any encoding specification and 
>>delegate it (all) to UNICODE. Thus, the key requirement to me would be 
>>(quoting from the June 20 WD requirement list) : "The working group shall 
>>consider the issue of future updates to Unicode."
>I think we anticipate a fixed set of rules, very close to the rules
>in the XML 1.0 document, and then: whatever Unicode says is in, is in,
>and whatever is out, is out.
>But in full generality that means each document has to bear the
>version number of Unicode that it assumes (for its names only, of
>of course, not for its character content).

But couldn't that be deduced from the binary representation (on the 
platform level) so that the parser just has to deal with a "current" (at 
the time of the parser implementation) UNICODE spec string? Why does the 
(XML) parser need to know the charset used?

>That means a series of
>updates to parsers are required, and control of the schedule is
>lost from W3C to Unicode.

I don't think it HAS to.

>Is this a Good Thing or a Bad Thing?

How can we determine this? I think the question roots to "what is XML"... ;-)

>>As for the "they can write latin markup anyways" argument, I don't see 
>>how we could EVER discriminate ANY cultural particularity (even if they 
>>SEEM obsure to us or to so-called "experts", lets not repeat the rfc822 
>>mistake) by denying to its adherents their ability to create markup in 
>>the way they want.
>Just so.
>>The backward-compatibility argument just doesn't hold : I'd be curious to 
>>see how (or if) Java parsers (for instance) enforce the restricions to 
>>UNICODE as specified in the XML spec. Aren't they just relying on the 
>>Java platform to handle encoding?
>Some parsers, at least, have their own tables.
>>Even if they are not, they should,
>That's dangerous: it leads to interop failures.  What if the version of
>Java at the receiving end has slightly different tables from the one
>at the sending end?

That's not XML interop but UNICODE interop. Aren't such "recovery" 
mechanism specified in UNICODE? And anyways the problem exists with 
implementations based on the current spec, you said it yourself : some 
parsers have tables and some don't. Interop failures are not bad if they 
are expected (which is not the case now  in XML). I suggest looking at the 
OSI layer model for inspiration, and to project UNICODE-XML 
interface+protocols onto it. Think abstraction!

>But that point has to do with *implementation*, not specification.

I was just stating this as a SYMPTOM for bad design.