OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] C1 characters in XML 1.0 and HTML 4

>Occasionally the internationalization working group in W3C decides to flex its muscles,
>and one instance of this was there insistence that XSLT should not generate HTML
>that contains characters which HTML defines to be illegal.

Seems very reasonable to me. Until Bjoern reminded me, I forgot about the SGML declaration for HTML 4.

>It's probably a mistake that XML allowed these C1 characters, because they are
>nearly always miscoded CP1252 characters. XML 1.1 tried to fix this problem
>but we all know what happened to that.

Yes, indeed. We've tried to avoid the complications of handling XML 1.1 in our tool chain.

>In the meantime, the result is that you feed a bad character
>nto the start of your processing pipeline and you discover
>the problem at the final stage when HTML emerges.

I was just a bit surprised that the error was caught so far down the line.
>The reasoning of course is that the end user shouldn't pay the price
>for the content provider's carelessness.

>This is very different from the culture in W3C which tries
>to improve data quality by insisting that software should
>reject bad data.

I'm usually on the delivery side of things, so I'm always working to understand the content and prevent bad data from getting out there in the first place.

Many thanks, Dr. Kay.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS