Lists Home |
Date Index |
email@example.com (Mike Champion) writes:
>Sure! The question is how to do something to make our
>lives less unpleasant while The System plots forward.
>Be patient, vote with our feet against crappy software
>that can't handle Unicode decently, or try to hack up
>something in the interim? The whole point of Unicode
>encodings is to map conveniently enterable text onto
>codepoints, and whatever the technical virtues or
>flaws of Tim's strawman proposal, this seems like the
>right layer to address it.
Mapping into code points is very different from mapping by reference to
an external table which modifies those codepoints and introduces new and
>The "W3C" (which after all is a few thousand different
>people with very different ideas) has been wrestling
>with this for years, the trouble is that no very great
>ideas have come up AFAIK. What's you're Wart-Off
An XML vocabulary for describing entity names and values. Entity
references stay the same, but the mechanism for describing them evolves.
Like regular entity reference processing, they come with a processing
model that inserts them during the parse.
I do this already for character entities as a preprocessing function to
avoid various conversion weirdnesses with O'Reilly DocBook, and I'm
integrating that with my Ripper parser. It's hardly rocket science,
though it might take an extra PI or maybe a pointer from the XML
declaration to make it run across multiple systems. Namespace-prefixing
entity references (so that &h:nbsp; could refer to the HTML nbsp entity)
is another option for making this work without too much insanity, though
that certainly feels hackish to me.
If people will open the XML box and stop pretending that it's bytes in,
infoset out, with no chance ever for controlling the box, they might
come up with better ways to solve XML problems in the XML space.
To me, Tim's proposal doesn't feel like a way to get around "The
System". It feels like "The System" proposing a dangerous hack to avoid
solving real problems it faces internally.