[
Lists Home |
Date Index |
Thread Index
]
I posted this first in May of 2001. I think it's worth looking at a
somewhat more heavy-weight approach to defining character entities,
since maintaining and distributing entity libraries is going to be an
important part of the use of XML going forward if we get this right.
So I'd like to at least look at what a real XML document type for
character entity definition might look like. Here's one attempt.
<?xml version='1.0'?>
<characters xmlns="http://www.w3.org/2001/05/Character"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<xhtml:p>A sample of proposed character entity definition file format</xhtml:p>
<character name="copyright" code="169"/>
<character name="registered" code="174">
<xhtml:p>Individual characters can <xhtml:em>also</xhtml:em> have
documentation</xhtml:p>
</character>
<character name="eacute" code="xC9"/>
<character name="bullet" code="x2022"
source="http://www.unicode.org/charts/PDF/U2000.pdf"/>
</characters>
There's a tentative W3C XML Schema schema document for this
document type attached below. It calls out the following issues:
1) Should there be an <include> element to allow multiple documents to be
combined? (tentative answer - yes)
2) Should this be a full general entity mechanism with a value be
allowed to be any string? (tentative answer - no)
3) Should there be explicit support for documentation, e.g. via <xs:any>?
(Tentative answer - yes, XML processors would of course ignore it)
4) Should the 'source' attribute be required? (tentative answer - no)?
There's obviously a fifth issue -- how to connect individual instances
to entity libraries? An xml:charents attribute works for me, but
others may have other ideas.
ht
draft schema for character entity definition document type
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
W3C Fellow 1999--2001, part-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
|