OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Entity resolution vs. URI resolution

[ Lists Home | Date Index | Thread Index ]

/ Edwin Goei <edwingo@sun.com> was heard to say:
| In SAX there is an entity resolver that takes a pair (publicID,
| systemID) as input:
|   InputSource resolveEntity(String publicID, String systemID)
| For other standards (such as W3C XML Schema), there is a need to resolve
| a schema location URI as a single input paramater:
|   InputSource resolveURI(String locationURI)

I think what's significant here is that many XML vocabularies include
the notion of an "include" element: xsl:include, xsl:import,
xsd:include, xsd:import, xi:include, etc. All of these references are
performed with simple URIs (rather than external entities, as we would
have done in the SGML days).

A great many developers are building applications on top of SAX. I'm
of the opinion that the ubiquity of these sorts of URI references (and
the fact that they are in practice performing the job of entity
references) makes them important enough to provide a SAX-level
resolution mechanism.

In other words, I don't think this is a schema application issue or a
stylesheet application issue (although this problem could certainly be
solved in specialized APIs for each of these types of applications).

| One view is that "entity resolution" = "uri resolution" + "publicID
| resolution".  So if only "uri resolution" is desired, pass "null" as the
| publicID.
| Another view is that these are two fundamentally different animals and
| so two different resolvers should be used.

I argued this point at length on the sax-devel list[1][2]. I am
steadfastly of the opinion that entities and "bare URIs" are entirely
different beasts.

An entity is identified by an external identifier[3]. An external
identifier is logically a tuple of (publicId, systemId). A URI
reference is nothing but a URI. I assert that (null, "foo") is not
identical to ("foo").

Forcing URI resolution through the entityResolver() method requires
the association of a null public identifier with objects (URI
references) that do not logically have a public identifier.

I also observe that while every system identifier is a URI (by
definition), it does not follow that every URI is a system identifier.

I maintain that adding a resolveURI(uri, baseURI) method to the SAX
API is "the right thing".

                                        Be seeing you,

[1] http://sourceforge.net/mailarchive/forum.php?thread_id=40245&forum_id=1472
[2] http://sourceforge.net/mailarchive/forum.php?thread_id=414410&forum_id=1472
[3] http://www.w3.org/TR/REC-xml#NT-ExternalID

Norman.Walsh@Sun.COM   | Success is relative; it is what we make of the
XML Standards Engineer | mess we have made of things.--T. S. Eliot
XML Technology Center  | 
Sun Microsystems, Inc. | 


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS