OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: SAX: New Idea for Entity Resolution

[ Lists Home | Date Index | Thread Index ]
  • From: lex@www.copsol.com (Alex Milowski)
  • To: xml-dev@ic.ac.uk
  • Date: Wed, 15 Apr 1998 23:42:38 -0500 (CDT)

> Alex Milowski writes:
>  > In effect, although the above interface is useful, it reduces
>  > interchange in that I can make a document with broken system
>  > identifiers work on my system.  Essentially, I can make an
>  > *invalid* document valid!
> You can do this in any case, though -- you can intercept URIs in the
> system libraries (Java, for example, lets you register your own
> schemes), or you can redirect them with a proxy server.
> With URLs, file:// will almost always break on exchange, as will http:
> system identifiers that refer to hostnames visible only within a
> private network.

Yes, but then if you do this, don't expect it to work elsewhere.  ;-)

Why would you use absolute URLs?  Bad author, bad!  Ok, maybe you would
use them for a standard DTD. ;-)  (This is where I beat the URN drum)

<SGMLRANT type='mild'>
In the SGML world, I could come up with a scheme that made location
orthogonal to my documents.  I *never* put a system identifier in my
documents.  In XML, this is much harder.

Now, if URN support was *standard*, I could at least put a URN in the
place of every system identifier I needed and then my document is
quite portable.  The key phrase here is *standard*.

Of course, we could also fix public identifiers and forget about the
URN stuff.   ...but, then we would have to come up with 
yet-another-resolution-mechanism... which sounds too much like URNs.

> Your other points (which I omitted above) are well taken -- public
> identifiers are a bit of a muddle right now, but since they're in XML
> 1.0, it makes sense to support them.  The interface is not only for
> public identifiers, however -- users can also remote URIs to
> local/secure equivalents, and they can even screen out certain URIs if
> necessary.  I'd better copyright "XML-Nanny" before someone else
> thinks of it.

Well, a further point I was making off-line is that this kind
of mapping could be lead people down the wrong road.  I have run into
so many SGML users over the years that didn't know how to or *couldn't* use
public identifiers without system identifiers.  In an SGML world, I see this
as bad practice.  Likewise, I see mapping system identifiers in XML as bad

Two general rules I can recommend:

   1. Use an internal resolution system inside your production
      systems.  Locations will change even inside your own system.

   2. Use a fairly static naming system (URN/Public identifier) when
      you exchange documents.

One thing XML has over SGML is that it is tied more closely to a location
mechanism.  If you add in URN ability, there is no issue of "configuring"
you local system to know about mappings--you just do a URN lookup.

(Obviously, URNs can be miss-configured or not available.  Ever had
 problems on the Internet with DNS names?  Same idea, same problem, same
 frustration when it is wrong!)

R. Alexander Milowski     http://www.copsol.com/   alex@copsol.com
Copernican Solutions Incorporated                  (612) 379 - 3608

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS