Lists Home |
Date Index |
----- Original Message -----
From: "Ed Davies" <email@example.com>
Sent: Friday, April 30, 2004 11:58 AM
> As most people on this list will know, OpenOffice.org documents
> are stored as XML within a ZIP format file. The main file
> within the ZIP is called content.xml and starts with:
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE office:document-content
> PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN"
> (line breaks added for mailability).
> The "office.dtd" system identifier here is a relative URI but,
> because the DTD is not in the document ZIP and is probably not
> even in the same directory as the document it is awfully
> confusing to any XML processor.
> My questions:
> 1. Is having a system id which doesn't actually refer to a DTD
> a sign of faulty XML (i.e., not valid or not well formed)?
> 2. Is this true even if the public identifier is OK?
> 3. What is the best way to deal with this case in a program
> using a SAX reader?
Depends on how much your app knows beforehand.
Does it have a registry of well-known DTDs where it can use
the public identifier to resolve the external id?
Also, the XML spec allows for application context
to determine a base URI to resolve a relatve URI against.
> 4. What is the best way to deal with it when using a standalone
> XML tool like an XSLT program?
> 5. Would it help if OOo included standalone="no" in the XML
> declaration? (If the processor isn't validating and knows
> the document is standalone then presumably it doesn't have
> any reason to read the DTD?)
Depends on the processor.