OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] External parsed general entities

[ Lists Home | Date Index | Thread Index ]

[Many apologies for the repost, but I forgot to include the RNC file in the 
last post; it is now included below.  I hope I haven't forgotten anything 
else.]

At 2002-08-01 10:38 +1000, Michael Leditschke wrote:
><snip/>
>
> > > > I do *not* use such constructs for sharing information as
> > fragments between
> > > > multiple XML instances; I use XSLT to extract fragments of an
> > XML document,
> > > > thus keeping every external parsed general entity in a single parsing
> > > > context (set of DTD declarations for entities).
>
>Hi Ken. Thought I'd try and confirm I understood what you
>meant and present Rick with a more complete example...
>
>If I understand what you are saying,
>
>1. One or more pieces of markup that need to be shared is placed
>in an external entity - a "fragment" file. This grouping is
>based on version control granularity.

Sorry, Michael, but it seems I steered you wrong from the get-go!  I hope 
to make amends below.

What I meant to convey is:

   (1) - using external parsed general entities is very useful for
         maintaining multiple fragments of a large XML file so that
         each can have its own revision history and you don't need
         to work with everything just to work with something
   (2) - having decided to use an external parsed general entity,
         it is one's intuition to share that entity amongst other
         XML instances, but that you mustn't do that because of the
         danger of hurting yourself long after you have created
         many XML instances that refer to the entity
   (3) - since sharing entities is bad, use XSLT to share information
         found in XML instances

I'll address your other points and then go through an example.

>2. One or more of these shared pieces are then included
>into a file via the XML 1.0 supplied mechanism - a
>"library" file.

Sure ... but that wasn't what I was trying to get across ... but in one 
case I have I have exactly that ... an instance named "shared.xml" that has 
common slides to all presentations that I just use as a library of 
slides.  It happens in my case, though, that this library file isn't big 
enough to warrant using external parsed general entities.  It was just that 
since I had the sharing code working from large presentations, having a 
library presentation to extract from required zero additional code.

>3. To build a course, you then use an XSLT stylesheet
>to pull in the bits you want - a "grouping" file.

Again not quite.  A course is a course and if a new course wants a slide 
from an old course, the old course is referenced as a library as you 
described above.  If a new course doesn't need any slides from any old 
courses, nothing is different, it just happens to not go looking for any 
slides from old courses.

>Have I got the general jist?

You've described something that will work, but it isn't quite the gist of 
what I was trying to convey.  To me the maintenance of the files in 
entities is a separate issue than the sharing of portions of XML documents, 
but many users see the two and conflate them in one thought, and that is 
where the problems lie.

>The neat thing is that anyone can use someone else's library,
>without having to ensure they got all the DTD dependencies
>correct (provided the library and fragment files are web-accessible).

Precisely!  Anyone can use someone else's XML instance regardless of how 
that someone else chooses to maintain their instance using external parsed 
general entities.  One cannot just use someone else's external parsed 
general entity because that someone else might, unwittingly, render your 
XML instance not well-formed because they added a new dependency on an 
entity they have in their XML instance but you don't in your instance.

>The only downside I see is that you have to manually include
>any fragment dependencies within a grouping file, though you
>do have the protection of automatic checking afterwards. Some smart
>XSLT could probably automatically resolve these as well, depending
>on the structure used for the fragment and the grouping files.

True ... but in my case, a presentation is a standalone presentation 
without any knowledge of its content being used by another 
presentation.  The other presentation can point to and extract a slide from 
the first presentation without being impacted by the maintenance of the 
first presentation.

Below is a simple but functional example of the gist of my approach.  You 
can cut and paste each file into your own environment and it should all 
work, as I'm showing a running command line down below.

I create a course "michael1.xml" but the size gets too big or I need to 
maintain the second module separately, so I use an external parsed general 
entity to maintain the second module.  Each one is checked into my CVS and 
lives with its own revision cycle.

I create a new course "michael2.xml" and I need that second module.  The 
intuition is to just go ahead and refer to the external parsed general 
entity "michael1.ent", *but that can fail in the future*.  What if, months 
later, I modify "michael1.ent" to need the entity named "abc", and modify 
"michael1.xml" to declare that entity?  The course "michael1.xml" works 
just fine, but "michael2.xml" would then not even be well-formed and would 
be outright rejected by all applications with an XML processor.

So, when I create "michael2.xml", I refer to "michael1.xml" as an unparsed 
entity and point to that entity in an attribute named "shared".

Note that with RELAX-NG, I can constrain a module to have either the shared 
attribute or module content but not both ... something you cannot do with a 
DTD.

My course builder "michael.xsl", then, acts differently when the shared 
attribute is present: it finds the module in the XML instance pointed to by 
the shared attribute and just processes that's instance's module with the 
same id.  Since I am building my HTML presentation slides from my course 
material, the use of the unique id ensures I don't have any slide naming 
conflicts.

So my three counsels are illustrated:

  (1) - use entities for large and/or fragmented XML file maintenance
  (2) - don't use entities for sharing fragments of XML information
  (3) - use XSLT for sharing fragments of XML information

I'm going to be very sad when entities are thrown out of XML.  To hear many 
on XML-Dev that seems a fait-accomplis.  During the derivation of XML from 
SGML I voiced strong opinions about entities, especially the use of PUBLIC 
identifiers with entities, but those making the decisions were not 
swayed.  Not many people were on my side.

Note again that very few tools support entities.  Mike Kay helped me use 
the XP XML processor with the Saxon XSLT processor because the delivered 
Aelfred XML processor cannot handle my XML usage.  I created an acid test 
distillation of my environment and it has been used by a number of 
commercial XML processor manufacturers to debug their XML processors when I 
couldn't use them in my environment.

To this day Xalan is another XML processor that cannot handle my 
environment.  If anyone would like a copy of my acid test environment, 
please just let me know.  Actually, I just added this acid test environment 
to my "Resource library (free developer tools)" link off my home page 
(noted in my trailer) this morning, so anyone can just pull it down.

I hope this is considered useful ... sorry for the length of the post.

......................... Ken


t:\ftemp>type michael.rnc
namespace a = "http://relaxng.org/ns/annotation/1.0";
datatypes c = "http://relaxng.org/ns/compatibility/datatypes/1.0";
datatypes x = "http://www.w3.org/2001/XMLSchema-datatypes";

<a:annotation>
     $Id: michael.rnc,v 1.1 2002/08/01 12:31:58 G. Ken Holman Exp $

     A course of modules, some of which might be shared.
</a:annotation>

start = element course
  {
   element module
    {
     attribute id { c:ID },
     (
      attribute shared { x:NMTOKEN }
      |
      element title { text }
     )
    }+
  }

# end of file

t:\ftemp>type michael1.xml
<?xml version="1.0" encoding="iso-8859-1"?>
<!-- $Id: michael1.xml,v 1.1 2002/08/01 12:31:58 G. Ken Holman Exp $ -->
<!DOCTYPE course
[
<!ENTITY m1 SYSTEM "michael1.ent">
<!ATTLIST module id ID #REQUIRED>
]>
<course>
   <module id="m1-1">
     <title>Michael 1 - module 1</title>
   </module>
   &m1;
   <module id="m1-3">
     <title>Michael 1 - module 3</title>
   </module>
</course>

t:\ftemp>type michael1.ent
<?xml version="1.0" encoding="iso-8859-1"?>
<!-- $Id: michael1.ent,v 1.1 2002/08/01 12:31:58 G. Ken Holman Exp $ -->
   <module id="m1-2">
     <title>Michael 1 - module 2</title>
   </module>
<!--end of file-->

t:\ftemp>call rnc michael.rnc michael1.xml
No validation errors.

t:\ftemp>type michael2.xml
<?xml version="1.0" encoding="iso-8859-1"?>
<!-- $Id: michael2.xml,v 1.1 2002/08/01 12:31:58 G. Ken Holman Exp $ -->
<!DOCTYPE course
[
<!ATTLIST module id ID #REQUIRED>
<!NOTATION xml SYSTEM "http://www.w3.org/XML/1998/namespace";>
<!ENTITY michael1 SYSTEM "michael1.xml" NDATA xml>
]>
<course>
   <module id="m2-1">
     <title>Michael 2 - module 1</title>
   </module>
   <module id="m1-1" shared="michael1"/>
   <module id="m2-3">
     <title>Michael 2 - module 3</title>
   </module>
</course>

t:\ftemp>call rnc michael.rnc michael2.xml
No validation errors.

t:\ftemp>type michael.xsl
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                 version="1.0">

<!--$Id: michael.xsl,v 1.1 2002/08/01 12:31:58 G. Ken Holman Exp $-->

<xsl:output method="text"/>

<xsl:template match="course">
A course:<xsl:apply-templates select="module"/>
</xsl:template>

<xsl:template match="module[@shared]">
   <xsl:variable name="look-id" select="@id"/>
   <xsl:for-each select="document(unparsed-entity-uri(@shared))">
     <xsl:apply-templates select="id($look-id)"/>
   </xsl:for-each>
</xsl:template>

<xsl:template match="module">
   module: <xsl:apply-templates select="title"/>
</xsl:template>

</xsl:stylesheet>

t:\ftemp>xt michael1.xml michael.xsl michael1.txt

t:\ftemp>type michael1.txt

A course:
   module: Michael 1 - module 1
   module: Michael 1 - module 2
   module: Michael 1 - module 3
t:\ftemp>xt michael2.xml michael.xsl michael2.txt

t:\ftemp>type michael2.txt

A course:
   module: Michael 2 - module 1
   module: Michael 1 - module 1
   module: Michael 2 - module 3
t:\ftemp>rem Done!


--
Upcoming hands-on in-depth 3-days XSLT/XPath and/or 2-days XSL-FO:
-                               North America:  Sep 30-Oct  4,2002
-                               Japan:          Oct  7-Oct 11,2002

G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/x/
Box 266, Kars, Ontario CANADA K0A-2E0  +1(613)489-0999 (Fax:-0995)
ISBN 0-13-065196-6                       Definitive XSLT and XPath
ISBN 1-894049-08-X   Practical Transformation Using XSLT and XPath
ISBN 1-894049-07-1                Practical Formatting Using XSLFO
XSL/XML/DSSSL/SGML/OmniMark services, books (electronic, printed),
articles, training (instructor-live,Internet-live,web/CD,licensed)
Next public training:           2002-08-05,26,27,09-30,10-03,07,10





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS