OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Why the double escape for lt ? (that is <!ENTITY lt "&#38;#60;"> )

At 2016-11-04 17:19 -0700, David John Burrowes wrote:
Your comment was useful, and helped pull my mind out of the rut it was in. Still, it doesn’t really seem to offer a definitive reason why internal and external entities are processed differently.
Because the replacement text of an internal entity is determined after parsing that entity for any references to parameter entities that need to be expanded as part of the replacement text. Since the content is being parsed for the parameter entities, the parsing process resolves the character entities (so as not to inadvertently reference an undesired parameter entity). The end result of that parsing step becomes the replacement text.

The replacement text of an external entity is manifest in what is in the file and so does not need to be parsed for references to parameter entities. Without the need to parse the content, the content can be used as is as the replacement text.

The replacement text is then parsed in the context of where it is placed in the stream by the entity reference.

In my XML work for my training material I compose numerous entities using the values of multiple parameter entities. It was a practice I followed in the SGML days. It ain't pretty but it does what I need it to do. Here is an excerpt:

<!ENTITY % imgdir "images/">
<!ENTITY % b "bmp"><!--bitmap extension "gif" or "bmp"-->
<!ENTITY % v "wmf"><!--vector extension "cgm" or "wmf"-->
<!ENTITY % areas "<!ENTITY areas SYSTEM '%imgdir;areas.%v;' NDATA %v;>">%areas;
<!ENTITY % axes "<!ENTITY axes SYSTEM '%imgdir;axes.%v;' NDATA %v;>">%axes;
<!ENTITY % book1 "<!ENTITY book1 SYSTEM '%imgdir;book1.%b;' NDATA %b;>">%book1;
<!ENTITY % bookalt "<!ENTITY bookalt SYSTEM '%imgdir;bookalt.%b;' NDATA %b;>">%bookalt;

So ... in the general entity replacement for "&areas;" I have the sequence "%imgdir;" which gets expanded because it is a parameter entity reference. But what if I wanted the string "%imgdir;" instead of the parameter entity reference? I need to escape the "%", so I need to use &#x25; or &#37; in order to encode the "%" to be a simple "%" and not the reference. So, internal entity replacement text processing needs to do numeric character reference processing which is part and parcel of entity processing.

Therefore, if you want an entity reference injected into your stream, one found in an external entity is coded as you would think, but one found in an internal entity has to be the result of entity reference processing, thus requiring the double escaping. You have to compose the replacement string you then want processed.

There is a reason. A bit arcane, but it wasn't done frivolously.

I hope this helps.

. . . . . . . Ken

Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Streaming hands-on XSLT/XPath 2 training @US$45: http://goo.gl/Dd9qBK |
Crane Softwrights Ltd. _ _ _ _ _ _ http://www.CraneSoftwrights.com/x/ |
G Ken Holman _ _ _ _ _ _ _ _ _ _ mailto:gkholman@CraneSoftwrights.com |
Google+ blog _ _ _ _ _ http://plus.google.com/+GKenHolman-Crane/posts |
Legal business disclaimers: _ _ http://www.CraneSoftwrights.com/legal |

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS