XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Parsing without resolving entities

Title: Message
Thanks for the notes on How to solve this issue.  I was really hoping to get a different answer!  :-)  I hadn't considered modifying the entity file or using processing instructions to protect the entities from being resolved.
 
Can anyone address the Why and include the perspective of a parser requirements writer / standards committee member?  To me, this seems like valuable functionality that is lacking from the current tools. 
 
>> Randy
 
-----Original Message-----
From: Michael Kay [mailto:mike@saxonica.com]
Sent: Monday, October 29, 2007 12:49 PM
To: Randy McGarvey; xml-dev@lists.xml.org
Subject: RE: [xml-dev] Parsing without resolving entities

It's a real pain that doesn't have a common solution. I tend to
 
(a) avoid using entities. Because I only ever use XML via XSLT, processing-instructions are much more manageable.
 
(b) if I do use entities, don't rely on them remaining intact - i.e. there should be no difference in information content between an entity and its expansion (so you can always re-entitize mechanistically if you need to).
 
(c) preprocess, as suggested, to replace the ampersands by something else: for example <?ent mdash?>.
 
Michael Kay
http://www.saxonica.com/


From: Randy McGarvey [mailto:rmcgarvey@generalcode.com]
Sent: 29 October 2007 15:04
To: xml-dev@lists.xml.org
Subject: [xml-dev] Parsing without resolving entities

If I have data with character entities such as &sect; or &mdash; in the XML, what is the best way to keep those intact, as is, after a parse.  Are there any parsers that have an option not to resolve entities?  What is the best way you've found to deal with this issue?  Do you escape the ampersands (e.g. &amp;sect;) in a pre-process?  Do you address it in an entity handler to re-write the original entity text?  This seems like a real pain that must have a common solution.

Thanks!
>> Randy

******************************************************************************

Do you get frequent requests for copies of certain sections of your Code? We can reproduce chapters of your Code in handy pamphlet format - no minimum quantity required! Order yours today.

 

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS