[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] External subset processing by browsers
- From: George Cristian Bina <george@oxygenxml.com>
- To: Andrew Welch <andrew.j.welch@gmail.com>
- Date: Mon, 08 Dec 2008 13:21:00 +0200
Hi Andrew,
Try setting http://xml.org/sax/features/external-general-entities to
false. See also:
http://xerces.apache.org/xerces2-j/features.html#external-general-entities
Best Regards,
George
--
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Andrew Welch wrote:
> Hi Elliotte,
>
> 2008/12/5 Elliotte Rusty Harold <elharo@metalab.unc.edu>:
>> Firefox. There are two separate issues here:
>>
>> 1. Whether Firefox should read the external DTD subset.
>> 2. How it should treat unrecognized entities when it doesn't read the
>> external subset.
>>
>> Let me check the spec, but my recollection is that if the external DTD
>> subset is not read, unrecognized entities are not a fatal error.
>
> I have a similar issue, for example there are some RSS feeds which
> contain entity references but no doctype:
>
> <foo>foo € bar</foo>
>
> I was trying the handle them by supplying a LexicalHandler (to trap
> and convert them to numeric refs), and setting a few Xerces features,
> but it always throws an exception for it before the startEntity event.
>
> Sample code (using Xerces 2.9.0):
>
> public class Test extends XMLFilterImpl implements LexicalHandler {
>
> public static void main(String... args) throws Exception {
> new Test();
> }
>
> public Test() throws Exception {
>
> String xml = "<foo>foo € bar</foo>";
>
> XMLReader xmlReader =
> XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
> xmlReader.setProperty("http://xml.org/sax/properties/lexical-handler",
> this);
> xmlReader.setFeature("http://apache.org/xml/features/scanner/notify-char-refs",
> true);
> xmlReader.setFeature("http://apache.org/xml/features/validation/unparsed-entity-checking",
> false);
> xmlReader.setFeature("http://xml.org/sax/features/external-parameter-entities",
> false);
> xmlReader.setEntityResolver(this);
> xmlReader.parse(new InputSource(new StringReader(xml)));
> }
>
> @Override
> public void startDocument() throws SAXException {
> super.startDocument();
> }
>
> public void startEntity(String name) throws SAXException {
> System.out.println("Start ent: " + name);
> }
>
> public void endEntity(String name) throws SAXException { }
> public void startCDATA() throws SAXException { }
> public void endCDATA() throws SAXException { }
> public void startDTD(String name, String publicId, String
> systemId) throws SAXException { }
> public void endDTD() throws SAXException { }
> public void comment(char[] ch, int start, int length) throws
> SAXException { }
> }
>
> The output when running this is:
>
> [Fatal Error] :1:16: The entity "euro" was referenced, but not declared.
> Exception in thread "main" org.xml.sax.SAXParseException: The entity
> "euro" was referenced, but not declared.
> at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
> at Test.<init>(Test.java:37)
>
>
> It would be really nice to handle this non-well-formed input using XML
> tools without resorting to a regex replace across every feed... I'm
> not sure it's possible but the features make it seem like it should be
> - any ideas?
>
>
> thanks
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]