OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Parsing Kanji (Japanese) characters...

[ Lists Home | Date Index | Thread Index ]

Not really a problem of parsing Kanji.  0x1b is a control character
(specifically, it's the escape character).  XML can't handle any characters
in the control set except CR, LF, and HT.  So either someone has created an
invalid document, or the parser doesn't recognize the encoding and is trying
to read it as UTF-8, in which that byte sequence is not a legal XML
character.

Amy!
On Wed, Jun 18, 2003 at 02:02:54PM -0400, nizar.hirani@citicorp.com wrote:
>
>Hi -
>
>Is the SAX Parser able to handle Kanji characters? Any help/pointers are
>appreciated.
>
>I am trying parse and document in Kanji characters and get the following
>stack trace:
>
>java.rmi.RemoteException: EJB Exception: ; nested exception is:
>com.citicorp.WebTicketRouter.BusinessServices.DealTicketManagementServic
>es.DealTicketManagerServiceException: nested exception is:
>org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x1b)
>was found in the element content of the document.
>org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x1b)
>was found in the element content of the document. at
>weblogic.apache.xerces.framework.XMLParser.reportError(XMLParser.java:12
>92) at
>weblogic.apache.xerces.framework.XMLDocumentScanner.reportFatalXMLError(
>XMLDocumentScanner.java:613) at
>weblogic.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.di
>spatch(XMLDocumentScanner.java:1336) at
>weblogic.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumen
>tScanner.java:399) at
>weblogic.apache.xerces.framework.XMLParser.parse(XMLParser.java:1138) at
>weblogic.apache.xerces.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImp
>l.java:203) at
>weblogic.xml.jaxp.RegistryDocumentBuilder.parse(RegistryDocumentBuilder.
>java:144) at
>javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:86) at
>com.citicorp.WebTicketRouter.BusinessServices.DealTicketManagementServic
>es.DealTicketManagerEJB.getDocumentElement(DealTicketManagerEJB.java:122
>5) 
>
>
>
>
>Nizar Hirani
>(416) 947-5421
>   
>This message may contain confidential, proprietary or legally privileged
>information.  If you are not the intended recipient, please notify the
>sender immediately and delete the message from your system.  You should
>not copy or use it for any purpose, nor disclose its contents to any
>other person. 
>   
>Also, if you are not the intended recipient, you are hereby notified
>that you are not authorized to review the contents of this email and
>that any dissemination, distribution or copying of this message is
>strictly prohibited.
>

-- 
Amelia A. Lewis                    amyzing {at} talsever.com
  Light is the left hand of darkness
  and darkness the right hand of light.
    Two are one, life and death, lying
    together like lovers in kemmer,
      like hands joined together,
      like the end and the way.
        -- Tormer's Lay [Ursula K. Le Guin, "The Left Hand of Darkness"]




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS