[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] How to handle "newline" characters in an XML parser.
- From: "Redefined Horizons" <redefined.horizons@gmail.com>
- To: "Klotz, Leigh" <Leigh.Klotz@xerox.com>
- Date: Fri, 8 Dec 2006 16:21:00 -0800
I should have clarified some things. :]
My parser is actually for a subset of XML, with no attributes,
comments, or CDATA. I looked at existing XML pull-parsers, but didn't
find one that made use of an event broadcasting mechanism like I
wanted.
The only reason I'm interested in counting lines is for the error
reporting. After reading the messages I think I've got more of a Java
problem than an XML problem. It really doesn't matter what the XML
Specs say in this particular case. What I need to figure out is how to
count the lines in text files created in different operating systems.
I do appreciate everyones help and suggestions.
Scott Huey
P.S. - When I get the functional code for my parser online I'll post a
message to this list. Most won't be interested in the parser design,
but a few might be. :]
On 12/5/06, Klotz, Leigh <Leigh.Klotz@xerox.com> wrote:
> Also note that XML 1.1 makes changes in newlines, which you might want
> to read since you're interested in this area.
> Leigh.
>
> -----Original Message-----
> From: Richard Tobin [mailto:richard@inf.ed.ac.uk]
> Sent: Tuesday, December 05, 2006 11:39 AM
> To: xml-dev@lists.xml.org
> Subject: Re: [xml-dev] How to handle "newline" characters in an XML
> parser.
>
> In article <e24752a10612051124j501ffe3i7469c64a94ea4959@mail.gmail.com>
> you write:
>
> >I'm having some trouble figuring out how to handle "newline"
> >characters in XML text files on different platforms. I typically
> >ignore all whitespace in the parser,
>
> To conform to the standard, an XML parser must return all whitespace
> in content to the application, except that line breaks must be
> normalized to a linefeed character. The idea is that XML applications
> don't have to worry about the platform's line-end conventions. Any of
> the following count as a line break: LF, CR LF, and CR not followed by
> LF. So if you get two CRs followed by a LF, you should return two
> LFs. The easiest way to do this is to convert them as you input them,
> before parsing. You can count line numbers at the same time.
>
> -- Richard
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]