OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] How to handle "newline" characters in an XML parser.

I should have clarified some things. :]

My parser is actually for a subset of XML, with no attributes,
comments, or CDATA. I looked at existing XML pull-parsers, but didn't
find one that made use of an event broadcasting mechanism like I

The only reason I'm interested in counting lines is for the error
reporting. After reading the messages I think I've got more of a Java
problem than an XML problem. It really doesn't matter what the XML
Specs say in this particular case. What I need to figure out is how to
count the lines in text files created in different operating systems.

I do appreciate everyones help and suggestions.

Scott Huey

P.S. - When I get the functional code for my parser online I'll post a
message to this list. Most won't be interested in the parser design,
but a few might be. :]

On 12/5/06, Klotz, Leigh <Leigh.Klotz@xerox.com> wrote:
> Also note that XML 1.1 makes changes in newlines, which you might want
> to read since you're interested in this area.
> Leigh.
> -----Original Message-----
> From: Richard Tobin [mailto:richard@inf.ed.ac.uk]
> Sent: Tuesday, December 05, 2006 11:39 AM
> To: xml-dev@lists.xml.org
> Subject: Re: [xml-dev] How to handle "newline" characters in an XML
> parser.
> In article <e24752a10612051124j501ffe3i7469c64a94ea4959@mail.gmail.com>
> you write:
> >I'm having some trouble figuring out how to handle "newline"
> >characters in XML text files on different platforms. I typically
> >ignore all whitespace in the parser,
> To conform to the standard, an XML parser must return all whitespace
> in content to the application, except that line breaks must be
> normalized to a linefeed character.  The idea is that XML applications
> don't have to worry about the platform's line-end conventions.  Any of
> the following count as a line break: LF, CR LF, and CR not followed by
> LF.  So if you get two CRs followed by a LF, you should return two
> LFs.  The easiest way to do this is to convert them as you input them,
> before parsing.  You can count line numbers at the same time.
> -- Richard
> _______________________________________________________________________
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS