OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Achieving interoperability in a world where different OS'srepresent newline differently

CRLF may be an issue in case of juggling between different RE flavors as
well (as eg. in XSD processors).

4 years ago I had to fix this APAR:

XML Schema Regular Expressions (SCRE) are anchored at start and end
automatically, and XSD processor does translate SCRE to PCRE. The fix was
to replace ")$" by ")(?!\n)$" in order to make sure that PCRE does not
match input "9\n" for (SCRE) "9"; the "(?!\n)" negative lookahead fixed the

Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team & Fixpack team lead
IBM DataPower Gateways
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

  From:       "Costello, Roger L." <costello@mitre.org>                                                                                             
  To:         "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>                                                                                       
  Date:       12/13/2014 03:11 PM                                                                                                                   
  Subject:    [xml-dev] Achieving interoperability in a world where different OS's represent newline differently                                    

Hi Folks,

Each operating system has its own convention for signifying the end of a
line of text:

Unix: the newline is a character, with the value hex A (LF).

MS Windows: the newline is a combination of two characters, with the values
hex D (CR) and hex A (LF), in that order.

Mac OS: the newline is a character, with the value hex D (CR).

This operating-system-dependency of newlines can cause interoperability
problems: the newlines in a string created on a Unix box will not be
understood by applications running on a Windows box.

Here is how the newline problem is resolved in XML and in JSON:

XML: all newlines are normalized by an XML parser to hex A (LF). So it
doesn't matter whether you create your XML document on a Unix box, a
Windows box, or a Macintosh box, all newlines will be represented as hex A

JSON: multi-line strings are not permitted! So the newline problem is
avoided completely. You can, however, embed within your JSON strings the \n
(LF) or \r\n (CRLF) symbols, to instruct processing applications: "Hey, I
would like a newline here."

That is quite an interesting difference in approach between XML and JSON
for dealing with the newline problem!

Please let me know of any errors or confusion in the above explanation.



XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS