OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] XML: why there is no escape (was Re: [xml-dev] Whatto escape when serializing XML)

Rick Marshall wrote:
> I don't know why, but I'm guessing Dennis Ritchie chose the "\" 
> instead of ESC
Oh, ESC would not be suitable at all, it is a character coding thing not 
a language-level code.
> Personally I think it would have been better for XML (SGML?) to stick 
> to an existing programming practice (and {} instead of <>) - but the 
> document world had evolving differently to the programming world and I 
> guess we just have to live with clash of cultures.
Well, SGML allowed you to change < for { but almost no-one did it. This 
is because in programs "<" is very common and "{" is rare, while in 
legal/technical/quality documents "{" is more common than "<".  (But 
sometimes for documents with many "<" people did remap to use "{".)  
Usually when people did use "{", they used it a short cut for an open 
tag not just a delimiter:
   {p}This is a {list}{item}short{/item}{item}example{/item}{/list}{/p}
is not a great advance in markup while short referencing it down to the 
   This is a {*short *example}
is much more useful.
> Michael Kay wrote:
>>> To escape a character means to do something (typically, to prefix it 
>>> with \ in C-family languages) to allow the character to be used 
>>> literally but without its normal parser treatment.  So \ before a 
>>> newline in a shell script is an escaped character.     
>> Kernighan and Ritchie don't use "escape" as a verb, but they do refer to
>> constructs such as "\n" and "\b" as "escape sequences". So it seems 
>> fairly
>> natural that people should use the verb "escape [a character]" to mean
>> "represent [a character] by means of an escape sequence". 
>> Representing tab
>> by "\t" doesn't seem very different from representing tab by "&#x9;", so
>> it's natural that the same verb should be used for that too.
I think K&R might have called them escape sequences because the codes 
would be converted by particular implementations into device-dependent 
escape sequences (e.g. the appropriate ANSI escape sequence or whatever, 
using termcap or printcap or whatever.) Not because the "r" was being 
escaped by the "\".

But a question like "How do we escape a non-printing character" should 
have only one answer in XML: "You cannot escape non-printing characters 
(in the sense of adding some prefix that makes them OK); you can only 
represent them using numeric character references (which some people may 
call an escape sequence) and some of them, notably NULL, you simply 
cannot represent unless you have HEX or BIN64 encoded embedded fragments.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS