OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] MSXML DOM Special Chars Less Than 32

[ Lists Home | Date Index | Thread Index ]

From: "John Cowan" <jcowan@reutershealth.com>
 
> Rick Jelliffe scripsit:
> 
> >    * The RFC on MIME types talks about "textual" data rather than the
> >      text/binary distinction.
> >       So control characters in an ASCII file may be "text" to some
> >       but they are not "textual". 
> 
> The 39 Articles of the Church of England ban "the Romish doctrine of
> Purgatory" and "the sacrifice of the masses", but John Henry Newman
> (back when he was an Anglican, rather than a Roman, Catholic) showed
> that Romish was not the same as Roman, and the masses not the same
> as the Mass, so Anglicans could believe in the Roman doctrine of
> Purgatory, and celebrate the sacrifice of the Mass, without offending
> the Articles in the least.
 
Yes, as the beautiful "Apologia pro vita sua" shows, Newman really tried 
hard to stay. 

But in the case of textual, the definition comes from an RFC rather
than a musty post-counter-reformation don.  

RFC 2048[1] 
Multipurpose Internet Mail Extensions
(MIME) Part Two:
Media Types

"3. Overview Of The Initial Top-Level Media Types 
The five discrete top-level media types are: 

text -- textual information. The subtype "plain" in particular indicates plain text containing no formatting commands or directives of any sort. Plain text is intended to be displayed "as-is". No special software is required to get the full meaning of the text, aside from support for the indicated character set. Other subtypes are to be used for enriched text in forms where application software may enhance the appearance of the text, but such software must not be required in order to get the general idea of the content. Possible subtypes of "text" thus include any word processor format that can be read without resorting to software that understands the format. In particular, formats that employ embeddded binary formatting information are not considered directly readable. A very simple and portable subtype, "richtext", was defined in RFC 1341, with a further revision in RFC 1896 under the name "enriched". "

In section 4.1.2, about text/*  is interesting too: 

"Aside from these conventions, any use of the control characters or DEL in a body must either occur 
  1.. because a subtype of text other than "plain" specifically assigns some additional meaning, or 

  2.. within the context of a private agreement between the sender and recipient. Such private agreements are discouraged and should be replaced by the other capabilities of this document. "

Should XML follow or lead?  XML 1.0 was designed with painstaking attention
to fitting in with existing standards and infrastructure, and it has done because of it.

Cheers
Rick Jelliffe

[1] http://www.nacs.uci.edu/indiv/ehood/MIME/2046/rfc2046.html




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS