OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   "Smart ASCII" -> XML for authoring?

[ Lists Home | Date Index | Thread Index ]

The notion of "smart ASCII" as a way of creating structured documents 
whose conventions allow it to be easily transformed to XML hit me 
twice yesterday, once in the day job (a very significant media 
company uses this to achieve interoperability between diverse 
authoring systems) and once when reading 
open&l=136,t=grx,p=mod :
"For the most part, "smart ASCII" is what you have been writing for 
years if you use e-mail and the Usenet. ...Asterisks surround  bold 
or heavily emphasized phrases; dashes surround italicized or lightly 
emphasized phrases; underscores introduce Book or Series Titles. ... 
They are all very quick to type.
Anything that looks like a URL is turned into a link automatically. A 
fairly simple special format with curly braces and the ALT text 
before a colon is used to insert images, such as charts and graphs."
There's a script included (a few hundred lines of well-commented 
Python) to do the conversion to XML.

I'm of two minds on this ... on one hand it sounds like a return to 
the Bad Old Days and will require continuing human intervention to 
cleanup the inevitable not-so-smart ASCII before it can be converted 
to XML rather than one-time human intervention to teach markup skills 
to the authors.  On the other, it leverages what humans do best -- 
deal with patterns, templates, informal conventions -- and lets 
computers do what they do best -- generate and parse formal syntaxes, 
putting XML further behind the scenes, perhaps where it belongs.

I'd be interested in hearing others' reactions to this IBM 
DeveloperWorks article and about actual experiences in the field.  My 
guess is that is makes a LOT of sense for simple documents (memos, 
weblogs, simple articles) and virtually no sense for serious 
technical documents where the whole point of SGML/XML is to catch the 
structural errors as early and automatically as possible even if this 
requires some pain on the part of the authors.  But how big is the 
middle ground, and when does it pay to make authors switch over to 
XML? In other words, should XML stay in the background, or is it time 
for the end-users to add basic markup knowledge to their repertoire 
of skills?


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS