OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] "Smart ASCII" -> XML for authoring?

[ Lists Home | Date Index | Thread Index ]

I believe the middle ground is big, we see this in the Content management
industry, where you mostly need only semi-structured content.

Our focus is on making it easier for people to author content with XML
tagging.  The approach that we have taken is browser based.  It is much like
working in a word processor, focused on business user authoring.

XML Tags define complex content data types.  The main difference between XML
and HTML tags is that XML is used to structure data, while HTML is used to
display data.  HTML is a more loosely defined set of tags, where XML
requires a more strict and structured arrangement of data.  With this
special tagging content management systems can use the data to perform
operations such as searches and shared content. The XML feature of
eWebEditPro2 uses XML in semi-structured content.  This assumes that
formatting and other items are still allowed within the XML tags.
Semi-structured content uses the tags to label certain pieces of data within
the content.  This then aids in presentation, search and storage needs.  XML
can tag special data within semi-structured HTML content. As we will see, by
the mechanisms used with the editor, the user needs very little expertise,
or even knowledge of, XML and its uses. The eWebEditPro+XML provides a user
layer between the XML tags themselves and the user's actions.  Scripting and
commands can work together to control what tags are used by the user and
where they are used.  The user will, most often, not realize that they are
working with XML tags, but think that they are working within a set of
content parameters, definitions, or rules. The eWebEditPro2 product is a web
based editor which allows the user to easily edit web page content.  It
contains an interface consisting of toolbars, context menus, dialogs, and
other user features. Developers can customize the editor by adding commands
in scripting, defining the user interface, and controlling the environment
in the web page. .

We see the need for several approaches to solving the problem of authoring
the content.
We wrote a white paper focuses on how to apply XML to human-authored
content"  See  http://www.ektron.com/whitepaper/xmlarticle.htm
From the white paper by Doug Domeny:
XML is gaining acceptance today, not because it is a great technology
looking for a problem, but because today's problems require its flexibility
and simplicity. XML enables you to create structured and semi-structured
documents that can be transferred and read by people and programs in
multiple formats (for example, pages that can be read on the web, handheld
devices and print). This "multi-use" of content is the driving force behind
the adoption of XML technology.
Today, most of the world's information is locked in paper, unsearchable
documents with proprietary file formats, or web pages where search engines
return too much data and not enough information. Just think about how much
your company has spent to create documents that can't be easily found or
distributed because they are unstructured.
XML lets business users create structured documents that can be leveraged
for multiple purposes in-house and exchanged to people and businesses around
the world. XML breaks new ground by connecting the front office business
users with the back office developers.
Bill Trippe, in his article "Do XML Editors Matter?" (Transform October
2001, page 27), makes this point by saying, "You can view XML as the bridge
between the two worlds of structured (relational) and unstructured
(document) data." He continues, "On one hand, you have a growing need for
content to be tagged at its source and maintained in a structured form. On
the other hand, users are resistant to more complex tools and processes."
Like a telephone line, which carries both voice and data, XML can carry
information suitable for computers and people.  Computer-generated XML is
dynamically created by a program for B2B ecommerce or other server-to-server
transaction. These applications are addressed by XML standards such as ebXML
and SOAP.  Human-authored content uses XML for improved search capabilities,
multi-channeled publication, and syndication. These applications are
addressed by standards such as MathML, NewsML, VoiceXML, and any number of
custom XML dialects.

There are also several multimedia demos at www.ektron.com/xml  showing how a
business user can author content.

William Rogers
CEO/President
Ektron Inc.
603-594-0249 ext 106


 -----Original Message-----
From: 	Mike Champion [mailto:mc@xegesis.org]
Sent:	Saturday, February 02, 2002 1:11 PM
To:	xml-dev@lists.xml.org
Subject:	[xml-dev] "Smart ASCII" -> XML for authoring?

The notion of "smart ASCII" as a way of creating structured documents
whose conventions allow it to be easily transformed to XML hit me
twice yesterday, once in the day job (a very significant media
company uses this to achieve interoperability between diverse
authoring systems) and once when reading
http://www-106.ibm.com/developerworks/xml/library/x-tipt2dw.html?
open&l=136,t=grx,p=mod :
"For the most part, "smart ASCII" is what you have been writing for
years if you use e-mail and the Usenet. ...Asterisks surround  bold
or heavily emphasized phrases; dashes surround italicized or lightly XML
Tags define complex content data types.  The main difference between XML and
HTML tags is that XML is used to structure data, while HTML is used to
display data.  HTML is a more loosely defined set of tags, where XML
requires a more strict and structured arrangement of data.  With this
special tagging content management systems can use the data to perform
operations such as searches and shared content.
emphasized phrases; underscores introduce Book or Series Titles. ...
They are all very quick to type.
Anything that looks like a URL is turned into a link automatically. A
fairly simple special format with curly braces and the ALT text
before a colon is used to insert images, such as charts and graphs."
There's a script included (a few hundred lines of well-commented
Python) to do the conversion to XML.

I'm of two minds on this ... on one hand it sounds like a return to
the Bad Old Days and will require continuing human intervention to
cleanup the inevitable not-so-smart ASCII before it can be converted
to XML rather than one-time human intervention to teach markup skills
to the authors.  On the other, it leverages what humans do best --
deal with patterns, templates, informal conventions -- and lets
computers do what they do best -- generate and parse formal syntaxes,
putting XML further behind the scenes, perhaps where it belongs.

I'd be interested in hearing others' reactions to this IBM
DeveloperWorks article and about actual experiences in the field.  My
guess is that is makes a LOT of sense for simple documents (memos,
weblogs, simple articles) and virtually no sense for serious
technical documents where the whole point of SGML/XML is to catch the
structural errors as early and automatically as possible even if this
requires some pain on the part of the authors.  But how big is the
middle ground, and when does it pay to make authors switch over to
XML? In other words, should XML stay in the background, or is it time
for the end-users to add basic markup knowledge to their repertoire
of skills?





-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS