OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: Word processors and semantic content

I'd have to agree.  The USPTO tried some years ago to persuade
applicants to submit patent applications in XML with very little
success.  Those few corporate customers who adopted the tools we gave
them (MS Word with template conversion to XML) produced documents that
were not reliably structured.  While the software itself was
problematic, the bigger issue was that the person creating the document
did not use the styles (structures) appropriately (abstract tagged as
the last claim, for example).  You could argue that it's only a matter
of training the users in the conceptual model of the patent application
contained in the structure of the underlying schema, and then they'd be
able to correctly populate that structure, but I don't think so.  These
were folks who knew all about the structure of a patent application
(professional clerks in very large IP law firms), but had no economic
motivation to be careful with the markup.

As I see it, the conceptual (abstract logical) model of a document (of
any kind) extant in any given culture is vague, but very powerful.
Anyone who uses typewriter/word processor tools has a tacit model that
is based on a "blank page" paradigm that bestows nearly unlimited
freedom of layout.  Think of the difference of appearance between a
formal wedding invitation and a legal brief presented to a court, and
you'll see that a great deal of highly significant information is
conveyed through the layout.  In both cases, the tacit model is
elaborated into more-or-less detailed models more-or-less explicitly
specified in either manuals of etiquette or through long exposure while
studying law.  In the case of patent applications, the Manual of Patent
Examining Procedure provides a great deal of detail about the content of
an application but usually does not compel specific format or layout
(all 100+ forms are optional).  The manner in which the rules are
expressed is such that a great deal of flexibility is retained by the
applicant while ensuring that the Office gets what it needs to examine
the application in accord with the law.  Creating a successful patent
application is the art of conforming to the rules of the MPEP, correctly
using language to which the courts have assigned specific
interpretations, disclosing the invention to one of ordinary skill in
the art while escaping the attention of competitors, and still
compelling the examiner to allow the application.  How do you create an
authoring tool that enables that process without sacrificing sufficient,
correct structure?

The cost of adding explicit structure (markup) to a document is offset
by the savings achieved with the automatic processing that the markup
enables.  I used to think that, as the WWII-induced mania for
industrializing all aspects of human discourse continues into the 21st
century, it would happen that the tacit document model and the
blank-page paradigm would evolve into something friendlier to explicit
structure, largely because of the introduction of programming skills at
earlier and earlier stages of formal schooling.  I'm not so sure any
more, especially since most of the markup people encounter today is HTML
and other types of primordial ooze conveyed through the WWW.  Things
will be different, but will they be better?

Until the tacit model (and human behavior along with it) changes, I
suspect that the outcome of Microsoft vs. ODF is irrelevant.  At
present, both of them appear to perpetuate rather than change the tacit
model.  Perhaps Google has the best opportunity to do otherwise, but
I've seen nothing yet to suggest that they will.

Bruce B Cox
US Patent & Trademark Office
Manager, Standards Development Division

The opinions expressed in this message are those of the author alone and
do not represent the official views of the US Patent & Trademark Office.

-----Original Message-----
From: Laurens van den Oever [mailto:laurens@xopus.com] 
Sent: Friday, February 08, 2008 7:50 AM
To: xml-dev@lists.xml.org
Subject: Word processors and semantic content

Dear List,

I enjoyed reading Elliotte's Future of XML article [1]. He made some
interesting comments and sharp observations.

But I disagree with one of the key statements in the article. 
I'd like to share my thoughts with the list and learn what you think.

At one point Elliotte says:

 "Traditionally, you see two hard problems in training non-techies to
write for the Web: teaching them semantic markup and showing them how to
use FTP."


 "XML-enabled word processors like OpenOffice and Microsoft Word solve
the first problem."

I don't think the first problem is solved. Word processors aren't going
to magically create semantic markup now that they can dump their
internal models to XML files.

To me the semantic authoring problem is the problem of having non
technical people creating semantic (and structured) content that meets
the requirements set by the use of that content.

If you're creating a plain weblog, a word processor may offer sufficient
semantics. But if you have requirements that impose a structure that is
more complex than HTML with custom tags, for instance nested sections,
or a required element order, the flexibility (which is perceived as
usability) of a word processor does more harm than good IMHO.

What are your thoughts on this?

Disclaimer: As an XML editor vendor, I'm biased, especially since our
core business is structured editing for non-techies.

[1] http://www.ibm.com/developerworks/library/x-xml2008prevw.html

Laurens van den Oever
Xopus Company

laurens at xopus.com

+31 70 4452345
Waldorpstraat 17G
2521 CA Den Haag
The Netherlands

KvK 27308787

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS