XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: Percentage of XML documents exclusively processed by machines?

As others have observed, it might depend on the class of documents.  We publish 10,000 -- 15,000 documents a week in XML.  These won't be looked at by a person unless there is a problem (rendering with XSLT, loading to database, complaint from content owner, complaint from value-added reseller), or an exception (mega content, i.e., more than 100,000 paragraphs, more than 100,000 embedded images, sequence listings with 1,000,000 or more sequences).  Resolving the problem, or finding a workaround for the exceptions, invariably leads to studying the instance to determine the root cause.

At the other extreme, we've just about completed a system that supports the editing of the Manual of Patent Examining Procedure (MPEP) and other similar documents.  A handful of editors use oXygen linked to a Documentum repository to create and update the content.  Even though oXygen provides a wysiwyg editor, some of the editors prefer to edit certain types of content in the raw XML, especially the hyperlinks (where the target is always a GUID).   They do so even though it meant they had to learn enough about XML to do so successfully and even though they are fully aware of the risks of doing so.  On the other hand, they have a law degree in addition to their technical degree, and, after all, XML is not rocket science.

Maybe one could generalize that any class of documents where the volume of instances is large (RSS feeds, news feeds, various kinds of business transactions), the XML will most likely not be viewed by humans unless there is an anomaly; and where the volume of instances is low, the XML is likely to be viewed by humans much more frequently.

But even in the low volume case, the intention is that an instance should be processed without human intervention.  When the MPEP content has been revised, and an editor issues the Publish order, the remainder of the processing (creating PDF, HTML, and loading the search engine) should proceed with no further human intervention.

Even though it's supposed to be hidden in most circumstances, using business terminology for tag names saves significant time and confusion in getting the business logic in schema structure right.

Bruce B Cox
OCIO/AED/Software Architecture and Engineering Division
571-272-9004


-----Original Message-----
From: Costello, Roger L. [mailto:costello@mitre.org] 
Sent: 2011 December 3, Saturday 08:13
To: xml-dev@lists.xml.org
Subject: Percentage of XML documents exclusively processed by machines?

Hi Folks,

What percentage of XML documents are exclusively processed by machines?

Allow me to explain what I mean by "exclusively processed by machines."

First, consider the opposite - XML documents that are processed by humans. An XML document is received and then displayed directly to a human. Or the XML document is received, transformed to a visually friendly form such as HTML, and then the visually friendly form is displayed to a human.

XML documents that are exclusively processed by machines don't have a human in the loop. An XML document is received and then processed by a machine. No human ever sees the XML.

Are there any statistics on the  percentage of XML documents that are exclusively processed by machines?

I'll take a wild guess and say that 99% of all XML documents are exclusively processed by machines. Is that a reasonable estimate?

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS