OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Web Design Principles (was Re: [xml-dev] Generality ofHTT

[ Lists Home | Date Index | Thread Index ]
  • To: xml-dev@lists.xml.org
  • Subject: RE: [xml-dev] Web Design Principles (was Re: [xml-dev] Generality ofHTTP)
  • From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
  • Date: Mon, 28 Jan 2002 16:32:49 -0600

Document Imaging System sites now make reference to fuzzy 
logic technologies that can clean up bad OCR output to some 
acceptable rate, so affordable technology could help out 
here.  One remaining problem is trusting the fuzzy 
logic to have not altered the original.   This is similar 
to the issue of legal document image fidelity: many 
systems don't accept a document as legal if it can have 
been altered by any means, aka, the identity problem.  
The application of a technology can't usually be divorced 
from the content it operates over.   As long as the 
identity requirement doesn't enter in, the process 
can be lightly defined and the fast fingered typist 
is as good as the OCR all other things being equal 
(which they aren't but that's a longer story).

So we come back around to simple is ok until you have 
strict requirements and money on the table.  On 
the other hand, any project I've ever worked on that 
relied on volunteer effort had to be simple or the 
predictor for success was very low.

I wonder if the Web Design Principles are different 
with large well-funded organizations in the loop.  
Consider the NASA effect:  when an engineering 
organization transforms into an engineering 
project management organization, does the quality 
of the product change or only the rate of the 


-----Original Message-----
From: Jeff Greif [mailto:jgreif@alumni.princeton.edu]

It's also a question of volume.  A 1% error rate that needs human cleanup is
not a big deal when you only see 100 docs per day, but it mounts up when
there a million.

Analogy: A friend is slowly scanning and turning into PDF files all the
reprints and preprints (in planetary science) that he's collected since the
late 1960's.  He runs the scanner more or less continuously while at home,
and takes the files produced on his laptop when he travels, and does a sort
of desultory fixup of the OCR (since he has the page images as well) as
lulling airplane activity.  Serious fixup occurs when he actually has to
consult the paper for details.

From: "Paul Prescod" <paul@prescod.net>

> Having computers and humans working together is great. But you seem to
> propose that users should be required to handle the exceptional cases
> that computers handle poorly. I'd suggest instead that the users would
> rather work with programmers (or visual mapping tools) to automate away
> those exceptional cases so that they can be freed up to do creative
> work.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS