[
Lists Home |
Date Index |
Thread Index
]
-----Original Message-----
...
A more practical approach would be for the open
source/open standards community to develop a community converter
application that is bullet-proof (lawyer proof) that takes any word
processing document and converts it into raw text and then produces a
generic XML candidate version which Word users could then approve or
change through the converter application to a candidate version that
they can approve. They can then publish both the Word version for the
community that wishes to use the MS suite and the raw text/XML
version or an Open Office version. It would take a while to get
comfortable with such a wasteful system, but it would at least offer
an alternative to bending the knee to MS and it would help ensure
that XML doesn't become a de facto MS property.
------------------------
The MS patent application can be read narrowly or broadly. I think even the
narrow interpretation would preclude the community converter for word
processing documents.
Here's how the patent application looks from the perspective of a person who
wants to build a community converter to read, parse and do something to a MS
Word document.
The broadest claim in the patent is, as always, the first:
"A computer-readable medium having computer-executable components,
comprising:
a first component for reading a word-processor document stored as a single
XML file;
a second component that utilizes an XSD for interpreting the word-processor
document, and
a third component for performing an action on the word-processor document."
This does not apply only to MS Word documents but to all word-processing
documents. However, Microsoft cannot seriously believe that this claim will
have no prior art unless the term "word processor document" is restricted in
meaning (otherwise a web browser displaying HTML could be prior art):
So what is a Word-processor document? Interestingly, the term is not defined
explicitly but, instead, is defined as an example in the definition of
"markup language":
"[0013] The terms "markup language" or "ML" refer to a language for special
codes within a document that specify how parts of the document are to be
interpreted by an application. In a word-processor file, the markup language
specifies how the text is to be formatted or laid out, whereas in an HTML
document, the ML tends to specify the text's structural function (e.g.,
heading, paragraph, etc.)"
This specifically restricts WPML to formatting markup and apparently
restricts the scope of the patent to the display of text rather than to the
structural function of the text (although I'm not sure where they draw the
line between "laid out" and an element's structural function).
Here's what their sample document looks like:
http://v3.espacenet.com/pdfdoc?DB=EPODOC&IDX=EP1376387&QPN=EP1376387&F=128&P
GN=33
It's all pretty clear to this point and if the patent is granted the
community converter would be in breach. But the patent application seems to
encompass more than the display of the word-processing documents:
Despite the title, "word processing document stored in a single xml file",
the patent contains the concept of a "hint" which seems to allow some types
of information to be stored outside the "single XML file". The following
paragraph describes a "hint".
[0047] Other information may also be included within the document that is
not needed by the word-processing program. According to one embodiment of
the invention a "hints" element is included that allows external programs to
easily be able to recognize what a particular element is, or how to recreate
the element. For example, a specific number format may be in a list and used
by the external program to recreate the document without knowing the
specifics of the style.
Now, despite the above example of a hint as a mechanism to specify the
format of a number in a list, claim 24 describes a hint as:
"The schema of Claim 23, further comprising a hints element, wherein the
hints element may be used to indicate a meaning for an item."
Note the use of the term "meaning".
Read broadly, Microsoft's intent may be to restrict competitors from not
only displaying word processor files identically to MS Word, but also to
stop competitors from using the "hints" to extract meaningful data from the
text elements. So it may be that not only would the community converter be
in breach of the patent, but the community archival and retrieval system may
also breach.
Now, I am not a patent lawyer, but it seems to me the prior art would need
to be a processor of a marked up document based on a schema (preferably XSD)
containing formatting instructions surrounding a single tag element
containing the text of the document. Optionally, this document could refer
to additional files containing further information about the elements in the
original document.
|