Lists Home |
Date Index |
- From: "Biron,Paul V" <Paul.V.Biron@kp.ORG>
- To: "'firstname.lastname@example.org'" <email@example.com>
- Date: Wed, 20 Jan 1999 16:30:44 -0800
> From: "Ogievetsky, Nikita" <firstname.lastname@example.org>
> Date: Wed, 13 Jan 1999 12:37:06 -0500
> Subject: RE: XML standards coherency and so forth
> >Andreas Berg wrote:
> > I am searching for a converter from Word documents to XML. Unfortunatly
> > no time to wait for Office 2000..... Is there something like this
> In the MS Word go to <File>/<Save As> menu, select "Save as HTML
> It will create a well formed XML file: HTML with all elements having start
> and end tags.
> (Just remember to exhume the <body> - sorry for bad joke).
> Nikita Ogievetsky.
Actually, it is very easy to generate a Word '97 document which when saved
as HTML will be non-wellformed. Try the following, where *xxx* means "make
xxx bold", and _yyy_ means "make yyy italicized".
This is *a test _of the* emergency_ broadcast system
The relevant portion of the HTML produced by word is
<P>This is <B>a test <I>of the</B> emergency</I> broadcast
The "nesting" of the B and I elements is not well-formed. As far as I can
tell this works (or doesn't as the case may be) for any format/font changes.
Word 97 also produced several well-formedness violations when doing anything
more than simple nested lists.
SGML Business Analyst
Kaiser Permanente, So Cal.
xml-dev: A list for W3C XML Developers. To post, mailto:email@example.com
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:firstname.lastname@example.org the following message;
To subscribe to the digests, mailto:email@example.com the following message;
List coordinator, Henry Rzepa (mailto:firstname.lastname@example.org)