OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: Word and XML (was: XML standards coherency and so forth)

[ Lists Home | Date Index | Thread Index ]
  • From: "Biron,Paul V" <Paul.V.Biron@kp.ORG>
  • To: "'xml-dev-digest@ic.ac.uk'" <xml-dev-digest@ic.ac.uk>
  • Date: Wed, 20 Jan 1999 16:30:44 -0800

> From: "Ogievetsky, Nikita" <nikita.ogievetsky@csfb.com>
> Date: Wed, 13 Jan 1999 12:37:06 -0500
> Subject: RE: XML standards coherency and so forth
> >Andreas Berg wrote:
> > I am searching for a converter from Word documents to XML. Unfortunatly
> >I
> have
> > no time to wait for Office 2000..... Is there something like this
> available?
> In the MS Word go to <File>/<Save As> menu, select "Save as HTML
> document".
> It will create a well formed XML file: HTML with all elements having start
> and end tags.
> (Just remember to exhume the <body> - sorry for bad joke).
> Nikita Ogievetsky.
Actually, it is very easy to generate a Word '97 document which when saved
as HTML will be non-wellformed.  Try the following, where *xxx* means "make
xxx bold", and _yyy_ means "make yyy italicized".

	This is *a test _of the* emergency_ broadcast system

The relevant portion of the HTML produced by word is

	<P>This is <B>a test <I>of the</B> emergency</I> broadcast

The "nesting" of the B and I elements is not well-formed.  As far as I can
tell this works (or doesn't as the case may be) for any format/font changes.

Word 97 also produced several well-formedness violations when doing anything
more than simple nested lists.

SGML Business Analyst
Kaiser Permanente, So Cal.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS