OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] MS office XML

[ Lists Home | Date Index | Thread Index ]

Justin Lipton wrote:

> Does anyone know or have ideas about what XML enabled Office 11 actually
> means?

I got an extended (hours-long) demo of Word & Excel & XDocs from JeanPa 
and a product manager whose name I don't have handy, two or three months 
ago, so things may have changed but here's what I saw:

Both Word & this new XDocs thing can edit arbitrary XML docs per the 
constraints of any old XSD schema.  No DTD supprt.  There are some of 
the usual XML editor goodies such as suggesting what elements can go 
here and picking attributes. They have pretty cool facilities for 
GUIfied schema customization.  Neither of them can help much with mixed 
content, which has always separated the men from the boys in the *ML 
editing sweepstakes.

I'm not sure that either of them are really being positioned as 
general-purpose XML content creation facilities up against Arbortext & 
Altova & Corel.  I'm not sure that market is big enough to interest MS 
anyhow.  XDocs is (strictly my opinion) an attempt to build a desktop 
application constructor at a level that is a bit more declarative and 
open than VB, but richer & more interactive than a Web browser.  I'm not 
really convinced yet - I think MS would agree there's still quite a bit 
of product management to do - but it does seem to be a pretty clever 
piece of software.  I'm pretty sure it's safe to interpret the advent of 
XDocs as MSFT's declaration that they're not going to do anything with 
XForms.

What actually turns my crank is that you can save word docs as XML and 
they have their own "WordML" tag set that gets generated.   I took a 
close look at this and it's pretty interesting. Very verbose - every 
word on the page gets its own markup.  Suppose you have the word "foo" 
in bold with single-underline, the WordML looks something like

<r>
  <rps>
   <rp class="bold" />
   <rp class="underline" lines="1" />
  </rps>foo</r>

When you get something like a Word table or floating text box the markup 
gets really severely dense and ugly, but I didn't see anything that 
seemed egregiously wrong, it's not pretending to do anything more than 
capture all the semantics that Word carries around inside, which are 
correspondingly severely dense and ugly.  And HTML tables get pretty 
hideous too.

Why did I like this?  I didn't see anything that I couldn't pick apart 
straightforwardly with Perl, and if someone asked me to write a script 
to pull all the paragraphs out of a Word doc that contain the word "foo" 
in bold, well you could do that.  Which seems pretty important to me.

The idea is that you can have a Word document with all that formatting 
and then you can mix that up pretty freely with your own schema stuff, 
and have validation, then you can save it as Word (your markup plus 
Word's) or as pure XML (discards Word's markup, leaving just yours). 
The old Corel WPerfect SGML editor used to be able to do this too.

WordML and VML (for graphics) and your own schemas all get namespaces 
and they seem to use them sensibly.  JeanPa even talked to me about 
using real HTTP URIs pointing at schemas.microsoft.com and having RDDL 
or equivalent there.  This gave me an opportunity for sarcastic remarks 
about "Imagine that, a URL on microsoft.com that stays stable for more 
than a week..."

Well, whaddaya know:

~/ 513> host schemas.microsoft.com
schemas.microsoft.com has address 207.68.176.124

Anyhow, if they really do something like what they showed me, I'd call 
it a positive step.

Now, why would they do this?  Ask yourself, who is going to be making 
the decision as to whether or not to buy the next Office upgrade?  The 
CIO, right?  Will the CIO care about a better spell-checker or other 
such wordprocessing fluff?  I think not.  Will the CIO like having the 
inventory of Office docs accessible to software for... well, anything? 
I think so. -Tim






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS