Lists Home |
Date Index |
Justin Lipton wrote:
> Does anyone know or have ideas about what XML enabled Office 11 actually
I got an extended (hours-long) demo of Word & Excel & XDocs from JeanPa
and a product manager whose name I don't have handy, two or three months
ago, so things may have changed but here's what I saw:
Both Word & this new XDocs thing can edit arbitrary XML docs per the
constraints of any old XSD schema. No DTD supprt. There are some of
the usual XML editor goodies such as suggesting what elements can go
here and picking attributes. They have pretty cool facilities for
GUIfied schema customization. Neither of them can help much with mixed
content, which has always separated the men from the boys in the *ML
I'm not sure that either of them are really being positioned as
general-purpose XML content creation facilities up against Arbortext &
Altova & Corel. I'm not sure that market is big enough to interest MS
anyhow. XDocs is (strictly my opinion) an attempt to build a desktop
application constructor at a level that is a bit more declarative and
open than VB, but richer & more interactive than a Web browser. I'm not
really convinced yet - I think MS would agree there's still quite a bit
of product management to do - but it does seem to be a pretty clever
piece of software. I'm pretty sure it's safe to interpret the advent of
XDocs as MSFT's declaration that they're not going to do anything with
What actually turns my crank is that you can save word docs as XML and
they have their own "WordML" tag set that gets generated. I took a
close look at this and it's pretty interesting. Very verbose - every
word on the page gets its own markup. Suppose you have the word "foo"
in bold with single-underline, the WordML looks something like
<rp class="bold" />
<rp class="underline" lines="1" />
When you get something like a Word table or floating text box the markup
gets really severely dense and ugly, but I didn't see anything that
seemed egregiously wrong, it's not pretending to do anything more than
capture all the semantics that Word carries around inside, which are
correspondingly severely dense and ugly. And HTML tables get pretty
Why did I like this? I didn't see anything that I couldn't pick apart
straightforwardly with Perl, and if someone asked me to write a script
to pull all the paragraphs out of a Word doc that contain the word "foo"
in bold, well you could do that. Which seems pretty important to me.
The idea is that you can have a Word document with all that formatting
and then you can mix that up pretty freely with your own schema stuff,
and have validation, then you can save it as Word (your markup plus
Word's) or as pure XML (discards Word's markup, leaving just yours).
The old Corel WPerfect SGML editor used to be able to do this too.
WordML and VML (for graphics) and your own schemas all get namespaces
and they seem to use them sensibly. JeanPa even talked to me about
using real HTTP URIs pointing at schemas.microsoft.com and having RDDL
or equivalent there. This gave me an opportunity for sarcastic remarks
about "Imagine that, a URL on microsoft.com that stays stable for more
than a week..."
Well, whaddaya know:
~/ 513> host schemas.microsoft.com
schemas.microsoft.com has address 188.8.131.52
Anyhow, if they really do something like what they showed me, I'd call
it a positive step.
Now, why would they do this? Ask yourself, who is going to be making
the decision as to whether or not to buy the next Office upgrade? The
CIO, right? Will the CIO care about a better spell-checker or other
such wordprocessing fluff? I think not. Will the CIO like having the
inventory of Office docs accessible to software for... well, anything?
I think so. -Tim