OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] [Summary] Media type (MIME) of XML in MS Word? in Notepad?

[ Lists Home | Date Index | Thread Index ]

Hi Roger,

I'd like to add one more piece to this "puzzle".

The effect of putting the xml into a word document has been described as 
putting a "wrapper" around it. As I understand it, this is not 
necessarily the case - that would imply that you could somehow easily 
extract the xml content. Word does some shredding of it's own so that 
the word representation remains the same, but not necessarily the exact 
content. ie again as I understand it, so long as it looks the same, Word 
is happy, but it may put lots of other stuff in the text to satisfy it's 
own needs.

Any attempt to print the document will result in a print file (even for 
text) not necessarily an untransformed version of what you put in.

How do I know this? I've been battleing MS programs to print images to 
Zebra printers - where the image is a very long string (no end of line) 
and almost every MS program insists on breaking the long string. The 
same would clearly apply to XML where end of line is just (optional and 
unnecessary) white space. You can argue about end of line being in 
appropriate places, but if it can't find such a place it will insert one 
anyway and that breaks the xml (and my images :( ).

So

a "wrapper" allows extraction of the original source unchanged - zip etc 
is such a "wrapper" (doesn't have to be the original text, but does have 
to be identical xml).
a "transform" can hold the xml, but not necessarily preserve the xml. 
Attempts to extract may not result in the original xml.

there's any number of mathematical ways to represent this - inverse 
functions etc.

hope this helps

Regards

Rick

Costello, Roger L. wrote:

> Hi Folks,
>
> Below I have summarized our discussion on XML and media types (MIME). 
> Please let me know if there are any inaccuracies in the summary. /Roger
>
> *A Summary of XML and Media Types (MIME)*
>
> *What is XML’s MIME Type?*
>
> At this URL is a list of the 350 different MIME types:
>
> http://www.iana.org/assignments/media-types/
>
> In this list you will see two different MIME types for XML:
>
> *application/xml*
>
> * text/xml*
>
> The later MIME type (*text/xml*) has been deprecated. Thus, the 
> official MIME type for XML is:
>
> *application/xml*
>
> Note the format for expressing MIME types – it contains two parts, 
> separated by a slash:
>
> / *type*/*//subtype/*
>
> *The Editor used to Create the XML Determines its MIME Type*
>
> Interestingly, you may have a document which contains XML and yet its 
> MIME type may not be *application/xml*.
>
> For example, take this simple XML:
>
> *<?xml version="1.0"?>*
>
> *<root>*
>
> * Blah*
>
> *</root>*
>
> and put it into *Word* (save it as a .doc file). The MIME type is:
>
> *application/msword*
>
> Conversely, if you put the same XML into *Notepad*, the MIME type is:
>
> *application/xml *
>
> Why is that? Why is it that if you put XML into one editor (*Word*) 
> you get a MIME type that is specific to the editor, whereas if you put 
> XML into another editor (*Notepad*) you get a MIME type that is 
> independent of the editor?
>
> The answer is this: when the XML is put into *Word*, the *Word* 
> application wraps the XML with a bunch of *Word*-specific stuff (the 
> wrapper stuff is not visible). Consequently, the *Word* document isn't 
> something you can feed directly into an XML parser.
>
> Conversely, *Notepad* does not wrap the XML with anything. The 
> document is pure XML, it can be fed directly into an XML parser, and 
> thus it has a MIME type of *application/xml*.
>
> Suppose that you put the XML into *Wordpad*, what is its MIME type?
>
> Answer: it depends on how you save the file. If you save it as a text 
> document (SaveAs = Text Document) then the MIME type will be 
> *application/xml*. If you save it as Rich Text (SaveAs = Rich Text) 
> then the editor will wrap the XML in some stuff, and the MIME type 
> will be different – it will be *text/richtext*.
>
> Lastly, suppose that you put the XML into *Notepad* and then compress 
> it using *Winzip*, what is its MIME type?
>
> Answer: *application/zip*
>
> *Further Information*
>
> Elliotte Rusty Harold has written an excellent article on this subject:
>
> http://www-128.ibm.com/developerworks/xml/library/x-mxd2.html
>
> *Acknowledgements*
>
> I would like to gratefully acknowledge the excellent inputs from these 
> people:
>
> Mitch Amiano
>
> Dave Pawson
>
> Bryan Rasmussen
>
> Henri Sivonen
>
> !DSPAM:448ded2c289197818312239! 

begin:vcard
fn:Rick  Marshall
n:Marshall;Rick 
email;internet:rjm@zenucom.com
tel;cell:+61 411 287 530
x-mozilla-html:TRUE
version:2.1
end:vcard





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS