XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Copying text (curly quotes) from Word into an XML document (UTF-8): what happens?

Hi Folks,

Suppose I use Notepad to compose an XML document. I indicate in the XML
declaration an encoding of UTF-8.

Suppose I copy text from Word into the XML document. Suppose the text
is: "Hello World", except suppose the quotes are curly quotes rather
than straight quotes as shown.

The text appears fine in Notepad. That is, the curly quotes look fine.

I save the Notepad file, and then drag and drop the file into Internet
Explorer, which gives an error: 

     An invalid character was found in text content.

QUESTIONS

1. Is the curly quote a valid UTF-8 character?

2. Word uses Windows-1252 encoding, correct? 

3. The curly quote in Windows-1252 has a specific binary sequence,
correct? 

4. When I copy the curly quote from Word into Notepad, the operating
system does a straight 1-1 copy of the binary sequence, correct?  

5. When I copy the curly quote from Word into Notepad, there is no
conversion or translation of the binary sequence by the operating
system, correct?

6. Assuming the curly quote is a valid UTF-8 character, is the
Windows-1252 curly quote binary sequence the same as the UTF-8 curly
quote binary sequence? 

7. Is the Windows-1252 curly quote binary sequence illegal in UTF-8,
i.e. the Windows-1252 curly quote binary sequence doesn't correspond to
any UTF-8 character?

8. Suppose I save the Word document as XML, and then I open the XML
using Notepad. The curly quotes no longer appear as curly quotes;
instead they appear as a bizarre character.  Why does the curly quote
now look like a bizarre character in Notepad, whereas when I copied the
curly quote from Word to Notepad it looked fine in Notepad?

/Roger

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS