[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Best Practice: Use a character or its escaped form?
- From: "Costello, Roger L." <costello@mitre.org>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Thu, 28 Feb 2013 10:29:04 +0000
Hi Folks,
When should Á be used versus when should its escaped form be used (&xC1;)?
Which is Best Practice, this:
<Name>Ándre</Name>
or this:
<Name>&xC1;ndre</Name>
Which do you think is Best Practice?
Scroll down to see the answer ...
The following answer is from Martin J. Dürst. It is fantastic.
In HTML and XML (and many other formats), escapes such as character entity references are what their name says, escape hatches. That means that you should only use them in "emergency situations".
In the example at hand, most people will be able to read Ándre without problems. But &xC1;ndre requires table lookup in Unicode or some other mental gymnastics.
The preference for using characters directly, rather than escapes, is formally put down at http://www.w3.org/TR/charmod/#C047.
C047 says:
>>>>>>>>
C047 [I] [C] Escapes SHOULD only be used when the characters to be expressed are not directly representable in the format or the character encoding of the document, or when the visual representation of the character is unclear.
>>>>>>>>
The [I] says that this applies to implementers, the [C] says that this applies to content. The "are not directly representable" would apply if e.g. your document is encoded in Shift_JIS (which doesn't have 'Á'). The "the visual representation of the character is unclear" applies e.g. for because it may be desirable when looking at the source that there's a non-breaking space there rather than a plain space. It may also apply if you don't have an editor that can show that character, if you e.g. can't input it, or if you are not familiar enough with the character/script to make sure you get the right one. But the former two are rare these days, and the latter should better be avoided, because the person inputting/checking may have the same problem when looking at an Unicode table.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]