XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Best Practice: Use a character or its escaped form?


Use the escaped form (a) when it's easier to generate than the "true" 
character, e.g. to cope with input device limitations, or (b) when you 
don't trust the transmission/storage services you are using to preserve 
the integrity of your character encoding.

Michael Kay
Saxonica

On 28/02/2013 10:29, Costello, Roger L. wrote:
> Hi Folks,
>
> When should Á be used versus when should its escaped form be used (&xC1;)?
>
> Which is Best Practice, this:
>
> 	<Name>Ándre</Name>
>
> or this:
>
> 	<Name>&xC1;ndre</Name>
>
> Which do you think is Best Practice?
>
> Scroll down to see the answer ...
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> The following answer is from Martin J. Dürst. It is fantastic.
>
> In HTML and XML (and many other formats), escapes such as character entity references are what their name says, escape hatches. That means that you should only use them in "emergency situations".
>
> In the example at hand, most people will be able to read Ándre without problems. But &xC1;ndre requires table lookup in Unicode or some other mental gymnastics.
>
> The preference for using characters directly, rather than escapes, is formally put down at http://www.w3.org/TR/charmod/#C047.
>
> C047 says:
>
>   >>>>>>>>
> C047  [I]  [C]  Escapes SHOULD only be used when the characters to be expressed are not directly representable in the format or the character encoding of the document, or when the visual representation of the character is unclear.
>   >>>>>>>>
>
> The [I] says that this applies to implementers, the [C] says that this applies to content. The "are not directly representable" would apply if e.g. your document is encoded in Shift_JIS (which doesn't have 'Á'). The "the visual representation of the character is unclear" applies e.g. for &nbsp; because it may be desirable when looking at the source that there's a non-breaking space there rather than a plain space. It may also apply if you don't have an editor that can show that character, if you e.g. can't input it, or if you are not familiar enough with the character/script to make sure you get the right one. But the former two are rare these days, and the latter should better be avoided, because the person inputting/checking may have the same problem when looking at an Unicode table.
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS