unicode characters within XML documents

XML.org

XML.org

FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

unicode characters within XML documents

From: Mukul Gandhi <mukulg@softwarebytes.org>
To: XML Developers List <xml-dev@lists.xml.org>
Date: Sun, 16 Jan 2022 12:33:44 +0530

Hi all,

I came across, following XML instance document, provided with w3c xml schema test suite,

<doc value="؀؁؂؃؄؅؆؇؈؉؊؋،؍؎؏ؘؙؚؐؑؒؓؔؕؖؗ؛؜؝؞؟ؠءآأؤإئابةتثجحخدذرزسشصضطظعغػؼؽؾؿـفقكلمنهوىيًٌٍَُِّْٕٖٜٟٓٔٗ٘ٙٚٛٝٞ٠١٢٣٤٥٦٧٨٩٪٫٬٭ٮٯٰٱٲٳٴٵٶٷٸٹٺٻټٽپٿڀځڂڃڄڅچڇڈډڊڋڌڍڎڏڐڑڒړڔڕږڗژڙښڛڜڝڞڟڠڡڢڣڤڥڦڧڨکڪګڬڭڮگڰڱڲڳڴڵڶڷڸڹںڻڼڽھڿۀہۂۃۄۅۆۇۈۉۊۋیۍێۏېۑےۓ۔ەۖۗۘۙۚۛۜ۝۞ۣ۟۠ۡۢۤۥۦۧۨ۩۪ۭ۫۬ۮۯ۰۱۲۳۴۵۶۷۸۹ۺۻۼ۽۾ۿ"/>

Within the above mentioned, XML document, the text content of attribute "value" are arabic characters (specified with their unicode code points). I guess, specifying unicode characters with notation &#x.... (as with the example cited above), is a preferred way to mention and transport the related XML documents across software application systems.

My questions please,

What would, end user applications do with such XML documents? I guess, most likely they'll render them within a UI (then relevant fonts would also be needed) or, get/extract text contents from the XML documents for specific computations (like string comparison, etc). Am I right, on these points?

Any thoughts, on this topic would be great.

--

Regards,

Mukul Gandhi

Follow-Ups:
- Re: [xml-dev] unicode characters within XML documents
  - From: Michael Kay <mike@saxonica.com>
- Re: [xml-dev] unicode characters within XML documents
  - From: "Liam R. E. Quin" <liam@fromoldbooks.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS