OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Character Entities: An XML Core WG View

[ Lists Home | Date Index | Thread Index ]

The following perl, applied to the standard Unicode database file 
'UnicodeData.txt' produces a file of entity declarations, 673922 bytes 
in size, that declares 13789 entities with canonical names.

use strict;

while (<STDIN>)
     my @fields = split(/;/, $_);
     my $cpoint = $fields[0];
     $cpoint =~ s/^0*//;
     my $name = $fields[1];
     next unless $name;
     next if ($name =~ /</);
     $name =~ s/ /_/g;
     print "<!ENTITY $name '&#x$cpoint;'>\n";


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS