OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: XML and Internationalization...

[ Lists Home | Date Index | Thread Index ]
  • From: Deke Smith <deke@tallent.com>
  • To: <xml-dev@ic.ac.uk>
  • Date: Mon, 9 Nov 98 12:44:58 -0600

david@megginson.com, david@megginson.com said on 11/9/98 8:54 AM:

>One important point to note is that by itself, the 'xml:lang'
>attribute simply indicates the language of the content and attribute
>values -- it does not suggest that sibling elements with different
>xml:lang values either are or are not equivalents in other languages.
>For example, I could have
>
><itinerary>
>  <city xml:lang="fr">Montr&#233;al</city>
>  <city xml:lang="en">London</city>
>  <city xml:lang="it">Roma</city>
></itinerary>
>


Tony Graham, tgraham@mulberrytech.com said on 11/9/98 10:09 AM:

>While you may find concepts from the TMX work that are useful to you,
>TMX stands for Translation Memory eXchange, and is concerned with
>importing and exporting portions of translation memory -- phrases that
>have been translated once and saved so they don't need to be
>translated again -- between translation tools.  TMX has structures for
>parallel portions of text in multiple languages, but there is no
>concept that these chunks of text can, should, or will string together
>to make a coherent "document", in anybody's sense of the word.  The
>only markup in a TMX document, which is in XML, is concerned with
>delimiting and identifying the parallel chunks of text for the
>purposes of the translation tool: other markup from the source
>document may be saved in the TMX document (with significant XML
>characters escaped with entities) but only as a translation aid for
>those tools that can use it.

I have created phrase "substitution" scripts in Frontier and XML and ran into the same problem. I wanted to be able to "translate" phrases or words for use in multi-lingual Websites. It translates in the roughest sense: "Hello World!"=="ĦHola Mundo!"=="ĦBonjour Monde!".

I created my own translation DTD (I don't know of simple ones that may exist) -- and I think it shows, as David pointed out, that XML only provides a framework and the processing program has to provide an additional amount of structure not found in the DTD.

Under my dirty little DTD (built by necessity), the "Hello World!" example would be:

<PHRASE ID="Hello World!" xml:lang="en">
     <TRANSLATION xml:lang="fr">
          ĦBonjour Monde!
     </TRANSLATION>
     <TRANSLATION xml:lang="es">
          ĦHola Mundo!
     </TRANSLATION>
     <TRANSLATION xml:lang="de">
          Hallo Welt!
     </TRANSLATION>
</PHRASE>

This is a private DTD, so in my little world I know that the ID attribute of the PHRASE element equals the text nodes of the TRANSLATION elements. It would be asking too much of XML to enforce this structure.

TMX does provide this sort of function and structure, doesn't it?

Here is how I would translate the previous example in TMX:

<?xml version="1.0?">
<!DOCTYPE tmx SYSTEM "http://www.lisa.org/tmx/tmx11.dtd">
<tmx version="1.1">
	<header
		creationtool="UserLand Frontier"
		creationtoolversion="5.1.4"
		datatype="PlainText"
		segtype="phrase"
		adminlang="en-us"
		srclang="EN"
		o-tmf="Frontier"
		o-encoding="MACINTOSH">
	</header>
	<body>
		<tu>
			<tuv lang="EN" creationid="BUZU">
				<seg>Hello world!</seg>
				</tuv>
			<tuv lang="FR" creationid="BUZU">
				<seg>ĦBonjour Monde!</seg>
				</tuv>
			<tuv lang="ES" creationid="BUZU">
				<seg>ĦHola Mundo!</seg>
				</tuv>
			<tuv lang="DE" creationid="BUZU">
				<seg>Hallo Welt!</seg>
				</tuv>
			</tu>
		</body>
</tmx>

Here's my question:

As I understand it, TMX is a format for translation "dictionaries" -- or lists of equivalent words, phrases, sentences or paragraphs in different languages. TMX also allows the preservation of formating within phrases, such as boldface, italic, etc.

I always judge tools by what *I* need from them and that is what I need from TMX. Is it meant to do more than what I have asked it to do? Is this "dictionary" concept something TMX is *meant* for?

I am under the impression that TMX can also have embedded "macros" within phrases. By "macro", I mean processing commands that may be understood only by a specific scripting language. Am I right?

Deke

-----------------------------------------------------------------
Deke Smith
Tallent Communications Group, Brentwood TN
deke@tallent.com, 615-661-9878
-----------------------------------------------------------------
" The best way to predict the future is to invent it. "
       - Alan Kay



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS