OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: Multi-lingual experiment - a call for action

[ Lists Home | Date Index | Thread Index ]
  • From: "Didier PH Martin" <martind@netfolder.com>
  • To: <laurent@mmania.com>, "Xml-Dev" <xml-dev@xml.org>
  • Date: Mon, 17 Apr 2000 08:13:02 -0400

Hi Laurent,

First and foremost thanks for you contribution.

Laurent said:
Hmmm... Forgive me if this is a stupid question, but what is the use
of translating the DTD ? Won't you end up with documents that need to
be processed by an XML application specific to each language ? If we
were talking about Schemas, it would make more sense to me - IIRC
Schema validation has provisions for making equivalence between
element names, so that a translated element name could properly map
to its translations.

Didier replies:
It make sense to have a DTD in each language so that, people can experiment
translating form one language to an other. Do not forget that the experiment
is about the result of a database. In the current debate about languages, we
forgot that there are several of domain languages: a) to encode knowledge.
b) to encode data, c) to encode visual/aural/tactile objects and probably
other categories too. In the case of XML used to encode knowledge, the fact
that the document vocabulary is location dependent is an important fact.
Also, in the case where XML is used to encode data, it is even more
important since it is a metter of fact that Databases are already localized.
If you take a close look to German DB, French DB, Swedish DB and so forth,
you can notice that the fields and the tables names are often in German,
French, Swedish and so forth. So, if we make the inference that a big chunk
of XML usage on the web will be for data base service, then, the
multi-language experiment is becoming very important since, this can reflect
the reality we'll experiment.

As secondary experiments people can do:
a) Example: translate from an XML document encoded with a French DTD into a
new XML document encoded in German for trading. Should I mention here that
this matter of fact will happen with a high probability mainly for exchange
and trade within the European community.
b) Example: translate from an XML document encoded with a Swedish DTD into
German, French, English for both rendering languages: WML and HTML. Within
this context, we may use XSLT to transform the Swedish structure into a
multi-lingual presentation format.

Laurent said:
Also, your sample document doesn't provide much "content" to
translate. Given the (implied) context, I'm assuming that the title
should not be translated, unless of course the item concerns a
translated version of the work. I've taken the liberty of adding a
"description" element to work on.

Didier replies:
Sure, I picked a small example so that people do not have a lot to do. Yes
only the title can be translated but what is more important is that the
elements and attributes can be translated. For example Aaron made a Spanish
translation where all elements and attributes are translated into Spanish.
We can then use this document to do some experiments like above. I am
personally actually creating an XSLT transformation sheet to transform the
same document structure encoded with a Japanese DTD into an HTML document
displaying English words. The point here is that people can get an example
of an xsl:template matching a Japanese expression and understand that XML
can be applied on their DB that are actually encoded most of the time in
their native language.

Laurent said:
>From what I gather of Appendix B to the XML 1.0 spec, french
"accents" are not allowed in element names - though I'm not quite
sure... For safety's sake I didn't use any, although I find the
restriction irritating (a bad start already !). A quick review of the
XML Schema TR leads me to believe diacritics *can* be used in element

Didier replies:
Please, use the accents since French includes accent. If I show you a
Japanese DTD (unfortunately most mails won't be able to decode UTF-8
Japanese characters) you'll notice that the elements are full Japanese words
_not_ cut back ones. So please, include the accents so that it is french not
a language between two chairs. If we speak of multi-ligual let's be
multi-lingual. Anyway, don't bother, I'll add them.

Laurent said:
The most important question, though, is what kind of manipulations
you want to be able to perform on such "multi-lingual documents" ?
Depending on the task - retrieving the version appropriate to readers
of different origins, assisting with machine translation, or
providing a "corpus" of translations for literary or academic
purposes, to name but a few examples - might require very different
structures and/or solutions.

Didier replies:
example like the ones mentionned above. But I'll write a paper an post
examples that people can use. These documents can represent - in any
experiment - data source results. As we know, especially when you are a
globetrotter, most db fields and tables names are often encoded in native
language names. Therefore, most documents created from data sources will be
encoded with an implicit native language DTD. If we can demonstrate that
these kind of documents can be transformed as well as english based
documents, then, maybe XML will be perceived as more useful than originally
expected. Maybe as one of the first language technology that repects
people's being. When you'll see a japanese document encoded with a japanese
DTD displayed in a browser in english, german or french, you'll _see_ what I
mean. (idem for a trade simulation between a french and german company both
using their own native language and repecting the native language of the
trading partner - After all, are we in the 21th century? Is the 20th century
over? I know, not until the end of the year :-)).

Didier PH Martin
Email: martind@netfolder.com
Conferences: XML Europe (http://www.gca.org)
Book: XML Professional (http://www.wrox.com)
column: Style Matters (http://www.xml.com)
Products: http://www.netfolder.com

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS