OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Usage of Fonts in XML



S. Jyotinarayan said:

[snipped]
>
> I have MS Office 2000 installed on Windows 2000 OS.
> I have  saved the MS Excel file to "Unicode Text".
> When I view the text in Notepad I see that it's delimited by tab.
> When I open it in MS Word or MS Excel it imports the text and displays it
in
> Courrier and Times New Roman fonts respectively and not the Arlsk font
> that's initially used. Is this normal?

Yes, it's normal. Anytime you save a document as text, you lose essentially
all of your formatting.

> What tool do I use to do the conversion from "Unicode Text" to XML?

That's a little like asking what vehicle you should ride to work. <wink/>

> What encoding do you think would best suit my requirement, I am displaying
> text with diacritical marks?

What are your requirements? I do not remember what you said in your previous
posts, and I've already deleted them from my computer. (Short on disk
space.)

> How does this work? Once I get the converted XML file with the diacritical
> marks and view it on an internet browser of a system, will I need to have
> the particular font I used to write the text initially in MS Excel
installed
> on the system?

Do you understand the difference between a font and an encoding?

> > I personally don't open binary files from strangers that may contain
> > built-in macros, especially Excel and Word files, which are very popular
> for
> > spreading viruses. Not that you would be trying to spread a virus, but
you
> > never know how good someone else's virus-checking infrastructure is.
>
> What do you suggest is the best way to send over formatted text?

I've seen a lot of people using rich text (RTF) or Adobe Acrobat (PDF). But
that really doesn't answer your present question, I am sure.

[snipped]

I'll try a little dancing in the dark and give you a really simplified
explanation of text, encoding, diacriticals, and fonts. Pardon me if I
become a little pedantic.

Character: a low-level graphic element commonly used to communicate
information in a specific language. Characters (preferably) exist at a level
below that commonly used to convey meaning.

Diacritical marks: character-like marks attached to fundamental characters
for various purposes.

Encoding: an assignment of numbers (code points) to characters and
character-like "things", such as diacritical marks, combination characters,
methods of combining, delimiting whitespace, etc. (This is way too simple an
explanation, but you have to start somewhere.)

Font (see typeface): a term improperly borrowed in the computer industry to
describe the definition of a set of graphical representations of characters
and character-like "things".

<important>A character encoding does not provide any means of specifying a
font, at least not in our present technology.</important>

Text: data represented as a sequence of characters and character-like
"things". (This is fairly easy to grasp when your native language is
American English. Other languages do not allow as clean a distinction
between text and formatting.)

<important>Although you can not correctly view a text file without a correct
font, the text is (ideally) independent of the font.</important>

Mark-up: a method or system of using unusual character strings or other
markings to indicate extra information, including (but not limited to)
formatting, keywords for searching, interpretations, annotations,
interpolations, other ancillary information of various sorts, etc., and so
forth.

XML: the newest and most well-accepted language for defining mark-up systems
using ordinary characters. Although we get lazy and talk about converting a
file to XML, we should rather talk about converting the file to a specific
application of XML. Otherwise, we don't have much idea what the output is
supposed to look like, except that it probably has a lot of tags.

HTH (And I've got to get back to work.)

Joel Rees
rees@mediafusion.co.jp
============================XML as Best Solution===
Media Fusion Co.,Ltd.  株式会社メディアフュージョン
Amagasaki  TEL 81-6-6415-2560    FAX 81-6-6415-2556
    Tokyo TEL 81-3-3516-2566   FAX 81-3-3516-2567
                       http://www.mediafusion.co.jp
---------------------------------------------------
                                         Programmer
===================================================