OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Why is text marked up ?

I'm doing some research and thinking and have a serious but seemingly stupid
If you don't want to bother with a possible hypothetical thread skip now ...

"Why is text marked up" ? 

I think I know most of the answers, but that's the problem ! I know enough
to not ask, which means I probably don't know.

This question has several variants ...  I'd like to propose what I think I
know and would love those with patience and interest to comment.
First off, I'm only interested (in this discussion) to "Text Markup" ...
that is NOT data representations that happen to use "markup syntax".
But rather, starting with "Plain Text" and adding "Markup" to it for some

Historical Manuscripts.
There's a lot of markup used in the Manuscript world (which I am least
In this case "Text" is not the original, but written word was.   
With Historical Manuscripts (like with archeology) there is a strong reason
to separate "the original" from "annotation".
Otherwise in many cases the text would just be rewritten to be clearer.

So I suggest Markup is used in this context .

* To annotate information missing from the transcription but visible in the
written manuscript
* To apply metadata only implicitly known (such as author, where found,
* To apply inference from a scholar (e.g.   "John" really refers to "John
the Baptist" ... ) 
* To make the original text easier to understand by today's reader (or a
reader without the visual clues of the original)

Overall I believe this is done to add information either lost by the
transcription, or added by a scholar.
Is there other reasons ? 

Linguistic Analysis
In this case markup is used on text (either transcribed or original)  to aid
in linguistic analysis.
* To identify meaning (noun , verb, phonons etc.)

Publication / Presentation
This is the most common (to me) use of text markup.   Adding markup to
indicate structure and presentation intended for publication of text.   In a
sense this is the reversal of the transcription process of manuscripts (!?)
... To prepare 'plain text' so it can be presented.
* Add structure so text can be segmented (paragraph, chapter, heading etc.)
* Add structure so text can be typographically presented (bold, italic, font
etc. )
 (Skip for a moment the distinction between semantic and presentation markup
... its coming)

But now the big question.  

And the ancillary
"What/Whom is the intended audience of the markup?"

There are many answers of course.   I'll suggest a few

* To add meaning for *human* readers - which coincidentally is machine
---> Is this ever really done ? 

* To add meaning solely for computers to aid in searching / repackaging  /
semantic analysis
* To add presentational meaning where the end result is intended to be
'published' in some other media (print, web etc.)

To conclude ... 
I'd love a discussion (if you're willing and happy !) about this.
"Why markup text? " - corollary "Why not just write the text better ?"
"Are the purposes consistent with each other ?"
"Is choice of a consistent markup technology/syntax a good or bad thing ?"
"Do you believe others think the same as you ?"   and "Do they ?"

Thanks for your patience and insight.

David A. Lee

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS