OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] What is @xml:space about?

Yes, in many applications, the TEI gets converted to HTML and the HTML 
spec says that browsers must normalize space. So all looks fine. The 
encoders think they've written excellent TEI. It's just like the 
examples in the TEI spec.

But let's say later, some database extracts names from the TEI. The 
database doesn't know it's supposed to be normalizing and so stores 
leading and trailing space.

The encoders think the database extractor is broken. But the problem was 
actually their assumption that all downstream processors would 
normalize. "Hey, that's the way HTML works. The database extractor 
should work the same way."


On 7/12/2012 10:34 AM, Mike Sokolov wrote:
> It's not at all apparent to me that whitespace normalization is needed 
> in the TEI example you gave.  I think that would depend on the end 
> format.  For HTML, the spacing is already correct.
> I always think the best practice is for the provider to make the 
> whitespace be as intended in the input, if they care.  Otherwise one 
> must rely on out-of-band information to make a decision.
> -Sokolov
> On 07/12/2012 09:18 AM, John P. McCaskey wrote:
>> On 7/11/2012 11:46 PM, John Cowan wrote:
>>> John P. McCaskey scripsit:
>>>> Is there an established way for an XML document to announce to
>>>> downstream processors what "default" processing -- trim, collapse,
>>>> pre-line, nowrap, etc. -- was assumed in the encoding?
>>> No, there isn't.  What counts as the Right Thing depends on the 
>>> consuming
>>> application.  The point of xml:space="preserve" is to persuade the
>>> consumer that the producer intends for the whitespace to convey 
>>> important
>>> information.  The alternative is that the producer doesn't really care.
>>> So if the producer wants to make sure that whitespace is normalized,
>>> the best approach is to do its own normalization and then add
>>> xml:space="preserve" to prevent the consumer from doing its own thing
>>> with it.
>> The comes up in TEI (www.tei-c.org).
>> Encoding like the following is common both in practice and in the 
>> published specifications.
>> <persName>
>>     His Excellency
>> <forename>Edward</forename>
>> <surname>Smith</surname>,
>>     Shire of <placeName>Westerland</placeName>
>> </persName>
>> Clearly the encoder is expecting that during processing, space will 
>> get collapsed and leading and trailing space will be trimmed. The 
>> presumption is pervasive in TEI encodings. Just as the producer has a 
>> way to tell the consumer, "Please don't mess with spacing," TEI needs 
>> to have a way to say, "Yes, go ahead, please normalize."
>> Should this be a part of the TEI spec globally? a parameter set in a 
>> header? Should there be a tei:space that allows more values than 
>> xml:base does and with "normalize" being the default? How would that 
>> interact with xml:base?
>> -- John

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS