XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: A proposal for simplify XML text editing and subsequent processing

Thanks to Liam, Mike, Rick and John who contributed to this thread in
response to this proposal, which now seems to have run its course, but
just to wrap up:

The principal driver for this proposal was to improve things after bad
personal experiences with whitespace management when collaboratively
working with XSLT and a variation of XMLspec; because whitespace has a
special significance in both XSLT and XMLspec, the limitations of
existing pretty-print tools were compounded. XMLspec and XSLT examples
were therefore instrumental in the testing of an implementation for
this proposal.

Liam rightly picked me up on this:

Me:
>>  To complement this, editors should have a capability to
>> safely strip such indentation characters when encountered (the reverse
>> of what pretty-print formatters achieve) in external sources.

Liam:
>You can only do that in general with a schema (e.g. a DTD or XSD schema)
>of course

Agreed. As this proposal is positioned as an alternative option to the
current insertion of tabs or spaces for formatting, I'll make a brief
comparison (I've tried to be objective, but there's inevitable bias)
between the two methods with regard to maintaining the integrity of
the XML content:

Notes:
1. The term 'Virtual Formatting' is not in general use - I've used it
here to cover the concept of indenting XML without tab or space
characters.
2. Observations on formatting character insertion are general, details
will vary accross tools and also depend on settings, but that's also a
major part of the problem.
3. Treatment of line-feed characters is much the same with both
methods (for the present), so, for the purpose of this comparison
'formatting characters' means 'leading tab or space characters
inserted on each line to give indentation consistent with the XML
context to improve readability'.

-- Formatting character insertion (The existing method) --

1. Formatting characters are inserted, removed or 'swapped' in XML in
the following cases:
     i.   A document is opened for editing
     ii.  When Pretty-print is manually invoked after altering the XML structure
     iii. Automated formatting character trimming/padding when the
user performs internal clipboard operations
     iv.  When XML is copied in from an external clipboard source
     v.   Deliberate correction of formatting characters by the user
     vi.  Accidental modification of formatting characters by the user
     vii. When tabs are found as padding, but user settings specify
spaces - or vice versa

2. When formatting characters are inserted into XML, it is likely they
will be modified on subsequent edits. It is therefore not a one-off
operation.

3. When formatting characters are modified the following information may be used
     i.   schema
     ii.  xml:space
     iii. detected mixed content
     iv.  personalised configuration settings in the editor

-- Virtual formatting (The propoasal): --

1. Formatting characters are trimmed from XML in the following cases:
    i.  A document is first opened for editing and formatting
characters are detected
    ii. When XML is copied in from an external clipboard source and
formatting characters are detected

2. XML is only trimmed if it is found to have formatting characters.
It is therefore a one-off operation, once trimmed it is not modified
again.

3. When formatting characters are trimmed the following information may be used
     i.   schema
     ii.  xml:space
     iii. detected mixed content
     iv.  personalised configuration settings in the editor
     v.   clues from patterns found in leading whitespace characters
on preceding lines


-- Conclusion: --
Virtual formatting is the winner. The probability of maintaining the
integrity of leading tab or space characters *required* within an XML
document is improved significantly by not using them also for XML
formatting.

---------
Phil


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS