[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: A proposal for simplify XML text editing and subsequent processing
- From: Philip Fearon <pgfearo@googlemail.com>
- To: xml-dev@lists.xml.org
- Date: Tue, 23 Aug 2011 11:50:38 +0100
Thanks to Liam, Mike, Rick and John who contributed to this thread in
response to this proposal, which now seems to have run its course, but
just to wrap up:
The principal driver for this proposal was to improve things after bad
personal experiences with whitespace management when collaboratively
working with XSLT and a variation of XMLspec; because whitespace has a
special significance in both XSLT and XMLspec, the limitations of
existing pretty-print tools were compounded. XMLspec and XSLT examples
were therefore instrumental in the testing of an implementation for
this proposal.
Liam rightly picked me up on this:
Me:
>> To complement this, editors should have a capability to
>> safely strip such indentation characters when encountered (the reverse
>> of what pretty-print formatters achieve) in external sources.
Liam:
>You can only do that in general with a schema (e.g. a DTD or XSD schema)
>of course
Agreed. As this proposal is positioned as an alternative option to the
current insertion of tabs or spaces for formatting, I'll make a brief
comparison (I've tried to be objective, but there's inevitable bias)
between the two methods with regard to maintaining the integrity of
the XML content:
Notes:
1. The term 'Virtual Formatting' is not in general use - I've used it
here to cover the concept of indenting XML without tab or space
characters.
2. Observations on formatting character insertion are general, details
will vary accross tools and also depend on settings, but that's also a
major part of the problem.
3. Treatment of line-feed characters is much the same with both
methods (for the present), so, for the purpose of this comparison
'formatting characters' means 'leading tab or space characters
inserted on each line to give indentation consistent with the XML
context to improve readability'.
-- Formatting character insertion (The existing method) --
1. Formatting characters are inserted, removed or 'swapped' in XML in
the following cases:
i. A document is opened for editing
ii. When Pretty-print is manually invoked after altering the XML structure
iii. Automated formatting character trimming/padding when the
user performs internal clipboard operations
iv. When XML is copied in from an external clipboard source
v. Deliberate correction of formatting characters by the user
vi. Accidental modification of formatting characters by the user
vii. When tabs are found as padding, but user settings specify
spaces - or vice versa
2. When formatting characters are inserted into XML, it is likely they
will be modified on subsequent edits. It is therefore not a one-off
operation.
3. When formatting characters are modified the following information may be used
i. schema
ii. xml:space
iii. detected mixed content
iv. personalised configuration settings in the editor
-- Virtual formatting (The propoasal): --
1. Formatting characters are trimmed from XML in the following cases:
i. A document is first opened for editing and formatting
characters are detected
ii. When XML is copied in from an external clipboard source and
formatting characters are detected
2. XML is only trimmed if it is found to have formatting characters.
It is therefore a one-off operation, once trimmed it is not modified
again.
3. When formatting characters are trimmed the following information may be used
i. schema
ii. xml:space
iii. detected mixed content
iv. personalised configuration settings in the editor
v. clues from patterns found in leading whitespace characters
on preceding lines
-- Conclusion: --
Virtual formatting is the winner. The probability of maintaining the
integrity of leading tab or space characters *required* within an XML
document is improved significantly by not using them also for XML
formatting.
---------
Phil
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]