OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] genx: canonicalization vs. pretty printing

[ Lists Home | Date Index | Thread Index ]

Elliotte Rusty Harold wrote:

> At 7:21 PM -0800 1/21/04, Tim Bray wrote:
>
>> 1. Any output encoding other than UTF-8
>> 2. Optional escaping of illegal characters
>> 3. Prettyprinting support
>> 4. Various kinds of error workarounds, and turning errorchecking off
>> 5. Writing CDATA sections
>> 6. Writing XML declaration (good idea, but I want the output to be 
>> Canonical XML)
>
>
> I think you're at best at 50% with these goals, maybe less. This is 
> certainly not at 80%.
>
> My experience with JDOM, XOM, and other APIs for doing operations like 
> XInclude that ultimately reslt in a serialized XML document is that 
> users really, really want pretty-printing a lot of the time. If the 
> API wont do that, they will ignore it. I wouldn't pretty print by 
> default, but I would definitely include options for setting the 
> maximum line length and indent string.

hi,

i think that using something similiar to SAX setFeature/Property could 
work pretty well to set optional properties such as indenting. for example:

    private final static String INDENT_PROP
        = 
"http://xmlpull.org/v1/doc/properties.html#serializer-indentation";;

            try {
                serializer.setProperty(INDENT_PROP, "  ");
            } catch (IllegalArgumentException e) {} catch 
(IllegalStateException e) {}

i also like idea of self-describing property URI so you can open it in 
browser and see description of the property.

http://xmlpull.org/v1/doc/properties.html#serializer-indentation

>
> Furthermore, I would make canonicalization an option if it's included 
> at all. It  imposes a significant performance hit since you have to 
> sort the attributes and namespaces. Worse, it prevents full streaming 
> since the attributes and namespaces have to be buffered before you 
> sort them. But more importantly, probably half the time users don't 
> care about canonicalization. The other half the time they do care. To 
> be more specific they don't want it. It's too damn ugly and the lines 
> are too long to work with. Plus there's no XML declaration, which 
> users like.

i would make it an optional feature too.

> I think an XML output library needs to realize that opening files in a 
> plain text editor is still a very important use case a lot of the 
> time. Byte-by-byte comparison and digital signatures usually aren't 
> (and when they are the digital signature library will canonicalize 
> first). Requiring canonical XML and not pretty printing or allowing 
> encoding selection does not give users the simple library that does 
> what they need. It is not sufficiently full featured to hit the 80-20 
> point.
>
maybe. but maybe there should be an API and more than one implementation 
so users can chose what they need ...

thanks,

alek

-- 
The best way to predict the future is to invent it - Alan Kay





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS