XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Why does XML call them "attributes" and notname-value pairs?

On Sun, 16 Jan 2022 11:04:31 -0800, Dimitre Novatchev wrote:
>> The attribute takes up exactly the same amount of memory as the
>> element.
> 
> Not true!
> 
> I ran a small C# program that shows the sizes of a bare (just created)
> XmlDocument, XmlElement and XmlAttribute.
> 
> Here are the results:
> 
> an XmlDocument: 3176 bytes
> 
> an XmlElement: 656 bytes
> 
> an XmlAttribute: 152 bytes.

This is likely to be true for pretty much any XML tree model 
implementation. I'd be interested to know how one might avoid it.

For an attribute, minimum number of pointers: name, value. Almost 
always implemented: container (parent element). Rarely implemented: 
sibling pointers (it's in some implementations, even though order is 
not significant. Also, here we treat name as a pointer, not as its 
three components or component pointers.

For an element: name, namespace collection pointer, attribute 
collection pointer, parent pointer, and child collection start pointer. 
In implementations optimized for usability rather than for maximal 
memory compactness: last child pointer, previous and next sibling 
pointers. One might maximally compact this by turning it into a 
JSON-like model lacking many of the pointers, but it's still XML, so 
name, pointer to three (two if namespaces are stuffed into the 
attributes collection) collections instead of: name, pointer to one 
value. You can program this as an XML tree, but the tree navigation API 
has to actually fill in all the missing pointers, which is kind of 
achieving a poor programmer's streaming-lite implementation.

This is based on experience with DOM, Axiom, JDOM, DOM4J models (plus a 
couple of proprietary models). It's quite easy to add more pointers 
(because making it easier to move to things is valuable, and that's 
what most of the pointers do), but reducing below named container 
contains value versus named container contains multiple kinds of 
collections seems like a non-starter. Hmm, I guess you *could* treat 
namespaces, attributes, and children as a single collection?

On second thought, if someone is doing this sort of trick to avoid it, 
I don't need to know (unless you want to warn me of a project to avoid).

Also: yes, in performance optimization of tree models 
built-for-the-purpose, minimizing the size of base (no content/empty) 
nodes remains a consideration. Overall, though, implementing in a 
fashion that permits streaming (freeing memory after 'consuming' parts 
of the tree (typically with the whole thing in persistent storage 
somewhere)) is preferred to speed-impacting reduction of internal 
pointers.

Amy!
-- 
Amelia A. Lewis                    amyzing {at} talsever.com
The less I seek my source for some definitive, the closer I am to fine.
                -- Indigo Girls


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS