XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Memory usage of elements / attributes


I ran a small C# program that shows the sizes of a bare (just created) XmlDocument, XmlElement and XmlAttribute.

Here are the results:

an XmlDocument: 3176 bytes

an XmlElement: 656 bytes

an XmlAttribute: 152 bytes.


I hadn't realised that you could get such a high precision memory instrumentation in C#.

With SaxonCS, on the TinyTree, nodes aren't allocated as individual objects, so we need to do bulk allocation and then compute an average.

I ran this test with SaxonCS:

private void buildDocWithElements(TreeModel model, int count) {
long mem = GC.GetTotalMemory(true);
StringBuilder sb = new StringBuilder("<doc>");
for (int i = 0; i < count; i++) {
sb.Append("<a/>");
}
sb.Append("</doc>");
Processor proc = new Processor();
DocumentBuilder db = proc.NewDocumentBuilder();
db.TreeModel = model;
XdmNode doc = db.Build(new StringReader(sb.ToString()));
sb = null;
Console.WriteLine("Memory: " + model + " " + count + " elements = " + (GC.GetTotalMemory(true) - mem));
}

private void buildDocWithAttributes(TreeModel model, int count) {
long mem = GC.GetTotalMemory(true);
StringBuilder sb = new StringBuilder("<doc>");
for (int i = 0; i < count; i++) {
sb.Append("<a b=''/>");
}
sb.Append("</doc>");
Processor proc = new Processor();
DocumentBuilder db = proc.NewDocumentBuilder();
db.TreeModel = model;
XdmNode doc = db.Build(new StringReader(sb.ToString()));
sb = null;
Console.WriteLine("Memory: " + model + " " + count + " attributes = " + (GC.GetTotalMemory(true) - mem));
}

[Test]
public void TestMemoryUsed() {
buildDocWithElements(TreeModel.TinyTree, 10000);
buildDocWithElements(TreeModel.TinyTree, 20000);
buildDocWithAttributes(TreeModel.TinyTree, 10000);
buildDocWithAttributes(TreeModel.TinyTree, 20000);
buildDocWithElements(TreeModel.LinkedTree, 10000);
buildDocWithElements(TreeModel.LinkedTree, 20000);
buildDocWithAttributes(TreeModel.LinkedTree, 10000);
buildDocWithAttributes(TreeModel.LinkedTree, 20000);
}
and it produced this output:

Memory: TinyTree 10000 elements = 800992
Memory: TinyTree 20000 elements = 992680
Memory: TinyTree 10000 attributes = 900744
Memory: TinyTree 20000 attributes = 1720944
Memory: LinkedTree 10000 elements = 2064384
Memory: LinkedTree 20000 elements = 4072008
Memory: LinkedTree 10000 attributes = 4198024
Memory: LinkedTree 20000 attributes = 8316768

But note that when we add 10000 attributes we are also adding 10000 elements.

My conclusions from this:

For the TinyTree:

* the cost for an additional empty element is (992680 - 800992) / 10000 = 19 bytes
* the cost for an additional empty element plus empty attribute is (1720944 - 900744) / 10000 = 82 bytes, so the attribute is 63 bytes

For the Linked Tree:

* the cost for an additional empty element is (4072008 - 2064384) / 10000 = 200 bytes
* the cost for an additional empty element plus empty attribute is (8316768 - 4198024) / 10000 = 412 bytes, so the attribute is 212 bytes

These are close to what I would predict from the design.

Measuring empty elements and attributes is a bit artificial. If we make the values in each case be a single ASCII character the numbers change to

Memory: TinyTree 10000 elements = 994320
Memory: TinyTree 20000 elements = 1379024
Memory: TinyTree 10000 attributes = 1176816
Memory: TinyTree 20000 attributes = 2273088
Memory: LinkedTree 10000 elements = 3103296
Memory: LinkedTree 20000 elements = 6148136
Memory: LinkedTree 10000 attributes = 4478456
Memory: LinkedTree 20000 attributes = 8868808

meaning:

For the TinyTree:

* the cost for an additional single-character element is 38 bytes
* the cost for an additional single-character attribute is 110 - 19  = 91 bytes

For the Linked Tree:

* the cost for an additional single-character element is 304 bytes
* the cost for an additional single-character attribute is 439 - 200 = 239 bytes

Note: from the design (not from measurement) the size should be independent of the length of the name, provided the same names are used repeatedly.

Michael Kay
Saxonica



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS