XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Memory usage of elements / attributes

Extended the code to also measure the size of naked XDocument, XElement and XAttribute. Hear we see even greater difference: XAttribute is more than 5 times smaller in size than an XElement:

    static void Main(string[] args)
    {
        var mem1 = GC.GetTotalMemory(true);
        var doc = new XmlDocument();
        var mem2 = GC.GetTotalMemory(true);
        var elem = doc.CreateElement("x");
        var mem3 = GC.GetTotalMemory(true);
        var attr = doc.CreateAttribute("y");
        var mem4 = GC.GetTotalMemory(true);
        var xDoc = new XDocument();
        var mem5 = GC.GetTotalMemory(true);
        var xElem = new XElement("x");
        var mem6 = GC.GetTotalMemory(true);
        var xAttr = new XAttribute("y", "1");
        var mem7 = GC.GetTotalMemory(true);
        Console.WriteLine($"XmlDocument: {mem2 - mem1} bytes");
        Console.WriteLine($"XmlElement: {mem3 - mem2} bytes");
        Console.WriteLine($"XmlAttribute: {mem4 - mem3} bytes");
        Console.WriteLine($"XDocument: {mem5 - mem4} bytes");
        Console.WriteLine($"XElement: {mem6 - mem5} bytes");
        Console.WriteLine($"XAttribute: {mem7 - mem6} bytes");
   }

Results:

XmlDocument: 3176 bytes
XmlElement: 656 bytes
XmlAttribute: 152 bytes
XDocument: 56 bytes
XElement: 512 bytes
XAttribute: 96 bytes

Cheers,
Dimitre

    }



On Sun, Jan 16, 2022 at 12:30 PM Michael Kay <mike@saxonica.com> wrote:

I ran a small C# program that shows the sizes of a bare (just created) XmlDocument, XmlElement and XmlAttribute.

Here are the results:

an XmlDocument: 3176 bytes

an XmlElement: 656 bytes

an XmlAttribute: 152 bytes.


I hadn't realised that you could get such a high precision memory instrumentation in C#.

With SaxonCS, on the TinyTree, nodes aren't allocated as individual objects, so we need to do bulk allocation and then compute an average.

I ran this test with SaxonCS:

private void buildDocWithElements(TreeModel model, int count) {
long mem = GC.GetTotalMemory(true);
StringBuilder sb = new StringBuilder("<doc>");
for (int i = 0; i < count; i++) {
sb.Append("<a/>");
}
sb.Append("</doc>");
Processor proc = new Processor();
DocumentBuilder db = proc.NewDocumentBuilder();
db.TreeModel = model;
XdmNode doc = db.Build(new StringReader(sb.ToString()));
sb = null;
Console.WriteLine("Memory: " + model + " " + count + " elements = " + (GC.GetTotalMemory(true) - mem));
}

private void buildDocWithAttributes(TreeModel model, int count) {
long mem = GC.GetTotalMemory(true);
StringBuilder sb = new StringBuilder("<doc>");
for (int i = 0; i < count; i++) {
sb.Append("<a b=''/>");
}
sb.Append("</doc>");
Processor proc = new Processor();
DocumentBuilder db = proc.NewDocumentBuilder();
db.TreeModel = model;
XdmNode doc = db.Build(new StringReader(sb.ToString()));
sb = null;
Console.WriteLine("Memory: " + model + " " + count + " attributes = " + (GC.GetTotalMemory(true) - mem));
}

[Test]
public void TestMemoryUsed() {
buildDocWithElements(TreeModel.TinyTree, 10000);
buildDocWithElements(TreeModel.TinyTree, 20000);
buildDocWithAttributes(TreeModel.TinyTree, 10000);
buildDocWithAttributes(TreeModel.TinyTree, 20000);
buildDocWithElements(TreeModel.LinkedTree, 10000);
buildDocWithElements(TreeModel.LinkedTree, 20000);
buildDocWithAttributes(TreeModel.LinkedTree, 10000);
buildDocWithAttributes(TreeModel.LinkedTree, 20000);
}
and it produced this output:

Memory: TinyTree 10000 elements = 800992
Memory: TinyTree 20000 elements = 992680
Memory: TinyTree 10000 attributes = 900744
Memory: TinyTree 20000 attributes = 1720944
Memory: LinkedTree 10000 elements = 2064384
Memory: LinkedTree 20000 elements = 4072008
Memory: LinkedTree 10000 attributes = 4198024
Memory: LinkedTree 20000 attributes = 8316768

But note that when we add 10000 attributes we are also adding 10000 elements.

My conclusions from this:

For the TinyTree:

* the cost for an additional empty element is (992680 - 800992) / 10000 = 19 bytes
* the cost for an additional empty element plus empty attribute is (1720944 - 900744) / 10000 = 82 bytes, so the attribute is 63 bytes

For the Linked Tree:

* the cost for an additional empty element is (4072008 - 2064384) / 10000 = 200 bytes
* the cost for an additional empty element plus empty attribute is (8316768 - 4198024) / 10000 = 412 bytes, so the attribute is 212 bytes

These are close to what I would predict from the design.

Measuring empty elements and attributes is a bit artificial. If we make the values in each case be a single ASCII character the numbers change to

Memory: TinyTree 10000 elements = 994320
Memory: TinyTree 20000 elements = 1379024
Memory: TinyTree 10000 attributes = 1176816
Memory: TinyTree 20000 attributes = 2273088
Memory: LinkedTree 10000 elements = 3103296
Memory: LinkedTree 20000 elements = 6148136
Memory: LinkedTree 10000 attributes = 4478456
Memory: LinkedTree 20000 attributes = 8868808

meaning:

For the TinyTree:

* the cost for an additional single-character element is 38 bytes
* the cost for an additional single-character attribute is 110 - 19  = 91 bytes

For the Linked Tree:

* the cost for an additional single-character element is 304 bytes
* the cost for an additional single-character attribute is 439 - 200 = 239 bytes

Note: from the design (not from measurement) the size should be independent of the length of the name, provided the same names are used repeatedly.

Michael Kay
Saxonica



--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.
 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS