[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML not ideal for Big Data
- From: David Carver <d_a_carver@yahoo.com>
- To: Jim Tivy <jimt@bluestream.com>
- Date: Thu, 03 Sep 2009 16:23:51 -0400
> Three limitations to processing XML files are:
>
> 1. XML File Size as set by the OS.
> 2. RAM consumption.
> 3. CPU consumption.
>
> Most XML Parsers can be used on big files (100GB) without exceeding these
> limitations. This is because XML Parsers are stream based - reading small
> chunks at a time. If you want to process the XML file, however, you will
> need to use a streaming technology like SAX. Other XML processing
> technologies like many DOM implementations will cause you to exceed RAM.
>
> Choosing a max like 1MB to 50MB will allow you to more freely use a wide
> variety of XML and other technologies (like EMail attachments) making your
> XML less constrained. Again, it depends on your use cases for the XML.
>
Jim in many ways you have hit the nail on the head. XML can be
successfully used with large data dumps. The underlying problem is not
necessarily with the data in XML format but the tools and frameworks
that are used to process it. The knee jerk reaction to the common
programmer when they have to deal with XML is to try and data bind
against it. There are a wide variety of ways to process the XML, and
normally the most common method isn't going to be the correct method
with large data stores. XML Data Bases, Streaming, STAX, SAX, etc are
much more efficient ways, then just trying to databind and store
everything in memory (which is what typically is the first reaction).
When the data binding fails, the programmer typically blames XML not
their own choice in technology to process it.
Dave
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]