XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Convert an XML Schema validation task into a form thatis suitable for running on Graphics Processing Units (GPUs)?

Liam, although the Nvidia architecture is proprietary it's available from a lot of vendors. I'd be no more worried about lock in than when using the Intel architecture.  As such, the Cuda libraries have been around for something like 12 or more years and make it fairly easy to use a graphics card for parallel programming.  As an aside, I once helped a hardware manufacturer look at using a parallel architecture for processing an XSLT-like language (pre XSLT), Cuda would be a good match for that task for XSLT if someone really wants a reason to play around with parallel processing and XML.
Peter Hunsberger


On Tue, Sep 1, 2020 at 10:28 PM Liam R. E. Quin <liam@fromoldbooks.org> wrote:
On Tue, 2020-09-01 at 12:49 +0000, Roger L Costello wrote:
> Hi Folks,
>
> I am reading a book [1] on machine learning and the book says some
> pretty interesting things:
>
> "In the search for more speed, machine learning researchers started
> taking advantage of special hardware found in some computers,
> originally designed to improve graphics performance.

There have been papers from a group at Intel working on speeding up XML
processing using hardware. Oh - while i was wrting this i think Tony
Graham mentioned one.

See
https://www3.cs.stonybrook.edu/~mikepo/papers/gnort-regexp.raid09.pdf
for a paper on mplementting regular expression matching in GPU
hardware; this is at the core of XSD validation.  The speedup they
report in that paper is 60%, though, which is not huge.


I just validated a 43MByte XML file against its DTD using xmllint; it
took 0.4 seconds. Xerces-C took 0.184 seconds.

The word count program, wc, took 0.015 seconds:

$ time < with-sources.xml wc -l
647399

real    0m0.015s
user    0m0.009s
sys     0m0.006s

so validating with Xerces was about 10 times slower than just counting
the number of newline characters. The "rev" command, however, which
saves each line in a buffger and then reverses it, takes two seconds,
and sed -e s/girl/boy/g takes 0.14 seconds.

So the speed of parsing doesn't seem to be a huge problem. The xmllint
program builds an in-memory tree which probably accounts for the extra
time.

The expat xmlwf command takes about 0.16 seconds on the  same file; the
overhead of validating seems very small.

With the schema instead of a DTD it's only very slightly slower in
xmllint.


For a 60% speedup i'm not sure i'd be very interested, because of
potentially becoming tied to propretary graphics card software.

--
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org


_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS