On Tue, 2020-09-01 at 12:49 +0000, Roger L Costello wrote:
> Hi Folks,
>
> I am reading a book [1] on machine learning and the book says some
> pretty interesting things:
>
> "In the search for more speed, machine learning researchers started
> taking advantage of special hardware found in some computers,
> originally designed to improve graphics performance.
There have been papers from a group at Intel working on speeding up XML
processing using hardware. Oh - while i was wrting this i think Tony
Graham mentioned one.
See
https://www3.cs.stonybrook.edu/~mikepo/papers/gnort-regexp.raid09.pdf
for a paper on mplementting regular expression matching in GPU
hardware; this is at the core of XSD validation. The speedup they
report in that paper is 60%, though, which is not huge.
I just validated a 43MByte XML file against its DTD using xmllint; it
took 0.4 seconds. Xerces-C took 0.184 seconds.
The word count program, wc, took 0.015 seconds:
$ time < with-sources.xml wc -l
647399
real 0m0.015s
user 0m0.009s
sys 0m0.006s
so validating with Xerces was about 10 times slower than just counting
the number of newline characters. The "rev" command, however, which
saves each line in a buffger and then reverses it, takes two seconds,
and sed -e s/girl/boy/g takes 0.14 seconds.
So the speed of parsing doesn't seem to be a huge problem. The xmllint
program builds an in-memory tree which probably accounts for the extra
time.
The expat xmlwf command takes about 0.16 seconds on the same file; the
overhead of validating seems very small.
With the schema instead of a DTD it's only very slightly slower in
xmllint.
For a 60% speedup i'm not sure i'd be very interested, because of
potentially becoming tied to propretary graphics card software.
--
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php