XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Convert an XML Schema validation task into a form thatis suitable for running on Graphics Processing Units (GPUs)?

Ultimately, yes. Initially, no. 

I expect this is another case where there may an application and data  sweetspot where GPU processing comes into its own.

There is a great difference in the problems that GPUs will be optimal for and those for which SIMD is the bee's knees. And their algorithms could not be more different.

For GPUs, you must have a large amount of essentially indentical types of data, in 1D or 2D or 3D, and you want algorithms that elimate branches and random and non-local lookup.

So instead of 
  if (a < 200) then a == a+1;  else a == a - 1

For GPU you have 
  table adjustment = -1, +1;
  int size = (int)  a  >= 200;
  a = a + adjustment[ size ];

i.e. because for a GPU, both branches will be stepped through (one interpreted as NOP), even though they are mutually exclusive.

For a SiMD you get a chunk of consecutive data, however much will fit into the longest CPU data unit (eg 8 bytes) and
 1. examine it in parallel to see if it has features
 2. process it in parallel according to those features as much as possible
 3. flip to non parallel processing for outliers
 4. reset to the next useful spot.

So for GPU you might have:

int4 reg1 =  get4ints(input[i]);
int4 reg2 =  make4ints( 200 );

if (allZeroOrLess(reg1, reg2)) {
   output[i] = decrementInt4(reg1)
   i = i + 4;
 }
else {
   int a = input[i];
   output[i++] = a > 200? a + 1, a -1;
   a = input[i];
   output[i++] = a > 200? a + 1, a -1;
   a = input[i];
   output[i++] = a > 200? a + 1, a -1;
   a = input[i];
   output[i++] = a > 200? a + 1, a -1;
}

Now it is true that both GPU and SIMD benefit from algorithms that fold decision making into tables. And they are cross pollenating. And they are both scewed to graphics processing not text.

But getting back to the sweet spot, the big reason why GPU processing may not take off is that it requires someone to pull their fingers out and do it, open source, which effectively means someone to pay for it. People should not assume that the people who pioneer a project wont need their babies to grow up and leave home (Stallman, Linus, Raymond notwithstanding.)

After all this time why is Xerces/Xalan pretty much unoptimized? 15 years ago, I pointed out that a trivial SSE2 optimization could speed up the UTF8 unpacking by 4 times ()
https://web.archive.org/web/20081021084904/http://blogs.oreilly.com/digitalmedia/2005/11/using-c-intrinsic-functions-2.html

And we had lots of people making noise about how slow XML parsing was. So why didnt anyone try to optimise libxml2 better, then or since, as far as I can see (ie explicit use of intrinsics: happy to be wrong). No slight in libxml2 or its developers, btw.

Rick

On Wed, 2 Sep. 2020, 07:37 Tom Hillman, <tom@expertml.com> wrote:
A more pertinent question might be: would the benefit of doing so outweigh the cost?

Under what circumstances would you need schema validation to be so massively parallel to be worth shunting the task from the CPU to the GPU?  Are there delays in doing so, and how do they compare with the time savings?

What about the developer's costs?  CPU time is a lot cheaper than programmers' time (a point made often in Steven Pemerton's talks, since it wasn't always so).  Perhaps the answer to that is (nearly) none: I would expect a lot of that sort of work to be done by the compiler or equivalent these days...

I don't come from a formal software engineering background, so I don't know these answers, but I suspect any cost to additional development might not be effective unless you're doing things at massive scale.

T

_________________
Tomos Hillman
eXpertML Ltd
+44 7793 242058
On 1 Sep 2020, 21:44 +0100, Tony Graham <tgraham@antenna.co.jp>, wrote:
On 01/09/2020 13:49, Roger L Costello wrote:
...
Here's a crazy question: Can the task of validating an XML instance
against an XML Schema be turned into a form that could run with
benefit on GPUs?

There was a paper about parallel processing and parsing at Balisage 2013:

Medforth, Nigel, Dan Lin, Kenneth Herdy, Rob Cameron and Arrvindh
Shriraman. “icXML: Accelerating a Commercial XML Parser Using SIMD and
Multicore Technologies.” Presented at Balisage: The Markup Conference
2013, Montréal, Canada, August 6 - 9, 2013. In Proceedings of Balisage:
The Markup Conference 2013. Balisage Series on Markup Technologies, vol.
10 (2013). https://doi.org/10.4242/BalisageVol10.Cameron01.

https://www.balisage.net/Proceedings//vol10/html/Cameron01/BalisageVol10-Cameron01.html

Their software seems to still be in active development, but I haven't
seen that it made the leap to GPUs:

https://cs-git-research.cs.surrey.sfu.ca/cameron/parabix-devel/-/wikis/home

Regards,


Tony Graham.
--
Senior Architect
XML Division
Antenna House, Inc.
----
Skerries, Ireland
tgraham@antenna.co.jp

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS