RE: Here’s how to eliminate duplicate elements using XSLT

Roger: Just curious, what is the time complexity of that loop in Saxon? Is Saxon able to do in in better than O(n**2)?

Michael Kay responded:

O(n log n)

group-by is implemented using a hash table.

From: Roger L Costello <costello@mitre.org>
Sent: Friday, July 9, 2021 9:47 AM
To: xml-dev@lists.xml.org
Subject: Here�s how to eliminate duplicate elements using XSLT

Hi Folks,

If you are processing XML documents using XSLT, this is a handy thing to know.

I have an XML document that consists of <row> elements:

<row>

</row>

<row>

</row>

<row>

</row>

</Document>

I want to eliminate duplicate rows. row[1] and row[3] have the same elements with the same values. They are duplicates. I only want one of them.

I posted a question to the worldwide xsl-list and Michael Kay responded:

In this situation, you can use grouping:

<xsl:for-each-group select="row" group-by="x, y" composite="yes">

<xsl:sequence select="current-group()[1]"/>

</xsl:for-each>

Roger: Wow! That simple for-loop is doing an enormous amount of work � grouping elements by their content, iterating over the groups, selecting the first of each group (thus, deleting the duplicates). That is beautiful code. Thank you Michael.