Daily Archives: 04/20/2009

Group By and Aggregates in .NET DataTable

Often there is a need to perform group by operations on in-memory .NET data tables. You can’t run a standard SQL statement and LINQ is not available prior version 3.0 of .NET. To compensate for this shortcoming I put together a small function that works in .NET 2.0. As input parameters it accepts source DataTable, column name to group by and a column name to perform aggregate operation on. It returns a grouped data as data table:

Function GroupBy(ByVal i_sGroupByColumn As String, ByVal i_sAggregateColumn As String, ByVal i_dSourceTable As DataTable) As DataTable


        Dim dv As New DataView(i_dSourceTable)

        'getting distinct values for group column
        Dim dtGroup As DataTable = dv.ToTable(True, New String() {i_sGroupByColumn}) 

        'adding column for the row count
        dtGroup.Columns.Add("Count", GetType(Integer)) 

        'looping thru distinct values for the group, counting
        For Each dr As DataRow In dtGroup.Rows
            dr("Count") = i_dSourceTable.Compute("Count(" & i_sAggregateColumn & ")", i_sGroupByColumn & " = '" & dr(i_sGroupByColumn) & "'")
        Next        

        'returning grouped/counted result
        Return dtGroup
End Function

The function first gets distinct values for group-by column, by creating a data view from source data table and using DataView’s “ToTable” method. It then loops thru these distinct values performing aggregate function on the source table using DataTable’s “Compute” method – Count in this case, but it can easily be replaced with other aggregates.