Often there is a need to perform group by operations on in-memory .NET data tables. You can’t run a standard SQL statement and LINQ is not available prior version 3.0 of .NET. To compensate for this shortcoming I put together a small function that works in .NET 2.0. As input parameters it accepts source DataTable, column name to group by and a column name to perform aggregate operation on. It returns a grouped data as data table:
Function GroupBy(ByVal i_sGroupByColumn As String, ByVal i_sAggregateColumn As String, ByVal i_dSourceTable As DataTable) As DataTable
Dim dv As New DataView(i_dSourceTable)
'getting distinct values for group column
Dim dtGroup As DataTable = dv.ToTable(True, New String() {i_sGroupByColumn})
'adding column for the row count
dtGroup.Columns.Add("Count", GetType(Integer))
'looping thru distinct values for the group, counting
For Each dr As DataRow In dtGroup.Rows
dr("Count") = i_dSourceTable.Compute("Count(" & i_sAggregateColumn & ")", i_sGroupByColumn & " = '" & dr(i_sGroupByColumn) & "'")
Next
'returning grouped/counted result
Return dtGroup
End Function
The function first gets distinct values for group-by column, by creating a data view from source data table and using DataView’s “ToTable” method. It then loops thru these distinct values performing aggregate function on the source table using DataTable’s “Compute” method – Count in this case, but it can easily be replaced with other aggregates.