How to change the data layout of a data table with LINQ - c#

Above is the screen shot of one of my Data Table. I am trying to transform this data into the following format so that I can bind it to one of my grid. I have tried LINQ but unsuccessful.
Could please anyone help me how I can do this. Doesn't necessarily be LINQ but I think it will be easier with LINQ

try below
var result = dataSet.Tables["reportColumns"].AsEnumerable().GroupBy(x => x.Field<string>("Object"))
.Select(g => new
{
ColumnName = g.Key,
DefaultColumn = g.FirstOrDefault(p => p.Field<string>("Attribute") == "DefaultColumn").Field<string>("Value"),
Label = g.FirstOrDefault(p => p.Field<string>("Attribute") == "Label").Field<string>("Value"),
Type = g.FirstOrDefault(p => p.Field<string>("Attribute") == "Type").Field<string>("Value"),
Standard = g.FirstOrDefault().Field<int>("Standard")
}).ToList();

You can use my ToPivotTable extension:
public static DataTable ToPivotTable<T, TColumn, TRow, TData>(
this IEnumerable<T> source,
Func<T, TColumn> columnSelector,
Expression<Func<T, TRow>> rowSelector,
Func<IEnumerable<T>, TData> dataSelector)
{
DataTable table = new DataTable();
var rowName = ((MemberExpression)rowSelector.Body).Member.Name;
table.Columns.Add(new DataColumn(rowName));
var columns = source.Select(columnSelector).Distinct();
foreach (var column in columns)
table.Columns.Add(new DataColumn(column.ToString()));
var rows = source.GroupBy(rowSelector.Compile())
.Select(rowGroup => new
{
Key = rowGroup.Key,
Values = columns.GroupJoin(
rowGroup,
c => c,
r => columnSelector(r),
(c, columnGroup) => dataSelector(columnGroup))
});
foreach (var row in rows)
{
var dataRow = table.NewRow();
var items = row.Values.Cast<object>().ToList();
items.Insert(0, row.Key);
dataRow.ItemArray = items.ToArray();
table.Rows.Add(dataRow);
}
return table;
}
Create strongly-typed data from your source table:
var data = from r in table.AsEnumerable()
select new {
Object = r.Field<string>("Object"),
Attribute = r.Field<string>("Attribute"),
Value = r.Field<object>("Value")
};
And convert them to pivot table:
var pivotTable = data.ToPivotTable(r => r.Attribute,
r => r.Object,
rows => rows.First().Value);
This will create pivot table with distinct values of Attribute (i.e. DefaultColumn, Label, Type) as columns, rows will be groups for each Object value, and each cell will have value of corresponding Value property for object group and attribute column.
Or in single query:
var pivotTable = table.AsEnumerable()
.Select(r => new {
Object = r.Field<string>("Object"),
Attribute = r.Field<string>("Attribute"),
Value = r.Field<object>("Value")
})
.ToPivotTable(r => r.Attribute,
r => r.Object,
rows => rows.First().Value);

Related

How can I improve the performance of the following code?

This code is working but taking too much time. Every data table contains 1000nds of rows and each time I need to filter data from another data tables with respect to a column.
for (int i = 0; i < dsResult.Tables[0].Rows.Count; i++)
{
DataTable dtFiltered = dtWorkExp.Clone();
foreach (DataRow drr in dtWorkExp.Rows)
{
if (drr["UserId"].ToString() == dsResult.Tables[0].Rows[i]["Registration NO."].ToString())
{
dtFiltered.ImportRow(drr);
}
}
DataTable dtFilteredAward= dtAwards.Clone();
foreach (DataRow drr in dtAwards.Rows)
{
if (drr["UserId"].ToString() == dsResult.Tables[0].Rows[i]["Registration NO."].ToString())
{
dtFilteredAward.ImportRow(drr);
}
}
DataTable dtFilteredOtherQual = dtOtherQual.Clone();
foreach (DataRow drr in dtOtherQual.Rows)
{
if (drr["UserId"].ToString() == dsResult.Tables[0].Rows[i]["Registration NO."].ToString())
{
dtFilteredOtherQual.ImportRow(drr);
}
}
//Do some operation with filtered Data Tables
}
You can declare these lines outside the for loop.
DataTable dtFiltered = dtWorkExp.Clone();
And instead of doing accessing dsResult.Table[0] each time, you can assign this to one variable and use it.
You can also replace the foreach loop with LINQ.
What I would do:
All rows of the main datatable as enumerable:
var rows = dsResult.Tables[0].AsEnumerable();
Get the column you're going to filter with:
var filter = rows.Select(r => r.Field<string>("Registration NO."));
Create a method that accepts that filter, a table to filter and a field to compare.
public static DataTable Filter<T>(EnumerableRowCollection<T> filter, DataTable table, string fieldName)
{
return table.AsEnumerable().Where(r => filter.Contains(r.Field<T>(fieldName))).CopyToDataTable();
}
Finally use the method to filter all tables:
var dtFiltered = Filter<string>(filter, dtWorkExp, "UserId");
var dtFilteredAward = Filter<string>(filter, dtAwards, "UserId");
var dtFilteredOtherQual = Filter<string>(filter, dtOtherQual, "UserId");
All together woul be something like this
public void YourMethod()
{
var rows = dsResult.Tables[0].AsEnumerable();
var filter = rows.Select(r => r.Field<string>("Registration NO."));
var dtFiltered = Filter<string>(filter, dtWorkExp, "UserId");
var dtFilteredAward = Filter<string>(filter, dtAwards, "UserId");
var dtFilteredOtherQual = Filter<string>(filter, dtOtherQual, "UserId");
}
public static DataTable Filter<T>(EnumerableRowCollection<T> filter, DataTable table, string fieldName)
{
return table.AsEnumerable().Where(r => filter.Contains(r.Field<T>(fieldName))).CopyToDataTable();
}
Put the value of the expression in a variable.
var regNo = dsResult.Tables[0].Rows[i]["Registration NO."].ToString();
Put the index of column to the variable. Access by index more faster then by column name.
int index = dtWorkExp.Columns["UserId"].Ordinal;
Result code:
int dtWorkIndex = dtWorkExp.Columns["UserId"].Ordinal;
int dtAwardsIndex = dtAwards.Columns["UserId"].Ordinal;
int dtOtherQualIdex = dtOtherQual.Columns["UserId"].Ordinal;
for (int i = 0; i < dsResult.Tables[0].Rows.Count; i++)
{
var regNo = dsResult.Tables[0].Rows[i]["Registration NO."].ToString();
DataTable dtFiltered = dtWorkExp.Clone();
foreach (DataRow drr in dtWorkExp.Rows)
{
if (drr[dtWorkIndex].ToString() == regNo)
{
dtFiltered.ImportRow(drr);
}
}
...
Of course, the column index can be set as a constant if you know it exactly in advance. Also, if the UserId indexes match in all tables, a single variable is sufficient.
You can also try using the BeginLoadData and EndLoadData methods.
DataTable dtFiltered = dtWorkExp.Clone();
dtFiltered.BeginLoadData();
foreach (DataRow drr in dtWorkExp.Rows)
{
if (drr[dtWorkIndex].ToString() == regNo)
{
dtFiltered.ImportRow(drr);
}
}
dtFiltered.EndLoadData();
But I'm not sure if they make sense together with ImportRow.
Finally, parallelization comes to help.
for (int i = 0; i < dsResult.Tables[0].Rows.Count; i++)
{
var regNo = ...;
var workTask = Task.Run(() =>
{
DataTable dtFiltered = dtWorkExp.Clone();
foreach (DataRow drr in dtWorkExp.Rows)
{
if (drr[dtWorkIndex].ToString() == regNo)
{
dtFiltered.ImportRow(drr);
}
}
return dtFiltered;
});
var awardTask = Task.Run(() =>
...
var otherQualTask = Task.Run(() =>
...
//Task.WaitAll(workTask, awardTask, otherQualTask);
await Task.WhenAll(workTask, awardTask, otherQualTask);
//Do some operation with filtered Data Tables
}

How to get all DataColumns of a DataTable to Lists

I want to get all DataColumns (of double type) of my DataTable in Lists and then create a Dictionary where the Key would be the header of the DataColumn and the Value would be the List with the data of the DataColumn. How can I achieve this with LINQ?
I tried the following lines without success:
// Create Dictionary
Dictionary<string, List<double>> DataDic = new Dictionary<string, List<double>>();
// Create List
List<double> DataList = new List<double>();
// For each DataColumn save it as a List of double
DataList = (from DataColumn dc in dt.Columns select new double()).ToList();
// Add KVP to DataDic
DataDic.Add(column.ColumnName, DataList);
Thanks in advance.
That is pretty straight forward:
// Create Dictionary
var DataDic = dt.Columns.Cast<DataColumn>()
.Where(dc => dc.DataType == typeof(double))
.ToDictionary(dc => dc.ColumnName,
dc => dt.AsEnumerable()
.Select(r => r.Field<double>(dc.ColumnName))
.ToList()
);

Use linq to find DataTable(Name) in a DataSet using unique list of Column Names

I got roped into some old code, that uses loose (untyped) datasets all over the place.
I'm trying to write a helper method to find the DataTable.Name using the names of some columns.....(because the original code has checks for "sometimes we have 2 datatables in a dataset, sometimes 3, sometimes 4)..and its hard to know the order. Basically, the TSQL Select statements conditionally run. (Gaaaaaaaaaaaaaahhh).
Anyway. I wrote the below, and if I give it 2 column names, its matching on "any" columnname, not "all column names".
Its probably my linq skillz (again), and probably a simple fix.
But I've tried to get the syntax sugar down..below is one of the things I wrote, that compiles.
private static void DataTableFindStuff()
{
DataSet ds = new DataSet();
DataTable dt1 = new DataTable("TableOne");
dt1.Columns.Add("Table1Column11");
dt1.Columns.Add("Name");
dt1.Columns.Add("Age");
dt1.Columns.Add("Height");
DataRow row1a = dt1.NewRow();
row1a["Table1Column11"] = "Table1Column11_ValueA";
row1a["Name"] = "Table1_Name_NameA";
row1a["Age"] = "AgeA";
row1a["Height"] = "HeightA";
dt1.Rows.Add(row1a);
DataRow row1b = dt1.NewRow();
row1b["Table1Column11"] = "Table1Column11_ValueB";
row1b["Name"] = "Table1_Name_NameB";
row1b["Age"] = "AgeB";
row1b["Height"] = "HeightB";
dt1.Rows.Add(row1b);
ds.Tables.Add(dt1);
DataTable dt2 = new DataTable("TableTwo");
dt2.Columns.Add("Table2Column21");
dt2.Columns.Add("Name");
dt2.Columns.Add("BirthCity");
dt2.Columns.Add("BirthState");
DataRow row2a = dt2.NewRow();
row2a["Table2Column21"] = "Table2Column1_ValueG";
row2a["Name"] = "Table2_Name_NameG";
row2a["BirthCity"] = "BirthCityA";
row2a["BirthState"] = "BirthStateA";
dt2.Rows.Add(row2a);
DataRow row2b = dt2.NewRow();
row2b["Table2Column21"] = "Table2Column1_ValueH";
row2b["Name"] = "Table2_Name_NameH";
row2b["BirthCity"] = "BirthCityB";
row2b["BirthState"] = "BirthStateB";
dt2.Rows.Add(row2b);
ds.Tables.Add(dt2);
DataTable dt3 = new DataTable("TableThree");
dt3.Columns.Add("Table3Column31");
dt3.Columns.Add("Name");
dt3.Columns.Add("Price");
dt3.Columns.Add("QuantityOnHand");
DataRow row3a = dt3.NewRow();
row3a["Table3Column31"] = "Table3Column31_ValueM";
row3a["Name"] = "Table3_Name_Name00M";
row3a["Price"] = "PriceA";
row3a["QuantityOnHand"] = "QuantityOnHandA";
dt3.Rows.Add(row3a);
DataRow row3b = dt3.NewRow();
row3b["Table3Column31"] = "Table3Column31_ValueN";
row3b["Name"] = "Table3_Name_Name00N";
row3b["Price"] = "PriceB";
row3b["QuantityOnHand"] = "QuantityOnHandB";
dt3.Rows.Add(row3b);
ds.Tables.Add(dt3);
string foundDataTable1Name = FindDataTableName(ds, new List<string> { "Table1Column11", "Name" });
/* foundDataTable1Name should be 'TableOne' */
string foundDataTable2Name = FindDataTableName(ds, new List<string> { "Table2Column21", "Name" });
/* foundDataTable1Name should be 'TableTwo' */
string foundDataTable3Name = FindDataTableName(ds, new List<string> { "Table3Column31", "Name" });
/* foundDataTable1Name should be 'TableThree' */
string foundDataTableThrowsExceptionName = FindDataTableName(ds, new List<string> { "Name" });
/* show throw exception as 'Name' is in multiple (distinct) tables */
}
public static string FindDataTableName(DataSet ds, List<string> columnNames)
{
string returnValue = string.Empty;
DataTable foundDataTable = FindDataTable(ds, columnNames);
if (null != foundDataTable)
{
returnValue = foundDataTable.TableName;
}
return returnValue;
}
public static DataTable FindDataTable(DataSet ds, List<string> columnNames)
{
DataTable returnItem = null;
if (null == ds || null == columnNames)
{
return null;
}
List<DataTable> tables =
ds.Tables
.Cast<DataTable>()
.SelectMany
(t => t.Columns.Cast<DataColumn>()
.Where(c => columnNames.Contains(c.ColumnName))
)
.Select(c => c.Table).Distinct().ToList();
if (null != tables)
{
if (tables.Count <= 1)
{
returnItem = tables.FirstOrDefault();
}
else
{
throw new IndexOutOfRangeException(string.Format("FindDataTable found more than one matching Table based on the input column names. ({0})", String.Join(", ", columnNames.ToArray())));
}
}
return returnItem;
}
I tried this too (to no avail) (always has 0 matches)
List<DataTable> tables =
ds.Tables
.Cast<DataTable>()
.Where
(t => t.Columns.Cast<DataColumn>()
.All(c => columnNames.Contains(c.ColumnName))
)
.Distinct().ToList();
To me sounds like you're trying to see if columnNames passed to the method are contained within Column's name collection of Table. If that's the case, this should do the work.
List<DataTable> tables =
ds.Tables
.Cast<DataTable>()
.Where(dt => !columnNames.Except(dt.Columns.Select(c => c.Name)).Any())
.ToList();
(Below is an append by the asker of the question)
Well, I had to tweak it to make it compile, but you got me there..
Thanks.
Final Answer:
List<DataTable> tables =
ds.Tables.Cast<DataTable>()
.Where
(dt => !columnNames.Except(dt.Columns.Cast<DataColumn>()
.Select(c => c.ColumnName))
.Any()
)
.ToList();
Final Answer (which is not case sensitive):
List<DataTable> tables =
ds.Tables.Cast<DataTable>()
.Where
(dt => !columnNames.Except(dt.Columns.Cast<DataColumn>()
.Select(c => c.ColumnName), StringComparer.OrdinalIgnoreCase)
.Any()
)
.ToList();

How to select data in rows using LINQ?

How can I use LINQ to select all the Company Name and Company ID from all the rows? I need something like this pseudo-code:
var typedQry = from b in allData.AsEnumerable()
where b.GetHeader("xxx") == "08/10/09 to 08/26/09"
select CompanyName, CompanyID, ...
The code below selects only one Company Name. Instead, I want Company Name from all the rows:
var typedQry3 = from b in allData.AsEnumerable()
select new { compname0 = b._rows[0][5]};
The data in _rows are Company Name (e.g., allData[0]._rows[0][5], allData[0]._rows[1][5],....), Company ID, and so forth.
However, Company Name, Company ID, and etc. are not defined in the DataProperty class. Their values are inserted into _rows from data files.
Any help is appreciated. Below is some code to help you understand my question.
List<DataProperty> allData = new List<DataProperty>();
The DataProperty class consists of
private readonly Dictionary<string, string> _headers = new Dictionary<string, string>();
private readonly List<string[]> _rows = new List<string[]>();
and these methods (among others):
public string[] GetDataRow(int rowNumber){return _rows[rowNumber];}
public void AddDataRow(string[] row){_rows.Add(row);}
according to your comment, if you need to the sum for each company you can try this:
var RowList1 = allData.SelectMany(u => u._rows.Select(t => new
{
CompanyName = t[5],
Amount = Convert.ToInt64(t[1]) + Convert.ToInt64(t[2])
}))
.Where(u => u.CompanyName == "XXX")
.OrderBy(u => u.CompanyName)
.ToList();
and if you need to sum of the all companies, you can try this:
var SumAmount = allData.SelectMany(u => u._rows.Select(t => new
{
CompanyName = t[5],
Amount = Convert.ToInt64(t[1]) + Convert.ToInt64(t[2])
}))
.Where(u => u.CompanyName == "XXX")
.DefaultIfEmpty()
.Sum(u => u.Amount);
you can write your own and customized query using these
you can use this to get all company names:
var AllCompanyNames = allData.SelectMany(u => u._rows.Select(t => t[5])).ToList();
and this, to get more property:
var Rows = allData.SelectMany(u =>
u._rows.Select(t => new
{
CompanyName = t[5],
Other1 = t[1],
Other2 = t[2]
}))
.ToList();
and this, if you need to check any condition:
var FilteredRows = allData.SelectMany(u =>
u._rows.Select(t => new
{
CompanyName = t[5],
Other1 = t[1],
Other2 = t[2]
}))
.Where(u => u.CompanyName == "XXX")
.ToList();
At first you can receive rows and then iterate through them.
This example may help you
var rows = (from DataRow dRow in dTable.Rows
select new {col1=dRow["dataColumn1"],col2=dRow["dataColumn2"]});
foreach (var row in distinctRows)
{
var value1=row.col1.ToString();
var value2=row.col2.ToString();
}

Reversing typeof to use Linq Field<T>

I want to use Linq to dynamically select DataTable columns by ColumnName but to use Field<> I must explicitly cast them or box everything to an object, which is not efficient.
I tried:
string[] colsNames = new[] { "Colum1", "Colum2" };
DataTable dt = StoredProcedure().Tables[0];
var cols = dt.Columns.Cast<DataColumn>().Where(c => cols.Contains(c.ColumnName));
if (cols.Any())
{
dt.AsEnumerable().Select(r => string.Join(":", cols.Select(c => r.Field<c.DataType>(c.ColumnName))))
}
but this throws me an error The type or namespace name 'c' could not be found
How do I convert typeof(decimal) to Field<decimal>("Column1") for example?
Try this:
DataTable dt = new DataTable();
dt.Columns.Add("id", Type.GetType("System.Int32"));
dt.Columns.Add("Colum1", Type.GetType("System.Int32"));
dt.Columns.Add("Colum2", Type.GetType("System.String"));
dt.Columns.Add("Colum3");
string[] colsNames = new[] { "Colum1", "Colum2" };
var colTypes = dt.Columns.Cast<DataColumn>()
.Where(c => colsNames.Contains(c.ColumnName))
.Select(c => new
{
c.ColumnName,
c.DataType
})
.ToDictionary(key => key.ColumnName, val => val.DataType);
var query = dt.AsEnumerable()
.Where(row => (int)row["id"]==5)
.Select(row => new
{
Colum1 = Convert.ChangeType(row[colsNames[0]], colTypes[colsNames[0]]),
Colum2 = Convert.ChangeType(row[colsNames[1]], colTypes[colsNames[1]])
});
Here is another variant, but it is not very interesting:
//define class
public class myClass
{
public int Column1;
public string Column2;
}
// then
var query = dt.AsEnumerable()
.Select(row => new myClass
{
Column1 = Convert.ToInt32(row[colsNames[0]]),
Column2 = row[colsNames[1]].ToString()
});
There is a third variant: you can create a view or stored procedure in the database and add it to the data context

Categories

Resources