Distinct Enumerable DataTable based on column? - c#

This is what i'm trying to accomplish. I want to select only distinct values from all Rows in Column[0].
Then I want to get all the distinct values from column[2] and group them on column[0].
so basically i got a DataTable like so:
Fruit|Apples
Fruit|Pears
Vegetables|Peas
Vegetables|Carrots
so I want to do a foreach on the distinct values, so I would Enumerate Fruit once and then pick up Apples and Pears, and VegeTables once and pick up Peas and Carrots.
I'm doing this to create Accordion Panes, where I want to group my results under one header, the below code does such, however, it creates two panes of Fruit because it does not realize it already went though fruit.
foreach (DataRow dtrow in dtTable.Rows)
{
string idRow = dtrow[0].ToString();
AccordionPane currentPane = new AccordionPane();
currentPane.ID = "AccordionPane" + Guid.NewGuid().ToString();
currentPane.HeaderContainer.Controls.Add(new LiteralControl(dtrow[0].ToString()));
foreach(DataRow dtRow2 in dtTable.Rows)
{
if(dtRow2[0].ToString() == idRow)
{
currentPane.ContentContainer.Controls.Add(new LiteralControl(dtRow2[1].ToString()));
}
}
NavigateAccordion.Panes.Add(currentPane);
}

You can accomplish this easily when using linq, see for yourself:
var groupedRows = from row in dtTable.Rows.AsEnumerable()
group row by row[0] into grouped
select grouped;
foreach (var group in groupedRows)
{
currentPane = new AccordionPane();
currentPane.HeaderContainer.Controls.Add(group.Key.ToString());
foreach (var row in group)
{
currentPane.ContentContainer.Controls.Add(row[1].ToString());
}
}
Or if you want to stick with your current non-linq approach:
foreach (DataRow dtrow in dtTable.Rows)
{
bool skip = false;
foreach (var pane in NavigateAccordion.Panes)
{
if (pane.HeaderContainer.Controls[0].Text == dtRow[0].ToString())
{
skip = true;
break;
}
}
if (!skip)
{
string idRow = dtrow[0].ToString();
AccordionPane currentPane = new AccordionPane();
currentPane.ID = "AccordionPane" + Guid.NewGuid().ToString();
currentPane.HeaderContainer.Controls.Add(new LiteralControl(dtrow[0].ToString()));
foreach(DataRow dtRow2 in dtTable.Rows)
{
if(dtRow2[0].ToString() == idRow)
{
currentPane.ContentContainer.Controls.Add(new LiteralControl(dtRow2[1].ToString()));
}
}
NavigateAccordion.Panes.Add(currentPane);
}
}

Related

Remove all columns from datatable except for 25

I have 500 Columns in my DataTable and I want to remove all of them except for 25 columns.
Is there any way to do this faster to save time and lines of code?
This is what I already tried:
private static void DeleteUselessColumns()
{
//This is example data!
List<DataColumn> dataColumnsToDelete = new List<DataColumn>();
DataTable bigData = new DataTable();
bigData.Columns.Add("Harry");
bigData.Columns.Add("Konstantin");
bigData.Columns.Add("George");
bigData.Columns.Add("Gabriel");
bigData.Columns.Add("Oscar");
bigData.Columns.Add("Muhammad");
bigData.Columns.Add("Emily");
bigData.Columns.Add("Olivia");
bigData.Columns.Add("Isla");
List<string> columnsToKeep = new List<string>();
columnsToKeep.Add("Isla");
columnsToKeep.Add("Oscar");
columnsToKeep.Add("Konstantin");
columnsToKeep.Add("Gabriel");
//This is the code i want to optimize------
foreach (DataColumn column in bigData.Columns)
{
bool keepColumn = false;
foreach (string s in columnsToKeep)
{
if (column.ColumnName.Equals(s))
{
keepColumn = true;
}
}
if (!keepColumn)
{
dataColumnsToDelete.Add(column);
}
}
foreach(DataColumn dataColumn in dataColumnsToDelete)
{
bigData.Columns.Remove(dataColumn);
}
//------------------------
}
var columnsToKeep = new List<string>() { "Isla", "Oscar", "Konstantin", "Gabriel"};
var toRemove = new List<DataColumn>();
foreach(DataColumn column in bigData.Columns)
{
if (!columnsToKeep.Any(name => column.ColumnName == name ))
{
toRemove.Add(column);
}
}
toRemove.ForEach(col => bigData.Columns.Remove(col));
Test1...test9 same code could be made a loop. No need to add the columns to delete in a list, just delete them in the first while loop. As for performance, not sure how to improve it.
You could try to use a DataView that selects the desired columns then copy to table. You need to experiment.
if they have different names create an array of string
var columns = new string[] { "Harry", "Konstantin","John"};
var columnsToKeep = new string[] { "John", "Konstantin"};
var columnsToDelete = from item in columns
where !columnsToKeep.Contains(item)
select item;
or using lambda
var columnsToDelete = columns
.Where (i=> !columnsToKeep.Contains(i))
.ToList();
toDelete
Harry

How can I improve the performance of the following code?

This code is working but taking too much time. Every data table contains 1000nds of rows and each time I need to filter data from another data tables with respect to a column.
for (int i = 0; i < dsResult.Tables[0].Rows.Count; i++)
{
DataTable dtFiltered = dtWorkExp.Clone();
foreach (DataRow drr in dtWorkExp.Rows)
{
if (drr["UserId"].ToString() == dsResult.Tables[0].Rows[i]["Registration NO."].ToString())
{
dtFiltered.ImportRow(drr);
}
}
DataTable dtFilteredAward= dtAwards.Clone();
foreach (DataRow drr in dtAwards.Rows)
{
if (drr["UserId"].ToString() == dsResult.Tables[0].Rows[i]["Registration NO."].ToString())
{
dtFilteredAward.ImportRow(drr);
}
}
DataTable dtFilteredOtherQual = dtOtherQual.Clone();
foreach (DataRow drr in dtOtherQual.Rows)
{
if (drr["UserId"].ToString() == dsResult.Tables[0].Rows[i]["Registration NO."].ToString())
{
dtFilteredOtherQual.ImportRow(drr);
}
}
//Do some operation with filtered Data Tables
}
You can declare these lines outside the for loop.
DataTable dtFiltered = dtWorkExp.Clone();
And instead of doing accessing dsResult.Table[0] each time, you can assign this to one variable and use it.
You can also replace the foreach loop with LINQ.
What I would do:
All rows of the main datatable as enumerable:
var rows = dsResult.Tables[0].AsEnumerable();
Get the column you're going to filter with:
var filter = rows.Select(r => r.Field<string>("Registration NO."));
Create a method that accepts that filter, a table to filter and a field to compare.
public static DataTable Filter<T>(EnumerableRowCollection<T> filter, DataTable table, string fieldName)
{
return table.AsEnumerable().Where(r => filter.Contains(r.Field<T>(fieldName))).CopyToDataTable();
}
Finally use the method to filter all tables:
var dtFiltered = Filter<string>(filter, dtWorkExp, "UserId");
var dtFilteredAward = Filter<string>(filter, dtAwards, "UserId");
var dtFilteredOtherQual = Filter<string>(filter, dtOtherQual, "UserId");
All together woul be something like this
public void YourMethod()
{
var rows = dsResult.Tables[0].AsEnumerable();
var filter = rows.Select(r => r.Field<string>("Registration NO."));
var dtFiltered = Filter<string>(filter, dtWorkExp, "UserId");
var dtFilteredAward = Filter<string>(filter, dtAwards, "UserId");
var dtFilteredOtherQual = Filter<string>(filter, dtOtherQual, "UserId");
}
public static DataTable Filter<T>(EnumerableRowCollection<T> filter, DataTable table, string fieldName)
{
return table.AsEnumerable().Where(r => filter.Contains(r.Field<T>(fieldName))).CopyToDataTable();
}
Put the value of the expression in a variable.
var regNo = dsResult.Tables[0].Rows[i]["Registration NO."].ToString();
Put the index of column to the variable. Access by index more faster then by column name.
int index = dtWorkExp.Columns["UserId"].Ordinal;
Result code:
int dtWorkIndex = dtWorkExp.Columns["UserId"].Ordinal;
int dtAwardsIndex = dtAwards.Columns["UserId"].Ordinal;
int dtOtherQualIdex = dtOtherQual.Columns["UserId"].Ordinal;
for (int i = 0; i < dsResult.Tables[0].Rows.Count; i++)
{
var regNo = dsResult.Tables[0].Rows[i]["Registration NO."].ToString();
DataTable dtFiltered = dtWorkExp.Clone();
foreach (DataRow drr in dtWorkExp.Rows)
{
if (drr[dtWorkIndex].ToString() == regNo)
{
dtFiltered.ImportRow(drr);
}
}
...
Of course, the column index can be set as a constant if you know it exactly in advance. Also, if the UserId indexes match in all tables, a single variable is sufficient.
You can also try using the BeginLoadData and EndLoadData methods.
DataTable dtFiltered = dtWorkExp.Clone();
dtFiltered.BeginLoadData();
foreach (DataRow drr in dtWorkExp.Rows)
{
if (drr[dtWorkIndex].ToString() == regNo)
{
dtFiltered.ImportRow(drr);
}
}
dtFiltered.EndLoadData();
But I'm not sure if they make sense together with ImportRow.
Finally, parallelization comes to help.
for (int i = 0; i < dsResult.Tables[0].Rows.Count; i++)
{
var regNo = ...;
var workTask = Task.Run(() =>
{
DataTable dtFiltered = dtWorkExp.Clone();
foreach (DataRow drr in dtWorkExp.Rows)
{
if (drr[dtWorkIndex].ToString() == regNo)
{
dtFiltered.ImportRow(drr);
}
}
return dtFiltered;
});
var awardTask = Task.Run(() =>
...
var otherQualTask = Task.Run(() =>
...
//Task.WaitAll(workTask, awardTask, otherQualTask);
await Task.WhenAll(workTask, awardTask, otherQualTask);
//Do some operation with filtered Data Tables
}

C# KeyValue foreach column in row

I am here today trying to work out how I can do this. I have the code below to look through each column in a DataRow, but how can I access the key AND value? I want to assign it to a dictionary in the class but I can't seem to get both of them, the only way I can get anything is by calling:
var columnValue = playerDataRow[column];
Here is the full thing:
using (var mysqlConnection = Sirius.GetServer().GetDatabaseManager().GetConnection())
{
mysqlConnection.SetQuery("SELECT * FROM `users` WHERE `auth_ticket` = #authTicket LIMIT 1");
mysqlConnection.AddParameter("authTicket", authTicket);
var playerDataTable = mysqlConnection.GetTable();
foreach (DataRow playerDataRow in playerDataTable.Rows)
{
foreach (DataColumn column in playerDataTable.Columns)
{
var columnValue = playerDataRow[column];
}
}
}
foreach (DataRow playerDataRow in playerDataTable.Rows)
{
var myDic = new Dictionary<string, object>();
foreach (DataColumn column in playerDataTable.Columns)
{
myDic.Add(column.ColumnName, playerDataRow[column]);
}
}
the variable column will be the key and the value will be columnValue
looks that you only want one row of output - perhaps for this specific user based on auth_ticket
here is an example of how to get all values for this row into a Dictionary of strings (I'm converting all data to strings by the way just for this example)
var htRowValues = new Dictionary<string,string>();
using (var mysqlConnection = Sirius.GetServer().GetDatabaseManager().GetConnection())
{
mysqlConnection.SetQuery("SELECT * FROM `users` WHERE `auth_ticket` = #authTicket LIMIT 1");
mysqlConnection.AddParameter("authTicket", authTicket);
var playerDataTable = mysqlConnection.GetTable();
foreach (DataRow playerDataRow in playerDataTable.Rows)
{
foreach (DataColumn column in playerDataTable.Columns)
{
var columnValue = playerDataRow[column];
htRowValues[column.ColumnName]=System.Convert.ToString(columnValue);
}
}
}
now you have all column values in the dictionary for this one row of data.

Shorten repetitive code

I feel quite dumb asking this but I have two methods which have almost the same code except the naming convention... I want to shorten this to reduce the use of redundant code.
How do I actually shorten this?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Text.RegularExpressions;
namespace empTRUST
{
class DBDictionary : DBBase
{
public DBDictionary()
: base("Dictionary", "Word")
{
}
public List<DataRow> AngerPercent(string status)
{
List<DataRow> dataList = new List<DataRow>();
var wordPattern = new Regex(#"\w+");
DataRow[] rows = fbTab.Select("Genre = 'Angry'");
foreach (Match match in wordPattern.Matches(status))
foreach (var item in rows)
if (item["Word"].ToString().ToLower() == match.ToString().ToLower())
{
dataList.Add(item);
}
return dataList;
}
public List<DataRow> CaringPercent(string status)
{
List<DataRow> dataList = new List<DataRow>();
var wordPattern = new Regex(#"\w+");
DataRow[] rows = fbTab.Select("Genre = 'Caring'");
foreach (Match match in wordPattern.Matches(status))
foreach (var item in rows)
if (item["Word"].ToString().ToLower() == match.ToString().ToLower())
{
dataList.Add(item);
}
return dataList;
}
}
}
Genre is the only thing that is different so just move it to the list of method arguments:
public List<DataRow> GenrePercent(string status, string genre)
{
List<DataRow> dataList = new List<DataRow>();
var wordPattern = new Regex(#"\w+");
DataRow[] rows = fbTab.Select(String.Format("Genre = '{0}'", genre.Replace("'", "''")));
foreach (Match match in wordPattern.Matches(status))
foreach (var item in rows)
if (item["Word"].ToString().ToLower() == match.ToString().ToLower())
{
dataList.Add(item);
}
return dataList;
}
You can then pass genre name when calling it:
GenrePercent("Status1", "Angry");
GenrePercent("Status2", "Caring");
public List<DataRow> QualifyPercent(string status, string selectQualifier)
{
List<DataRow> dataList = new List<DataRow>();
var wordPattern = new Regex(#"\w+");
DataRow[] rows = fbTab.Select(selectQualifier);
foreach (Match match in wordPattern.Matches(status))
foreach (var item in rows)
if (item["Word"].ToString().ToLower() == match.ToString().ToLower())
{
dataList.Add(item);
}
return dataList;
}
call it like this:
List<DataRow> angerPercent = QualifyPercent(status,"Genre = 'Angry'");
I believe the code could be made even simpler (this is more of a comment than an answer because it has nothing to do with the original question):
public List<DataRow> QualifyPercent(string status, string selectQualifier)
{
var matchList = status.Split(" ".ToCharArray());
var dataList =
fbTab.Select(selectQualifier).OfType<DataRow>().Select(row =>
matchList.Select(
m => m.ToString().ToLower() == row["Word"].ToSring().ToLower()).Any());
       return dataList;
}
Love that linq, this should be faster because of the nature of linq Any() will only run the loop till a result is found - which should speed it up O(n/2)
You already have one parameter, why not change the bit with 'Caring' to be based on a parameter as well?
public List<DataRow> AngerPercent(string status)
...
DataRow[] rows = fbTab.Select("Genre = 'Angry'");
becomes
public List<DataRow> AngerPercent(string status, string query)
...
DataRow[] rows = fbTab.Select("Genre = '" + query + "'");
public List<DataRow> Percent(string status, DataRow[] rows)
{
List<DataRow> dataList = new List<DataRow>();
var wordPattern = new Regex(#"\w+");
foreach (Match match in wordPattern.Matches(status)) {
foreach (var item in rows) {
if (item["Word"].ToString().ToLower() == match.ToString().ToLower()) {
dataList.Add(item);
}
}
}
return dataList;
}
Call like this:
DataRow[] data = fbTab.Select("Genre = 'Angry'");
// DataRow[] data = fbTab.Select("Genre = 'Caring'");
Percent("Status1", data);
Your method shouldn't know, what data you would like to pass him- 1 method = 1 function (processing your given data in this case).

Rename column and type in Datatable after data is loaded

I am importing data from csv file, sometimes there are column headers and some times not the customer chooses custom columns(from multiple drop downs)
my problem is I am able to change the columns type and name but when I want to import data row into cloned table it just adds rows but no data with in those rows. If I rename the column to old values it works, let's say column 0 name is 0 if I change that to something else which I need to it won't fill the row below with data but If I change zero to zero again it will any idea:
here is my coding:
#region Manipulate headers
DataTable tblCloned = new DataTable();
tblCloned = tblDataTable.Clone();
int i = 0;
foreach (string item in lstRecord)
{
if (item != "Date")
{
var m = tblDataTable.Columns[i].DataType;
tblCloned.Columns[i].DataType = typeof(System.String);
tblCloned.Columns[i].ColumnName = item;
}
else if(item == "Date")
{
//get the proper date format
//FillDateFormatToColumn(tblCloned);
tblCloned.Columns[i].DataType = typeof(DateTime);
tblCloned.Columns[i].ColumnName = item;
}
i++;
}
tblCloned.AcceptChanges();
foreach (DataRow row in tblDataTable.Rows)
{
tblCloned.ImportRow(row);
}
tblCloned.AcceptChanges();
#endregion
in the second foreach loop when it calls to import data to cloned table it adds empty rows.
After couple of tries I came up with this solution which is working:
foreach (DataRow row in tblDataTable.Rows)
{
int x = 0;
DataRow dr = tblCloned.NewRow();
foreach (DataColumn dt in tblCloned.Columns)
{
dr[x] = row[x];
x++;
}
tblCloned.Rows.Add(dr);
//tblCloned.ImportRow(row);
}
but I will accept Scottie's answer because it is less code after all.
Instead of
foreach (DataRow row in tblDataTable.Rows)
{
tblCloned.ImportRow(row);
}
try
foreach (DataRow row in tblDataTable.Rows)
{
tblCloned.LoadDataRow(row.ItemArray, true);
}

Categories

Resources