I have situation, I need my DataRows (string columns) to be HTMLEncoded. Can we update entire cells of a DataRow in one go using LINQ or any other way?
Basically I want to avoid loops.
I have a datatable oDt.
DataTable have following columns : id_season, season, modifiedon(date);
To save this dataTable i have a function
Save Table(DataTable oDT){
//Here I have to update modifedon column to DateTime.now;
foreach(datarow dr in oDT.Rows)
{
dr[modifiedon] = DateTime.now;
}
// I need to avoid this loop as datatable can have 35000 + records
}
You must update every element, so you are are looking at O(n) time regardless.
On my system, updating 35,000 elements takes between 0.01 and 0.03 seconds, 350,000 items takes .1 to .3 seconds, etc. You can get a small performance boost by getting the value of DateTime.Now outside the loop like this:
var now = DateTime.Now;
foreach(DataRow dr in oDT.Rows)
{
dr["ModifiedOn"] = now;
}
Related
I am writing a C# script that will be run inside another project that makes use of the C# compiler. This engine does not have System.Data.DataSetExtensions referenced nor will it be allowed to do so.
That being said I still need to take a DataTable of 100,000 rows and break it up into smaller DataSets of 10,000 rows max. Normally I would use DataExtensions with something like ..
var filteredDataTable = myDataTable.AsEnumerable().Take(10000).CopyToDataTable();
What would be an efficient way to go about this without using DataSetExtensions? Should I be resigned to using a foreach loop and copying over 10,000 rows into new DataTables?
Should I be resigned to using a foreach loop and copying over 10,000
rows into new DataTables?
Yes, also you may consider writing your own extension method to slice the table and reuse it wherever required. Something like below -
public DataTable SliceTable(this DataTable dt, int rowsCount, int skipRows=0)
{
DataTable dtResult = dt.Clone();
for (int i = skipRows; i < dt.Rows.Count && rowsCount > 0; i++)
{
dtResult.ImportRow(dt.Rows[i]);
rowsCount --;
}
return dtResult;
}
Uses-
DataTable myData; -- Original table
var slice1 = myData.SliceTable(1000); // get slice of first 1000 rows
var slice2 = myData.SliceTable(1000,1000); // get rows from 1001 to 2000
var slice3 = myData.SliceTable(1000,2000); // get rows from 2001 to 3000
I have a DataTable with 36000 columns minimum, 20000 rows min. I have a hardcoded list of column names which I need to find in the datatable and perform some calculation on the values for each column.
e.g Sum of all rows for a particular column
I was thinking of using a simple foreach to find the column name and datatable.compute method to perform my calculation.
Is there a better way of achieving this? Any help is greatly appreciated.
Thank you.
I am using .Net 4.6 and VS2015
You can compare the performance difference between the Datatable.Compute and Foreach.
// Datatable.Compute sample
DataTable table;
table = dataSet.Tables["Orders"];
// Declare an object variable.
object sumObject;
sumObject = table.Compute("Sum(Total)", "EmpID = 5");
Below is a sample of foreach
// Foreach through DataTable
double sum = 0;
dt = ds.Tables["Orders"];
foreach (DataRow dr in dt.Rows)
{
sum += System.Convert.ToDouble(dr["Total"]);
}
For more performance its good to use parallel methods.
This link has a sample for you to use parallel and it is very Good to use it every where is possible.
At this example you see how much will improve your speed.
Is it possible to change order of rows in DataTable so for example the one with current index of 5 moves to place with index of 3, etc.?
I have this legacy, messy code where dropdown menu get it's values from DataTable, which get it's values from database. It is impossible to make changes in database, since it has too many columns and entries. My original though was to add new column in db and order by it's values, but this is going to be hard.
So since this is only matter of presentation to user I was thinking to just switch order of rows in that DataTable. Does someone knows the best way to do this in C#?
This is my current code:
DataTable result = flokkurDao.GetMCCHABAKflokka("MSCODE");
foreach (DataRow row in result.Rows)
{
m_cboReasonCode.Properties.Items.Add(row["FLOKKUR"].ToString().Trim() + " - " + row["SKYRING"]);
}
For example I want to push row 2011 - Credit previously issued to the top of the DataTable.
SOLUTION:
For those who might have problems with ordering rows in DataTable and working with obsolete technology that doesn't supports Linq this might help:
DataRow firstSelectedRow = result.Rows[6];
DataRow firstNewRow = result.NewRow();
firstNewRow.ItemArray = firstSelectedRow.ItemArray; // copy data
result.Rows.Remove(firstSelectedRow);
result.Rows.InsertAt(firstNewRow, 0);
You have to clone row, remove it and insert it again with a new index. This code moves row with index 6 to first place in the DataTable.
If you really want randomness you could use Guid.NewGuid in LINQ's OrderBy:
DataTable result = flokkurDao.GetMCCHABAKflokka("MSCODE");
var randomOrder = result.AsEnumerable().OrderBy(r => Guid.NewGuid());
foreach (DataRow row in randomOrder)
{
// ...
}
If you actually don't want randomness but you want specific values at the top, you can use:
var orderFlokkur2011 = result.AsEnumerable()
.OrderBy(r => r.Field<int>("FLOKKUR") == 2011 ? 0 : 1);
You can use linq to order rows:
DataTable result = flokkurDao.GetMCCHABAKflokka("MSCODE");
foreach (DataRow row in result.Rows.OrderBy(x => x.ColumnName))
{
m_cboReasonCode.Properties.Items.Add(row["FLOKKUR"].ToString().Trim() + " - " + row["SKYRING"]);
}
To order by multiple columns:
result.Rows.OrderBy(x => x.ColumnName).ThenBy(x => x.OtherColumnName).ThenBy(x.YetAnotherOne)
To order by a specific value:
result.Rows.OrderBy(x => (x.ColumnName == 2001 or x.ColumnName == 2002) ? 0 : 1).ThenBy(x => x.ColumName)
You can use the above code to "pin" certain rows to the top, if you want more granular than that you can use a switch for example to sort specific values into sorted values of 1, 2, 3, 4 and use a higher number for the rest.
You can not change the order or delete a row in a foreach loop, you should create a new datatable and randomly add the rows to new datatable, you should also track the inserted rows not to duplicate
Use a DataView
DataTable result = flokkurDao.GetMCCHABAKflokka("MSCODE");
DateView view = new DateView(result);
view.Sort = "FLOKKUR";
view.Filter = "... you can even apply an in memory filter here ..."
foreach (DataRowView row in view.Rows)
{
....
Every data table comes with a view DefaultView which you can use, this way you can apply the default sorting / filtering in your datalayer.
public DataTable GetMCCHABAKflokka(string tableName, string sort, string filter)
{
var result = GetMCCHABAKflokka(tableName);
result.DefaultView.Sort = sort;
result.DefaultView.Filter = filter;
return result;
}
// use like this
foreach (DataRowView row in result.DefaultView)
I made the following code to add external data table to another table in MS word document, its working fine but takes a lot of time in case that the number of rows is more than 100, and in case of adding table with rows count more that 500 it fills the ms word table really slow and can't complete the task.
I tried to hide the document and disable the screen update for the document but still no solution for the slow performance.
//Get the required external data to the DT data table
DataTable DT = XDt.GetData();
Word.Table TB;
int X = 1;
foreach (DataRow Rw in DT.Rows)
{
Word.Row Rn = TB.Rows.Add(TB.Rows[X + 1]);
for(int i=0;i<=DT.Columns.Count-1;i++)
{
Rn.Cells[i+1].Range.Text = Rw[i].ToString());
}
X++;
}
So is there a way to make this process go faster ?
The most efficient way to add a table to Word is to first concatenate the data in a delimited text string, where "/n" must be the symbol for end-of-row (record separator). The end-of-cell (field separator) can be any character you like that's not in the string content that makes up the table.
Assign this string to a Range object, then use the ConvertToTable() method to create the table.
You're retrieving the last row of the current table for the BeforeRow parameter of TB.Rows.Add. This is significantly slower than simply adding the row. You should replace this:
Word.Row Rn = TB.Rows.Add(TB.Rows[X + 1]);
With this:
Word.Row Rn = TB.Rows.Add();
Utilizing parallelization as suggested in the comments might help slightly, but I'm afraid it's not going to do much good seeing the table add code runs on the main thread as mentioned in this link.
EDIT:
If performance is still an issue, I'd look into creating the Word table independently of the Word object model by using OpenXML. It's orders of magnitude faster.
ConvertToTable method is orders of magnitude faster than adding Rows/Cells one at a time.
while (reader.Read())
{
values = new object[reader.FieldCount];
var cols = reader.GetValues(values);
var item = String.Join("\t", values);
items.Add(item);
};
data = String.Join("\n", items.ToArray());
var tempDocument = application.Documents.Add();
var range = tempDocument.Range();
range.Text = data;
var tempTable = range.ConvertToTable(Separator: Microsoft.Office.Interop.Word.WdTableFieldSeparator.wdSeparateByTabs,
NumColumns: reader.FieldCount,
NumRows: rows, DefaultTableBehavior: WdDefaultTableBehavior.wdWord9TableBehavior,
AutoFitBehavior: WdAutoFitBehavior.wdAutoFitWindow);
I have now a problem with a very old system of ours. (!It is more then 7 years old and I have no budget and resources to make bigger change in the structure, so the decision to improve the old logic as many as we can.!)
We have an own written gridcontrol. Basically it is like a normal ASP.NET grid, you can add, change, delete elements.
The problem is that the grid has a BindGrid() method, where for further usage, the rows of the datasource table copied into a DataRow[]. I need to keep the DataRow[], but I would like to implement the best way to copy the source from the the table into the array.
The current solution:
DataRow[] rows = DataSource.Select("1=1", SortOrderString);
As I experienced so far, if I need to get a specified sort, that could be the best way (I'm also interested if it has a quicker way or not.)
BUT there are some simplified pages, where the SortOrder is not needed.
So I could make two method one for the sort order and one for without.
The real problem is the second one:
DataRow[] rows = DataSource.Select("1=1");
Because it is very slow. I made some test and it is kind of 15 times slower then the CopyTo() solution:
DataRow[] rows = new DataRow[DataSource.Rows.Count];
DataSource.Rows.CopyTo(rows,0);
I would like to use the faster way, BUT when I made the tests some old function simply crashed. It seems, there is an other difference, what I only noticed now:
The Select() gets the rows like the RowChanges are accepted.
So if I deleted a row, and I do not use the AcceptRowChanges() (I can't do that unfortunately), then with Select("1=1") the row is in the DataSource but not in the DataRow[].
With a simple .CopyTo() the row is there, and that is a bad news for me.
My questions are:
1) Is the Select("1=1") the best way to get the rows by the RowChanges? (I doubt a bit, because it is like 6 year old part)
2) And if 1) is not, is it possible to achieve a faster way with the same result than the .Select("1=1") ?
UPDATE:
Here is a very basic test app, what I used for speedtesting:
DataTable dt = new DataTable("Test");
dt.Columns.Add("Id", typeof (int));
dt.Columns.Add("Name", typeof(string));
for (int i = 0; i < 10000; i++)
{
DataRow row = dt.NewRow();
row["ID"] = i;
row["Name"] = "Name" + i;
dt.Rows.Add(row);
}
dt.AcceptChanges();
DateTime start = DateTime.Now;
DataRow[] rows = dt.Select();
/*DataRow[] rows = new DataRow[dt.Rows.Count];
dt.Rows.CopyTo(rows,0);*/
Console.WriteLine(DateTime.Now - start);
You can call Select without an argument: DataRow[] allRows = DataSource.Select(); That would be for sure more efficient than "1=1" since that applies a pointless RowFilter.
Another way is using Linq-To-DataSet to order and filter the DataTable. That isn't more efficient but more readable and maintainable.
I have yet no example or measurement, but it is obvious that a RowFilter with "1=1" is more expensive than none. Select is implemented in this way:
public Select(DataTable table, string filterExpression, string sort, DataViewRowState recordStates)
{
this.table = table;
this.IndexFields = table.ParseSortString(sort);
this.indexDesc = Select.ConvertIndexFieldtoIndexDesc(this.IndexFields);
// following would be omitted if you would use DataSource.Select() without "1=1"
if (filterExpression != null && filterExpression.Length > 0)
{
this.rowFilter = new DataExpression(this.table, filterExpression);
this.expression = this.rowFilter.ExpressionNode;
}
this.recordStates = recordStates;
}
If you want to be able to select also the rows that are currently not accepted, you can use the overload of Select:
DataRow[] allRows = DataSource.Select("", "", DataViewRowState.CurrentRows | DataViewRowState.Deleted);
This will select all rows inclusive the rows that are deleted even if AcceptChanges was not called yet.