How can I merge DataTable objects ignoring the first row?
The datatable I need to merge with the one I've got comes from a parsed CSV file and its first row (sometimes) still contains headers, which are obviously not supposed to end up in the resulting table...
DataTable.Merge method does not seem to offer such an option. What's the best way to do that? Just removing the first row beforehand? But that affects (alters) the "original", and what if I wanted it to stay as it was. Removing and reinserting after the merge? Smells like "clever coding". Is there really no better way?
Editing my previous
I wrote code on similar lines and ended up with all rows of dt1 intact and dt2 containing only row 2 &3 of from dt1
var dt1 = new DataTable("Test");
dt1.Columns.Add("id", typeof(int));
dt1.Columns.Add("name", typeof(string));
var dt2 = new DataTable("Test");
dt2.Columns.Add("id", typeof(int));
dt2.Columns.Add("name", typeof(string));
dt1.Rows.Add(1, "Apple"); dt1.Rows.Add(2, "Oranges");
dt1.Rows.Add(3, "Grapes");
dt1.AcceptChanges();
dt1.Rows[0].Delete();
dt2.Merge(dt1);
dt2.AcceptChanges();
dt1.RejectChanges();
Let me know if you find it acceptable.
Vijay
You could go through the rows separately and merge them into the table, something like
public static class DataTableExtensions
{
public static void MergeRange(this DataTable dest, DataTable table, int startIndex, int length)
{
List<string> matchingColumns = new List<string>();
for (int i = 0; i < table.Columns.Count; i++)
{
// Only copy columns with the same name and type
string columnName = table.Columns[i].ColumnName;
if (dest.Columns.Contains(columnName))
{
if (dest.Columns[columnName].DataType == table.Columns[columnName].DataType)
{
matchingColumns.Add(columnName);
}
}
}
for (int i = 0; i < length; i++)
{
int row = i + startIndex;
DataRow destRow = dest.NewRow();
foreach (string column in matchingColumns)
{
destRow[column] = table.Rows[row][column];
}
dest.Rows.Add(destRow);
}
}
}
Related
I have a DataTable like this:
And I want to write a for loop that shows debit and credit line on its own separate line like this:
Here is my unfinished code:
DataTable dt = new DataTable();
dt.Columns.Add("DEBIT", typeof(string));
dt.Columns.Add("CREDIT", typeof(string));
dt.Columns.Add("AMOUNT", typeof(double));
dt.Rows.Add("Debit1", "Credit1", 10);
dt.Rows.Add("Debit2", "Credit2", 8);
dt.Rows.Add("Debit3", "Credit3", 12);
for (int i=1; i <= dt.Rows.Count; i++)
{
//The first image (datatable) has three debit and credit lines are showing on the same line. Normally the debit line and credit line are showing on its own separate lines.
//With above given datatable I want to construct for loop that shows three debit lines and three credit lines as demonstrated in the second image. In this case it shows 6 lines
}
I would much appreciate it if you could help me with this.
Steps:
Start the loop in reverse (so you can easily insert rows).
Create a new row for the credit and fill it with the relevant data.
Remove the credit data from the original row.
Insert the new column in the position following the original row.
Something like this should do the trick:
for (int i = dt.Rows.Count - 1; i >= 0; i--)
{
var row = dt.Rows[i];
if (!string.IsNullOrEmpty(row["CREDIT"].ToString()))
{
var creditRow = dt.NewRow();
creditRow["CREDIT"] = row["CREDIT"];
creditRow["AMOUNT"] = row["AMOUNT"];
row["CREDIT"] = string.Empty;
dt.Rows.InsertAt(creditRow, i + 1);
}
}
Try it online.
I have a data table that contains 100 rows, I want to copy a range of rows(31st row to 50th row) to another data table.
I am following below logic.
DataTable dtNew = table.Clone();
for(int k=30; k < 50 && k < table.Rows.Count; k++)
{
dtNew.ImportRow(table.Rows[k]);
}
Is there any better approach to do this?
Using LINQ you can do something like:
DataTable dtNew = table.Select().Skip(31).Take(20).CopyToDataTable();
Performance wise, using LINQ wont do any better, it however makes it more readable.
EDIT: Added handling check
int numOFEndrow = 20;
int numOfStartRow = 31;
if (table.Rows.Count > numOFEndrow +numOfStartRow)
{
DataTable dtNew = table.Select().Skip(numOfStartRow).Take(numOFEndrow).CopyToDataTable();
}
If it's about readability, then a good idea would be to throw this into an extension method.
Without changing your logic:
public static class Utils
{
public static void CopyRows(this DataTable from, DataTable to, int min, int max)
{
for (int i = min; i < max && i < from.Rows.Count; i++)
to.ImportRow(from.Rows[i]);
}
}
Then you can always reuse it without all the fancy syntax and know that it does exactly what you need if there is a concern of performance:
DataTable dt1 = new DataTable();
DataTable dt2 = new DataTable();
dt1.CopyRows(dt2, 30, 50);
What's the fastest way in term of speed of coding to add rows to a DataTable? I don't need to know neither the name of columns nor datatype. Is it possible to add rows without previously specify the number or name of dataTable columns?
DataTable t = new DataTable();
t.Rows.Add(value1,
value1,
value2,
value3,
...
valueN
);
DataSet ds = new DataSet();
ds.Tables.Add(t);
If the input comes out of a collection, you could loop it to create the DataColumns with the correct type:
var data = new Object[] { "A", 1, 'B', 2.3 };
DataTable t = new DataTable();
// create all DataColumns
for (int i = 0; i < data.Length; i++)
{
t.Columns.Add(new DataColumn("Column " + i, data[i].GetType()));
}
// add the row to the table
t.Rows.Add(data);
To answer your first question: no, you have to have columns defined on the table. You can't just say, "Hey, make a column for all these values." Nothing stopping you from creating columns on the fly, though, as Mr. Schmelter says.
Without knowing the rows or columns (you first need to add columns, without that not possible)
for(int intCount = 0;intCount < dt.Rows.Count; intCount++)
{
for(int intSubCount = 0; intSubCount < dt.Columns.Count; intSubCount++)
{
dt.Rows[intCount][intSubCount] = yourValue; // or assign to something
}
}
I have a list of DataTables like
List<DataTable> a = new List<DataTable>();
I want to make a deep copy of this list (i.e. copying each DataTable). My code currently looks like
List<DataTable> aCopy = new List<DataTable>();
for(int i = 0; i < a.Rows.Count; i++) {
aCopy.Add(a[i].Copy());
}
The performance is absolutely terrible, and I am wondering if there is a known way to speed up such a copy?
Edit: do not worry about why I have this or need to do this, just accept that it is part of a legacy code base that I cannot change
if you have to copy a data table it is essentially an N time operation. If the data table is very large and causing a large amount of allocation you may be able to speed up the operation by doing a section at a time, but you are essentially bounded by the work set.
You can try the following - it gave me a performance boost, although your mileage might vary! I've adapted it to your example to demonstrate how to copy a datatable using an alternative mechanism - clone the table, then stream the data in. You could easily put this in an extension method.
List<DataTable> aCopy = new List<DataTable>();
for(int i = 0; i < a.Rows.Count; i++) {
DataTable sourceTable = a[i];
DataTable copyTable = sourceTable.Clone(); //Clones structure
copyTable.Load(sourceTable.CreateDataReader());
}
This was many times faster (around 6 in my use case) than the following:
DataTable copyTable = sourceTable.Clone();
foreach(DataRow dr in sourceTable.Rows)
{
copyTable.ImportRow(dr);
}
Also, If we look at what DataTable.Copy is doing using ILSpy:
public DataTable Copy()
{
IntPtr intPtr;
Bid.ScopeEnter(out intPtr, "<ds.DataTable.Copy|API> %d#\n", this.ObjectID);
DataTable result;
try
{
DataTable dataTable = this.Clone();
foreach (DataRow row in this.Rows)
{
this.CopyRow(dataTable, row);
}
result = dataTable;
}
finally
{
Bid.ScopeLeave(ref intPtr);
}
return result;
}
internal void CopyRow(DataTable table, DataRow row)
{
int num = -1;
int newRecord = -1;
if (row == null)
{
return;
}
if (row.oldRecord != -1)
{
num = table.recordManager.ImportRecord(row.Table, row.oldRecord);
}
if (row.newRecord != -1)
{
if (row.newRecord != row.oldRecord)
{
newRecord = table.recordManager.ImportRecord(row.Table, row.newRecord);
}
else
{
newRecord = num;
}
}
DataRow dataRow = table.AddRecords(num, newRecord);
if (row.HasErrors)
{
dataRow.RowError = row.RowError;
DataColumn[] columnsInError = row.GetColumnsInError();
for (int i = 0; i < columnsInError.Length; i++)
{
DataColumn column = dataRow.Table.Columns[columnsInError[i].ColumnName];
dataRow.SetColumnError(column, row.GetColumnError(columnsInError[i]));
}
}
}
It's not surprising that the operation will take a long time; not only is it row by row, but it also does additional validation.
You should specify the capacity of the list otherwise it will have to grow internally to accommodate the data. See here for the detailed explanation.
List<DataTable> aCopy = new List<DataTable>(a.Count);
I found following approach much more efficient than other ways of filtering records like LINQ, provided your search criteria is simple:
public static DataTable FilterByEntityID(this DataTable table, int EntityID)
{
table.DefaultView.RowFilter = "EntityId = " + EntityID.ToString();
return table.DefaultView.ToTable();
}
i have a datatable i created below i need to list all rows' cell length in datatable. my result must be not including "0" value. But my list : 19,19,19,19,19, 0.0.0.0..0.0..... so on why is it? How can i see length of my Array?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
namespace DataTables
{
class Program
{
static void Main(string[] args)
{
DataTable table = GetTable();
int[] mySortedLists = new int[table.Rows.Count*table.Columns.Count];
foreach (DataColumn dc in table.Columns)
{
foreach (DataRow dr in table.Rows)
{
Console.WriteLine(dr[dc].ToString().Length);
}
Console.WriteLine("\t");
}
Console.WriteLine("--------------------------------------");
for (int i = 0; i < table.Rows.Count; i++)
{
for (int j = 0; j < table.Columns.Count; j++)
{
mySortedLists[i] += table.Rows[i][j].ToString().Length;
}
}
foreach (var mySortedList in mySortedLists)
{
Console.WriteLine(mySortedList.ToString() + "\n");
}
Console.ReadKey();
}
static DataTable GetTable()
{
//
// Here we create a DataTable with four columns.
//
DataTable table = new DataTable();
table.Columns.Add("Dosage", typeof(int));
table.Columns.Add("Drug", typeof(string));
table.Columns.Add("Patient", typeof(string));
table.Columns.Add("Date", typeof(DateTime));
//
// Here we add five DataRows.
//
table.Rows.Add(25, "Indocin", "David", DateTime.Now);
table.Rows.Add(50, "Enebrel", "Sam", DateTime.Now);
table.Rows.Add(10, "Hydralazine", "Christoff", DateTime.Now);
table.Rows.Add(21, "Combivent", "Janet", DateTime.Now);
table.Rows.Add(100, "Dilantin", "Melanie", DateTime.Now);
return table;
}
}
}
Please help me!
You declare mySortedLists to have length table.Rows.Count * table.Columns.Count, but you only use the first table.Rows.Count entries. If you want one length value per row then you probably want:
int[] mySortedLists = new int[table.Rows.Count];
Or, if you want one length value per cell then you either want a two-dimensional array:
int[,] mySortedLists = new int[table.Rows.Count, table.Columns.Count];
...
mySortedLists[i, j] += table.Rows[i][j].ToString().Length;
Or you want to flatten the array indices:
mySortedLists[i * table.Columns.Count + j] += table.Rows[i][j].ToString().Length;
Unless I'm misunderstanding what you're trying to accomplish, your problem is in how you're declaring the mySortedLists array. You want it to declare it like this:
int[] mySortedLists = new int[table.Rows.Count];
To see the length of your array, you can use
Console.WriteLine(mySortedLists.Length);
I can't tell if you're trying to get the total length of the cells of each row, or the individual lengths of each cell in each row. Your array is declared with a length that would support the latter scenario, but you're only assigning values to it for the former scenario. Here are two Linq methods to get the length values stored, the first being putting the lengths into a jagged array that will have each field length.
int[][] lengths;
using (DataTable table = GetTable())
{
lengths = (from DataRow row in table.Rows
select
(from DataColumn col in table.Columns
select row[col].ToString().Length).ToArray()).ToArray();
}
foreach (int[] row in lengths)
{
Console.WriteLine(string.Join(",", row));
}
The second starts the same, but performs an aggregation at the end where the first .ToArray() was before, so it gets the total length of each individual row and stores it in an array.
int[] array;
using (DataTable table = GetTable())
{
array = (from DataRow row in table.Rows
select
(from DataColumn col in table.Columns
select row[col].ToString().Length).Sum()).ToArray();
}
foreach (int value in array)
Console.WriteLine(value);