I'm trying to pass an ArrayList into a DataRow object, the idea being to import data into a database from a CSV.
Previously in the file, a Dictionary<string,int> has been created, with the column name as the Key, and the position index as the corresponding value.
I was planning on using this to create a temporary DataTable for each record to aid importing into the DB. My original idea was something along the lines of:
private DataRow ArrayListToDataRow(ArrayList data, Dictionary<string,int> columnPositions)
{
DataTable dt = new DataTable();
DataColumn dc = new DataColumn();
for (i=0;i<=data.Count;i++)
{
dc.ColumnName = columnPositions.Keys[i];
dt.Columns.Add(dc);
dt.Columns[columnPositions.Keys[i]].SetOrdinal(columnPositions(columnPositions.Keys[i]);
}
//TODO Add data to row
}
But of course, the keys aren't indexable.
Does anybody have an idea on how this could be achieved?
Since the size of data should be the same as the size of your columnPositions, you could try using a foreach over your dictionary instead of a for loop.
If you want to access your dictionary values based on a sortable index, you would need to change it to
Dictionary<int, string>
Which seems to make more sense, as you seem to want to read them in that order.
If you cannot change the dictionary, you can do something like this
var orderedPositions = columnPositions.OrderBy(x => x.Value);
foreach(var position in orderedPositions)
{
// do your stuff using position.Key and position.Value
}
.OrderBy comes from Linq, so yuo will need to add
using System.Linq;
to your class.
By ordering the columnPositions on their value (the columnIndex) instead of the default (the order in which items were added), you can loop trough them in the order you presumably want (seeing as you were going with a for loop and every time trying to get the next columnPosition).
Related
I have a datatable imported from a csv. What I'm trying to do is compare all of the rows to each other to find duplicates. In the case of duplicates I am going to add the row # to a list, then write the list to an array and deal with the duplicates after that.
//find duplicate rows and merge them.
foreach (DataRow dr in dt.Rows)
{
//loop again to compare rows
foreach (DataRow dx in dt.Rows)
{
if (dx[0]==dr[0] && dx[1]==dr[1] && dx[2] == dr[2] && dx[3] == dr[3] && dx[4] == dr[4] && dx[5] == dr[5] && dx[7] == dr[7])
{
dupeRows.Add(dx.ToString());
}
}
}
for testing I have added:
listBox1.Items.AddRange(dupeRows.ToArray());
which simply outputs System.Data.DataRow.
How do I store the duplicate row index ids?
The basic problem is that you saved a string describing the type of the row (what DataRow.ToString() returns by default) at the time you decided the row was a duplicate
Assuming you've read your CSV straight in with some library/driver rather than line by line (which would have been a good time to dedupe) let's use a dictionary to dedupe:
Dictionary<string, DataRow> d = new Dictionary<string, DataRow>();
foreach(var ro in dataTable.Rows){
//form a key for the dictionary
string key = string.Format("{0}\t{1}\t{2}\t{3}\t{4}\t{5}\t{7}", ro.ItemArray);
d[key] = ro;
}
That's it; at the end of this operation the d.Values will be a deduped collection of DataRow. 1000 rows will require 1000 operations so this will likely be orders of magnitude faster than comparing every row to every other row, which would need a million operations for a thousand rows
I've used tabs to separate the values when I formed the key - assuming your data contains no tabs. Best reliability will be achieved if you use a character that does not appear in the data
If you've read your CSV line by line and done a manual string split on comma (i.e. a primitive way of reading a CSV) you could do this operation then instead; after you split you have an array that can be used in place of ro.ItemArray. Process the entire file, creating rows (and adding to the dictionary) only if d.ContainsKey returns false. If the dictionary already contains that row, skip on rather than creating a row
The output (System.Data.DataRow) that you are seeing is expected since there is no custom implementation of DataRow.ToString() found in your project, framework is calling base class's (which is System.Object) ToString() for which the default implementation returns data type of object which invokes that method.
I see three solutions here:
If possible, try to read the DataTable into custom objects (like
MyDataTable, MyDataRow) so, you can create your own ToString() like
below:
public class MyDataRow
{
public override string ToString()
{
return "This is my custom data row formatted string";
}
}
in the for loop, when you found duplicated row, either just add
index/id (sort of primary key) of dx to array and then have another
for loop to retrieve dupes.
Third is same as mentioned by Caius Jard.
I have a question about assigning array values (double[]) into a column in a DataTable in C#. I derived double[] from a column in my DataTable as shown below.
double[] ToBeChanged = mydatatable.AsEnumerable().Select(s => s.Field<double>("column")).ToArray<double>();
After updating some values in ToBeChanged, I want to get it back to the original column quickly. How can I implement it as simple or as quick as possible?
One approach would be using .Zip method
var pairs = mydatatable.AsEnumerble().Zip(toBeChanged, (row, value) => (row, value));
foreach (var (row, value) in pairs)
{
row.SetField("column", value);
}
But I wouldn't advise to use it for production code, because correctness will fully depend on the assumption that rows and values ordered correctly.
Other possible approaches could be:
Update values within DataTable .
Pass DataRow instance to the method which will update values
Use dictionary to associate updatable values with row identification value.
I have a small problem, which I just cannot find how to fix it.
For my data table, Dictionar.DtDomenii, I need to copy the unique data from my other data table, Dictionar.Dt.
I wrote a query, but when using query.CopyToDataTable() to copy the data into my DtDomenii table, the "CopyToDataTable" function does not show...
Am I doing something wrong? Is there an easier way to copy distinct data (categories from my example) from one data table to another?
PS: I've already read the information from MSDN https://msdn.microsoft.com/en-us/library/bb386921%28v=vs.110%29.aspx
void LoadCategories()
{
var query = (from cat in Dictionar.dt.AsEnumerable()
select new
{
categorie = categorii.Field<string>("Categoria")
}).Distinct();
// This statement does not work:
Dictionar.dtDomenii = query.CopyToDataTable();
}
Only collections of DataRows can use the CopyToDataTable method. For example:
DataTable table = new DataTable();
table.AsEnumerable().CopyToDataTable(); // this works
List<DataRow> dataRows = new List<DataRow>();
dataRows.CopyToDataTable(); // this also works
List<string> strings = new List<string>();
strings.CopyToDataTable(); // this does not work
The select new... part of your query is converting the DataRows into objects. You need to convert the objects back into DataRows before you can use the CopyToDataTable method.
You might have better luck doing something like this:
DataTable copy = Dictionar.dt
.AsEnumerable() // now an enumerable of DataRow
.Distinct() // remove duplicates, still an enumerable of DataRow
.CopyToDataTable(); // done!
You can also make a complete copy of the table with Dictionar.dt.Copy(), then remove the duplicate rows manually.
'cannot implicitly convert type string to data row[]'.
Is it possible to store the string type to data row[]? I need to store the value of the particular column in that particular data row array. Suggest me an answer please.
DataRow[] drprocess = objds.Tables[0].Rows[i]["ProcessName"].ToString();
You have declared a variable of type DataRow[] called drProcess but have not yet created an array of DataRows in which to put any values. Instead you've tried to tell the compiler that the string you're retrieving is actually a DataRow, which it isn't.
It's possible that what you want to do is to create your array of DataRows, then create a DataRow object and assign it into the array. However, I'm suspicious that this isn't actually what you're trying to achieve. Note that objds.Tables[0].Rows is already a collection of DataRows. You can actually edit or use this collection yourself if you need.
Or if you're wanting to create a new collection of process names you might be better creating a var processes = new List<string>() then calling process.Add(objds.Tables[0].Rows[i]["ProcessName"].ToString()).
It all depends what you want to do with this collection of process names afterwards.
First, a DataRow always belongs to a DataTable. To which table should these new DataRow belong? I will presume objds.Tables[0].
I also assume that you have a string-column and you want to split every field in it to a DataRow[], then we need to know the delimiter.
Presuming it is a comma:
DataRow[] drprocess = objds.Tables[0].Rows[i].Field<string>("ProcessName").Split(',')
.Select(name => {
DataRow row = objds.Tables[0].NewRow();
row.SetField("ProcessName", name);
return row;
})
.ToArray();
I am embarrassed to ask, but what is the best way to add key/value pair data in cache (HttpRuntime.Cache) to a DataTable?
I'm currently dumping the key/value pair data from cache into a HashTable, which becomes the DataSource for a Repeater object. Unfortunately, I cannot sort the data in the HashTable and therefore thought a DataTable (being the DataSource for the Repeater) would solve my dilemma.
If you simply want to copy each key/value pair from the cache to a DataTable:
DataTable table = new DataTable();
table.Colums.Add("key", typeof(string));
table.Colums.Add("value", typeof(string));
foreach (DictionaryEntry entry in HttpRuntime.Cache)
{
table.Rows.Add(entry.Key, entry.Value);
}
This assumes that both keys and values are of type string, but if this is not the case, simply replace the types mentioned in line #2 and #3 in the code.
The newly created DataTable can be bound to a Repeater using code like this:
myRepeater.DataSource = table;
myRepeater.DataBind();
Have you had a look at SortedList Class, or SortedDictionary Class
Represents a collection of key/value
pairs that are sorted by key based on
the associated IComparer<(Of <(T>)>)
implementation.
Is there some reason you odn't want to store the Original Data table in your cache? (Besides the obvious high object weight of it, I mean?) Is it something you can share between all users in a read-only fashion? If so, it's probably a good candidate for caching in some format (maybe IDictionary?).
Or you can create you own class having the properties it will be bind to the repeater later on and add them into list of that type and you can sort them easily by linq
//your data container class
public class MyClass
{
public string Name { get; set; }
}
//Put all classes into the list before caching it
List<MyClass> source = new List<MyClass>() ;
//use this to sort with any kind of data inside your own defined class
var sortedResult = source.OrderBy(x => x.Name);
This should do the trick:
Sub Main() Handles Me.Load
Dim Hash As New Hashtable
Hash.Add("Tom", "Arnold")
Hash.Add("Sly", "Stallone")
HashToDataTable(Hash)
End Sub
Function HashToDataTable(ByVal Hash As Hashtable) As Data.DataTable
Dim Table As New Data.DataTable()
Table.Columns.Add("Key", GetType(String))
Table.Columns.Add("Value", GetType(Object)) 'You can use any type you want.
For Each Key In Hash.Keys
Table.Rows.Add(Key, Hash(Key))
Next
Return Table
End Function