I have some problem with my code. I want to replace the ForEach loop with the help of LINQ here, is there any way or solution to solve my problem? My code is given bellow.
static public string table2Json(DataSet ds, int table_no)
{
try
{
object[][] tb = new object[ds.Tables[table_no].Rows.Count][];
int r = 0;
foreach (DataRow dr in ds.Tables[table_no].Rows)
{
tb[r] = new object[ds.Tables[table_no].Columns.Count];
int col = 0;
foreach (DataColumn column in ds.Tables[table_no].Columns)
{
tb[r][col] = dr[col];
if ((tb[r][col]).Equals(System.DBNull.Value))
{
tb[r][col] = "";
}
col++;
}
r++;
}
string table = JsonConvert.SerializeObject(tb, Formatting.Indented);
return table;
}
catch (Exception ex)
{
tools.log(ex.Message);
throw ex;
}
}
This question really asks 3 different things:
how to serialize a DataTable
how to change the DataTable serialization format and finally
how to replace nulls with empty strings, even though an empty string isn't a NULL.
JSON.NET already handles DataSet and DataTable instance serialization with a DataTableConverter whose source can be found here. You could just write :
var str = JsonConvert.SerializeObject(data);
Given this DataTable :
var dataTable=new DataTable();
dataTable.Columns.Add("Name",typeof(string));
dataTable.Columns.Add("SurName",typeof(string));
dataTable.Rows.Add("Moo",null);
dataTable.Rows.Add("AAA","BBB");
You get :
[{"Name":"Moo","SurName":null},{"Name":"AAA","SurName":"BBB"}]
DataTables aren't 2D arrays and the column names and types matter. Generating a separate row object with named fields is far better than generating an object[] array. It also allows makes it far easier for clients to handle the JSON string without knowing its schema in advance. With an object[] for each row, the clients will have to know what's stored in each location in advance.
If you want to use a different serialization format, you could customize the DataTableConverter. Another option though, is to use DataRow.ItemArray to get the values as an object[] and LINQ to get the rows, eg :
object[][] values=dataTable.Rows.Cast<DataRow>()
.Select(row=>row.ItemArray)
.ToArray();
Serializing this produces :
[["Moo",null],["AAA","BBB"]]
And there's no way to tell which item is the name and which is the surname any more.
Replacing DBNulls with strings in this last form needs an extra Select() to replace DBNull.Value with "" :
object[][] values=dataTable.Rows.Cast<DataRow>()
.Select(row=>row.ItemArray
.Select(x=>x==DBNull.Value?"":x)
.ToArray())
.ToArray();
Serializing this produces :
[["Moo",""],["AAA","BBB"]]
That's what was asked, but now we have no way to tell whether the Surname is an empty string, or just doesn't exist.
This may sound strange, but Arabic names may be one long name without surname. Makes things interesting for airlines or travel agents that try to issue tickets (ask me how I know).
We can get rid of ToArray() if we use var :
var values=dataTable.Rows.Cast<DataRow>()
.Select(row=>row.ItemArray
.Select(x=>x==DBNull.Value?"":x));
JSON serialization will work the same.
LINQ is not a nice fit for this sort of thing because you are using explicit indexes r and col into multiple "array structures" (and there is no easy/tidy way to achieve multiple, parallel enumeration).
Other issues
tb is repeatedly newed, filled with data and then replaced in the next iteration, so you end up capturing only the last row of input to the JSON string - that's a logical bug and won't work as I think you intend.
The inner foreach loop declares but does not use the iteration variable column - that's not going to break anything but it is redundant.
You will get more mileage out of using JSON.Net properly (or coding the foreach loops as for loops instead if you want to navigate the structures yourself).
Related
I have a file with 2 columns and multiple rows. 1st column is ID, 2nd column is Name. I would like to display a Dropdown where I will show only all the names from this file.
I will only iterate through the collection. So what is the better approach? Is creating the objects more readable for other developers? Or maybe creating new objects is too slow and it's not worth.
while (!reader.EndOfStream)
{
var row = reader.ReadLine();
var values = row.Split(' ');
list.Add(new Object { Id = int.Parse(values[0]), Name = values[1] });
}
or
while (!reader.EndOfStream)
{
var row = reader.ReadLine();
var values = row.Split(' ');
dict.Add(int.Parse(values[0]), values[1]);
}
Do I lose the speed in the case if I will create new objects?
You create new objects, so to speak, also while adding to the Dictionary<T>, you create new Key-Value pair of the desired type.
As you already mentioned in your question, the decision is made on primary
expected access pattern
performance considerations, which are the function also of access pattern per se.
If you need read-only array to iterate over, use List<T> (even better if the size of the data is known upfront use Type[] data, and just read it where you need it).
If you need key-wise access to your data, use Dictionary<T>.
If you want to only iterate objects, then use List. No need to use Dictionary class at all.
I have a datatable imported from a csv. What I'm trying to do is compare all of the rows to each other to find duplicates. In the case of duplicates I am going to add the row # to a list, then write the list to an array and deal with the duplicates after that.
//find duplicate rows and merge them.
foreach (DataRow dr in dt.Rows)
{
//loop again to compare rows
foreach (DataRow dx in dt.Rows)
{
if (dx[0]==dr[0] && dx[1]==dr[1] && dx[2] == dr[2] && dx[3] == dr[3] && dx[4] == dr[4] && dx[5] == dr[5] && dx[7] == dr[7])
{
dupeRows.Add(dx.ToString());
}
}
}
for testing I have added:
listBox1.Items.AddRange(dupeRows.ToArray());
which simply outputs System.Data.DataRow.
How do I store the duplicate row index ids?
The basic problem is that you saved a string describing the type of the row (what DataRow.ToString() returns by default) at the time you decided the row was a duplicate
Assuming you've read your CSV straight in with some library/driver rather than line by line (which would have been a good time to dedupe) let's use a dictionary to dedupe:
Dictionary<string, DataRow> d = new Dictionary<string, DataRow>();
foreach(var ro in dataTable.Rows){
//form a key for the dictionary
string key = string.Format("{0}\t{1}\t{2}\t{3}\t{4}\t{5}\t{7}", ro.ItemArray);
d[key] = ro;
}
That's it; at the end of this operation the d.Values will be a deduped collection of DataRow. 1000 rows will require 1000 operations so this will likely be orders of magnitude faster than comparing every row to every other row, which would need a million operations for a thousand rows
I've used tabs to separate the values when I formed the key - assuming your data contains no tabs. Best reliability will be achieved if you use a character that does not appear in the data
If you've read your CSV line by line and done a manual string split on comma (i.e. a primitive way of reading a CSV) you could do this operation then instead; after you split you have an array that can be used in place of ro.ItemArray. Process the entire file, creating rows (and adding to the dictionary) only if d.ContainsKey returns false. If the dictionary already contains that row, skip on rather than creating a row
The output (System.Data.DataRow) that you are seeing is expected since there is no custom implementation of DataRow.ToString() found in your project, framework is calling base class's (which is System.Object) ToString() for which the default implementation returns data type of object which invokes that method.
I see three solutions here:
If possible, try to read the DataTable into custom objects (like
MyDataTable, MyDataRow) so, you can create your own ToString() like
below:
public class MyDataRow
{
public override string ToString()
{
return "This is my custom data row formatted string";
}
}
in the for loop, when you found duplicated row, either just add
index/id (sort of primary key) of dx to array and then have another
for loop to retrieve dupes.
Third is same as mentioned by Caius Jard.
I have a datatable which I have filterd using Linq:
result = myDataTable.AsEnumerable().Where(row => row.Field<string>("id").Contains(values));
I have also tried using CopyToDataTable methods like
result.CopyToDataTable()
and
result.CopyToDataTable<DataRow>()
but they didn't work.
How can I convert result to new DataTable?
I have search many Stack Overflow questions and many other tutorials but I can't find what I want.
UPDATE
I have concatenated HashSet to comma separated values, I think I should use array of String or HashSet?
I suggest you create object of DataTable and import row in it by calling ImportRow() function , that will resolve issue.
DataTable.ImportRow Method
Example code.
DataTable tblClone = datTab.Clone();
foreach (DataRow datRow in datTab.Rows)
{
tblClone.ImportRow(datRow);
}
for you it will be like
var result = myDataTable.AsEnumerable().Where(row => row.Field<string>("id").Contains(values));
DataTable tblClone = myDataTable.Clone();
foreach(DataRow dr in result)
tblClone.ImportRow(dr);
.CopyToDataTable<DataRow>() returns a DataTable, it will not modify the variable unless you re-assign it.
result = myDataTable.AsEnumerable().Where(row => row.Field<string>("id").Contains(values));
Then you actually need a DataTable object.
DataTable resultDT = result.CopyToDataTable<DataRow>();
Edit: As Tim pointed out, if no rows are returned by your query, an exception will be thrown "The source contains no DataRows"
You could do something like so;
DataTable resultDT = result.Any() ? result.CopyToDataTable<DataRow>() : myDataTable.Clone();
But that will run the query twice (also as Tim pointed out).
Therefore you could convert that to a list object using (.ToList()), check the count and do your processing then. That has performance implications in such that you create a new instance of the object (List object).
Doing a try/catch with attempt to convert it to DataTable also isn't a good idea. See Pranays answer for another great way to achieve the final result.
'cannot implicitly convert type string to data row[]'.
Is it possible to store the string type to data row[]? I need to store the value of the particular column in that particular data row array. Suggest me an answer please.
DataRow[] drprocess = objds.Tables[0].Rows[i]["ProcessName"].ToString();
You have declared a variable of type DataRow[] called drProcess but have not yet created an array of DataRows in which to put any values. Instead you've tried to tell the compiler that the string you're retrieving is actually a DataRow, which it isn't.
It's possible that what you want to do is to create your array of DataRows, then create a DataRow object and assign it into the array. However, I'm suspicious that this isn't actually what you're trying to achieve. Note that objds.Tables[0].Rows is already a collection of DataRows. You can actually edit or use this collection yourself if you need.
Or if you're wanting to create a new collection of process names you might be better creating a var processes = new List<string>() then calling process.Add(objds.Tables[0].Rows[i]["ProcessName"].ToString()).
It all depends what you want to do with this collection of process names afterwards.
First, a DataRow always belongs to a DataTable. To which table should these new DataRow belong? I will presume objds.Tables[0].
I also assume that you have a string-column and you want to split every field in it to a DataRow[], then we need to know the delimiter.
Presuming it is a comma:
DataRow[] drprocess = objds.Tables[0].Rows[i].Field<string>("ProcessName").Split(',')
.Select(name => {
DataRow row = objds.Tables[0].NewRow();
row.SetField("ProcessName", name);
return row;
})
.ToArray();
i have an array of custom objects. i'd like to be able to reference this array by a particular data member, for instance myArrary["Item1"]
"Item1" is actually the value stored in the Name property of this custom type and I can write a predicate to mark the appropriate array item. However I am unclear as to how to let the array know i'd like to use this predicate to find the array item.
I'd like to just use a dictionary or hashtable or NameValuePair for this array, and get around this whole problem but it's generated and it must remain as CustomObj[]. i'm also trying to avoid loading a dictionary from this array as it's going to happen many times and there could be many objects in it.
For clarification
myArray[5] = new CustomObj() // easy!
myArray["ItemName"] = new CustomObj(); // how to do this?
Can the above be done? I'm really just looking for something similar to how DataRow.Columns["MyColumnName"] works
Thanks for the advice.
What you really want is an OrderedDictionary. The version that .NET provides in System.Collections.Specialized is not generic - however there is a generic version on CodeProject that you could use. Internally, this is really just a hashtable married to a list ... but it is exposed in a uniform manner.
If you really want to avoid using a dictionary - you're going to have to live with O(n) lookup performance for an item by key. In that case, stick with an array or list and just use the LINQ Where() method to lookup a value. You can use either First() or Single() depending on whether duplicate entries are expected.
var myArrayOfCustom = ...
var item = myArrayOfCustom.Where( x => x.Name = "yourSearchValue" ).First();
It's easy enough to wrap this functionality into a class so that external consumers are not burdened by this knowledge, and can use simple indexers to access the data. You could then add features like memoization if you expect the same values are going to be accessed frequently. In this way you could amortize the cost of building the underlying lookup dictionary over multiple accesses.
If you do not want to use "Dictionary", then you should create class "myArrary" with data mass storage functionality and add indexers of type "int" for index access and of type "string" for associative access.
public CustomObj this [string index]
{
get
{
return data[searchIdxByName(index)];
}
set
{
data[searchIdxByName(index)] = value;
}
}
First link in google for indexers is: http://www.csharphelp.com/2006/04/c-indexers/
you could use a dictionary for this, although it might not be the best solution in the world this is the first i came up with.
Dictionary<string, int> d = new Dictionary<string, int>();
d.Add("cat", 2);
d.Add("dog", 1);
d.Add("llama", 0);
d.Add("iguana", -1);
the ints could be objects, what you like :)
http://dotnetperls.com/dictionary-keys
Perhaps OrderedDictionary is what you're looking for.
you can use HashTable ;
System.Collections.Hashtable o_Hash_Table = new Hashtable();
o_Hash_Table.Add("Key", "Value");
There is a class in the System.Collections namespace called Dictionary<K,V> that you should use.
var d = new Dictionary<string, MyObj>();
MyObj o = d["a string variable"];
Another way would be to code two methods/a property:
public MyObj this[string index]
{
get
{
foreach (var o in My_Enumerable)
{
if (o.Name == index)
{
return o;
}
}
}
set
{
foreach (var o in My_Enumerable)
{
if (o.Name == index)
{
var i = My_Enumerable.IndexOf(0);
My_Enumerable.Remove(0);
My_Enumerable.Add(value);
}
}
}
}
I hope it helps!
It depends on the collection, some collections allow accessing by name and some don't. Accessing with strings is only meaningful when the collection has data stored, the column collection identifies columns by their name, thus allowing you to select a column by its name. In a normal array this would not work because items are only identified by their index number.
My best recommendation, if you can't change it to use a dictionary, is to either use a Linq expression:
var item1 = myArray.Where(x => x.Name == "Item1").FirstOrDefault();
or, make an extension method that uses a linq expression:
public static class CustomObjExtensions
{
public static CustomObj Get(this CustomObj[] Array, string Name)
{
Array.Where(x => x.Name == Name).FirstOrDefault();
}
}
then in your app:
var item2 = myArray.Get("Item2");
Note however that performance wouldn't be as good as using a dictionary, since behind the scenes .NET will just loop through the list until it finds a match, so if your list isn't going to change frequently, then you could just make a Dictionary instead.
I have two ideas:
1) I'm not sure you're aware but you can copy dictionary objects to an array like so:
Dictionary dict = new Dictionary();
dict.Add("tesT",40);
int[] myints = new int[dict.Count];
dict.Values.CopyTo(myints, 0);
This might allow you to use a Dictionary for everything while still keeping the output as an array.
2) You could also actually create a DataTable programmatically if that's the exact functionality you want:
DataTable dt = new DataTable();
DataColumn dc1 = new DataColumn("ID", typeof(int));
DataColumn dc2 = new DataColumn("Name", typeof(string));
dt.Columns.Add(dc1);
dt.Columns.Add(dc2);
DataRow row = dt.NewRow();
row["ID"] = 100;
row["Name"] = "Test";
dt.Rows.Add(row);
You could also create this outside of the method so you don't have to make the table over again every time.