Extract column of doubles from a DataTable - c#

Is there an easier way to achieve the following?
var obj = from row in table.AsEnumerable()
select row["DOUBLEVALUE"];
double[] a = Array.ConvertAll<object, double>(obj.ToArray(), o => (double)o);
I'm extracting a column from a DataTable and storing the column in an array of doubles.
Assume that table is a DataTable containing a column called "DOUBLEVALUE" of type typeof(Double).

var obj = (from row in table.AsEnumerable()
select row.Field<double>("DOUBLEVALUE")).ToArray();
The .ToArray() bit is optional, of course; without the ToArray(), you'll get an enumerable sequence of doubles instead of an array - it depends whether you need to read it once or twice. If you have nulls in the data, use <double?> instead.
(note that this needs a reference to System.Data.DataSetExtensions.dll, and a "using System.Data;" statement - this brings the .Field<T>(...) extension methods into play)

double[] a = (from row in table.AsEnumerable()
select Convert.ToDouble( row["DOUBLEVALUE"] )).ToArray();
If you have rows that may have null values for that column, add where row["DOUBLEVALUE"] != null before the select.

Related

Linq Distinct not bringing back the correct results

I'm trying to select a distinct values from a DataTable using Linq. The DataTable gets populated from an excel sheet which has dynamic column apart from each excel sheet has a column name SERIAL NUMBER which is mandatory.
I have a DataTable for demo purpose which consist of 4 serial number as:
12345
12345
98765
98765
When I do
var distinctList = dt.AsEnumerable().Select(a => a).Distinct().ToList();
If I do
var distinctList = dt.AsEnumerable().Select(a => a.Field<string>("SERIAL NUMBER").Distinct().ToList();
Then I get the correct results, however but it only contains the one column from dt and not all the other columns
I get all four records instead of 2. Can someone tell me where I'm going wrong please.
The problem is that Distinct method by default uses the default equality comparer, which for DataRow is comparing by reference. To get the desired result, you can use the Distinct overload that allows you to pass IEqualityComparer<T>, and pass DataRowComparer.Default:
The DataRowComparer<TRow> class is used to compare the values of the DataRow objects and does not compare the object references.
var distinctList = dt.AsEnumerable().Distinct(DataRowComparer.Default).ToList();
For more info, see Comparing DataRows (LINQ to DataSet).
So, you want to group them by Serial Number and retrieve the full DataRow? Assuming that after grouping them we want to retrieve the first item:
var distinctList = dt.AsEnumerable().GroupBy(a => a.Field<string>("SERIAL NUMBER"))
.Select(a => a.FirstOrDefault()).Distinct().ToList();
EDIT: As requested
var distinctValues = dt.AsEnumerable().Select(a => a.Field<string>("SERIAL NUMBER")).Distinct().ToList();
var duplicateValues = dt.AsEnumerable().GroupBy(a => a.Field<string>("SERIAL NUMBER")).SelectMany(a => a.Skip(1)).Distinct().ToList();
var duplicatesRemoved = dt.AsEnumerable().Except(duplicateValues);
In ToTable method the first parameter specifies if you want Distinct records, the second specify by which column name we will make distinct.
DataTable returnVals = dt.DefaultView.ToTable(true, "ColumnNameOnWhichYouWantDistinctRecords");
Here there is no need to use linq for this task !
Using Linq a GroupBy would be better suited, by the sounds of it.
var groups = dt.AsEnumerable().GroupBy(a => a.SerialNumber).Select(_ => new {Key = _.Key, Items = _});
This will then contain groupings based on the Serial Number. With each group of items having the same serial number, but other property values different.
Try this:
List<string> distinctValues = (from row in dt.AsEnumerable() select row.Field<string>("SERIAL NUMBER")).Distinct().ToList();
However to me this also works:
List<string> distinctValues = dt.AsEnumerable().Select(row => row.Field<string>("SERIAL NUMBER")).Distinct().ToList();

How to convert to int and then compare in linq query c#

IEnumerable<classB> list = getItems();
//dt is datatable
list = list.Where(x => Convert.ToInt32( !dt.Columns["Id"]) == (x.Id));
I want to only keep the items in the list which match in datatable id column. The rest are removed. I m not doing it right.
The datatable can have: ID - 1,3,4,5,7
The list can have: ID - 1,2,3,4,5,6,7,8,9,10
I want the output list to have: ID - 1,3,4,5,7
Your code won't work because you're comparing a definition of a column to an integer value. That's not a sensible comparison to make.
What you can do is put all of the values from the data table into a collection that can be effectively searched and then get all of the items in the list that are also in that collection:
var ids = new HashSet<int>(dt.AsEnumerable()
.Select(row => row.Field<int>("Id"));
list = list.Where(x => ids.Contains(x.Id));
Try this one
var idList = dt.AsEnumerable().Select(d => (int) d["Id"]).ToList();
list = list.Where(x => idList.Contains(x.Id));
You can't do it like that. Your dt.Columns["Id"] returns the DataColumn and not the value inside that column in a specific datarow. You need to make a join between two linq query, the first one you already have, the other you need to get from the DataTable.
var queryDt = (from dtRow in dt
where !dtRow.IsNull("Id")
select int.Parse(dtRow["Id"])).ToList();
Now the join
var qry = from nonNull in queryDt
join existing in list on nonNull equals list.id

Linq with DataTable .ToList() very slow

facts.UnderlyingDataTable is a DataTable
var queryResults4 = //get all facts
(from f in facts.UnderlyingDataTable.AsEnumerable()
where f.RowState != DataRowState.Deleted &&
FactIDsToSelect.Contains(f.Field<int>("FactID"))
select f);
var queryResults5 = (from f in queryResults4.AsEnumerable()
orderby UF.Rnd.Next()
select f);
return queryResults5.ToList();
The problem is this line queryResults5.ToList();
It returns a list of DataRows. But is super slow to do this.
I am happy to return any object that implements IEnumerable. What should I do? I seems the conversion from whatever the var is to List<DataRow> is slow.
Thanks for your time.
First, not the ToList itself is slow but the query that gets executed in this method. So maybe your DataTable contains many rows. I assume also that FactIDsToSelect is large which makes the Contains check for every row slow .
You could use CopyToDataTable to create a new DataTable with the same schema instead of a List since that is more natural for an IEnumerable<DataRow>. However, as i have mentioned, that would not solve your performance issue.
You could optimize the query with a Join which is much more efficient:
var q = from row in UnderlyingDataTable.AsEnumerable()
where row.RowState != DataRowState.Deleted
join id in FactIDsToSelect
on row.Field<int>("FactID") equals id
select row;
var newTable = q.CopyToDataTable();
Why is LINQ JOIN so much faster than linking with WHERE?
Please try with following.
List<DataRow> list = new List<DataRow>(UnderlyingDataTable.Select("FactID = " + id.ToString(),DataViewRowState.Unchanged));
You may need to change the DataViewRowState argument in .Select method.

Use LINQ to get datatable row numbers meeting certain conditions

How can I get an array of datatable row numbers which meet a certain criteria? For example I have a datatable with a column "DateTime". I want to retrieve the row numbers of the datatable where "DateTime" equals the variable startTime.
I know how to retrieve the actual row, but not the number of the row in the datatable.
Any help will be appreciated :)
int count = tblData.AsEnumerable()
.Count(row => row.Field<DateTime>("DateTime").Equals(startTime));
or as query:
int count = (from row in tblData.AsEnumerable()
where row.Field<DateTime>("DateTime").Equals(startTime)
select row).Count();
If I am reading the question right, using the overload of Select that allows a second input for the index may work for you. Something like
var indices =
table.AsEnumerable()
.Select((row, index) => new { row, index })
.Where(item => item.row.Field<DateTime?>("DateTime") == startTime)
.Select(item => item.index)
.ToArray();
If that date matches on the first, third, and sixth rows, the array will contain indices 0, 2, and 5. You can, of course, add 1 to each index in the query if you would like row numbers to start at 1. (ex: .Select(item => item.index + 1))
This is not possible. Note that with SQL (I assume you use SQL), the row order returned is not guaranteed. Your rows are ordered physically according to the primary key. So if you want a reliable row identifier, you must use your primary key number/id.

Get an array of IDs(values) from a datatable

I have a datable with 50 rows and has an ID Column. I am trying to get an array that holds only the IDs like:
string [] IDs = (from row in DataTable.Rows
select row["ID"].toString()).ToArray();
Is there a way to do this. I always get the error "Could not find he implementation of the query...."
Use the DataTableExtensions.AsEnumerable method by adding a reference to System.Data.DataSetExtensions and a using System.Data; Then you should be able to use the following query:
var query = from row in datatable.AsEnumerable()
select row["ID"].ToString();
string[] ids = query.ToArray();
If you really need an array you can use the last line above or enclose the query in parentheses and call ToArray() as you did originally. I'm generally not a fan of the latter approach.
In fluent syntax it would be:
string[] ids = datatable.AsEnumerable()
.Select(row => row["ID"].ToString())
.ToArray();
is there is anyway you can select a list data table into a customer object array. Assuming all the columns are going to be same.

Categories

Resources