How to sort a list<datarow>? - c#

I have a datatable which I will convert to a list<datarow>. How do I sort the list based on some datatable column? I think it's something like list = list.Sort(p=>p.Field() but I am not sure about the syntax.
I am interested in using LINQ heavily so I was wondering if I should convert the datatable to a strongly typed generic list and take it from there instead of using linq to datasets. The only reason I didn't do this now is for performance reason. The datatable consists of about 20,000 records.

I'd recommend you covert the dataset to a generic list and use linq to sort:
var collection =
from c in people
orderby c.Name ascending
select c;
return collection.ToList();
Where people is the generic List. There are other ways to sort using linq.
Are you always going to return 20k records? The way I understand it, there's more overhead with datatables/sets/rows than with generic lists due to all the built-in .NET functions and methods...but I might be wrong.

In general if you want to sort a List<T> you could do something like
list.Sort((a, b) => String.Compare(a.StringValue, b.StringValue));
Obviously this example is sorting string values alphabetically, but you should get the idea of the syntax to use.

You could use something like:-
list.Sort((x,y) => (int)x[i] > (int)y[i] ? 1 : ((int)x[i] < (int)y[i] ? -1 : 0))
Where i is the index of the column you want to compare. In the example above the i column contains integers. If it contained strings you could do something like
list.Sort((x,y) => string.Compare((string)x[i], (string)y[i]))
Note that the Sort method sorts the list in place, it does not create a new list.

Related

C# LINQ Most Efficient Way to Sort Dates DESC with NULL First

I have a dataset containing a nullable datetime field. I want to sort that field descending but with NULL values first.
This code works and returns exactly what I want:
var groupedResult = fullResults.OrderBy(c => c.ClientName).ThenBy(t => t.ContactName).ThenByDescending(d => d.EndDate ?? DateTime.MaxValue);
I'm new to LINQ so what I'm wondering is if there is a more efficient or preferred way of achieving this same result.
In short, likely not. LINQ will put all of the order clauses together and run them as efficiently as possible for you. It will be up to the engine to figure out how to do it efficiently (SQL/In-Memory/etc.).

Get a list of unique strings from duplicate entries

I'm looking for a LINQ function that returns a list of unique strings from a list of objects which contain these strings. The strings of the objects are not unique. Like this:
List before:
name="abc",value=3
name="xyz",value=5
name="abc",value=9
name="hgf",value=0
List this function would return:
"abc","xyz","hgf"
Does such a function even exist? Of course I know how I could implement this manually, but I was curious if LINQ can do this for me.
var foo = list.Select(p => p.name).Distinct().ToList();
You could use the Distinct extension method. So basically you will first project the original objects into a collection of strings and then apply the Distinct method:
string[] result = source.Select(x => x.name).Distinct().ToArray();
(from object in objectList
select object.name).Distinct();

Getting distinct and ordered members from a list of strings - linq or hashset for unique which one is faster / better suited

I have a big list of strings (about 5k-20k entries) that I need to order and also to remove duplicates from.
I've done this in 2 ways now, once with a hashset and once solely with linq. Tests with that number of entries did not show a big difference but I'm wondering what way and thus what method would be better suited.
For the ways (myList is of the datatype List):
Linq: I'm using 1 linq statement to order the list and get the distinct values from it.
myList = myList.OrderBy(q => q).Distinct().ToList();
Hashset: I'm using hashset to remove all duplicates and then I'm ordering the list
myList = new HashSet<String>(myList).ToList<String>();
myList = myList.OrderBy(q => q).ToList();
Like I said tests I made were about the same time consumption for both methods but I'm still wondering if one method is better than the other and if so why (the code is for a high performance part and I need to get every millisecond I can out of it).
If you're really concerned about every nanosecond, then
myList = myList.Distinct().OrderBy(q => q).ToList();
might be slightly faster than:
myList = myList.OrderBy(q => q).Distinct().ToList();
if there are a large number of duplicates.
The LINQ method is more readable and will have similar performance to explicitly creating a HashSet<T> as others have said. In fact it may be slightly faster if the original List is already sorted, since the LINQ method will preserve the initial order before sorting, while explicitly creating a HashSet<T> will enumerate in an undefined order.
They are pretty much the same. Distinct also uses a Set<T> to eliminate duplicates. My suggestion is use the Distinct first then sort your items. Also in your second code, ToList<String> call is redundant, you can use OrderBy on HashSet then call ToList.

Sorting a list in .Net by one field, then another

I have list of objects I need to sort based on some of their properties. This works fine to sort it by one field:
reportDataRows.Sort((x, y) => x["Comment1"].CompareTo(y["Comment1"]));
foreach (var row in reportDataRows) {
...
}
I see lots of examples on here that do this with only one field. But how do I sort by one field, then another? Or how about a list of many fields? It seems like using LINQ orderby thenby would be best, but I don't know enough about it to know how use it.
For the parameters, something like this that supports any number of fields to sort by would be nice:
var sortBy = new List<string>(){"Comment1","Time"};
I don't want to be writing code to do this in every one of my apps. I plan on moving this sort code to the class that holds the data so that it can do more advanced things like using a list of parameters and implicitly recognizing that the field is a date and sorting it as a date instead of a string. The reportDataRow object contains fields with this information, so I don't have to do any messy checks to find out if the field is supposed to be a date.
Yes, I think it makes more sense to use OrderBy and ThenBy:
foreach (var row in reportDataRows.OrderBy(x => x["Comment1"]).ThenBy(x => x["Comment2"])
{
...
}
This assumes the other thing you want to order by is "Comment2".
Try this:
reportDataRows.Sort((x, y) =>
{
var compare = x["Comment1"].CompareTo(y["Comment1"]);
if(compare != 0)
return compare;
return x["Comment2"].CompareTo(y["Comment2"]);
});
You may want to look at this previous answer where I posted an extension method which handles multiple order by's in LINQ. This allows this sort of syntax:
myList.OrderByMany(x => x.Field1, x => x.Field2);
Look at the example for ThenBy on msdn.
If you're comparing your own objects, then you can implement the IComparable interface.
Otherwise, you can use the IComparer interface.
Using LINQ method syntax:
var sortedRows = reportDataRows.OrderBy(r => r["Comment1"])
.ThenBy(r => r["AnotherField"];
foreach (var row in sortedRows) {
...
}
And even more readable using query comprehension syntax:
var sortedRows = from r in reportDataRows
orderby r["Comment1"], r["Comment2"]
select r;
foreach (var row in sortedRows) {
...
}
You got it. Enumerable.OrderBy().ThenBy() is your ticket. It works exactly like it looks; elements are sorted by each projection, with ties decided by comparing the next projection. You can chain as many ThenBys as you want, and there are also OrderByDesc and ThenByDesc methods that will sort that projection in descending order.
As Albin has pointed out, An OrderBy chain does not touch the original list unless you assign the result of the ordering back to the original variable, like this:
reportDataRows = reportDataRows.OrderBy(x=>x.Comment1).ThenBy(x=>x.Comment2).ToList();
As a rule, OrderBy will perform slightly slower than List.Sort(); the algorithm is designed to work on any IEnumerable series of elements, so in order to sort (which requires knowing every element of the series) it slurps its entire source enumerable into a new array. However, OrderBy has a distinct advantage over Sort in that it is a "stable" sort; elements that are exactly equal to each other will retain their "relative order" in the sorted enumerable (the first of the two that you;d encounter when iterating through the unsorted list will be the first of the two encountered when iterating through the sorted list).

How to deal with datatypes returned by LINQ

I'm new to LINQ, I've used LINQ to SQL to link to two tables, it does return data, which is cool. What I'm trying to understand is what datatype is being returned and how do I work with this datatype?
I'm used to dealing with datatables. Are we throwing out datatables (and all the other ADO.Net object like rows, datasets etc.) now if using LINQ? If so, what are we replacing that with and how can I use it to do everything I did before with datatables? Also--does it make sense to replace datables, was there a deficiency with them?
Here is some code:
protected IEnumerable<string> GetMarketCodes()
{
LINQOmniDataContext db = new LINQOmniDataContext();
var mcodes = from p in db.lkpMarketCodes
orderby 0
select p;
return (IEnumerable<string>) mcodes;
}
This code does currently return data (I can see it in debug), but errors at the "return" line, because apparently my datatype is not IEnumerables, which was my best guess. So, one thing I'd like to understand as well is what datatype is my data being put into and how to return it to the calling function.
It is returning an IQueryable<lkpMarketCode>, assuming that that lkpMarketCode is the type of data in db.lkpMarketCodes. If you want the strings, you need to select p.SomeProperty;, not just select p;.
You shouldn't need to cast (since IQueryable<T> implements IEnumerable<T>); it should also tell you this if you hover on mcodes.
I find it more convenient to return List<>'s so I know what I'm dealing with. So your code would be:
protected List<string> GetMarketCodes()
{
LINQOmniDataContext db = new LINQOmniDataContext();
var mcodes = from p in db.lkpMarketCodes
orderby 0
select p.SomeProperty;
return mcodes.ToList();
}
Having said that, I've hardly used LINQ-to-SQL so there are probably better ways around..
It's returning an IQueryable object.
How does your table look like? I'm guessing the error is because your lkpMarketCodes table is not just one string column. It's returning the whole table.
If you want to return just an IEnumerable of strings, you'll have to return something that looks like this (I'm sure the syntax is a bit off):
var mcodes = from p in db.lkpMarketCodes
orderby 0
select new { p.StringColumnName };
LINQ returns IQueryable<type>'s. This is a superset of IEnumerable. The reason you are getting an error is that your query is not returning an IQueryable<string> it's returning an IQueryable<lkpMarketCodes>. lkpMarketCodes is most likely an object, which can be thought of as similar to a row of records.
LINQ is a Object-Relational mapper, it maps Columns and Rows to Fields and Objects.
You can do pretty much all the same things that you could in ADO, but it works with objects rather than generic rows, so it's more type safe.
In your example, i'm going to assume that lkpMarketCodes is a table, and that table consists of at least two fields, mcode and description.
If you want to return an IEnumerable<string> of mcode's, you would do something like this:
protected IEnumerable<string> GetMarketCodes()
{
LINQOmniDataContext db = new LINQOmniDataContext();
var mcodes = from p in db.lkpMarketCodes
orderby 0
select p.mcode;
return mcodes;
}
This will return your IEnumerable<string> of codes. One trick you can use to find out types is to simply use the variable after its declaration, then hover your mouse over the variable name and a popup will tell you it's type.
That is, if you hover over the return mcodes, it will tell you the type, but it will not tell you the type if you hover over the var mcodes.

Categories

Resources