Using .Select and .Where in a single LINQ statement - c#

I need to gather Distinct Id's from a particular table using LINQ. The catch is I also need a WHERE statement that should filter the results based only from the requirements I've set. Relatively new to having to use LINQ so much, but I'm using the following code more or less:
private void WriteStuff(SqlHelper db, EmployeeHelper emp)
{
String checkFieldChange;
AnIList tableClass = new AnIList(db, (int)emp.PersonId);
var linq = tableClass.Items
.Where(
x => x.UserId == emp.UserId
&& x.Date > DateBeforeChanges
&& x.Date < DateAfterEffective
&& (
(x.Field == Inserted)
|| (x.Field == Deleted)))
)
).OrderByDescending(x => x.Id);
if (linq != null)
{
foreach (TableClassChanges item in linq)
{
AnotherIList payTxn = new AnotherIList(db, item.Id);
checkFieldChange = GetChangeType(item.FieldName);
// Other codes that will retrieve data from each item
// and write it into a text file
}
}
}
I tried to add .Distinct for var linq but it's still returning duplicate items (meaning having the same Id's). I've read through a lot of sites and have tried adding a .Select into the query but the .Where clause breaks instead. There are other articles where the query is somehow different with the way it retrieves the values and place it in a var. I also tried to use .GroupBy but I get an "At least one object must implement IComparable" when using Id as a key.
The query actually works and I'm able to output the data from the columns with the specifications I require, but I just can't seem to make .Distinct work (which is the only thing really missing). I tried to create two vars with one triggering a distinct call then have a nested foreach to ensure the values are just unique, but will thousands of records to gather the performance impact is just too much.
I'm unsure as well if I'd have to override or use IEnumerable for my requirement, and thought I'd ask the question around just in case there's an easier way, or if it's possible to have both .Select and .Where working in just one statement?

Did you add the Select() after the Where() or before?
You should add it after, because of the concurrency logic:
1 Take the entire table
2 Filter it accordingly
3 Select only the ID's
4 Make them distinct.
If you do a Select first, the Where clause can only contain the ID attribute because all other attributes have already been edited out.
Update: For clarity, this order of operators should work:
db.Items.Where(x=> x.userid == user_ID).Select(x=>x.Id).Distinct();
Probably want to add a .toList() at the end but that's optional :)

In order for Enumerable.Distinct to work for your type, you can implement IEquatable<T> and provide suitable definitions for Equals and GetHashCode, otherwise it will use the default implementation: comparing for reference equality (assuming that you are using a reference type).
From the manual:
The Distinct(IEnumerable) method returns an unordered sequence that contains no duplicate values. It uses the default equality comparer, Default, to compare values.
The default equality comparer, Default, is used to compare values of the types that implement the IEquatable generic interface. To compare a custom data type, you need to implement this interface and provide your own GetHashCode and Equals methods for the type.
In your case it looks like you might just need to compare the IDs, but you may also want to compare other fields too depending on what it means for you that two objects are "the same".
You can also consider using DistinctBy from morelinq.
Note that this is LINQ to Objects only, but I assume that's what you are using.
Yet another option is to combine GroupBy and First:
var query = // your query here...
.GroupBy(x => x.Id)
.Select(g => g.First());
This would also work in LINQ to SQL, for example.

Since you are trying to compare two different objects you will need to first implement the IEqualityComparer interface. Here is an example code on a simple console application that uses distinct and a simple implementation of the IEqualityComparer:
class Program
{
static void Main(string[] args)
{
List<Test> testData = new List<Test>()
{
new Test(1,"Test"),
new Test(2, "Test"),
new Test(2, "Test")
};
var result = testData.Where(x => x.Id > 1).Distinct(new MyComparer());
}
}
public class MyComparer : IEqualityComparer<Test>
{
public bool Equals(Test x, Test y)
{
return x.Id == y.Id;
}
public int GetHashCode(Test obj)
{
return string.Format("{0}{1}", obj.Id, obj.Name).GetHashCode();
}
}
public class Test
{
public Test(int id, string name)
{
this.id = id;
this.name = name;
}
private int id;
public int Id
{
get { return id; }
set { id = value; }
}
private string name;
public string Name
{
get { return name; }
set { name = value; }
}
}
I hope that helps.

Do you passed a IEqualityComparer<T> to .Distinct()?
Something like this:
internal abstract class BaseComparer<T> : IEqualityComparer<T> {
public bool Equals(T x, T y) {
return GetHashCode(x) == GetHashCode(y);
}
public abstract int GetHashCode(T obj);
}
internal class DetailComparer : BaseComparer<StyleFeatureItem> {
public override int GetHashCode(MyClass obj) {
return obj.ID.GetHashCode();
}
}
Usage:
list.Distinct(new DetailComparer())

You can easily query with LINQ like this
considering this JSON
{
"items": [
{
"id": "10",
"name": "one"
},
{
"id": "12",
"name": "two"
}
]
}
putting it in a variable called json like this,
JObject json = JObject.Parse("{'items':[{'id':'10','name':'one'},{'id':'12','name':'two'}]}");
you can select all ids from the items where name is "one" using the following LINQ query
var Ids =
from item in json["items"]
where (string)item["name"] == "one"
select item["id"];
Then, you will have the result in an IEnumerable list

Related

How to select the same list but with an additional variable set?

I have a List<MyObject>
MyObject is as follows.
public class MyObject
{
public bool available;
public bool online;
}
Now when I retrieve this List<MyObject> from another function, only the available field is set. Now I want to set the online field of each MyObject.
What I currently do is
List<MyObject> objectList = getMyObjectList();
objectList.ForEach(x => x.online = IsOnline(x));
Now that the online property is set, I want to filter again using Where to select MyObject that is both available and online.
objectList.Where(x => x.available && x.online);
While I understand that the code above is working and readable, I am curious to know whether there is a LINQ way of selecting the same object but with a variable initialized so I can combine all the three lines to one line. Unfortunately ForEach does not return the list back.
Something like
getMyObjectList().BetterForEach(x => x.online = IsOnline(x)).Where(x => x.available && x.online);
BetterForEach will return x with the previous values set and with the online field set as well.
Is there any way function/way to achieve this using LINQ?
UPDATE
I've removed other fields of MyObject. MyObject does not only contain these fields but many more. I'd rather not create new instances of MyObject.
The simplest solution is probably to make an extension method that is like ForEach but returns the list for chaining:
public static List<T> ForEachThen<T>(this List<T> source, Action<T> action)
{
source.ForEach(action);
return source;
}
Linq is for querying data, not for updating it. So any option you have will not be too pretty, but there are still some options.
You could do this:
var result =
objectList.Select(x =>
{
x.online = IsOnline(x);
return x;
});
However, that is pretty bad practice. This would be better:
var result =
objectList.Select(x => new MyObject
{
available = x.available,
online = IsOnline(x)
});
But this creates a collection of new objects, not related to your original set.
This kind of code generally indicates there might be something wrong with your basic design. Personally, I'd go with something like this (if you can set up a static method to do the work of IsOnline):
public class MyObject
{
public bool Available;
public bool Online { get { return MyObjectHelper.IsOnline(this); } }
}
...
var result = objectList.Where(x => x.Available && x.Online);
Or if you can't set up a static method, maybe this field doesn't need to be in MyObject class at all?
public class MyObject
{
public bool Available;
}
...
var result = objectList.Where(x => x.Available && IsOnline(x));
var selected = getMyObjectList()
.Select(x => new MyObject{available=x.available, online = IsOnline(x))
.Where(x => x.available && x.online);
assuming you need to access online in the resulting list..
You could also add a method to MyObject like:
public MyObject SetOnline(bool isOnline) {
this.online = isOnline;
return this;
}
and then do:
var selected = getMyObjectList()
.Select(x => x.SetOnline( IsOnline(x) ))
.Where(x => x.available && x.online);

Filter and keep first object of a List of objects with properties that match

I apologize upfront, because I now realize that I have completely worded my example wrong. For those who have given responses, I truly appreciate it. Please let me re-attempt to explain with a more accurate details. Please edit your responses, and once again, I apologize for not being more exact in my previous posting.
Using an entity framework model class called Staging (which is a representation of my Staging table), I have the following List<Staging>.
List<Staging> data = (from t in database.Stagings select t).ToList();
//check for an empty List...react accordingly...
Here is a quick look at what Staging looks like:
public partial class Staging
{
public int ID { get; set; } //PK
public int RequestID { get; set; } //FK
...
public string Project { get; set; }
...
}
Let us suppose that the query returns 10 records into my data list. Let us also suppose that data[3], data[6], and data[7] each have the same value in data.Project, let's say "Foo". The data.Project value is not known until runtime.
Given this, how would I keep the first occurrence, data[3], and remove data[6] and data[7] from my List<Staging>?
Edit:
I have the following code that works, but is there another way?
HashSet<string> projectValuesFound = new HashSet<string>();
List<Staging> newData = new List<Staging>();
foreach (Staging entry in data)
{
if (!projectValuesFound.Contains(entry.Project))
{
projectValuesFound.Add(entry.Project);
newData.Add(entry);
}
}
You can do this via LINQ and a HashSet<T>:
var found = new HashSet<string>();
var distinctValues = theList.Where(mc => found.Add(mc.Var3));
// If you want to assign back into the List<T> again:
// theList = distinctValues.ToList();
This works because HashSet<T>.Add returns true if the value was not already in the set, and false if it already existed. As such, you'll only get the first "matching" value for Var3.
var uniques = (from theList select theList.Var3).Distinct();
That will give you distinct values for all entries.
You could use Linq:
var result = (from my in theList where my.Var3 == "Foo" select my).First();
If you also want to keep the other items, you can use Distinct() instead of First(). To use Dictinct(), either MyClass must implement IEquatable<T>, or you must provide an IEqualityComparer<T> as shown in the MSDN link.
The "canonical" way to do it would be to pass appropriately implemented comparer to Distinct:
class Var3Comparer : IEqualityComparer<MyClass> {
public int GetHashCode(MyClass obj) {
return (obj.Var3 ?? string.Empty).GetHashCode();
}
public bool Equals(MyClass x, MyClass y) {
return x.Var3 == y.Var3;
}
}
// ...
var distinct = list.Distinct(new Var3Comparer());
Just beware that while current implementation seems to keep the ordering of the "surviving" elements, the documentation says it "returns an unordered sequence" and is best treated that way.
There is also a Distinct overload that doesn't require a comparer - it just assumes the Default comparer, which in turn, will utilize the IEquatable<T> if implemented by MyClass.

Check IEnumerable<T> for items having duplicate properties

How to check if an IEnumerable has two or more items with the same property value ?
For example a class
public class Item
{
public int Prop1 {get;set;}
public string Prop2 {get;set;}
}
and then a collection of type IEnumerable<Item>
I need to return false if there are items with duplicate values in Prop1.
You want to check only for Prop1 right ?
What about:
IEnumerable<Item> items = ...
var noDistinct = items.GroupBy(x => x.Prop1).All(x => x.Count() == 1);
// it returns true if all items have different Prop1, false otherwise
I think this method will work.
public static bool ContainsDuplicates<T1>(this IEnumerable<T1> source, Func<T1, T2> selector)
{
var d = new HashSet<T2>();
foreach(var t in source)
{
if(!d.Add(selector(t)))
{
return true;
}
}
return false;
}
A short, one-enumeration only solution would be:
public static bool ContainsDuplicates<T>(this IEnumerable<T> list)
=> !list.All(new HashSet<T>().Add);
which could be read as: A list has no duplicates when All items can be Add-ed to a set.
This is conceptually similar to Jake Pearsons solution; however, it leaves out the independant concept of projection; the OP's question would then be solved as:
items.Select(o => o.Prop1).ContainsDuplicates()
bool x = list.Distinct().SequenceEqual(list);
x is true if list has duplicates.
Have you tried Enumerable.Distinct(IEnumerable, IEqualityComparer)?
You can select the distinct values from the IEnumerable and then check the count against that of the full collection.
Example:
var distinctItemCount = myEnumerable.Select(m => m.Prop1).Distinct().Count();
if(distinctItemCount < myEnumerable.Count())
{
return false;
}
This could potentially be made for performant, but it's the only correct answer so far.
// Create an enumeration of the distinct values of Prop1
var propertyCollection = objectCollection.Select(o => o.Prop1).Distinct();
// If the property collection has the same number of entries as the object
// collection, then all properties are distinct. Otherwise there are some
// duplicates.
return propertyCollection.Count() == objectCollection.Count();
public static class EnumerableEx
{
public static IEnumerable<T> GetDuplicates<T>(this IEnumerable<T> source)
{
return source.GroupBy(t => t).Where(x => x.Count() > 1).Select(x => x.Key);
}
}
Personally, I like the neatness of extension methods.
If your objects don't require a selector for determining equality, then this works nicely.
We can remove duplicate entries by using .Distinct() in ArrayList.
Example:
I have a createdby column in testtable with 5 duplicate entries. I have to get only one row
ID Createdby
=== ========
1 Reddy
2 Reddy
3 Reddy
4 Reddy
Considering the above table, I need to select only one "Reddy"
DataTable table=new DataTable("MyTable");//Actually I am getting this table data from database
DataColumn col=new DataColumn("Createdby");
var childrows = table.AsEnumerable().Select( row => row.Field<object>(col)).Distinct().ToArray();

How to update an element with a List using LINQ and C#

I have a list of objects and I'd like to update a particular member variable within one of the objects. I understand LINQ is designed for query and not meant to update lists of immutable data. What would be the best way to accomplish this? I do not need to use LINQ for the solution if it is not most efficient.
Would creating an Update extension method work? If so how would I go about doing that?
EXAMPLE:
(from trade in CrudeBalancedList
where trade.Date.Month == monthIndex
select trade).Update(
trade => trade.Buy += optionQty);
Although linq is not meant to update lists of immutable data, it is very handy for getting the items that you want to update. I think for you this would be:
(from trade in CrudeBalancedList
where trade.Date.Month == monthIndex
select trade).ToList().ForEach( trade => trade.Buy += optionQty);
I'm not sure if this is the best way, but will allow you to update an element from the list.
The test object:
public class SomeClass {
public int Value { get; set; }
public DateTime Date { get; set; }
}
The extension method:
public static class Extension {
public static void Update<T>(this T item, Action<T> updateAction) {
updateAction(item);
}
}
The test:
public void Test()
{
// test data
List<SomeClass> list = new List<SomeClass>()
{
new SomeClass {Value = 1, Date = DateTime.Now.AddDays(-1)},
new SomeClass {Value = 2, Date = DateTime.Now },
new SomeClass {Value = 3, Date = DateTime.Now.AddDays(1)}
};
// query and update
(from i in list where i.Date.Day.Equals(DateTime.Now.Day) select i).First().Update(v => v.Value += 5);
foreach (SomeClass s in list) {
Console.WriteLine(s.Value);
}
}
So you're expecting to get a single result here. In that case you might consider utilizing the SingleOrDefault method:
var record =
(from trade in CrudeBalancedList
where trade.Date.Month == monthIndex
select trade).SingleOrDefault();
if (record != null)
record.Buy += optionQty;
Note that the SingleOrDefault method expects there to be exactly one or zero value returned (much like a row in a table for some unique primary key). If more than one record is returned, the method will throw an exception.
To create such a method, you would start with its prototype:
public static class UpdateEx {
public void Update(this IEnumerable<T> items,
Expression<Action> updateAction) {
}
}
That's the easy part.
The hard part will be to compile the Expression<Action> into an SQL update statement. Depending on how much syntax you want to support, such a compiler's complexity can range from trivial to impossible.
For an example of compiling Linq Expressions, see the TableQuery class of the sqlite-net project.

Using Except() on a Generic collection

I have asked this question about using the a Linq method that returns one object (First, Min, Max, etc) from of a generic collection.
I now want to be able to use linq's Except() method and I am not sure how to do it. Perhaps the answer is just in front on me but think I need help.
I have a generic method that fills in missing dates for a corresponding descriptive field. This method is declared as below:
public IEnumerable<T> FillInMissingDates<T>(IEnumerable<T> collection, string datePropertyName, string descriptionPropertyName)
{
Type type = typeof(T);
PropertyInfo dateProperty = type.GetProperty(datePropertyName);
PropertyInfo descriptionProperty = type.GetProperty(descriptionPropertyName);
...
}
What I want to accomplish is this. datePropertyName is the name of the date property I will use to fill in my date gaps (adding default object instances for the dates not already present in the collection). If I were dealing with a non-generic class, I would do something like this:
foreach (string description in descriptions)
{
var missingDates = allDates.Except(originalData.Where(d => d.Description == desc).Select(d => d.TransactionDate).ToList());
...
}
But how can I do the same using the generic method FillInMissingDates with the dateProperty and descriptionProperty properties resolved in runtime?
I think the best way would be to define an interface with all of the properties that you want to use in your method. Have the classes that the method may be used in implement this interface. Then, use a generic method and constrain the generic type to derive from the interface.
This example may not do exactly what you want -- it fills in missing dates for items in the list matching a description, but hopefully it will give you the basic idea.
public interface ITransactable
{
string Description { get; }
DateTime? TransactionDate { get; }
}
public class CompletedTransaction : ITransactable
{
...
}
// note conversion to extension method
public static void FillInMissingDates<T>( this IEnumerable<T> collection,
string match,
DateTime defaultDate )
where T : ITransactable
{
foreach (var trans in collection.Where( t => t.Description = match ))
{
if (!trans.TransactionDate.HasValue)
{
trans.TransactionDate = defaultDate;
}
}
}
You'll need to Cast your enumeration to ITransactable before invoking (at least until C# 4.0 comes out).
var list = new List<CompletedTransaction>();
list.Cast<ITransactable>()
.FillInMissingDates("description",DateTime.MinValue);
Alternatively, you could investigate using Dynamic LINQ from the VS2008 Samples collection. This would allow you to specify the name of a property if it's not consistent between classes. You'd probably still need to use reflection to set the property, however.
You could try this approach:
public IEnumerable<T> FillInMissingDates<T>(IEnumerable<T> collection,
Func<T, DateTime> dateProperty, Func<T, string> descriptionProperty, string desc)
{
return collection.Except(collection
.Where(d => descriptionProperty(d) == desc))
.Select(d => dateProperty(d));
}
This allows you to do things like:
someCollection.FillInMissingDates(o => o.CreatedDate, o => o.Description, "matching");
Note that you don't necessarily need the Except() call, and just have:
.. Where(d => descriptionProperty(d) != desc)
foreach (string description in descriptions)
{
var missingDates = allDates.Except<YourClass>(originalData.Where(d => d.Description == desc).Select(d => d.TransactionDate).ToList());
}
In fact, almost all LINQ extension in C# have a generic possible value. (Except and Except)
If you're going to identify the property to be accessed by a string name, then you don't need to use generics. Their only purpose is static type safety. Just use reflection to access the property, and make the method work on a non-generic IEnumerable.
Getting Except result with multiple properties working with custom data class is not allowed.
You have to use it like this: (given in msdn 101 LINQ Samples)
public void Linq53()
{
List<Product> products = GetProductList();
List<Customer> customers = GetCustomerList();
var productFirstChars =
from p in products
select p.ProductName[0];
var customerFirstChars =
from c in customers
select c.CompanyName[0];
var productOnlyFirstChars = productFirstChars.Except(customerFirstChars);
Console.WriteLine("First letters from Product names, but not from Customer names:");
foreach (var ch in productOnlyFirstChars)
{
Console.WriteLine(ch);
}
}
Having the key, you can handle your data accordingly :)

Categories

Resources