How to query this in LINQ? - c#

I'm trying to write a query in LINQ and, so far, I'm unable to make it work. If I've managed to ask the most obvious LINQ question in history then I apologize but I really do need some help with this one ...
Here is the gist of what I'm trying to do:
I have a class for a keyword:
class Keyword
{
public string Name {get; set;}
}
I also have a class for a file:
class File
{
public IList<Keyword> Keywords { get; set;}
}
Now, assuming I have a method to do a search for files by keyword(s):
IEnumerable<File> FindByKeywords(IEnumerable<Keyword> keywords)
{
// Let's say that Context.Files is a collection of File objects
// each of which contains a collection of associated keywords
// that may (or may not) match the keywords we get passed as
// a parameter. This is where I need LINQ magic to happen.
return Context.Files; // How do I select the files by the list of keywords?
}
I've seen examples of using Contains on the passed in list of keywords but that only seems to work for instances where the matching property is a scalar. In my case the matching property is another list of keywords.
In other words, this doesn't work:
IEnumerable<File> FindByKeywords(IEnumerable<Keyword> keywords)
{
return Context.Files.Where(x => keywords.Contains(x);
}
Anyone have any ideas? I really just need to find files that contain one or more keywords that match whatever is in the list of keywords passed in as a parameter. It's probably got an obvious solution but I can't see it.
Thanks in advance.

Do you want to find all File objects, for which any of the elements in Keywords are present in the collection of keywords you passed into the method?
Doing anything with the Keyword class inside your query is apparently a no-no. As your error suggests, it can only translate primitive types.
var names = keywords.Select(x => x.Name).ToList();
return Context.Files.Where(x => keywords.Select(y => y.Name).Intersect(names).Any());

Maybe since you are having trouble wrapping your head around how to do this with Linq, you should start with something you can do. Some simple loops for example. Once you have that working, you can move on to Linq queries.
IEnumerable<File> FindByKeywords(IEnumerable<Keyword> keywords)
{
var foundFiles = new List<File>();
foreach (File file in Context.Files)
{
foreach (string fileWord in file)
{
foreach (string keyword in keywords)
{
if (fileWord == keyword)
{
foundFiles.Add(file);
break;
}
}
}
}
return foundFiles;
}

I might build an index first, then query the index:
IDictionary<string, List<File>> BuildIndex(IEnumerable<File> files)
{
var index = new Dictionary<string, List<File>>();
foreach (File file in files)
{
foreach (string keyword in file.Keywords.Select(k => k.Name))
{
if (!index.ContainsKey(keyword))
index[keyword] = new List<File>();
index[keyword].Add(file);
}
}
return index;
}
IEnumerable<File> FindByKeywords(IEnumerable<Keyword> keywords)
{
var index = BuildIndex(Context.Files);
return keywords
.Where(k => index.ContainsKey(k.Name))
.SelectMany(k => index[k.Name])
.Distinct()
.ToList();
}

As I understand it, you're looking for all the files where any of their keywords is in the collection of keywords.
I'd write it like this:
IEnumerable<File> FilesThatContainsAnyKeyword(
IEnumerable<File> files, // IQueryable<File>?
IEnumerable<Keyword> keywords)
{
var keywordSet = new HashSet<string>(keywords.Select(k => k.Name));
return files.Where(f =>
f.Keywords.Any(k => keywordSet.Contains(k.Name))
);
}
Then call it:
IEnumerable<File> FindByKeywords(IEnumerable<Keyword> keywords)
{
return FilesThatContainsAnyKeyword(Context.Files, keywords);
}
Since your keyword objects cannot be compared for equality directly, you have to compare them by identity (their Name).

Related

Advise to optimize the following code in a better way

I have some data stored in a dictionary where the values are basically a list of objects with few attributes in them. Right now I'm looping through as following to get the data stored in a specific attribute. These data are then added into a drop down list (unity UI dropdown)
foreach (KeyValuePair<string, List<NameIDValuePair>> kvp in TeamValuePair)
{
List<NameIDValuePair> list = kvp.Value;
if(kvp.Key == teamNames.options[teamNames.value].text)
{
foreach (var rec in list)
{
screenNamesDropDown.options.Add(new TMP_Dropdown.OptionData { text = rec.ScreenName });
}
}
}
teamNames and screenNamesDropDown are dropdown elements part of my unity UI.
The structure of the NameIdValuePair looks as follows:
public class NameIdValuePair
{
public string ScreenName { get; private set; }
public string ScreenId { get; private set; }
}
I would like to optimize this piece of code in a better way using linq - so that it's a bit more readable. Since I'm pretty new to linq, i'm not really sure if I'm using the right keywords when searching for suggestions but so far I haven't had much success in finding any helpful suggestion.
Thanks
As mentioned before instead of looping a Dictionary - where we already know that the keys are unique - you could simply use Dictionary.TryGetValue
// do this only once!
var key = teamNames.options[teamNames.value].text;
if (TeamValuePair.TryGetValue(key, out var list))
{
foreach(var item in list)
{
screenNamesDropDown.options.Add(new TMP_Dropdown.OptionData(item.ScreenName));
}
}
and then actually the only place where you could use Linq if you really want to would maybe be in
var key = teamNames.options[teamNames.value].text;
if (TeamValuePair.TryGetValue(key, out var list))
{
screenNamesDropDown.options.AddRange(list.Select(item => new TMP_Dropdown.OptionData(item.ScreenName)));
}
if this makes it better to read is questionable though.
And in general the question would also be if you always want to Add (AddRange) to the screenNamesDropDown.options or if you maybe want to actually replace the options. Then instead of AddRange you could do
screenNamesDropDown.options = list.Select(item => new TMP_Dropdown.OptionData(item.ScreenName)).ToList();

More efficient method to search a list of strings for certain strings?

I have a list of strings. Neither the number of nor the order of these strings is guaranteed. The only thing that is certain is that this list WILL at least contain my 3 strings of interest and inside those strings we'll say "string1", "string2", and "string3" will be contained within them respectively (i.e. these strings can contain more information but those keywords will definitely be in there). I then want to use these results in a function.
My current implementation to solve this is as such:
foreach(var item in myList)
{
if (item.Contains("string1"))
{
myFunction1(item);
}
else if (item.Contains("string2"))
{
myFunction2(item);
}
else if (item.Contains("string3"))
{
myFunction3(item);
}
}
Is there a better way to check string lists and apply functions to those items that match some criteria?
One approach is to use Regex for the fixed list of strings, and check which group is present, like this:
// Note the matching groups around each string
var regex = new Regex("(string1)|(string2)|(string3)");
foreach(var item in myList) {
var match = regex.Match(item);
if (!match.Success) {
continue;
}
if (match.Groups[1].Success) {
myFunction1(item);
}
else if (match.Groups[2].Success)
{
myFunction2(item);
}
else if (match.Groups[3].Success)
{
myFunction3(item);
}
}
This way all three matches would be done with a single pass through the target string.
You could reduce some of the duplicated code in the if statements by creating a Dictionary that maps the strings to their respective functions. (This snippet assumes that myList contains string values, but can easily be adapted to a list of any type.)
Dictionary<string, Action<string>> actions = new Dictionary<string, Action<string>>
{
["string1"] = myFunction1,
["string2"] = myFunction2,
["string3"] = myFunction3
};
foreach (var item in myList)
{
foreach (var action in actions)
{
if (item.Contains(action.Key))
{
action.Value(item);
break;
}
}
}
For a list of only three items, this might not be much of an improvement, but if you have a large list of strings/functions to search for it can make your code much shorter. It also means that adding a new string/function pair is a one-line change. The biggest downside is that the foreach loop is a bit more difficult to read.

Actions on different branching criteria LINQ

I was wondering, is there an effective way to implement branching logic in a collection using LINQ, without iterating the collection more than once.
For example
foreach (string file in Files){
if (file== "file1"){
\\do something
}
else if (file== "file2")
\\do something
}
else if (file== "file3")
\\do something
}
}
I have found a solution using lookups but only works for if-else cases
var group = Files.ToLookup(f => f=="file1");
var file1Group= group[true].ToList();
Thanks in advance.
Does this help? Basically you project a new anonymous type with extra properties that you want to filter by, order by, group by or whatever.
using System;
using System.Linq;
namespace Test
{
public class Program
{
static void Main(string[] args)
{
var Files = new string[] { "file1", "file2" };
var withMetaData = Files.Select(z =>
new { file = z, IsFile1 = z == "file1", IsFile2 = z == "file2"});
// You can now OrderBy or GroupBy or whatever you fancy here
foreach (var fileWithMetaData in withMetaData)
{
Console.WriteLine(fileWithMetaData);
}
Console.ReadLine();
}
}
}
If I have a large number of reoccurring actions, which are defined before the loop and have to be reusable (otherwise the if...else as you already have is probably preferable), I've got a tendency to put references to those actions in a dictionary (or other collection). For this particular example (files), that would only make sense if multiple folders are processed with the same filenames, or the 'filex' parts are obtained from a substring, but the concept can be applied to any reoccurring branching logic.
The example below uses lambdas, but any existing method with the same signature can be used
var actions = new Dictionary<string,Action>{
{"file1", () => Console.WriteLine("abc")},
{"file2", () => {var foo ="def"; Console.WriteLine(foo);}},
};
//example call, only to show usage
foreach(var file in new[]{"file1","file1","file2","file1"})
actions[file](); //nb, for the example no check was added. For real use 'TryGetValue' can be used
Of course, this is assuming an action has to be performed on each file, otherwise a group by can be used.
Since an action is used for each file the example itself doesn't make much sense, since there is no specific action for the individual files. So in an effort to create a somewhat more sensible example:
var actions = new Dictionary<string,Action<string>>{
{"file1", f => Console.WriteLine("abc: {0}", f)},
{"file2", f => {var foo ="def"; Console.WriteLine(foo);}},
};
foreach(var file in new[]{#"c:\temp\file1",#"c:\someotherfolder\file1",#"c:\temp\file2",#"c:\abc\file1"})
actions[Path.GetFileNameWithoutExtension(file)](file); //nb, for the example no check was added. For real use 'TryGetValue' can be used

Returning a list that is filtered based on another list in C#

Im quite new to C# so trying to test a simple string filter but not getting the desired results. This is my test method:
[TestMethod]
public void Test_ExceptFilter()
{
IEnumerable<string> allnames = new List<string>(){"Thumbs.db","~","one","two exe here","three"};
var u = FileLister.GetFilteredList(allnames, new [] { "thumbs","~"}, false);
}
I have a static class called FileLister which has the GetFilteredList method, and the bit that is not working is when i try to return a list that DOES NOT CONTAIN any strings that are in the second IEnumerable.
So in my example here, i expect var u to have "one","two exe here" and "three" only, but it has all the items from the allnames list!.
These are the various ways i have tried and in all of them, the part im struggling with is where the boolean includewords is passed in as false:
//Return strings that contain or dont contain any element in the array
public static IEnumerable<string> GetFilteredList(IEnumerable<string> myfiles,
IEnumerable<string> filterbythesewords, bool includewords)
{
if (includewords)
{
return from line in myfiles
where filterbythesewords.Any(item => line.Contains(item))
select line;
}
else
{
return from line in myfiles
where filterbythesewords.Any(item => !line.Contains(item))
select line;
}
}
SECOND TRIAL
//Return strings that contain or dont contain any element in the array
public static IEnumerable<string> GetFilteredList(IEnumerable<string> myfiles,
IEnumerable<string> filterbythesewords, bool includewords)
{
if (includewords)
{
return from line in myfiles
where filterbythesewords.Any(item => line.Contains(item))
select line;
}
else
{
List<string> p = new List<string>();
p.AddRange(myfiles.Except(filterbythesewords));
return p;
}
}
THIRD TRIAL
//Return strings that contain or dont contain any element in the array
public static IEnumerable<string> GetFilteredList(IEnumerable<string> myfiles,
IEnumerable<string> filterbythesewords, bool includewords)
{
if (includewords)
{
return from line in myfiles
where filterbythesewords.Any(item => line.Contains(item))
select line;
}
else
{
return myfiles.Where(file => filterbythesewords.Any(x => !file.ToUpperInvariant().Contains(x.ToUpperInvariant())));
}
}
How can i make this work please? Ultimately i want to be able to filter filenames from a directory listing based on their file extensions or part of their names.
cheers
The problem is that you're using Any with a negative condition - which means you'll include the value if there are any words that aren't included in the candidate. Instead, you want to think of it as:
Exclude a file the words if any of the words are included in the file
Include a file if all of the words aren't in it.
Include a file if "not any" of the words are in it (i.e. if a positive check isn't true for any of the words)
So you could use:
return myfiles.Where(file => filterbythesewords.All(item => !file.Contains(item));
or
return myfiles.Where(file => !filterbythesewords.Any(item => file.Contains(item));
(There's no need for a query expression here - when you're basically just doing a simple filter, query expressions don't help readability.)

LINQ expression instead of nested foreach loops

I have these two clases:
public class Client{
public List<Address> addressList{get;set;}
}
public class Address{
public string name { get; set; }
}
and I have a List of type Client called testList. It contains n clients and each one of those contains n addresses
List<Client> testList;
how can i do the following using LINQ:
foreach (var element in testList)
{
foreach (var add in element.addressList)
{
console.writeLine(add.name);
}
}
Well I wouldn't put the Console.WriteLine in a lambda expression, but you can use SelectMany to avoid the nesting:
foreach (var add in testList.SelectMany(x => x.addressList))
{
Console.WriteLine(add.name);
}
I see little reason to convert the results to a list and then use List<T>.ForEach when there's a perfectly good foreach loop as part of the language. It's not like you naturally have a delegate to apply to each name, e.g. as a method parameter - you're always just writing to the console. See Eric Lippert's blog post on the topic for more thoughts.
(I'd also strongly recommend that you start following .NET naming conventions, but that's a different matter.)
foreach(var a in testList.SelectMany(c => c.addressList))
{
Console.WriteLine(a.name);
}
It will not materialize any new collection.
This may helps:
testList.ForEach(i => i.addressList.ForEach(j => Console.WriteLine(j.name)));
foreach(var add in testList.SelectMany(element => element.addressList)){
Console.WriteLine(add.name);
}
testList.SelectMany(c => c.addressList)
.Select(a => a.name)
.ToList()
.ForEach(Console.WriteLine)
Use the ForEach method:
testList.ForEach(tl=>tl.addressList.ForEach(al=>console.writeLine(al.name)));
LINQ doesn't include a ForEach function, and they don't intend to, since it goes against the idea of LINQ being functional methods. So you can't do this in a single statement. List<T> has a ForEach method, but I'd recommend not using this for the same reasons that it's not in LINQ.
You can, however, use LINQ to simplify your code, e.g.
foreach (var add in testList.SelectMany(x => x.addressList))
{
Console.WriteLine(add.name);
}
// or
foreach (var name in testList.SelectMany(x => x.addressList).Select(x => x.name))
{
Console.WriteLine(name);
}

Categories

Resources