How to minimize loop with if statement

How to minimize loop with if statement - c#

I am trying to minimize this piece of code
public static void UnfavSong(Song song)
{
List<string> favorites = FileManagement.GetFileContent_List(FAVS_FILENAME);
foreach (string s in favorites)
{
Song deser = SongSerializer.Deserialize(s);
if (deser.ID == song.ID)
{
favorites.Remove(s);
break;
}
}
FileManagement.SaveFile(FAVS_FILENAME, favorites);
}
But I feel like the whole foreach part can be made much shorter.
Is there a way in C# to cut this down to the core?

Using LINQ
favorites.RemoveAll(s => SongSerializer.Deserialize(s).ID == song.ID)
Btw. your code shouldn't work at all as you can't modify the List during it's iteration

you can use linq Where() to filter them:
List<string> result = favorites.Where(x=>SongSerializer.Deserialize(x).ID != song.ID).ToList();
This will give you all element except with the matching ID with song.ID

Related

How to use First() / FirstOrDefault()?

using System;
using System.Linq;
public class Program
{
public static void Main()
{
var flame = new string[]
{
"bad", "word"
}
;
var text = "this contains some bad words";
foreach (string item in text.Split(' '))
{
bool testerino = flame.Any(item.Contains);
if (testerino)
{
Console.WriteLine("1");
}
}
}
}
https://dotnetfiddle.net/Widget/as5iTs
I want Console.WriteLine("1"); to run only once. I tried to use First() and FirstOrDefault() but I was not able to use it without syntax errors. Why I'm using a Split? I don't know. It was the only way to get .Contains() running. I did receive errors using char item in text with Contains().
I don't need to use foreach or even First() it is just only way I know so far.
Any help is very much appreciated.

You want something like this:
var anyFlameWords =
text
.Split(' ')
.Any(word => flame.Contains(word));
if (anyFlameWords)
Console.WriteLine("1");
You don't need First/FirstOrDefault, unless you want the first element from the collection, which looking at your existing code is not what you require.

You can break after you act on the first match.
foreach (string item in text.Split(' '))
{
bool testerino = flame.Any(item.Contains);
if (testerino)
{
Console.WriteLine("1");
break;
}
}
But a more concise alternative is just
if (test.Split(' ').Any(f=>flame.Contains(f))) Console.WriteLine("1");
If you are interested to see how IEnumerable<T>.FirstOrDefault could be used here, note that you can pass in a predicate to FirstOrDefault to get the first item in the IEnumerable<T> that matches that predicate (or the default value for T if nothing matches):
var firstMatch = test.Split(' ').FirstOrDefault(w=>flame.Contains(w));
if (firstMatch != null) Console.WriteLine("1");

Just add a break; to your if-case:
foreach (string item in text.Split(' '))
{
bool testerino = flame.Any(item.Contains);
if (testerino)
{
Console.WriteLine("1");
break;
}
}
A break causes to exit the foreach-loop immediately.

This linq may help
bool badWordExists = text.Split(' ').Any(s => flame.Contains(s));
if(badWordExists) Console.WriteLine("1");

Appropriate datastructure for key.contains(x) Map/Dictionary

I am somewhat struggling with the terminology and complexity of my explanations here, feel free to edit it.
I have 1.000 - 20.000 objects. Each one can contain several name words (first, second, middle, last, title...) and normalized numbers(home, business...), email adresses or even physical adresses and spouse names.
I want to implement a search that enables users to freely combine word parts and number parts.When I search for "LL 676" I want to find all objects that contain any String with "LL" AND "676".
Currently I am iterating over every object and every objects property, split the searchString on " " and do a stringInstance.Contains(searchword).
This is too slow, so I am looking for a better solution.
What is the appropriate language agnostic data structure for this?
In my case I need it for C#.
Is the following data structure a good solution?
It's based on a HashMap/Dictionary.
At first I create a String that contains all name parts and phone numbers I want to look through, one example would be: "William Bill Henry Gates III 3. +436760000 billgatesstreet 12":
Then I split on " " and for every word x I create all possible substrings y that fullfill x.contains(y). I put every of those substrings inside the hashmap/dictionary.
On lookup/search I just need to call the search for every searchword and the join the results. Naturally, the lookup speed is blazingly fast (native Hashmap/Dictionary speed).
EDIT: Inserts are very fast as well (insignificant time) now that I use a smarter algorithm to get the substrings.

It's possible I've misunderstood your algorithm or requirement, but this seems like it could be a potential performance improvement:
foreach (string arg in searchWords)
{
if (String.IsNullOrEmpty(arg))
continue;
tempList = new List<T>();
if (dictionary.ContainsKey(arg))
foreach (T obj in dictionary[arg])
if (list.Contains(obj))
tempList.Add(obj);
list = new List<T>(tempList);
}
The idea is that you do the first search word separately before this, and only put all the subsequent words into the searchWords list.
That should allow you to remove your final foreach loop entirely. Results only stay in your list as long as they keep matching every searchWord, rather than initially having to pile everything that matches a single word in then filter them back out at the end.

In case anyone cares for my solution:
Disclaimer:
This is only a rough draft.
I have only done some synthetic testing and I have written a lot of it without testing it again.I have revised my code: Inserts are now ((n^2)/2)+(n/2) instead of 2^n-1 which is infinitely faster. Word length is now irrelevant.
namespace MegaHash
{
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading.Tasks;
public class GenericConcurrentMegaHash<T>
{
// After doing a bulk add, call AwaitAll() to ensure all data was added!
private ConcurrentBag<Task> bag = new ConcurrentBag<Task>();
private ConcurrentDictionary<string, List<T>> dictionary = new ConcurrentDictionary<string, List<T>>();
// consider changing this to include for example '-'
public char[] splitChars;
public GenericConcurrentMegaHash()
: this(new char[] { ' ' })
{
}
public GenericConcurrentMegaHash(char[] splitChars)
{
this.splitChars = splitChars;
}
public void Add(string keyWords, T o)
{
keyWords = keyWords.ToUpper();
foreach (string keyWord in keyWords.Split(splitChars))
{
if (keyWord == null || keyWord.Length < 1)
return;
this.bag.Add(Task.Factory.StartNew(() => { AddInternal(keyWord, o); }));
}
}
public void AwaitAll()
{
lock (this.bag)
{
foreach (Task t in bag)
t.Wait();
this.bag = new ConcurrentBag<Task>();
}
}
private void AddInternal(string key, T y)
{
for (int i = 0; i < key.Length; i++)
{
for (int i2 = 0; i2 < i + 1; i2++)
{
string desire = key.Substring(i2, key.Length - i);
if (dictionary.ContainsKey(desire))
{
List<T> l = dictionary[desire];
lock (l)
{
try
{
if (!l.Contains(y))
l.Add(y);
}
catch (Exception ex)
{
ex.ToString();
}
}
}
else
{
List<T> l = new List<T>();
l.Add(y);
dictionary[desire] = l;
}
}
}
}
public IList<T> FulltextSearch(string searchString)
{
searchString = searchString.ToUpper();
List<T> list = new List<T>();
string[] searchWords = searchString.Split(splitChars);
foreach (string arg in searchWords)
{
if (arg == null || arg.Length < 1)
continue;
if (dictionary.ContainsKey(arg))
foreach (T obj in dictionary[arg])
if (!list.Contains(obj))
list.Add(obj);
}
List<T> returnList = new List<T>();
foreach (T o in list)
{
foreach (string arg in searchWords)
if (dictionary[arg] == null || !dictionary[arg].Contains(o))
goto BREAK;
returnList.Add(o);
BREAK:
continue;
}
return returnList;
}
}
}

C# dedupe List based on split

I'm having a hard time deduping a list based on a specific delimiter.
For example I have 4 strings like below:
apple|pear|fruit|basket
orange|mango|fruit|turtle
purple|red|black|green
hero|thor|ironman|hulk
In this example I should want my list to only have unique values in column 3, so it would result in an List that looks like this,
apple|pear|fruit|basket
purple|red|black|green
hero|thor|ironman|hulk
In the above example I would have gotten rid of line 2 because line 1 had the same result in column 3. Any help would be awesome, deduping is tough in C#.
how i'm testing this:
static void Main(string[] args)
{
BeginListSet = new List<string>();
startHashSet();
}
public static List<string> BeginListSet { get; set; }
public static void startHashSet()
{
string[] BeginFileLine = File.ReadAllLines(#"C:\testit.txt");
foreach (string begLine in BeginFileLine)
{
BeginListSet.Add(begLine);
}
}
public static IEnumerable<string> Dedupe(IEnumerable<string> list, char seperator, int keyIndex)
{
var hashset = new HashSet<string>();
foreach (string item in list)
{
var array = item.Split(seperator);
if (hashset.Add(array[keyIndex]))
yield return item;
}
}

Something like this should work for you
static IEnumerable<string> Dedupe(this IEnumerable<string> input, char seperator, int keyIndex)
{
var hashset = new HashSet<string>();
foreach (string item in input)
{
var array = item.Split(seperator);
if (hashset.Add(array[keyIndex]))
yield return item;
}
}
...
var list = new string[]
{
"apple|pear|fruit|basket",
"orange|mango|fruit|turtle",
"purple|red|black|green",
"hero|thor|ironman|hulk"
};
foreach (string item in list.Dedupe('|', 2))
Console.WriteLine(item);
Edit: In the linked question Distinct() with Lambda, Jon Skeet presents the idea in a much better fashion, in the form of a DistinctBy custom method. While similar, his is far more reusable than the idea presented here.
Using his method, you could write
var deduped = list.DistinctBy(item => item.Split('|')[2]);
And you could later reuse the same method to "dedupe" another list of objects of a different type by a key of possibly yet another type.

Try this:
var list = new string[]
{
"apple|pear|fruit|basket",
"orange|mango|fruit|turtle",
"purple|red|black|green",
"hero|thor|ironman|hulk "
};
var dedup = new List<string>();
var filtered = new List<string>();
foreach (var s in list)
{
var filter = s.Split('|')[2];
if (dedup.Contains(filter)) continue;
filtered.Add(s);
dedup.Add(filter);
}
// Console.WriteLine(filtered);

Can you use a HashSet instead? That will eliminate dupes automatically for you as they are added.

May be you can sort the words with delimited | on alphabetical order. Then store them onto grid (columns). Then when you try to insert, just check if there is column having a word which starting with this char.

If LINQ is an option, you can do something like this:
// assume strings is a collection of strings
List<string> list = strings.Select(a => a.Split('|')) // split each line by '|'
.GroupBy(a => a[2]) // group by third column
.Select(a => a.First()) // select first line from each group
.Select(a => string.Join("|", a))
.ToList(); // convert to list of strings
Edit (per Jeff Mercado's comment), this can be simplified further:
List<string> list =
strings.GroupBy(a => a.split('|')[2]) // group by third column
.Select(a => a.First()) // select first line from each group
.ToList(); // convert to list of strings

Performing a boolean AND string search on sub-collections of a collection (non-LINQ)

I hope the title makes sense.
I have a set of items that I want to search and select a subset of, based on a set of keywords that must all appear at least once in any of the SubItems of the Items. I believe this could easily be achieved using LINQ, but I'm using .NET 2.0 for this project.
The code below should achieve pretty much what I want to do, assuming AllBitsAreSet is implemented, but I'm wondering if I'm missing an alternative, simpler way of doing this?
Since there doesn't appear to be a good way of checking if all the bits in a BitArray are set, besides looping through them all (please tell me if there is!), I'm wondering about "nicer" alternatives. Not necessarily more CPU efficient, because I doubt the below code will be too slow for the data sets I'm working with, but ones with less code.
public List<Item> Search(Item[] items, List<string> keywords)
{
List<Item> results = new List<Item>();
BitArray flags = new BitArray(keywords.Count);
foreach (Item item in items)
{
flags.SetAll(false);
foreach (SubItem subItem in item.SubItems)
{
for (int i = 0; i < keywords.Count; i++)
{
if (subItem.StringValue.IndexOf(keywords[i]) >= 0)
flags[i] = true;
}
}
if (AllBitsAreSet(flags)) results.Add(item);
}
return results;
}

You can use LINQ Bridge to get LINQ support on .NET 2.0 and use the following LINQ query.
items.Where(i =>
keywords.All(k =>
i.SubItems.Any(s =>
s.StringValue.Contains(k))));
You can avoid using the bit set if you swap the two inner loops - the performance impact depends on thenumber of sub items vs the number of keywords.
foreach (Item item in items)
{
Boolean found = false;
foreach (String keyword in keywords)
{
found = false;
foreach (SubItem subItem in item.SubItems)
{
if (subItem.StringValue.Contains(keyword))
{
found = true;
break;
}
}
if (!found)
{
break;
}
}
if (found)
{
result.Add(item);
}
}

I would write it as follows. Of course this is very similar to Daniel's solution, but I believe it is better.
public List<Item> Search(Item[] items, List<string> keywords)
{
List<Item> results = new List<Item>();
foreach (Item item in items)
if(ContainsAllKeywords(item, keywords))
results.Add(item);
return results;
}
bool ContainsAllKeywords(Item item, List<string> keywords)
{
foreach (string keyword in keywords)
if (!ContainsKey(item.SubItems, keyword))
return false;
return true;
}
bool ContainsKey(IEnumerable<SubItem> subItems, string key)
{
foreach (SubItem subItem in subItems)
if (subItem.StringValue.Contains(key))
return true;
return false;
}
edit: changed == to .Contains() as per comment

LINQ for beginners

I love C#, I love the framework, and I also love to learn as much as possible. Today I began to read articles about LINQ in C# and I couldn't find anything good for a beginner that never worked with SQL in his life.
I found this article very helpful and I understood small parts of it, but I'd like to get more examples.
After reading it couple of times, I tried to use LINQ in a function of mine, but I failed.
private void Filter(string filename)
{
using (TextWriter writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
{
using(TextReader reader = File.OpenText(filename))
{
string line;
while((line = reader.ReadLine()) != null)
{
string[] items = line.Split('\t');
int myInteger = int.Parse(items[1]);
if (myInteger == 24809) writer.WriteLine(line);
}
}
}
}
This is what I did and it did not work, the result was always false.
private void Filter(string filename)
{
using (TextWriter writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
{
using(TextReader reader = File.OpenText(filename))
{
string line;
while((line = reader.ReadLine()) != null)
{
string[] items = line.Split('\t');
var Linqi = from item in items
where int.Parse(items[1]) == 24809
select true;
if (Linqi == true) writer.WriteLine(line);
}
}
}
}
I'm asking for two things:
How would the function look like using as much Linq as possible?
A website/book/article about Linq,but please note I'm a decent beginner in sql/linq.
Thank you in advance!

Well one thing that would make your sample more "LINQy" is an IEnumerable<string> for reading lines from a file. Here's a somewhat simplified version of my LineReader class from MiscUtil:
using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;
public sealed class LineReader : IEnumerable<string>
{
readonly Func<TextReader> dataSource;
public LineReader(string filename)
: this(() => File.OpenText(filename))
{
}
public LineReader(Func<TextReader> dataSource)
{
this.dataSource = dataSource;
}
public IEnumerator<string> GetEnumerator()
{
using (TextReader reader = dataSource())
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
Now you can use that:
var query = from line in new LineReader(filename)
let items = line.Split('\t')
let myInteger int.Parse(items[1]);
where myInteger == 24809
select line;
using (TextWriter writer = File.CreateText(Application.StartupPath
+ "\\temp\\test.txt"))
{
foreach (string line in query)
{
writer.WriteLine(line);
}
}
Note that it would probably be more efficient to not have the let clauses:
var query = from line in new LineReader(filename)
where int.Parse(line.Split('\t')[1]) == 24809
select line;
at which point you could reasonably do it all in "dot notation":
var query = new LineReader(filename)
.Where(line => int.Parse(line.Split('\t')[1]) == 24809);
However, I far prefer the readability of the original query :)

101 LINQ Samples is certainly a good collection of examples. Also LINQPad might be a good way to play around with LINQ.

For a website as a starting point, you can try Hooked on LINQ
Edit:
Original site appears to be dead now (domain is for sale).
Here's the internet archive of the last version: https://web.archive.org/web/20140823041217/http://www.hookedonlinq.com/

If you're after a book, I found LINQ in action from Manning Publications a good place to start.

MSDN LINQ Examples: http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx

I got a lot out of the following sites when I started:
http://msdn.microsoft.com/en-us/library/bb425822.aspx
http://weblogs.asp.net/scottgu/archive/2007/05/19/using-linq-to-sql-part-1.aspx

To answer the first question, there frankly isn't too much reason to use LINQ the way you suggest in the above function except as an exercise. In fact, it probably just makes the function harder to read.
LINQ is more useful at operating on a collection than a single element, and I would use it in that way instead. So, here's my attempt at using as much LINQ as possible in the function (make no mention of efficiency and I don't suggest reading the whole file into memory like this):
private void Filter(string filename)
{
using (TextWriter writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
{
using(TextReader reader = File.OpenText(filename))
{
List<string> lines;
string line;
while((line = reader.ReadLine()) != null)
lines.Add(line);
var query = from l in lines
let splitLine = l.Split('\t')
where int.Parse(splitLine.Skip(1).First()) == 24809
select l;
foreach(var l in query)
writer.WriteLine(l);
}
}
}

First, I would introduce this method:
private IEnumerable<string> ReadLines(StreamReader reader)
{
while(!reader.EndOfStream)
{
yield return reader.ReadLine();
}
}
Then, I would refactor the main method to use it. I put both using statements above the same block, and also added a range check to ensure items[1] doesn't fail:
private void Filter(string fileName)
{
using(var writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
using(var reader = File.OpenText(filename))
{
var myIntegers =
from line in ReadLines(reader)
let items = line.Split('\t')
where items.Length > 1
let myInteger = Int32.Parse(items[1])
where myInteger == 24809
select myInteger;
foreach(var myInteger in myIntegers)
{
writer.WriteLine(myInteger);
}
}
}

I found this article to be extremely crucial to understand LINQ which is based upon so many new constructs brought in in .NET 3.0 & 3.5:
I'll warn you it's a long read, but if you really want to understand what Linq is and does I believe it is essential
http://blogs.msdn.com/ericwhite/pages/FP-Tutorial.aspx
Happy reading

If I was to rewrite your filter function using LINQ where possible, it'd look like this:
private void Filter(string filename)
{
using (TextWriter writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
{
var lines = File.ReadAllLines(filename);
var matches = from line in lines
let items = line.Split('\t')
let myInteger = int.Parse(items[1]);
where myInteger == 24809
select line;
foreach (var match in matches)
{
writer.WriteLine(line)
}
}
}

As for Linq books, I would recommend:
(source: ebookpdf.net)
http://www.diesel-ebooks.com/mas_assets/full/0321564189.jpg
Both are excellent books that drill into Linq in detail.
To add yet another variation to the as-much-linq-as-possible topic, here's my take:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
namespace LinqDemo
{
class Program
{
static void Main()
{
var baseDir = AppDomain.CurrentDomain.BaseDirectory;
File.WriteAllLines(
Path.Combine(baseDir, "out.txt"),
File.ReadAllLines(Path.Combine(baseDir, "in.txt"))
.Select(line => new KeyValuePair<string, string[]>(line, line.Split(','))) // split each line into columns, also carry the original line forward
.Where(info => info.Value.Length > 1) // filter out lines that don't have 2nd column
.Select(info => new KeyValuePair<string, int>(info.Key, int.Parse(info.Value[1]))) // convert 2nd column to int, still carrying the original line forward
.Where(info => info.Value == 24809) // apply the filtering criteria
.Select(info => info.Key) // restore original lines
.ToArray());
}
}
}
Note that I changed your tab-delimited-columns to comma-delimited columns (easier to author in my editor that converts tabs to spaces ;-) ). When this program is run against an input file:
A1,2
B,24809,C
C
E
G,24809
The output will be:
B,24809,C
G,24809
You could improve memory requirements of this solution by replacing "File.ReadAllLines" and "File.WriteAllLines" with Jon Skeet's LineReader (and LineWriter in a similar vein, taking IEnumerable and writing each returned item to the output file as a new line). This would transform the solution above from "get all lines into memory as an array, filter them down, create another array in memory for result and write this result to output file" to "read lines from input file one by one, and if that line meets our criteria, write it to output file immediately" (pipeline approach).

cannot just check if Linqi is true...Linqi is an IEnumerable<bool> (in this case) so have to check like Linqi.First() == true
here is a small example:
string[] items = { "12121", "2222", "24809", "23445", "24809" };
var Linqi = from item in items
where Convert.ToInt32(item) == 24809
select true;
if (Linqi.First() == true) Console.WriteLine("Got a true");
You could also iterate over Linqi, and in my example there are 2 items in the collection.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.