The example below throws an InvalidOperationException, "Collection was modified; enumeration operation may not execute." when executing the code.
var urls = new List<string>();
urls.Add("http://www.google.com");
foreach (string url in urls)
{
// Get all links from the url
List<string> newUrls = GetLinks(url);
urls.AddRange(newUrls); // <-- This is really the problematic row, adding values to the collection I'm looping
}
How can I rewrite this in a better way? I'm guessing a recursive solution?
You can't, basically. What you really want here is a queue:
var urls = new Queue<string>();
urls.Enqueue("http://www.google.com");
while(urls.Count != 0)
{
String url = url.Dequeue();
// Get all links from the url
List<string> newUrls = GetLinks(url);
foreach (string newUrl in newUrls)
{
queue.Enqueue(newUrl);
}
}
It's slightly ugly due to there not being an AddRange method in Queue<T> but I think it's basically what you want.
There are three strategies you can use.
Copy the List<> to a second collection (list or array - perhaps use ToArray()). Loop through that second collection, adding urls to the first.
Create a second List<>, and loop through your urls List<> adding new values to the second list. Copy those to the original list when done looping.
Use a for loop instead of a foreach loop. Grab your count up front. List should leave things indexed correctly, so it you add things they will go to the end of the list.
I prefer #3 as it doesn't have any of the overhead associated with #1 or #2. Here is an example:
var urls = new List<string>();
urls.Add("http://www.google.com");
int count = urls.Count;
for (int index = 0; index < count; index++)
{
// Get all links from the url
List<string> newUrls = GetLinks(urls[index]);
urls.AddRange(newUrls);
}
Edit: The last example (#3) assumes that you don't want to process additional URLs as they are found in the loop. If you do want to process additional URLs as they are found, just use urls.Count in the for loop instead of the local count variable as mentioned by configurator in the comments for this answer.
Use foreach with a lambda, it's more fun!
var urls = new List<string>();
var destUrls = new List<string>();
urls.Add("http://www.google.com");
urls.ForEach(i => destUrls.Add(GetLinks(i)));
urls.AddRange(destUrls);
alternately, you could treat the collection as a queue
IList<string> urls = new List<string>();
urls.Add("http://www.google.com");
while (urls.Count > 0)
{
string url = urls[0];
urls.RemoveAt(0);
// Get all links from the url
List<string> newUrls = GetLinks(url);
urls.AddRange(newUrls);
}
I would create two lists add into the second and then update the reference like this:
var urls = new List<string>();
var destUrls = new List<string>(urls);
urls.Add("http://www.google.com");
foreach (string url in urls)
{
// Get all links from the url
List<string> newUrls = GetLinks(url);
destUrls.AddRange(newUrls);
}
urls = destUrls;
Consider using a Queue with while loop (while q.Count > 0, url = q.Dequeue()) instead of iteration.
I assume you want to iterate over the whole list, and each item you add to it? If so I would suggest recursion:
var urls = new List<string>();
var turls = new List<string();
turls.Add("http://www.google.com")
iterate(turls);
function iterate(List<string> u)
{
foreach(string url in u)
{
List<string> newUrls = GetLinks(url);
urls.AddRange(newUrls);
iterate(newUrls);
}
}
You can probably also create a recursive function, like this (untested):
IEnumerable<string> GetUrl(string url)
{
foreach(string u in GetUrl(url))
yield return u;
foreach(string ret_url in WHERE_I_GET_MY_URLS)
yield return ret_url;
}
List<string> MyEnumerateFunction()
{
return new List<string>(GetUrl("http://www.google.com"));
}
In this case, you will not have to create two lists, since GetUrl does all the work.
But I may have missed the point of you program.
Don't change the collection you're looping through via for each. Just use a while loop on the Count property of the list and access the List items by index. This way, even if you add items, the iteration should pick up the changes.
Edit: Then again, it sort of depends on whether you WANT the new items you added to be picked up by the loop. If not, then this won't help.
Edit 2: I guess the easiest way to do it would be to just change your loop to:
foreach (string url in urls.ToArray())
This will create an Array copy of your list, and it will loop through this instead of the original list. This will have the effect of not looping over your added items.
Jon's approach is right; a queue's the right data structure for this kind of application.
Assuming that you'd eventually like your program to terminate, I'd suggest two other things:
don't use string for your URLs, use System.Web.Uri: it provides a canonical string representation of the URL. This will be useful for the second suggestion, which is...
put the canonical string representation of each URL you process in a Dictionary. Before you enqueue a URL, check to see if it's in the Dictionary first.
It's hard to make the code better without knowing what GetLinks() does. In any event, this avoids recursion. The standard idiom is you don't alter a collection when you're enumerating over it. While the runtime could have let you do it, the reasoning is that it's a source of error, so better to create a new collection or control the iteration yourself.
create a queue with all urls.
when dequeueing, we're pretty much saying we've processed it, so add it to result.
If GetLinks() returns anything, add those to the queue and process them as well.
.
public List<string> ExpandLinksOrSomething(List<string> urls)
{
List<string> result = new List<string>();
Queue<string> queue = new Queue<string>(urls);
while (queue.Any())
{
string url = queue.Dequeue();
result.Add(url);
foreach( string newResult in GetLinks(url) )
{
queue.Enqueue(newResult);
}
}
return result;
}
The naive implementation assumes that GetLinks() will not return circular references. e.g. A returns B, and B returns A. This can be fixed by:
List<string> newItems = GetLinks(url).Except(result).ToList();
foreach( string newResult in newItems )
{
queue.Enqueue(newResult);
}
* As others point out using a dictionary may be more efficient depending on how many items you process.
I find it strange that GetLinks() would return a value, and then later resolve that to more Url's. Maybe all you want to do is 1-level expansion. If so, we can get rid of the Queue altogether.
public static List<string> StraightProcess(List<string> urls)
{
List<string> result = new List<string>();
foreach (string url in urls)
{
result.Add(url);
result.AddRange(GetLinks(url));
}
return result;
}
I decided to rewrite it because while other answers used queues, it wasn't apparent that they didn't run forever.
Related
I have a list of strings. Neither the number of nor the order of these strings is guaranteed. The only thing that is certain is that this list WILL at least contain my 3 strings of interest and inside those strings we'll say "string1", "string2", and "string3" will be contained within them respectively (i.e. these strings can contain more information but those keywords will definitely be in there). I then want to use these results in a function.
My current implementation to solve this is as such:
foreach(var item in myList)
{
if (item.Contains("string1"))
{
myFunction1(item);
}
else if (item.Contains("string2"))
{
myFunction2(item);
}
else if (item.Contains("string3"))
{
myFunction3(item);
}
}
Is there a better way to check string lists and apply functions to those items that match some criteria?
One approach is to use Regex for the fixed list of strings, and check which group is present, like this:
// Note the matching groups around each string
var regex = new Regex("(string1)|(string2)|(string3)");
foreach(var item in myList) {
var match = regex.Match(item);
if (!match.Success) {
continue;
}
if (match.Groups[1].Success) {
myFunction1(item);
}
else if (match.Groups[2].Success)
{
myFunction2(item);
}
else if (match.Groups[3].Success)
{
myFunction3(item);
}
}
This way all three matches would be done with a single pass through the target string.
You could reduce some of the duplicated code in the if statements by creating a Dictionary that maps the strings to their respective functions. (This snippet assumes that myList contains string values, but can easily be adapted to a list of any type.)
Dictionary<string, Action<string>> actions = new Dictionary<string, Action<string>>
{
["string1"] = myFunction1,
["string2"] = myFunction2,
["string3"] = myFunction3
};
foreach (var item in myList)
{
foreach (var action in actions)
{
if (item.Contains(action.Key))
{
action.Value(item);
break;
}
}
}
For a list of only three items, this might not be much of an improvement, but if you have a large list of strings/functions to search for it can make your code much shorter. It also means that adding a new string/function pair is a one-line change. The biggest downside is that the foreach loop is a bit more difficult to read.
I have a list of comma separated string like below:
List<string> IdList=new List<string>();
and each element of list has comma separated string like
1,2,4,5,6,7,8,10,12,15,16
2,3,5,7,8,9,0,10,16,17
4,5,89,12,13,1,2,3,6,7,10,16
I want to apply AND operation on this list of string so I get output like below:
2,5,7,10,16
Is there any efficient way to implement Intersection operation?
You're actually looking for an intersection.
If you don't need the values in numeric order, you could just treat each string as just comma-separated values. Start with the first list, and just intersect each other one appropriately:
HashSet<string> set = new HashSet<string>(list[0].Split(','));
foreach (var item in list.Skip(1))
{
set.IntersectWith(item.Split(','));
}
string result = string.Join(",", set);
Complete sample code:
using System;
using System.Collections.Generic;
using System.Linq;
class Test
{
static void Main()
{
var list = new List<string>
{
"1,2,4,5,6,7,8,10,12,15,16",
"2,3,5,7,8,9,0,10,16,17",
"4,5,89,12,13,1,2,3,6,7,10,16"
};
HashSet<string> set = new HashSet<string>(list[0].Split(','));
foreach (var item in list.Skip(1))
{
set.IntersectWith(item.Split(','));
}
string result = string.Join(",", set);
Console.WriteLine(result);
}
}
Result (order not guaranteed):
2,5,7,10,16
I don't know about "less memory utilization", but my first shot at this would be something along these lines (untested, coded in browser, no Visual Studio handy yadda yadda):
Dictionary<int,int> occurences = new Dictionary<int,int>();
int numberOfLists = YourCollectionOfOuterLists.Count;
foreach (string list in YourCollectionOfOuterLists) {
foreach (string value in list.Split(',')) {
occurences[value] = ((occurences[value] as int) ?? 0) + 1;
}
}
List<int> output = new List<int>();
foreach (int key in occurences.Keys) {
if (occurences[key] == numberOfLists) {
output.Add(key);
}
}
return String.Join(output.Select(x => x.ToString()), ",");
It might very well be possible to write the code more tersely, but anything that accomplishes what you seem to be after will still have to perform roughly the same steps: decide which elements exist in all lists (which is slightly non-trivial as the number of lists is unknown), then make a new list out of those values.
If you have access to it, something like Parallel.ForEach() might help cut down on wallclock execution time at least of the second loop (and possibly the first, with proper locking/synchronization in place).
If you are after something other than this, please clarify your question to describe exactly what you want.
I'm not sure about performance but you can use the Aggregate extension method to 'fold intersections'.
var data = new List<string>
{
"1,2,4,5,6,7,8,10,12,15,16",
"2,3,5,7,8,9,0,10,16,17",
"4,5,89,12,13,1,2,3,6,7,10,16",
};
var fold = data.Aggregate(data[0].Split(',').AsEnumerable(), (d1, d2) => d1.Intersect(d2.Split(',')));
Is there a quicker or more efficient way to add Strings to a List than the below example?:
List<String> apptList = new List<String>();
foreach (Appointment appointment in appointments){
String subject = appointment.Subject;
//...(continues for another 10 lines)
//...And then manually adding each String to the List:
apptList.Add(subject);
//...(continues for another 10 lines)
//And then send off List apptList to another method
}
var apptList = appointments.Select(a => a.Subject).ToList();
I'm not sure if I'm getting your code right, but since your Appointment class is already implementing IEnumerable, you should be able to call ToList() to convert it to a list in one shot.
http://msdn.microsoft.com/en-us/library/bb342261.aspx
How about this:
List<string> apptList = appointments.Select(x => x.Subject).ToList();
I have the classic case of trying to remove an item from a collection while enumerating it in a loop:
List<int> myIntCollection = new List<int>();
myIntCollection.Add(42);
myIntCollection.Add(12);
myIntCollection.Add(96);
myIntCollection.Add(25);
foreach (int i in myIntCollection)
{
if (i == 42)
myIntCollection.Remove(96); // The error is here.
if (i == 25)
myIntCollection.Remove(42); // The error is here.
}
At the beginning of the iteration after a change takes place, an InvalidOperationException is thrown, because enumerators don’t like when the underlying collection changes.
I need to make changes to the collection while iterating. There are many patterns that can be used to avoid this, but none of them seems to have a good solution:
Do not delete inside this loop, instead keep a separate “Delete List”, that you process after the main loop.
This is normally a good solution, but in my case, I need the item to be gone instantly as “waiting” till after
the main loop to really delete the item changes the logic flow of my code.
Instead of deleting the item, simply set a flag on the item and mark it as inactive. Then add the functionality of pattern 1 to clean up the list.
This would work for all of my needs, but it means that a lot of code will have to change in order to check the inactive flag every time an item is accessed. This is far too much administration for my liking.
Somehow incorporate the ideas of pattern 2 in a class that derives from List<T>. This Superlist will handle the inactive flag, the deletion of objects after the fact and also will not expose items marked as inactive to enumeration consumers. Basically, it just encapsulates all the ideas of pattern 2 (and subsequently pattern 1).
Does a class like this exist? Does anyone have code for this? Or is there a better way?
I’ve been told that accessing myIntCollection.ToArray() instead of myIntCollection will solve the problem and allow me to delete inside the loop.
This seems like a bad design pattern to me, or maybe it’s fine?
Details:
The list will contain many items and I will be removing only some of them.
Inside the loop, I will be doing all sorts of processes, adding, removing etc., so the solution needs to be fairly generic.
The item that I need to delete may not be the current item in the loop. For example, I may be on item 10 of a 30 item loop and need to remove item 6 or item 26. Walking backwards through the array will no longer work because of this. ;o(
The best solution is usually to use the RemoveAll() method:
myList.RemoveAll(x => x.SomeProp == "SomeValue");
Or, if you need certain elements removed:
MyListType[] elems = new[] { elem1, elem2 };
myList.RemoveAll(x => elems.Contains(x));
This assume that your loop is solely intended for removal purposes, of course. If you do need to additional processing, then the best method is usually to use a for or while loop, since then you're not using an enumerator:
for (int i = myList.Count - 1; i >= 0; i--)
{
// Do processing here, then...
if (shouldRemoveCondition)
{
myList.RemoveAt(i);
}
}
Going backwards ensures that you don't skip any elements.
Response to Edit:
If you're going to have seemingly arbitrary elements removed, the easiest method might be to just keep track of the elements you want to remove, and then remove them all at once after. Something like this:
List<int> toRemove = new List<int>();
foreach (var elem in myList)
{
// Do some stuff
// Check for removal
if (needToRemoveAnElement)
{
toRemove.Add(elem);
}
}
// Remove everything here
myList.RemoveAll(x => toRemove.Contains(x));
If you must both enumerate a List<T> and remove from it then I suggest simply using a while loop instead of a foreach
var index = 0;
while (index < myList.Count) {
if (someCondition(myList[index])) {
myList.RemoveAt(index);
} else {
index++;
}
}
I know this post is old, but I thought I'd share what worked for me.
Create a copy of the list for enumerating, and then in the for each loop, you can process on the copied values, and remove/add/whatever with the source list.
private void ProcessAndRemove(IList<Item> list)
{
foreach (var item in list.ToList())
{
if (item.DeterminingFactor > 10)
{
list.Remove(item);
}
}
}
When you need to iterate through a list and might modify it during the loop then you are better off using a for loop:
for (int i = 0; i < myIntCollection.Count; i++)
{
if (myIntCollection[i] == 42)
{
myIntCollection.Remove(i);
i--;
}
}
Of course you must be careful, for example I decrement i whenever an item is removed as otherwise we will skip entries (an alternative is to go backwards though the list).
If you have Linq then you should just use RemoveAll as dlev has suggested.
As you enumerate the list, add the one you want to KEEP to a new list. Afterward, assign the new list to the myIntCollection
List<int> myIntCollection=new List<int>();
myIntCollection.Add(42);
List<int> newCollection=new List<int>(myIntCollection.Count);
foreach(int i in myIntCollection)
{
if (i want to delete this)
///
else
newCollection.Add(i);
}
myIntCollection = newCollection;
Let's add you code:
List<int> myIntCollection=new List<int>();
myIntCollection.Add(42);
myIntCollection.Add(12);
myIntCollection.Add(96);
myIntCollection.Add(25);
If you want to change the list while you're in a foreach, you must type .ToList()
foreach(int i in myIntCollection.ToList())
{
if (i == 42)
myIntCollection.Remove(96);
if (i == 25)
myIntCollection.Remove(42);
}
For those it may help, I wrote this Extension method to remove items matching the predicate and return the list of removed items.
public static IList<T> RemoveAllKeepRemoved<T>(this IList<T> source, Predicate<T> predicate)
{
IList<T> removed = new List<T>();
for (int i = source.Count - 1; i >= 0; i--)
{
T item = source[i];
if (predicate(item))
{
removed.Add(item);
source.RemoveAt(i);
}
}
return removed;
}
How about
int[] tmp = new int[myIntCollection.Count ()];
myIntCollection.CopyTo(tmp);
foreach(int i in tmp)
{
myIntCollection.Remove(42); //The error is no longer here.
}
If you're interested in high performance, you can use two lists. The following minimises garbage collection, maximises memory locality and never actually removes an item from a list, which is very inefficient if it's not the last item.
private void RemoveItems()
{
_newList.Clear();
foreach (var item in _list)
{
item.Process();
if (!item.NeedsRemoving())
_newList.Add(item);
}
var swap = _list;
_list = _newList;
_newList = swap;
}
Just figured I'll share my solution to a similar problem where i needed to remove items from a list while processing them.
So basically "foreach" that will remove the item from the list after it has been iterated.
My test:
var list = new List<TempLoopDto>();
list.Add(new TempLoopDto("Test1"));
list.Add(new TempLoopDto("Test2"));
list.Add(new TempLoopDto("Test3"));
list.Add(new TempLoopDto("Test4"));
list.PopForEach((item) =>
{
Console.WriteLine($"Process {item.Name}");
});
Assert.That(list.Count, Is.EqualTo(0));
I solved this with a extension method "PopForEach" that will perform a action and then remove the item from the list.
public static class ListExtensions
{
public static void PopForEach<T>(this List<T> list, Action<T> action)
{
var index = 0;
while (index < list.Count) {
action(list[index]);
list.RemoveAt(index);
}
}
}
Hope this can be helpful to any one.
Currently you are using a list. If you could use a dictionary instead, it would be much easier. I'm making some assumptions that you are really using a class instead of just a list of ints. This would work if you had some form of unique key. In the dictionary, object can be any class you have and int would be any unique key.
Dictionary<int, object> myIntCollection = new Dictionary<int, object>();
myIntCollection.Add(42, "");
myIntCollection.Add(12, "");
myIntCollection.Add(96, "");
myIntCollection.Add(25, "");
foreach (int i in myIntCollection.Keys)
{
//Check to make sure the key wasn't already removed
if (myIntCollection.ContainsKey(i))
{
if (i == 42) //You can test against the key
myIntCollection.Remove(96);
if (myIntCollection[i] == 25) //or you can test against the value
myIntCollection.Remove(42);
}
}
Or you could use
Dictionary<myUniqueClass, bool> myCollection; //Bool is just an empty place holder
The nice thing is you can do anything you want to the underlying dictionary and the key enumerator doesn't care, but it also doesn't update with added or removed entries.
If I have:
List<string> myList1;
List<string> myList2;
myList1 = getMeAList();
// Checked myList1, it contains 4 strings
myList2 = getMeAnotherList();
// Checked myList2, it contains 6 strings
myList1.Concat(myList2);
// Checked mylist1, it contains 4 strings... why?
I ran code similar to this in Visual Studio 2008 and set break points after each execution. After myList1 = getMeAList();, myList1 contains four strings, and I pressed the plus button to make sure they weren't all nulls.
After myList2 = getMeAnotherList();, myList2 contains six strings, and I checked to make sure they weren't null... After myList1.Concat(myList2); myList1 contained only four strings. Why is that?
Concat returns a new sequence without modifying the original list. Try myList1.AddRange(myList2).
Try this:
myList1 = myList1.Concat(myList2).ToList();
Concat returns an IEnumerable<T> that is the two lists put together, it doesn't modify either existing list. Also, since it returns an IEnumerable, if you want to assign it to a variable that is List<T>, you'll have to call ToList() on the IEnumerable<T> that is returned.
targetList = list1.Concat(list2).ToList();
It's working fine I think so. As previously said, Concat returns a new sequence and while converting the result to List, it does the job perfectly.
It also worth noting that Concat works in constant time and in constant memory.
For example, the following code
long boundary = 60000000;
for (long i = 0; i < boundary; i++)
{
list1.Add(i);
list2.Add(i);
}
var listConcat = list1.Concat(list2);
var list = listConcat.ToList();
list1.AddRange(list2);
gives the following timing/memory metrics:
After lists filled mem used: 1048730 KB
concat two enumerables: 00:00:00.0023309 mem used: 1048730 KB
convert concat to list: 00:00:03.7430633 mem used: 2097307 KB
list1.AddRange(list2) : 00:00:00.8439870 mem used: 2621595 KB
I know this is old but I came upon this post quickly thinking Concat would be my answer. Union worked great for me. Note, it returns only unique values but knowing that I was getting unique values anyway this solution worked for me.
namespace TestProject
{
public partial class Form1 :Form
{
public Form1()
{
InitializeComponent();
List<string> FirstList = new List<string>();
FirstList.Add("1234");
FirstList.Add("4567");
// In my code, I know I would not have this here but I put it in as a demonstration that it will not be in the secondList twice
FirstList.Add("Three");
List<string> secondList = GetList(FirstList);
foreach (string item in secondList)
Console.WriteLine(item);
}
private List<String> GetList(List<string> SortBy)
{
List<string> list = new List<string>();
list.Add("One");
list.Add("Two");
list.Add("Three");
list = list.Union(SortBy).ToList();
return list;
}
}
}
The output is:
One
Two
Three
1234
4567
Take a look at my implementation. It's safe from null lists.
IList<string> all= new List<string>();
if (letterForm.SecretaryPhone!=null)// first list may be null
all=all.Concat(letterForm.SecretaryPhone).ToList();
if (letterForm.EmployeePhone != null)// second list may be null
all= all.Concat(letterForm.EmployeePhone).ToList();
if (letterForm.DepartmentManagerName != null) // this is not list (its just string variable) so wrap it inside list then concat it
all = all.Concat(new []{letterForm.DepartmentManagerPhone}).ToList();