Looping through associated Lists in a class? - c#

I have a class that has several List<T> objects in it. These Lists are "associated" so that the first items in each are related, and the second ones, and so on (kind of like fields within a single record). I want to loop through the Lists together to alter some of the data simultaneously per "record".
With a foreach loop, I can loop through one List without tracking the record via i or some such. However, I don't know how to simultaneously access the related items in the other Lists. Do I have to count it out using a variable like i, or is there a better way? I'm still pretty new to generics and class-based programming. Am I totally missing a better way to arrange this data?

So this is kind of a fun problem... Note that I suspect some different data modeling might have been able to get around this issue, but if you stored the related items together in a Tuple you could get away from having sync'ed lists... It seems very dangerous to have these sync'ed lists and rely on the fact that they should all correspond at "i" in that any sorting, grouping, or paging (Skip/Take) could break this paradigm.
If you stored them in a List<Tuple<ItemTypeFromList1, ItemTypeFromList2, ... ItemTypeFromListN>> then you could keep the items together in a single list such that you could do a single iteration over the list and then just act on the N items in the tuple appropriately

Use a standard for loop and an index (your i) that will allow you to access the same element in each array. There is no better way to do it.

How about collecting all data for the 'row' in a single class and place instances of this class in a single list as opposed to multiple lists you are trying to keep in synch

The easiest way I can think of would be to use a standard for-loop. When the index is important I always prefer for-loops instead of foreach.
for(int i = 0; i < list1.Count(); i++)
{
list1[i].someMethod();
list2[i].someMethod();
...
}
I assume all lists are of equal length when they are related as you say.
You might want to look into grouping the related items together in a single class and then have only one list, instead of multiple.

Try using following code
foreach (var i in firstList )
{
var s1 = secondList[firstList.LastIndexOf(i)];
var s2 = thirdList[firstList.LastIndexOf(i)];
}
Hope this is the answer you want..:)

Related

Converting a for loop into Task.Parallel.For

I have a procedure bool IsExistImage(int i) . the task of the procedure to detect an image and return bool whether it exist or not.
i have a PDF of 100+ pages which i split and send only the file name through the method. file names are actually the pagenumber of the main PDF file. like 1,2,3,...,125,..
after detecting the image, my method correctly save the list of pages. For that i used this code:
ArrayList array1 = new ArrayList();
for(int i=1;i<pdf.length;i++)
{
if(isExistImage(i))
{
array1.add(i);
}
}
This process runs for more than 1 hours(obviously for the internal works in isExistImage() method.). I can assure you, that no object/variable are global out side the method scope.
So, to shorten the time, I used Task.Parallel For loop. here is what i did :
System.Threading.Tasks,Parallel.For(1,pdf.Length,i =>
{
if(isExistImage(i))
array1.Add(i);
}
But this is not working properly. Sometimes the image detection is right. But most of the time its wrong. When i use non parallel for loop, then it's always right.
I am not understanding what is the problem here. what should i apply here. Is there any technique i am missing?
Your problem is that ArrayList (and most other .Net collections) is not thread-safe.
There are several ways to fix this, but I think that in this case, the best option is to use PLINQ:
List<int> pagesWithImages = ParallelEnumerable.Range(1, pdf.Length)
.Where(i => isExistImage(i))
.ToList();
This will use multiple threads to call the (weirdly named) isExistImage method, which is exactly what you want, and then return a List<int> containing the indexes that matched the condition.
The returned list won't be sorted. If you want that, add AsOrdered() before the Where().
BTW, you really shouldn't be using ArrayList. If you want a list of integers, use List<int>.
ArrayList isn't thread safe; look into concurrent collections here.
is isExistImage thread safe? I.e. are you locking before updating any member variables??

Which is better way of passing string to List in C# highlighting listbox based on database values

I would appreciate if someone can tell me what is the better way of defining list and passing a string to it
I am not sure which one to use or which one is better from performance point of view
var selection = "28,2,10,30,100,51";
List<string> categories = selection.Split(',').ToList();
List<string> categories = new List<string>(selection.Split(','));
I actually want to highlight the Listbox items based on database selection
after creating my list i loop through them & use following code to highlight the selection in the multi-selection list-box in asp.net
foreach (ListItem item in lstCatID.Items)
{
if (categories.Contains(item.Value))
item.Selected = true;
}
Is the the best way to do it or it can be done in any other way to enhance performance.
ToList internally calls the List constructor taking an argument of type IEnumerable so for both of your cases it would be same.
You should see: Reimplementing LINQ to Objects: Part 20 - ToList (Jon Skeet)
You may be wondering why we even need ToList, if we could just create
a list by calling the constructor directly. The difference is that in
order to call a constructor, you need to specify the element type as
the type argument.
It would be better for you if you can time them both using Stopwtach to see the difference, Also first make sure your code works and then worry about the performance. Usually performance optmization for this kind of task results in negligible improvements.
If you are just using it to read value try using IEnumerable<string> instead if List<string> which is lighter and restrictive than list. When you use IEnumerable, you give the compiler a chance to defer work until later, possibly optimizing along the way. SO while using Linq expressions like contains that you are using here IEnumerable probably is the best bet. Apart from this many a times during desin pattern when you want to transfer list of items between 2 objects again IEnumerable is a best bet since it is more restrictive.
var selection = "28,2,10,30,100,51";
IEnumerable<string> categories = selection.Split(',');
foreach (ListItem item in lstCatID.Items)
{
if (categories.Contains(item.Value))
item.Selected = true;
}

C# preventing Collection Was Modified exception

Does
foreach(T value in new List<T>(oldList) )
is dangerous (costly) when oldList contains 1 millions of object T ?
More generaly what is the best way to enumerate over oldList given that elements can be added/removed during the enumeration...
The general rule is, you should not modify the same collection in which you are enumerating. If you want to do something like that, keep another collection which will keep track of which elements to add/remove from the original collection and then after exiting from the loop, perform the add/remove operation on the original collection.
I usually just create a list for all the objects to be removed or added.
Within the foreach I just add the items to the appropriate collections and modify the original collection after the foreach have completed (loop through the removeItems and addItems collection)
just like this
var itemsToBeRemoved = new List<T>();
foreach (T item in myHugeList)
{
if (/*<condition>*/)
itemsToBeRemoved.Add(item);
}
myHugeList.RemoveRange(itemsToBeRemoved);
You could iterate through the list without using an enumerator, so do something like...
for(int i = 0;i<oldList.Count;i++) {
var value = oldList[i];
...
if(itemRemoveCondition) {
oldList.RemoveAt(i--);
}
}
If you mean you can add/remove objects from another thread, I would:
1-synchronize the threads
2- in the add/remove threads, create a list of items to be added or deleted
3- and then delete these items in a critical section (so it is small - you don't have to synch while adding the items to the delete list)
If you dont want to do that, you can use for instead of foreach, that would avoid the exception, but you would have to take extra care so you do not get other kinds of exceptions
foreach(T value in new List(oldList).ToList() ) - give a try
For me, first thing is you should consider using some kind of data paging, because having such 1-milion-items-large list could be dangerous itself.
Have you heard about Unit of Work pattern?
You can implement it so you mark objects for create, update or delete, and later, you call "SaveChanges", "Commit" or any other doing the job of "apply changes", and you'll get done.
For example, you iterate over the enumerable (oldList) and you mark them as "delete". Later, you call "SaveChanges" and the more abstract, generic unit of work will iterate over the small, filtered list of objects to work with.
http://martinfowler.com/eaaCatalog/unitOfWork.html
Anyway, avoid lists of a milion items. You should work with paged lists of objects.
It will be 'slow' but there is not much more you can do about it, except running it on a background thread. E.g. using a BackgroundWorker.
If your operations on the list only occur on one thread, the correct approach is to add the items to add/remove to seperate lists, and perform those operations after your iterations has finished.
If you use multiple threads you will have to look into multithreaded programming, and e.g. use locks or probably better a ReaderWriterLock.
UPDATE:
As mentioned in another Stack Overflow question, this is now possible without any effort in .NET 4.0 when using concurrent collections.
If you are using Foreach loop for modifying collection then you will get this error as below.
List<string> li = new List<string>();
li.Add("bhanu");
li.Add("test");
foreach (string s in li)
{
li.Remove(s);
}
Solution - use For Loop as below.
for (int i = 0; i < li.Count; i++)
{
li.RemoveAt(i);
i--;
}
you can use a flag to switch the modification to a temporary list while the original is being enumerated.
/// where you are enumerating
isBeingEnumerated = true
foreach(T value in new List<T>(oldList) )
isBeingEnumerated = false
SyncList(oldList with temporaryList)
/// where you are modifying while enumerating
if isBeingEnumerated then
use a temporaryList to make the changes.

Quickly retrieve the subset of properties used in a huge collection in C#

I have a huge Collection (which I can cast as an enumerable using OfType<>()) of objects. Each of these objects has a Category property, which is drawn from a list somewhere else in the application. This Collection can reach sizes of hundreds of items, but it is possible that only, say, 6/30 of the possible Categories are actually used. What is the fastest method to find these 6 Categories? The size of the huge Collection discourages me from just iterating across the entire thing and returning all unique values, so is there a faster method of accomplishing this?
Ideally I'd collect the categories into a List<string>.
If you are using .NET 3.5 then try this:
List<string> categories = collection
.Cast<Foo>()
.Select(foo => foo.Category)
.Distinct()
.ToList();
It should be very fast.
I assume these objects originally came from a database? If so then you might want to ask the database to do the work for you. If there is an index on that column then you will get the result close to instantly without even having to fetch the objects into memory.
The size of the huge Collection discourages me from just iterating across the entire thing and returning all unique values
I am afraid in order to find all used categories, you will have to look at each item once, so you can hardly avoid iterating (unless you keep track of the used categories while building your collection).
Try if Mark Byers solution is fast enough for you and only worry about its performance if it isn't.

Hopefully simple question about modifying dictionaries in C#

I have a huge dictionary of blank values in a variable called current like so:
struct movieuser {blah blah blah}
Dictionary<movieuser, float> questions = new Dictionary<movieuser, float>();
So I am looping through this dictionary and need to fill in the "answers", like so:
for(var k = questions.Keys.GetEnumerator();k.MoveNext(); )
{
questions[k.Current] = retrieveGuess(k.Current.userID, k.Current.movieID);
}
Now, this doesn't work, because I get an InvalidOperationException from trying to modify the dictionary I am looping through. However, you can see that the code should work fine - since I am not adding or deleting any values, just modifying the value. I understand, however, why it is afraid of my attempting this.
What is the preferred way of doing this? I can't figure out a way to loop through a dictionary WITHOUT using iterators.
I don't really want to create a copy of the whole array, since it is a lot of data and will eat up my ram like its still Thanksgiving.
Thanks,
Dave
Matt's answer, getting the keys first, separately is the right way to go. Yes, there'll be some redundancy - but it will work. I'd take a working program which is easy to debug and maintain over an efficient program which either won't work or is hard to maintain any day.
Don't forget that if you make MovieUser a reference type, the array will only be the size of as many references as you've got users - that's pretty small. A million users will only take up 4MB or 8MB on x64. How many users have you really got?
Your code should therefore be something like:
IEnumerable<MovieUser> users = RetrieveUsers();
IDictionary<MovieUser, float> questions = new Dictionary<MovieUser, float>();
foreach (MovieUser user in users)
{
questions[user] = RetrieveGuess(user);
}
If you're using .NET 3.5 (and can therefore use LINQ), it's even easier:
IDictionary<MovieUser, float> questions =
RetrieveUsers.ToDictionary(user => user, user => RetrieveGuess(user));
Note that if RetrieveUsers() can stream the list of users from its source (e.g. a file) then it will be efficient anyway, as you never need to know about more than one of them at a time while you're populating the dictionary.
A few comments on the rest of your code:
Code conventions matter. Capitalise the names of your types and methods to fit in with other .NET code.
You're not calling Dispose on the IEnumerator<T> produced by the call to GetEnumerator. If you just use foreach your code will be simpler and safer.
MovieUser should almost certainly be a class. Do you have a genuinely good reason for making it a struct?
Is there any reason you can't just populate the dictionary with both keys and values at the same time?
foreach(var key in someListOfKeys)
{
questions.Add(key, retrieveGuess(key.userID, key.movieID);
}
store the dictionary keys in a temporary collection then loop over the temp collection and use the key value as your indexer parameter. This should get you around the exception.

Categories

Resources