C# index out of range, multi thread, Logs

C# index out of range, multi thread, Logs - c#

I am trying to grab logs from windows. To make it faster I look for the days where logs are and then for that range of those days I open one thread per day to load it fast. In function work1 the error "Index was outside the bounds of the array" appears. If I make the job in only one thread it works fine but it is very very slow.
I tried to use the information from
"Index was outside the bounds of the array while trying to start multiple threads"
but it does not work.
I think the problem is in IEnumerable when it is loaded, like it is not loaded in time when the loop is started.
Sorry for my english, i am from Uzbekistan.
var result = from EventLogEntry elog in aLog.Entries
orderby elog.TimeGenerated
select elog.TimeGenerated;
DateTime OLDentry = result.First();
DateTime NEWentry = result.Last();
DTN.Add(result.First());
foreach (var dtn in result) {
if (dtn.Year != DTN.Last().Year |
dtn.Month != DTN.Last().Month |
dtn.Day != DTN.Last().Day
) {
DTN.Add(dtn);
}
}
List<Thread> t = new List<Thread>();
int i = 0;
foreach (DateTime day in DTN) {
DateTime yad = day;
var test = from EventLogEntry elog in aLog.Entries
where (elog.TimeGenerated.Year == day.Year) &&
(elog.TimeGenerated.Month == day.Month) &&
(elog.TimeGenerated.Day == day.Day)
select elog;
var tt2 = test.ToArray();
t.Add(new Thread(() => work1(tt2)));
t[i].Start();
i++;
}
static void work1(IEnumerable<EventLogEntry> b) {
var z = b;
for (int i = 0; i < z.Count(); i++) {
Console.Write(z + "\n");
}
}

Replace var tt2 = test; with var tt2 = test.ToArray();
The error is a mistake you do numerous times in your code: you are enumerating over a the data countless times. Calling .Count() enumerates the data again, and in this case the data ends up conflicting with cached values inside the EventLogEntry enumerator.
LINQ does not return a data set. It returns a query. A variable of type IEnumerable<T> may return different values every time you call Count(), First() or Last(). Calling .ToArray() makes C# retrieve the result and store it in an array.
You should generally just enumerate an IEnumerable<T> once.

Related

Async call to database inside for loop

I am trying to make call to database and store result in record, stored proc always returns 4 records, but some time I got 3 records and reader shows 4 count but null in first record. What is wrong with code ?
List record = new List();
List<Task> listOfTasks = new List<Task>();
for (int i = 0; i < 2; i++)
{
listOfTasks.Add(Task.Factory.StartNew(() => {
IDataCommand cmd = ds.CreateCommand("DropTicket", "returnTableTypeData",
CommandType.StoredProcedure);
IDataReader reader = cmd.ExecuteReader();
while (reader.Read())
{
TicketTextOutputRecord rec = new TicketTextOutputRecord();
rec.ValidationNumber = (string)reader["ValidationNumber"];
rec.IsSuccess = (bool)reader["IsSuccess"];
rec.Error = (string)reader["Error"];
record.Add(rec);
}
//reader.Close();
//reader.Dispose();
}));
}
Task.WaitAll(listOfTasks.ToArray());
return record;

This sounds like a concurrency error; it is not intended that a connection is accessed concurrently; you are allowed overlapping readers (if MARS is enabled), but the actual access must still not be concurrent in terms of multiple threads trying to do things at the same time. The moment you do that, all behavior is undefined. Frankly, I'd just execute these sequentially, not concurrently. You are allowed to work concurrently if you use completely unrelated connections, note.

I fixed exactly the same error before.
List is not thread safe.
When adding items concurrently, the internal pointer of list can get confused and can cause item to return null even if non null value was added.
This produces the problem:
var list = new List<object>();
var listOfTasks = new List<Task>();
for (var i = 0; i < 10; i++)
{
listOfTasks.Add(Task.Factory.StartNew(() => list.Add(new object())));
}
Task.WaitAll(listOfTasks.ToArray());
Use a thread safe list will fix the problem. But I’d change the task to return the result rather than adding it to a list. Then use LINQ or Task.WhenAll to get those results.
var listOfTasks = new List<Task<object>>();
for (var i = 0; i < 10; i++)
{
listOfTasks.Add(Task.Factory.StartNew(() => new object()));
}
var list = await Task.WhenAll(listOfTasks.ToArray());
// OR
Task.WaitAll(listOfTasks.ToArray());
var list = listOfTasks.Select(t => t.Result).ToList();

Iterate through list in paged manner, within another loop

I have three collections. First, a collection of days. Next, a collection of time spans in each day. These time spans are the same for each day. Next, I have a collection of sessions.
There are 4 days. There are 6 time spans. There are 30 sessions.
I need to iterate through each day, assigning all of the time spans to each day the same way for each day. However, I need to assign the sessions to time blocks in sequence. For example, day 1 gets all 6 time spans, but only the first 6 sessions, 1-6. Day 2 gets the same time spans, but gets the next 6 sessions, 7-12.
How can I do this within the same method?
Here's what I have so far, but I'm having trouble wrapping my head around the paged iteration part.
var timeSlots = TimeSlotDataAccess.GetItems(codeCampId);
var assignableSlotCount = timeSlots.Where(t => !t.SpanAllTracks);
// determine how many days the event lasts for
agenda.NumberOfDays = (int)(agenda.CodeCamp.EndDate - agenda.CodeCamp.BeginDate).TotalDays;
// iterate through each day
agenda.EventDays = new List<EventDayInfo>(agenda.NumberOfDays);
var dayCount = 0;
while (dayCount <= agenda.NumberOfDays)
{
var eventDate = agenda.CodeCamp.BeginDate.AddDays(dayCount);
var eventDay = new EventDayInfo()
{
Index = dayCount,
Day = eventDate.Day,
Month = eventDate.Month,
Year = eventDate.Year,
TimeStamp = eventDate
};
// iterate through each timeslot
foreach (var timeSlot in timeSlots)
{
var slot = new AgendaTimeSlotInfo(timeSlot);
// iterate through each session
// first day gets the first set of assignableTimeSlotCount, then the next iteration gets the next set of that count, etc.
slot.Sessions = SessionDataAccess.GetItemsByTimeSlotId(slot.TimeSlotId, codeCampId).ToList();
// iterate through each speaker
foreach (var session in slot.Sessions)
{
session.Speakers=SpeakerDataAccess.GetSpeakersForCollection(session.SessionId, codeCampId);
}
}
agenda.EventDays.Add(eventDay);
dayCount++;
}

I ended up using LINQ in a new method based upon the GetItemsByTimeSlot() method. The new signature and example of getting a matching subset of that collection is below.
Here's how I'm calling it:
slot.Sessions = SessionDataAccess.GetItemsByTimeSlotIdByPage(slot.TimeSlotId,
codeCampId, dayCount + 1, timeSlotCount).ToList();
Here's what it looks like:
public IEnumerable<SessionInfo> GetItemsByTimeSlotIdByPage(int timeSlotId, int codeCampId, int pageNumber, int pageSize)
{
var items = repo.GetItems(codeCampId).Where(t => t.TimeSlotId == timeSlotId);
items.Select(s => { s.RegistrantCount = GetRegistrantCount(s.SessionId); return s; });
// this is the important part
var resultSet = items.Skip(pageSize * (pageNumber - 1)).Take(pageSize);
foreach (var item in resultSet)
{
item.Speakers = speakerRepo.GetSpeakersForCollection(item.SessionId, item.CodeCampId);
}
return resultSet;
}

Concurent foreach iteration of two list strings

Let's say I have two List<string>. These are populated from the results of reading a text file
List owner contains:
cross
jhill
bbroms
List assignee contains:
Chris Cross
Jack Hill
Bryan Broms
During the read from a SQL source (the SQL statement contains a join)... I would perform
if(sqlReader["projects.owner"] == "something in owner list" || sqlReader["assign.assignee"] == "something in assignee list")
{
// add this projects information to the primary results LIST
list_by_owner.Add(sqlReader["projects.owner"],sqlReader["projects.project_date_created"],sqlReader["projects.project_name"],sqlReader["projects.project_status"]);
// if the assignee is not null, add also to the secondary results LIST
// logic to determine if assign.assignee is null goes here
list_by_assignee.Add(sqlReader["assign.assignee"],sqlReader["projects.owner"],sqlReader["projects.project_date_created"],sqlReader["projects.project_name"],sqlReader["projects.project_status"]);
}
I do not want to end up using nested foreach.
The FOR loop would probably suffice. Someone had mentioned ZIP to me but wasn't sure if that would be a preferable route to go in my situation.

One loop to iterate through both lists (assuming both have same count):
for (int i = 0; i < alpha.Count; i++)
{
var itemAlpha = alpha[i] // <= your object of list alpha
var itemBeta = beta[i] // <= your object of list beta
//write your code here
}

From what you describe, you don't need to iterate at all.
This is what you need:
http://msdn.microsoft.com/en-us/library/bhkz42b3.aspx
Usage:
if ((listAlpga.contains(resultA) || (listBeta.contains(resultA)) {
// do your operation
}
List Iteration will happen implicitly inside the contains method. And thats 2n comparisions, vs n*n for nested iteration.
You would be better off with sequential iteration in each list one after the other, if at all you need to go that route.

This list is maybe better represented as a List<KeyValuePair<string, string>> which would pair the two list values together in a single list.

There are several options for this. The least "painful" would be plain old for loop:
for (var index = 0; index < alpha.Count; index++)
{
var alphaItem = alpha[index];
var betaItem = beta[index];
// Do something.
}
Another interesting approach is using the indexed LINQ methods (but you need to remember they get evaluated lazily, you have to consume the resulting enumerable), for example:
alpha.Select((alphaItem, index) =>
{
var betaItem = beta[index];
// Do something
})
Or you can enumerate both collection if you use the enumerator directly:
using (var alphaEnumerator = alpha.GetEnumerator())
using (var betaEnumerator = beta.GetEnumerator())
{
while (alphaEnumerator.MoveNext() && betaEnumerator.MoveNext())
{
var alphaItem = alphaEnumerator.Current;
var betaItem = betaEnumerator.Current;
// Do something
}
}

Zip (if you need pairs) or Concat (if you need combined list) are possible options to iterate 2 lists at the same time.

I like doing something like this to enumerate over parallel lists:
int alphaCount = alpha.Count ;
int betaCount = beta.Count ;
int i = 0 ;
while ( i < alphaCount && i < betaCount )
{
var a = alpha[i] ;
bar b = beta[i] ;
// handle matched alpha/beta pairs
++i ;
}
while ( i < alphaCount )
{
var a = alpha[i] ;
// handle unmatched alphas
++i ;
}
while ( i < betaCount )
{
var b = beta[i] ;
// handle unmatched betas
++i ;
}

How to change each element of a List<long>?

I have a List of different DayTime (Ticks). I try to get a list of the time remaining from now to each time element.
List<long> diffliste = new List<long>(m_DummyAtTime);
// 864000000000 ≙ 24h
diffliste.ForEach(item => { item -= now; if (item < 0) item += 864000000000; });
// test, does also not work
// diffliste.ForEach(item => { item -= 500; });
However, the list is not changed. Do I miss something?
(now is DateTime.Now.TimeOfDay.Ticks)

var times = diffliste.Select(ticks => new DateTime(ticks) - DateTime.Now);
Will return a collection of TimeSpans between now and each time.
Without using Linq:
List<TimeSpan> spans = diffliste.ConvertAll(ticks => new DateTime(ticks) - DateTime.Now);
(modified as suggested by Marc)

You are changing a standalone copy in a local variable (well, parameter actually), not the actual value in the list. To do that, perhaps:
for(int i = 0 ; i < diffliste.Count ; i++) {
long val = diffliste[i]; // copy the value out from the list
... change it
diffliste[i] = val; // update the value in the list
}
Ultimately, your current code is semantically similar to:
long firstVal = diffliste[0];
firstVal = 42;
which also does not change the first value in the list to 42 (it only changes the local variable).

You cannot change the value of an item inside a foreach cycle.
You can do it using a classic for cycle or creating and assigning items to a new list.
for (int i = 0 ; i < diffliste.Count; i++)
{
long value = diffliste[i];
// Do here what you need
diffliste[i] = value;
}

The iteration var in a foreach cycle is immutable, so you cannot change it. You either have to create a new list or use a for cycle... see also here.

Compare adjacent list items

I'm writing a duplicate file detector. To determine if two files are duplicates I calculate a CRC32 checksum. Since this can be an expensive operation, I only want to calculate checksums for files that have another file with matching size. I have sorted my list of files by size, and am looping through to compare each element to the ones above and below it. Unfortunately, there is an issue at the beginning and end since there will be no previous or next file, respectively. I can fix this using if statements, but it feels clunky. Here is my code:
public void GetCRCs(List<DupInfo> dupInfos)
{
var crc = new Crc32();
for (int i = 0; i < dupInfos.Count(); i++)
{
if (dupInfos[i].Size == dupInfos[i - 1].Size || dupInfos[i].Size == dupInfos[i + 1].Size)
{
dupInfos[i].CheckSum = crc.ComputeChecksum(File.ReadAllBytes(dupInfos[i].FullName));
}
}
}
My question is:
How can I compare each entry to its neighbors without the out of bounds error?
Should I be using a loop for this, or is there a better LINQ or other function?
Note: I did not include the rest of my code to avoid clutter. If you want to see it, I can include it.

Compute the Crcs first:
// It is assumed that DupInfo.CheckSum is nullable
public void GetCRCs(List<DupInfo> dupInfos)
{
dupInfos[0].CheckSum = null ;
for (int i = 1; i < dupInfos.Count(); i++)
{
dupInfos[i].CheckSum = null ;
if (dupInfos[i].Size == dupInfos[i - 1].Size)
{
if (dupInfos[i-1].Checksum==null) dupInfos[i-1].CheckSum = crc.ComputeChecksum(File.ReadAllBytes(dupInfos[i-1].FullName));
dupInfos[i].CheckSum = crc.ComputeChecksum(File.ReadAllBytes(dupInfos[i].FullName));
}
}
}
After having sorted your files by size and crc, identify duplicates:
public void GetDuplicates(List<DupInfo> dupInfos)
{
for (int i = dupInfos.Count();i>0 i++)
{ // loop is inverted to allow list items deletion
if (dupInfos[i].Size == dupInfos[i - 1].Size &&
dupInfos[i].CheckSum != null &&
dupInfos[i].CheckSum == dupInfos[i - 1].Checksum)
{ // i is duplicated with i-1
... // your code here
... // eventually, dupInfos.RemoveAt(i) ;
}
}
}

I have sorted my list of files by size, and am looping through to
compare each element to the ones above and below it.
The next logical step is to actually group your files by size. Comparing consecutive files will not always be sufficient if you have more than two files of the same size. Instead, you will need to compare every file to every other same-sized file.
I suggest taking this approach
Use LINQ's .GroupBy to create a collection of files sizes. Then .Where to only keep the groups with more than one file.
Within those groups, calculate the CRC32 checksum and add it to a collection of known checksums. Compare with previously calculated checksums. If you need to know which files specifically are duplicates you could use a dictionary keyed by this checksum (you can achieve this with another GroupBy. Otherwise a simple list will suffice to detect any duplicates.
The code might look something like this:
var filesSetsWithPossibleDupes = files.GroupBy(f => f.Length)
.Where(group => group.Count() > 1);
foreach (var grp in filesSetsWithPossibleDupes)
{
var checksums = new List<CRC32CheckSum>(); //or whatever type
foreach (var file in grp)
{
var currentCheckSum = crc.ComputeChecksum(file);
if (checksums.Contains(currentCheckSum))
{
//Found a duplicate
}
else
{
checksums.Add(currentCheckSum);
}
}
}
Or if you need the specific objects that could be duplicates, the inner foreach loop might look like
var filesSetsWithPossibleDupes = files.GroupBy(f => f.FileSize)
.Where(grp => grp.Count() > 1);
var masterDuplicateDict = new Dictionary<DupStats, IEnumerable<DupInfo>>();
//A dictionary keyed by the basic duplicate stats
//, and whose value is a collection of the possible duplicates
foreach (var grp in filesSetsWithPossibleDupes)
{
var likelyDuplicates = grp.GroupBy(dup => dup.Checksum)
.Where(g => g.Count() > 1);
//Same GroupBy logic, but applied to the checksum (instead of file size)
foreach(var dupGrp in likelyDuplicates)
{
//Create the key for the dictionary (your code is likely different)
var sample = dupGrp.First();
var key = new DupStats() {FileSize = sample.FileSize, Checksum = sample.Checksum};
masterDuplicateDict.Add(key, dupGrp);
}
}
A demo of this idea.

I think the for loop should be : for (int i = 1; i < dupInfos.Count()-1; i++)
var grps= dupInfos.GroupBy(d=>d.Size);
grps.Where(g=>g.Count>1).ToList().ForEach(g=>
{
...
});

Can you do a union between your two lists? If you have a list of filenames and do a union it should result in only a list of the overlapping files. I can write out an example if you want but this link should give you the general idea.
https://stackoverflow.com/a/13505715/1856992
Edit: Sorry for some reason I thought you were comparing file name not size.
So here is an actual answer for you.
using System;
using System.Collections.Generic;
using System.Linq;
public class ObjectWithSize
{
public int Size {get; set;}
public ObjectWithSize(int size)
{
Size = size;
}
}
public class Program
{
public static void Main()
{
Console.WriteLine("start");
var list = new List<ObjectWithSize>();
list.Add(new ObjectWithSize(12));
list.Add(new ObjectWithSize(13));
list.Add(new ObjectWithSize(14));
list.Add(new ObjectWithSize(14));
list.Add(new ObjectWithSize(18));
list.Add(new ObjectWithSize(15));
list.Add(new ObjectWithSize(15));
var duplicates = list.GroupBy(x=>x.Size)
.Where(g=>g.Count()>1);
foreach (var dup in duplicates)
foreach (var objWithSize in dup)
Console.WriteLine(objWithSize.Size);
}
}
This will print out
14
14
15
15
Here is a netFiddle for that.
https://dotnetfiddle.net/0ub6Bs
Final note. I actually think your answer looks better and will run faster. This was just an implementation in Linq.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# index out of range, multi thread, Logs - c#

Related

Async call to database inside for loop

Iterate through list in paged manner, within another loop

Concurent foreach iteration of two list strings

How to change each element of a List<long>?

Compare adjacent list items

Categories

Resources