So I encountered a strange issue today - I had a simple creation of an instance inside the critical section of a lock, and it would throw a null reference exception when I manually dragged the next line to execute. To illustrate:
public class SearchEngineOptimizationParser
{
protected static ConcurrentDictionary<string, SearchEngineOptimizationInfo> _referralInformation = null;
protected static DateTime _lastRecordingDate;
protected static object _lockRecordingObject = new object();
protected static Dictionary<string, string> _searchProviderLookups = null;
static SearchEngineOptimizationParser()
{
_referralInformation = new ConcurrentDictionary<string, SearchEngineOptimizationInfo>();
_lastRecordingDate = DateTime.Now;
_searchProviderLookups = new Dictionary<string, string>();
_searchProviderLookups.Add("google.com", "q");
_searchProviderLookups.Add("yahoo.com", "p");
_searchProviderLookups.Add("bing.com", "q");
}
public SearchEngineOptimizationParser()
{
}
public virtual void ParseReferrer(Uri requestUrl, NameValueCollection serverVariables, ISession session)
{
string corePath = requestUrl.PathAndQuery.SmartSplit('?')[0].ToLower();
string referrer = serverVariables["HTTP_REFERER"];
if (!string.IsNullOrWhiteSpace(referrer))
{
NameValueCollection queryString = HttpUtility.ParseQueryString(referrer);
string dictionaryKey = session.AffiliateID + "|" + corePath;
foreach (var searchProvider in _searchProviderLookups)
{
if (referrer.Contains(searchProvider.Key))
{
if (queryString[searchProvider.Value] != null)
{
string keywords = queryString[searchProvider.Value];
SearchEngineOptimizationInfo info = new SearchEngineOptimizationInfo
{
Count = 1,
CorePath = corePath,
AffiliateId = session.AffiliateID,
Keywords = keywords
};
_referralInformation.AddOrUpdate(dictionaryKey, info, (key, oldValue) =>
{
oldValue.Count++;
return oldValue;
});
break;
}
}
}
}
if (DateTime.Now > _lastRecordingDate.AddHours(1))
{
lock (_lockRecordingObject)
{
if (DateTime.Now > _lastRecordingDate.AddHours(1))
{
SearchEngineKeywordRepository repository = new SearchEngineKeywordRepository();
List<KeyValuePair<string, SearchEngineOptimizationInfo>> currentInfo = _referralInformation.ToList();
Action logData = () =>
{
foreach (var item in currentInfo)
repository.LogKeyword(item.Value);
};
Thread logThread = new Thread(new ThreadStart(logData));
logThread.Start();
_lastRecordingDate = DateTime.Now;
_referralInformation.Clear();
}
}
}
}
EDIT: Updated Real Object
public class SearchEngineKeywordRepository
{
public virtual void LogKeyword(SearchEngineOptimizationInfo keywordInfo)
{
LogSearchEngineKeywords procedure = new LogSearchEngineKeywords();
procedure.Execute(keywordInfo.CorePath, keywordInfo.AffiliateId, keywordInfo.Keywords, keywordInfo.Count);
}
}
The general pattern being that I want to do this 'something' only every hour (in the context of a website application that gets a lot of traffic). I would breakpoint my first if statement, and then step the next line to execute inside the second if statement. When doing so, the act of initializing the SomeObject instance would cause a null reference exception. It had a completely 100% default constructor - I didn't even specify one.
However, when I let the code go through naturally, it would execute without problem. For some reason, it seems that when I skipped over the lock call into the critical section to just test run that code, it caused all kinds of errors.
I'm curious to know why that is; I understand the lock keyword is just syntactic sugar for a Monitor.Enter(o) try / finally block, but that seems to be that when invoking the constructor, something else was happening.
Anyone have any ideas?
EDIT: I've added the actual code to this. I'm able to reproduce this at will, but I still don't understand why this is happening. I've tried copying this code to another solution and the problem does not seem to occur.
I've tried to reproduce your situation, but as I expected I could not. I've tried both the 2.0 and 4.0 runtime, in 32 and 64 bit mode (debugging sometimes behaves differently under x64).
Is the code shown a simplification? Have you checked all your assumptions? I understand you're skipping 3 lines of code, both the if statements and the lock? In that case, even setting the lock object to null does not cause the exception you describe.
(Having _lockRecordingObject set to null causes an ArgumentNullException when leaving the lock scope)
Related
I've built a program that
takes in a list of record data from a file
parses and cleans up each record in a parsing object
outputs it to an output file
So far this has worked on a single thread, but considering the fact that records can exceed 1 million in some cases, we want to implement this in a multi threading context. Multi threading is new to me in .Net, and I've given it a shot but its not working. Below I will provide more details and code:
Main Class (simplified):
public class MainClass
{
parseObject[] parseObjects;
Thread[] threads;
List<InputLineItem> inputList = new List<InputLineItem>();
FileUtils fileUtils = new FileUtils();
public GenParseUtilsThreaded(int threadCount)
{
this.threadCount = threadCount;
Init();
}
public void Init()
{
inputList = fileUtils.GetInputList();
parseObjects = new parseObject[threadCount - 1];
threads = new Thread[threadCount - 1];
InitParseObjects();
Parse();
}
private void InitParseObjects()
{
//using a ref of fileUtils to use as my lock expression
parseObjects[0] = new ParseObject(ref fileUtils);
parseObjects[0].InitValues();
for (int i = 1; i < threadCount - 1; i++)
{
parseObjects[i] = new parseObject(ref fileUtils);
parseObjects[i].InitValues();
}
}
private void InitThreads()
{
for (int i = 0; i < threadCount - 1; i++)
{
Thread t = new Thread(new ThreadStart(parseObjects[0].CleanupAndParseInput));
threads[i] = t;
}
}
public void Parse()
{
try
{
InitThreads();
int objectIndex = 0;
foreach (InputLineItem inputLineItem in inputList)
{
parseObjects[0].inputLineItem = inputLineItem;
threads[objectIndex].Start();
objectIndex++;
if (objectIndex == threadCount)
{
objectIndex = 0;
InitThreads(); //do i need to re-init the threads after I've already used them all once?
}
}
}
catch (Exception e)
{
Console.WriteLine("(286) The following error occured: " + e);
}
}
}
}
And my Parse object class (also simplified):
public class ParseObject
{
public ParserLibrary parser { get; set; }
public FileUtils fileUtils { get; set; }
public InputLineItem inputLineItem { get; set; }
public ParseObject( ref FileUtils fileUtils)
{
this.fileUtils = fileUtils;
}
public void InitValues()
{
//relevant config of parser library object occurs here
}
public void CleanupFields()
{
parser.Clean(inputLineItem.nameValue);
inputLineItem.nameValue = GetCleanupUpValueFromParser();
}
private string GetCleanupFieldValue()
{
//code to extract cleanup up value from parses
}
public void CleanupAndParseInput()
{
CleanupFields();
ParseInput();
}
public void ParseInput()
{
try
{
parser.Parse(InputLineItem.NameValue);
}
catch (Exception e)
{
}
try
{
lock (fileUtils)
{
WriteOutputToFile(inputLineItem);
}
}
catch (Exception e)
{
Console.WriteLine("(414) Failed to write to output: " + e);
}
}
public void WriteOutputToFile(InputLineItem inputLineItem)
{
//writes updated value to output file
}
}
The error I get is when trying to run the Parse function, I get this message:
An unhandled exception of type 'System.AccessViolationException' occurred in GenParse.NET.dll
Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
That being said, I feel like there's a whole lot more that I'm doing wrong here aside from what is causing that error.
I also have further questions:
Do I create multiple parse objects and iteratively feed them to each thread as I'm attempting to do, or should I use one Parse object that gets shared or cloned across each thread?
If, outside the thread, I change a value in the object that I'm passing to the thread, will that change reflect in the object passed to the thread? i.e, is the object passed by value or reference?
Is there a more efficient way for each record to be assigned to a thread and its parse object than I am currently doing with the objectIndex iterator?
THANKS!
Do I create multiple parse objects and iteratively feed them to each thread as I'm attempting to do, or should I use one Parse object that gets shared or cloned across each thread?
You initialize each thread with new ThreadStart(parseObjects[0].CleanupAndParseInput) so all threads will share the same parse object. It is a fairly safe bet that the parse objects are not threadsafe. So each thread should have a separate object. Note that this might not be sufficient, if the parse library uses any global fields it might be non-threadsafe even when using separate objects.
If, outside the thread, I change a value in the object that I'm passing to the thread, will that change reflect in the object passed to the thread? i.e, is the object passed by value or reference?
Objects (i.e. classes) are passed by reference. But any changes to an object are not guaranteed to be visible in other threads unless a memoryBarrier is issued. Most synchronization code (like lock) will issue memory barriers. Keep in mind that any non-atomic operation is unsafe if a field is written an read concurrently.
Is there a more efficient way for each record to be assigned to a thread and its parse object than I am currently doing with the objectIndex iterator?
Using manual threads in this way is very old-school. The modern, easier, and probably faster way is to use a parallel-for loop. This will try to be smart about how many threads it will use and try to adapt chunk sizes to keep the synchronization overhead low.
var items = new List<int>();
ParseObject LocalInit()
{
// Do initalization, This is run once for each thread used
return new ParseObject();
}
ParseObject ThreadMain(int value, ParallelLoopState state, ParseObject threadLocalObject)
{
// Do whatever you need to do
// This is run on multiple threads
return threadLocalObject;
}
void LocalFinally(ParseObject obj)
{
// Do Cleanup for each thread
}
Parallel.ForEach(items, LocalInit, ThreadMain, LocalFinally);
As a final note, I would advice against using multithreading unless you are familiar with the potential dangers and pitfalls it involves, at least for any project where the result is important. There are many ways to screw up and make a program that will work 99.9% of the time, and silently corrupt data the remaining 0.1% of the time.
I've got a case that might be useful to analyze and extract some conclusions.
I've got a class that implements ITaskWorker, and each Task can run simultaneously with other Task connected with a scheduling engine.
Suppose Task A runs a job for object A_1 with B1...BN attributes, while for each attribute a command line runs and gives results (which is blocked until an answer is recieved from the command line process).
This means that for Task B we can schedule the same A_1 with B1...BN attributes.
For the following piece of code and explanation, could you find something that might resolve in threads interrupting each other (deadlocks, race conditions, starvation)?
How can I ensure that there isn't a multi threaded issue here?
I think starvation cannot be an issue here, unless there are a lot of tasks of the same type that other types cannot get to be done (see below explanation about the code). I don't see a case for deadlock, but there might be a race condition on mainLocaker or connectionLockers data members (because of the same variable and collection that's are used across multiple methods).
There cannot be the same key in the dictionary (I've verified that: [b_i.A_Name + "_" + b_i.B_Name] creates a unique key)
I've got this code in C#. Please notice that mainLocker and connectorsLockers is being used in several methods like doTaskOfTypeX, so several 'types' of workers might lock it in different parts of code:
private static object mainLocker = new Object();
private static Dictionary<string, object> connectionLockers = new Dictionary<string,object>();
private doTaskOfTypeA()
{
// ... initialize A from task parameters
var attibutes = getListOfAttribuesByObject(A);
bool localLocalTaken = false;
foreach (B b_i in attibutes)
{
try{
lock (mainLocker)
{
if (!typeLockers.ContainsKey(b_i.A_Name + "_" + b_i.B_Name))
{
typeLockers.Add(b_i.A_Name + "_" + b_i.B_Name, new object());
}
}
localLocalTaken = false;
Monitor.Enter(connectionLockers[b_i.A_Name + "_" + b_i.B_Name, ref localLocalTaken);
if (localLocalTaken)
{
var calcObj = callCLIProcess(); // a CMD call is in here
if (calcObj != null)
{
// do things with calcObj
}
else
{
jobResult = new ScheduleTaskResult(ResultTypes.Failed);
}
}
}
catch
{
jobResult = new ScheduleTaskResult(ResultTypes.Failed);
throw;
}
finally
{
if (localLocalTaken)
{
Monitor.Exit(connectionLockers[b_i.A_Name + "_" + b_i.B_Name]);
}
}
}
}
Actually, there is no issue here.
The [// do things with calcObj] notation had a code from an external library that didn't work too well :-)
It's been a while since I've been this stumped. The crazy thing is, I've done this several times in other areas of my code, so it's almost complete copy and paste, but except this code isn't working properly. so I am somehow missing something extremely obvious.
public class RoomCache
{
private ConcurrentDictionary<string, List<string>> _dicOnlineTraders;
ILoggingService _logService = new LoggingService();
public RoomCache()
{
_dicOnlineTraders = new ConcurrentDictionary<string, List<string>>();
}
public void UpdateTraderCurrentRoom(string sRoom, string sTrader)
{
_dicOnlineTraders.AddOrUpdate(sRoom, new List<string>() { sTrader }, (x, y) => UpdateRoomOnlineTraderList(sTrader, y));
}
private List<string> UpdateRoomOnlineTraderList(string sTrader, List<string> aryTraderList)
{
try
{
if (!aryTraderList.Contains(sTrader))
aryTraderList.Add(sTrader);
return aryTraderList;
}
catch (Exception ex)
{
_logService.LogError(ex);
return aryTraderList;
}
}
}
The above class is instantiated in the Application_Start() global.asax.cs like this:
public static RoomCache RoomCache;
RoomCache = new RoomCache();
So now between page loads my dictionary is not keeping the value i add into the list when UpdateRoomOnlineTraderList is called. When i step through, the list is there. The next time i load
the page it's gone and I am 100% there is nothing else removing this value from the dictionary.
How is my dictionary not retaining the value between page loads? the key is still retained but the value just vanishes. i'm baffled.
If you're absolutely sure you don't have code elsewhere that's re-initializing your RoomCache or removing the expected data from it, my best guess is that you have two AppDomains running for your IIS application...so you actually have two static RoomCache's in two different AppDomains under one w3wp worker process.
You can check this yourself by printing in the watch or immediate window: AppDomain.CurrentDomain.Id
If the two page loads are in fact happening in different AppDomains, the result will be two different int values.
Generally, if ASP .NET has decided to host two different AppDomains for you, it's in your best interest... So, if you really need information reliably persistable across page loads, you might consider an out of process store for your information.
Alternatively, you could use your web.config to insist that ASP .NET limit your application to only one AppDomain. This still won't protect you though if ASP .NET decides to recycle your AppDomain between page loads (which may happen all the time).
I don't see the error, given the code provided. However, I have an idea.
If the list is ever returned to a caller, it's possible that the caller could then set to null... Which would then set the list in the collection to null too (since they're the same list of course).
This would cause that problem, if the GetOnlineTradersWithSideEffects existed.
public class RoomCache
{
private ConcurrentDictionary<string, List<string>> _dicOnlineTraders;
ILoggingService _logService = new LoggingService();
private static readonly object SynchronousReadLock = new object();
// This is bad because the reference is passed out to the
// caller and we can't be sure that callers will behave. Any
// modifications to that list will change our list too.
private List<string> GetOnlineTradersWithSideEffects(string sRoom)
{
List<string> theseTraders = null;
_dicOnlineTraders.TryGetValue(sRoom, out theseTraders);
return theseTraders;
}
// A side-effect-free method of returning the list to a caller.
private List<string> GetOnlineTraders(string sRoom)
{
List<string> theseTraders = null;
_dicOnlineTraders.TryGetValue(sRoom, out theseTraders);
lock (SynchronousReadLock)
{
// Create a new list to return to a caller, that has
// copies of the elements of the list in the dictionary.
var localCopy = new List<string>(theseTraders);
return localCopy;
}
}
public RoomCache()
{
_dicOnlineTraders = new ConcurrentDictionary<string, List<string>>();
}
public void UpdateTraderCurrentRoom(string sRoom, string sTrader)
{
_dicOnlineTraders.AddOrUpdate(sRoom, new List<string>() { sTrader }, (x, y) => {});
}
private List<string> UpdateRoomOnlineTraderList(string sTrader, List<string> aryTraderList)
{
try
{
// Lock here too, when modifying the list so that our reads
// wait for writes and vice-versa.
lock (SynchronousReadLock)
{
if (!aryTraderList.Contains(sTrader))
aryTraderList.Add(sTrader);
return aryTraderList;
}
}
catch (Exception ex)
{
_logService.LogError(ex);
return aryTraderList;
}
}
}
I can't explain an issue I've run across. Basically I get a different answer if I use lambda syntax in a foreach loop than if I use it in a for loop. In the code below I register a delegate in a "dispatcher" class. I then later wrap the delegate on the way out in another delegate and return a list of these wrapped delegates. I then execute them. The expected output of executing the wrapped function list is 1,2. However I don't see that when I combine a lambda and a foreach loop.
This is not the code that is causing the problem, but the simplest case I could make to reproduce it. I would prefer not to discuss use cases of this, I'm more curious as to why I get behavior I'm not expecting. If I use the foreach loop below with the lambda syntax it fails. If I use the new Action() syntax and a foreach it works, if I use the lambda syntax in a for loop it works. Can anyone explain what is going on here. This has me really stumped.
public class Holder
{
public Holder(int ID, Dispatcher disp)
{
this.ID = ID;
disp.Register(Something);
}
public int ID { get; set; }
private void Something(int test) { Console.WriteLine(ID.ToString()); }
}
public class Dispatcher
{
List<Action<int>> m_Holder = new List<Action<int>>();
public void Register(Action<int> func)
{
m_Holder.Add(func);
}
public List<Action<int>> ReturnWrappedList()
{
List<Action<int>> temp = new List<Action<int>>();
//for (int i = 0; i < m_Holder.Count; i++) //Works - gives 1, 2
//{
// var action = m_Holder[i];
// temp.Add(p => action(p));
//}
foreach (var action in m_Holder)
{
temp.Add(p => action(p)); //Fails - gives 2,2
//temp.Add(new Action<int>(action)); Works - gives 1,2
}
return temp;
}
}
class Program
{
static void Main(string[] args)
{
var disp = new Dispatcher();
var hold1 = new Holder(1, disp);
var hold2 = new Holder(2, disp);
disp.ReturnWrappedList().ForEach(p => p(1));
}
}
This is the infamous "closing over the loop variable" gotcha.
Closing over the loop variable considered harmful (and part two)
Have you tried:
foreach (var action in m_Holder)
{
var a = action;
temp.Add(p => a(p));
}
This is the classic issue of a captured closure with a scope that isn't what you expect. In the foreach, the action has outer scope, so the execution captures the last value of the loop. In the for case, you create the action in inner scope, so the closure is over the local value at each iteration.
I get a null exception if I try to pass a null parameter to a delegate during an invoke. Here's what the code looks like:
public void RequestPhoto()
{
WCF.Service.BeginGetUserPhoto(Contact.UserID,
new AsyncCallback(RequestPhotoCB), null);
}
public void RequestPhotoCB(IAsyncResult result)
{
var photo = WCF.Service.EndGetUserPhoto(result);
UpdatePhoto(photo);
}
public delegate void UpdatePhotoDelegate(Binary photo);
public void UpdatePhoto(Binary photo)
{
if (InvokeRequired)
{
var d = new UpdatePhotoDelegate(UpdatePhoto);
Invoke(d, new object[] { photo });
}
else
{
var ms = new MemoryStream(photo.ToArray());
var bmp = new Bitmap(ms);
pbPhoto.BackgroundImage = bmp;
}
}
The problem is with the line:
Invoke(d, new object[] { photo });
If the variable "photo" is null. What is the correct way to pass a null parameter during an invoke?
Thanks!
Just for the benefit of others, you can pass null arguments to delegates (if the type allows it? Clarification needed here). In your case, IAsyncResult will allow it.
As for the debugging, the exception occurs on Invoke because you are debugging on a given Thread A, the exception occurs on Thread B. You can breakpoint multiple threads. Breakpoint the Thread B code and you will see the exception closer to or on the source.
Notice though that your debugger will jump around if multiple threads are running code at the same time. Debugging in multiple threads is always at least a little tricky, but satisfying when you solve the problems.
You could also further improve your answer code to check the null before it checks the InvokeRequired, as this is thread-independent to your logic (your code checks it just prior to use, after Invoking). This will save pushing the Invoke onto the message pump (assuming WinForms).
OK I figured it out. The problem was NOT with passing the null parameter to the delegate like I thought. The problem was with the delegate executing it was causing a null exception on the line:
var ms = new MemoryStream(photo.ToArray());
I didn't realize the problem was there because it was crashing on the Invoke line.
So I changed the UpdatePhoto method as follows:
public void UpdatePhoto(Binary photo)
{
if (InvokeRequired)
{
var d = new UpdatePhotoDelegate(UpdatePhoto);
Invoke(d, new object[] { photo});
}
else
{
if (photo != null)
{
var ms = new MemoryStream(photo.ToArray());
var bmp = new Bitmap(ms);
pbPhoto.BackgroundImage = bmp;
}
}
}
And all is well!