how to store information on disk frequently without introducing delays? - c#

I really don't want to introduce any delays in my high frequency trading software and at the same time I need to store thousands of lines of logs every second. 1 ms delay would be huge, I only agree to have 0.01-0.05 ms delay.
*Now*I just allocate 500 Mb in memory at start-up, store logs there and when application finish I put this log on disk.
However now I realized that I want more logs and I want them during application execution. So I now want to store logs during application execution (probably once per minute or once per 10 minute). How slow StreamWriter.WriteLine is? Would it be slower than just "adding to preallocated collection"?
Should I use StreamWriter.WriteLine directly (is it syncrhonous or asynchronous, is AutoFlush option affects perfomance?). I also can use BlockingCollection to add items to log and then use dedicated thread to process this blocking collection and to store logs on disk in another thread.

Don't
Reinvent a wheel
Do
Use a logging framework
Properly configure loggers and levels for each logger
Use sync logging for memory (it's simple and fast, but has problems with event persistence onto drive) and async for IO (it is difficult to get right, slow, harder to test) loggers
If you hadn't done so, check out log4net and NLog, this will be a good place to start.

Probably you could store your logs in circular buffer and spawn a new thread of execution which will just send data from that buffer in shared memory to disk.

Use log4net as Andre Calil suggests. It logs to SQL, disks and whatnot and is extremely customizable. It can seem a bit complicated at first, but it is worth the effort.
What you need is probably the RollingFileAppender. log4net is in nuget, but you should read the documentation at the log4net site. Start by looking at the appender config.

Related

ETW Logging - TraceEventSession overwrites file

I have an IIS application which uses TraceEventSession to capture ETW messages and forward them onto a log file: -
TraceEventSession _etwSession = new TraceEventSession(
"MyEtwLog", #"C:\Logs\MyEtwLog.etl") { 100 };
_etwSession.EnableProvider(
TraceEventProviders.GetEventSourceGuidFromName, "MyEtwLog"),
TraceEventLevel.Always);
It is working fine, but for some reason every time I restart the app it overwrites the log file instead of appending to it. Any idea what I'm missing?
Thanks in advance
Disclaimer: I contribute to TraceEvent internally at Microsoft
You can append to an ETW file using EVENT_TRACE_FILE_APPEND_MODE, which TraceEvent doesn't support today. We could potentially add it, but I can see more problems with exposing the API than not.
TL;DR -- Full ETW sessions with their metadata are what is logged into each file, and each session can have different options, like clock resolution, for example which can cause subtle timestamp inaccuracies that may go under your radar and cause you grievance at some point.
Here's what I recommend. I've put in a sleep, but you'd put some logic to decide how to rotate the files (like keeping track when your IIS instance refreshes).
var _etwSession = new TraceEventSession("MyEtwLog", #"C:\Logs\MyEtwLog." + MyTimestamp + ".etl");
_etwSession.EnableProvider(new Guid("MyGuid"), TraceEventLevel.Always);
Thread.Sleep(1000 * 60);
_etwSession.SetFileName(#"C:\Logs\MyEtwLog" + timestamp + ".etl");
A little background:
ETW files are binary consumer data (your log messages), and then metadata provided by the ETW subsystem that every log message gets for free. Like ThreadID, logical processor number, whether it cam from kernel or user mode, and lastly but most importantly the timestamp, which is actually a value dependent on the processor frequency.
In addition to the above, ETW file "rundown", think of it like state of the operating system, is also flushed to the file at the beginning and end of the session.
And despite the fact the most consumers think ETW logs are like sort of plain logs, they are sort of kind of not. They are intimately tied to the time the trace was taken (remember ETW was used mostly for perf analysis by the Windows Kernel team in the beginning). Things have improved in the regard recently such that the file can be fully treated independently and for your purpose it may very well be.
But I can imagine many cases where appending to the same file is not a good idea.
Oh, and another big one. ETW files are read sequentially from beginning to end every single time. That is it'll keep growing and you can't read from it in the middle, at least not in a supported way :-)
And lastly you'll still not want to append, because imagine you write a log file foo.etl, then you go buy a spanking new processor and you append to this log file, all your timestamps collected in your previous session will be off by some number.
I do not think you are missing anything. There is no way to append to an existing file with TraceEventSession. Merging different files together with TraceEventSession.Merge is possible, but will only yield correct results if they are from the same machine, and it was not restarted in the mean time.
In the ETW API there is a way to append with the EVENT_TRACE_FILE_MODE_APPEND option if you want to do it outside of TraceEventSession, but it has limitations.

How to prevent NHibernate long-running process from locking up web site?

I have an NHibernate MVC application that is using ReadCommitted Isolation.
On the site, there is a certain process that the user could initiate, and depending on the input, may take several minutes. This is because the session is per request and is open that entire time.
But while that runs, no other user can access the site (they can try, but their request won't go through unless the long-running thing is finished)
What's more, I also have a need to have a console app that also performs this long running function while connecting to the same database. It is causing the same issue.
I'm not sure what part of my setup is wrong, any feedback would be appreciated.
NHibernate is set up with fluent configuration and StructureMap.
Isolation level is set as ReadCommitted.
The session factory lifecycle is HybridLifeCycle (which on the web should be Session per request, but on the win console app would be ThreadLocal)
It sounds like your requests are waiting on database locks. Your options are really:
Break the long running process into a series of smaller transactions.
Use ReadUncommitted isolation level most of the time (this is appropriate in a lot of use cases).
Judicious use of Snapshot isolation level (Assuming you're using MS-SQL 2005 or later).
(N.B. I'm assuming the long-running function does a lot of reads/writes and the requests being blocked are primarily doing reads.)
As has been suggested, breaking your process down into multiple smaller transactions will probably be the solution.
I would suggest looking at something like Rhino Service Bus or NServiceBus (my preference is Rhino Service Bus - I find it much simpler to work with personally). What that allows you to do is separate the functionality down into small chunks, but maintain the transactional nature. Essentially with a service bus, you send a message to initiate a piece of work, the piece of work will be enlisted in a distributed transaction along with receiving the message, so if something goes wrong, the message will not just disappear, leaving your system in a potentially inconsistent state.
Depending on what you need to do, you could send an initial message to start the processing, and then after each step, send a new message to initiate the next step. This can really help to break down the transactions into much smaller pieces of work (and simplify the code). The two service buses I mentioned (there is also Mass Transit), also have things like retries built in, and error handling, so that if something goes wrong, the message ends up in an error queue and you can investigate what went wrong, hopefully fix it, and reprocess the message, thus ensuring your system remains consistent.
Of course whether this is necessary depends on the requirements of your system :)
Another, but more complex solution would be:
You build a background robot application which runs on one of the machines
this background worker robot can be receive "worker jobs" (the one initiated by the user)
then, the robot processes the jobs step & step in the background
Pitfalls are:
- you have to programm this robot very stable
- you need to watch the robot somehow
Sure, this is involves more work - on the flip side you will have the option to integrate more job-types, enabling your system to process different things in the background.
I think the design of your application /SQL statements has a problem , unless you are facebook I dont think any process it should take all this time , it is better to review your design and check where is the bottleneck are, instead of trying to make this long running process continue .
also some times ORM is not good for every scenario , did you try to use SP ?

Which is the most performance based method of writing logs for Application

I am working in a client server application in which multiple clients and server are working on socket based communication for financial transactions where performance is very critical. Currently I am using streamwriter of system.IO namespace to write logs in file. For a single transaction I need to call streamwriter method 50 times to log different value and for more than 50,000 transactions, time taken for this logging become very important.
How can I reduce time taken by application to do logging? Whether I need to choose some other approach or any other class instead of streamwriter? What will be the best way to do logging with lesser time.
If performance is key, then I would consider looking at Event Tracing for Windows (AKA ETW).
With .NET 4.5 and the introduction of the EventSource class this has made ETW much easier to implement than in the past.
Vance Morrison's Weblog has some good articles on this topic.
For an overview of the architecture see Improve Debugging And Performance Tuning With ETW.
There is also the Semantic Application Block from Microsoft's patterns & practices team that makes it easier to incorporate the EventSource functionality and to manage logging behavior.
I suggest you to try Log4Net, you can configure where (file, database, xml) and when (bath, transaction, ...) and easily switch tracing level (debug, info, warning, ...)
Writing a log system from scratch is not worth it.
I suggest log into database (high perfomance, maybe embedded sqlite/sqlce). Bonus - you can structure and query your log entries.
To reduce the time taken by logging I suggest:
Ensure that the data being logged requires minimal conversion/formatting
Create or use a logging library that:
When called puts the logging data (along with time, thread id, and other tags to be logged) in a buffer.
Periodically flushes the buffered data to disk (i.e. immediately when the buffered data is big enough to fill at least one physical block in the log file, or immediately when the system becomes idle, or periodically every x seconds).
Locks the log file for exclusive write access, so you can view it while the software is running but other processes can't lock it under your feet.
Uses a separate thread to handle the flushing i.e. don't slow down your worker threads.
If you have many server processes, consider using IPC to send the log data to a single point, to minimise the number of active files being written and number of buffers in use (you may have to test so see if this is worth it, and you may have to add tags to show the source of each entry).
Do scheduled/idle time backups of the logs to prevent them getting too big.
Cache the values before writing them to disk.
Only commit the log when you have finished your transaction.
Something like this:
StringBuilder builder = new StringBuilder();
// do transaction step 1
builder.Append("Transaction step 1" + environment.NewLine); // or however you add a line to you log
// step 2
builder.Append("Transaction step 2" + environment.NewLine);
//...
// step 50
builder.Append("Transaction step 50" + environment.NewLine);
// now write to the file
File.WriteAllText(#"C:\log.txt", builder.ToString());
You can add in some handling if there is an error in any step to still write the log.
You could also use some opensource tool like log4net: http://logging.apache.org/log4net/.
One way to improve performance of you application and be able to write out all those log messages is to queue them up in an MSMQ queue, and have a windows service application process the queued log messages when the server is not under load. You could have the queue on a completely separate server too.
Implementation-wise you can setup a WCF web service that uses MSMQ for processing your log messages. That makes it easier than having to setup a windows service.

Filewatcher for the whole computer (alternative?)

I want to write an application that gets events on every file change on the whole computer (to synchronize between file locations/permissions and my application's database).
I was thinking of using the .net filewatcher class but after some tests i found the following limitations:
1) The filewatcher has a buffer (http://msdn.microsoft.com/en-us/library/system.io.filesystemwatcher(v=vs.90).aspx):
If there are many changes in a short time, the buffer can overflow.
This causes the component to lose track of changes in the directory,
and it will only provide blanket notification. Increasing the size of
the buffer with the InternalBufferSize property is expensive, as it
comes from non-paged memory that cannot be swapped out to disk, so
keep the buffer as small yet large enough to not miss any file change
events. To avoid a buffer overflow, use the NotifyFilter and
IncludeSubdirectories properties so you can filter out unwanted change
notifications.
So in the whole computer, I can get some large amount of events (in peak) that i need to handle. Even if inside the event handling I'm only adding the event info to a queue I still can miss events.
2) Filewatcher has memory leaks:
http://connect.microsoft.com/VisualStudio/feedback/details/654232/filesystemwatcher-memory-leak
I checked it myself and it's true, after a few days my process memory grows from 20MB to 250MB
3) Microsoft says that we should use filewatcher for specific folders (I don't know why):
Use FileSystemWatcher to watch for changes in a specified directory.
So for these reasons i need an alternative solution to create my application. I know that I can write a driver but I prefer it to be a .net solution (based on win32 api, of course).
Thank you for your help,
Omri
Putting monitoring (especially synchronous notifications) will slow down the system. You can probably make use of our CallbackFilter product which provides a driver and a handy .NET API for tracking file changes. And CallbackFilter supports asynchronous notifications which are faster. Discounted and free licenses are possible.
Try doing this through WMI imo - the following link is relevant: http://www.codeproject.com/Articles/42212/WMI-and-File-System-Monitoring

TextWriterTraceListener on background thread

I've got a 3rd party component which uses the TraceSwitch functionality to allow me to output some traces of what's going on inside. Unfortunately, running the switches in verbose mode, with a TextWriterTraceListener as the consumer (outputs to file) slows down the application too much.
It isn't critical that the traced data is written immediately, so is there a way to get the data written on a lower priority thread? Perhaps a Task?
EDIT
Upon further investigation, it seems merely turning on the switches without attaching the listener causes the slowdown. I'm going to get a hold of the component provider.
Would still be interesting to hear an answer though.
Write your own extension to the TraceListener. In the extension, put all the trace strings onto a List<string> and when the count gets high enough, write the list out to a file and clear the list to start again. Flush the list on Dispose().
This can then be easily extended to use the Thread Pool to Queue a new task do the actual write.
This doesn't guarantee that you will improve performance. It will only help if you are sure it is the IO that is slowing things down.

Categories

Resources