ETW Logging - TraceEventSession overwrites file

ETW Logging - TraceEventSession overwrites file - c#

I have an IIS application which uses TraceEventSession to capture ETW messages and forward them onto a log file: -
TraceEventSession _etwSession = new TraceEventSession(
"MyEtwLog", #"C:\Logs\MyEtwLog.etl") { 100 };
_etwSession.EnableProvider(
TraceEventProviders.GetEventSourceGuidFromName, "MyEtwLog"),
TraceEventLevel.Always);
It is working fine, but for some reason every time I restart the app it overwrites the log file instead of appending to it. Any idea what I'm missing?
Thanks in advance

Disclaimer: I contribute to TraceEvent internally at Microsoft
You can append to an ETW file using EVENT_TRACE_FILE_APPEND_MODE, which TraceEvent doesn't support today. We could potentially add it, but I can see more problems with exposing the API than not.
TL;DR -- Full ETW sessions with their metadata are what is logged into each file, and each session can have different options, like clock resolution, for example which can cause subtle timestamp inaccuracies that may go under your radar and cause you grievance at some point.
Here's what I recommend. I've put in a sleep, but you'd put some logic to decide how to rotate the files (like keeping track when your IIS instance refreshes).
var _etwSession = new TraceEventSession("MyEtwLog", #"C:\Logs\MyEtwLog." + MyTimestamp + ".etl");
_etwSession.EnableProvider(new Guid("MyGuid"), TraceEventLevel.Always);
Thread.Sleep(1000 * 60);
_etwSession.SetFileName(#"C:\Logs\MyEtwLog" + timestamp + ".etl");
A little background:
ETW files are binary consumer data (your log messages), and then metadata provided by the ETW subsystem that every log message gets for free. Like ThreadID, logical processor number, whether it cam from kernel or user mode, and lastly but most importantly the timestamp, which is actually a value dependent on the processor frequency.
In addition to the above, ETW file "rundown", think of it like state of the operating system, is also flushed to the file at the beginning and end of the session.
And despite the fact the most consumers think ETW logs are like sort of plain logs, they are sort of kind of not. They are intimately tied to the time the trace was taken (remember ETW was used mostly for perf analysis by the Windows Kernel team in the beginning). Things have improved in the regard recently such that the file can be fully treated independently and for your purpose it may very well be.
But I can imagine many cases where appending to the same file is not a good idea.
Oh, and another big one. ETW files are read sequentially from beginning to end every single time. That is it'll keep growing and you can't read from it in the middle, at least not in a supported way :-)
And lastly you'll still not want to append, because imagine you write a log file foo.etl, then you go buy a spanking new processor and you append to this log file, all your timestamps collected in your previous session will be off by some number.

I do not think you are missing anything. There is no way to append to an existing file with TraceEventSession. Merging different files together with TraceEventSession.Merge is possible, but will only yield correct results if they are from the same machine, and it was not restarted in the mean time.
In the ETW API there is a way to append with the EVENT_TRACE_FILE_MODE_APPEND option if you want to do it outside of TraceEventSession, but it has limitations.

Related

Which is the most performance based method of writing logs for Application

I am working in a client server application in which multiple clients and server are working on socket based communication for financial transactions where performance is very critical. Currently I am using streamwriter of system.IO namespace to write logs in file. For a single transaction I need to call streamwriter method 50 times to log different value and for more than 50,000 transactions, time taken for this logging become very important.
How can I reduce time taken by application to do logging? Whether I need to choose some other approach or any other class instead of streamwriter? What will be the best way to do logging with lesser time.

If performance is key, then I would consider looking at Event Tracing for Windows (AKA ETW).
With .NET 4.5 and the introduction of the EventSource class this has made ETW much easier to implement than in the past.
Vance Morrison's Weblog has some good articles on this topic.
For an overview of the architecture see Improve Debugging And Performance Tuning With ETW.
There is also the Semantic Application Block from Microsoft's patterns & practices team that makes it easier to incorporate the EventSource functionality and to manage logging behavior.

I suggest you to try Log4Net, you can configure where (file, database, xml) and when (bath, transaction, ...) and easily switch tracing level (debug, info, warning, ...)
Writing a log system from scratch is not worth it.

I suggest log into database (high perfomance, maybe embedded sqlite/sqlce). Bonus - you can structure and query your log entries.

To reduce the time taken by logging I suggest:
Ensure that the data being logged requires minimal conversion/formatting
Create or use a logging library that:
When called puts the logging data (along with time, thread id, and other tags to be logged) in a buffer.
Periodically flushes the buffered data to disk (i.e. immediately when the buffered data is big enough to fill at least one physical block in the log file, or immediately when the system becomes idle, or periodically every x seconds).
Locks the log file for exclusive write access, so you can view it while the software is running but other processes can't lock it under your feet.
Uses a separate thread to handle the flushing i.e. don't slow down your worker threads.
If you have many server processes, consider using IPC to send the log data to a single point, to minimise the number of active files being written and number of buffers in use (you may have to test so see if this is worth it, and you may have to add tags to show the source of each entry).
Do scheduled/idle time backups of the logs to prevent them getting too big.

Cache the values before writing them to disk.
Only commit the log when you have finished your transaction.
Something like this:
StringBuilder builder = new StringBuilder();
// do transaction step 1
builder.Append("Transaction step 1" + environment.NewLine); // or however you add a line to you log
// step 2
builder.Append("Transaction step 2" + environment.NewLine);
//...
// step 50
builder.Append("Transaction step 50" + environment.NewLine);
// now write to the file
File.WriteAllText(#"C:\log.txt", builder.ToString());
You can add in some handling if there is an error in any step to still write the log.
You could also use some opensource tool like log4net: http://logging.apache.org/log4net/.

One way to improve performance of you application and be able to write out all those log messages is to queue them up in an MSMQ queue, and have a windows service application process the queued log messages when the server is not under load. You could have the queue on a completely separate server too.
Implementation-wise you can setup a WCF web service that uses MSMQ for processing your log messages. That makes it easier than having to setup a windows service.

how to store information on disk frequently without introducing delays?

I really don't want to introduce any delays in my high frequency trading software and at the same time I need to store thousands of lines of logs every second. 1 ms delay would be huge, I only agree to have 0.01-0.05 ms delay.
*Now*I just allocate 500 Mb in memory at start-up, store logs there and when application finish I put this log on disk.
However now I realized that I want more logs and I want them during application execution. So I now want to store logs during application execution (probably once per minute or once per 10 minute). How slow StreamWriter.WriteLine is? Would it be slower than just "adding to preallocated collection"?
Should I use StreamWriter.WriteLine directly (is it syncrhonous or asynchronous, is AutoFlush option affects perfomance?). I also can use BlockingCollection to add items to log and then use dedicated thread to process this blocking collection and to store logs on disk in another thread.

Don't
Reinvent a wheel
Do
Use a logging framework
Properly configure loggers and levels for each logger
Use sync logging for memory (it's simple and fast, but has problems with event persistence onto drive) and async for IO (it is difficult to get right, slow, harder to test) loggers
If you hadn't done so, check out log4net and NLog, this will be a good place to start.

Probably you could store your logs in circular buffer and spawn a new thread of execution which will just send data from that buffer in shared memory to disk.

Use log4net as Andre Calil suggests. It logs to SQL, disks and whatnot and is extremely customizable. It can seem a bit complicated at first, but it is worth the effort.
What you need is probably the RollingFileAppender. log4net is in nuget, but you should read the documentation at the log4net site. Start by looking at the appender config.

Filewatcher for the whole computer (alternative?)

I want to write an application that gets events on every file change on the whole computer (to synchronize between file locations/permissions and my application's database).
I was thinking of using the .net filewatcher class but after some tests i found the following limitations:
1) The filewatcher has a buffer (http://msdn.microsoft.com/en-us/library/system.io.filesystemwatcher(v=vs.90).aspx):
If there are many changes in a short time, the buffer can overflow.
This causes the component to lose track of changes in the directory,
and it will only provide blanket notification. Increasing the size of
the buffer with the InternalBufferSize property is expensive, as it
comes from non-paged memory that cannot be swapped out to disk, so
keep the buffer as small yet large enough to not miss any file change
events. To avoid a buffer overflow, use the NotifyFilter and
IncludeSubdirectories properties so you can filter out unwanted change
notifications.
So in the whole computer, I can get some large amount of events (in peak) that i need to handle. Even if inside the event handling I'm only adding the event info to a queue I still can miss events.
2) Filewatcher has memory leaks:
http://connect.microsoft.com/VisualStudio/feedback/details/654232/filesystemwatcher-memory-leak
I checked it myself and it's true, after a few days my process memory grows from 20MB to 250MB
3) Microsoft says that we should use filewatcher for specific folders (I don't know why):
Use FileSystemWatcher to watch for changes in a specified directory.
So for these reasons i need an alternative solution to create my application. I know that I can write a driver but I prefer it to be a .net solution (based on win32 api, of course).
Thank you for your help,
Omri

Putting monitoring (especially synchronous notifications) will slow down the system. You can probably make use of our CallbackFilter product which provides a driver and a handy .NET API for tracking file changes. And CallbackFilter supports asynchronous notifications which are faster. Discounted and free licenses are possible.

Try doing this through WMI imo - the following link is relevant: http://www.codeproject.com/Articles/42212/WMI-and-File-System-Monitoring

Logging to files or to event viewer?

I was wondering what is the 'correct' way to log information messages; to files, or to a special log in the event viewer?
I like logging to files since I can use rolling flat file listener and see fresh new log from each day, plus in the event viewer I can only see one message at a time - where in a file I can scan through the day much easily. My colleague argues that files just take up space and he likes having his warnings, errors and information messages all in one place. What do you think? Is there a preferred way? If so, why?
Also, are there any concurrency issues in any of the methods? I have read that entlib is thread-safe and generates a Monitor.Enter behind if the listener is not thread safe, but I want to make sure (we're just using Logger.Write). We are using entlib 3.1.
Thank you in advance.

Here's the rule of thumb that I use when logging messages.
EventLog (if you have access of course)
- We always log Unhandled Exceptions
- In most cases we log Errors or Fatals
- In some cases we log Warnings
- In some very rare cases we log Information
- We will never log useless general messages like: "I'm here, blah, blah, blah"
Log File
- General rule, we log everthing but can chose the type of level or filter to use to turn down the volume of messages being logged
The EventLog is always a good option because its bound to WMI. This way products like Open View and alike, can monitor and alert ops if something went haywire. However, keep the messages to a minimum because its slow, its size limited on a per messaeg basis and it, entry limit as you can easily fill up the EventLog quite quickly and you application has to handle the dreaded "EventLog is Full" exception :)
Hope this helps...

There is no 'correct' way. It depends on your requirements.
You 'like' looking at flat files but how many (thousands) of lines can you really read every day?
What you seem to need is a plan (policy) and that ought to involve some tooling. Ask yourself how quickly will you notice an anomaly in the logs? And the absence of something normal?
The eventlog is a bit more work/overhead but it can be easily monitored remotely (multiples servers) by some tool. If you are using (only) manual inspection, don't bother.

In enterprise applications there are different types of logs such as -
Activity logs - Technical logs which instrument a process and are useful in debugging
Audit logs - logs used for auditing purpose. Availability of such logs is a legal requirements in some cases.
What to store where: -
As far as the Audit logs or any logs with sensitive information are concerned they should go to database where they can be stored safely.
For Activity logs my preference is to files. But we should also have different log levels such as Error, Info, Verbose etc which should be configurable. This will make it possible to save space and time required for logging when it is not needed.
You should write to event log only when you are not able to write to a file.

Consider asking your customer admins or technical support people where they want the logs to be placed.
As to being thread-safe, yes, EntLib is thread-safe.

I would recommend Event-viewer but in cases where you don't have admin rights or particular access to Event-viewer, Logging to normal files would be better option.

I prefer logging to a database, that way I can profile my logs and generate statistics and trends on errors occurring and fix the most frequent ones.
For external customers I use a web service called async to report the error. (I swallow any expections in it so any logging error's wouldn't affect the client - not that I've had any, using log4net and L4NDash).

Looking for solution ideas on how to update files in real time that may be locked by other software

I'm interested in getting solution ideas for a problem we have.
Background:
We have software tools that run on laptops and flash data onto hardware components. This software reads in a series of data files in order to do the programming on the hardware. It's in a manufacturing environment and is running continuously throughout the day.
Problem:
Currently, they're a central repository that the software connects to to read the data files. The software reads the files and retains a lock on them throughout the entire flashing process. This is running all throughout the day on different hardware components, so it's feasible that these files could be "locked" for most of the day.
There's new requirements that state these data files that the software is reading need to be updated in real time, will minimal impact to the end user who is doing the flashing. We will be writing the service that drops the files out there in real time.
The software is developed by a third party vendor and is not modifiable by us. However, it expects a location to look for the data files, so everything up until the point of flashing is our process that we're free to change.
Question:
What approach would you take to solve this from a solution programming standpoint? We're not sure how to drop files out there in real time given the locks that will be present on them throughout the day. We'll settle for an "as soon as possible" solution if that is significantly easier.

The only way out of this conundrum seems to be the introduction of an extra file repository, along with a service-like piece of logic in charge of keeping these repositories synchronized.
In other words, the file upload takes places in one of the repositories (call it the "input repository"), and the flashing process uses the other repository (call it the "ouput repository"). The synchronization logic permanently pools the input repository for new files (based on file time stamp or other...) and when it finds such new files, attempts to copy these to the "output directory"; such copy either takes place instantly, when the flashing logic hasn't locked the corresponding file in the output directory, or it is differed till the file gets unlocked.
Note: During the file copy, the synchronization logic can/should lock the file, hence very temporarily preventing the file to be overwritten by new uploads, but ensuring full integrity of the copied file. The difference with the existing system is that the lock is held for a much shorter amount of time.
The drawback of this system is the full duplication of the repository, and this could be a problem if the repository is very big. However there doesn't appear to be many alternatives since we do not have control over the flashing process.

"As soon as possible" is your only option. You can't update a file that's locked, that's the whole point of a lock.
Edit:
Would it be possible to put the new file in a different location and then tell the 3rd party service to look in that location the next time it needs the file?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.