I'm using timestamps to temporally order concurrent changes in my program, and require that each timestamp of a change be unique. However, I've discovered that simply calling DateTime.Now is insufficient, as it will often return the same value if called in quick succession.
I have some thoughts, but nothing strikes me as the "best" solution to this. Is there a method I can write that will guarantee each successive call produces a unique DateTime?
Should I perhaps be using a different type for this, maybe a long int? DateTime has the obvious advantage of being easily interpretable as a real time, unlike, say, an incremental counter.
Update: Here's what I ended up coding as a simple compromise solution that still allows me to use DateTime as my temporal key, while ensuring uniqueness each time the method is called:
private static long _lastTime; // records the 64-bit tick value of the last time
private static object _timeLock = new object();
internal static DateTime GetCurrentTime() {
lock ( _timeLock ) { // prevent concurrent access to ensure uniqueness
DateTime result = DateTime.UtcNow;
if ( result.Ticks <= _lastTime )
result = new DateTime( _lastTime + 1 );
_lastTime = result.Ticks;
return result;
}
}
Because each tick value is only one 10-millionth of a second, this method only introduces noticeable clock skew when called on the order of 10 million times per second (which, by the way, it is efficient enough to execute at), meaning it's perfectly acceptable for my purposes.
Here is some test code:
DateTime start = DateTime.UtcNow;
DateTime prev = Kernel.GetCurrentTime();
Debug.WriteLine( "Start time : " + start.TimeOfDay );
Debug.WriteLine( "Start value: " + prev.TimeOfDay );
for ( int i = 0; i < 10000000; i++ ) {
var now = Kernel.GetCurrentTime();
Debug.Assert( now > prev ); // no failures here!
prev = now;
}
DateTime end = DateTime.UtcNow;
Debug.WriteLine( "End time: " + end.TimeOfDay );
Debug.WriteLine( "End value: " + prev.TimeOfDay );
Debug.WriteLine( "Skew: " + ( prev - end ) );
Debug.WriteLine( "GetCurrentTime test completed in: " + ( end - start ) );
...and the results:
Start time: 15:44:07.3405024
Start value: 15:44:07.3405024
End time: 15:44:07.8355307
End value: 15:44:08.3417124
Skew: 00:00:00.5061817
GetCurrentTime test completed in: 00:00:00.4950283
So in other words, in half a second it generated 10 million unique timestamps, and the final result was only pushed ahead by half a second. In real-world applications the skew would be unnoticeable.
One way to get a strictly ascending sequence of timestamps with no duplicates is the following code.
Compared to the other answers here this one has the following benefits:
The values track closely with actual real-time values (except in extreme circumstances with very high request rates when they would get slightly ahead of real-time).
It's lock free and should perform better that the solutions using lock statements.
It guarantees ascending order (simply appending a looping a counter does not).
public class HiResDateTime
{
private static long lastTimeStamp = DateTime.UtcNow.Ticks;
public static long UtcNowTicks
{
get
{
long original, newValue;
do
{
original = lastTimeStamp;
long now = DateTime.UtcNow.Ticks;
newValue = Math.Max(now, original + 1);
} while (Interlocked.CompareExchange
(ref lastTimeStamp, newValue, original) != original);
return newValue;
}
}
}
Also note the comment below that original = Interlocked.Read(ref lastTimestamp); should be used since 64-bit read operations are not atomic on 32-bit systems.
Er, the answer to your question is that "you can't," since if two operations occur at the same time (which they will in multi-core processors), they will have the same timestamp, no matter what precision you manage to gather.
That said, it sounds like what you want is some kind of auto-incrementing thread-safe counter. To implement this (presumably as a global service, perhaps in a static class), you would use the Interlocked.Increment method, and if you decided you needed more than int.MaxValue possible versions, also Interlocked.Read.
DateTime.Now is only updated every 10-15ms.
Not a dupe per se, but this thread has some ideas on reducing duplicates/providing better timing resolution:
How to get timestamp of tick precision in .NET / C#?
That being said: timestamps are horrible keys for information; if things happen that fast you may want an index/counter that keeps the discrete order of items as they occur. There is no ambiguity there.
I find that the most foolproof way is to combine a timestamp and an atomic counter. You already know the problem with the poor resolution of a timestamp. Using an atomic counter by itself also has the simple problem of requiring its state be stored if you are going to stop and start the application (otherwise the counter starts back at 0, causing duplicates).
If you were just going for a unique id, it would be as simple as concatenating the timestamp and counter value with a delimiter between. But because you want the values to always be in order, that will not suffice. Basically all you need to do is use the atomic counter value to add addition fixed width precision to your timestamp. I am a Java developer so I will not be able to provide C# sample code just yet, but the problem is the same in both domains. So just follow these general steps:
You will need a method to provide you with counter values cycling from 0-99999. 100000 is the maximum number of values possible when concatenating a millisecond precision timestamp with a fixed width value in a 64 bit long. So you are basically assuming that you will never need more than 100000 ids within a single timestamp resolution (15ms or so). A static method, using the Interlocked class to provide atomic incrementing and resetting to 0 is the ideal way.
Now to generate your id you simply concatenate your timestamp with your counter value padded to 5 characters. So if your timestamp was 13023991070123 and your counter was at 234 the id would be 1302399107012300234.
This strategy will work as long as you are not needing ids faster than 6666 per ms (assuming 15ms is your most granular resolution) and will always work without having to save any state across restarts of your application.
It can't be guaranteed to be unique, but is perhaps using ticks is granular enough?
A single tick represents one hundred
nanoseconds or one ten-millionth of a
second. There are 10,000 ticks in a
millisecond.
Not sure what you're trying to do entirely but maybe look into using Queues to handle sequentially process records.
Related
I have a game file with millions of events, file size can be > 10gb
Each line is a game action, like:
player 1, action=kill, timestamp=xxxx(ms granularity)
player 1, action=jump, timestamp=xxxx
player 2, action=fire, timestamp=xxxx
Each action is unique and finite for this data set.
I want to perform analysis on this file, like the total number of events per second, while tracking the individual number of actions in that second.
My plan in semi pseudocode:
lastReadGameEventTime = DateTime.MinValue;
while(line=getNextLine() != null)
{
parse_values(lastReadGameEventTime, out var timestamp, out var action);
if(timestamp == MinValue)
{
lastReadGameEventTime = timestamp;
}
else if(timestamp.subtract(lastReadGameEventTime).TotalSeconds > 1)
{
notify_points_for_this_second(datapoints);
datapoints = new T();
}
if(!datapoints.TryGetValue(action, out var act))
act = new Dictionary<string,int>();
act[action] = 0;
else
act[action]++;
}
lastReadGameEventTime = parse_time(line)
My worry is that this is too naive. I was thinking maybe count the entire minute and get the average per second. But of course I will miss game event spikes.
And if I want to calculate a 5 day average, it will further degrade the result set.
Any clever ideas?
You're asking several different questions here. All are related. Your requirements aren't real detailed, but I think I can point you in the right direction. I'm going to assume that all you want is number of events per second, for some period in the past. So all we need is some way to hold an integer (count of events) for every second during that period.
There are 86,400 seconds in a day. Let's say you want 10 days worth of information. You can build a circular buffer of size 864,000 to hold 10 days' worth of counts:
const int SecondsPerDay = 86400;
const int TenDays = 10 * SecondsPerDay;
int[] TenDaysEvents = new int[TenDays];
So you always have the last 10 days' of counts.
Assuming you have an event handler that reads your socket data and passes the information to a function, you can easily keep your data updated:
DateTime lastEventTime = DateTime.MinValue;
int lastTimeIndex = 0;
void ProcessReceivedEvent(string event)
{
// here, parse the event string to get the DateTime
DateTime eventTime = GetEventDate(event);
if (lastEventTime == DateTime.MinValue)
{
lastTimeIndex = 0;
}
else if (eventTime != lastEventTime)
{
// get number of seconds since last event
var elapsedTime = eventTime - lastEventTime;
var elapsedSeconds = (int)elapsedTime.TotalSeconds;
// For each of those seconds, set the number of events to 0
for (int i = 1; i <= elapsedSeconds; ++i)
{
lastTimeIndex = (lastTimeIndex + 1) % TenDays; // wrap around if we get past the end
TenDaysEvents[lastTimeIndex] = 0;
}
}
// Now increment the count for the current time index
++TenDaysEvents[lastTimeIndex];
}
This keeps the last 10 days in memory at all times, and is easy to update. Reporting is a bit more difficult because the start might be in the middle of the array. That is, if the current index is 469301, then the starting time is at 469302. It's a circular buffer. The naive way to report on this would be to copy the circular buffer to another array or list, with the starting point at position 0 in the new collection, and then report on that. Or, you could write a custom enumerator that counts back from the current position and starts there. That wouldn't be especially difficult to create.
The beauty of the above is that your array remains static. You allocate it once, and just re-use it. You might want to add an extra 60 entries, though, so that there's some "buffer" between the current time and the time from 10 days ago. That will prevent the data for 10 days ago from being changed during a query. Add an extra 300 items, to give yourself a 5-minute buffer.
Another option is to create a linked list of entries. Again, one per second. With that, you add items to the end of the list, and remove older items from the front. Whenever an event comes in for a new second, add the event entry to the end of the list, and then remove entries that are more than 10 days (or whatever your threshold is) from the front of the list. You can still use LINQ to report on things, as recommended in another answer.
You could use a hybrid, too. As each second goes by, write a record to the database, and keep the last minute, or hour, or whatever in memory. That way, you have up-to-the-second data available in memory for quick reports and real-time updates, but you can also use the database to report on any period since you first started collecting data.
Whatever you decide, you probably should keep some kind of database, because you can't guarantee that your system won't go down. In fact, you can pretty much guarantee that your system will go down at some point. It's no fun losing data, or having to scan through terabytes of log data to re-build the data that you've collected over time.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Interlocked.CompareExchange<Int> using GreaterThan or LessThan instead of equality
I know that Interlocked.CompareExchange exchanges values only if the value and the comparand are equal,
How to exchange them if they are not equal to achieve something like this ?
if (Interlocked.CompareExchange(ref count, count + 1, max) != max)
// i want it to increment as long as it is not equal to max
{
//count should equal to count + 1
}
A more efficient (less bus locking and less reads) and simplified implementation of what Marc posted:
static int InterlockedIncrementAndClamp(ref int count, int max)
{
int oldval = Volatile.Read(ref count), val = ~oldval;
while(oldval != max && oldval != val)
{
val = oldval;
oldval = Interlocked.CompareExchange(ref count, oldval + 1, oldval);
}
return oldval + 1;
}
If you have really high contention, we might be able to improve scalability further by reducing the common case to a single atomic increment instruction: same overhead as CompareExchange, but no chance of a loop.
static int InterlockedIncrementAndClamp(ref int count, int max, int drift)
{
int v = Interlocked.Increment(ref count);
while(v > (max + drift))
{
// try to adjust value.
v = Interlocked.CompareExchange(ref count, max, v);
}
return Math.Min(v, max);
}
Here we allow count to go up to drift values beyond max. But we still return only up to max. This allows us to fold the entire op into a single atomic increment in most cases, which will allow maximum scalability. We only need more than one op if we go above our drift value, which you can probably make large enough to make very rare.
In response to Marc's worries about Interlocked and non-Interlocked memory access working together:
Regarding specifically volatile vs Interlocked: volatile is just a normal memory op, but one that isn't optimized away and one that isn't reordered with regards to other memory ops. This specific issue doesn't revolve around either of those specific properties, so really we're talking non-Interlocked vs Interlocked interoperability.
The .NET memory model guarantees reads and writes of basic integer types (up to the machine's native word size) and references are atomic. The Interlocked methods are also atomic. Because .NET only has one definition of "atomic", they don't need to explicitly special-case saying they're compatible with each-other.
One thing Volatile.Read does not guarantee is visibility: you'll always get a load instruction, but the CPU might read an old value from its local cache instead of a new value just put in memory by a different CPU. x86 doesn't need to worry about this in most cases (special instructions like MOVNTPS being the exception), but it's a very possible thing with others architectures.
To summarize, this describes two problems that could affect the Volatile.Read: first, we could be running on a 16-bit CPU, in which case reading an int is not going to be atomic and what we read might not be the value someone else is writing. Second, even if it is atomic, we might be reading an old value due to visibility.
But affecting Volatile.Read does not mean they affect the algorithm as a whole, which is perfectly secure from these.
The first case would only bite us if you're writing to count concurrently in a non-atomic way. This is because what could end up happening is (write A[0]; CAS A[0:1]; write A[1]). Because all of our writes to count happen in the guaranteed-atomic CAS, this isn't a problem. When we're just reading, if we read a wrong value, it'll be caught in the upcoming CAS.
If you think about it, the second case is actually just a specialization of the normal case where the value changes between read and write -- the read just happens before we ask for it. In this case the first Interlocked.CompareExchange call would report a different value than what Volatile.Read gave, and you'd start looping until it succeeded.
If you'd like, you can think of the Volatile.Read as a pure optimization for cases of low contention. We could init oldval with 0 and it would still work just fine. Using Volatile.Read gives it a high chance to only perform one CAS (which, as instructions go, is quite expensive especially in multi-CPU configurations) instead of two.
But yes, as Marc says -- sometimes locks are just simpler!
There isn't a "compare if not equal", however: you can test the value yourself first, and then only do the update if you don't get a thread race; this often means you may need to loop if the second test fails. In pseudo-code:
bool retry;
do {
retry = false;
// get current value
var val = Interlocked.CompareExchange(ref field, 0, 0);
if(val != max) { // if not maxed
// increment; if the value isn't what it was above: redo from start
retry = Interlocked.CompareExchange(ref field, val + 1, val) != val;
}
} while (retry);
But frankly, a lock would be simpler.
Ok, So, I just started screwing around with threading, now it's taking a bit of time to wrap my head around the concepts so i wrote a pretty simple test to see how much faster if faster at all printing out 20000 lines would be (and i figured it would be faster since i have a quad core processor?)
so first i wrote this, (this is how i would normally do the following):
System.DateTime startdate = DateTime.Now;
for (int i = 0; i < 10000; ++i)
{
Console.WriteLine("Producing " + i);
Console.WriteLine("\t\t\t\tConsuming " + i);
}
System.DateTime endtime = DateTime.Now;
Console.WriteLine(a.startdate.Second + ":" + a.startdate.Millisecond + " to " + endtime.Second + ":" + endtime.Millisecond);
And then with threading:
public class Test
{
static ProducerConsumer queue;
public System.DateTime startdate = DateTime.Now;
static void Main()
{
queue = new ProducerConsumer();
new Thread(new ThreadStart(ConsumerJob)).Start();
for (int i = 0; i < 10000; i++)
{
Console.WriteLine("Producing {0}", i);
queue.Produce(i);
}
Test a = new Test();
}
static void ConsumerJob()
{
Test a = new Test();
for (int i = 0; i < 10000; i++)
{
object o = queue.Consume();
Console.WriteLine("\t\t\t\tConsuming {0}", o);
}
System.DateTime endtime = DateTime.Now;
Console.WriteLine(a.startdate.Second + ":" + a.startdate.Millisecond + " to " + endtime.Second + ":" + endtime.Millisecond);
}
}
public class ProducerConsumer
{
readonly object listLock = new object();
Queue queue = new Queue();
public void Produce(object o)
{
lock (listLock)
{
queue.Enqueue(o);
Monitor.Pulse(listLock);
}
}
public object Consume()
{
lock (listLock)
{
while (queue.Count == 0)
{
Monitor.Wait(listLock);
}
return queue.Dequeue();
}
}
}
Now, For some reason i assumed this would be faster, but after testing it 15 times, the median of the results is ... a few milliseconds different in favor of non threading
Then i figured hey ... maybe i should try it on a million Console.WriteLine's, but the results were similar
am i doing something wrong ?
Writing to the console is internally synchronized. It is not parallel. It also causes cross-process communication.
In short: It is the worst possible benchmark I can think of ;-)
Try benchmarking something real, something that you actually would want to speed up. It needs to be CPU bound and not internally synchronized.
As far as I can see you have only got one thread servicing the queue, so why would this be any quicker?
I have an example for why your expectation of a big speedup through multi-threading is wrong:
Assume you want to upload 100 pictures. The single threaded variant loads the first, uploads it, loads the second, uploads it, etc.
The limiting part here is the bandwidth of your internet connection (assuming that every upload uses up all the upload bandwidth you have).
What happens if you create 100 threads to upload 1 picture only? Well, each thread reads its picture (this is the part that speeds things up a little, because reading the pictures is done in parallel instead of one after the other).
As the currently active thread uses 100% of the internet upload bandwidth to upload its picture, no other thread can upload a single byte when it is not active. As the amount of bytes that needs to be transmitted, the time that 100 threads need to upload one picture each is the same time that one thread needs to upload 100 pictures one after the other.
You only get a speedup if uploading pictures was limited to lets say 50% of the available bandwidth. Then, 100 threads would be done in 50% of the time it would take one thread to upload 100 pictures.
"For some reason i assumed this would be faster"
If you don't know why you assumed it would be faster, why are you surprised that it's not? Simply starting up new threads is never guaranteed to make any operation run faster. There has to be some inefficiency in the original algorithm that a new thread can reduce (and that is sufficient to overcome the extra overhead of creating the thread).
All the advice given by others is good advice, especially the mention of the fact that the console is serialized, as well as the fact that adding threads does not guarantee speedup.
What I want to point out and what it seems the others missed is that in your original scenario you are printing everything in the main thread, while in the second scenario you are merely delegating the entire printing task to the secondary worker. This cannot be any faster than your original scenario because you simply traded one worker for another.
A scenario where you might see speedup is this one:
for(int i = 0; i < largeNumber; i++)
{
// embarrassingly parallel task that takes some time to process
}
and then replacing that with:
int i = 0;
Parallel.For(i, largeNumber,
o =>
{
// embarrassingly parallel task that takes some time to process
});
This will split the loop among the workers such that each worker processes a smaller chunk of the original data. If the task does not need synchronization you should see the expected speedup.
Cool test.
One thing to have in mind when dealing with threads is bottlenecks. Consider this:
You have a Restaurant. Your kitchen can make a new order every 10
minutes (your chef has a bladder problem so he's always in the
bathroom, but is your girlfriend's cousin), so he produces 6 orders an
hour.
You currently employ only one waiter, which can attend tables
immediately (he's probably on E, but you don't care as long as the
service is good).
During the first week of business everything is fine: you get
customers every ten minutes. Customers still wait for exactly ten
minutes for their meal, but that's fine.
However, after that week, you are getting as much as 2 costumers every
ten minutes, and they have to wait as much as 20 minutes to get their
meal. They start complaining and making noises. And god, you have
noise. So what do you do?
Waiters are cheap, so you hire two more. Will the wait time change?
Not at all... waiters will get the order faster, sure (attend two
customers in parallel), but still some customers wait 20 minutes for
the chef to complete their orders.You need another chef, but as you
search, you discover they are lacking! Every one of them is on TV
doing some crazy reality show (except for your girlfriend's cousin who
actually, you discover, is a former drug dealer).
In your case, waiters are the threads making calls to Console.WriteLine; But your chef is the Console itself. It can only service so much calls a second. Adding some threads might make things a bit faster, but the gains should be minimal.
You have multiple sources, but only 1 output. It that case multi-threading will not speed it up. It's like having a road where 4 lanes that merge into 1 lane. Having 4 lanes will move traffic faster, but at the end it will slow back down when it merges into 1 lane.
I have built an application that is used to simulate the number of products that a company can produce in different "modes" per month. This simulation is used to aid in finding the optimal series of modes to run in for a month to best meet the projected sales forecast for the month. This application has been working well, until recently when the plant was modified to run in additional modes. It is now possible to run in 16 modes. For a month with 22 work days this yields 9,364,199,760 possible combinations. This is up from 8 modes in the past that would have yielded a mere 1,560,780 possible combinations. The PC that runs this application is on the old side and cannot handle the number of calculations before an out of memory exception is thrown. In fact the entire application cannot support more than 15 modes because it uses integers to track the number of modes and it exceeds the upper limit for an integer. Baring that issue, I need to do what I can to reduce the memory utilization of the application and optimize this to run as efficiently as possible even if it cannot achieve the stated goal of 16 modes. I was considering writing the data to disk rather than storing the list in memory, but before I take on that overhead, I would like to get people’s opinion on the method to see if there is any room for optimization there.
EDIT
Based on a suggestion by few to consider something more academic then merely calculating every possible answer, listed below is a brief explanation of how the optimal run (combination of modes) is chosen.
Currently the computer determines every possible way that the plant can run for the number of work days that month. For example 3 Modes for a max of 2 work days would result in the combinations (where the number represents the mode chosen) of (1,1), (1,2), (1,3), (2,2), (2,3), (3,3) For each mode a product produces at a different rate of production, for example in mode 1, product x may produce at 50 units per hour where product y produces at 30 units per hour and product z produces at 0 units per hour. Each combination is then multiplied by work hours and production rates. The run that produces numbers that most closely match the forecasted value for each product for the month is chosen. However, because some months the plant does not meet the forecasted value for a product, the algorithm increases the priority of a product for the next month to ensure that at the end of the year the product has met the forecasted value. Since warehouse space is tight, it is important that products not overproduce too much either.
Thank you
private List<List<int>> _modeIterations = new List<List<int>>();
private void CalculateCombinations(int modes, int workDays, string combinationValues)
{
List<int> _tempList = new List<int>();
if (modes == 1)
{
combinationValues += Convert.ToString(workDays);
string[] _combinations = combinationValues.Split(',');
foreach (string _number in _combinations)
{
_tempList.Add(Convert.ToInt32(_number));
}
_modeIterations.Add(_tempList);
}
else
{
for (int i = workDays + 1; --i >= 0; )
{
CalculateCombinations(modes - 1, workDays - i, combinationValues + i + ",");
}
}
}
This kind of optimization problem is difficult but extremely well-studied. You should probably read up in the literature on it rather than trying to re-invent the wheel. The keywords you want to look for are "operations research" and "combinatorial optimization problem".
It is well-known in the study of optimization problems that finding the optimal solution to a problem is almost always computationally infeasible as the problem grows large, as you have discovered for yourself. However, it is frequently the case that finding a solution guaranteed to be within a certain percentage of the optimal solution is feasible. You should probably concentrate on finding approximate solutions. After all, your sales targets are already just educated guesses, therefore finding the optimal solution is already going to be impossible; you haven't got complete information.)
What I would do is start by reading the wikipedia page on the Knapsack Problem:
http://en.wikipedia.org/wiki/Knapsack_problem
This is the problem of "I've got a whole bunch of items of different values and different weights, I can carry 50 pounds in my knapsack, what is the largest possible value I can carry while meeting my weight goal?"
This isn't exactly your problem, but clearly it is related -- you've got a certain amount of "value" to maximize, and a limited number of slots to pack that value into. If you can start to understand how people find near-optimal solutions to the knapsack problem, you can apply that to your specific problem.
You could process the permutation as soon as you have generated it, instead of collecting them all in a list first:
public delegate void Processor(List<int> args);
private void CalculateCombinations(int modes, int workDays, string combinationValues, Processor processor)
{
if (modes == 1)
{
List<int> _tempList = new List<int>();
combinationValues += Convert.ToString(workDays);
string[] _combinations = combinationValues.Split(',');
foreach (string _number in _combinations)
{
_tempList.Add(Convert.ToInt32(_number));
}
processor.Invoke(_tempList);
}
else
{
for (int i = workDays + 1; --i >= 0; )
{
CalculateCombinations(modes - 1, workDays - i, combinationValues + i + ",", processor);
}
}
}
I am assuming here, that your current pattern of work is something along the lines
CalculateCombinations(initial_value_1, initial_value_2, initial_value_3);
foreach( List<int> list in _modeIterations ) {
... process the list ...
}
With the direct-process-approach, this would be
private void ProcessPermutation(List<int> args)
{
... process ...
}
... somewhere else ...
CalculateCombinations(initial_value_1, initial_value_2, initial_value_3, ProcessPermutation);
I would also suggest, that you try to prune the search tree as early as possible; if you can already tell, that certain combinations of the arguments will never yield something, which can be processed, you should catch those already during generation, and avoid the recursion alltogether, if this is possible.
In new versions of C#, generation of the combinations using an iterator (?) function might be usable to retain the original structure of your code. I haven't really used this feature (yield) as of yet, so I cannot comment on it.
The problem lies more in the Brute Force approach that in the code itself. It's possible that brute force might be the only way to approach the problem but I doubt it. Chess, for example, is unresolvable by Brute Force but computers play at it quite well using heuristics to discard the less promising approaches and focusing on good ones. Maybe you should take a similar approach.
On the other hand we need to know how each "mode" is evaluated in order to suggest any heuristics. In your code you're only computing all possible combinations which, anyway, will not scale if the modes go up to 32... even if you store it on disk.
if (modes == 1)
{
List<int> _tempList = new List<int>();
combinationValues += Convert.ToString(workDays);
string[] _combinations = combinationValues.Split(',');
foreach (string _number in _combinations)
{
_tempList.Add(Convert.ToInt32(_number));
}
processor.Invoke(_tempList);
}
Everything in this block of code is executed over and over again, so no line in that code should make use of memory without freeing it. The most obvious place to avoid memory craziness is to write out combinationValues to disk as it is processed (i.e. use a FileStream, not a string). I think that in general, doing string concatenation the way you are doing here is bad, since every concatenation results in memory sadness. At least use a stringbuilder (See back to basics , which discusses the same issue in terms of C). There may be other places with issues, though. The simplest way to figure out why you are getting an out of memory error may be to use a memory profiler (Download Link from download.microsoft.com).
By the way, my tendency with code like this is to have a global List object that is Clear()ed rather than having a temporary one that is created over and over again.
I would replace the List objects with my own class that uses preallocated arrays to hold the ints. I'm not really sure about this right now, but I believe that each integer in a List is boxed, which means much more memory is used than with a simple array of ints.
Edit: On the other hand it seems I am mistaken: Which one is more efficient : List<int> or int[]
I have a windows service that serves messages of some virtual queue via a WCF service interface.
I wanted to expose two performance counters -
The number of items on the queue
The number of items removed from the queue per second
The first one works fine, the second one always shows as 0 in PerfMon.exe, despite the RawValue appearing to be correct.
I'm creating the counters as such -
internal const string PERF_COUNTERS_CATEGORY = "HRG.Test.GDSSimulator";
internal const string PERF_COUNTER_ITEMSINQUEUE_COUNTER = "# Messages on queue";
internal const string PERF_COUNTER_PNR_PER_SECOND_COUNTER = "# Messages read / sec";
if (!PerformanceCounterCategory.Exists(PERF_COUNTERS_CATEGORY))
{
System.Diagnostics.Trace.WriteLine("Creating performance counter category: " + PERF_COUNTERS_CATEGORY);
CounterCreationDataCollection counters = new CounterCreationDataCollection();
CounterCreationData numberOfMessagesCounter = new CounterCreationData();
numberOfMessagesCounter.CounterHelp = "This counter provides the number of messages exist in each simulated queue";
numberOfMessagesCounter.CounterName = PERF_COUNTER_ITEMSINQUEUE_COUNTER;
numberOfMessagesCounter.CounterType = PerformanceCounterType.NumberOfItems32;
counters.Add(numberOfMessagesCounter);
CounterCreationData messagesPerSecondCounter= new CounterCreationData();
messagesPerSecondCounter.CounterHelp = "This counter provides the number of messages read from the queue per second";
messagesPerSecondCounter.CounterName = PERF_COUNTER_PNR_PER_SECOND_COUNTER;
messagesPerSecondCounter.CounterType = PerformanceCounterType.RateOfCountsPerSecond32;
counters.Add(messagesPerSecondCounter);
PerformanceCounterCategory.Create(PERF_COUNTERS_CATEGORY, "HRG Queue Simulator performance counters", PerformanceCounterCategoryType.MultiInstance,counters);
}
Then, on each service call, I increment the relevant counter, for the per/sec counter this currently looks like this -
messagesPerSecCounter = new PerformanceCounter();
messagesPerSecCounter.CategoryName = QueueSimulator.PERF_COUNTERS_CATEGORY;
messagesPerSecCounter.CounterName = QueueSimulator.PERF_COUNTER_PNR_PER_SECOND_COUNTER;
messagesPerSecCounter.MachineName = ".";
messagesPerSecCounter.InstanceName = this.ToString().ToLower();
messagesPerSecCounter.ReadOnly = false;
messagesPerSecCounter.Increment();
As mentioned - if I put a breakpoint after the call to increment I can see the RawValue constantly increasing, in consistence with the calls to the service (fairly frequently, more than once a second, I would think)
But the performance counter itself stays on 0.
The performance counter providing the count of items on the 'queue', which is implemented in the same way (although I assign the RawValue, rather than call Increment) works just fine.
What am I missing?
I also initially had problems with this counter. MSDN has a full working example that helped me a lot:
http://msdn.microsoft.com/en-us/library/4bcx21aa.aspx
As their example was fairly long winded, I boiled it down to a single method to demonstrate the bare essentials. When run, I see the expected value of 10 counts per second in PerfMon.
public static void Test()
{
var ccdc = new CounterCreationDataCollection();
// add the counter
const string counterName = "RateOfCountsPerSecond64Sample";
var rateOfCounts64 = new CounterCreationData
{
CounterType = PerformanceCounterType.RateOfCountsPerSecond64,
CounterName = counterName
};
ccdc.Add(rateOfCounts64);
// ensure category exists
const string categoryName = "RateOfCountsPerSecond64SampleCategory";
if (PerformanceCounterCategory.Exists(categoryName))
{
PerformanceCounterCategory.Delete(categoryName);
}
PerformanceCounterCategory.Create(categoryName, "",
PerformanceCounterCategoryType.SingleInstance, ccdc);
// create the counter
var pc = new PerformanceCounter(categoryName, counterName, false);
// send some sample data - roughly ten counts per second
while (true)
{
pc.IncrementBy(10);
System.Threading.Thread.Sleep(1000);
}
}
I hope this helps someone.
When you are working with Average type Performance Counters there are two components - a numerator and a denominator. Because you are working with an average the counter is calculated as 'x instances per y instances'. In your case you are working out 'number items' per 'number of seconds'. In other words, you need to count both how many items you take out of the queue and how many seconds they take to be removed.
The Average type Performance Counters actually create two counters - a numerator component called {name} and a denominator component called {name}Base. If you go to the Performance Counter snap-in you can view all the categories and counters; you can check the name of the Base counter. When the queue processing process is started, you should
begin a stopwatch
remove item(s) from the queue
stop the stopwatch
increment the {name} counter by the number of items removed from the queue
increment the {name}Base counter by the number of ticks on the stopwatch
The counter is supposed to automatically know to divide the first counter by the second to give the average rate. Check CodeProject for a good example of how this works.
It's quite likely that you don't want this type of counter. These Average counters are used to determine how many instances happen per second of operation; e.g. the average number of seconds it takes to complete an order, or to do some complex transaction or process. What you may want is an average number of instances in 'real' time as opposed to processing time.
Consider if you had 1 item in your queue, and it took 1ms to remove, that's a rate of 1000 items per second. But after one second you have only removed 1 item (because that's all there is) and so you are processing 1 item per second in 'real' time. Similarly, if there are a million items in the queue but you've only processed one because your server is busy doing some other work, do you want to see 1000 items / second theoretical or 1 item / second real?
If you want this 'real' figure, as opposed to the theoretical throughput figure, then this scenario isn't really suited to performance counters - instead you need to know a start time, and end time, and a number of items processed. It can't really be done with a simple 'counter'. Instead you would store a system startup time somewhere, and calculate (number of items) / (now - startup time).
I had the same problem. In my testing, I believe I was seeing the issue was some combination of multi instance and rate of count per second. If I used single instance or a number of items counter it worked. Something about that combination of multi instance and rate per second caused it to be always zero.
As Performance counter of type RateOfCountsPerSecond64 has always the value 0 mentions, a reboot may do the trick. Worked for me anyway.
Another thing that worked for me was this initializing the counter in a block like this:
counter.BeginInit();
counter.RawValue = 0;
counter.EndInit();
I think you need some way to persist the counter. It appears to me that each time the service call is initiated then the counter is recreated.
So you could save the counter to a DB, flat file, or perhaps even a session variable if you wanted it unique to a user.