I'm developing an application in C# that requires many function calls (100-1000 per second) to happen at very specific times. However, there are extremely tight specs on the latency of this application, and so due to the latency increase associated with garbage collection, it's not feasible for me to use DateTime or Timer objects. Is there some way I can access the system time as a primitive type, without having to create DateTime objects?
TL;DR: Is there an analogue for Java's System.currentTimeMillis() for C#?
What makes you think DateTime allocates objects? It's a value type. No need for a heap allocation, and thus no need for garbage collection. (As TomTom says, if you have hard latency requirements, you'll need a real-time operating system etc. If you just have "low" latency requirements, that's a different matter.)
You should be able to use DateTime.Now or DateTime.UtcNow without any issues - UtcNow is faster, as it doesn't perform any time zone conversions.
As an example, I just time 100 million calls to DateTime.UtcNow and then using the Hour property, and on my laptop that takes about 3.5 seconds. Using the Ticks property (which doesn't involve as much computation) takes about 1.2 seconds. Without using any property, it only takes 1 second.
So basically if you're only performing 1000 calls per second, it's going to be irrelevant.
Consider not using windows. SImple like that. Not even "not using C#" but not using windows.
However, there are extremely tight specs on the latency of this application,
There are special real time operating systems that are build for exactly this.
Is there an analogue for Java's System.currentTimeMillis() f
Yes. but that still will not help.
The best you CAN do is high precision multimedia timers, which work like a charm but also have no real time guarantees. The language is not hte problem - your OS of choicee is unsuitable for the task at hand.
GC is totally not an issue if programming smart. Objects are not an issue, usin a concurrent GC and avoiding EXCESSIVE creation of objects helps a lot. You dramatize a problem here that is not there to start with.
There is a kernel API that can handle very low MS precision and can be accessed from C#
http://www.codeproject.com/Articles/98346/Microsecond-and-Millisecond-NET-Timer
the real problem is you must reconfigure the kernel to make that interrupt at short notices or you are at the mercy of the scheduler that does not ahve such a low resolution.
Related
Lets say I have a device running Windows CE and there are 2 options: using native c++ and using the .NET Compact Framework using C# to build the application.
I have to establish a connection with an external computer and send out status messages exactly every 0.5 seconds, with only a +/- 10 millisecond error tolerance.
I know you might say that in practice there are too many factors to know the answer, but lets assume that this has been tested with a c++ program, and works, and I wanted to make an equivalent program using C#. The only factor being changed would be the language/framework. Would this be possible, or would the 10 ms +/- error tolerance be too strict to achieve due to C# being a slower garbage collecting language?
The 10ms requirement would be achievable in C#, but would never be guaranteed. It might be 10ms most of the time, but you can near guarantee a GC is going to happen at an inopportune time and your managed thread is going to get suspended. You will miss that 10ms window.
But why does the solution have to be one or the other? I don't know much about your overall app requirements, but given similar requirements, my inclination would be to create a small piece in C (not C++ because you want very fine control over memory allocation and deallocation) for the time sensitive piece, probably as a service since services are easy in CE, and then create any UI, business logic, etc. in C#. Get the real-time nature of the OS for your tiny time-sensitive routine and the huge benefits of managed code for the rest of the app. Doing UI in C or C++ any more is just silly.
If your program does not need much memory and you can avoid long GC pauses then go with C#. It is infinitely easier to work with C# than C++. Raw C# speed is also pretty good. In some cases C# is even faster than C++.
The only thing that you get from C++ is predictability. There is no garbage collection that can surprise you. That is if you manage to avoid memory corruption, duplicate deallocation, pointer screwups, unallocated memory references, etc. etc.
What happens when you call DateTime.Now?
I followed the property code in Reflector and it appears to add the time zone offset of the current locale to UtcNow. Following UTCNow led me, turn by turn, finally to a Win32 API call.
I reflected on it and asked a related question but haven't received a satisfactory response yet. From the links in the present comment on that question, I infer that there is a hardware unit that keeps time. But I also want to know what unit it keeps time in and whether or not it uses the CPU to convert time into a human readable unit. This will shed some light on whether the retrieval of date and time information is I/O bound or compute-bound.
You are deeply in undocumented territory with this question. Time is provided by the kernel: the underlying native API call is NtQuerySystemTime(). This does get tinkered with across Windows versions - Windows 8 especially heavily altered the underlying implementation, with visible side-effects.
It is I/O bound in nature: time is maintained by the RTC (Real Time Clock) which used to be a dedicated chip but nowadays is integrated in the chipset. But there is very strong evidence that it isn't I/O bound in practice. Time updates in sync with the clock interrupt so very likely the interrupt handler reads the RTC and you get a copy of the value. Something you can see when you tinker with timeBeginPeriod().
And you can see when you profile it that it only takes ~7 nanoseconds on Windows 10 - entirely too fast to be I/O bound.
You seem to be concerned with blocking. There are two cases where you'd want to avoid that.
On the UI thread it's about latency. It does not matter what you do (IO or CPU), it can't take long. Otherwise it freezes the UI thread. UtcNow is super fast so it's not a concern.
Sometimes, non-blocking IO is being uses as a way to scale throughput as more load is added. Here, the only reason is to save threads because each thread consumes a lot of resources. Since there is no async way to call UtcNow the question is moot. You just have to call it as is.
Since time on Windows usually advances at 60 Hz I'd assume that a call to UtcNow reads from an in-memory variable that is written to at 60 Hz. That makes is CPU bound. But it does not matter either way.
.NET relies on the API. MSDN has to say this about the API:
https://msdn.microsoft.com/de-de/library/windows/desktop/ms724961(v=vs.85).aspx
When the system first starts, it sets the system time to a value based on the real-time clock of the computer and then regularly updates the time [...] GetSystemTime copies the time to a SYSTEMTIME [...]
I have found no reliable sources to back up my claim that it is stored as SYSTEMTIME structure, updated therein, and just copied into the receiving buffer of GetSystemTime when called. The smallest logical unit is 100ns from the NtQuerySystemTime system call, but we end up with 1 millisecond in the CLR's DateTime object. Resolution is not always the same.
We might be able to figure that out for Mono on Linux, but hardly for Windows given that the API code itself is not public. So here is an assumption: Current time is a variable in the kernel address space. It will be updated by the OS (frequently by the system clock timer interrupt, less frequently maybe from a network source -- the documentation mentions that callers may not rely on monotonic behavior, as a network sync can correct the current time backwards). The OS will synchronize access to prevent concurrent writing but otherwise it will not be an I/O-expensive operation.
On recent computers, the timer interval is no longer fixed, and can be controlled by the BIOS and OS. Applications can even request lower or higher clock rates: https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted
Are there any tips, tricks and techniques to prevent or minimize slowdowns or temporary freeze of an app because of the .NET GC?
Maybe something along the lines of:
Try to use structs if you can, unless the data is too large or will be mostly used inside other classes, etc.
The description of your App does not fit the usual meaning of "realtime". Realtime is commonly used for software that has a max latency in milliseconds or less.
You have a requirement of responsiveness to the user, meaning you could probably tolerate an incidental delay of 500 ms or more. 100 ms won't be noticed.
Luckily for you, the GC won't cause delays that long. And if it did you could use the Server (background) version of the GC, but I know little about the details.
But if your "user experience" does suffer, it probably won't be the GC.
IMHO, if the performance of your application is being affected noticeably by the GC, something is wrong. The GC is designed to work without intervention and without significantly affecting your application. In other words, you shouldn't have to code with the details of the GC in mind.
I would examine the structure of your application and see where the bottlenecks are, maybe using a profiler. Maybe there are places where you could reduce the number of objects that are being created and destroyed.
If parts of your application really need to be real-time, perhaps they should be written in another language that is designed for that sort of thing.
Another trick is to use GC.RegisterForFullNotifications on back-end.
Let say, that you have load balancing server and N app. servers. When load balancer recieves information about possible full GC on one of the servers it will forward requests to other servers for some time therefore SLA will not be affected by GC (which is especially usefull for x64 boxes where more than 4GB can be addressed).
Updated
No, unfortunately I don't have a code but there is a very simple example at MSDN.com with dummy methods like RedirectRequests and AcceptRequests which can be found here: Garbage Collection Notifications
I have been given the task of re-writing some libraries written in C# so that there are no allocations once startup is completed.
I just got to one project that does some DB queries over an OdbcConnection every 30 seconds. I've always just used .ExecuteReader() which creates an OdbcDataReader. Is there any pattern (like the SocketAsyncEventArgs socket pattern) that lets you re-use your own OdbcDataReader? Or some other clever way to avoid allocations?
I haven't bothered to learn LINQ since all the dbs at work are Oracle based and the last I checked, there was no official Linq To Oracle provider. But if there's a way to do this in Linq, I could use one of the third-party ones.
Update:
I don't think I clearly specified the reasons for the no-alloc requirement. We have one critical thread running and it is very important that it not freeze. This is for a near realtime trading application, and we do see up to a 100 ms freeze for some Gen 2 collections. (I've also heard of games being written the same way in C#). There is one background thread that does some compliance checking and runs every 30 seconds. It does a db query right now. The query is quite slow (approx 500 ms to return with all the data), but that is okay because it doesn't interfere with the critical thread. Except if the worker thread is allocating memory, it will cause GCs which freeze all threads.
I've been told that all the libraries (including this one) cannot allocate memory after startup. Whether I agree with that or not, that's the requirement from the people who sign the checks :).
Now, clearly there are ways that I could get the data into this process without allocations. I could set up another process and connect it to this one using a socket. The new .NET 3.5 sockets were specifically optimized not to allocate at all, using the new SocketAsyncEventArgs pattern. (In fact, we are using them to connect to several systems and never see any GCs from them.) Then have a pre-allocated byte array that reads from the socket and go through the data, allocating no strings along the way. (I'm not familiar with other forms of IPC in .NET so I'm not sure if the memory mapped files and named pipes allocate or not).
But if there's a faster way to get this no-alloc query done without going through all that hassle, I'd prefer it.
You cannot reuse IDataReader (or OdbcDataReader or SqlDataReader or any equivalent class). They are designed to be used with a single query only. These objects encapsulate a single record set, so once you've obtained and iterated it, it has no meaning anymore.
Creating a data reader is an incredibly cheap operation anyway, vanishingly small in contrast to the cost of actually executing the query. I cannot see a logical reason for this "no allocations" requirement.
I'd go so far as to say that it's very nearly impossible to rewrite a library so as to allocate no memory. Even something as simple as boxing an integer or using a string variable is going to allocate some memory. Even if it were somehow possible to reuse the reader (which it isn't, as I explained), it would still have to issue the query to the database again, which would require memory allocations in the form of preparing the query, sending it over the network, retrieving the results again, etc.
Avoiding memory allocations is simply not a practical goal. Better to perhaps avoid specific types of memory allocations if and when you determine that some specific operation is using up too much memory.
For such a requirement, are you sure that a high-level language like C# is your choice?
You cannot say whether the .NET library functions you are using are internally allocating memory or not. The standard doesn't guarantee that, so if they are not using allocations in the current version of .NET framework, they may start doing so later.
I suggest you profile the application to determine where the time and/or memory are being spent. Don't guess - you will only guess wrong.
Given a case where I have an object that may be in one or more true/false states, I've always been a little fuzzy on why programmers frequently use flags+bitmasks instead of just using several boolean values.
It's all over the .NET framework. Not sure if this is the best example, but the .NET framework has the following:
public enum AnchorStyles
{
None = 0,
Top = 1,
Bottom = 2,
Left = 4,
Right = 8
}
So given an anchor style, we can use bitmasks to figure out which of the states are selected. However, it seems like you could accomplish the same thing with an AnchorStyle class/struct with bool properties defined for each possible value, or an array of individual enum values.
Of course the main reason for my question is that I'm wondering if I should follow a similar practice with my own code.
So, why use this approach?
Less memory consumption? (it doesn't seem like it would consume less than an array/struct of bools)
Better stack/heap performance than a struct or array?
Faster compare operations? Faster value addition/removal?
More convenient for the developer who wrote it?
It was traditionally a way of reducing memory usage. So, yes, its quite obsolete in C# :-)
As a programming technique, it may be obsolete in today's systems, and you'd be quite alright to use an array of bools, but...
It is fast to compare values stored as a bitmask. Use the AND and OR logic operators and compare the resulting 2 ints.
It uses considerably less memory. Putting all 4 of your example values in a bitmask would use half a byte. Using an array of bools, most likely would use a few bytes for the array object plus a long word for each bool. If you have to store a million values, you'll see exactly why a bitmask version is superior.
It is easier to manage, you only have to deal with a single integer value, whereas an array of bools would store quite differently in, say a database.
And, because of the memory layout, much faster in every aspect than an array. It's nearly as fast as using a single 32-bit integer. We all know that is as fast as you can get for operations on data.
Easy setting multiple flags in any order.
Easy to save and get a serie of 0101011 to the database.
Among other things, its easier to add new bit meanings to a bitfield than to add new boolean values to a class. Its also easier to copy a bitfield from one instance to another than a series of booleans.
It can also make Methods clearer. Imagine a Method with 10 bools vs. 1 Bitmask.
Actually, it can have a better performance, mainly if your enum derives from an byte.
In that extreme case, each enum value would be represented by a byte, containing all the combinations, up to 256. Having so many possible combinations with booleans would lead to 256 bytes.
But, even then, I don't think that is the real reason. The reason I prefer those is the power C# gives me to handle those enums. I can add several values with a single expression. I can remove them also. I can even compare several values at once with a single expression using the enum. With booleans, code can become, let's say, more verbose.
From a domain Model perspective, it just models reality better in some situations. If you have three booleans like AccountIsInDefault and IsPreferredCustomer and RequiresSalesTaxState, then it doesnn't make sense to add them to a single Flags decorated enumeration, cause they are not three distinct values for the same domain model element.
But if you have a set of booleans like:
[Flags] enum AccountStatus {AccountIsInDefault=1,
AccountOverdue=2 and AccountFrozen=4}
or
[Flags] enum CargoState {ExceedsWeightLimit=1,
ContainsDangerousCargo=2, IsFlammableCargo=4,
ContainsRadioactive=8}
Then it is useful to be able to store the total state of the Account, (or the cargo) in ONE variable... that represents ONE Domain Element whose value can represent any possible combination of states.
Raymond Chen has a blog post on this subject.
Sure, bitfields save data memory, but
you have to balance it against the
cost in code size, debuggability, and
reduced multithreading.
As others have said, its time is largely past. It's tempting to still do it, cause bit fiddling is fun and cool-looking, but it's no longer more efficient, it has serious drawbacks in terms of maintenance, it doesn't play nicely with databases, and unless you're working in an embedded world, you have enough memory.
I would suggest never using enum flags unless you are dealing with some pretty serious memory limitations (not likely). You should always write code optimized for maintenance.
Having several boolean properties makes it easier to read and understand the code, change the values, and provide Intellisense comments not to mention reduce the likelihood of bugs. If necessary, you can always use an enum flag field internally, just make sure you expose the setting/getting of the values with boolean properties.
Space efficiency - 1 bit
Time efficiency - bit comparisons are handled quickly by hardware.
Language independence - where the data may be handled by a number of different programs you don't need to worry about the implementation of booleans across different languages/platforms.
Most of the time, these are not worth the tradeoff in terms of maintance. However, there are times when it is useful:
Network protocols - there will be a big saving in reduced size of messages
Legacy software - once I had to add some information for tracing into some legacy software.
Cost to modify the header: millions of dollars and years of effort.
Cost to shoehorn the information into 2 bytes in the header that weren't being used: 0.
Of course, there was the additional cost in the code that accessed and manipulated this information, but these were done by functions anyways so once you had the accessors defined it was no less maintainable than using Booleans.
I have seen answers like Time efficiency and compatibility. those are The Reasons, but I do not think it is explained why these are sometime necessary in times like ours. from all answers and experience of chatting with other engineers I have seen it pictured as some sort of quirky old time way of doing things that should just die because new way to do things are better.
Yes, in very rare case you may want to do it the "old way" for performance sake like if you have the classic million times loop. but I say that is the wrong perspective of putting things.
While it is true that you should NOT care at all and use whatever C# language throws at you as the new right-way™ to do things (enforced by some fancy AI code analysis slaping you whenever you do not meet their code style), you should understand deeply that low level strategies aren't there randomly and even more, it is in many cases the only way to solve things when you have no help from a fancy framework. your OS, drivers, and even more the .NET itself(especially the garbage collector) are built using bitfields and transactional instructions. your CPU instruction set itself is a very complex bitfield, so JIT compilers will encode their output using complex bit processing and few hardcoded bitfields so that the CPU can execute them correctly.
When we talk about performance things have a much larger impact than people imagine, today more then ever especially when you start considering multicores.
when multicore systems started to become more common all CPU manufacturer started to mitigate the issues of SMP with the addition of dedicated transactional memory access instructions while these were made specifically to mitigate the near impossible task to make multiple CPUs to cooperate at kernel level without a huge drop in perfomrance it actually provides additional benefits like an OS independent way to boost low level part of most programs. basically your program can use CPU assisted instructions to perform memory changes to integers sized memory locations, that is, a read-modify-write where the "modify" part can be anything you want but most common patterns are a combination of set/clear/increment.
usually the CPU simply monitors if there is any other CPU accessing the same address location and if a contention happens it usually stops the operation to be committed to memory and signals the event to the application within the same instruction. this seems trivial task but superscaler CPU (each core has multiple ALUs allowing instruction parallelism), multi-level cache (some private to each core, some shared on a cluster of CPU) and Non-Uniform-Memory-Access systems (check threadripper CPU) makes things difficult to keep coherent, luckily the smartest people in the world work to boost performance and keep all these things happening correctly. todays CPU have a large amount of transistor dedicated to this task so that caches and our read-modify-write transactions work correctly.
C# allows you to use the most common transactional memory access patterns using Interlocked class (it is only a limited set for example a very useful clear mask and increment is missing, but you can always use CompareExchange instead which gets very close to the same performance).
To achieve the same result using a array of booleans you must use some sort of lock and in case of contention the lock is several orders of magnitude less permorming compared to the atomic instructions.
here are some examples of highly appreciated HW assisted transaction access using bitfields which would require a completely different strategy without them of course these are not part of C# scope:
assume a DMA peripheral that has a set of DMA channels, let say 20 (but any number up to the maximum number of bits of the interlock integer will do). When any peripheral's interrupt that might execute at any time, including your beloved OS and from any core of your 32-core latest gen wants a DMA channel you want to allocate a DMA channel (assign it to the peripheral) and use it. a bitfield will cover all those requirements and will use just a dozen of instructions to perform the allocation, which are inlineable within the requesting code. basically you cannot go faster then this and your code is just few functions, basically we delegate the hard part to the HW to solve the problem, constraints: bitfield only
assume a peripheral that to perform its duty requires some working space in normal RAM memory. for example assume a high speed I/O peripheral that uses scatter-gather DMA, in short it uses a fixed-size block of RAM populated with the description (btw the descriptor is itself made of bitfields) of the next transfer and chained one to each other creating a FIFO queue of transfers in RAM. the application prepares the descriptors first and then it chains with the tail of the current transfers without ever pausing the controller (not even disabling the interrupts). the allocation/deallocation of such descriptors can be made using bitfield and transactional instructions so when it is shared between diffent CPUs and between the driver interrupt and the kernel all will still work without conflicts. one usage case would be the kernel allocates atomically descriptors without stopping or disabling interrupts and without additional locks (the bitfield itself is the lock), the interrupt deallocates when the transfer completes.
most old strategies were to preallocate the resources and force the application to free after usage.
If you ever need to use multitask on steriods C# allows you to use either Threads + Interlocked, but lately C# introduced lightweight Tasks, guess how it is made? transactional memory access using Interlocked class. So you likely do not need to reinvent the wheel any of the low level part is already covered and well engineered.
so the idea is, let smart people (not me, I am a common developer like you) solve the hard part for you and just enjoy general purpose computing platform like C#. if you still see some remnants of these parts is because someone may still need to interface with worlds outside .NET and access some driver or system calls for example requiring you to know how to build a descriptor and put each bit in the right place. do not being mad at those people, they made our jobs possible.
In short : Interlocked + bitfields. incredibly powerful, don't use it
It is for speed and efficiency. Essentially all you are working with is a single int.
if ((flags & AnchorStyles.Top) == AnchorStyles.Top)
{
//Do stuff
}