Any way to make this LINQ faster? - c#

I have a LINQ expression that's slowing down my application.
I'm drawing a control, but to do this, I need to know the max width of the text that will appear in my column.
The way I'm doing that is this:
return Items.Max(w => TextRenderer.MeasureText((w.RenatlUnit == null)? "" :
w.RenatlUnit.UnitNumber, this.Font).Width) + 2;
However, this iterates over ~1000 Items, and takes around 20% of the CPU time that is used in my drawing method. To make it worse, there are two other columns that this must be done with, so this LINQ statement on all the items/columns takes ~75-85% of the CPU time.
TextRenderer is from System.Windows.Forms package, and because I'm not using a monospaced font, MeasureText is needed to figure out the pixel width of a string.
How might I make this faster?

I don't believe that your problem lies in the speed of LINQ, it lies in the fact that you're calling MeasureText over 1000 times. I would imagine that taking your logic out of a LINQ query and putting it into an ordinary foreach loop would yield similar run times.
A better idea is probably to employ a little bit of sanity checking around what you're doing. If you go with reasonable inputs (and disregard the possibility of linebreaks), then you really only need to measure the text of strings that are, say, within 10% or so of the absolute longest (in terms of number of characters) string, then use the maximum value. In other words, there's no point in measuring the string "foo" if the largest value is "paleontology". There's no font that has widths THAT variable.

It's the MeasureText method that takes time, so the only way to increase the speed is to do less work.
You can cache the results of the call to MeasureText in a dictionary, that way you don't have to remeasure strings that already has been measured before.
You can calculate the values once and keep along with the data to display. Whenever you change the data, you recalculate the values. That way you don't have to measure the strings every time the control is drawn.

Step 0: Profile. Assuming you find that most of the execution time is indeed in MeasureText, then you can try the following to reduce the number of calls:
Compute the lengths of all individual characters. Since it sounds like you're rendering a number, this should be a small set.
Estimate the length numstr.Select(digitChar=>digitLengthDict[digitChar]).Sum()
Take the strings with the top N lengths, and measure only those.
To avoid even most of the cost of the lookup+sum, also filter to include only those strings within 90% of the maximum string-length, as suggested.
e.g. Something like...
// somewhere else, during initialization - do only once.
var digitLengthDict = possibleChars.ToDictionary(c=>c,c=>TextRenderer.MeasureText(c.ToString()));
//...
var relevantStringArray = Items.Where(w=>w.RenatlUnit!=null).Select(w.RenatlUnit.UnitNumber).ToArray();
double minStrLen = 0.9*relevantStringArray.Max(str => str.Length);
return (
from numstr in relevantStringArray
where str.Length >= minStrLen
orderby numstr.Select(digitChar=>digitLengthDict[digitChar]).Sum() descending
select TextRenderer.MeasureText(numstr)
).Take(10).Max() + 2;
If we knew more about the distribution of the strings, that would help.
Also, MeasureText isn't magic; it's quite possible you can duplicate it's functionality entirely quite easily for a limited set of inputs. For instance, it would not surprise me to learn that the Measured length of a string is precisely equal to the sum of the length of all characters in the string, minus the kerning overhang of all character bigrams in the string. If your string then consists of, say, 0-9, +, -, ,, ., and a terminator symbol, then a lookup table of 14 character widths and 15*15-1 kernel corrections might be enough to precisely emulate MeasureText at a far greater speed, and without much complexity.
Finally, the best solution is to not solve the problem at all - perhaps you can rearchitect the application to not require such a precise number - if a simpler estimate were to suffice, you could avoid MeasureText almost completely.

Unfortunately, it doesn't look like LINQ is your problem. If you ran a for loop and did this same calculation, the amount of time would be the same order of magnitude.
Have you considered running this calculation on multiple threads? It would work nicely with Parallel LINQ.
Edit: It seems Parallel LINQ won't work because MeasureText is a GDI function and will simply be marshaled back to the UI thread (thanks #Adam Robinson for correcting me.)

My guess is the issues is not the LINQ expression but calling the MeasureText several thousand times.
I think you could work around the non-monospaced font issue by breaking the problem into 4 parts.
Find the biggest number in terms of render size
Find the apartment unit with the most digits
Create a string with all values being the value determined in #1 and having size in #2.
Pass the value created in #3 to MeasureText and use that as your basis
This won't yield a perfect solution but it will ensure that you reserve at least enough space for your item and avoids the pitfall of calling MeasureText far too many times.

If you can't figure out how to make MeasureText faster, you could precalculate the width of all the characters in your font size and style and estimate the width of a string like that, although kerning of character pairs would suggest that it would probably be only an estimate and not precise.

You might want to consider as an approximation taking the length of the longest string and then finding the width of a string of that length of 0's (or whatever the widest digit is, I can't remember). That should be a much faster method, but it would only be an approximation and probably longer than necessary.
var longest = Items.Max( w => w.RenatlUnit == null
|| w.RenatlUnit.UnitNumber == null)
? 0
: w.RenatlUnit.UnitNumber.Length );
if (longest == 0)
{
return 2;
}
return TextRenderer.MeasureText( new String('0', longest ) ).Width + 2;

Related

Pure Speed for Lookup Single Value Type c#?

.NET 4.5.1
I have a "bunch" of Int16 values that fit in a range from -4 to 32760. The numbers in the range are not consecutive, but they are ordered from -4 to 32760. In other words, the numbers from 16-302 are not in the "bunch", but numbers 303-400 are in there, number 2102 is not there, etc.
What is the all-out fastest way to determine if a particular value (eg 18400) is in the "bunch"? Right now it is in an Int16[] and the Linq Contains method is used to determine if a value is in the array, but if anyone can say why/how a different structure would deliver a single value faster I would appreciate it. Speed is the key for this lookup (the "bunch" is a static property on a static class).
Sample code that works
Int16[] someShorts = new[] { (short)4 ,(short) 5 , (short)6};
var isInIt = someShorts.Contains( (short)4 );
I am not sure if that is the most performant thing that can be done.
Thanks.
It sounds like you really want BitArray - just offset the value by 4 so you've got a range of [0, 32764] and you should be fine.
That will allocate an array which is effectively 4K in size (32764 / 8), with one bit per value in the array. It will handle finding the relevant element in the array, and applying bit masking. (I don't know whether it uses a byte[] internally or something else.)
This is a potentially less compact representation than storing ranges, but the only cost involved in getting/setting a bit will be computing an index (basically a shift), getting the relevant bit of memory to the CPU, and then bit masking. It takes 1/8th the size of a bool[], making your CPU cache usage more efficient.
Of course, if this is really a performance bottleneck for you, you should compare both this solution and a bool[] approach in your real application - microbenchmarks aren't nearly as important here as how your real app behaves.
Make one bool for each possible value:
var isPresentItems = new bool[32760-(-4)+1];
Set the corresponding element to true if the given item is present in the set. Lookup is easy:
var isPresent = isPresentItems[myIndex];
Can't be done any faster. The bools will fit into L1 or L2 cache.
I advise against using BitArray because it stores multiple values per byte. This means that each access is slower. Bit-arithmetic is required.
And if you want insane speed, don't make LINQ call a delegate once for each item. LINQ is not the first choice for performance-critical code. Many indirections that stall the CPU.
If you want to optimize for lookup time, pick a data structure with O(1) (constant-time) lookups. You have several choices since you only care about set membership, and not sorting or ordering.
A HashSet<Int16> will give this to you, as will a BitArray indexed on max - min + 1. The absolute fastest ad-hoc solution would probably be a simple array indexed on max - min + 1, as #usr suggests. Any of these should be plenty "fast enough". The HashSet<Int16> will probably use the most memory, as the size of the internal hash table is an implementation detail. BitArray would be the most space efficient out of these options.
If you only have a single lookup, then memory should not be a concern, and I suggest first going with a HashSet<Int16>. That solution is easy to reason about and deal with in a bug-free manner, as you don't have to worry about staying within array boundaries; you can simply check set.Contains(n). This is particularly useful if your value range might change in the future. You can fall back to one of the other solutions if you need to optimize further for speed or performance.
One option is to use the HashSet. To find if the value is in it, it is a O(1) operation
The code example:
HashSet<Int16> evenNumbers = new HashSet<Int16>();
for (Int16 i = 0; i < 20; i++)
{
evenNumbers.Add(i);
}
if (evenNumbers.Contains(0))
{
/////
}
Because the numbers are sorted, I would loop through the list one time and generate a list of Range objects that have a start and end number. That list would be much smaller than having a list or dictionary of thousands of numbers.
If your "bunch" of numbers can be identified as a series of intervals, I suggest you use Interval Trees. An interval tree allows dynamic insertion/deletions and also searching if a an interval intersects any interval in the tree is O(log(n)) where n is the number of intervals in the tree. In your case the number of intervals would be way less than the number of ints and the search is much faster.

Dijkstra algorithm expanded with extra limit variable

I am having trouble implementing this into my current path finding algorithm.
Currently I have Dijkstra written and works like it should, but I need to step further away and add a limit (range). I can better explain with an image:
Let's say I have range of 80. I want to go from A to E. My current algorithm, works as it should, so it results in A->B-E.
However, I need to go only on paths with weight not more than the range - 80, which would mean that A->B->E is not the option any more, but A->C->D->B->E (considering that range/limit resets on every stop)
So far, I have implemented a bool named Possible which would return for the single part of path (e.g. A->B) is it possible comparing to my limit / range.
My main problem is that I do not know where/how to start. My only idea was to see where Possible is false (A->B on the total route A->B->E) and run the algorithm from A to A->E again without / excluding B stop/vertex.
Is this a good approach? Because of that my big O notation would increment twice (as far as I understand it).
I see two ways of doing this
Create a new graph G' that contains only edges < 80, and look for shortest path there... reduction time is O(V+E), and additional O(V+E) memory usage
You can change Dijkstra's algorithm, to ignore edges > 80, just skip edges >80, when giving values to neighbor vertices, the complexity and memory usage will stay the same in this case
Create a temporary version of your graph, and set all weights above the threshold to infinity. Then run the ordinary Dijkstra algorithm on it.
Complexity will increase or not, depending on your version of the algorithm:
if you have O(V^2) then it will increase to O(E + V^2)
if you have the O(ElogV) version then it will increase to O(E + ElogV)
if you have the O(E + VlogV) version it will remain the same
As noted by ArsenMkrt you can as well remove these edges, which makes even more sense but will make the complexity a bit worse. Modifying the algorithm to just skip those edges seems to be the best option though, as he suggested in his answer.

MonoTouch on iPad: How to make text search faster?

I need to do text search based on user input in a relative large list (about 37K lines with 50 to 100 chars each line). The search is done after entering each character and the result is shown in a UITableView. This is my current code:
if (input.Any(x => Char.IsUpper(x)))
return _list.Where(x => x.Desc.Contains(input));
else
return _list.Where(x => x.Desc.ToLower().Contains(input));
It performs okay on a MacBook running simulator, but too slow on iPad.
On interesting thing I observed is that it takes longer and longer as input grows. For example, say "examin" as input. It takes about 1 second after entering e, 2 seconds after x, 5 seconds after a, but 28 seconds after m and so on. Why that?
I hope there is a simple way to improve it.
Always take care to avoid memory allocations in time sensitive code.
For example we often produce code often allocates string without realizing it, e.g.
x => x.Desc.ToLower().Contains(input)
That will allocate a string to return from ToLower. From your description this will occurs many time. You can easily avoid this by using:
x = x.Desc.IndexOf ("s", StringComparison.OrdinalIgnoreCase) != -1
note: just select the StringComparison.*IgnoreCase that match your need.
Also LINQ is nice but it hides allocations in many cases - maybe not in your case but measuring is key to get things faster. In that case using another algorithm (like suggested in another answer) could give you much better results (but keep in mind the allocations ;-)
UPDATE:
Mono's Contains(string) will call, after a few checks, the following:
CultureInfo.CurrentCulture.CompareInfo.IndexOf (this, value, 0, length, CompareOptions.Ordinal);
which, with your ToLower requirement that using StringComparison.OrdinalIgnoreCase is the perfect (i.e. identical) match for your existing code (it did not do any culture specific comparison).
Generally I've found that contains operations are not preferable for search, so I'd recommend you take a look at the Mastering Core Data Session (login required ) video on the WWDC 2010 page (around the 10 min mark). Apple knows that 'contains' is terrible w/ SQLite on mobile devices, you can essentially do what Apple does to sort of "hack" FTS on the version of SQLite they ship.
Essentially they do prefix matching by creating a table like:
[[ pk_id || input || normalized_input ]]
Where input and normalized_input are both indexed explicitly. Then they prefix match against the normalized value. So for instance if a user is searching for 'snuggles' and so far they've typed in 'snu' the prefix matching query would look like:
normalized_input >= 'snu' and normalized_input < 'snt'
Not sure if this translates given your use case, but I thought it was worth mentioning. Hope it's helpful!
You need to use a trie. See http://en.wikipedia.org/wiki/Trie

HLSL Computation - process pixels in order?

Imagine I want to, say, compute the first one million terms of the Fibonacci sequence using the GPU. (I realize this will exceed the precision limit of a 32-bit data type - just used as an example)
Given a GPU with 40 shaders/stream processors, and cheating by using a reference book, I can break up the million terms into 40 blocks of 250,000 strips, and seed each shader with the two start values:
unit 0: 1,1 (which then calculates 2,3,5,8,blah blah blah)
unit 1: 250,000th term
unit 2: 500,000th term
...
How, if possible, could I go about ensuring that pixels are processed in order? If the first few pixels in the input texture have values (with RGBA for simplicity)
0,0,0,1 // initial condition
0,0,0,1 // initial condition
0,0,0,2
0,0,0,3
0,0,0,5
...
How can I ensure that I don't try to calculate the 5th term before the first four are ready?
I realize this could be done in multiple passes but setting a "ready" bit whenever a value is calculated, but that seems incredibly inefficient and sort of eliminates the benefit of performing this type of calculation on the GPU.
OpenCL/CUDA/etc probably provide nice ways to do this, but I'm trying (for my own edification) to get this to work with XNA/HLSL.
Links or examples are appreciated.
Update/Simplification
Is it possible to write a shader that uses values from one pixel to influence the values from a neighboring pixel?
You cannot determine the order the pixels are processed. If you could, that would break the massive pixel throughput of the shader pipelines. What you can do is calculating the Fibonacci sequence using the non-recursive formula.
In your question, you are actually trying to serialize the shader units to run one after another. You can use the CPU right away and it will be much faster.
By the way, multiple passes aren't as slow as you might think, but they won't help you in your case. You cannot really calculate any next value without knowing the previous ones, thus killing any parallelization.

DB2 ZOS String Comparison Problem

I am comparing some CHAR data in a where clause in my sql like this,
where PRI_CODE < PriCode
The problem I am having is when the CHAR values are of different lengths.
So if PRI_CODE = '0800' and PriCode = '20' it is returning true instead of false.
It looks like it is comparing it like this
'08' < '20'
instead of like
'0800' < '20'
Does a CHAR comparison start from the Left until one or the other values end?
If so how do I fix this?
My values can have letters in it so convering to numeric is not an option.
It's not comparing '08' with '20', it is, as you expect, comparing '0800' with '20'.
What you don't seem to expect, however, is that '0800' (the string) is indeed less than '20' (the string).
If converting it to numerics for a numeric comparison is out of the question, you could use the following DB2 function:
right ('0000000000'||val,10)
which will give you val padded on the left with zeroes to a size of 10 (ideal for a CHAR(10), for example). That will at least guarantee that the fields are the same size and the comparison will work for your particular case. But I urge you to rethink how you're doing things: per-row functions rarely scale well, performance-wise.
If you're using z/OS, you should have a few DBAs just lying around on the computer room floor waiting for work - you can probably ask one of them for advice more tailored to your specific application :-)
One thing that comes to mind in the use of an insert/update trigger and secondary column PRI_CODE_PADDED to hold the PRI_CODE column fully padded out (using the same method as above). Then make sure your PriCode variable is similarly formatted before executing the select ... where PR_CODE_PADDED < PriCode.
Incurring that cost at insert/update time will amortise it over all the selects you're likely to do (which, because they're no longer using per-row functions, will be blindingly fast), giving you better overall performance (assuming your database isn't one of those incredibly rare beasts that are written more than read, of course).

Categories

Resources