Is there a public implementation of the Rope data structure in C#?
For what its worth, here is an immutable Java implementation. You could probably convert it to C# in less than an hour.
I'm not aware of a Rope implementation (though there probably is one!), but if you're only after doing concatenation, StringBuilder will do the job.
The BigList<T> class from Wintellect Power Collections (a C# data structure library) is somehow similar to rope:
http://docs.pushtechnology.com/docs/4.5.7/dotnet/externalclient/html/class_wintellect_1_1_power_collections_1_1_big_list_3_01_t_01_4.html
I measured its performance and it performs pretty well in "start of string inserts":
const int InsertCount = 150000;
var startTime = DateTime.Now;
var ropeOfChars = new BigList<char>();
for (int i = 0; i < InsertCount; i++)
{
ropeOfChars.Insert(0, (char)('a' + (i % 10)));
}
Console.WriteLine("Rope<char> time: {0}", DateTime.Now - startTime);
startTime = DateTime.Now;
var stringBuilder = new StringBuilder();
for (int i = 0; i < InsertCount; i++)
{
stringBuilder.Insert(0, (char)('a' + (i % 10)));
}
Console.WriteLine("StringBuilder time: {0}", DateTime.Now - startTime);
Results:
Rope<char> time: 00:00:00.0468740
StringBuilder time: 00:00:05.1471300
But it performs not well in "middle of string inserts":
const int InsertCount = 150000;
var startTime = DateTime.Now;
var ropeOfChars = new BigList<char>();
for (int i = 0; i < InsertCount; i++)
{
ropeOfChars.Insert(ropeOfChars.Count / 2, (char)('a' + (i % 10)));
}
Console.WriteLine("Rope<char> time: {0}", DateTime.Now - startTime);
startTime = DateTime.Now;
var stringBuilder = new StringBuilder();
for (int i = 0; i < InsertCount; i++)
{
stringBuilder.Insert(stringBuilder.Length / 2, (char)('a' + (i % 10)));
}
Console.WriteLine("StringBuilder time: {0}", DateTime.Now - startTime);
Results:
Rope<char> time: 00:00:15.0229452
StringBuilder time: 00:00:04.7812553
I am not sure if this is a bug or unefficient implementation, but "rope of chars" is expected to be faster that StringBuilder in C#.
You can install Power Collections from NuGet:
Install-Package XAct.Wintellect.PowerCollections
Here is a public implementation of Ropes in C#, based on the immutable java implementation listed above. Note that you won't get the same polymorphism benefits as the java version because strings can't be inherited and CharSequence doesn't exist natively in C#.
Related
I am new to C# and trying a demo program in this program my intended output is:
Id 1 2 3 4 5 6 7 8 9
Roll # 1 2 3 4 5 6 7 8 9
and this is what I have tried :
static void Main(string[] args)
{
StringBuilder sb = new StringBuilder();
sb.Append("Id ");
for (int i = 0; i < 10; i++)
{
sb.Append(i+" ");
}
sb.AppendLine();
sb.Append("Roll# ");
for (int i = 0; i < 10; i++)
{
sb.Append(i + " ");
}
Console.WriteLine(sb);
}
though it gives me desired output but here I have to iterate through for loop twice. Is there any way by which only iterating once I can get the same output, using some string formatting of C#?
This can be done without explicit looping, using Enumerable.Range to "generate a sequence of integral numbers within a specified range", along with string.Join() to concatenate the previously created range with the string " " :
// using System.Linq;
string range = string.Join(" ", Enumerable.Range(1, 10)); // "1 2 3 4 5 6 7 8 9 10"
sb.AppendLine($"Id {range}");
sb.AppendLine($"Roll# {range}");
If you really want to use a for loop to build your sequence, you can build your own Range method such as :
public static IEnumerable<int> Range(int min, int max)
{
if (min > max)
{
throw new ArgumentException("The min value can't be greater than the max");
}
for (int i = min; i <= max; i++)
{
yield return i;
}
}
And then Join like previously :
var range = string.Join(" ", Range(1, 10));
sb.AppendLine($"Id {range}");
sb.AppendLine($"Roll# {range}");
Or build an array/List/whatever collection and then use string.Join() :
var arr = new int [10];
for (int i = 1; i <= 10; i++)
{
arr[i - 1] = i;
}
string range = string.Join(" ", arr);
sb.AppendLine($"Id {range}");
sb.AppendLine($"Roll# {range}");
Or directly build a string in the loop :
var sbRange = new StringBuilder();
for (int i = 1; i <= 10; i++)
{
sbRange.Append($"{i} ");
}
// You can use a string and trim it (there is a space in excess at the end)
string range = sbRange.ToString().Trim();
sb.AppendLine($"Id {range}");
sb.AppendLine($"Roll# {range}");
Instead of 1, use 2 StringBuilder instances:
StringBuilder sb1 = new StringBuilder();
StringBuilder sb2 = new StringBuilder();
sb1.Append("Id ");
sb2.Append("Roll# ");
for (int i = 0; i < 10; i++)
{
sb1.Append(i + " ");
sb2.Append(i + " ");
}
Console.WriteLine(sb1);
Console.WriteLine(sb2);
This will always require at least 3 loops:
One for the creation for the array.
One for each WriteLine.
At best you can have somebody elses code do the looping for you.
Unless you are interested in pulling stunts like manually inserting the Newline into a really long string, there is no way to save even a single loop. But such a thing is just unreliable and should not be atempted.
It honestly sounds a lot like a Speed Question, and for those we have the speed rant. You should read it either way, but can skip part 1.
The only improovement I can think of is building those strings with a stringbuilder. String concatenation in loops can be a bit troublesome. But on this scale it works either way.
I am working with annuities and have the following methods in my code:
public static double NumPMTsRemaining( double CurBalance, double ContractRate, double Pmt)
{
double rt = PeriodicRate(ContractRate);
return -1 * Math.Log(1 - (CurBalance * (rt) / Pmt)) / Math.Log(1 + (rt));
}
public static double MonthlyPMT(double OrigBalance, double ContractRate, int Term)
{
double rt = PeriodicRate(ContractRate);
if (ContractRate > 0)
return (OrigBalance * rt * Math.Pow(1 + rt, Term)) / (Math.Pow(1 + rt, Term) - 1);
else return OrigBalance / Term;
}
I use the former method to determine if the payment for a loan will insure the loans pays off in its life remaining. I use the latter method to determine if a payment is quoted for a payment period other than monthly and then replace it with a monthly payment if so. Upon reflection I can use the latter method for both tasks.
With that in mind, I was wondering if anyone knew off the top of their head if Math.Pow is faster/more efficient than/relative to Math.Log?
I assume that Math.Pow is the better choice, but would appreciate a bit of input.
I have built a benchmark as recommended by #Mangist. The code is posted below. I was surprised by the response by #CodesInChaos. I, of course, did some research and realized I could improve a large amount of my code. I will post a link to a interesting StackOverflow article I found in this regard. A number of people had worked out improvements on Math.Pow due to the aforementioned fact.
Thank you again for the suggestions and information.
int term = 72;
double contractRate = 2.74 / 1200;
double balance = 20203.66;
double pmt = 304.96;
double logarithm = 0;
double power = 0;
DateTime BeginLog = DateTime.UtcNow;
for (int i = 0; i < 100000000; i++)
{
logarithm=(-1*Math.Log(1-(balance*contractRate/pmt))/Math.Log(1+contractRate));
}
DateTime EndLog = DateTime.UtcNow;
Console.WriteLine("Elapsed time= " + (EndLog - BeginLog));
Console.ReadLine();
DateTime BeginPow = DateTime.UtcNow;
for (int i = 0; i < 100000000; i++)
{
power = (balance * contractRate * Math.Pow(1 + contractRate, term)) / (Math.Pow(1
+ contractRate, term) - 1);
}
DateTime EndPow = DateTime.UtcNow;
Console.WriteLine("Elapsed time= " + (EndPow - BeginPow));
Console.ReadLine();
The results of the benchmark were
Elapsed time for the logarithm 00:00:04.9274927
Elapsed time for the power 00:00:11.6981697
I also alluded to some additional StackOverflow discussions which shed light on the comment by #CodeInChaos.
How is Math.Pow() implemented in .NET Framework?
Let me add a head to head comparison between a suggestion on the above link and the Math.Pow function. I benchmarked Math.Pow(x,y) against Math.Exp(y*Math.Log(x)) with the following code:
DateTime PowBeginTime = DateTime.UtcNow;
for (int i = 0; i < 250000000; i++)
{
Math.Pow(1 + contractRate, term);
}
DateTime PowEndTime = DateTime.UtcNow;
Console.WriteLine("Elapsed time= " + (PowEndTime - PowBeginTime));
Console.ReadLine();
DateTime HighSchoolBeginTime = DateTime.UtcNow;
for (int i = 0; i < 250000000; i++)
{
Math.Exp(term * Math.Log(1 + contractRate));
}
DateTime HighSchoolEndTime = DateTime.UtcNow;
Console.WriteLine("Elapsed time= " + (HighSchoolEndTime - HighSchoolBeginTime));
Console.ReadLine();
The results were:
Math.Pow(x,y) 00:00:19.9469945
Math.Exp(y*Math.Log(x)) 00:00:18.3478346
I am trying to compare performance between parallel streams in Java 8 and PLINQ (C#/.Net 4.5.1).
Here is the result I get on my machine ( System Manufacturer Dell Inc. System Model Precision M4700 Processor Intel(R) Core(TM) i7-3740QM CPU # 2.70GHz, 2701 Mhz, 4 Core(s), 8 Logical Processor(s) Installed Physical Memory (RAM) 16.0 GB OS Name Microsoft Windows 7 Enterprise Version 6.1.7601 Service Pack 1 Build 7601)
C# .Net 4.5.1 (X64-release)
Serial:
470.7784, 491.4226, 502.4643, 481.7507, 464.1156, 463.0088, 546.149, 481.2942, 502.414, 483.1166
Average: 490.6373
Parallel:
158.6935, 133.4113, 217.4304, 182.3404, 184.188, 128.5767, 160.352, 277.2829, 127.6818, 213.6832
Average: 180.5496
Java 8 (X64)
Serial:
471.911822, 333.843924, 324.914299, 325.215631, 325.208402, 324.872828, 324.888046, 325.53066, 325.765791, 325.935861
Average:326.241715
Parallel:
212.09323, 73.969783, 68.015431, 66.246628, 66.15912, 66.185373, 80.120837, 75.813539, 70.085948, 66.360769
Average:70.3286
It looks like PLINQ does not scale across the CPU cores. I am wondering if I miss something.
Here is the code for C#:
class Program
{
static void Main(string[] args)
{
var NUMBER_OF_RUNS = 10;
var size = 10000000;
var vals = new double[size];
var rnd = new Random();
for (int i = 0; i < size; i++)
{
vals[i] = rnd.NextDouble();
}
var avg = 0.0;
Console.WriteLine("Serial:");
for (int i = 0; i < NUMBER_OF_RUNS; i++)
{
var watch = Stopwatch.StartNew();
var res = vals.Select(v => Math.Sin(v)).ToArray();
var elapsed = watch.Elapsed.TotalMilliseconds;
Console.Write(elapsed + ", ");
if (i > 0)
avg += elapsed;
}
Console.Write("\nAverage: " + (avg / (NUMBER_OF_RUNS - 1)));
avg = 0.0;
Console.WriteLine("\n\nParallel:");
for (int i = 0; i < NUMBER_OF_RUNS; i++)
{
var watch = Stopwatch.StartNew();
var res = vals.AsParallel().Select(v => Math.Sin(v)).ToArray();
var elapsed = watch.Elapsed.TotalMilliseconds;
Console.Write(elapsed + ", ");
if (i > 0)
avg += elapsed;
}
Console.Write("\nAverage: " + (avg / (NUMBER_OF_RUNS - 1)));
}
}
Here is the code for Java:
import java.util.Arrays;
import java.util.Random;
import java.util.stream.DoubleStream;
public class Main {
private static final Random rand = new Random();
private static final int MIN = 1;
private static final int MAX = 140;
private static final int POPULATION_SIZE = 10_000_000;
public static final int NUMBER_OF_RUNS = 10;
public static void main(String[] args) throws InterruptedException {
Random rnd = new Random();
double[] vals1 = DoubleStream.generate(rnd::nextDouble).limit(POPULATION_SIZE).toArray();
double avg = 0.0;
System.out.println("Serial:");
for (int i = 0; i < NUMBER_OF_RUNS; i++)
{
long start = System.nanoTime();
double[] res = Arrays.stream(vals1).map(Math::sin).toArray();
double duration = (System.nanoTime() - start) / 1_000_000.0;
System.out.print(duration + ", " );
if (i > 0)
avg += duration;
}
System.out.println("\nAverage:" + (avg / (NUMBER_OF_RUNS - 1)));
avg = 0.0;
System.out.println("\n\nParallel:");
for (int i = 0; i < NUMBER_OF_RUNS; i++)
{
long start = System.nanoTime();
double[] res = Arrays.stream(vals1).parallel().map(Math::sin).toArray();
double duration = (System.nanoTime() - start) / 1_000_000.0;
System.out.print(duration + ", " );
if (i > 0)
avg += duration;
}
System.out.println("\nAverage:" + (avg / (NUMBER_OF_RUNS - 1)));
}
}
Both runtimes make a decision about how many threads to use in order to complete the parallel operation. That is a non-trivial task that can take many factors into account, including the degree to which the task is CPU bound, the estimated time to complete the task, etc.
Each runtime is different decisions about how many threads to use to resolve the request. Neither decision is obviously right or wrong in terms of system-wide scheduling, but the Java strategy performs the benchmark better (and leaves fewer CPU resources available for other tasks on the system).
Out of curiosity, is there a faster/more efficient way to parse a dynamic list of ints from a string?
Currently I have this, and it works absolutely fine; I was just thinking there might be a better way as this seems a little overly complex for something so simple.
public static void Send(string providerIDList)
{
String[] providerIDArray = providerIDList.Split('|');
var providerIDs = new List<int>();
for (int counter = 0; counter < providerIDArray.Count(); counter++)
{
providerIDs.Add(int.Parse(providerIDArray[counter].ToString()));
}
//do some stuff with the parsed list of int
Edit: Perhaps I should have said a more simple way to parse out my list from the string. But since the original question did state faster and more efficient the chosen answer will reflect that.
There's definitely a better way. Use LINQ:
var providerIDs = providerIDList.Split('|')
.Select(x => int.Parse(x))
.ToList();
Or using a method group conversion instead of a lambda expression:
var providerIDs = providerIDList.Split('|')
.Select(int.Parse)
.ToList();
This is not the most efficient way it can be done, but it's quite possibly the simplest. It's about as efficient as your approach - though that could be made slightly more efficient fairly easily, e.g. giving the List an initial capacity.
The difference in performance is likely to be irrelevant, so I'd stick with this simple code until you've got evidence that it's a bottleneck.
Note that if you don't need a List<int> - if you just need something you can iterate over once - you can kill the ToList call and use providerIDs as an IEnumerable<int>.
EDIT: If we're in the efficiency business, then here's an adaptation of the ForEachChar method, to avoid using int.Parse:
public static List<int> ForEachCharManualParse(string s, char delim)
{
List<int> result = new List<int>();
int tmp = 0;
foreach(char x in s)
{
if(x == delim)
{
result.Add(tmp);
tmp = 0;
}
else if (x >= '0' && x <= '9')
{
tmp = tmp * 10 + x - '0';
}
else
{
throw new ArgumentException("Invalid input: " + s);
}
}
result.Add(tmp);
return result;
}
Notes:
This will add zeroes for any consecutive delimiters, or a delimiter at the start or end
It doesn't handle negative numbers
It doesn't check for overflow
As noted in comments, using a switch statement instead of the x >= '0' && x <= '9' can improve the performance further (by about 10-15%)
If none of those are a problem for you, it's about 7x faster than ForEachChar on my machine:
ListSize 1000 : StringLen 10434
ForEachChar1000 Time : 00:00:02.1536651
ForEachCharManualParse1000 Time : 00:00:00.2760543
ListSize 100000 : StringLen 1048421
ForEachChar100000 Time : 00:00:02.2169482
ForEachCharManualParse100000 Time : 00:00:00.3087568
ListSize 10000000 : StringLen 104829611
ForEachChar10000000 Time : 00:00:22.0803706
ForEachCharManualParse10000000 Time : 00:00:03.1206769
The limitations can be worked around, but I haven't bothered... let me know if they're significant concerns for you.
I don't like any of the answers so far. So to actually answer the question the OP posed "fastest/most efficient" String.Split with Int.Parse, I wrote and tested some code.
Using Mono on an Intel 3770k.
I found that using String.Split + IEnum.Select is not the fastest (maybe the prettiest) solution. In fact it's the slowest.
Here's some benchmark results
ListSize 1000 : StringLen 10468
SplitForEach1000 Time : 00:00:02.8704048
SplitSelect1000 Time : 00:00:02.9134658
ForEachChar1000 Time : 00:00:01.8254438
SplitParallelSelectr1000 Time : 00:00:07.5421146
ForParallelForEachChar1000 Time : 00:00:05.3534218
ListSize 100000 : StringLen 1048233
SplitForEach100000 Time : 00:00:01.9500846
SplitSelect100000 Time : 00:00:02.2662606
ForEachChar100000 Time : 00:00:01.2554577
SplitParallelSelectr100000 Time : 00:00:02.6509969
ForParallelForEachChar100000 Time : 00:00:01.5842131
ListSize 10000000 : StringLen 104824707
SplitForEach10000000 Time : 00:00:18.2658261
SplitSelect10000000 Time : 00:00:20.6043874
ForEachChar10000000 Time : 00:00:10.0555613
SplitParallelSelectr10000000 Time : 00:00:18.1908017
ForParallelForEachChar10000000 Time : 00:00:08.6756213
Here's the code to get the benchmark results
using System;
using System.Collections.Generic;
using System.Collections.Concurrent;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using System.Diagnostics;
namespace FastStringSplit
{
class MainClass
{
public static void Main (string[] args)
{
Random rnd = new Random();
char delim = ':';
int[] sizes = new int[]{1000, 100000, 10000000 };
int[] iters = new int[]{10000, 100, 10};
Stopwatch sw;
List<int> list, result = new List<int>();
string str;
for(int s=0; s<sizes.Length; s++) {
list = new List<int>(sizes[s]);
for(int i=0; i<sizes[s]; i++)
list.Add (rnd.Next());
str = string.Join(":", list);
Console.WriteLine(string.Format("\nListSize {0} : StringLen {1}", sizes[s], str.Length));
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = SplitForEach(str, delim);
sw.Stop();
}
Console.WriteLine("SplitForEach" + result.Count + " Time : " + sw.Elapsed.ToString());
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = SplitSelect(str, delim);
sw.Stop();
}
Console.WriteLine("SplitSelect" + result.Count + " Time : " + sw.Elapsed.ToString());
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = ForEachChar(str, delim);
sw.Stop();
}
Console.WriteLine("ForEachChar" + result.Count + " Time : " + sw.Elapsed.ToString());
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = SplitParallelSelect(str, delim);
sw.Stop();
}
Console.WriteLine("SplitParallelSelectr" + result.Count + " Time : " + sw.Elapsed.ToString());
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = ForParallelForEachChar(str, delim);
sw.Stop();
}
Console.WriteLine("ForParallelForEachChar" + result.Count + " Time : " + sw.Elapsed.ToString());
}
}
public static List<int> SplitForEach(string s, char delim) {
List<int> result = new List<int>();
foreach(string x in s.Split(delim))
result.Add(int.Parse (x));
return result;
}
public static List<int> SplitSelect(string s, char delim) {
return s.Split(delim)
.Select(int.Parse)
.ToList();
}
public static List<int> ForEachChar(string s, char delim) {
List<int> result = new List<int>();
int start = 0;
int end = 0;
foreach(char x in s) {
if(x == delim || end == s.Length - 1) {
if(end == s.Length - 1)
end++;
result.Add(int.Parse (s.Substring(start, end-start)));
start = end + 1;
}
end++;
}
return result;
}
public static List<int> SplitParallelSelect(string s, char delim) {
return s.Split(delim)
.AsParallel()
.Select(int.Parse)
.ToList();
}
public static int NumOfThreads = Environment.ProcessorCount > 2 ? Environment.ProcessorCount : 2;
public static List<int> ForParallelForEachChar(string s, char delim) {
int chunkSize = (s.Length / NumOfThreads) + 1;
ConcurrentBag<int> result = new ConcurrentBag<int>();
int[] chunks = new int[NumOfThreads+1];
Task[] tasks = new Task[NumOfThreads];
for(int x=0; x<NumOfThreads; x++) {
int next = chunks[x] + chunkSize;
while(next < s.Length) {
if(s[next] == delim)
break;
next++;
}
//Console.WriteLine(next);
chunks[x+1] = Math.Min(next, s.Length);
tasks[x] = Task.Factory.StartNew((o) => {
int chunkId = (int)o;
int start = chunks[chunkId];
int end = chunks[chunkId + 1];
if(start >= s.Length)
return;
if(s[start] == delim)
start++;
//Console.WriteLine(string.Format("{0} {1}", start, end));
for(int i = start; i<end; i++) {
if(s[i] == delim || i == end-1) {
if(i == end-1)
i++;
result.Add(int.Parse (s.Substring(start, i-start)));
start = i + 1;
}
}
}, x);
}
Task.WaitAll(tasks);
return result.ToList();
}
}
}
Here's the function I recommend
public static List<int> ForEachChar(string s, char delim) {
List<int> result = new List<int>();
int start = 0;
int end = 0;
foreach(char x in s) {
if(x == delim || end == s.Length - 1) {
if(end == s.Length - 1)
end++;
result.Add(int.Parse (s.Substring(start, end-start)));
start = end + 1;
}
end++;
}
return result;
}
Why it's faster?
It doesn't split the string into an array first. It does the splitting and parsing at the same time so there is no added overhead of iterating over the string to split it and then iterating over the array to parse it.
I also threw in a parallel-ized version using tasks, but it is only faster in the case with very large strings.
This appears cleaner:
var providerIDs = providerIDList.Split('|').Select(x => int.Parse(x)).ToList();
if you really want to know the most efficent way, then use unsafe code, define char pointer from string, iterate all chars incrementing char pointer, buffer read chars until the next '|', convert buffered chars to int32. if you want to be really fast then do it manually (begin with last char, substruct value of '0' char, multiply it 10, 100, 1000... accoring to iteration variable, then add it to the sum variable. i dont have time to write code but hopefully you get the idea
Executive Summary: Reed's answer below is the fastest if you want to stay in C#. If you're willing to marshal to C++ (which I am), that's a faster solution.
I have two 55mb ushort arrays in C#. I am combining them using the following loop:
float b = (float)number / 100.0f;
for (int i = 0; i < length; i++)
{
image.DataArray[i] =
(ushort)(mUIHandler.image1.DataArray[i] +
(ushort)(b * (float)mUIHandler.image2.DataArray[i]));
}
This code, according to adding DateTime.Now calls before and afterwards, takes 3.5 seconds to run. How can I make it faster?
EDIT: Here is some code that, I think, shows the root of the problem. When the following code is run in a brand new WPF application, I get these timing results:
Time elapsed: 00:00:00.4749156 //arrays added directly
Time elapsed: 00:00:00.5907879 //arrays contained in another class
Time elapsed: 00:00:02.8856150 //arrays accessed via accessor methods
So when arrays are walked directly, the time is much faster than if the arrays are inside of another object or container. This code shows that somehow, I'm using an accessor method, rather than accessing the arrays directly. Even so, the fastest I seem to be able to get is half a second. When I run the second listing of code in C++ with icc, I get:
Run time for pointer walk: 0.0743338
In this case, then, C++ is 7x faster (using icc, not sure if the same performance can be obtained with msvc-- I'm not as familiar with optimizations there). Is there any way to get C# near that level of C++ performance, or should I just have C# call my C++ routine?
Listing 1, C# code:
public class ArrayHolder
{
int length;
public ushort[] output;
public ushort[] input1;
public ushort[] input2;
public ArrayHolder(int inLength)
{
length = inLength;
output = new ushort[length];
input1 = new ushort[length];
input2 = new ushort[length];
}
public ushort[] getOutput() { return output; }
public ushort[] getInput1() { return input1; }
public ushort[] getInput2() { return input2; }
}
/// <summary>
/// Interaction logic for MainWindow.xaml
/// </summary>
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
Random random = new Random();
int length = 55 * 1024 * 1024;
ushort[] output = new ushort[length];
ushort[] input1 = new ushort[length];
ushort[] input2 = new ushort[length];
ArrayHolder theArrayHolder = new ArrayHolder(length);
for (int i = 0; i < length; i++)
{
output[i] = (ushort)random.Next(0, 16384);
input1[i] = (ushort)random.Next(0, 16384);
input2[i] = (ushort)random.Next(0, 16384);
theArrayHolder.getOutput()[i] = output[i];
theArrayHolder.getInput1()[i] = input1[i];
theArrayHolder.getInput2()[i] = input2[i];
}
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
int number = 44;
float b = (float)number / 100.0f;
for (int i = 0; i < length; i++)
{
output[i] =
(ushort)(input1[i] +
(ushort)(b * (float)input2[i]));
}
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}",
stopwatch.Elapsed);
stopwatch.Reset();
stopwatch.Start();
for (int i = 0; i < length; i++)
{
theArrayHolder.output[i] =
(ushort)(theArrayHolder.input1[i] +
(ushort)(b * (float)theArrayHolder.input2[i]));
}
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}",
stopwatch.Elapsed);
stopwatch.Reset();
stopwatch.Start();
for (int i = 0; i < length; i++)
{
theArrayHolder.getOutput()[i] =
(ushort)(theArrayHolder.getInput1()[i] +
(ushort)(b * (float)theArrayHolder.getInput2()[i]));
}
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}",
stopwatch.Elapsed);
}
}
Listing 2, C++ equivalent:
// looptiming.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <stdlib.h>
#include <windows.h>
#include <stdio.h>
#include <iostream>
int _tmain(int argc, _TCHAR* argv[])
{
int length = 55*1024*1024;
unsigned short* output = new unsigned short[length];
unsigned short* input1 = new unsigned short[length];
unsigned short* input2 = new unsigned short[length];
unsigned short* outPtr = output;
unsigned short* in1Ptr = input1;
unsigned short* in2Ptr = input2;
int i;
const int max = 16384;
for (i = 0; i < length; ++i, ++outPtr, ++in1Ptr, ++in2Ptr){
*outPtr = rand()%max;
*in1Ptr = rand()%max;
*in2Ptr = rand()%max;
}
LARGE_INTEGER ticksPerSecond;
LARGE_INTEGER tick1, tick2; // A point in time
LARGE_INTEGER time; // For converting tick into real time
QueryPerformanceCounter(&tick1);
outPtr = output;
in1Ptr = input1;
in2Ptr = input2;
int number = 44;
float b = (float)number/100.0f;
for (i = 0; i < length; ++i, ++outPtr, ++in1Ptr, ++in2Ptr){
*outPtr = *in1Ptr + (unsigned short)((float)*in2Ptr * b);
}
QueryPerformanceCounter(&tick2);
QueryPerformanceFrequency(&ticksPerSecond);
time.QuadPart = tick2.QuadPart - tick1.QuadPart;
std::cout << "Run time for pointer walk: " << (double)time.QuadPart/(double)ticksPerSecond.QuadPart << std::endl;
return 0;
}
EDIT 2: Enabling /QxHost in the second example drops the time down to 0.0662714 seconds. Modifying the first loop as #Reed suggested gets me down to
Time elapsed: 00:00:00.3835017
So, still not fast enough for a slider. That time is via the code:
stopwatch.Start();
Parallel.ForEach(Partitioner.Create(0, length),
(range) =>
{
for (int i = range.Item1; i < range.Item2; i++)
{
output[i] =
(ushort)(input1[i] +
(ushort)(b * (float)input2[i]));
}
});
stopwatch.Stop();
EDIT 3 As per #Eric Lippert's suggestion, I've rerun the code in C# in release, and, rather than use an attached debugger, just print the results to a dialog. They are:
Simple arrays: ~0.273s
Contained arrays: ~0.330s
Accessor arrays: ~0.345s
Parallel arrays: ~0.190s
(these numbers come from a 5 run average)
So the parallel solution is definitely faster than the 3.5 seconds I was getting before, but is still a bit under the 0.074 seconds achievable using the non-icc processor. It seems, therefore, that the fastest solution is to compile in release and then marshal to an icc-compiled C++ executable, which makes using a slider possible here.
EDIT 4: Three more suggestions from #Eric Lippert: change the inside of the for loop from length to array.length, use doubles, and try unsafe code.
For those three, the timing is now:
length: ~0.274s
doubles, not floats: ~0.290s
unsafe: ~0.376s
So far, the parallel solution is the big winner. Although if I could add these via a shader, maybe I could see some kind of speedup there...
Here's the additional code:
stopwatch.Reset();
stopwatch.Start();
double b2 = ((double)number) / 100.0;
for (int i = 0; i < output.Length; ++i)
{
output[i] =
(ushort)(input1[i] +
(ushort)(b2 * (double)input2[i]));
}
stopwatch.Stop();
DoubleArrayLabel.Content += "\t" + stopwatch.Elapsed.Seconds + "." + stopwatch.Elapsed.Milliseconds;
stopwatch.Reset();
stopwatch.Start();
for (int i = 0; i < output.Length; ++i)
{
output[i] =
(ushort)(input1[i] +
(ushort)(b * input2[i]));
}
stopwatch.Stop();
LengthArrayLabel.Content += "\t" + stopwatch.Elapsed.Seconds + "." + stopwatch.Elapsed.Milliseconds;
Console.WriteLine("Time elapsed: {0}",
stopwatch.Elapsed);
stopwatch.Reset();
stopwatch.Start();
unsafe
{
fixed (ushort* outPtr = output, in1Ptr = input1, in2Ptr = input2){
ushort* outP = outPtr;
ushort* in1P = in1Ptr;
ushort* in2P = in2Ptr;
for (int i = 0; i < output.Length; ++i, ++outP, ++in1P, ++in2P)
{
*outP = (ushort)(*in1P + b * (float)*in2P);
}
}
}
stopwatch.Stop();
UnsafeArrayLabel.Content += "\t" + stopwatch.Elapsed.Seconds + "." + stopwatch.Elapsed.Milliseconds;
Console.WriteLine("Time elapsed: {0}",
stopwatch.Elapsed);
This should be perfectly parallelizable. However, given the small amount of work being done per element, you'll need to handle this with extra care.
The proper way to do this (in .NET 4) would be to use Parallel.ForEach in conjunction with a Partitioner:
float b = (float)number / 100.0f;
Parallel.ForEach(Partitioner.Create(0, length),
(range) =>
{
for (int i = range.Item1; i < range.Item2; i++)
{
image.DataArray[i] =
(ushort)(mUIHandler.image1.DataArray[i] +
(ushort)(b * (float)mUIHandler.image2.DataArray[i]));
}
});
This will efficiently partition the work across available processing cores in your system, and should provide a decent speedup if you have multiple cores.
That being said, this will, at best, only speed up this operation by the number of cores in your system. If you need to speed it up more, you'll likely need to revert to a mix of parallelization and unsafe code. At that point, it might be worth thinking about alternatives to trying to present this in real time.
Assuming you have a lot of these guys, you can attempt to parallelize the operation (and you're using .NET 4):
Parallel.For(0, length, i=>
{
image.DataArray[i] =
(ushort)(mUIHandler.image1.DataArray[i] +
(ushort)(b * (float)mUIHandler.image2.DataArray[i]));
});
Of course that is all going to depend on whether or not parallelization of this would be worth it. That statement looks fairly computationally short; accessing indices by number is pretty fast as is. You might get gains because this loop is being run so many times with that much data.