Suppose, a polymer has N monomers in its chain. I want to simulate its movement using the bead-spring model. However, there was no periodic boundary condition applied. Coz, points were generated such that they never cross the boundary.
So, I wrote the following program.
Polymer chain simulation with Monte Carlo method
I am using 1 million steps. The energy is not fluctuating as expected. After several thousand steps the curve goes totally flat.
The X-axis is steps. Y-axis is total energy.
Can anyone check the source code and tell me what I should change?
N.B. I am especially concerned with the function that calculates the total energy of the polymer.
Probably, the algorithm is incorrect.
public double GetTotalPotential()
{
double totalBeadPotential = 0.0;
double totalSpringPotential = 0.0;
// calculate total bead-energy
for (int i = 0; i < beadsList.Count; i++)
{
Bead item_i = beadsList[i];
Bead item_i_plus_1 = null;
try
{
item_i_plus_1 = beadsList[i + 1];
if (i != beadsList.Count - 1)
{
// calculate total spring energy.
totalSpringPotential += item_i.GetHarmonicPotential(item_i_plus_1);
}
}
catch { }
for (int j = 0; j < beadsList.Count; j++)
{
if (i != j)
{
Bead item_j = beadsList[j];
totalBeadPotential += item_i.GetPairPotential(item_j);
//Console.Write(totalBeadPotential + "\n");
//Thread.Sleep(100);
}
}
}
return totalBeadPotential + totalSpringPotential;
}
Problem of this application is that simulations (Simulation.SimulateMotion) are run in separate thread in parallel to the draw timer (SimulationGuiForm.timer1_Tick) and share the same state (polymerChain) without any sync/signaling, so some mutations of polymerChain are skipped completely (not drawn) and when the simulation is finished (far before the finish of the drawing) the timer1_Tick will redraw the same polymerChain. You can easily check that by adding counter to Simulation and increasing it in the SimulateMotion:
public class Simulation
{
public static int Simulations = 0; // counter
public static void SimulateMotion(PolymerChain polymerChain, int iterations)
{
Random random = new Random();
for (int i = 0; i < iterations; i++)
{
Simulations++; // bump the counter
// rest of the code
// ...
And checking it in timer1_Tick:
private void timer1_Tick(object sender, EventArgs e)
{
// ...
// previous code
if (Simulation.Simulations == totalIterations)
{
// breakpoint or Console.Writeline() ...
// will be hit as soon as "the curve goes totally flat"
}
DrawZGraph();
}
You need to rewrite your application in such way that SimulateMotion either stores iterations in some collection which is consumed by timer1_Tick (basically implementing producer-consumer pattern, for example you can try using BlockingCollection, like I do in the pull request) or performs it's actions only when the current state is rendered.
I'm stumped on this right now, and I don't know why it's happening. In this class, I have a function that loops over all the numbers before a decimal point to cull extra zeroes off the end, and does it for the numbers after the decimal point as well. The second part (removing the extra zeroes before the decimal point) works fine, but the first part (removing the extra zeroes after the decimal point) doesn't.
I have to call the function multiple times for it to cull all of the extra zeroes on the right-hand end of the number, and I don't know why. I suspect it might have something to do with checking whether a byte is == 0, but I'm not sure yet.
FYI: The function is called in the constructor at the very end.
public class GiantNumber
{
//constructor and some other functions...
public bool IsNegative { get; }
public bool HasDecimal { get; }
public List<byte> NumbersBeforeDecimal;
public List<byte> NumbersAfterDecimal;
public void CullZeroes()
{
for (int i = 0; i < NumbersBeforeDecimal.Count; i++)
{
byte number = NumbersBeforeDecimal[i];
if (number == 0)
{
NumbersBeforeDecimal.RemoveAt(i);
}
else
{
break;
}
}
for (int i = NumbersAfterDecimal.Count - 1; i >= 0; i--)
{
byte number = NumbersAfterDecimal[i];
if (number == 0)
{
NumbersAfterDecimal.RemoveAt(i);
}
else
{
break;
}
}
}
}
You should not do that this way at all. RemoveAt is O(n), that means, that removing 10M digits would take roughly half an hour. It is about half an hour more than is really needed, as it can be done in miliseconds. You can remove the whole range in one step:
int j = 0;
while (j < NumbersBeforeDecimal.Count
&& NumbersBeforeDecimal[j] == 0)
{
++j;
}
NumbersBeforeDecimal.RemoveRange(0, j);
I figured this out with the help of Ňɏssa Pøngjǣrdenlarp; since I'm deleting the index and iterating over the next one, I'm skipping an index every time I loop. If I leave the iterator blank:
for (int i = 0; i < NumbersBeforeDecimal.Count; )
{
//stuff
}
It won't skip over lines! This could be a better idea compared to iterating backwards when you want simpler code (in my case it's a lot better).
I created a function for finding Prime Numbers, but the process takes a long time and uses a lot of memory. I need to optimize my code by making it more time and memory efficient.
The function is split into two parts:
The 1st part calculates odd numbers, the 2nd part is the isSimple method which searches for odd numbers that are prime.
I made some progress by moving the Math.Sqrt(N) outside of the for loop, but I'm not sure what to do next.
Any suggestions are welcome.
Program:
class Program
{
static void Main(string[] args)
{
//consider odd numbers
for (int i = 10001; i <=90000; i+=2)
{
if (isSimple(i))
{
Console.Write(i.ToString() + "\n");
}
}
}
//The method of finding primes
private static bool isSimple(int N)
{
double koren = Math.Sqrt(N);
// to verify a prime number or not enough to check whether it is //divisible number on numbers before its root
for (int i = 2; i <= koren; i++)
{
if (N % i == 0)
return false;
}
return true;
}
}
You are checking all possible divisors with your for (int i = 2; i <= koren; i++) That wastes time. You can halve the time taken by checking only odd divisors. You know that the given number N is odd, so no even number can be a divisor. Try for (int i = 3; i <= koren; i+=2) instead.
I've run into something really strange while working my way through some practice problems using dotnetfiddle. I have a program that applies a mathematical sequence (different calculations each step depending on whether the current step is even or odd):
using System;
public class Program
{
public static void Main()
{
int ceiling = 1000000;
int maxMoves = 0;
int maxStart = 0;
int testNumber;
for(int i = 1; i <= ceiling; i++){
testNumber = i;
int moves = 1;
while(testNumber != 1){
if(testNumber % 2 == 0){
testNumber = testNumber / 2;
moves++;
} else {
testNumber = (3 * testNumber) + 1;
moves++;
}
}
if(moves > maxMoves){
maxMoves = moves;
maxStart = i;
}
}
Console.WriteLine(maxStart);
Console.WriteLine(maxMoves);
}
}
As written, the execution time limit gets exceeded. However, if I change the declaration of test number to a long instead of an int, the program runs:
int maxMoves = 0;
int maxStart = 0;
**long** testNumber;
Why would making this change, which requires recasting i from an int to a long each increment of the for loop (at testNumber = i), be faster than leaving this as an int? Is performing the mathematical operations faster on a long value?
The reason seems to be an overflow. If you run that code enclosed in a
checked
{
// your code
}
you get an OverflowException when running with testNumber as int.
The reason is that eventually 3*testNumber+1 exceeds the boundary of an int. In an unchecked context this does not throw an exception, but leads to negative values for testNumber.
At this point your sequence (I think it's Collatz, right?) does not work anymore and the calculation takes (probably infinitly) longer, because you never reach 1 (or at least it takes you a whole lot more iterations to reach 1).
So I am looking at this question and the general consensus is that uint cast version is more efficient than range check with 0. Since the code is also in MS's implementation of List I assume it is a real optimization. However I have failed to produce a code sample that results in better performance for the uint version. I have tried different tests and there is something missing or some other part of my code is dwarfing the time for the checks. My last attempt looks like this:
class TestType
{
public TestType(int size)
{
MaxSize = size;
Random rand = new Random(100);
for (int i = 0; i < MaxIterations; i++)
{
indexes[i] = rand.Next(0, MaxSize);
}
}
public const int MaxIterations = 10000000;
private int MaxSize;
private int[] indexes = new int[MaxIterations];
public void Test()
{
var timer = new Stopwatch();
int inRange = 0;
int outOfRange = 0;
timer.Start();
for (int i = 0; i < MaxIterations; i++)
{
int x = indexes[i];
if (x < 0 || x > MaxSize)
{
throw new Exception();
}
inRange += indexes[x];
}
timer.Stop();
Console.WriteLine("Comparision 1: " + inRange + "/" + outOfRange + ", elapsed: " + timer.ElapsedMilliseconds + "ms");
inRange = 0;
outOfRange = 0;
timer.Reset();
timer.Start();
for (int i = 0; i < MaxIterations; i++)
{
int x = indexes[i];
if ((uint)x > (uint)MaxSize)
{
throw new Exception();
}
inRange += indexes[x];
}
timer.Stop();
Console.WriteLine("Comparision 2: " + inRange + "/" + outOfRange + ", elapsed: " + timer.ElapsedMilliseconds + "ms");
}
}
class Program
{
static void Main()
{
TestType t = new TestType(TestType.MaxIterations);
t.Test();
TestType t2 = new TestType(TestType.MaxIterations);
t2.Test();
TestType t3 = new TestType(TestType.MaxIterations);
t3.Test();
}
}
The code is a bit of a mess because I tried many things to make uint check perform faster like moving the compared variable into a field of a class, generating random index access and so on but in every case the result seems to be the same for both versions. So is this change applicable on modern x86 processors and can someone demonstrate it somehow?
Note that I am not asking for someone to fix my sample or explain what is wrong with it. I just want to see the case where the optimization does work.
if (x < 0 || x > MaxSize)
The comparison is performed by the CMP processor instruction (Compare). You'll want to take a look at Agner Fog's instruction tables document (PDF), it list the cost of instructions. Find your processor back in the list, then locate the CMP instruction.
For mine, Haswell, CMP takes 1 cycle of latency and 0.25 cycles of throughput.
A fractional cost like that could use an explanation, Haswell has 4 integer execution units that can execute instructions at the same time. When a program contains enough integer operations, like CMP, without an interdependency then they can all execute at the same time. In effect making the program 4 times faster. You don't always manage to keep all 4 of them busy at the same time with your code, it is actually pretty rare. But you do keep 2 of them busy in this case. Or in other words, two comparisons take just as long as single one, 1 cycle.
There are other factors at play that make the execution time identical. One thing helps is that the processor can predict the branch very well, it can speculatively execute x > MaxSize in spite of the short-circuit evaluation. And it will in fact end up using the result since the branch is never taken.
And the true bottleneck in this code is the array indexing, accessing memory is one of the slowest thing the processor can do. So the "fast" version of the code isn't faster even though it provides more opportunity to allow the processor to concurrently execute instructions. It isn't much of an opportunity today anyway, a processor has too many execution units to keep busy. Otherwise the feature that makes HyperThreading work. In both cases the processor bogs down at the same rate.
On my machine, I have to write code that occupies more than 4 engines to make it slower. Silly code like this:
if (x < 0 || x > MaxSize || x > 10000000 || x > 20000000 || x > 3000000) {
outOfRange++;
}
else {
inRange++;
}
Using 5 compares, now I can a difference, 61 vs 47 msec. Or in other words, this is a way to count the number of integer engines in the processor. Hehe :)
So this is a micro-optimization that probably used to pay off a decade ago. It doesn't anymore. Scratch it off your list of things to worry about :)
I would suggest attempting code which does not throw an exception when the index is out of range. Exceptions are incredibly expensive and can completely throw off your bench results.
The code below does a timed-average bench for 1,000 iterations of 1,000,000 results.
using System;
using System.Diagnostics;
namespace BenchTest
{
class Program
{
const int LoopCount = 1000000;
const int AverageCount = 1000;
static void Main(string[] args)
{
Console.WriteLine("Starting Benchmark");
RunTest();
Console.WriteLine("Finished Benchmark");
Console.Write("Press any key to exit...");
Console.ReadKey();
}
static void RunTest()
{
int cursorRow = Console.CursorTop; int cursorCol = Console.CursorLeft;
long totalTime1 = 0; long totalTime2 = 0;
long invalidOperationCount1 = 0; long invalidOperationCount2 = 0;
for (int i = 0; i < AverageCount; i++)
{
Console.SetCursorPosition(cursorCol, cursorRow);
Console.WriteLine("Running iteration: {0}/{1}", i + 1, AverageCount);
int[] indexArgs = RandomFill(LoopCount, int.MinValue, int.MaxValue);
int[] sizeArgs = RandomFill(LoopCount, 0, int.MaxValue);
totalTime1 += RunLoop(TestMethod1, indexArgs, sizeArgs, ref invalidOperationCount1);
totalTime2 += RunLoop(TestMethod2, indexArgs, sizeArgs, ref invalidOperationCount2);
}
PrintResult("Test 1", TimeSpan.FromTicks(totalTime1 / AverageCount), invalidOperationCount1);
PrintResult("Test 2", TimeSpan.FromTicks(totalTime2 / AverageCount), invalidOperationCount2);
}
static void PrintResult(string testName, TimeSpan averageTime, long invalidOperationCount)
{
Console.WriteLine(testName);
Console.WriteLine(" Average Time: {0}", averageTime);
Console.WriteLine(" Invalid Operations: {0} ({1})", invalidOperationCount, (invalidOperationCount / (double)(AverageCount * LoopCount)).ToString("P3"));
}
static long RunLoop(Func<int, int, int> testMethod, int[] indexArgs, int[] sizeArgs, ref long invalidOperationCount)
{
Stopwatch sw = new Stopwatch();
Console.Write("Running {0} sub-iterations", LoopCount);
sw.Start();
long startTickCount = sw.ElapsedTicks;
for (int i = 0; i < LoopCount; i++)
{
invalidOperationCount += testMethod(indexArgs[i], sizeArgs[i]);
}
sw.Stop();
long stopTickCount = sw.ElapsedTicks;
long elapsedTickCount = stopTickCount - startTickCount;
Console.WriteLine(" - Time Taken: {0}", new TimeSpan(elapsedTickCount));
return elapsedTickCount;
}
static int[] RandomFill(int size, int minValue, int maxValue)
{
int[] randomArray = new int[size];
Random rng = new Random();
for (int i = 0; i < size; i++)
{
randomArray[i] = rng.Next(minValue, maxValue);
}
return randomArray;
}
static int TestMethod1(int index, int size)
{
return (index < 0 || index >= size) ? 1 : 0;
}
static int TestMethod2(int index, int size)
{
return ((uint)(index) >= (uint)(size)) ? 1 : 0;
}
}
}
You aren't comparing like with like.
The code you were talking about not only saved one branch by using the optimisation, but also 4 bytes of CIL in a small method.
In a small method 4 bytes can be the difference in being inlined and not being inlined.
And if the method calling that method is also written to be small, then that can mean two (or more) method calls are jitted as one piece of inline code.
And maybe some of it is then, because it is inline and available for analysis by the jitter, optimised further again.
The real difference is not between index < 0 || index >= _size and (uint)index >= (uint)_size, but between code that has repeated efforts to minimise the method body size and code that does not. Look for example at how another method is used to throw the exception if necessary, further shaving off a couple of bytes of CIL.
(And no, that's not to say that I think all methods should be written like that, but there certainly can be performance differences when one does).