Pitfalls in C# for a new user. (FWHM calculation)

Pitfalls in C# for a new user. (FWHM calculation) - c#

This is my idea to program a simple math module (function) that can be called from another main program. It calculates the FWHM(full width at half the max) of a curve. Since this is my first try at Visual Studio and C#. I would like to know few basic programming structures I should learn in C# coming from a Mathematica background.
Is double fwhm(double[] data, int c) indicate the input arguments
to this function fwhm should be a double data array and an Integer
value? Did I get this right?
I find it difficult to express complex mathematical equations (line 32/33) to express them in parenthesis and divide one by another, whats the right method to do that?
How can I perform Mathematical functions on elements of an Array like division and store the results in the same Array?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace DEV_2
{
class fwhm
{
static double fwhm(double[] data, int c) // data as 2d data and c is integer
{
double[] datax;
double[] datay;
int L;
int Mag = 4;
double PP = 2.2;
int CI;
int k;
double Interp;
double Tlead;
double Ttrail;
double fwhm;
L = datay.Length;
// Create datax as index for the number of elemts in data from 1-Length(data).
for (int i = 1; i <= data.Length; i++)
{
datax[i] = (i + 1);
}
//Find max in datay and divide all elements by maxValue.
var m = datay.Length; // Find length of datay
Array.ForEach(datay, (x) => {datay[m++] = x / datay.Max();}); // Divide all elements of datay by max(datay)
double maxValue = datay.Max();
CI = datay.ToList().IndexOf(maxValue); // Push that index to CI
// Start to search lead
int k = 2;
while (Math.Sign(datay[k]) == Math.Sign(datay[k-1]-0.5))
{
k=k+1;
}
Interp = (0.5-datay[k-1])/(datay[k]-datay[k-1]);
Tlead = datax[k-1]+Interp*(datax[k]-datax[k-1]);
CI = CI+1;
// Start search for the trail
while (Math.Sign(datay[k]-0.5) == Math.Sign(datay[k-1]-0.5) && (k<=L-1))
{
k=k+1;
}
if (k != L)
{
Interp = (0.5-datay[k-1])/(datay[k]-datay[k-1]);
Ttrail = datax[k-1] + Interp*(datax[k]-datax[k-1]);
fwhm =((Ttrail-Tlead)*PP)/Mag;
}
}//end main
}//end class
}//end namespace

There are plenty of pitfalls in C#, but working through problems is a great way to find and learn them!
Yes, when passing parameters to a method the correct syntax is MethodName(varType varName) seperated by a comma for multiple parameters. Some pitfalls arise here with differences in passing Value types and Reference types. If you're interested here is some reading on the subject.
Edit: As pointed out in the comments you should write code as best as possible to require as few comments as possible (thus paragraph between #3 and #4), however if you need to do very specific and slightly complex math then you should comment to clarify what is occuring.
If you mean difficulties understanding, make sure you comment your code properly. If you mean difficulties writing it, you can create variables to simplify reading your code (but generally unnecessary) or look up functions or libraries to help you, this is a bit open ended question if you have a particular functionality you are looking for perhaps we could be of more help.
You can access your array via indexes such as array[i] will get the ith index. Following this you can manipulate the data that said index is pointing to in any way you wish, array[i] = (array[i]/24)^3 or array[i] = doMath(array[i])
A couple things you can do if you like to clean a little, but they are preference based, is not declare int CI; int k; in your code before you initialize them with int k = 2;, there is no need (although you can if it helps you). The other thing is to correctly name your variables, common practice is a more descriptive camelCase naming, so perhaps instead of int CI = datay.ToList().IndexOf(maxValue); you coud use int indexMaxValueYData = datay.ToList().IndexOf(maxValue);
As per your comment question "What would this method return?" The method will return a double, as declared above. returnType methodName(parameters) However you need to add that in your code, as of now I see no return line. Such as return doubleVar; where doubleVar is a variable of type double.

Related

Reducing a BigInteger value in C#

I'm somewhat new to working with BigIntegers and have tried some stuff to get this system working, but feel a little stuck at the moment and would really appreciate a nudge in the right direction or a solution.
I'm currently working on a system which reduces BigInteger values down to a more readable form, and this is working fine with my current implementation, but I would like to further expand on it to get decimals implemented.
To better give a picture of what I'm attempting, I'll break it down.
In this context, we have a method which is taking a BigInteger, and returning it as a string:
public static string ShortenBigInt (BigInteger moneyValue)
With this in mind, when a number such as 10,000 is passed to this method, 10k will be returned. Same for 1,000,000 which will return 1M.
This is done by doing:
for(int i = 0; i < prefixes.Length; i++)
{
if(!(moneyValue >= BigInteger.Pow(10, 3*i)))
{
moneyValue = moneyValue / BigInteger.Pow(10, 3*(i-1));
return moneyValue + prefixes[i-1];
}
}
This system is working by grabbing a string from an array of prefixes and reducing numbers down to their simplest forms and combining the two and returning it when inside that prefix range.
So with that context, the question I have is:
How might I go about returning this in the same way, where passing 100,000 would return 100k, but also doing something like 1,111,111 would return 1.11M?
Currently, passing 1,111,111M returns 1M, but I would like that additional .11 tagged on. No more than 2 decimals.
My original thought was to convert the big integer into a string, then chunk out the first few characters into a new string and parse a decimal in there, but since prefixes don't change until values reach their 1000th mark, it's harder to tell when to place the decimal place.
My next thought was using BigInteger.Log to reduce the value down into a decimal friendly number and do a simple division to get the value in its decimal form, but doing this didn't seem to work with my implementation.
This system should work for the following prefixes, dynamically:
k, M, B, T, qd, Qn, sx, Sp,
O, N, de, Ud, DD, tdD, qdD, QnD,
sxD, SpD, OcD, NvD, Vgn, UVg, DVg,
TVg, qtV, QnV, SeV, SPG, OVG, NVG,
TGN, UTG, DTG, tsTG, qtTG, QnTG, ssTG,
SpTG, OcTG, NoTG, QdDR, uQDR, dQDR, tQDR,
qdQDR, QnQDR, sxQDR, SpQDR, OQDDr, NQDDr,
qQGNT, uQGNT, dQGNT, tQGNT, qdQGNT, QnQGNT,
sxQGNT, SpQGNT, OQQGNT, NQQGNT, SXGNTL
Would anyone happen to know how to do something like this? Any language is fine, C# is preferable, but I'm all good with translating. Thank you in advance!

formatting it manually could work a bit like this:
(prefixes as a string which is an char[])
public static string ShortenBigInt(BigInteger moneyValue)
{
string prefixes = " kMGTP";
double m2 = (double)moneyValue;
for (int i = 1; i < prefixes.Length; i++)
{
var step = Math.Pow(10, 3 * i);
if (m2 / step < 1000)
{
return String.Format("{0:F2}", (m2/step)) + prefixes[i];
}
}
return "err";
}

Although Falco's answer does work, it doesn't work for what was requested. This was the solution I was looking for and received some help from a friend on it. This solution will go until there are no more prefixes left in your string array of prefixes. If you do run out of bounds, the exception will be thrown and handled by returning "Infinity".
This solution is better due to the fact there is no crunch down to doubles/decimals within this process. This solution does not have a number cap, only limit is the amount of prefixes you make/provide.
public static string ShortenBigInt(BigInteger moneyValue)
{
if (moneyValue < 1000)
return "" + moneyValue;
try
{
string moneyAsString = moneyValue.ToString();
string prefix = prefixes[(moneyAsString.Length - 1) / 3];
BigInteger chopAmmount = (moneyAsString.Length - 1) % 3 + 1;
int insertPoint = (int)chopAmmount;
chopAmmount += 2;
moneyAsString = moneyAsString.Remove(Math.Min(moneyAsString.Length - 1, (int)chopAmmount));
moneyAsString = moneyAsString.Insert(insertPoint, ".");
return moneyAsString + " " + prefix;
}
catch (Exception exceptionToBeThrown)
{
return "Infinity";
}
}

Force C# SolverFoundation SimplexSolver to find an int solution

It is stated here that SimplexSolver "Defines a branch-and-bound search for optimizing mixed integer problems." which should mean that it finds an integer solution for a given task but it finds a precise solution with a double values.
Is there a way to force it to find an integer solution or i should implement my own branch-and-bound on top of given double solutions?

Is there a way to force it to find an integer solution or i should implement my own branch-and-bound on top of given double solutions?
No need to implement B&B algorithm, just declare your variables as integers and SimplexSolver should be able to solve it and provide integer optimal solution. See example here. Relevant snippet below:
SimplexSolver solver = new SimplexSolver();
// ...
for (int i = 0; i < 5; i++) {
solver.AddVariable(string.Format("project{0}", i),
out chooseProjectX[i]);
solver.SetBounds(chooseProjectX[i], 0, 1);
solver.SetIntegrality(chooseProjectX[i], true);
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
solver.SetCoefficient(profit, chooseProjectX[i],
estimatedProfitOfProjectX[i]);
solver.SetCoefficient(expenditure, chooseProjectX[i],
capitalRequiredForProjectX[i]);
}

Check how many instances of strings in a list

Well I enquired about checking if certain keywords can be found in an list and if they are all there the question is correct. Found here: Check if the string contains all inputs on the list
What I would like to also know is how many of the words are in the list, then divide it and get a percentage, so the user knows how accurately they answered each question.
public String KeyWords_Found()
{
int Return_Value = 0;
foreach (String s in KeyWords)
{
if (textBox1.Text.Contains(s))
{
Return_Value++;
}
}
int Holder = Return_Value / KeyWords.Count;
int Fixed = Holder * 100;
return Fixed + "%";
}
So what I want that code it do is check for all instances of keywords listed into the list KeyWords. Then get the percentage by dividing by the total amount of keywords and multiplying by 100. But it says that both values are 0 and i cant divide by 0. I'm not sure why they would be zero. Confused! Help!

You should first check, if KeyWords is empty or not
public String KeyWords_Found()
{
if (KeyWords.Count == 0)
return "0%";
// rest of the code
}
Alternatively you could use Linq instead of writing your own method:
int nOfOccurences = KeyWords.Where(k => textBox1.Text.Contains(k)).Count();
make sure you are using System.Linq; for that to work.
You'll still need to check for KeyWords.Count == 0 and compute the percentage yourself, though.

You should use floating point maths instead of integer maths in your calculations.
int i=100;
int a=51;
(i/a)==0 //true, integer division sucks for calculating percentages
((double)i/a)==0 //false, actually equals ~1.96

Standard deviation of generic list? [duplicate]

This question already has answers here:
How do I determine the standard deviation (stddev) of a set of values?
(12 answers)
Standard Deviation in LINQ
(8 answers)
Closed 9 years ago.
I need to calculate the standard deviation of a generic list. I will try to include my code. Its a generic list with data in it. The data is mostly floats and ints. Here is my code that is relative to it without getting into to much detail:
namespace ValveTesterInterface
{
public class ValveDataResults
{
private List<ValveData> m_ValveResults;
public ValveDataResults()
{
if (m_ValveResults == null)
{
m_ValveResults = new List<ValveData>();
}
}
public void AddValveData(ValveData valve)
{
m_ValveResults.Add(valve);
}
Here is the function where the standard deviation needs to be calculated:
public float LatchStdev()
{
float sumOfSqrs = 0;
float meanValue = 0;
foreach (ValveData value in m_ValveResults)
{
meanValue += value.LatchTime;
}
meanValue = (meanValue / m_ValveResults.Count) * 0.02f;
for (int i = 0; i <= m_ValveResults.Count; i++)
{
sumOfSqrs += Math.Pow((m_ValveResults - meanValue), 2);
}
return Math.Sqrt(sumOfSqrs /(m_ValveResults.Count - 1));
}
}
}
Ignore whats inside the LatchStdev() function because I'm sure its not right. Its just my poor attempt to calculate the st dev. I know how to do it of a list of doubles, however not of a list of generic data list. If someone had experience in this, please help.

The example above is slightly incorrect and could have a divide by zero error if your population set is 1. The following code is somewhat simpler and gives the "population standard deviation" result. (http://en.wikipedia.org/wiki/Standard_deviation)
using System;
using System.Linq;
using System.Collections.Generic;
public static class Extend
{
public static double StandardDeviation(this IEnumerable<double> values)
{
double avg = values.Average();
return Math.Sqrt(values.Average(v=>Math.Pow(v-avg,2)));
}
}

This article should help you. It creates a function that computes the deviation of a sequence of double values. All you have to do is supply a sequence of appropriate data elements.
The resulting function is:
private double CalculateStandardDeviation(IEnumerable<double> values)
{
double standardDeviation = 0;
if (values.Any())
{
// Compute the average.
double avg = values.Average();
// Perform the Sum of (value-avg)_2_2.
double sum = values.Sum(d => Math.Pow(d - avg, 2));
// Put it all together.
standardDeviation = Math.Sqrt((sum) / (values.Count()-1));
}
return standardDeviation;
}
This is easy enough to adapt for any generic type, so long as we provide a selector for the value being computed. LINQ is great for that, the Select funciton allows you to project from your generic list of custom types a sequence of numeric values for which to compute the standard deviation:
List<ValveData> list = ...
var result = list.Select( v => (double)v.SomeField )
.CalculateStdDev();

Even though the accepted answer seems mathematically correct, it is wrong from the programming perspective - it enumerates the same sequence 4 times. This might be ok if the underlying object is a list or an array, but if the input is a filtered/aggregated/etc linq expression, or if the data is coming directly from the database or network stream, this would cause much lower performance.
I would highly recommend not to reinvent the wheel and use one of the better open source math libraries Math.NET. We have been using that lib in our company and are very happy with the performance.
PM> Install-Package MathNet.Numerics
var populationStdDev = new List<double>(1d, 2d, 3d, 4d, 5d).PopulationStandardDeviation();
var sampleStdDev = new List<double>(2d, 3d, 4d).StandardDeviation();
See http://numerics.mathdotnet.com/docs/DescriptiveStatistics.html for more information.
Lastly, for those who want to get the fastest possible result and sacrifice some precision, read "one-pass" algorithm https://en.wikipedia.org/wiki/Standard_deviation#Rapid_calculation_methods

I see what you're doing, and I use something similar. It seems to me you're not going far enough. I tend to encapsulate all data processing into a single class, that way I can cache the values that are calculated until the list changes.
for instance:
public class StatProcessor{
private list<double> _data; //this holds the current data
private _avg; //we cache average here
private _avgValid; //a flag to say weather we need to calculate the average or not
private _calcAvg(); //calculate the average of the list and cache in _avg, and set _avgValid
public double average{
get{
if(!_avgValid) //if we dont HAVE to calculate the average, skip it
_calcAvg(); //if we do, go ahead, cache it, then set the flag.
return _avg; //now _avg is garunteed to be good, so return it.
}
}
...more stuff
Add(){
//add stuff to the list here, and reset the flag
}
}
You'll notice that using this method, only the first request for average actually computes the average. After that, as long as we don't add (or remove, or modify at all, but those arnt shown) anything from the list, we can get the average for basically nothing.
Additionally, since the average is used in the algorithm for the standard deviation, computing the standard deviation first will give us the average for free, and computing the average first will give us a little performance boost in the standard devation calculation, assuming we remember to check the flag.
Furthermore! places like the average function, where you're looping through every value already anyway, is a great time to cache things like the minimum and maximum values. Of course, requests for this information need to first check whether theyve been cached, and that can cause a relative slowdown compared to just finding the max using the list, since it does all the extra work setting up all the concerned caches, not just the one your accessing.

Dynamic compilation for performance

I have an idea of how I can improve the performance with dynamic code generation, but I'm not sure which is the best way to approach this problem.
Suppose I have a class
class Calculator
{
int Value1;
int Value2;
//..........
int ValueN;
void DoCalc()
{
if (Value1 > 0)
{
DoValue1RelatedStuff();
}
if (Value2 > 0)
{
DoValue2RelatedStuff();
}
//....
//....
//....
if (ValueN > 0)
{
DoValueNRelatedStuff();
}
}
}
The DoCalc method is at the lowest level and it is called many times during calculation. Another important aspect is that ValueN are only set at the beginning and do not change during calculation. So many of the ifs in the DoCalc method are unnecessary, as many of ValueN are 0. So I was hoping that dynamic code generation could help to improve performance.
For instance if I create a method
void DoCalc_Specific()
{
const Value1 = 0;
const Value2 = 0;
const ValueN = 1;
if (Value1 > 0)
{
DoValue1RelatedStuff();
}
if (Value2 > 0)
{
DoValue2RelatedStuff();
}
....
....
....
if (ValueN > 0)
{
DoValueNRelatedStuff();
}
}
and compile it with optimizations switched on the C# compiler is smart enough to only keep the necessary stuff. So I would like to create such method at run time based on the values of ValueN and use the generated method during calculations.
I guess that I could use expression trees for that, but expression trees works only with simple lambda functions, so I cannot use things like if, while etc. inside the function body. So in this case I need to change this method in an appropriate way.
Another possibility is to create the necessary code as a string and compile it dynamically. But it would be much better for me if I could take the existing method and modify it accordingly.
There's also Reflection.Emit, but I don't want to stick with it as it would be very difficult to maintain.
BTW. I'm not restricted to C#. So I'm open to suggestions of programming languages that are best suited for this kind of problem. Except for LISP for a couple of reasons.
One important clarification. DoValue1RelatedStuff() is not a method call in my algorithm. It's just some formula-based calculation and it's pretty fast. I should have written it like this
if (Value1 > 0)
{
// Do Value1 Related Stuff
}
I have run some performance tests and I can see that with two ifs when one is disabled the optimized method is about 2 times faster than with the redundant if.
Here's the code I used for testing:
public class Program
{
static void Main(string[] args)
{
int x = 0, y = 2;
var if_st = DateTime.Now.Ticks;
for (var i = 0; i < 10000000; i++)
{
WithIf(x, y);
}
var if_et = DateTime.Now.Ticks - if_st;
Console.WriteLine(if_et.ToString());
var noif_st = DateTime.Now.Ticks;
for (var i = 0; i < 10000000; i++)
{
Without(x, y);
}
var noif_et = DateTime.Now.Ticks - noif_st;
Console.WriteLine(noif_et.ToString());
Console.ReadLine();
}
static double WithIf(int x, int y)
{
var result = 0.0;
for (var i = 0; i < 100; i++)
{
if (x > 0)
{
result += x * 0.01;
}
if (y > 0)
{
result += y * 0.01;
}
}
return result;
}
static double Without(int x, int y)
{
var result = 0.0;
for (var i = 0; i < 100; i++)
{
result += y * 0.01;
}
return result;
}
}

I would usually not even think about such an optimization. How much work does DoValueXRelatedStuff() do? More than 10 to 50 processor cycles? Yes? That means you are going to build quite a complex system to save less then 10% execution time (and this seems quite optimistic to me). This can easily go down to less then 1%.
Is there no room for other optimizations? Better algorithms? An do you really need to eliminate single branches taking only a single processor cycle (if the branch prediction is correct)? Yes? Shouldn't you think about writing your code in assembler or something else more machine specific instead of using .NET?
Could you give the order of N, the complexity of a typical method, and the ratio of expressions usually evaluating to true?

It would surprise me to find a scenario where the overhead of evaluating the if statements is worth the effort to dynamically emit code.
Modern CPU's support branch prediction and branch predication, which makes the overhead for branches in small segments of code approach zero.
Have you tried to benchmark two hand-coded versions of the code, one that has all the if-statements in place but provides zero values for most, and one that removes all of those same if branches?

If you are really into code optimisation - before you do anything - run the profiler! It will show you where the bottleneck is and which areas are worth optimising.
Also - if the language choice is not limited (except for LISP) then nothing will beat assembler in terms of performance ;)
I remember achieving some performance magic by rewriting some inner functions (like the one you have) using assembler.

Before you do anything, do you actually have a problem?
i.e. does it run long enough to bother you?
If so, find out what is actually taking time, not what you guess. This is the quick, dirty, and highly effective method I use to see where time goes.
Now, you are talking about interpreting versus compiling. Interpreted code is typically 1-2 orders of magnitude slower than compiled code. The reason is that interpreters are continually figuring out what to do next, and then forgetting, while compiled code just knows.
If you are in this situation, then it may make sense to pay the price of translating so as to get the speed of compiled code.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Pitfalls in C# for a new user. (FWHM calculation) - c#

Related

Reducing a BigInteger value in C#

Force C# SolverFoundation SimplexSolver to find an int solution

Check how many instances of strings in a list

Standard deviation of generic list? [duplicate]

Dynamic compilation for performance

Categories

Resources