How can I calculate a fixed payment amount for a loan term that has two different interest rates based on how long the loan has been open?
This gets a little ugly, so please bear with me.
Define:
g1 = Initial monthly rate (For 3%, g=0.03/12.)
g2 = Second monthly rate.
T1 = Term for the initial rate (T1 = 3 for 3 months).
T2 = Term for the subsequent rate.
u1 = 1 / (1 + g1)
u2 = 1 / (1 + g2)
Then:
payment = g1 * g2 / (g1 * u1^T1 * (1 - u2^T2) + g2 * (1 - u1^T1))
Of course, I may have made a mistake, but that seems right.
This is a pretty complicated calculation that is usually part of a company's intellectual property. So I doubt anyone is going to post code. I've been down this road and it requires huge amounts of testing depending on how far you decide to go with it.
When performing the calculations in code it is critical that you use a data type such as Decimal instead of the floating point types like double. Decimal was explicitly created for these types of money calculations. Floating point types will cause many rounding errors, making the calculated values be off by unacceptable amounts.
Next, mortgage calculators that you find online are of highly varying quality. When testing your method it will be useful to see what the online calculators come up with, but by no means consider them more accurate than yours. Generally they are good to see if you are in the right ballpark, but they could be off by as much as .1% per year of the loan term.
Final note
Consider purchasing a library from a company like Math Corp instead of rolling your own. I'm pretty sure it'll be accurate AND much cheaper than the dev / qa time to get yours right.
Loan contracts are very complex. If you don't want to dive into the complexity you have to make some simplifying assumptions. Here are some of the variables you need to consider:
What is the base rate? Does the loan float over Prime? Libor? CMT?
What is the margin above the base rate?
How often does the base rate reset?
What happens if the reset date falls on a holiday? A weekend?
Are there ceilings or floors on the base rate?
Is there an initial period at which the base rate is fixed before the first reset? How long is that period?
Is there an initial discount on the margin that is later adjusted (a teaser rate)?
What's the term of the mortgage?
Is it a negative-amortization mortgate? What's the stop period on the negative-amortizing payments?
Is it a fully-amortizing mortgage?
Is it a balloon mortgage?
Is the interest simple interest or compounded interest? If the latter, what's the compounding frequency?
As you can see, if you haven't specified enough about the problem that you are trying to solve to even begin to come up with a solution.
If you're not a domain expert on ARMs or financial products in general I strongly encourage you to find someone who is.
The pmt function is based on this math:
Payment = Loan Amount at current time / ( 1 - ( 1 / ( 1+ current rate)^numperiods remaining ) )
Figuring out the loan amount at the current time (i.e. after five years of making a payment at a different rate) is the tough part.
Related
Here is a real world question: if I have a product A with a unit price of $25.8848 and I ordered 77 units of it then how I should calculate the subtotal and total?
I am actually getting confused about rounding issues. The code I have written is this:
Math.Round(item.Price, 2) * item.Qty
This should give me the correct subtotal and summing the other item will provided me correct total. Is the above code correct in terms of rounding? Or should it be done like this:
Math.Round(item.Price * item.Qty, 2)
I just want to know how the subtotal price rounding is done in real world.
Rounding should be applied per line item so the second method you have posted would be correct. Each line item subtotal should be rounded. Here is an example of it although it is coded in ruby you can get the idea etc
http://makandracards.com/makandra/1505-invoices-how-to-properly-round-and-calculate-totals
Either way is correct as far as C# is concerned. This is really a question about your company's accounting and business rules. Ideally this would be defined in the requirements document when the application was designed.
That being said, I would usually expect the 2nd option to be correct, apply the rounding AFTER the price is multiplied by the quantity.
I need to calculate PI with predefined precision using this formula:
So I ended up with this solution.
private static double CalculatePIWithPrecision(int presicion)
{
if (presicion == 0)
{
return PI_ZERO_PRECISION;
}
double sum = 0;
double numberOfSumElements = Math.Pow(10, presicion + 2);
for (double i = 1; i < numberOfSumElements; i++)
{
sum += 1 / (i * i);
}
double pi = Math.Sqrt(sum * 6);
return pi;
}
So this works correct, but I faced the problem with efficiency. It's very slow with precision values 8 and higher.
Is there a better (and faster!) way to calculate PI using that formula?
double numberOfSumElements = Math.Pow(10, presicion + 2);
I'm going to talk about this strictly in practical software engineering terms, avoiding getting lost in the formal math. Just practical tips that any software engineer should know.
First observe the complexity of your code. How long it takes to execute is strictly determined by this expression. You've written an exponential algorithm, the value you calculate very rapidly goes up as presicion increases. You quote the uncomfortable number, 8 produces 10^10 or a loop that makes ten billion calculations. Yes, you notice this, that's when computers starts to take seconds to produce a result, no matter how fast they are.
Exponential algorithms are bad, they perform very poorly. You can only do worse with one that has factorial complexity, O(n!), that goes up even faster. Otherwise the complexity of many real-world problems.
Now, is that expression actually accurate? You can do this with an "elbow test", using a practical back-of-the-envelope example. Let's pick a precision of 5 digits as a target and write it out:
1.0000 + 0.2500 + 0.1111 + 0.0625 + 0.0400 + 0.0278 + ... = 1.6433
You can tell that the additions rapidly get smaller, it converges quickly. You can reason out that, once the next number you add gets small enough then it does very little to make the result more accurate. Let's say that when the next number is less than 0.00001 then it's time to stop trying to improve the result.
So you'll stop at 1 / (n * n) = 0.00001 => n * n = 100000 => n = sqrt(100000) => n ~= 316
Your expression says to stop at 10^(5+2) = 10,000,000
You can tell that you are way off, looping entirely too often and not improving the accuracy of the result with the last 9.999 million iterations.
Time to talk about the real problem, too bad that you didn't explain how you got to such a drastically wrong algorithm. But surely you discovered when testing your code that it just was not very good at calculating a more precise value for pi. So you figured that by iterating more often, you'd get a better result.
Do note that in this elbow-test, it is also very important that you are able to calculate the additions with sufficient precision. I intentionally rounded the numbers, as though it was calculated on a machine capable of performing additions with 5 digits of precision. Whatever you do, the result can never be more precise than 5 digits.
You are using the double type in your code. Directly supported by the processor, it does not have infinite precision. The one and only rule you ever need to keep in mind is that calculations with double are never more precise than 15 digits. Also memorize the rule for float, it is never more precise than 7 digits.
So no matter what value you pass for presicion, the result can never be more precise than 15 digits. That is not useful at all, you already have the value of pi accurate to 15 digits. It is Math.Pi
The one thing you need to do to fix this is using a type that has more precision than double. In fact, it needs to be a type that has arbitrary precision, it needs to be at least as accurate as the presicion value you pass. Such a type does not exist in the .NET framework. Finding a library that can provide you with one is a common question at SO.
We build software that audits fees charged by banks to merchants that accept credit and debit cards. Our customers want us to tell them if the card processor is overcharging them. Per-transaction credit card fees are calculated like this:
fee = fixed + variable*transaction_price
A "fee scheme" is the pair of (fixed, variable) used by a group of credit cards, e.g. "MasterCard business debit gold cards issued by First National Bank of Hollywood". We believe there are fewer than 10 different fee schemes in use at any time, but we aren't getting a complete nor current list of fee schemes from our partners. (yes, I know that some "fee schemes" are more complicated than the equation above because of caps and other gotchas, but our transactions are known to have only a + bx schemes in use).
Here's the problem we're trying to solve: we want to use per-transaction data about fees to derive the fee schemes in use. Then we can compare that list to the fee schemes that each customer should be using according to their bank.
The data we get about each transaction is a data tuple: (card_id, transaction_price, fee).
transaction_price and fee are in integer cents. The bank rolls over fractional cents for each transation until the cumulative is greater than one cent, and then a "rounding cent" will be attached to the fees of that transaction. We cannot predict which transaction the "rounding cent" will be attached to.
card_id identifies a group of cards that share the same fee scheme. In a typical day of 10,000 transactions, there may be several hundred unique card_id's. Multiple card_id's will share a fee scheme.
The data we get looks like this, and what we want to figure out is the last two columns.
card_id transaction_price fee fixed variable
=======================================================================
12345 200 22 ? ?
67890 300 21 ? ?
56789 150 8 ? ?
34567 150 8 ? ?
34567 150 "rounding cent"-> 9 ? ?
34567 150 8 ? ?
The end result we want is a short list like this with 10 or fewer entries showing the fee schemes that best fit our data. Like this:
fee_scheme_id fixed variable
======================================
1 22 0
2 21 0
3 ? ?
4 ? ?
...
The average fee is about 8 cents. This means the rounding cents have a huge impact and the derivation above requires a lot of data.
The average transaction is 125 cents. Transaction prices are always on 5-cent boundaries.
We want a short list of fee schemes that "fit" 98%+ of the 3,000+ transactions each customer gets each day. If that's not enough data to achieve 98% confidence, we can use multiple days' of data.
Because of the rounding cents applied somewhat arbitrarily to each transaction, this isn't a simple algebra problem. Instead, it's a kind of statistical clustering exercise that I'm not sure how to solve.
Any suggestions for how to approach this problem? The implementation can be in C# or T-SQL, whichever makes the most sense given the algorithm.
Hough transform
Consider your problem in image terms: If you would plot your input data on a diagram of price vs. fee, each scheme's entries would form a straight line (with rounding cents being noise). Consider the density map of your plot as an image, and the task is reduced to finding straight lines in an image. Which is just the job of the Hough transform.
You would essentially approach this by plotting one line for each transaction into a diagram of possible fixed fee versus possible variable fee, adding the values of lines where they cross. At the points of real fee schemes, many lines will intersect and form a large local maximum. By detecting this maximum, you find your fee scheme, and even a degree of importance for the fee scheme.
This approach will surely work, but might take some time depending on the resolution you want to achieve. If computation time proves to be an issue, remember that a Voronoi diagram of a coarse Hough space can be used as a classificator - and once you have classified your points into fee schemes, simple linear regression solves your problem.
Considering, that a processing query's storage requirements are in the same power of 2 as a day's worth of transaction data, I assume that such storage is not a problem, so:
First pass: Group the transactions for each card_id by transaction_price, keeping card_id, transaction_price and average fee. This can easily be done in SQL. This assumes, there are not outliers - but you can catch those at after this stage if so required. The resulting number of rows is guaranteed to be no higher than the number of raw data points.
Second pass: Per group walk these new data points (with a cursor or in C#) and calculate the average value of b. Again any outliers can be caught if desired after this stage.
Third pass: Per group calculate the average value of a, now that b is known. This is basic SQL. Outliers as allways
If you decide to do the second step in a cursor you can stuff all that into a stored procedure.
Different card_id groups, that use the same fee scheme can now be coalesced (Sorry of this is the wrong word, non-english native) into fee schemes by rounding a and b with a sane precision and again grouping.
The Hough transform is the most general answer, though I don't know how one would implement it in SQL (rather than pulling the data out and processing it in a general purpose language of your choice).
Alas, the naive version is known to be slow if you have a lot of input data (1000 points is kinda medium sized) and if you want high precision results (scales as size_of_the_input / (rho_precision * theta_precision)).
There is a faster approach based on 2^n-trees, but there are few implementations out on the web to just plug in. (I recently did one in C++ as a testbed for a project I'm involved in. Maybe I'll clean it up and post it somewhere.)
If there is some additional order to the data you may be able to do better (i.e. do the line segments form a piecewise function?).
Naive Hough transform
Define an accumulator in (theta,rho) space spanning [-pi,pi) and [0,max(hypotenuse(x,y)] as an 2D-array.
Foreach point in the input data
Foreach bin in theta
find the distance rho of the altitude from the origin to
a line through (a,y) and making angle theta with the horizontal
rho = x cos(theta) + y sin(theta)
and increment the bin (theta,rho) in the accumulator
Find the maximum bin in the accumulator, this
represents the most line-like structure in the data
if (theta !=0) {a = rho/sin(theta); b = -1/tan(theta);}
Reliably getting multiple lines out of a single pass takes a little more bookkeeping, but it is not significantly harder.
You can improve the result a little by smoothing the data near the candidate peaks and fitting to get sub-bin precision which should be faster than using smaller bins and should pickup the effect of the "rounding" cents fairly smoothly.
You're looking at the rounding cent as a significant source of noise in your calculations, so I'd focus on minimizing the noise due to that issue. The easiest way to do this IMO is to increase the sample size.
Instead of viewing your data as thousands of y=mx + b (+Rounding) group your data into larger subsets:
If you combine X transactions with the same and look at this as (sum of X fees) = (variable rate)*(sum of X transactions) + X(base rates) (+Rounding) your rounding number the noise will likely fall to the wayside.
Get enough groups of size 'X' and you should be able to come up with a pretty close representation of the real numbers.
I am wondering what's the best type for a price field in SQL Server for a shop-like structure?
Looking at this overview we have data types called money, smallmoney, then we have decimal/numeric and lastly float and real.
Name, memory/disk-usage and value ranges:
Money: 8 bytes (values: -922,337,203,685,477.5808 to +922,337,203,685,477.5807)
Smallmoney: 4 bytes (values: -214,748.3648 to +214,748.3647)
Decimal: 9 [default, min. 5] bytes (values: -10^38 +1 to 10^38 -1 )
Float: 8 bytes (values: -1.79E+308 to 1.79E+308 )
Real: 4 bytes (values: -3.40E+38 to 3.40E+38 )
Is it really wise to store price values in those types? What about eg. INT?
Int: 4 bytes (values: -2,147,483,648 to 2,147,483,647)
Lets say a shop uses dollars, they have cents, but I don't see prices being $49.2142342 so the use of a lot of decimals showing cents seems waste of SQL bandwidth. Secondly, most shops wouldn't show any prices near 200.000.000 (not in normal web-shops at least, unless someone is trying to sell me a famous tower in Paris)
So why not go for an int?
An int is fast, its only 4 bytes and you can easily make decimals, by saving values in cents instead of dollars and then divide when you present the values.
The other approach would be to use smallmoney which is 4 bytes too, but this will require the math part of the CPU to do the calc, where as Int is integer power... on the downside you will need to divide every single outcome.
Are there any "currency" related problems with regional settings when using smallmoney/money fields? what will these transfer too in C#/.NET ?
Any pros/cons? Go for integer prices or smallmoney or some other?
What does your experience tell?
If you're absolutely sure your numbers will always stay within the range of smallmoney, use that and you can save a few bytes. Otherwise, I would use money. But remember, storage is cheap these days. The extra 4 bytes over 100 million records is still less than half a GB. As #marc_s points out, however, using smallmoney if you can will reduce the memory footprint of SQL server.
Long story short, if you can get away with smallmoney, do. If you think you might go over the max, use money.
But, do not use a floating-decimal type or you will get rounding issues and will start losing or gaining random cents, unless you deal with them properly.
My argument against using int: Why reinvent the wheel by storing an int and then having to remember to divide by 100 (10000) to retrieve the value and multiply back when you go to store the value. My understanding is the money types use an int or long as the underlying storage type anyway.
As far as the corresponding data type in .NET, it will be decimal (which will also avoid rounding issues in your C# code).
Use the Money datatype if you are storing money (unless modelling huge amounts of money like the national debt) - it avoids precision/rounding issues.
The Many Benefits of Money…Data Type!
USE NUMERIC / DECIMAL. Avoid MONEY / SMALLMONEY. Here's an example of why. Sooner or later the MONEY / SMALLMONEY types will likely let you down due to rounding errors. The money types are completely redundant and achieve nothing useful - a currency amount being just another decimal number like any other.
Lastly, the MONEY / SMALLMONEY types are proprietary to Microsoft. NUMERIC / DECIMAL are part of the SQL standard. They are used, recognised and understood by more people and are supported by most DBMSs and other software.
Personally, I'd use smallmoney or money to store shop prices.
Using int adds complexity elsewhere.
And 200 million is perfectly valid price in Korean Won or Indonesian Rupees too...
SQL data types money and smallmoney both resolve to c# decimal type:
http://msdn.microsoft.com/en-us/library/system.data.sqltypes.sqlmoney(v=VS.71).aspx
So I'm thinking that you might as well go for decimal. Personally I've been using double all my life working in the financial industry and haven't experienced performance issues, etc. Actually, I've found that for certain calculations, etc., having a larger data type allows for higher degree of accuracy.
I would go for the Money datatype. Invididually you may not exceed the value in Smallmoney, but it would be easy for multiple items to exceed it.
In my pawnshop app, the pawnshop operators lend from $5.00 to $10,000.00
When they calculate the loan amount they round it to the nearest dollar in order to
avoid dealing with cents (the same applies for interest payments). When the loan amount is above $50.00 they will round it to the nearest $5.00 (i.e. $50, $55, $60 ...), again to minimize running out of dollar bills. Therefore, I use DECIMAL(7,2) for transaction.calculated_loan_amount and DECIMAL(5,0) for transaction.loan_amount.
The app calculates the loan amount to the penny and places that amount in loan_amount where it gets rounded to the nearest dollar when below $50 or to the nearest $5.00 when greater.
I am doing some population modeling (for fun, mostly to play with the concepts of Carrying Capacity and the Logistics Function). The model works with multiple planets (about 100,000 of them, right now). When the population reaches carrying capacity on one planet, the inhabitants start branching out to nearby planets, and so on.
Problem: 100,000+ planets can house a LOT of people. More than a C# Decimal can handle. Since I'm doing averages and other stuff with these numbers, I need the capability to work with floating points (or I'd just use a BigInt library).
Does anyone know of a BigFloatingPoint class (or whatever) I can use? Google is being very unhelpful today. I could probably write a class that would work well enough, but I'd rather use something pre-existing, if such a thing exists.
Use units of megapeople to achieve more headroom.
Also, Decimal lets you have 100,000 planets each with 100000000000000 times the population of the Earth, if my arithmetic is right. Are you sure that's not enough?
Even if each planet has 100 billion people, the total is still only 1E16. This is well within the limit of a signed 64 bit integer (2^63 goes to 9,223,372,036,854,775,807 which is almost 1E19...
You could go with a Million Billion people per planet, with 100000 planets before you got close to the limit...
As to fractions and averages and such, can't you convert to a Float or double when you do any such calculations ?
Do you really need 28 digit precision? Could you use floating point for some calculations?
(double to be exact: ±5.0e−324 to ±1.7e308)