I have a list of GPS locations in a MySQL server database. The user will be entering a GPS coordinate in the application and he should get the nearest GPS coordinate.
I don't mind the distance calculation is based on "crow's flight" or anything else. It should be fast enough to search thousands of GPS locations.
I prefer solution in C#, else I will try to get the logic and apply myself.
There's one question on MySQL lat/long distance search in Need help optimizing a lat/Lon geo search for mysql
For C# distance calculation, most sites use the Haversine formula - here's a C# implementation - http://www.storm-consultancy.com/blog/development/code-snippets/the-haversine-formula-in-c-and-sql/ - this also has a SQL (MS SQL) implementation too.
/// <summary>
/// Returns the distance in miles or kilometers of any two
/// latitude / longitude points.
/// </summary>
/// <param name="pos1">Location 1</param>
/// <param name="pos2">Location 2</param>
/// <param name="unit">Miles or Kilometers</param>
/// <returns>Distance in the requested unit</returns>
public double HaversineDistance(LatLng pos1, LatLng pos2, DistanceUnit unit)
{
double R = (unit == DistanceUnit.Miles) ? 3960 : 6371;
var lat = (pos2.Latitude - pos1.Latitude).ToRadians();
var lng = (pos2.Longitude - pos1.Longitude).ToRadians();
var h1 = Math.Sin(lat / 2) * Math.Sin(lat / 2) +
Math.Cos(pos1.Latitude.ToRadians()) * Math.Cos(pos2.Latitude.ToRadians()) *
Math.Sin(lng / 2) * Math.Sin(lng / 2);
var h2 = 2 * Math.Asin(Math.Min(1, Math.Sqrt(h1)));
return R * h2;
}
public enum DistanceUnit { Miles, Kilometers };
For most queries... you are probably OK splitting the work between C# and SQL
use MySQL to select "close" lat/lng points, e.g. say where lat and lng are within 1.0 of your target
then use C# to calculate a more accurate distance and to select "the best".
If you were using MS SQL 2008 then I'd recommend using the MS SQL geography types as these have built-in optimised indexing and calculation features - I see that MySQL also has some extensions - http://dev.mysql.com/tech-resources/articles/4.1/gis-with-mysql.html - but I've no experience with these.
What you're trying to do is called a nearest-neighbor search and there are many good data structures which can speed up these sorts of searches. You may want to look into kd-trees, for example, as they can give expected sublinear time (O(√ n) in two dimensions) queries for the point in a data set nearest to some arbitrary test point. They're also surprisingly easy to implement if you're comfortable writing a modified binary search tree.
Note that when dealing with spherical geometry our euclidean geometry isn't quite precise (a^2+b^2=c^2) but for small subparts of the earth it might be approximate enough.
Otherwise: http://en.wikipedia.org/wiki/Great-circle_distance
If you have coordinate data stored in a database, you might want to query the database directly, especially if there is a large amount of the data. However, you need specific database support for that (normal indexes do not help). I know MSSQL supports geography data, I did not test MySQL, but online documentation seems to suggest there is similar support, too. As soon as you have built a spatial-aware database, you get your results with a simple query.
Related
I have this class that generates synthetic looking (stock) data and it works fine. However, I want to modify it so that NewPrice generates smooth trending data for say n-bars.
I know that if I reduce the volatility, I get smoother prices. However, not sure how to guarantee that the data goes into alternating persistant trend either up/down. A sine wave looking thing, but with stock looking prices, i.e, no negative prices.
Price = Trend + Previous Price + Random Component I am missing the trend component in the implementation below.
Any suggestions?
class SyntheticData
{
public static double previous = 1.0;
public static double NewPrice(double volatility, double rnd)
{
var change_percent = 2 * volatility * rnd;
if (change_percent > volatility)
change_percent -= (2 * volatility);
var change_amount = previous * change_percent;
var new_price = previous + change_amount;
previous = new_price;
return new_price;
}
}
Trade.previous = 100.0;
Price = Trade.NewPrice(.03, rnd.NextDouble()),
Exponential smoothing or exponential moving average will create the type of data you want. Ideally, you would have existing stock price data that represents the type of time series that you want to generate. You fit an exponential smoothing model to your data. This will determine a number of parameters for that model. You can then use the model and its parameters to generate similar time series with the same kind of trends, and you can control the volatility (standard deviation) of the random variable associated with the model.
As an example of what you can do, in the image below the blue and yellow parts are from real data, and the green part is synthetic data generated with a model that was fit to the real data.
Time series forecasting is a large topic. I do not know how familiar you are with that topic. See Time Series Analysis, it covers a large range of time series providing clear presentations and examples in Excel. See exponential smoothing for more theoretical background
Here is a specific example of how such a time series can be generated. I chose one of the 30 exponential smoothing models, one that has additive trend and volatility, and no seasonal component. The equations for generating the time series are:
The time index is t, an integer. The values of the time series are yt. lt and bt are respectively the offset and slope components of the time series. Alpha and beta are parameters, and l-1 and b-1 are initial values of the offset and slope components. et is the value of a random variable that follows some distribution, e.g. normal. Alpha and beta must satisfy the relations below for stability of the time series.
To generate different time series you choose values for alpha, beta, l-1, b-1, and the standard deviation of et assuming a normal law, and calculate the successive values of yt. I have done this in Excel for several combinations of values. Here are several time series generated with this model. Sigma is the standard deviation (volatility) of et.
Here are the equations for the 30 models. N means no trend / seasonal component. A means additive component. M means multiplicative component. The d subscript indicates a variant that is damped. You can get all of the details from the references above.
Something like this is what I was looking for:
public static double[] Sine(int n)
{
const int FS = 64; // sampling rate
return MathNet.Numerics.Generate.Sinusoidal(n, FS, 1.0, 20.0);
}
Although, it is not intuitive for a person that wants to deal in prices and time-based periodicity and not in mathematical functions.
https://numerics.mathdotnet.com/Generate.html
I can use the following SQL to calculate the distance between a fixed location and the location against the venues in the database.
SELECT Location.STDistance(geography::Point(51, -2, 4326)) * 0.00062137119 FROM Venues
Please note the distance returned is in miles and the Location field is a geography type.
I was wondering what is the equivalent of this in .NET which would return the same values. This method would have the following signature:
public static double Distance(location1Latitude, location1Longitude, location2Latitude, location2Longitude) {
return ...;
}
I know I could call the database method in .NET but I don't wish to do this. I'm hoping there is a formula to calculate the distance. Thanks
I believe you can simply add Microsoft.SqlServer.Types.dll as a reference and then use the SqlGeometry type like any other .NET type, including calling the STDistance method.
You would need to compute the Geographical distance to compute the distance manually. There are many different techniques and formulas to do this, each with different underlying assumptions (ie: a spherical earth, ellipsoidal earth, etc).
A common option is the haversine formula, with a C# implementation available here.
this is very well explained here.
Shortly: with EF5 (to be more specific, with .net 4.5) Microsoft included the type DbGeography. Let say you already have a bunch of lat/long, you can then create a DbGeography object easily using an helper like:
public static DbGeography CreatePoint(double latitude, double longitude)
{
var text = string.Format(CultureInfo.InvariantCulture.NumberFormat,
"POINT({0} {1})", longitude, latitude);
// 4326 is most common coordinate system used by GPS/Maps
return DbGeography.PointFromText(text, 4326);
}
Once you got a two or more points (DbGeography) you got everything to calculate the Distance between them:
var distance = point1.Distance(point2)
How can i store a rectangle - consisting of 2 points NorthEast and SouthWest each point is a coordinate of lattitude and longitude
And add a circle consisting of a center ( lat-lng ) and a radius (int/float value)
what is the best way to store and later on query if a lat-lng is within the bounds of a any circle or rectangle ?
also , can i store an array of those ? say 10 rectangles and 5 circles in a single record ?
Can i use Nhibernate to ease the pain?
Sorry if this seems noobish , i have never done anything with spatial data and i don't even have clue from where to start.
Any samples and pointers are helpful !
Thanks in advance.
Here's how I would approach this problem using TSQL.
For a rectangle, the simplest method is to extrapolate the extra 2 points by using the relevant coordinates from the original points. e.g.
NorthEast (lat1, lon1) NorthWest* (lat1, lon2)
SouthEast* (lat2, lon1) SouthWest (lat2, lon2)
*New point
That doesn't give you a true rectangle (in a mathematical sense) but it's a common method in GIS (it's how geohashes are formed) what you get is a rough rectangle with varying size based on the distance from the equator. If you need an extact rectangle of a certain height/width you should look into using the Haversine formula to calculate the remaining 2 points, that will take into account bearing, and great circle distance.
http://www.movable-type.co.uk/scripts/latlong.html
To store the rectangle, I'd create a SQL table with a GEOGRAPHY type column, this will allow you assign additional attributes (e.g. name) along with a spatial index that will make future queries much faster.
CREATE TABLE dbo.geographies
(
NAME VARCHAR(50)
,GEOG GEOGRAPHY
)
INSERT INTO dbo.geographies (NAME, GEOG)
VALUES ('Rectangle', geography::STPolyFromText('POLYGON((lon1 lat1, lon2 lat1, lon2 lat2, lon1 lat2, lon1 lat1))', 4326))
Note that both the first point and the last point are the same, this is required to 'close' the polygon, and the final number denotes the SRID, or coordinate system, in this case WGS84. You can reference this page: http://msdn.microsoft.com/en-us/library/bb933971
As to the circle, it's simple to store a point and then use the radius to apply a buffer around the point:
INSERT INTO dbo.geographies (NAME, GEOG)
VALUES ('Circle with Radius', geography::STPointFromText('POINT(lon lat)', 4326).STBuffer([radius]))
Note that the buffer takes its input in meters so you may need to apply a conversion, more notes on this page: http://msdn.microsoft.com/en-us/library/bb933979
Now the fun part, it's quite easy to check for intersection on a point using the STIntersects
method.
http://msdn.microsoft.com/en-us/library/bb933962.aspx
DECLARE #point GEOGRAPHY = geography::STPointFromText('POINT(lon lat)', 4326)
SELECT * FROM dbo.geographies
WHERE #point.STIntersects(GEOG) = 1
The code sample takes a point and returns a list of all the geographies that the point is found within. It's important the the SRIDs of the new point and the geographies in the table match, otherwise you'll get zero matches (and probably pound you head against a wall for a while until you realize your mistake, at least, that's what I do).
As to integrating this with C#, I'm not sure how much help I can be, but it shouldn't be too much of a challenge to return the SQLGeography type
http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.types.sqlgeography.aspx
Hopefully this at least points you in the right direction.
I am currently working on an application that will retrieve other users' locations based on distance.
I have a database that store all the user location information in latitude and longitude.
Since the calculation of distance between these two pairs of latitude and longitude is quite complicated, I need a function to handle it.
from a in db.Location.Where(a => (calDistance(lat, longi, Double.Parse(a.latitude), Double.Parse(a.longitude)))<Math.Abs(distance) )) {...}
However, I got the following error: LINQ to Entities does not recognize the method and this method cannot be translated into a store expression.
I don't know how to translated it into a store expression and also, the calculation also need the math library.
Is there any method that i can do to let the LINQ expression call my own function?
Maybe there are other ways to achieve my goal, can anyone help?
LinqToEntities won't allow you to call a function, it doesn't even allow ToString()
this is not a Linq thing its a LinqToEntities restriction
you could put your code in to the database as a Stored Proc or Function and call it using ExecuteStoreQuery
see here Does Entity Framework Code First support stored procedures?
I don't really know LINQ, but assuming that you can only send simple constraints in the query, I would construct a method that basically does the inverse of calDistance - take a coordinate and a distance and convert it into a bounding box with a minimum longitude, maximum longitude, minimum latitude, and maximum latitude.
You should be able to construct a simple query that will serve your purposes with those constraints.
something like (using Java here):
public double[] getCoordinateBounds(double distance, double longitude, double latitude) {
double[] bounds = new double[4];
bounds[0] = longitude - distanceToLongitudePoints * (distance);
bounds[1] = longitude + distanceToLongitudePoints * (distance);
bounds[2] = latitude - distanceToLatitudePoints * (distance);
bounds[3] = latitude + distanceToLatitudePoints * (distance);
return bounds;
}
Then you could construct a query.
double[] bounds = getCoordinateBounds(distance, longi, lat);
var nearbyUserLocations = from a in db.Location
where longitude > bounds[0] and longitude < bounds[1]
and latitude > bounds[2] and latitude < bounds[3]
This would give you a box of points rather than a radius of points, but it would be few enough points that you could then process them and throw out the ones outside your radius. Or you might decide that a box is good enough for your purposes.
The problem you see is that the LINQ to SQL engine is trying to inject T-SQL from your user-defined function and it cannot. One (albeit nasty) option is to retrieve all of your locations and then calculate from that result set.
var locations = db.Location.ToList();
locations = locations.Where(a => (calDistance(lat, longi, Double.Parse(a.latitude), Double.Parse(a.longitude))).ToList();
I have a location (latitude & longitude). How can I get a list of zipcodes that are either partially or fully within the 10 mile radius of my location?
The solution could be a call to a well known web service (google maps, bing maps, etc...) or a local database solution (the client has sql server 2005) or an algorithm.
I have seen the somewhat similar question, but all the answers there pretty much pertain to using SQL Server 2008 geography functionality which is unavailable to me.
Start with a zip code database that contains zipcodes and their corresponding latitude and longitude coordinates:
http://www.zipcodedownload.com/Products/Product/Z5Commercial/Standard/Overview/
To get the distance between latitude and longitude, you will need a good distance formula. This site has a couple variations:
http://www.meridianworlddata.com/distance-calculation/
The "Great Circle Distance" formula is a little extreme. This one works well enough from my experience:
sqrt(x * x + y * y)
where x = 69.1 * (lat2 - lat1)
and y = 69.1 * (lon2 - lon1) * cos(lat1/57.3)
Your SQL Query will then look something like this:
select zd.ZipCode
from ZipData zd
where
sqrt(
square(69.1 * (zd.Latitude - #Latitude)) +
square(69.1 * (zd.Longitude - #Longitude) * cos(#Latitude/57.3))
) < #Distance
Good luck!
Firstly, you'll need a database of all zipcodes and their corresponding latitudes and longitudes. In Australia, there are only a few thousand of these (and the information is easily available), however I assume it's probably a more difficult task in the US.
Secondly, given you know where you are, and you know the radius you are looking for, you can look up all zipcodes that fall within that radius. Something simple written in PHP would be as follows: (apologies it's not in C#)
function distanceFromTo($latitude1,$longitude1,$latitude2,$longitude2,$km){
$latitude1 = deg2rad($latitude1);
$longitude1 = deg2rad($longitude1);
$latitude2 = deg2rad($latitude2);
$longitude2 = deg2rad($longitude2);
$delta_latitude = $latitude2 - $latitude1;
$delta_longitude = $longitude2 - $longitude1;
$temp = pow(sin($delta_latitude/2.0),2) + cos($latitude1) * cos($latitude2) * pow(sin($delta_longitude/2.0),2);
$earth_radius = 3956;
$distance = $earth_radius * 2 * atan2(sqrt($temp),sqrt(1-$temp));
if ($km)
$distance = $distance * 1.609344;
return $distance;
}
Most searches work with centroids. In order to work with partial zipcodes being within the 10 miles, you are going to have to buy a database of zipcode polygons (*). Then implement an algorithm which checks for zipcodes with vertices within your 10 mile radius. To be done properly, you owuld use the Haversine formula for the distance measurement. With some clever data structures, you can significant reduce the search space. Similarly, searches can be greatly speeded up by storing and initially comparing against zipcoe extents (North,West,East,South).
(*) Note: Technically zipcodes are NOT polygons! I know we all think of them like that, but really they are collections of data points (street addresses) and this is how the USPS really uses them. This means zipcodes can include other zipcodes; zipcodes can be made of multiple "polygons"; and zipcodes can overlap other zipcodes. Most of these situations should not be a problem, but you will have to handle zipcodes that can be defined as multiple polygons.