TSM - Spatial Data in the SQL Server

Diana Muntea - Software Developer @ Yardi România

Spatial data is used to represent information about the location and shape of geometric objects. These objects can be the center point locations or more complex structures: roads, rivers, cities or countries.

Beginning with 2008, the variety of SQL Server products from Microsoft offers support for geospatial data. This allows storing of spatial data types within tables as points, lines, and polygons. It also offers a large variety of functions for managing the data, as well as spatial indexes to run queries more efficiently.

In the SQL Server, spatial data can be of two types:

Both types of data are implemented using .NET common language runtime (CLR).

SRID - Spatial Reference Identifier - corresponds to a spatial reference system based on the specific ellipsoid used for either flat-earth mapping or round-earth mapping. The identifier is defined by the European Petroleum Survey Group (EPSG) standard. A column may contain objects with different SRID, but we cannot perform operations between objects with different SRID (not based on the same unit of measurement, datum, and projection). The most common measurement unit is the meter or the square meter. For geometrical data, the implicit value for SRID is zero and for the geographical ones, it is 4326 (it is also used by Google Maps API).

Available objects for the geometrical and geographical data types

Picture MSDN

SQL Server has several functions and methods that allow us to manage spatial data types: for importing data objects (STGeomFromText, STGeomFromWKB), for making different types of operations (STContains, STOverlaps, STUnion, STIntersection) or for making different measurements (STArea, STDistance), including methods to help us identify the nearest neighbor (STDistance(\@me)). Starting with SQL Server 2012, FullGlobe data type is defined: it represents a polygon that covers the entire globe. This polygon has an area, but it has no boundaries.

Examples

Date Geometrice

CREATE TABLE myTable (
    id int IDENTITY (1,1),
    geometryData geometry, 
GO

INSERT INTO myTable (geometryData)
VALUES (geometry::STGeomFromText('LINESTRING (100 100, 20 180, 180 180)', 0));

INSERT INTO myTable (geometryData)
VALUES (geometry::STGeomFromText('POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))', 0));
GO

SELECT @geom1 = geometryData FROM myTable WHERE id = 1;
SELECT @geom2 = geometryData FROM myTable WHERE id = 2;
SELECT @result = @geom1.STIntersection(@geom2);

Geographical Data Type

CREATE TABLE myTable (
    id int IDENTITY (1,1),
    geographyData geography, 
GO

INSERT INTO myTable (geographyData)
   VALUES (geography::STPolyFromText('POLYGON((-73.9998722076416 40.726185523600634,-74.00708198547363 40.73860807461818,-73.99824142456055 40.7466717351717,-73.97326469421387 40.74628158055554,-73.97309303283691 40.7269010214160, -73.9998722076416 40.726185523600634))', 4326));

What type of data should I choose for my application? (Geometry vs Geography)?

The type of data you choose depends on the application and its purpose. From the point of view of data storing, there is no difference between the two types of spatial data. But if we check performance levels, geometrical data queries are much faster. In the end, the most important argument is functionality. If we have an application for measuring the distance between different locations, or other operations where we need to take into account the shape of the Earth, we will need to use geographical data. In other cases, for example if we only need to visualize different polygons, geometrical data might be enough.

Applications

Radius search

Let's assume that we have a collection of points, determined by latitude and longitude, representing different locations. This type of search implies drawing a circle, determined by a center and a radius measured using a certain measurement unit (meters). In this case we can only use geographical data given that the search criterion is the distance between two points.

The optimum option would be saving the points in three columns: latitude, longitude, and geographical point. This way, before applying spatial filtering, we can filter the data by using the bounding box of the circle.

 geoPoint = geography::STGeomFromText('POINT (-96.8501 32.7639)', 4326)

SELECT * from myTable
WHERE latitude < 32.7871617669569 
AND latitude > 32.7500254131114 
AND longitude < -96.8143320623701 
AND longitude > -96.8584966119462 
AND geoPoint.STDistance(
  geography::STGeomFromText(
  'POINT(-96.836414337158146 32.768593590034193)',
   4326)) <= 2067

References

https://msdn.microsoft.com/en-us/library/bb933790.aspx

http://en.wikibooks.org/wiki/Geospatial_Data_in_SQL_Server

http://www.dotnetsolutions.co.uk/working-with-spatial-data-in-sql-server-2008/

https://devjef.wordpress.com/2013/01/12/geometry-vs-geography/