Before PostGIS – 2000
Adding spatial capabilities to PostgreSQL is a project that seems redundant – after all, PostgreSQL already has “geometric types”. Unfortunately the native PostgreSQL geometric types, while interesting, are too limited for GIS data and analysis. They were built for academic research purposes and are more suitable for computer graphics than GIS use.
In 2000 and 2001, Refractions was doing data management and some systems work for a British Columbia government Ministry. Most things could be done easily enough with file-based storage, but a couple data sets were sufficiently changeable that managing the files was a pain – a versioned database would be so much easier, but how to store the spatial data?
The OpenGIS “Simple Features for SQL” document seemed to provide some basic recipes: normalize the geometries into the relational model; or, create a “geometry” object. The path of least resistance was to simply use the relational model, so we implemented that, and then wrote some scripts to load data.
Unfortunately, the overhead involved in tearing geometries apart for relational storage, and then putting them together again when extracting them was very high – the system was slow. So, in 2001, we turned our attention to building a geometry object instead.
Early PostGIS – May 2001
Government contracting has a certain rhythm, and the rhythm of the British Columbia government is dictated by the March 31 fiscal year-end. Prior to the year-end, consultants are very busy, and after the year-end, very idle. In 2001, we spent our after-fiscal idle time on the problem of building a spatial database in PostgreSQL using a custom
geometry type of our own invention.
We were lucky to be using PostgreSQL at this point, because the ability to add “custom types” is a core part of the PostgreSQL design, with both documentation and examples available in the
contrib directory of the source code.
The performance of the first implementation of a PostGIS
geometry type was better than we ever could have hoped. It was 100-times faster than the relational model to load and unload, and about 10-times faster than using the generic BLOB (binary large object) sub-system.
Now we needed a spatial index to allow fast queries of subsets of large tables. Fortunately, again, PostgreSQL already had examples available in the
contrib directory of R-Tree bindings to the GiST indexing subsystem. With the index in place, we were able to demonstrate millisecond access to features in multi-million record spatial tables.
We now had the basic components of a spatial data storage and retrieval system:
- spatial objects and standards-based text representations (OpenGIS “well-known text”) for loading and unloading;
- some basic introspection functions like Length() and Area();
- a simple JDBC extension component for input/output in Java; and,
- a spatial index for fast random access of the data.
These components made up the first release of PostGIS, version 0.1, which was made public on May 31, 2001.
- 0.1 – May 31, 2001
Released June 20, 2001, only three weeks after the initial cut, the 0.2 release primarily added documentation, so that users could figure out building and using the software more easily.
- 0.2 – June 20, 2001
The 0.2 release included a few functions for creating OpenGIS “well-known binary” formats as output from PostGIS. Frank Warmerdam noticed that these functions had exactly the same purpose as standard functions in the OpenGIS specification, but had different names. He suggested that PostGIS should follow the specification more exactly, and for 0.5, the old function names and all new functions were harmonized with the OpenGIS specification.
- 0.5 – July 21, 2001
In addition to the Mapserver support, the 0.5 release added a large number of the functions required by the OpenGIS specification. In fact, so much functionality was added that we decided to skip versions 0.3 and 0.4!
The 0.6 series of releases was focussed mainly on adding new functions in line with the OpenGIS specifications.
While the 0.6 series was coming out, a parallel development was occuring that would have a big effect on PostGIS. The British Columbia government had let a contract to develop an open source Java library of topological predicates, based on the same OpenGIS standards as PostGIS. The functions being added through 0.6 were the “low hanging fruit”, the really hard functions were the predicates and operators: Touches(), Intersects(), Contains(), Buffer(), Union() and Difference(). The first indication that this “JTS Topology Suite” was bearing fruit came in December 2001.
In 2002, the PostgreSQL team released version 7.2, which substantially changed the GiST index bindings in previous PostGIS versions, and forced us to release a new PostGIS series to support 7.2. The 0.7 series also included support for coordinate reference system transformations, and a more thorough approach to figuring out spatial index selectivity.
- 0.7.0 – May 5, 2002
- 0.7.1 – May 14, 2002
- 0.7.2 – September 4, 2002
- 0.7.3 – September 5, 2002
- 0.7.4 – February 13, 2003
While the 0.7 series was ticking along, a major development was underway that would finally bring PostGIS from the category of “useful but limited” to “fully functional spatial database”. The JTS Topology Suite in Java had been released, and included a complete set of topology operators, templated on the OpenGIS specifications – the trouble was, the code was all in Java! Refractions teamed up with Vivid Solutions (the JTS company) and the University of Victoria, hiring a computer science graduate student to a port of JTS from Java to C++. The ported version of JTS would come to be named “GEOS” – Geometry Engine, Open Source.
The first version of GEOS was released on November 6, 2003, and we rushed to bring out a PostGIS version that included all the great new functionality the library made available. The 0.8 series was the first that could be made fully “Simple Features for SQL” compliant, using the functionality from the GEOS library to support the last set of difficult functions in the specification.
Refractions had been using PostGIS for more and more production data management work on our consulting projects, such as the Digital Roads Atlas (DRA) and the Corporate Watershed Base (CWB). The CWB in particular had almost 20 million segments in the database, and access speed was becoming an issue.
The original PostGIS
geometry database structure was formed without too much regard for how much space it used. The header included a number of integers that were used to store small or binary values, and even some un-used integers for “future use”. In addition, the structure allocated space for 3-dimensional data, even when the input data was 2-dimensional. For small objects, like 2-d points, the overhead of the un-used z-value and the headers added over 200% to the size needed to store just the ordinates of the point.
The slowest part of a modern computer is the disk storage sub-system, so any system that writes and reads from disk needs to move as little data as possible through it. To speed up PostGIS, we needed to make the data structures smaller, so a new “light-weight geometry” experiment was added during 0.8. The light-weight geometries dispensed with much of the original header, only using single bits for booleans, and avoiding storing z- and m-values when they weren't needed. In addition, the indexes were re-done to use bounding boxes defined with 32-bit floats instead of 64-bit doubles. The experimental system was part of PostGIS as an option, but was not made the default geometry implementation until the 1.0 series.
As the light-weight experiment proceded off to the side, the production line of PostGIS moved into the 0.9 series. The 0.9 series was primarily built to work with the new PostgreSQL 8.0 release. PostgreSQL 8.0 allowed native Windows support, which PostGIS also took advantage of. The estimation facilities built in 0.6 were finally integrated into the main PostgreSQL estimation and planning system in 0.9.
Almost two years after the light-weight geometries were started as an experiment, the 1.0 series of PostGIS moved them into position as the default geometry strorage system, greatly improving the performance of the system for very large databases.
- 1.0.0 – April 20, 2005
- 1.0.1 – May 24, 2005
- 1.0.2 – July 5, 2005
- 1.0.3 – August 7, 2005
- 1.0.4 – September 9, 2005
- 1.0.5 – November 25, 2005
- 1.0.6 – December 12, 2005
With version 1.1, PostGIS made a major improvement to the build system, allowing PostGIS to be built along-side already-installed binary packages in Linux distributions. Serious GIS functions, like polygon building and line building were added, and performance continued to be a major focus of enhancement.
- 1.1.0 – December 12, 2005
- 1.1.1 – January 22, 2006
- 1.1.2 – April 5, 2006
- 1.1.3 – June 30, 2006
- 1.1.4 – September 27, 2006
- 1.1.5 – October 13, 2006
- 1.1.6 – November 6, 2006
- 1.1.7 – January 31, 2007
As 2006 drew to a close, PostGIS had not seen a major change in a couple years. The 1.2 series and 1.3 series close behind would make some larger changes, moving from using the OpenGIS specification as the primary design guide to using the ISO SQL/MM specification instead. The SQL/MM document includes a much larger number of spatial objects, adding various Curves to the suite of standard types. The 1.2 series was the first to include support for Curves.
The 1.3 series continued the process of tracking more closely to SQL/MM, moving all the function signatures in PostGIS to match the SQL/MM standard, with “ST_” prefixes for all function names, and a number of new functions defined in SQL/MM.
To make PostGIS easier for new users to get started with, 1.3 also made the use of spatial indexes implicit in a number of common functions, like ST_Intersects(), and added some new functions, like ST_DWithin(), that allowed distance tests to use indexes implicitly.
Here are some of the PostGIS projects that Refractions Research has completed for clients over the years.