Manipulating PostGIS layers from GRASS

GRASS users realize, right from the start, that GRASS maintains its own internal format for vector and raster layers. Whenever you want to use GRASS modules on GIS layers from some other source you first must import the data layers into GRASS, then run your analysis, and if necessary, export back to a required format.

But suppose you need to manipulate attributes of a vector layer that’s already stored in a PostGIS spatial database. A typical example: I have a GIS layer of rain gauges, stored in PostGIS. I need to add a column to the layer’s attribute table for elevation, and then get the elevation for each gauge from a GRASS based DEM (Digital Elevation Model) raster.

There are three approaches. For this post I’ll call them:

  1. Import-Export: The usual, long but straight-forward way
  2. Duplicate table: Also somewhat long, but saves having two layers in PostGIS
  3. Hop Skip and Jump: This suggestion I owe to Jonah Duckles. A bit harder to follow, but it avoids any duplication of layers or attribute tables.

The conventional Import-Export method involves importing the point vector from PostGIS into GRASS, adding the required attribute column and updating its values in GRASS, then exporting back out to PostGIS. But you end up with a second PostGIS vector layer, equivalent to the first with the addition of the required elevation column. Now you must use PostgreSQL tools to drop the original table and rename the new one to the original name.

Briefly…

(we’ll assume a PostGIS vector called gauges, and the GRASS elevation raster called dem)

# Set up the data source  details

export HOST=<Your PostGIS hostname>
export DB=<Your PostGIS database name>
export USER=<Your PostGIS administrator name>
export PASS=<Your PostGIS admin password>
export DSN="PG:host=${HOST} dbname=${DB} user=${USER} password=${PASS}"
v.in.ogr dsn=${DSN} layer=gauges out=gauges_tmp

# We now have a duplicate of the original PostGIS vector stored totally (geometry and attributes) in GRASS

# Add the new column for elevation values

v.db.addcol gauges_tmp col="elevation double precision"

# check attribute table

v.info -c gauges

# Now get the elevation values from the raster

v.what.rast vect=gauges_tmp rast=dem col=elevation

# check elevation values:

v.db.select gauges_tmp col=gauge_id,elevation

If everything looks good, export back to PostGIS:

v.out.ogr in=gauges_tmp dsn=$DSN type=point format=PostgreSQL

Now you should have in PostGIS two point layers, with the same features, and the gauges_tmp table should have the additional elevation column. What’s left is to drop (or rename) the original table and rename the new one to the original name. (Double check that the new gauges_tmp table contains all the features before continuing!)

psql -h ${HOST} -d ${DB} -U ${USER} -c "ALTER TABLE gauges RENAME TO gauges_orig"
psql -h ${HOST} -d ${DB} -U ${USER} -c "ALTER TABLE gauges_tmp RENAME TO gauges"

In PostGIS spatial tables also appear in the geometry_columns table so be sure to rename the gauges_tmp table to exactly the same name as the original.

Now on to the second Duplicate table method.
This way involves setting the database connection in GRASS to store attribute tables straight in PostgreSQL, then import as before the original gauges layer. This will create a GRASS vector in which the geometry is in GRASS (as always) but the attributes are kept right in PostgreSQL. Then run the v.what.rast command, and the new elevation values are updated right into the new PostGIS table. The final step is to copy those elevation values from this new attribute table to the original gauges layer. Here are the steps:

# First check current database connection and take note of the settings. We’ll need these parameters to revert to this database connection after the procedure.

db.connect -p >> connection.txt
db.connect driv=pg database="host=${HOST},dbname=${DB}"
db.login user=${USER} password=${PASS}

# and import the original gauges just like before

v.in.ogr dsn=${DSN} layer=gauges out=gauges_tmp

# This creates a GRASS vector, but the attribute table is now stored right in your PostGIS database. We’re ready to add the required elevation column and get values from the GRASS dem.

v.db.addcol gauges_tmp col="elevation double precision"
v.what.rast vect=gauges_tmp rast=dem col=elevation

All that remains now is to run two PostgreSQL commands: add an elevation column to the original table and update its values from the gauges_tmp table:

psql -h ${HOST} -d ${DB} -U ${USER} -c "ALTER TABLE gauges ADD COLUMN elevation float"
psql -h ${HOST} -d ${DB} -U ${USER} -c "UPDATE gauges SET elevation=gauges_tmp.elevation FROM gauges_tmp WHERE gauges_tmp.gauge_id=gauges.gauge_id"

The “WHERE” clause insures that the correct elevation values are entered into the permanent gauges layer based on the gauge_id column. Finish by removing the gauges_tmp layer (Its accompanying database attribute table will also be dropped).

g.remove vect=gauge_tmp

And don’t forget to reset the database connection back to the original values (We saved the parameters into the file connection.txt):

cat connection.txt
db.connect driv=... database=...

On to the Hop Skip and Jump method.

First I should note that this trick works only for point vectors. If you need to update attribute values for PostGIS based line or area features, this won’t work. Use one of the first two ways.

The process involves three routines that can be chained into a one-line command. First we take advantage of the GRASS v.external module to have access to externally stored vectors. This modules gives us a read-only representation in GRASS of an externally stored vector. Then, with the v.out.ascii command we get a list of the X-Y coordinates for all the points in this vector. These coordinates are piped as input to the r.what module which outputs the raster value at each of the input point locations. The output from this raster command is then fed into a psql statement to update the original database directly. No interim tables are needed at all. Here we go:

v.external dsn=${DSN} layer=gauges out=gauges_tmp

# Now the one-liner

v.out.ascii in=gauges_tmp fs=" " columns="gauge_id" | \
r.what in=dem fs=" " | \
sed '/\*/d' | \
awk '{print "UPDATE gauges SET elevation="$5 " WHERE gauge_id="$4}' >> update_gauges.sql

Here’s the “play by play”:

As I mentioned, v.out.ascii creates a list of X-Y coordinates, including the GRASS cat values, and also the gauge_id, added by the columns= parameter. This list is input to r.what which just adds at the end of each line the raster value at that X-Y location. But r.what will output an asterisk ‘*’ when no value is available (i.e for points outside the region of the raster). We don’t want these lines in the output, so the sed command deletes them.

The resulting list gets piped thru the awk command to create a list of PostgreSQL UPDATE commands, where the elevation values are taken from column 5 in the input, and the gauge_id is from column 4. (Columns 1,2 are the coords, column 3 the cat value. Thus column 4 is the gauge_id and column 5 the raster value).

# Check the results carefully for any anomolies…

less update_gauges.sql

# If all is OK, just run this file through psql:

psql -h ${HOST} -d ${DB} -U ${USER} -c "ALTER TABLE gauges ADD COLUMN elevation float"
psql -h ${HOST} -d ${DB} -U ${USER} -f update_gauges.sql

Quick and elegant. If you’re confident that the above string of commands creates correct sql UPDATE statements, you can even do away with the update_gauges.sql file and pipe the list straight into psql.

3 thoughts on “Manipulating PostGIS layers from GRASS

  1. I’m not clear how this is supposed to work. If I execute “export HOST= localhost” from the command console, I just get this:
    “‘export’ is not recognized as an internal or external
    command,
    operable program or batch file.”

    Am I missing something?

  2. Hello Aren:

    Aren Cambre :

    I’m not clear how this is supposed to work. If I execute “export HOST= localhost” from the command console, I just get this:
    “‘export’ is not recognized as an internal or external
    command,
    operable program or batch file.”

    Am I missing something?

    I guess you’re using Windows. The equivalent to the linux ‘export’ command should be ‘set’ to set environment variables.
    Alternatively, you could just string all the connection parameters onto the command line after the “dsn=” option.
    Regards,
    Micha

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>