Tag Archives: Python

Adding state outlines to PRISM figures (without using Basemap)

I was showing my earlier PRISM/matplotlib examples to a student and their immediate response was to ask how hard it would be to add state outlines to the map without using Basemap.  A quick Google/stackoverflow search suggested that it shouldn’t be too hard (link, link, link, link), as long as the state outlines were in the same coordinate system/projection as the raster data.

The first thing was to read in the PRISM array and compute the extent from the HDR file in order to set the imshow extent.

import numpy as np

def read_prism_hdr(hdr_path):
    """Read an ESRI BIL HDR file"""
    with open(hdr_path, 'r') as input_f:
    header_list = input_f.readlines()
    return dict(item.strip().split() for item in header_list)

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
header_dict = read_prism_hdr(prism_path.replace('.bil', '.hdr'))

prism_nodata = float(header_dict['NODATA'])
prism_rows = int(header_dict['NROWS'])
prism_cols = int(header_dict['NCOLS'])

# Assume the cells are square and have the same cellsize in each direction
prism_cs = float(header_dict['XDIM'])
# prism_cs = float(header_dict['YDIM'])

# Build extent as xmin, xmax, ymin, ymax
# ULXMAP and ULYMAP are upper left cell center
prism_extent = [
    float(header_dict['ULXMAP']) - 0.5 * prism_cs,
    float(header_dict['ULXMAP']) - 0.5 * prism_cs + prism_cols * prism_cs,
    float(header_dict['ULYMAP']) + 0.5 * prism_cs - prism_rows * prism_cs,
    float(header_dict['ULYMAP']) + 0.5 * prism_cs
]

# Read in the array
prism_array = np.fromfile(prism_path, dtype=np.float32)
prism_array = prism_array.reshape(prism_rows, prism_cols)
prism_array[prism_array == prism_nodata] = np.nan

My first approach was basically copied from this script.  I used the 1:20,000,000 state shapefile from the US Census Cartographic Boundaries website.   I installed gdal, fiona, descartes, and shapely using the wheel files built by Christoph Gohlke, since none of the modules would install with conda (on my windows computer). I had to use GDAL 1.11.4 instead of 2.0.2, otherwise I got an error when reading the shapefile with fiona.

import os
from collections import defaultdict

from descartes import PolygonPatch
import fiona
import matplotlib.pyplot as plt
from matplotlib.collections import PatchCollection
from shapely.geometry import MultiPolygon, Polygon, shape

# Read the polygon geometries from the shapefile
shp_path = 'cb_2014_us_state_20m.shp'
mp = MultiPolygon(
    [shape(pol['geometry']) for pol in fiona.open(shp_path, 'r')])

fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(prism_array, cmap='viridis', extent=prism_extent)
patches = [PolygonPatch(p, fc='none', ec='0.1', zorder=1) for p in mp]
ax.add_collection(PatchCollection(patches, match_original=True))
plt.grid()
plt.show()

prism_missing_patches

The problem with this combination of approach and dataset, are that the state features are multi-polygons, but only the first polygon is read if the feature is a multi-polygon.  As you can see, some of the state with more than one polygon are missing (California, Michigan, Florida, etc.).

The following approach will check whether the feature is a polygon or multi-polygon. If it is a multi-polygon, it will read each polygon geometry from the multi-polygon.

geom_list = []
with fiona.open(shp_path, 'r') as cell_f:
    for feature in cell_f:
        ftr_geom = shape(feature['geometry'])

        if ftr_geom.is_empty:
            continue
        elif ftr_geom.geom_type == 'MultiPolygon':
            # Unpack multi-polygons to lists of polygons
            geom_list.extend([g for g in ftr_geom if not g.is_empty])
        elif ftr_geom.geom_type == 'Polygon':
            geom_list.append(ftr_geom)
        else:
            print('Invalid geometry type')
            continue
mp = MultiPolygon(geom_list)

fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(prism_array, cmap='viridis', extent=prism_extent)

patches = [PolygonPatch(p, fc='none', ec='0.1', zorder=1) for p in mp]
ax.add_collection(PatchCollection(patches, match_original=True))

plt.grid()
plt.show()

prism_all_patches

I will definitely try this again and see if there is a simpler approach or one that relies on fewer modules.

Reading PRISM BIL arrays without using GDAL

Since it isn’t always easy to get GDAL working with Conda, I decided to try reading the PRISM BIL file using built in Python modules and NumPy.

The first example I found (Plotting USGS BIL DEMs in Python) provided 95% of the steps needed to get this working using the struct module.  I only needed to change the datatype to a float32/short.  I decided to hardcode the width, height, datatype, number of bands, and nodata value for now, but all of these could/should be read from the HDR file.

import struct

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_cols = 1405
prism_rows = 621
prism_nodata = -9999

bil_f = open(bil_path, 'rb')
bil_data = bil_f.read()
bil_f.close()

# Unpack binary data into a flat tuple z
s = '<%df' % (int(prism_cols * prism_rows),)
z = struct.unpack(s, bil_data)

prism_array = np.zeros((prism_rows, prism_cols), dtype=np.float32)
for r in range(0, prism_rows):
    for c in range(0, prism_cols):
        prism_array[r][c] = float(z[((prism_cols) * r) + c])
        # prism_array.itemset((r, c), float(z[((prism_cols) * r) + c]))

prism_array[prism_array == prism_nodata] = np.nan

I then realized that NumPy should be able to easily (and quickly) read a simple single band binary array file using fromfile().

import numpy as np

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_cols = 1405
prism_rows = 621
prism_nodata = -9999

prism_array = np.fromfile(bil_path, dtype=np.float32)
prism_array = prism_array.reshape(prism_rows, prism_cols)
prism_array[prism_array == prism_nodata] = np.nan

Might as well read the width, height, and nodata values from the HDF file, since it is avaiable. You could get the datatype also.

import numpy as np

def read_prism_hdr(hdr_path):
    """Read an ESRI BIL HDR file"""
    with open(hdr_path, 'r') as input_f:
        header_list = input_f.readlines()
    return dict(item.strip().split() for item in header_list)

def read_prism_bil(bil_path):
    """Read an array from ESRI BIL raster file"""
    hdr_dict = read_prism_hdr(bil_path.replace('.bil', '.hdr'))
    # For now, only use NROWS, NCOLS, and NODATA
    # Eventually use NBANDS, BYTEORDER, LAYOUT, PIXELTYPE, NBITS

    prism_array = np.fromfile(bil_path, dtype=np.float32)
    prism_array = prism_array.reshape(
        int(hdr_dict['NROWS']), int(hdr_dict['NCOLS']))
    prism_array[prism_array == float(hdr_dict['NODATA'])] = np.nan
    return prism_array

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_array = read_prism_bil(prism_path)

Plotting PRISM BIL arrays using GDAL & Matplotlib

Now that the PRISM data is being released in a BIL format, I decided I should update my old post on plotting PRISM arrays. First, I tried using ndimage.imread() and matplotlib.imread(), but neither worked with the BIL format. GDAL can handle it though.

prism_tmean_normal

import matplotlib.pyplot as plt
from osgeo import gdal

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_nodata = -9999

prism_ds = gdal.Open(prism_path)
prism_band = prism_ds.GetRasterBand(1)
prism_array = prism_band.ReadAsArray().astype(np.float32)
# prism_nodata = prism_band.GetNoDataValue()
prism_array[prism_array == prism_nodata] = np.nan
prism_ds = None

plt.imshow(prism_array, cmap='viridis')
plt.show()

I really like the new Viridis colormap!

 

Using ArcPy with Anaconda

Recently I have had really good luck using Anaconda to manage Python modules, but I didn’t know how to use the ArcGIS 10.2 ArcPy module with it.  ArcGIS installs its own Python installation (C:\Python27\ArcGIS10.2) and I was only able to import arcpy when using the this copy of Python.

To make this work, I started with this really helpful stackoverflow post.  Basically, I uninstalled all existing ArcGIS installations and python modules, installed ArcGIS, then installed Anaconda (or Miniconda) using all the default settings/options.  I clicked the button to register Anaconda as the default Python installation, but I’m not sure this actually worked.

Unlike the stackoverflow poast, I then added C:\Anaconda and C:\Anaconda\bin to the system path, but I removed C:\Python27\ArcGIS10.2.  For ArcGIS 10.2 (32bit), the file in C:\Python27\ArcGIS10.2\Lib\site-packages that you need to copy to C:\Anaconda\Lib\site-packages is called desktop10.2.pth.  For ArcGIS 10.2 (64bit), the file in C:\Python27\ArcGISx6410.2\Lib\site-packages that you need to copy to C:\Anaconda\Lib\site-packages is called DTBGGP64.pth.

Finally, the only way I could get the Anaconda Python to be used when double clicking on a .py file was to modify the registry.  I didn’t try it but you could probably change these values from the command prompt as outlined in the Python documentation.  I changed the following keys to point to the Anaconda folder:

HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Python.File\shell\Edit with IDLE
HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Python.File\shell\open
HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Python.CompiledFile\shell\open\command

After that I could import ArcPy from the Anaconda Python without error.