Tag Archives: PRISM

Adding state outlines to PRISM figures (without using Basemap)

I was showing my earlier PRISM/matplotlib examples to a student and their immediate response was to ask how hard it would be to add state outlines to the map without using Basemap.  A quick Google/stackoverflow search suggested that it shouldn’t be too hard (link, link, link, link), as long as the state outlines were in the same coordinate system/projection as the raster data.

The first thing was to read in the PRISM array and compute the extent from the HDR file in order to set the imshow extent.

import numpy as np

def read_prism_hdr(hdr_path):
    """Read an ESRI BIL HDR file"""
    with open(hdr_path, 'r') as input_f:
    header_list = input_f.readlines()
    return dict(item.strip().split() for item in header_list)

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
header_dict = read_prism_hdr(prism_path.replace('.bil', '.hdr'))

prism_nodata = float(header_dict['NODATA'])
prism_rows = int(header_dict['NROWS'])
prism_cols = int(header_dict['NCOLS'])

# Assume the cells are square and have the same cellsize in each direction
prism_cs = float(header_dict['XDIM'])
# prism_cs = float(header_dict['YDIM'])

# Build extent as xmin, xmax, ymin, ymax
# ULXMAP and ULYMAP are upper left cell center
prism_extent = [
    float(header_dict['ULXMAP']) - 0.5 * prism_cs,
    float(header_dict['ULXMAP']) - 0.5 * prism_cs + prism_cols * prism_cs,
    float(header_dict['ULYMAP']) + 0.5 * prism_cs - prism_rows * prism_cs,
    float(header_dict['ULYMAP']) + 0.5 * prism_cs
]

# Read in the array
prism_array = np.fromfile(prism_path, dtype=np.float32)
prism_array = prism_array.reshape(prism_rows, prism_cols)
prism_array[prism_array == prism_nodata] = np.nan

My first approach was basically copied from this script.  I used the 1:20,000,000 state shapefile from the US Census Cartographic Boundaries website.   I installed gdal, fiona, descartes, and shapely using the wheel files built by Christoph Gohlke, since none of the modules would install with conda (on my windows computer). I had to use GDAL 1.11.4 instead of 2.0.2, otherwise I got an error when reading the shapefile with fiona.

import os
from collections import defaultdict

from descartes import PolygonPatch
import fiona
import matplotlib.pyplot as plt
from matplotlib.collections import PatchCollection
from shapely.geometry import MultiPolygon, Polygon, shape

# Read the polygon geometries from the shapefile
shp_path = 'cb_2014_us_state_20m.shp'
mp = MultiPolygon(
    [shape(pol['geometry']) for pol in fiona.open(shp_path, 'r')])

fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(prism_array, cmap='viridis', extent=prism_extent)
patches = [PolygonPatch(p, fc='none', ec='0.1', zorder=1) for p in mp]
ax.add_collection(PatchCollection(patches, match_original=True))
plt.grid()
plt.show()

prism_missing_patches

The problem with this combination of approach and dataset, are that the state features are multi-polygons, but only the first polygon is read if the feature is a multi-polygon.  As you can see, some of the state with more than one polygon are missing (California, Michigan, Florida, etc.).

The following approach will check whether the feature is a polygon or multi-polygon. If it is a multi-polygon, it will read each polygon geometry from the multi-polygon.

geom_list = []
with fiona.open(shp_path, 'r') as cell_f:
    for feature in cell_f:
        ftr_geom = shape(feature['geometry'])

        if ftr_geom.is_empty:
            continue
        elif ftr_geom.geom_type == 'MultiPolygon':
            # Unpack multi-polygons to lists of polygons
            geom_list.extend([g for g in ftr_geom if not g.is_empty])
        elif ftr_geom.geom_type == 'Polygon':
            geom_list.append(ftr_geom)
        else:
            print('Invalid geometry type')
            continue
mp = MultiPolygon(geom_list)

fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(prism_array, cmap='viridis', extent=prism_extent)

patches = [PolygonPatch(p, fc='none', ec='0.1', zorder=1) for p in mp]
ax.add_collection(PatchCollection(patches, match_original=True))

plt.grid()
plt.show()

prism_all_patches

I will definitely try this again and see if there is a simpler approach or one that relies on fewer modules.

Reading PRISM BIL arrays without using GDAL

Since it isn’t always easy to get GDAL working with Conda, I decided to try reading the PRISM BIL file using built in Python modules and NumPy.

The first example I found (Plotting USGS BIL DEMs in Python) provided 95% of the steps needed to get this working using the struct module.  I only needed to change the datatype to a float32/short.  I decided to hardcode the width, height, datatype, number of bands, and nodata value for now, but all of these could/should be read from the HDR file.

import struct

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_cols = 1405
prism_rows = 621
prism_nodata = -9999

bil_f = open(bil_path, 'rb')
bil_data = bil_f.read()
bil_f.close()

# Unpack binary data into a flat tuple z
s = '<%df' % (int(prism_cols * prism_rows),)
z = struct.unpack(s, bil_data)

prism_array = np.zeros((prism_rows, prism_cols), dtype=np.float32)
for r in range(0, prism_rows):
    for c in range(0, prism_cols):
        prism_array[r][c] = float(z[((prism_cols) * r) + c])
        # prism_array.itemset((r, c), float(z[((prism_cols) * r) + c]))

prism_array[prism_array == prism_nodata] = np.nan

I then realized that NumPy should be able to easily (and quickly) read a simple single band binary array file using fromfile().

import numpy as np

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_cols = 1405
prism_rows = 621
prism_nodata = -9999

prism_array = np.fromfile(bil_path, dtype=np.float32)
prism_array = prism_array.reshape(prism_rows, prism_cols)
prism_array[prism_array == prism_nodata] = np.nan

Might as well read the width, height, and nodata values from the HDF file, since it is avaiable. You could get the datatype also.

import numpy as np

def read_prism_hdr(hdr_path):
    """Read an ESRI BIL HDR file"""
    with open(hdr_path, 'r') as input_f:
        header_list = input_f.readlines()
    return dict(item.strip().split() for item in header_list)

def read_prism_bil(bil_path):
    """Read an array from ESRI BIL raster file"""
    hdr_dict = read_prism_hdr(bil_path.replace('.bil', '.hdr'))
    # For now, only use NROWS, NCOLS, and NODATA
    # Eventually use NBANDS, BYTEORDER, LAYOUT, PIXELTYPE, NBITS

    prism_array = np.fromfile(bil_path, dtype=np.float32)
    prism_array = prism_array.reshape(
        int(hdr_dict['NROWS']), int(hdr_dict['NCOLS']))
    prism_array[prism_array == float(hdr_dict['NODATA'])] = np.nan
    return prism_array

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_array = read_prism_bil(prism_path)

Plotting PRISM BIL arrays using GDAL & Matplotlib

Now that the PRISM data is being released in a BIL format, I decided I should update my old post on plotting PRISM arrays. First, I tried using ndimage.imread() and matplotlib.imread(), but neither worked with the BIL format. GDAL can handle it though.

prism_tmean_normal

import matplotlib.pyplot as plt
from osgeo import gdal

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_nodata = -9999

prism_ds = gdal.Open(prism_path)
prism_band = prism_ds.GetRasterBand(1)
prism_array = prism_band.ReadAsArray().astype(np.float32)
# prism_nodata = prism_band.GetNoDataValue()
prism_array[prism_array == prism_nodata] = np.nan
prism_ds = None

plt.imshow(prism_array, cmap='viridis')
plt.show()

I really like the new Viridis colormap!

 

Plotting PRISM ASCII arrays using Basemap

prism_projected
Plotting NumPy arrays directly using Matplotlib was simple, but what I really wanted though was to show the data in a projected coordinate system.  With a little bit of searching it seemed like the Python basemap module should be able to handle this task pretty easily.  The examples on the basemap website  were really good but the example on Peak5398’s Introduction to Python site also really helped with some of the details.

Just like last time, the first step was reading in the PRISM ASCII raster to a NumPy array.  The only difference is that I read the data into a masked array.

import numpy as np

# Read in PRISM header data
prism_name = 'us_tmax_2011.14.asc'
with open(prism_name, 'r') as prism_f:
    prism_header = prism_f.readlines()[:6]

# Pull out nodata value from ASCII raster header
prism_header = [item.strip().split()[-1] for item in prism_header]
prism_cols = int(prism_header[0])
prism_rows = int(prism_header[1])
prism_xll = float(prism_header[2])
prism_yll = float(prism_header[3])
prism_cs = float(prism_header[4])
prism_nodata = float(prism_header[5])

# Readin the PRISM array
prism_array = np.loadtxt(prism_name, dtype=np.float, skiprows=6)

# Build a masked array
prism_ma = np.ma.masked_equal(prism_array, prism_nodata)

# PRISM data is stored as an integer but scaled by 100
prism_ma *= 0.01

# Calculate the extent of the PRISM array
# XMin, XMax, YMin, YMax
prism_extent = [
    prism_xll, prism_xll + prism_cols * prism_cs,
    prism_yll, prism_yll + prism_rows * prism_cs]

I choose to use an Albers Equal Area projection, but it seems really simple to set just about any projection.  I used a masked array so that I could use the maskoceans function to remove the PRISM values over the ocean.  Just setting these cells to NaN didn’t work.

Overall it seems really easy to plot array data in a projected coordinate system.

import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap, maskoceans

# Build arrays of the lat/lon points for each cell
lons = np.arange(prism_extent[0], prism_extent[1], prism_cs)
lats = np.arange(prism_extent[3], prism_extent[2], -prism_cs)
lons, lats = np.meshgrid(lons, lats)

# Just the contingous United States
m = Basemap(
    width=5000000, height=3400000, projection='aea', resolution='l',
    lat_1=29.5, lat_2=45.5, lat_0=38.5, lon_0=-96.)

# Build the figure and ax objects
fig, ax = plt.subplots()
ax.set_title('PRISM 2011 Tmax')

# Built-in map boundaries
m.drawcoastlines()
m.drawlsmask(land_color='white', ocean_color='0.8', lakes=False)
m.drawcountries()
m.drawstates()

# Set ocean pixels to nodata
prism_ma_land = maskoceans(lons, lats, prism_ma)

# Draw PRISM array directly to map
im = m.pcolormesh(lons, lats, prism_ma_land, cmap=plt.cm.jet, latlon=True)

# Build the colorbar on the right
cbar = m.colorbar(im, 'right', size='5%', pad='5%')
# Build the colorbar on the bottom
# cbar = m.colorbar(im, 'bottom', size='5%', pad='8%')
cbar.set_label('Degrees C')

# Meridians &amp;amp; Parallels
m.drawmeridians(np.arange(0, 360, 10), labels=[False,False,False,True])
m.drawparallels(np.arange(-90, 90, 10), labels=[True,False,False,False])

# Save and show
plt.show()

 

Plotting PRISM ASCII arrays using Matplotlib imshow

I was trying to display an old PRISM ASCII raster file, but didn’t want to have to load ArcGIS and then import the raster from the ASCII file.  The new PRISM arrays are in a BIL format, which is much more convenient for use in ArcGIS, but requires using something like GDAL to read into NumPy arrays.

Reading the ASCII raster format directly into a NumPy array is really simple though.  The format of the ASCII raster is defined as:

NCOLS xxx
NROWS xxx
XLLCENTER xxx | XLLCORNER xxx
YLLCENTER xxx | YLLCORNER xxx
CELLSIZE xxx
NODATA_VALUE xxx
row 1
row 2
...
row n

The only other issue with the old PRISM ASCII rasters is that they are stored as integers but scaled by 100.

import numpy as np

# Read in PRISM header data
with open('us_tmax_2011.14.asc', 'r') as prism_f:
    prism_header = prism_f.readlines()[:6]

# Read the PRISM ASCII raster header
prism_header = [item.strip().split()[-1] for item in prism_header]
prism_cols = int(prism_header[0])
prism_rows = int(prism_header[1])
prism_xll = float(prism_header[2])
prism_yll = float(prism_header[3])
prism_cs = float(prism_header[4])
prism_nodata = float(prism_header[5])

# Read in the PRISM array
prism_array = np.loadtxt(prism_path, dtype=np.float, skiprows=6)

# Set the nodata values to nan
prism_array[prism_array == prism_nodata] = np.nan

# PRISM data is stored as an integer but scaled by 100
prism_array *= 0.01

The next step was using the imshow function in matplotlib to display the array.  I thought a colorbar and grid lines would look nice also.

import matplotlib.pyplot as plt

# Plot PRISM array again
fig, ax = plt.subplots()
ax.set_title('PRISM 2011 Precipitation')

# Get the img object in order to pass it to the colorbar function
img_plot = ax.imshow(prism_array, cmap='jet')

# Place a colorbar next to the map
cbar = fig.colorbar(img_plot)

ax.grid(True)
plt.show()

 

 

prism_geographic_a

The figure clearly shows the data, but I thought it would be nice if the x and y ticks where in degrees instead of the row and column indices.  It would also be nice to have the colorbar match up with the y-axis of the plot and have a label.

I tried setting the ticks manually for awhile before I realized you could just pass an extent object to imshow.  It is pretty easy to manually set the colorbar parameters as well, but I’m sure there is an automated way to do this.

# Calculate the extent of the PRISM array
# Extent is Xmin, Xmax, Ymin, Ymax
prism_extent = [
    prism_xll, prism_xll + prism_cols * prism_cs,
    prism_yll, prism_yll + prism_rows * prism_cs]

img_plot = ax.imshow(prism_array, extent=prism_extent)

# Place a colorbar next to the map
cbar = plt.colorbar(img_plot, orientation='vertical', shrink=0.5, aspect=14)
cbar.set_label('Degrees C')

prism_geographic_b

If you want the colorbar to be on the bottom, that is a pretty easy fix as well.

cbar = plt.colorbar(img_plot, orientation='horizontal')
cbar.set_label('Degrees C')

prism_geographic_c