Link

I was trying to create a figure of a legend (without a plot) and it took me way to long to figure it out. What I wanted was something like this:

legend.png

I initially though this would be really easy using the approach in this stack overflow post

import matplotlib.pyplot as plt

labels = ['Label 1', 'Label 2', 'Label 3', 'Label 4']
colors = []

fig = plt.figure()
fig_legend = plt.figure(figsize=(2, 1.25))
ax = fig.add_subplot(111)
lines = ax.plot(range(2), range(2), range(2), range(2), range(2), range(2), range(2), range(2))
fig_legend.legend(lines, labels, loc='center', frameon=False)
plt.show()

legend.png

This worked well enough for making lines, but what I really wanted were the boxes/patches, like you would get with a bar chart or histogram.  I tried switching to plotting bars:

fig = plt.figure()
fig_legend = plt.figure(figsize=(2, 1.25))
ax = fig.add_subplot(111)
bars = ax.bar(range(4), range(4), color=colors, label=labels)
fig_legend.legend(bars.get_children(), labels, loc='center', frameon=False)

but this resulted in the following “RuntimeError: Can not put single artist in more than one figure“.

I tried as many different combinations of “ax.get_legend_handles_labels()”, “bars.get_children()”, plotting each bar separately, and plotting histograms, but kept getting the same error.

Finally, after far too long, I realized I didn’t need to play anything to get a patch object and could create them manually.  The following code was used to generate the figure at the top.

fig = plt.figure(figsize=(2, 1.25))
patches = [
    mpatches.Patch(color=color, label=label)
    for label, color in zip(labels, colors)]
fig.legend(patches, labels, loc='center', frameon=False)
plt.show()

Adding state outlines to PRISM figures (without using Basemap)

I was showing my earlier PRISM/matplotlib examples to a student and their immediate response was to ask how hard it would be to add state outlines to the map without using Basemap.  A quick Google/stackoverflow search suggested that it shouldn’t be too hard (link, link, link, link), as long as the state outlines were in the same coordinate system/projection as the raster data.

The first thing was to read in the PRISM array and compute the extent from the HDR file in order to set the imshow extent.

import numpy as np

def read_prism_hdr(hdr_path):
    """Read an ESRI BIL HDR file"""
    with open(hdr_path, 'r') as input_f:
    header_list = input_f.readlines()
    return dict(item.strip().split() for item in header_list)

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
header_dict = read_prism_hdr(prism_path.replace('.bil', '.hdr'))

prism_nodata = float(header_dict['NODATA'])
prism_rows = int(header_dict['NROWS'])
prism_cols = int(header_dict['NCOLS'])

# Assume the cells are square and have the same cellsize in each direction
prism_cs = float(header_dict['XDIM'])
# prism_cs = float(header_dict['YDIM'])

# Build extent as xmin, xmax, ymin, ymax
# ULXMAP and ULYMAP are upper left cell center
prism_extent = [
    float(header_dict['ULXMAP']) - 0.5 * prism_cs,
    float(header_dict['ULXMAP']) - 0.5 * prism_cs + prism_cols * prism_cs,
    float(header_dict['ULYMAP']) + 0.5 * prism_cs - prism_rows * prism_cs,
    float(header_dict['ULYMAP']) + 0.5 * prism_cs
]

# Read in the array
prism_array = np.fromfile(prism_path, dtype=np.float32)
prism_array = prism_array.reshape(prism_rows, prism_cols)
prism_array[prism_array == prism_nodata] = np.nan

My first approach was basically copied from this script.  I used the 1:20,000,000 state shapefile from the US Census Cartographic Boundaries website.   I installed gdal, fiona, descartes, and shapely using the wheel files built by Christoph Gohlke, since none of the modules would install with conda (on my windows computer). I had to use GDAL 1.11.4 instead of 2.0.2, otherwise I got an error when reading the shapefile with fiona.

import os
from collections import defaultdict

from descartes import PolygonPatch
import fiona
import matplotlib.pyplot as plt
from matplotlib.collections import PatchCollection
from shapely.geometry import MultiPolygon, Polygon, shape

# Read the polygon geometries from the shapefile
shp_path = 'cb_2014_us_state_20m.shp'
mp = MultiPolygon(
    [shape(pol['geometry']) for pol in fiona.open(shp_path, 'r')])

fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(prism_array, cmap='viridis', extent=prism_extent)
patches = [PolygonPatch(p, fc='none', ec='0.1', zorder=1) for p in mp]
ax.add_collection(PatchCollection(patches, match_original=True))
plt.grid()
plt.show()

prism_missing_patches

The problem with this combination of approach and dataset, are that the state features are multi-polygons, but only the first polygon is read if the feature is a multi-polygon.  As you can see, some of the state with more than one polygon are missing (California, Michigan, Florida, etc.).

The following approach will check whether the feature is a polygon or multi-polygon. If it is a multi-polygon, it will read each polygon geometry from the multi-polygon.

geom_list = []
with fiona.open(shp_path, 'r') as cell_f:
    for feature in cell_f:
        ftr_geom = shape(feature['geometry'])

        if ftr_geom.is_empty:
            continue
        elif ftr_geom.geom_type == 'MultiPolygon':
            # Unpack multi-polygons to lists of polygons
            geom_list.extend([g for g in ftr_geom if not g.is_empty])
        elif ftr_geom.geom_type == 'Polygon':
            geom_list.append(ftr_geom)
        else:
            print('Invalid geometry type')
            continue
mp = MultiPolygon(geom_list)

fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(prism_array, cmap='viridis', extent=prism_extent)

patches = [PolygonPatch(p, fc='none', ec='0.1', zorder=1) for p in mp]
ax.add_collection(PatchCollection(patches, match_original=True))

plt.grid()
plt.show()

prism_all_patches

I will definitely try this again and see if there is a simpler approach or one that relies on fewer modules.

Reading PRISM BIL arrays without using GDAL

Since it isn’t always easy to get GDAL working with Conda, I decided to try reading the PRISM BIL file using built in Python modules and NumPy.

The first example I found (Plotting USGS BIL DEMs in Python) provided 95% of the steps needed to get this working using the struct module.  I only needed to change the datatype to a float32/short.  I decided to hardcode the width, height, datatype, number of bands, and nodata value for now, but all of these could/should be read from the HDR file.

import struct

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_cols = 1405
prism_rows = 621
prism_nodata = -9999

bil_f = open(bil_path, 'rb')
bil_data = bil_f.read()
bil_f.close()

# Unpack binary data into a flat tuple z
s = '<%df' % (int(prism_cols * prism_rows),)
z = struct.unpack(s, bil_data)

prism_array = np.zeros((prism_rows, prism_cols), dtype=np.float32)
for r in range(0, prism_rows):
    for c in range(0, prism_cols):
        prism_array[r][c] = float(z[((prism_cols) * r) + c])
        # prism_array.itemset((r, c), float(z[((prism_cols) * r) + c]))

prism_array[prism_array == prism_nodata] = np.nan

I then realized that NumPy should be able to easily (and quickly) read a simple single band binary array file using fromfile().

import numpy as np

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_cols = 1405
prism_rows = 621
prism_nodata = -9999

prism_array = np.fromfile(bil_path, dtype=np.float32)
prism_array = prism_array.reshape(prism_rows, prism_cols)
prism_array[prism_array == prism_nodata] = np.nan

Might as well read the width, height, and nodata values from the HDF file, since it is avaiable. You could get the datatype also.

import numpy as np

def read_prism_hdr(hdr_path):
    """Read an ESRI BIL HDR file"""
    with open(hdr_path, 'r') as input_f:
        header_list = input_f.readlines()
    return dict(item.strip().split() for item in header_list)

def read_prism_bil(bil_path):
    """Read an array from ESRI BIL raster file"""
    hdr_dict = read_prism_hdr(bil_path.replace('.bil', '.hdr'))
    # For now, only use NROWS, NCOLS, and NODATA
    # Eventually use NBANDS, BYTEORDER, LAYOUT, PIXELTYPE, NBITS

    prism_array = np.fromfile(bil_path, dtype=np.float32)
    prism_array = prism_array.reshape(
        int(hdr_dict['NROWS']), int(hdr_dict['NCOLS']))
    prism_array[prism_array == float(hdr_dict['NODATA'])] = np.nan
    return prism_array

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_array = read_prism_bil(prism_path)

Plotting PRISM BIL arrays using GDAL & Matplotlib

Now that the PRISM data is being released in a BIL format, I decided I should update my old post on plotting PRISM arrays. First, I tried using ndimage.imread() and matplotlib.imread(), but neither worked with the BIL format. GDAL can handle it though.

prism_tmean_normal

import matplotlib.pyplot as plt
from osgeo import gdal

prism_path = 'PRISM_tmean_30yr_normal_4kmM2_annual_bil.bil'
prism_nodata = -9999

prism_ds = gdal.Open(prism_path)
prism_band = prism_ds.GetRasterBand(1)
prism_array = prism_band.ReadAsArray().astype(np.float32)
# prism_nodata = prism_band.GetNoDataValue()
prism_array[prism_array == prism_nodata] = np.nan
prism_ds = None

plt.imshow(prism_array, cmap='viridis')
plt.show()

I really like the new Viridis colormap!

 

Reversing a matplotlib colormap

I wanted to try to test out the new Matplotlib colormap viridis:

Matplotlib Viridis colormap

Initially, I couldn’t find an easy way to use it since it isn’t in the current version (1.4.3).  I found the underlying data list it in “_cm_listed.py” in the Matplotlib Github listed as “_viridis_data”and built the colormap from the data list:

viridis_cm = LinearSegmentedColormap.from_list('viridis', cm_data)

Here is the figure I was working on with the viridis colormap:
figure2_viridis

Looking at the figure, what I really wanted was the reverse of the colormap. Since I wasn’t using one of the listed Matplotlib colormaps, I couldn’t use the “_r” try to reverse it (i.e. to invert “jet” use “jet_r” instead). I I looked around for awhile for a simple method to reverse the colormap but didn’t find any really helpful stackoverflow posts. Finally, after digging around in the Matplotlib help I noticed the revcmap function. From there it was simply a matter of calling the function on the extracted colormap data.

viridis_r_cm = matplotlib.colors.LinearSegmentedColormap(
    'viridis_r', matplotlib.cm.revcmap(viridis_cm._segmentdata))

The figure still doesn’t look great but it is the reverse:
figure2_viridis_r

Wrap integer values to fixed range

I had a project where I wanted to select a window of days centered on a specific day of year (DOY), but the window had to also be defined by DOY. For example, for DOY 2 (Jan. 2nd), a 7 day window (3 before and 3 after) would go from DOY 364 (Dec. 30th, non-leap years) to DOY 5 (Jan. 5th). This could probably be done using Python datetime objects or some built-in Python module, but I also needed to implement this in JavaScript and on Google Earth Engine also, so I thought it would be fun to come up with a generic algorithm.

It seemed like the modulo operator (%) would be a good place to start. For the range 0-n, x % n for x > 0 will be the correctly wrapped value.

x_list = range(8, 23)
print(x_list)
[8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]
print([x % 10 for x in x_list])
[8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]

In Python, the modulo operator gets the sign of the output from the divisor. So for negative values the modulo will still correctly wrap the value.

x_list = range(-12,3)
print(x_list)
[-12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2]
print([x % 10 for x in x_list])
[8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]

This doesn’t work in JavaScript though, or any language where the sign of the output comes from the dividend. To show this in Python you can use the to get the sign from the dividend.

import math
x_list = range(-12,3)
print(x_list)
[-12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2]
print([int(math.fmod(x,10)) for x in x_list])
[-2, -1, 0, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1]

The divisor based modulo operator can be implemented from the dividend based one as:
((x % n) + n) % n

import math
x_list = range(-12,3)
print(x_list)
[-12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2]
print([int(math.fmod(math.fmod(x,10) + 10, 10)) for x in x_list])
[8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]

This will wrap values to the range 0-n, but what about the range a-b? For this I decided to just shift the range down to 0-n, wrap the values, then shift them back to a-b.
(x - a) % (b - a + 1) + a

The following Python example will wrap the values to the range 2-7. To do the same thing in Earth Engine or JavaScript just means including the extra operations from above.

a = 2
b = 7
x_list = range(-5,12)
print(x_list)
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
print([((x - a) % (b - a + 1) + a) for x in x_list])
[7, 2, 3, 4, 5, 6, 7, 2, 3, 4, 5, 6, 7, 2, 3, 4, 5]

Binary erosion without using SciPy ndimage

One of the best parts about Python is all of the great add-on modules that allow you to do very complex operations in just a few lines of code.  That being said, I also greatly enjoy when I can’t use a specific module or function and I am forced to really learn an algorithm in order to recreate it.  I recently needed to do a binary erosion of a NumPy array without using the binary_erosion() function in the SciPy ndimage package (the ArcGIS Python install only includes NumPy and matplotlib).

The following is an example of binary erosion using the SciPy ndimage function

>>> import numpy as np
>>> from scipy import ndimage
>>> input_array = np.ones((7,7), dtype=np.int)
>>> input_array[3,0] = 0
>>> input_array[1,3:5] = 0
>>> input_array
array([[1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 0, 0, 1, 1],
       [1, 1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1]])
>>> structure = np.ones((3,3), dtype=np.int)
>>> ndimage.binary_erosion(input_array, structure)
array([[0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 1, 0],
       [0, 0, 1, 1, 1, 1, 0],
       [0, 1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0, 0]])

I searched stackoverflow for awhile, but most of the responses were to use ndimage.  Since the algorithm itself is quite simple (set the output pixel to the minimum value of all the pixels in the input pixel’s structure/neighborhood) I decided to start from scratch, instead of borrowing from stackoverflow like I usually do.  Initially, I was going to just iterate over every cell in the input array, select the subset using the structure element as a mask, and then calculate the minimum.  The drawback to this approach is that I would need quite a bit of logic to handle the edge cases because the structure element would also need to be sliced.  Instead, I decided to just pad the input array with zeros so the structure element would fit without modification.

>>> pad_array = np.zeros((9,9), dtype=np.int)
array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0]])
>>> pad_array[1:-1,1:-1] = input_array
>>> pad_array
array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 1, 1, 0],
       [0, 1, 1, 1, 0, 0, 1, 1, 0],
       [0, 1, 1, 1, 1, 1, 1, 1, 0],
       [0, 0, 1, 1, 1, 1, 1, 1, 0],
       [0, 1, 1, 1, 1, 1, 1, 1, 0],
       [0, 1, 1, 1, 1, 1, 1, 1, 0],
       [0, 1, 1, 1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0]])

Padding with zeros also made it trivial to propagate the edge cells into the array like occurs with ndimage.binary_erosion().  One potential downside of this approach is that I am creating an additional copy of the input array in order to pad it.

Below is the numpy binary erosion function. I am sure that this is a very inefficient approach, but it worked!

def np_binary_erosion(input_array, structure=np.ones((3,3)).astype(np.bool)):
    '''NumPy binary erosion function
    
    No error checking on input array (type)
    No error checking on structure element (# of dimensions, shape, type, etc.)
    
    Args:
    input_array: Binary NumPy array to be eroded. Non-zero (True) elements
        form the subset to be eroded
    structure: Structuring element used for the erosion. Non-zero elements
        are considered True. If no structuring element is provided, an
        element is generated with a square connectivity equal to two 
        (square, not cross).
    Returns:
        binary_erosion: Erosion of the input by the stucturing element
    '''
    rows, cols = input_array.shape
    
    ## Pad output array (binary_erosion) with extra cells around the edge
    ## so that structuring element will fit without wrapping.
    ## A 3x3 structure, will need 1 additional cell around the edge
    ## A 5x5 structure, will need 2 additional cells around the edge
    pad_shape = (
        input_shape[0] + structure_shape[0] - 1, 
        input_shape[1] + structure.shape[1] - 1)
    input_pad_array = np.zeros(pad_shape).astype(np.bool)
    input_pad_array[1:rows+1,1:cols+1] = input_array
    binary_erosion = np.zeros(pad_shape).astype(np.bool)
    
    ## Cast structure element to boolean
    struc_mask = structure.astype(np.bool)
    
    ## Iterate over each cell
    for row in xrange(rows):
        for col in xrange(cols):
            ## The value of the output pixel is the minimum value of all the
            ## pixels in the input pixel's neighborhood.
            binary_erosion[row+1,col+1] = np.min(
                input_pad_array[row:row+3, col:col+3][struc_mask])
    return binary_erosion[1:rows+1,1:cols+1]

And the result using the function

>>> np_binary_erosion(input_array, structure).astype(np.int)
array([[0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 1, 0],
       [0, 0, 1, 1, 1, 1, 0],
       [0, 1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0, 0]])

Using ArcPy with Anaconda

Recently I have had really good luck using Anaconda to manage Python modules, but I didn’t know how to use the ArcGIS 10.2 ArcPy module with it.  ArcGIS installs its own Python installation (C:\Python27\ArcGIS10.2) and I was only able to import arcpy when using the this copy of Python.

To make this work, I started with this really helpful stackoverflow post.  Basically, I uninstalled all existing ArcGIS installations and python modules, installed ArcGIS, then installed Anaconda (or Miniconda) using all the default settings/options.  I clicked the button to register Anaconda as the default Python installation, but I’m not sure this actually worked.

Unlike the stackoverflow poast, I then added C:\Anaconda and C:\Anaconda\bin to the system path, but I removed C:\Python27\ArcGIS10.2.  For ArcGIS 10.2 (32bit), the file in C:\Python27\ArcGIS10.2\Lib\site-packages that you need to copy to C:\Anaconda\Lib\site-packages is called desktop10.2.pth.  For ArcGIS 10.2 (64bit), the file in C:\Python27\ArcGISx6410.2\Lib\site-packages that you need to copy to C:\Anaconda\Lib\site-packages is called DTBGGP64.pth.

Finally, the only way I could get the Anaconda Python to be used when double clicking on a .py file was to modify the registry.  I didn’t try it but you could probably change these values from the command prompt as outlined in the Python documentation.  I changed the following keys to point to the Anaconda folder:

HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Python.File\shell\Edit with IDLE
HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Python.File\shell\open
HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Python.CompiledFile\shell\open\command

After that I could import ArcPy from the Anaconda Python without error.

Use NumPy & Avoid For Loops

I was given a Python script by a student for simulating the random motion of particles.  It was really good for a first attempt but he had ignored my pleas to avoid nested for loops and to use NumPy arrays and functions instead.  The following is a comparison of the original script and my two attempts to improve it.

This is the figure that the scripts output.  The student wanted to show the final position of all the particles and the trajectory of the first (red) and last (green) particle.  For this test I tracked 100 particles through 100,000 steps.

3D_100_Particles_100_Steps_v0

The following is a simplified version of the original script.  It takes 128 seconds to execute.  For each particle and time step, a random direction and distance is applied.

import math
import matplotlib.pyplot as pp
from mpl_toolkits.mplot3d import Axes3D as pp3d
from matplotlib import rcParams
import numpy as np
import random
from time import clock

## Input parameters
particles = 100
steps = 100000

calc_time = clock()

## Initialize all particles at zero
points = np.zeros((particles,3))

## For the first and last points, save the full path history
## Need one extra step value to store starting point
pointsR = np.zeros((steps+1,3))
pointsG = np.zeros((steps+1,3))

## Loop through each particle
for j in range(0,particles):

    ## Loop through each step
    for i in range(1,steps+1):
        ## For the angles, use the python uniform random distribution
        theta = random.uniform(0, 2 * math.pi)
        phi = random.uniform(0, 2 * math.pi)
        ## For the distance, use the python gaussian random distribution
        r = random.gauss(1.0, 0.01)

        ## Apply the motion to each particle for eacch step
        points[j,0] += r * np.sin(phi) * np.cos(theta)
        points[j,1] += r * np.sin(phi) * np.sin(theta)
        points[j,2] += r * np.cos(phi)

        ## Save the full trajectory for the first and last particle
        if j == 0:
            pointsR[i,0] = points[j,0]
            pointsR[i,1] = points[j,1]
            pointsR[i,2] = points[j,2]
        if j == particles - 1:
            pointsG[i,0] = points[j,0]
            pointsG[i,1] = points[j,1]
            pointsG[i,2] = points[j,2]

## Plot the final point locations
fig = pp.figure()
rcParams.update({'figure.autolayout': True}) # adjusts layout of graph
ax = fig.add_subplot(111, projection='3d')
ax.scatter(points[:,0], points[:,1], points[:,2], c='b')
ax.plot(pointsR[:,0], pointsR[:,1], pointsR[:,2], '-', c='r')
ax.plot(pointsG[:,0], pointsG[:,1], pointsG[:,2], '-', c='g')
print('Calc Time: {0}'.format(clock()-calc_time))
pp.show() 

 

As a first step, I thought that all the particles could be processed at once, instead of separately.  For each time step, I built an array of the random motions for all the particles and then applied these all at once.  This version takes 6.4 seconds to execute.

## Initialize all particles at zero
points = np.zeros((particles,3))

## For the first and last points, save the full path history
## Need one extra step value to store starting point
pointsR = np.zeros((steps+1,3))
pointsG = np.zeros((steps+1,3))

## Loop through each step
for i in range(steps):
    ## For the angles, use the NumPy uniform random distribution function
    theta = np.random.uniform(0, 2 * math.pi, particles)
    phi = np.random.uniform(0, 2 * math.pi, particles)

    ## For the distance, use the NumPy normal (Gaussian) random distribution
    r = np.random.normal(1.0, 0.01, particles)

    ## Apply the motion to each axis for all particles
    points[:,0] += r * np.sin(phi) * np.cos(theta)
    points[:,1] += r * np.sin(phi) * np.sin(theta)
    points[:,2] += r * np.cos(phi)

    ## Save the full trajectory of the first and last point
    ## Particles started at 0 so add 1 to index
    pointsR[i+1,:] = points[0,:]
    pointsG[i+1,:] = points[-1,:]

 

It also seemed like this should be possible without using any for loops.  To do that I built an array of the random motions for all particles and steps, and then used the NumPy cumulative sum function to apply the motions across the “time steps” axis of the array.  This approach was faster for some combinations of particles and steps.  With the 100 particles and 100,000 steps this takes 2.3 seconds to execute.  The main downside to this approach is that it can use a lot of memory since all the motions are stored in memory at once.

## Initialize all particles and steps at zero
points = np.zeros((3,steps,particles))

## Pre-calculate sin/cos for all particles and steps
## For the angles, use the NumPy uniform random distribution function
phi = np.random.uniform(0, 2 * math.pi, (steps,particles))
theta = np.random.uniform(0, 2 * math.pi, (steps,particles))

## For the distance, use the NumPy normal (Gaussian) random distribution
r = np.random.normal(1.0, 0.01, (steps,particles))

## Pre calculate the x,y,z motion for each particle and step
points[0,:,:] = r * np.sin(phi) * np.cos(theta)
points[1,:,:] = r * np.sin(phi) * np.sin(theta)
points[2,:,:] = r * np.cos(phi)

## For all particles, sum the motions for all steps at once
np.cumsum(points, axis=1, out=points)

## Save the full trajectory of the first and last point
## This doesn't include first point at the origin though
pointsR = points[:,:,0]
pointsG = points[:,:,particles-1]

 

 

Plotting PRISM ASCII arrays using Basemap

prism_projected
Plotting NumPy arrays directly using Matplotlib was simple, but what I really wanted though was to show the data in a projected coordinate system.  With a little bit of searching it seemed like the Python basemap module should be able to handle this task pretty easily.  The examples on the basemap website  were really good but the example on Peak5398’s Introduction to Python site also really helped with some of the details.

Just like last time, the first step was reading in the PRISM ASCII raster to a NumPy array.  The only difference is that I read the data into a masked array.

import numpy as np

# Read in PRISM header data
prism_name = 'us_tmax_2011.14.asc'
with open(prism_name, 'r') as prism_f:
    prism_header = prism_f.readlines()[:6]

# Pull out nodata value from ASCII raster header
prism_header = [item.strip().split()[-1] for item in prism_header]
prism_cols = int(prism_header[0])
prism_rows = int(prism_header[1])
prism_xll = float(prism_header[2])
prism_yll = float(prism_header[3])
prism_cs = float(prism_header[4])
prism_nodata = float(prism_header[5])

# Readin the PRISM array
prism_array = np.loadtxt(prism_name, dtype=np.float, skiprows=6)

# Build a masked array
prism_ma = np.ma.masked_equal(prism_array, prism_nodata)

# PRISM data is stored as an integer but scaled by 100
prism_ma *= 0.01

# Calculate the extent of the PRISM array
# XMin, XMax, YMin, YMax
prism_extent = [
    prism_xll, prism_xll + prism_cols * prism_cs,
    prism_yll, prism_yll + prism_rows * prism_cs]

I choose to use an Albers Equal Area projection, but it seems really simple to set just about any projection.  I used a masked array so that I could use the maskoceans function to remove the PRISM values over the ocean.  Just setting these cells to NaN didn’t work.

Overall it seems really easy to plot array data in a projected coordinate system.

import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap, maskoceans

# Build arrays of the lat/lon points for each cell
lons = np.arange(prism_extent[0], prism_extent[1], prism_cs)
lats = np.arange(prism_extent[3], prism_extent[2], -prism_cs)
lons, lats = np.meshgrid(lons, lats)

# Just the contingous United States
m = Basemap(
    width=5000000, height=3400000, projection='aea', resolution='l',
    lat_1=29.5, lat_2=45.5, lat_0=38.5, lon_0=-96.)

# Build the figure and ax objects
fig, ax = plt.subplots()
ax.set_title('PRISM 2011 Tmax')

# Built-in map boundaries
m.drawcoastlines()
m.drawlsmask(land_color='white', ocean_color='0.8', lakes=False)
m.drawcountries()
m.drawstates()

# Set ocean pixels to nodata
prism_ma_land = maskoceans(lons, lats, prism_ma)

# Draw PRISM array directly to map
im = m.pcolormesh(lons, lats, prism_ma_land, cmap=plt.cm.jet, latlon=True)

# Build the colorbar on the right
cbar = m.colorbar(im, 'right', size='5%', pad='5%')
# Build the colorbar on the bottom
# cbar = m.colorbar(im, 'bottom', size='5%', pad='8%')
cbar.set_label('Degrees C')

# Meridians &amp;amp; Parallels
m.drawmeridians(np.arange(0, 360, 10), labels=[False,False,False,True])
m.drawparallels(np.arange(-90, 90, 10), labels=[True,False,False,False])

# Save and show
plt.show()