Polygons#
In addition to points, lines, areas, rasters, and trimeshes, Datashader can quickly render large collections of polygons (filled polylines). Datashader’s polygon support depends on data structures provided by the spatialpandas library. SpatialPandas extends Pandas and Parquet to support efficient storage and manipulation of “ragged” (variable length) data like polygons. Instructions for installing spatialpandas are given at the bottom of this page.
SpatialPandas Data Structures#
Pandas supports custom column types using an “ExtensionArray” interface. SpatialPandas provides two Pandas ExtensionArrays that support polygons:
spatialpandas.geometry.PolygonArray: Each row in the column is a singlePolygoninstance. As with shapely and geopandas, each Polygon may contain zero or more holes.spatialpandas.geometry.MultiPolygonArray: Each row in the column is aMultiPolygoninstance, each of which can store one or more polygons, with each polygon containing zero or more holes.
Datashader assumes that the vertices of the outer filled polygon will be listed as x1, y1, x2, y2, etc. in counter-clockwise (CCW) order around the polygon edge, while the holes will be in clockwise (CW) order. All polygons (both filled and holes) must be “closed”, with the first vertex of each polygon repeated as the last vertex.
graph TD;
subgraph SP[ ]
A[Polygon]:::spatialpandas -->|used in| E[PolygonArray]:::spatialpandas
B[MultiPolygon]:::spatialpandas -->|used in| F[MultiPolygonArray]:::spatialpandas
E -->|used in| G[GeoDataFrame or<div></div> GeoSeries]:::spatialpandas
spatialpandas(SpatialPandas):::spatialpandas_title
F -->|used in| G
end
subgraph SH[ ]
shapely(Shapely):::shapely_title
H[Polygon]:::shapely -->|converts to| A
I[MultiPolygon]:::shapely -->|converts to| B
end
subgraph GP[ ]
geopandas(GeoPandas):::geopandas_title
J[GeoDataFrame or<div></div> GeoSeries]:::geopandas
end
subgraph PD[ ]
pandas(Pandas):::pandas_title
K[DataFrame or<div></div> Series]:::pandas
end
subgraph SPD[ ]
spatialpandas.dask(SpatialPandas.Dask):::dask_title
L[DaskGeoDataFrame or<div></div> DaskGeoSeries]:::dask
end
M(Datashader):::datashader_title
F -->|usable in| K
E -->|usable in| K
G -->|converts to| L
G <-->|converts to| J
G -->|usable by| M
L -->|usable by| M
J -->|usable by| M
K -->|usable by| M
classDef spatialpandas fill:#4e79a7,stroke:#000,stroke-width:0px,color:black;
classDef shapely fill:#f28e2b,stroke:#000,stroke-width:0px,color:black;
classDef geopandas fill:#59a14f,stroke:#000,stroke-width:0px,color:black;
classDef dask fill:#76b7b2,stroke:#000,stroke-width:0px,color:black;
classDef datashader fill:#b07aa1,stroke:#000,stroke-width:0px,color:black;
classDef pandas fill:#edc948,stroke:#000,stroke-width:0px,color:black;
classDef spatialpandas_title fill:#fff,stroke:#4e79a7,stroke-width:7px,color:black,fill-opacity:.9,font-weight: bold;
classDef shapely_title fill:#fff,stroke:#f28e2b,stroke-width:7px,color:black,fill-opacity:.9,font-weight: bold;
classDef geopandas_title fill:#fff,stroke:#59a14f,stroke-width:7px,color:black,fill-opacity:.9,font-weight: bold;
classDef dask_title fill:#fff,stroke:#76b7b2,stroke-width:7px,color:black,fill-opacity:.9,font-weight: bold;
classDef pandas_title fill:#fff,stroke:#edc948,stroke-width:7px,color:black,fill-opacity:.9,font-weight: bold;
classDef datashader_title fill:#fff,stroke:#b07aa1,stroke-width:7px,color:black,fill-opacity:.9,font-weight: bold;
classDef subgraph_style fill:#fff;
style SP fill:grey,stroke:#fff,stroke-width:0px
style SH fill:grey,stroke:#fff,stroke-width:0px
style GP fill:grey,stroke:#fff,stroke-width:0px
style PD fill:grey,stroke:#fff,stroke-width:0px
style SPD fill:grey,stroke:#fff,stroke-width:0px
Importing Required Libraries#
import pandas as pd
import numpy as np
import dask.dataframe as dd
import colorcet as cc
import datashader as ds
import datashader.transfer_functions as tf
import spatialpandas as sp
import spatialpandas.geometry
import spatialpandas.dask
Simple Example#
Here is a simple example of a two-element MultiPolygonArray. The first element specifies two filled polygons, the first with two holes. The second element contains one filled polygon with one hole.
polygons = sp.geometry.MultiPolygonArray([
# First Element
[[[0, 0, 1, 0, 2, 2, -1, 4, 0, 0], # Filled quadrilateral (CCW order)
[0.5, 1, 1, 2, 1.5, 1.5, 0.5, 1], # Triangular hole (CW order)
[0, 2, 0, 2.5, 0.5, 2.5, 0.5, 2, 0, 2]], # Rectangular hole (CW order)
[[-0.5, 3, 1.5, 3, 1.5, 4, -0.5, 3]],], # Filled triangle
# Second Element
[[[1.25, 0, 1.25, 2, 4, 2, 4, 0, 1.25, 0], # Filled rectangle (CCW order)
[1.5, 0.25, 3.75, 0.25, 3.75, 1.75, 1.5, 1.75, 1.5, 0.25]],]]) # Rectangular hole (CW order)
The filled quadrilateral starts at (x,y) (0,0) and goes to (1,0), then (2,2), then (-1,4), and back to (0,0); others similarly go left to right (but drawing in either CCW or CW order around the edge of the polygon depending on whether they are filled or holes).
Since a MultiPolygonArray is a pandas/dask extension array, it can be added as a column to a DataFrame. For convenience, you can define your DataFrame as sp.GeoDataFrame instead of pd.DataFrame, which will automatically include support for polygon columns:
df = sp.GeoDataFrame({'polygons': polygons, 'v': range(1, len(polygons)+1)})
df
| polygons | v | |
|---|---|---|
| 0 | MultiPolygon([[[0.0, 0.0, 1.0, 0.0, 2.0, 2.0, ... | 1 |
| 1 | MultiPolygon([[[1.25, 0.0, 1.25, 2.0, 4.0, 2.0... | 2 |
Rasterizing Polygons#
Polygons are rasterized to a Canvas by Datashader using the Canvas.polygons method. This method works like the existing glyph methods (.points, .line, etc), except it does not have x and y arguments. Instead, it has a single geometry argument that should be passed the name of a PolygonArray or MultiPolygonArray column in the supplied DataFrame. For comparison we’ll also show the polygon outlines rendered with Canvas.line (which also supports geometry columns):
%%time
cvs = ds.Canvas()
agg = cvs.polygons(df, geometry='polygons', agg=ds.sum('v'))
filled = tf.shade(agg)
float(agg.min()), float(agg.max())
CPU times: user 1.88 s, sys: 231 ms, total: 2.11 s
Wall time: 2.91 s
(1.0, 3.0)
%%time
cvs = ds.Canvas()
agg = cvs.line(df, geometry='polygons', agg=ds.sum('v'), line_width=4)
unfilled = tf.shade(agg)
float(agg.min()), float(agg.max())
CPU times: user 1.68 s, sys: 69.8 ms, total: 1.75 s
Wall time: 2.44 s
(0.0032808413085092525, 3.0)
tf.Images(filled, unfilled)
Here as you can see each polygon is filled or outlined with the indicated value from the v column for that polygon specification. The sum aggregator for the filled polygons specifies that the rendered colors should indicate the sum of the v values of polygons that overlap that pixel, and so the first element (v=1) has an overlapping area with value 2, and the second element has a value of 2 except in overlap areas where it gets a value of 3. Each plot is normalized separately, so the filled plot uses the colormap for the range 1 (light blue) to 3 (dark blue), while the outlined plot ranges from 1 to 2.
You can use polygons from within interactive plotting programs with axes to see the underlying values by hovering, which helps for comparing Datashader’s aggregation-based approach (right) to standard polygon rendering (left):
import holoviews as hv
from holoviews.operation.datashader import rasterize
from holoviews.streams import PlotSize
PlotSize.scale=2 # Sharper plots on Retina displays
hv.extension("bokeh")
hvpolys = hv.Polygons(df, vdims=['v']).opts(color='v', tools=['hover'])
hvpolys + rasterize(hvpolys, aggregator=ds.sum('v')).opts(tools=['hover'])