tiling#
Building Tilesets using Datashader#
Datashader provides render_tiles
which is a utility function for creating tilesets from arbitrary datashader pipelines.
from datashader.tiles import render_tiles
A couple of notes about the tiling process:
By default, uses a simple
Web Mercator Tiling Scheme (EPSG:3857)
call
render_tiles
with the following arguments:
extent_of_area_i_want_to_tile = (-500000, -500000, 500000, 500000) # xmin, ymin, xmax, ymax
render_tiles(extent_of_data_i_want_to_handle,
tile_levels=range(6),
output_path='example_tileset_output_directory',
load_data_func=function_which_returns_dataframe,
rasterize_func=function_which_creates_xarray_aggregate,
shader_func=function_which_renders_aggregate_to_datashader_image,
post_render_func=function_which_post_processes_image)
data representing x / y coordinates is assumed to be represented in meters (m) based on the Web Mercator coordinate system.
the tiling extent is subdivided into
supertiles
generally of size4096 x 4096
the
load_data_func
returns a dataframe-like object and contains your data access specific code.the
rasterize_func
returns axr.DataArray
and contains your xarray specific code.the
shader_func
returns ads.Image
and contains your datashader specific code.the
post_render_func
is called once for each final tile (default 256 x 256
) and contains PIL (Python Imaging Library) specific code. This is the hook for adding additional filters, text, watermarks, etc. to output tiles.
Creating Tile Component Functions#
Create load_data_func
#
accepts
x_range
andy_range
arguments which correspond to the ranges of the supertile being rendered.returns a dataframe-like object (pd.Dataframe / dask.Dataframe)
this example
load_data_func
creates a pandas dataframe withx
andy
fields sampled from a wald distribution
import pandas as pd
import numpy as np
df = None
def load_data_func(x_range, y_range):
global df
if df is None:
xoffsets = [-1, 1, -1, 1]
yoffsets = [-1, 1, 1, -1]
xs = np.concatenate([np.random.wald(10000000, 10000000, size=10000000) * offset for offset in xoffsets])
ys = np.concatenate([np.random.wald(10000000, 10000000, size=10000000) * offset for offset in yoffsets])
df = pd.DataFrame(dict(x=xs, y=ys))
return df.loc[df['x'].between(*x_range) & df['y'].between(*y_range)]
Create rasterize_func
#
accepts
df
,x_range
,y_range
,height
,width
arguments which correspond to the data, ranges, and plot dimensions of the supertile being rendered.returns an
xr.DataArray
object representing the aggregate.
import datashader as ds
def rasterize_func(df, x_range, y_range, height, width):
# aggregate
cvs = ds.Canvas(x_range=x_range, y_range=y_range,
plot_height=height, plot_width=width)
agg = cvs.points(df, 'x', 'y')
return agg
Create shader_func
#
accepts
agg (xr.DataArray)
,span (tuple(min, max))
. The span argument can be used to control color mapping / auto-ranging across supertiles.returns an
ds.Image
object representing the shaded image.
import datashader.transfer_functions as tf
from datashader.colors import viridis
def shader_func(agg, span=None):
img = tf.shade(agg, cmap=reversed(viridis), span=span, how='log')
img = tf.set_background(img, 'black')
return img
Create post_render_func
#
accepts
img
,extras
arguments which correspond to the output PIL.Image before it is write to disk (or S3), and additional image properties.returns image
(PIL.Image)
this is a good place to run any non-datashader-specific logic on each output tile.
from PIL import ImageDraw
def post_render_func(img, **kwargs):
info = "x={},y={},z={}".format(kwargs['x'], kwargs['y'], kwargs['z'])
draw = ImageDraw.Draw(img)
draw.text((5, 5), info, fill='rgb(255, 255, 255)')
return img
Render tiles to local filesystem#
full_extent_of_data = (-500000, -500000, 500000, 500000)
output_path = 'tiles_output_directory/wald_tiles'
results = render_tiles(full_extent_of_data,
range(3),
load_data_func=load_data_func,
rasterize_func=rasterize_func,
shader_func=shader_func,
post_render_func=post_render_func,
output_path=output_path)
calculating statistics for level 0
rendering 1 supertiles for zoom level 0 with span=(np.uint32(0), np.uint32(2914))
/Users/runner/work/datashader/datashader/.pixi/envs/docs/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
warnings.warn(
calculating statistics for level 1
rendering 1 supertiles for zoom level 1 with span=(np.uint32(0), np.uint32(770))
/Users/runner/work/datashader/datashader/.pixi/envs/docs/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
warnings.warn(
calculating statistics for level 2
rendering 1 supertiles for zoom level 2 with span=(np.uint32(0), np.uint32(215))
/Users/runner/work/datashader/datashader/.pixi/envs/docs/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
warnings.warn(
Preview the tileset using Bokeh#
Browse to the tile output directory and start an http server:
$> cd test_tiles_output
$> python -m http.server
Starting up http-server, serving ./
Available on:
http://127.0.0.1:8080
http://192.168.1.7:8080
Hit CTRL-C to stop the server
build a
bokeh.plotting.Figure
from bokeh.plotting import figure
from bokeh.models.tiles import WMTSTileSource
from bokeh.io import show
from bokeh.io import output_notebook
output_notebook()
xmin, ymin, xmax, ymax = full_extent_of_data
p = figure(width=800, height=800,
x_range=(int(-20e6), int(20e6)),
y_range=(int(-20e6), int(20e6)),
tools="pan,wheel_zoom,reset")
p.background_fill_color = 'black'
p.grid.grid_line_alpha = 0
p.axis.visible = False
p.add_tile(WMTSTileSource(url="http://localhost:8080/{Z}/{X}/{Y}.png"),
render_parents=False)
show(p)
Render tiles to Amazon Simple Storage Service (S3)#
To render tiles directly to S3, you only need to use the s3://
protocol in your output_path
argument
Requires AWS Access / Secret Keys with appropriate IAM permissions for uploading to S3.
Requires extra
boto3
dependency:
conda install boto3
Configuring credentials#
The mechanism in which boto3 looks for credentials is to search through a list of possible locations and stop as soon as it finds credentials. The order in which Boto3 searches for credentials is:
~~Passing credentials as parameters in the boto.client() method~~
~~Passing credentials as parameters when creating a Session object~~
Environment variables
Shared credential file (~/.aws/credentials)
AWS config file (~/.aws/config)
Assume Role provider
Boto2 config file (/etc/boto.cfg and ~/.boto)
Instance metadata service on an Amazon EC2 instance that has an IAM role configured.
Datashader’s
render_tiles
function supports only credential search locations highlighted in bold aboveNOTE: all tiles written to S3 are marked with
public-read
ACL settings.
Setup tile bucket using AWS CLI#
$> aws s3 mb s3://datashader-tiles-testing/
full_extent_of_data = (int(-20e6), int(-20e6), int(20e6), int(20e6))
output_path = 's3://datashader-tiles-testing/wald_tiles/'
try:
results = render_tiles(full_extent_of_data,
range(3),
load_data_func=load_data_func,
rasterize_func=rasterize_func,
shader_func=shader_func,
post_render_func=post_render_func,
output_path=output_path)
except ImportError:
print('you must install boto3 to save tiles to Amazon S3')
calculating statistics for level 0
rendering 1 supertiles for zoom level 0 with span=(np.uint32(0), np.uint32(2914))
/Users/runner/work/datashader/datashader/.pixi/envs/docs/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
warnings.warn(
you must install boto3 to save tiles to Amazon S3
Preview S3 Tiles#
xmin, ymin, xmax, ymax = full_extent_of_data
p = figure(width=800, height=800,
x_range=(int(-20e6), int(20e6)),
y_range=(int(-20e6), int(20e6)),
tools="pan,wheel_zoom,reset")
p.axis.visible = False
p.background_fill_color = 'black'
p.grid.grid_line_alpha = 0
p.add_tile(WMTSTileSource(url="https://datashader-tiles-testing.s3.amazonaws.com/wald_tiles/{Z}/{X}/{Y}.png"),
render_parents=False)
show(p)