tiling#

Building Tilesets using Datashader#

Datashader provides render_tiles which is a utility function for creating tilesets from arbitrary datashader pipelines.

from datashader.tiles import render_tiles

A couple of notes about the tiling process:

  • By default, uses a simple Web Mercator Tiling Scheme (EPSG:3857)

  • call render_tiles with the following arguments:

extent_of_area_i_want_to_tile = (-500000, -500000, 500000, 500000)  # xmin, ymin, xmax, ymax
render_tiles(extent_of_data_i_want_to_handle,
             tile_levels=range(6),
             output_path='example_tileset_output_directory',
             load_data_func=function_which_returns_dataframe,
             rasterize_func=function_which_creates_xarray_aggregate,
             shader_func=function_which_renders_aggregate_to_datashader_image,
             post_render_func=function_which_post_processes_image)
  • data representing x / y coordinates is assumed to be represented in meters (m) based on the Web Mercator coordinate system.

  • the tiling extent is subdivided into supertiles generally of size 4096 x 4096

  • the load_data_func returns a dataframe-like object and contains your data access specific code.

  • the rasterize_func returns a xr.DataArray and contains your xarray specific code.

  • the shader_func returns a ds.Image and contains your datashader specific code.

  • the post_render_func is called once for each final tile (default 256 x 256) and contains PIL (Python Imaging Library) specific code. This is the hook for adding additional filters, text, watermarks, etc. to output tiles.

Creating Tile Component Functions#

Create load_data_func#

  • accepts x_range and y_range arguments which correspond to the ranges of the supertile being rendered.

  • returns a dataframe-like object (pd.Dataframe / dask.Dataframe)

  • this example load_data_func creates a pandas dataframe with x and y fields sampled from a wald distribution

import pandas as pd
import numpy as np

df = None
def load_data_func(x_range, y_range):
    global df
    if df is None:
        xoffsets = [-1, 1, -1, 1]
        yoffsets = [-1, 1, 1, -1]
        xs = np.concatenate([np.random.wald(10000000, 10000000, size=10000000) * offset for offset in xoffsets])
        ys = np.concatenate([np.random.wald(10000000, 10000000, size=10000000) * offset for offset in yoffsets])
        df = pd.DataFrame(dict(x=xs, y=ys))

    return df.loc[df['x'].between(*x_range) & df['y'].between(*y_range)]

Create rasterize_func#

  • accepts df, x_range, y_range, height, width arguments which correspond to the data, ranges, and plot dimensions of the supertile being rendered.

  • returns an xr.DataArray object representing the aggregate.

import datashader as ds

def rasterize_func(df, x_range, y_range, height, width):
    # aggregate
    cvs = ds.Canvas(x_range=x_range, y_range=y_range,
                    plot_height=height, plot_width=width)
    agg = cvs.points(df, 'x', 'y')
    return agg

Create shader_func#

  • accepts agg (xr.DataArray), span (tuple(min, max)). The span argument can be used to control color mapping / auto-ranging across supertiles.

  • returns an ds.Image object representing the shaded image.

import datashader.transfer_functions as tf
from datashader.colors import viridis

def shader_func(agg, span=None):
    img = tf.shade(agg, cmap=reversed(viridis), span=span, how='log')
    img = tf.set_background(img, 'black')
    return img

Create post_render_func#

  • accepts img , extras arguments which correspond to the output PIL.Image before it is write to disk (or S3), and additional image properties.

  • returns image (PIL.Image)

  • this is a good place to run any non-datashader-specific logic on each output tile.

from PIL import ImageDraw

def post_render_func(img, **kwargs):
    info = "x={},y={},z={}".format(kwargs['x'], kwargs['y'], kwargs['z'])
    draw = ImageDraw.Draw(img)
    draw.text((5, 5), info, fill='rgb(255, 255, 255)')
    return img

Render tiles to local filesystem#

full_extent_of_data = (-500000, -500000, 500000, 500000)
output_path = 'tiles_output_directory/wald_tiles'
results = render_tiles(full_extent_of_data,
                       range(3),
                       load_data_func=load_data_func,
                       rasterize_func=rasterize_func,
                       shader_func=shader_func,
                       post_render_func=post_render_func,
                       output_path=output_path)
calculating statistics for level 0
rendering 1 supertiles for zoom level 0 with span=(np.uint32(0), np.uint32(2914))
/Users/runner/work/datashader/datashader/.pixi/envs/docs/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
  warnings.warn(
calculating statistics for level 1
rendering 1 supertiles for zoom level 1 with span=(np.uint32(0), np.uint32(770))
/Users/runner/work/datashader/datashader/.pixi/envs/docs/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
  warnings.warn(
calculating statistics for level 2
rendering 1 supertiles for zoom level 2 with span=(np.uint32(0), np.uint32(215))
/Users/runner/work/datashader/datashader/.pixi/envs/docs/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
  warnings.warn(

Preview the tileset using Bokeh#

  • Browse to the tile output directory and start an http server:

$> cd test_tiles_output
$> python -m http.server

Starting up http-server, serving ./
Available on:
  http://127.0.0.1:8080
  http://192.168.1.7:8080
Hit CTRL-C to stop the server
  • build a bokeh.plotting.Figure

from bokeh.plotting import figure
from bokeh.models.tiles import WMTSTileSource
from bokeh.io import show
from bokeh.io import output_notebook

output_notebook()

xmin, ymin, xmax, ymax = full_extent_of_data

p = figure(width=800, height=800,
           x_range=(int(-20e6), int(20e6)),
           y_range=(int(-20e6), int(20e6)),
           tools="pan,wheel_zoom,reset")

p.background_fill_color = 'black'
p.grid.grid_line_alpha = 0
p.axis.visible = False
p.add_tile(WMTSTileSource(url="http://localhost:8080/{Z}/{X}/{Y}.png"),
          render_parents=False)
show(p)
Loading BokehJS ...

Render tiles to Amazon Simple Storage Service (S3)#

To render tiles directly to S3, you only need to use the s3:// protocol in your output_path argument

  • Requires AWS Access / Secret Keys with appropriate IAM permissions for uploading to S3.

  • Requires extra boto3 dependency:

conda install boto3

Configuring credentials#

The mechanism in which boto3 looks for credentials is to search through a list of possible locations and stop as soon as it finds credentials. The order in which Boto3 searches for credentials is:

  1. ~~Passing credentials as parameters in the boto.client() method~~

  • ~~Passing credentials as parameters when creating a Session object~~

  • Environment variables

  • Shared credential file (~/.aws/credentials)

  • AWS config file (~/.aws/config)

  • Assume Role provider

  • Boto2 config file (/etc/boto.cfg and ~/.boto)

  • Instance metadata service on an Amazon EC2 instance that has an IAM role configured.

  • Datashader’s render_tiles function supports only credential search locations highlighted in bold above

  • NOTE: all tiles written to S3 are marked with public-read ACL settings.

Setup tile bucket using AWS CLI#

$> aws s3 mb s3://datashader-tiles-testing/
full_extent_of_data = (int(-20e6), int(-20e6), int(20e6), int(20e6))
output_path = 's3://datashader-tiles-testing/wald_tiles/'
try:
    results = render_tiles(full_extent_of_data,
                           range(3),
                           load_data_func=load_data_func,
                           rasterize_func=rasterize_func,
                           shader_func=shader_func,
                           post_render_func=post_render_func,
                           output_path=output_path)
except ImportError:
    print('you must install boto3 to save tiles to Amazon S3')
calculating statistics for level 0
rendering 1 supertiles for zoom level 0 with span=(np.uint32(0), np.uint32(2914))
/Users/runner/work/datashader/datashader/.pixi/envs/docs/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option `dataframe.query-planning` to `True` or None to enable the new Dask Dataframe implementation and silence this warning.
  warnings.warn(
you must install boto3 to save tiles to Amazon S3

Preview S3 Tiles#

xmin, ymin, xmax, ymax = full_extent_of_data

p = figure(width=800, height=800,
           x_range=(int(-20e6), int(20e6)),
           y_range=(int(-20e6), int(20e6)),
           tools="pan,wheel_zoom,reset")
p.axis.visible = False
p.background_fill_color = 'black'
p.grid.grid_line_alpha = 0
p.add_tile(WMTSTileSource(url="https://datashader-tiles-testing.s3.amazonaws.com/wald_tiles/{Z}/{X}/{Y}.png"),
           render_parents=False)
show(p)
This web page was generated from a Jupyter notebook and not all interactivity will work on this website. Right click to download and run locally for full Python-backed interactivity.

Right click to download this notebook from GitHub.