3 Interactivity

The previous notebook showed all the steps required to get a Datashader rendering of your dataset, yielding raster images displayed using Jupyter's "rich display" support. However, these bare images do not show the data ranges or axis labels, making them difficult to interpret. Moreover, they are only static images, and datasets often need to be explored at multiple scales, which is much easier to do in an interactive program.

To get axes and interactivity, the images generated by Datashader need to be embedded into a plot using an external library like Matplotlib or Bokeh. As we illustrate below, the most convenient way to make Datashader plots using these libraries is via the HoloViews high-level data-science API. Plotly also includes Datashader support for Plotly, and native Datashader support for Matplotlib has been sketched but is not yet released.

In this notebook, we will first look at datashader's native Bokeh support, because it uses the same API introduced in the previous examples. We'll start with the same example from the previous notebook:

In [1]:
import pandas as pd
import numpy as np
import datashader as ds
import datashader.transfer_functions as tf
from collections import OrderedDict as odict


dists = {cat: pd.DataFrame(odict([('x',np.random.normal(x,s,num)),
         for x,  y,  s,  val, cat in
         [(  2,  2, 0.03, 10, "d1"),
          (  2, -2, 0.10, 20, "d2"),
          ( -2, -2, 0.50, 30, "d3"),
          ( -2,  2, 1.00, 40, "d4"),
          (  0,  0, 3.00, 50, "d5")] }

df = pd.concat(dists,ignore_index=True)

Bokeh provides interactive plotting in a web browser. To make an interactive datashader plot when working with Bokeh directly, we'll first need to write a "callback" that wraps up the plotting steps shown in the previous notebook. A callback is a function that will render an image of the dataframe above when given some parameters:

In [2]:
def image_callback(x_range, y_range, w, h, name=None):
    cvs = ds.Canvas(plot_width=w, plot_height=h, x_range=x_range, y_range=y_range)
    agg = cvs.points(df, 'x', 'y', ds.count_cat('cat'))
    img = tf.shade(agg)
    return tf.dynspread(img, threshold=0.50, name=name)

As you can see, this callback is a function that lets us generate a Datashader image covering any range of data space that we want to examine:

In [3]:
tf.Images(image_callback(None,        None,      300, 300, name="Original"),
          image_callback((  0, 4  ), (  0, 4  ), 300, 300, name="Zoom 1"),
          image_callback((1.9, 2.1), (1.9, 2.1), 300, 300, name="Zoom 2"))

Zoom 1

You can now see that the single apparent "red dot" from the original image is actually a large collection of overlapping points (100,000, to be exact). However, you can also see that it would be awkward to explore a dataset using static images in this way, having to guess at numerical ranges as in the code above. Instead, let's make an interactive Bokeh plot using a convenience utility from Datashader called InteractiveImage:

In [4]:
from datashader.bokeh_ext import InteractiveImage
import bokeh.plotting as bp

p = bp.figure(tools='pan,wheel_zoom,reset', x_range=(-5,5), y_range=(-5,5), plot_width=500, plot_height=500)

InteractiveImage(p, image_callback)
Loading BokehJS ...