API

Entry Points

Canvas

Canvas ([plot_width, plot_height, x_range, …]) An abstract canvas representing the space in which to bin.
Canvas.line (source, x, y[, agg]) Compute a reduction by pixel, mapping data to pixels as a line.
Canvas.points (source, x, y[, agg]) Compute a reduction by pixel, mapping data to pixels as points.
Canvas.raster (source[, layer, …]) Sample a raster dataset by canvas size and bounds.
Canvas.trimesh (vertices, simplices[, mesh, …]) Compute a reduction by pixel, mapping data to pixels as a triangle.
Canvas.validate () Check that parameter settings are valid for this object

Pipeline

Pipeline (df, glyph[, agg, transform_fn, …]) A datashading pipeline callback.

Edge Bundling

directly_connect_edges alias of datashader.bundling.connect_edges
hammer_bundle (**params) Iteratively group edges and return as paths suitable for datashading.

Glyphs

Point

Point (x, y) A point, with center at x and y .
Point.inputs
Point.validate (in_dshape)

Line

Line (x, y) A line, with vertices defined by x and y .
Line.inputs
Line.validate (in_dshape)

Reductions

any ([column]) Whether any elements in column map to each bin.
count ([column]) Count elements in each bin.
count_cat ([column]) Count of all elements in column , grouped by category.
first ([column]) First value encountered in column .
last ([column]) Last value encountered in column .
m2 ([column]) Sum of square differences from the mean of all elements in column .
max ([column]) Maximum value of all elements in column .
mean ([column]) Mean of all elements in column .
min ([column]) Minimum value of all elements in column .
mode ([column]) Mode (most common value) of all the values encountered in column .
std ([column]) Standard Deviation of all elements in column .
sum ([column]) Sum of all elements in column .
summary (**kwargs) A collection of named reductions.
var ([column]) Variance of all elements in column .

Transfer Functions

Image

Image (data[, coords, dims, name, attrs, …])
Attributes:
Image.to_bytesio ([format, origin])
Image.to_pil ([origin])

Images

Images (*images) A list of HTML-representable objects to display in a table.
Images.cols (n) Set the number of columns to use in the HTML table.

Other

dynspread (img[, threshold, max_px, shape, …]) Spread pixels in an image dynamically based on the image density.
set_background (img[, color, name]) Return a new image, with the background set to color .
shade (agg[, cmap, color_key, how, alpha, …]) Convert a DataArray to an image by choosing an RGBA pixel color for each value.
spread (img[, px, shape, how, mask, name]) Spread pixels in an image.
stack (*imgs, **kwargs) Combine images together, overlaying later images onto earlier ones.

Definitions

class datashader. Canvas ( plot_width=600 , plot_height=600 , x_range=None , y_range=None , x_axis_type='linear' , y_axis_type='linear' ) [source]

An abstract canvas representing the space in which to bin.

Parameters:
plot_width, plot_height : int, optional

Width and height of the output aggregate in pixels.

x_range, y_range : tuple, optional

A tuple representing the bounds inclusive space [min, max] along the axis.

x_axis_type, y_axis_type : str, optional

The type of the axis. Valid options are 'linear' [default], and 'log' .

Methods

line (source, x, y[, agg]) Compute a reduction by pixel, mapping data to pixels as a line.
points (source, x, y[, agg]) Compute a reduction by pixel, mapping data to pixels as points.
raster (source[, layer, upsample_method, …]) Sample a raster dataset by canvas size and bounds.
trimesh (vertices, simplices[, mesh, agg, …]) Compute a reduction by pixel, mapping data to pixels as a triangle.
validate () Check that parameter settings are valid for this object
class datashader. Pipeline ( df , glyph , agg=<datashader.reductions.count object> , transform_fn=<function identity> , color_fn=<function shade> , spread_fn=<function dynspread> , width_scale=1.0 , height_scale=1.0 ) [source]

A datashading pipeline callback.

Given a declarative specification, creates a callable with the following signature:

callback(x_range, y_range, width, height)

where x_range and y_range form the bounding box on the viewport, and width and height specify the output image dimensions.

Parameters:
df : pandas.DataFrame, dask.DataFrame
glyph : Glyph

The glyph to bin by.

agg : Reduction, optional

The reduction to compute per-pixel. Default is count() .

transform_fn : callable, optional

A callable that takes the computed aggregate as an argument, and returns another aggregate. This can be used to do preprocessing before passing to the color_fn function.

color_fn : callable, optional

A callable that takes the output of tranform_fn , and returns an Image object. Default is shade .

spread_fn : callable, optional

A callable that takes the output of color_fn , and returns another Image object. Default is dynspread .

height_scale: float, optional

Factor by which to scale the provided height

width_scale: float, optional

Factor by which to scale the provided width

Methods

__call__ ([x_range, y_range, width, height]) Compute an image from the specified pipeline.
datashader.bundling. directly_connect_edges

alias of datashader.bundling.connect_edges

class datashader.bundling. hammer_bundle ( **params ) [source]

Iteratively group edges and return as paths suitable for datashading.

Breaks each edge into a path with multiple line segments, and iteratively curves this path to bundle edges into groups.

Methods

__call__ (nodes, edges, **params)
debug (msg, *args, **kw) Print msg merged with args as a debugging statement.
defaults () Return {parameter_name:parameter.default} for all non-constant Parameters.
force_new_dynamic_value
get_param_values
get_value_generator
inspect_value
instance
message (msg, *args, **kw) Print msg merged with args as a message.
params ([parameter_name]) Return the Parameters of this class as the
pprint ([imports, prefix, unknown_value, …]) Same as Parameterized.pprint, except that X.classname(Y
print_param_defaults () Print the default values of all cls’s Parameters.
print_param_values () Print the values of all this object’s Parameters.
script_repr ([imports, prefix]) Same as Parameterized.script_repr, except that X.classname(Y
set_default (param_name, value) Set the default value of param_name.
set_dynamic_time_fn
set_param
state_pop () Restore the most recently saved state.
state_push () Save this instance’s state.
verbose (msg, *args, **kw) Print msg merged with args as a verbose message.
warning (msg, *args, **kw) Print msg merged with args as a warning, unless module variable warnings_as_exceptions is True, then raise an Exception containing the arguments.
param String x ( allow_None=False, basestring=<class ‘str’>, constant=False, default=x, instantiate=False, pickle_default_value=True, precedence=None, readonly=False )
Column name for each node’s x coordinate.
param String y ( allow_None=False, basestring=<class ‘str’>, constant=False, default=y, instantiate=False, pickle_default_value=True, precedence=None, readonly=False )
Column name for each node’s y coordinate.
param String source ( allow_None=False, basestring=<class ‘str’>, constant=False, default=source, instantiate=False, pickle_default_value=True, precedence=None, readonly=False )
Column name for each edge’s source.
param String target ( allow_None=False, basestring=<class ‘str’>, constant=False, default=target, instantiate=False, pickle_default_value=True, precedence=None, readonly=False )
Column name for each edge’s target.
param String weight ( allow_None=True, basestring=<class ‘str’>, constant=False, default=weight, instantiate=False, pickle_default_value=True, precedence=None, readonly=False )
Column name for each edge weight. If None, weights are ignored.
param Boolean include_edge_id ( allow_None=False, bounds=(0, 1), constant=False, default=False, instantiate=False, pickle_default_value=True, precedence=None, readonly=False )
Include edge IDs in bundled dataframe
param Number initial_bandwidth ( allow_None=False, bounds=(0.0, None), constant=False, default=0.05, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=None, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Initial value of the bandwidth….
param Number decay ( allow_None=False, bounds=(0.0, 1.0), constant=False, default=0.7, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=None, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Rate of decay in the bandwidth value, with 1.0 indicating no decay.
param Integer iterations ( allow_None=False, bounds=(1, None), constant=False, default=4, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=None, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Number of passes for the smoothing algorithm
param Integer batch_size ( allow_None=False, bounds=(1, None), constant=False, default=20000, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=None, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Number of edges to process together
param Number tension ( allow_None=False, bounds=(0, None), constant=False, default=0.3, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=-0.5, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Exponential smoothing factor to use when smoothing
param Integer accuracy ( allow_None=False, bounds=(1, None), constant=False, default=500, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=-0.5, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Number of entries in table for…
param Integer advect_iterations ( allow_None=False, bounds=(0, None), constant=False, default=50, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=-0.5, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Number of iterations to move edges along gradients
param Number min_segment_length ( allow_None=False, bounds=(0, None), constant=False, default=0.008, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=-0.5, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Minimum length (in data space?) for an edge segment
param Number max_segment_length ( allow_None=False, bounds=(0, None), constant=False, default=0.016, inclusive_bounds=(True, True), instantiate=False, pickle_default_value=True, precedence=-0.5, readonly=False, softbounds=None, time_dependent=False, time_fn=<Time Time00001> )
Maximum length (in data space?) for an edge segment
class datashader.glyphs. Point ( x , y ) [source]

A point, with center at x and y .

Points map each record to a single bin. Points falling exactly on the upper bounds are treated as a special case, mapping into the previous bin rather than being cropped off.

Parameters:
x, y : str

Column names for the x and y coordinates of each point.

Attributes:
inputs

Methods

validate
class datashader.glyphs. Line ( x , y ) [source]

A line, with vertices defined by x and y .

Parameters:
x, y : str

Column names for the x and y coordinates of each vertex.

Attributes:
inputs

Methods

validate
class datashader.reductions. any ( column=None ) [source]

Whether any elements in column map to each bin.

Parameters:
column : str, optional

If provided, only elements in column that are NaN are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. count ( column=None ) [source]

Count elements in each bin.

Parameters:
column : str, optional

If provided, only counts elements in column that are not NaN . Otherwise, counts every element.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. count_cat ( column=None ) [source]

Count of all elements in column , grouped by category.

Parameters:
column : str

Name of the column to aggregate over. Column data type must be categorical. Resulting aggregate has a outer dimension axis along the categories present.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. first ( column=None ) [source]

First value encountered in column .

Useful for categorical data where an actual value must always be returned, not an average or other numerical calculation.

Currently only supported for rasters, externally to this class.

Parameters:
column : str

Name of the column to aggregate over. If the data type is floating point, NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. last ( column=None ) [source]

Last value encountered in column .

Useful for categorical data where an actual value must always be returned, not an average or other numerical calculation.

Currently only supported for rasters, externally to this class.

Parameters:
column : str

Name of the column to aggregate over. If the data type is floating point, NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. m2 ( column=None ) [source]

Sum of square differences from the mean of all elements in column .

Intermediate value for computing var and std , not intended to be used on its own.

Parameters:
column : str

Name of the column to aggregate over. Column data type must be numeric. NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. max ( column=None ) [source]

Maximum value of all elements in column .

Parameters:
column : str

Name of the column to aggregate over. Column data type must be numeric. NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. mean ( column=None ) [source]

Mean of all elements in column .

Parameters:
column : str

Name of the column to aggregate over. Column data type must be numeric. NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. min ( column=None ) [source]

Minimum value of all elements in column .

Parameters:
column : str

Name of the column to aggregate over. Column data type must be numeric. NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. mode ( column=None ) [source]

Mode (most common value) of all the values encountered in column .

Useful for categorical data where an actual value must always be returned, not an average or other numerical calculation.

Currently only supported for rasters, externally to this class. Implementing it for other glyph types would be difficult due to potentially unbounded data storage requirements to store indefinite point or line data per pixel.

Parameters:
column : str

Name of the column to aggregate over. If the data type is floating point, NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. std ( column=None ) [source]

Standard Deviation of all elements in column .

Parameters:
column : str

Name of the column to aggregate over. Column data type must be numeric. NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. sum ( column=None ) [source]

Sum of all elements in column .

Parameters:
column : str

Name of the column to aggregate over. Column data type must be numeric. NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. summary ( **kwargs ) [source]

A collection of named reductions.

Computes all aggregates simultaneously, output is stored as a xarray.Dataset .

Examples

A reduction for computing the mean of column “a”, and the sum of column “b” for each bin, all in a single pass.

>>> import datashader as ds
>>> red = ds.summary(mean_a=ds.mean('a'), sum_b=ds.sum('b'))
Attributes:
inputs

Methods

out_dshape
validate
class datashader.reductions. var ( column=None ) [source]

Variance of all elements in column .

Parameters:
column : str

Name of the column to aggregate over. Column data type must be numeric. NaN values in the column are skipped.

Attributes:
inputs

Methods

out_dshape
validate
datashader.transfer_functions. stack ( *imgs , **kwargs ) [source]

Combine images together, overlaying later images onto earlier ones.

Parameters:
imgs : iterable of Image

The images to combine.

how : str, optional

The compositing operator to combine pixels. Default is ‘over’ .

datashader.transfer_functions. shade ( agg, cmap=['lightblue', 'darkblue'], color_key=['#e41a1c', '#377eb8', '#4daf4a', '#984ea3', '#ff7f00', '#ffff33', '#a65628', '#f781bf', '#999999', '#66c2a5', '#fc8d62', '#8da0cb', '#a6d854', '#ffd92f', '#e5c494', '#ffffb3', '#fb8072', '#fdb462', '#fccde5', '#d9d9d9', '#ccebc5', '#ffed6f'], how='eq_hist', alpha=255, min_alpha=40, span=None, name=None ) [source]

Convert a DataArray to an image by choosing an RGBA pixel color for each value.

Requires a DataArray with a single data dimension, here called the “value”, indexed using either 2D or 3D coordinates.

For a DataArray with 2D coordinates, the RGB channels are computed from the values by interpolated lookup into the given colormap cmap . The A channel is then set to the given fixed alpha value for all non-zero values, and to zero for all zero values.

DataArrays with 3D coordinates are expected to contain values distributed over different categories that are indexed by the additional coordinate. Such an array would reduce to the 2D-coordinate case if collapsed across the categories (e.g. if one did aggc.sum(dim='cat') for a categorical dimension cat ). The RGB channels for the uncollapsed, 3D case are computed by averaging the colors in the provided color_key (with one color per category), weighted by the array’s value for that category. The A channel is then computed from the array’s total value collapsed across all categories at that location, ranging from the specified min_alpha to the maximum alpha value (255).

Parameters:
agg : DataArray
cmap : list of colors or matplotlib.colors.Colormap, optional

The colormap to use for 2D agg arrays. Can be either a list of colors (specified either by name, RGBA hexcode, or as a tuple of (red, green, blue) values.), or a matplotlib colormap object. Default is ["lightblue", "darkblue"] .

color_key : dict or iterable

The colors to use for a 3D (categorical) agg array. Can be either a dict mapping from field name to colors, or an iterable of colors in the same order as the record fields, and including at least that many distinct colors.

how : str or callable, optional

The interpolation method to use, for the cmap of a 2D DataArray or the alpha channel of a 3D DataArray. Valid strings are ‘eq_hist’ [default], ‘cbrt’ (cube root), ‘log’ (logarithmic), and ‘linear’. Callables take 2 arguments - a 2-dimensional array of magnitudes at each pixel, and a boolean mask array indicating missingness. They should return a numeric array of the same shape, with NaN values where the mask was True.

alpha : int, optional

Value between 0 - 255 representing the alpha value to use for colormapped pixels that contain data (i.e. non-NaN values). Regardless of this value, NaN values are set to be fully transparent when doing colormapping.

min_alpha : float, optional

The minimum alpha value to use for non-empty pixels when doing colormapping, in [0, 255]. Use a higher value to avoid undersaturation, i.e. poorly visible low-value datapoints, at the expense of the overall dynamic range.

span : list of min-max range, optional

Min and max data values to use for colormap interpolation, when wishing to override autoranging.

name : string name, optional

Optional string name to give to the Image object to return, to label results for display.

datashader.transfer_functions. set_background ( img , color=None , name=None ) [source]

Return a new image, with the background set to color .

Parameters:
img : Image
color : color name or tuple, optional

The background color. Can be specified either by name, hexcode, or as a tuple of (red, green, blue) values.

datashader.transfer_functions. spread ( img , px=1 , shape='circle' , how='over' , mask=None , name=None ) [source]

Spread pixels in an image.

Spreading expands each pixel a certain number of pixels on all sides according to a given shape, merging pixels using a specified compositing operator. This can be useful to make sparse plots more visible.

Parameters:
img : Image
px : int, optional

Number of pixels to spread on all sides

shape : str, optional

The shape to spread by. Options are ‘circle’ [default] or ‘square’.

how : str, optional

The name of the compositing operator to use when combining pixels.

mask : ndarray, shape (M, M), optional

The mask to spread over. If provided, this mask is used instead of generating one based on px and shape . Must be a square array with odd dimensions. Pixels are spread from the center of the mask to locations where the mask is True.

name : string name, optional

Optional string name to give to the Image object to return, to label results for display.

datashader.transfer_functions. dynspread ( img , threshold=0.5 , max_px=3 , shape='circle' , how='over' , name=None ) [source]

Spread pixels in an image dynamically based on the image density.

Spreading expands each pixel a certain number of pixels on all sides according to a given shape, merging pixels using a specified compositing operator. This can be useful to make sparse plots more visible. Dynamic spreading determines how many pixels to spread based on a density heuristic. Spreading starts at 1 pixel, and stops when the fraction of adjacent non-empty pixels reaches the specified threshold, or the max_px is reached, whichever comes first.

Parameters:
img : Image
threshold : float, optional

A tuning parameter in [0, 1], with higher values giving more spreading.

max_px : int, optional

Maximum number of pixels to spread on all sides.

shape : str, optional

The shape to spread by. Options are ‘circle’ [default] or ‘square’.

how : str, optional

The name of the compositing operator to use when combining pixels.