Chaco Fundamentals

While you can go a long way using the high-level APIs provided by the Plot class, to use Chaco to its full potential it helps to have a deeper understanding of the building blocks of the library.

Foundations

Chaco relies on a number of underlying libraries for core capabilities. Understanding these libraries is important for writing heavily interactive visualizations, as well as new tools, overlays and plot renderers.

Traits

Chaco is written using Traits which not only provides basic type-checking of data, but also attribute change notification which is key to the interactive updating of Chaco.

You should be comfortable with Traits when writing any significant Chaco-based application or when extending Chaco’s capabilities.

TraitsUI

TraitsUI is a rapid GUI application development library built on top of Traits. While it is not required for the development of applications that use Chaco, it is the most natural way of providing other UI elements for your users to interact with your visualizations; or alternatively the most natural environment for writing applications which embed Chaco plots. Chaco also provides TraitsUI views for configuring some Chaco objects.

You will need at least a basic familiarity with TraitsUI for most Chaco use cases.

Kiva

Kiva provides an abstracted 2D drawing API and a number of back-end implementations. This permits writing of drawing code that can be used almost un-modified when rendering to a screen, to a vector document format (such as PDF or PostScript), or to a raster file format (such as PNG).

The core object for Kiva code is the “graphics context” which represents a 2D drawing surface and the current state of the drawing environment. When Chaco classes need to do any drawing they will usually be supplied with an appropriate graphics context to render into, but very occasionally (such as when saving a plot out to a file) you may want to create your own context to draw into.

If you want to write new Chaco overlays or plot renderers you should develop an understanding of Kiva.

Enable

Enable provides interactivity and layout on top of the Kiva drawing library. Enable is the library that handles the interface between Chaco plots and the OS/windowing system, as well as the basic hierarchical layout and layering of visible components of a plot.

Enable provides two base classes that are at the root of much of the Chaco code:

Component

An object that occupies a rectangular region of the window and knows how to draw itself and dispatch user interactions. The Container class is a subclass of Component that handles hierarchical layout and event dispatch.

The drawing of Component objects is split across a number of layers, so that overlays, underlays, borders and background can be rendered in a coherent manner.

BaseTool

An object that handles a particular type of user interaction (eg. mouse events or key presses). Each tool is a state machine and so the interactions can vary depending on the state that the tool is in (eg. “normal” vs. “dragging” state for a tool that handles moving interactions).

An understanding of Enable is important for writing new interactive tools, as well as for understanding and controlling how components are layered and layed out.

Chaco Object Model

The Plot class provides a fairly simple API for creating a plot in an application, but beneath that lies a set of classes that handle converting the numbers in the numpy arrays into pixels on the screen. Between the ArrayPlotData and the Plot are a series of classes which hold state for various operations and transformations.

Data Flow

The data flow between these classes can generally be sumarised as follows:

digraph dataflow { rankdir=LR; node [shape=plaintext]; "Plot Data" -> "Data sources" -> "Ranges" -> "Mappers" -> "Renderers" -> "Plot"; "Data sources" -> "Renderers"; "Mappers" -> "Axes and Grids"; "Pan and Zoom" -> "Ranges"; }
Data sources

These hold individual data sets from the plot data (ie. something that looks like a single NumPy array) and update when the data changes.

Examples: ArrayDataSource, ImageData, GridDataSource.

Ranges

These hold a range of displayed data values and can be updated either by changes to the data or changes in the state of pan or zoom tools.

Examples: DataRange1D, DataRange2D.

Mappers

These are responsible for mapping data values to screen (or color) values.

Examples: LinearMapper, LogMapper, GridMapper.

Renderers

These are the objects responsible for rendering plot data, such as line plots or scatter plots. They need to be update either when the data they are displaying changes, or the mapping from data space to screen space changes.

Examples: LinePlot, ScatterPlot, CMapImagePlot, TextPlot1D.

Axes and Grids

These are the objects responsible for drawing axes ticks and grid lines, and need to know the mapping between data space and screen space. Axes and Grids are examples of Overlays (although they are technically underlays).

Examples: PlotAxis, LabelAxis, PlotGrid.

Pan and Zoom

These are pan and zoom commands that come from user interactions, such as via a pan or zoom operation, from resizing the plot window, or from other application-based setting of the range of values to display. Pan and zoom are commonly initated via Tools.

Examples: PanTool, ZoomTool.

Data Flow Examples

Consider the following example:

def create_plot():
    t = np.linspace(0, 2*np.pi, 100)
    amplitude1 = 2*np.sin(t)
    amplitude2 = np.cos(2*t)
    plot_data = ArrayPlotData(
        t=t,
        amplitude1=amplitude1,
        amplitude2=amplitude2,
    )
    plot = Plot(plot_data)
    plot.plot(('t', 'amplitude1'), type='line')
    plot.plot(('t', 'amplitude2'), type='scatter')
    return plot

This sets up a number of objects and connects them together, so that data flows roughly as follows:

digraph dataflow { subgraph cluster_level { node [shape=plaintext]; style=invis; "Data source" -> "Range" -> "Mapper" -> "Underlay" -> "Renderer" [style=invis]; } node [shape=rectangle]; subgraph index { color=white; "ArrayDataSource: time" -> "Range1D: index" -> "LinearMapper: index" -> "PlotAxis: index"; } subgraph value { color=white; "ArrayDataSource: amplitude1" -> "Range1D: value"; "ArrayDataSource: amplitude2" -> "Range1D: value"; "Range1D: value" -> "LinearMapper: value" -> "PlotAxis: value"; } {rank = same; "Data source"; "ArrayDataSource: time"; "ArrayDataSource: amplitude1"; "ArrayDataSource: amplitude2"} {rank = same; "Range"; "Range1D: index"; "Range1D: value"} {rank = same; "Mapper"; "LinearMapper: index"; "LinearMapper: value"} {rank = same; "Underlay"; "PlotAxis: index"; "PlotAxis: value"} {rank = same; "Renderer"; "LinePlot"; "ScatterPlot"} "ArrayPlotData" -> "ArrayDataSource: time"; "ArrayPlotData" -> "ArrayDataSource: amplitude1"; "ArrayPlotData" -> "ArrayDataSource: amplitude2"; "ArrayDataSource: time" -> "LinePlot"; "ArrayDataSource: time" -> "ScatterPlot"; "ArrayDataSource: amplitude1" -> "LinePlot"; "ArrayDataSource: amplitude2" -> "ScatterPlot"; "LinearMapper: value" -> "LinePlot"; "LinearMapper: value" -> "ScatterPlot"; "LinearMapper: index" -> "LinePlot"; "LinearMapper: index" -> "ScatterPlot"; "PlotAxis: index" -> "Plot"; "PlotAxis: value" -> "Plot"; "LinePlot" -> "Plot"; "ScatterPlot" -> "Plot"; }

Updates to the data stored in the array plot data object trigger updates through the pathways indicated, first updating the data sources for each array, upon which the data ranges depend. In turn the mappers update their state when the data ranges update, and the underlays and plot renderers update their state based on changes to the mappers and, for the renderers, on the changes to the data sources. Finally the changes to the state of the components are flagged in the Enable drawing system, which will then schedule the plot for re-drawing during the GUI event loop’s next paint event.

Notice also how this diagram shows that mappers and ranges are shared between renderers and underlays that share the same physical space. Plots which don’t share the same screen space shouldn’t share mappers, but can share data and/or ranges.

For example, here are two plots which share the same array plot data:

def create_plot():
    t = np.linspace(0, 2*np.pi, 100)
    amplitude1 = 2*np.sin(t)
    amplitude2 = np.cos(2*t)
    plot_data = ArrayPlotData(
        t=t,
        amplitude1=amplitude1,
        amplitude2=amplitude2,
    )
    plot_1 = Plot(plot_data)
    plot_1.plot(('t', 'amplitude1'), type='line')
    plot_2 = Plot(plot_data)
    plot_2.plot(('t', 'amplitude2'), type='scatter')
    container = HPlotContainer(plot_1, plot2)

Which gives rise to the following data flow diagram:

digraph dataflow { subgraph cluster_level { node [shape=plaintext]; style=invis; "Data source" -> "Range" -> "Mapper" -> "Underlay" -> "Renderer" [style=invis]; } node [shape=rectangle]; subgraph index_1 { color=white; "Range1D: index 1" -> "LinearMapper: index 1" -> "PlotAxis: index 1"; } subgraph value_1 { color=white; "Range1D: value 1" -> "LinearMapper: value 1" -> "PlotAxis: value 1"; } subgraph index_2 { color=white; "Range1D: index 2" -> "LinearMapper: index 2" -> "PlotAxis: index 2"; } subgraph value_2 { color=white; "Range1D: value 2" -> "LinearMapper: value 2" -> "PlotAxis: value 2"; } {rank = same; "Data source"; "ArrayDataSource: time"; "ArrayDataSource: amplitude1"; "ArrayDataSource: amplitude2"} {rank = same; "Range"; "Range1D: index 1"; "Range1D: value 1"; "Range1D: index 2"; "Range1D: value 2"} {rank = same; "Mapper"; "LinearMapper: index 1"; "LinearMapper: value 1"; "LinearMapper: index 2"; "LinearMapper: value 2"} {rank = same; "Underlay"; "PlotAxis: index 1"; "PlotAxis: value 1"; "PlotAxis: index 2"; "PlotAxis: value 2"} {rank = same; "Renderer"; "LinePlot"; "ScatterPlot"} "ArrayPlotData" -> "ArrayDataSource: time"; "ArrayPlotData" -> "ArrayDataSource: amplitude1"; "ArrayPlotData" -> "ArrayDataSource: amplitude2"; "ArrayDataSource: time" -> "Range1D: index 1" "ArrayDataSource: time" -> "Range1D: index 2" "ArrayDataSource: time" -> "LinePlot"; "ArrayDataSource: time" -> "ScatterPlot"; "ArrayDataSource: amplitude1" -> "Range1D: value 1"; "ArrayDataSource: amplitude1" -> "Range1D: value 2"; "ArrayDataSource: amplitude1" -> "LinePlot"; "ArrayDataSource: amplitude2" -> "Range1D: value 1"; "ArrayDataSource: amplitude2" -> "Range1D: value 2" "ArrayDataSource: amplitude2" -> "ScatterPlot"; "LinearMapper: value 1" -> "LinePlot"; "LinearMapper: value 2" -> "ScatterPlot"; "LinearMapper: index 1" -> "LinePlot"; "LinearMapper: index 2" -> "ScatterPlot"; "PlotAxis: index 1" -> "Plot 1"; "PlotAxis: value 1" -> "Plot 1"; "PlotAxis: index 2" -> "Plot 2"; "PlotAxis: value 2" -> "Plot 2"; "LinePlot" -> "Plot 1"; "ScatterPlot" -> "Plot 2"; }

In contrast to the previous example the ranges and mappers are not related in any way between the two plots. This means that changes to the visible region in data space for one plot will not affect the other, and because the values span a different range initially they will have different value scales.

It is common to want to share one or both of the ranges between plots to keep the axes synchronized in data space.

For example, here are two plots which share the same data ranges:

def create_plot():
    t = np.linspace(0, 2*np.pi, 100)
    amplitude1 = 2*np.sin(t)
    amplitude2 = np.cos(2*t)
    plot_data = ArrayPlotData(
        t=t,
        amplitude1=amplitude1,
        amplitude2=amplitude2,
    )
    plot_1 = Plot(plot_data)
    plot_1.plot(('t', 'amplitude1'), type='line')
    plot_2 = Plot(plot_data)
    plot_2.plot(('t', 'amplitude2'), type='scatter')
    plot_2.index_range = plot_1.index_range
    plot_2.value_range = plot_1.value_range
    container = HPlotContainer(plot_1, plot2)

Which gives rise to the following data flow diagram:

digraph dataflow { subgraph cluster_level { node [shape=plaintext]; style=invis; "Data source" -> "Range" -> "Mapper" -> "Underlay" -> "Renderer" [style=invis]; } node [shape=rectangle]; subgraph index_1 { color=white; "Range1D: index" -> "LinearMapper: index 1" -> "PlotAxis: index 1"; } subgraph value_1 { color=white; "Range1D: value" -> "LinearMapper: value 1" -> "PlotAxis: value 1"; } subgraph index_2 { color=white; "Range1D: index" -> "LinearMapper: index 2" -> "PlotAxis: index 2"; } subgraph value_2 { color=white; "Range1D: value" -> "LinearMapper: value 2" -> "PlotAxis: value 2"; } {rank = same; "Data source"; "ArrayDataSource: time"; "ArrayDataSource: amplitude1"; "ArrayDataSource: amplitude2"} {rank = same; "Range"; "Range1D: index"; "Range1D: value"} {rank = same; "Mapper"; "LinearMapper: index 1"; "LinearMapper: value 1"; "LinearMapper: index 2"; "LinearMapper: value 2"} {rank = same; "Underlay"; "PlotAxis: index 1"; "PlotAxis: value 1"; "PlotAxis: index 2"; "PlotAxis: value 2"} {rank = same; "Renderer"; "LinePlot"; "ScatterPlot"} "ArrayPlotData" -> "ArrayDataSource: time"; "ArrayPlotData" -> "ArrayDataSource: amplitude1"; "ArrayPlotData" -> "ArrayDataSource: amplitude2"; "ArrayDataSource: time" -> "Range1D: index"; "ArrayDataSource: time" -> "LinePlot"; "ArrayDataSource: time" -> "ScatterPlot"; "ArrayDataSource: amplitude1" -> "Range1D: value"; "ArrayDataSource: amplitude1" -> "LinePlot"; "ArrayDataSource: amplitude2" -> "Range1D: value"; "ArrayDataSource: amplitude2" -> "ScatterPlot"; "LinearMapper: value 1" -> "LinePlot"; "LinearMapper: value 2" -> "ScatterPlot"; "LinearMapper: index 1" -> "LinePlot"; "LinearMapper: index 2" -> "ScatterPlot"; "PlotAxis: index 1" -> "Plot 1"; "PlotAxis: value 1" -> "Plot 1"; "PlotAxis: index 2" -> "Plot 2"; "PlotAxis: value 2" -> "Plot 2"; "LinePlot" -> "Plot 1"; "ScatterPlot" -> "Plot 2"; }

Here any change to the range will automatically update the mappers of both, so the visible ranges will match. However since the screen space of the two plots is different, we don’t want to share mappers (mappers can only be shared when the plots are contained in an OverlayPlotContainer or a subclass such as DataView or Plot)

Data Sources

At its core, Chaco is about visualizing interactive data. As such, Chaco has a standard API for representing data: all of these classes implement the AbstractDataSource API. This class has methods for getting and setting the data that is provided by the data source, as well as basic information about the data’s size and (for numerical data) the numerical bounds of the values. A data source can also hold a dictionary of arbitrary additional metadata.

The workhorse data source is the ArrayDataSource which holds a single NumPy of array of numerical data and which covers almost all common use cases. In most cases where you need to work with an ArrayDataSource you call set_data() to change the stored data, listen to the data_changed event trait for when the data changes and call get_data() to get the current value of the data.

Some users of a data source only care about the range of values that are contained in that data. In this case the data source API provides a bounds_changed trait that indicates that the maximum or minimum value of the data has changed, and those values can be efficiently retrieved via the get_bounds() trait.

Similarly there is a metadata_changed event trait that is fired when the metadata dictionary is replaced or modified.

A common use case for alternative data sources is to render a computed function (such as a curve that has been fit to the data) dynamically rather than having to sample a fixed set of points. This can be done by supplying the plot data with an FunctionDataSource and plotting that:

def create_plot():
    t = np.linspace(0, 2*np.pi, 100)
    amplitude = 2*np.sin(t) + numpy.random.normal(scale=0.1)
    plot_data = ArrayPlotData(t=t, amplitude=amplitude)
    plot = Plot(plot_data)
    plot.plot(('t', 'amplitude'), type='scatter')

    def f(low, high):
        return 2*np.sin(np.linspace(low, high, 100))

    data_source = FunctionDataSource(
        func=f, data_range=plot.index_range
    )
    plot_data.set_data('f', data_source)
    plot.plot(('t', 'f'), type='line')

    return plot

Mappers

Data as provided by the AbstractDataSource is not suitable for display; it needs to be mapped to an appropriate value for rendering into a graphics context. The most obvious mapping transforms data values into Enable’s drawing coordinates (often simply referred to as “screen” coordinates, whether or not they are actually rendered to a screen). However similar transformations need to be performed to map numerical data to color values for displaying on colormapped plots. There are two hierarchies of classes that perform these transformations.

The abstract base class for mapping data is the AbstractMapper and this class specifies methods map_screen() for mapping data values to screen values, map_data() for mapping screen values back to data values, and map_data_array() for mapping a collection of screen values to data values. Perhaps most importantly, the mapper fires the updated event.

Chaco provides a number of sub-classes of the base class for various use-cases. The most commonly used is the LinearMapper which provides a one dimensional linear transformation between data space and screen space, but there is also LogMapper which provides one dimensional logarithmic transformation, and GridMapper which provides a mapping frrom a two dimensional data source to a point in screen (x, y) coordinates using a combination of two one dimensional mappers.

For mapping of values to colors, there is the AbstractColormap class and the two sub-classes ColorMapper and DiscreteColorMapper. These have the same base API as AbstractMapper but also provide some specialized methods for converting to integer RGB values efficiently. Chaco provides a large number of default color maps suitable for various visualization types.

Ranges

A common problem to many data mappers is that the range of data values may change dynamically, and when data changes it is desirable to have the mapper automatically update itself to ensure that the full range of data values is mapped to the screen. This functionality is broken out into subclasses of the AbstractDataRange class.

These classes track a collection of AbstractDataSource instances via their sources trait, and when the bounds of any of those data sources change then the range adjusts its upper and lower bound appropriately. Data mappers then listen to the values of the upper and lower bounds of the range and use that to adjust the transformation that they apply. The actual values of the upper and lower bounds in data space coordinates are provided by the low and high traits.

However there are situations where the behaviour of the range should change, for example after a pan or zoom operation the value of the bounds should remain fixed to whatever values the user panned or zoomed to even if the underlying data changes. For these purposes, code interacting with a data range can set the low_setting and high_setting traits either to an absolute numerical value in the data space, or to a number of other values, such as auto or track that determine the behaviour when data changes.

The most commonly used subclass is DataRange1D which has a number of additional affordances to facilitate pleasant appearing plots, such as the ability to add some padding above and below the data via the margin trait, or even to supply a custom padding calculation function.

It is worthwhile noting that data ranges can be shared between mappers, and this permits linking of axes bounds or color maps ranges across different plots.

Axes and Grids

Axes and grids are auxilliary objects that draw plot decorations. They are underlays (and so inherit from AbstractOverlay) and are usually drawn into the underlay layer of a Plot but they are also able to be used as stand-alone components if needed (for example to create multi-axis plots).

These objects present numerous options for their styling, but perhaps more importantly allow control over the algorithm to used for determining where tick marks and grid lines should be drawn. Both classes have a tick_generator trait which takes an instance of an AbstractTickGenerator which has a single method get_ticks() that returns the tick positions for the current data and screen space bounds.

There are several standard tick generators available for use, but in the absence of anything else the DefaultTickGenerator is used, which tries to generate genererally pleasing ticks at round numbers for both linear and logarithmic mappings. The MinorTickGenerator is similar, but generates generate denser ticks that are suitable for use as a minor scale. The ShowAllTickGenerator simply shows ticks at a list of supplied data values, giving complete control at the expense of not being able to dynamically adapt to changes from panning and zooming.

For more complex tick generation, such as time axes where the “natural” tick spacings, positions and even label formatting can change as you zoom through different levels, the ScalesTickGenerator allows the user to specify a multi-leveled ScaleSystem. In particular this system provides the CalendarScaleSystem which by default correct ticks axes with time values ranging from microseconds through to years.

For example, you can create an hours, minutes, seconds time axis (ignoring higher level calendar constructs) for a plot as follows:

from chaco.scales.api import (
    CalendarScaleSystem, HMSScales, ScalesTickGenerator
)

def create_plot():
    t = np.linspace(0, 3600, 36001)
    a = np.sin(2*pi*60*t)
    plot_data = ArrayPlotData(t=t, a=a)
    plot = Plot(plot_data)
    plot.plot(('t', 'a'), type='line')
    plot.index_axis.tick_generator = ScalesTickGenerator(
        scale=CalendarScaleSystem(*HMSScales)
    )

    return plot

Plot Renderers

The core of the Chaco plotting library are the plot renderers which are responsible for drawing the markings that represent the data, all of which implement the AbstractPlotRenderer API. This ABC is a subclass of PlotComponent, and so all plot renderers are expected to implement the key parts of the Enable drawing API. Most specialized plot renderers expect a render() method that performs actual drawing of the plot into a provided Kiva graphics context.

Most plot renders have the notion of “index” and “value” data that they are plotting. Each item in the index has a corresponding value, so if a function were being plotted the index are points in the domain and the values are points in the range. For plot renderers the index usually provides a location at which the value should be rendered, and the value provides a position offset or color value. Importantly, the index and value are not directly linked to horizontal or vertical screen space.

Different subclasses of the abstract plot renderer implement common conventions for handling index and value representation. For example:

BaseXYPlot

This class handles plots like line plots and bar plots where the index gives offsets along one axis and the values are along the other axis.

Base1DPlot

This class handles plots where the index gives the offset along one axis, and the values are displayed by markings at or near those points.

Base2DPlot

This class handles plots like contour and image plots where the index lies on a regular 2D grid and values are displayed by markings at or near those points.

There are a number of other plot types that handle special cases like candle plots.

Plot renderers have mappers for each of their data dimensions, but they also express convenience APIs mapping data values to and from screen (x, y) values using the methods map_data() and map_screen(). These are usually simple wrappers around the appropriate mapper calls of the same name.

Plot renderers also have to provide information for tools that want to interact with the values on the plot. They are expected to provide a map_index() method which handles mapping a screen point to an index item (ie. an integer index into the index data source).

Tools

Up to this point, all the classes discussed are dynamic in the sense that if the underlying data changes then the visualization will update appropriately. However it is often the case that you want to add other interactions to a visualization. The most common of these is the ability to pan or zoom the plot to focus on particular details, but there number of ways that you might want a user to interact with the visualization is potentially vast. As a result one of the most common ways to customize a visualization is by writing new tools.

Tools are technically a feature of Enable, rather than Chaco, and as a result there are a number of tools and base classes there that can be used as the foundation or inspiration for custom interactions. For example, the following Enable tools may be of use:

enable.tools.move_tool.MoveTool

A tool which changes the screen location of a component by dragging with the mouse. This can be useful for allowing the user to move plot decorations such as legends around the plot.

enable.tools.resize_tool.ResizeTool

A tool which changes the screen size of a component by dragging edges or corners.

enable.tools.hover_tool.HoverTool

A tool which calls a callback when the mouse hs not moved significantly for a period of time.

enable.tools.button_tool.ButtonTool

A tool that makes a component act like a button, with a enable.tools.button_tool.ButtonTool.clicked trait that you can react to via the usual Traits mechanisms.

enable.tools.pyface.context_menu_tool.ContextMenuTool

A tool which displays a context menu at the point where the use right-clicks, using Pyface’s menu and action classes.

enable.tools.traits_tool.TraitsTool

A tool which opens a TraitsUI dialog when a component is double-clicked.

enable.tools.base_drop_tool.BaseDropTool

A base tool which responds to operating system drag and drop. Must be subclassed to implement methods that indicate whether a type of object can be dropped, and what to do if they are dropped.

enable.tools.value_drag_tool.ValueDragTool

A base tool which changes a numeric value as the user drags the mouse. Must be subclassed to provide methods to get and set the value. There is a subclass enable.tools.value_drag_tool.AttributeDragTool which sets the values of attributes on an object as the mouse moves, which is a common use case.

Overlays and Underlays

In some instances you want to render additional decorations that are independent of the plot type. In a similar fashion to the Tool classes auxilliary renderers can be attached to plots as “overlays” (and using the same mechanism, just rendering into a different layer, as “underlays”). Common use cases for overlays include cursor lines, selection regions, hover text, legends and other annotations. Overlays are frequently designed to work together with a particular Tool or class of tools, but can frequently be used independently if desired.

Overlays and underlays which need to render relative to points in data space will frequently want to make use of the plot mappers to know where in screen space to perform their drawing operations..