Introduction

This tutorial is intended to introduce you to the basics of Traits and to give you an idea of what capabilities the Traits library provides. This tutorial assumes that you are confortable with the Python programming language, object-oriented programming, and the core tools of the Python scientific ecosystem, such as NumPy.

This tutorial is designed to introduce the basics of Traits, but also to explain why you might want to use Traits in your own code.

An interactive version of this tutorial can be accessed using the ETS Demo application.

Introduction to Traits

To make the tutorial more practical, let’s imagine that you are writing some tools to organize and process image data, perhaps to help you develop a machine learning algorithm, or to document your research. This image data might come from a camera, a scanning electron microscope, or from some more exotic source. No matter the source, the image will have data associated with it, such as who produced the image, the equipment that was used, the date and time that the image was taken, and so forth.

In this tutorial, we’ll consider an example where we have a collection of greyscale SEM images, each of them stored in a file, along with other information, such as the sample that was imaged, the size of the scanned region, and the operator who took the image.

We would like to read in the image and provide a number of standard analysis steps to the user, as well as making the raw image data available as a NumPy array.

The sample code here shows how you might create a class in standard object-oriented Python with NumPy for analysis and Pillow for image loading. Experiment with the code in the example to see how you can use it.

In the subsequent steps of the tutorial, we’ll look at how we can use Traits to simplify and improve the code here.

import datetime
import os

import numpy as np
from PIL import Image as PILImage


class Image:
    """ An SEM image stored in a file. """

    def __init__(self, filename, sample_id, date_acquired, operator,
                 scan_size=(1e-5, 1e-5)):
        # initialize the primary attributes
        self.filename = filename
        self.sample_id = sample_id
        self.operator = operator
        self.date_acquired = operator
        self.scan_size = scan_size

        # useful secondary attributes
        self.scan_width, self.scan_height = self.scan_size

    def read_image(self):
        """ Read the image from disk. """
        pil_image = PILImage.open(self.filename).convert("L")
        self.image = np.array(pil_image)

        # compute some extra secondary attributes from the image
        if self.image.size > 0:
            self.pixel_area = (
                self.scan_height * self.scan_width / self.image.size
            )
        else:
            self.pixel_area = 0

    def histogram(self):
        """ Compute the normalized histogram of the image. """
        hist, bins = np.histogram(
            self.image,
            bins=256,
            range=(0, 256),
            density=True,
        )
        return hist

    def threshold(self, low=0, high=255):
        """ Compute a threshold mask for the array. """
        return (self.image >= low) & (self.image <= high)


# ---------------------------------------------------------------------------
# Demo code
# ---------------------------------------------------------------------------

this_dir = os.path.dirname(__file__)
image_dir = os.path.join(this_dir, "images")
filename = os.path.join(image_dir, "sample_0001.png")

# load the image
image = Image(
    filename=filename,
    operator="Hannes",
    sample_id="0001",
    date_acquired=datetime.datetime.today(),
    scan_size=(1e-5, 1e-5),
)

# read the image from disk
image.read_image()

# perform some sample computations
print(
    "The maximum intensity of {} is {}".format(
        image.sample_id,
        image.histogram().argmax(),
    )
)
pixel_count = image.threshold(low=200).sum()
print(
    "The area with intensity greater than 200 is {:0.3f} µm²".format(
        pixel_count * image.pixel_area * 1e12
    )
)

Validation

A common issue faced by scientists is ensuring that the data that is entered matches the sort of data that is expected. For example, we expect that the filename is a string (or perhaps a pathlib Path, for advanced users), the operator name is a string, the acquisition date is a datetime.date object, and so on. Many languages allow (or even require) you to specify these data types as part of your program, but even then these data types tend to reflect what the computer stores in memory, and not what the data actually represents.

For example, not only should the file be a string, but we would like to validate that the string is in fact a valid path name, and ideally that the file can actually be found. A sample ID may be expected to follow some pattern based on lab protocols. The image data is expected to be a 2D array of values rather than just any NumPy array. And so on.

Traits provides a way to ensure that values are validated when assigned in your classes: as part of your class definition you _declare_ what you expect your data to be. To do this, the first thing you want to do is inherit from the base HasTraits class which enables all of the validation machinery:

from traits.api import HasTraits

class Image(HasTraits):
    ''' An SEM image stored in a file. '''

Having done this, we can declare the types of data that we expect for the attributes of an Image class. For example, we expect that the operator name should be a string, so we can use the standard Traits Str trait type:

from traits.api import HasTraits, Str

class Image(HasTraits):
    ''' An SEM image stored in a file. '''

    operator = Str()

Now, if we try and assign any other value to the operator attribute, we find that the class will raise a TraitError:

>>> image = Image()
>>> image.operator = 3
TraitError: The 'operator' trait of an Image instance must be a
string, but a value of 3 <class 'int'> was specified.

Traits has trait types corresponding to all the basic Python data types: Int, Float, Complex, Bool, and Str. It also has trait types for the standard containers: List, Dict, Set and Tuple. There is an Instance trait type for values which are instances of a Python class. Traits also provides a rich set of trait types that cover many common data types, for example:

  • we can use a Date trait type for the date_aquired

  • we can specify that the scan size is not just a tuple, but a pair of floating point values by specifying Tuple(Float, Float).

  • we can use a File trait for the filename, and we can require that the path refer to an existing file by using File(exists=True).

  • we can specify that the image data is a 2D NumPy array of unsigned integers with Array(shape=(None, None), dtype='uint8')

Everything else can remain unchanged in the class, and it will still work as expected, however just as with regular Python classes, we need to remember to call super() in the __init__ method:

def __init__(self, filename, sample_id, date_aquired, operator,
             scan_size=(1e-5, 1e-5)):
    super().__init__()

    # initialize the primary attributes
    ...

When we talk about an attribute which is declared by a trait type, it is common to call it a _trait_ rather than an attribute.

Traits and Static Types

The first version of Traits was written over 15 years ago. In the last 5 years or so, Python has started to gain the ability to perform static type checking using tools like MyPy and certain integrated development environments. The dataclass module introduced in recent Python versions can do similar sorts of type declaration for classes. Advanced Python users may be aware of, and using these classes already.

As we will see, the capabilities of Traits are much greater than these type checking systems, however if you have the traits-stubs package installed, most of your trait type declarations will be recognised and can be used with these new Python type systems.

Exercise

The example code hasn’t declared trait types for all the attributes used by the class. Declare trait types for scan_width, scan_height and pixel_area.

Initialization

If you have done any significant amount of object-oriented Python programming, you may have noticed that your __init__ methods often have a lot of boilerplate. In our original example, the code copies all of the __init__ arguments to corresponding attributes before doing any real work:

def __init__(self, filename, sample_id, date_aquired, operator,
             scan_size=(1e-5, 1e-5)):
    # initialize the primary attributes
    self.filename = filename
    self.sample_id = sample_id
    self.operator = operator
    self.scan_size = scan_size

Traits lets you avoid this boilerplate by defining a default __init__ method that accepts keyword arguments that correspond to the declared traits. The Traits version of the Image class could potentially skip the __init__ method entirely:

class Image(HasTraits):
    filename = File(exists=True)
    sample_id = Str()
    date_acquired = Date()
    operator = Str()
    scan_size = Tuple(Float, Float)


# this works!
image = Image(
    filename=filename,
    operator="Hannes",
    sample_id="0001",
    date_acquired=datetime.datetime.today(),
    scan_size=(1e-5, 1e-5),
)

Default Values

There are a couple of complications in the example that we need to take into account. The first is what happens if a user forgets to provide an initial value:

>>> image = Image(
...     filename=filename,
...     sample_id="0001",
...     date_acquired=datetime.datetime.today(),
...     scan_size=(1e-5, 1e-5),
... )
>>> image.operator
""

As this example shows, the operator gets given a default value of the empty string "". In fact every trait type comes with an default value. For numeric trait types, like Int and Float, the default is 0. For Str trait types it is the empty string, for Bool traits it is False, and so on.

However, that might not be what you want as your default value. For example, you might want to instead flag that the operator has not been provided with the string "N/A" for “not available”. Most trait types allow you to specify a default value as part of the declaration. So we could say:

operator = Str("N/A")

and now if we omit operator from the arguments, we get:

>>> image.operator
"N/A"

Dynamic Defaults

The second complication comes from more complex initial values. For example, we could declare some arbitrary fixed date as the default value for date_acquired:

date_acquired = Date(datetime.date(2020, 1, 1))

But it would be better if we could set it to a dynamic value. For example, a reasonable default would be today’s date. You can provide this sort of dynamically declared default by using a specially-named method which has the pattern _<trait-name>_default and which returns the default value. So we could write:

def _date_acquired_default(self):
    return datetime.datetime.today()

Dynamic defaults are best used for values which don’t depend on other traits. For example, it might be tempting to have the image trait have a dynamic default which loads in the data. As we will see, this is almost always better handled by Traits observation and/or properties, which are discussed in subsequent sections of the tutorial.

The traits_init Method

Although you aren’t required to write an __init__ method in a HasTraits subclass, you can always choose to do so. If you do, you must call super() to ensure that Traits has a chance to set up its machinery. In our example the __init__ method is also used to set up some auxilliary values. This doesn’t have to change:

def __init__(self, **traits):
    super().__init__(**traits)

    # useful secondary attributes
    self.scan_width, self.scan_height = self.scan_size

However Traits offers a slightlty more convenient way of doing this sort of post-initialization setup of state: you can define a traits_init method which the HasTraits class ensures is called as part of the main initialization process. When it has been called, all initial values will have been set:

def traits_init(self):
    # useful secondary attributes
    self.scan_width, self.scan_height = self.scan_size

Exercise

In our original example, the scan_size atribute had a default value of (1e-5, 1e-5). Modify the code in the example so that the trait is initialized to this default using a dynamic default method.

Observation

In our code so far, there is a problem that it is possible for certain related values to get out of sync with one another. For example, if we change the filename after we have read the image into memory, then the data in memory still refers to the old image. It would be nice if we could automatically re-load the image if the filename changes. Traits allows you to do this.

The Observe Decorator

We want to have the read_image method run whenever the filename trait changes. We can do this by adding an observe decorator to the method:

class Image(HasTraits):
    ...

    @observe('filename')
    def read_image(self, event):
        ...

The observer passes an event object to the function which contains information about what changed, such as the old value of the trait, but we don’t need that information to react to the change, so it is ignored in the body of the function.

For most traits, the observer will run only when the trait’s value actually changes, not just when the value is set. So if you do:

>>> image.filename = "sample_0001.png"
>>> image.filename = "sample_0001.png"

then the observer will only be run once.

Observing Multiple Traits

If you look at the computation of pixel_area in the original code, it looks like this:

self.pixel_area = self.scan_height * self.scan_width / self.image.size

It depends on the scan_width, scan_height and the image, so we would like to listen to changes to all of these. We could write three @observe functions, one for each trait, but the content would be the same for each. A better way to do this is to have the observer listen to all the traits at once:

class Image(HasTraits):
    ...

    @observe('scan_width, scan_height, image')
    def update_pixel_area(self, event):
        if self.image.size > 0:
            self.pixel_area = (
                self.scan_height * self.scan_width / self.image.size
            )
        else:
            self.pixel_area = 0

Dynamic Observers

Sometimes you want to be able to observe changes to traits from a different object or piece of code. The observe method on a HasTraits subclass allows you to dynamically specify a function to be called if the value of a trait changes:

image = Image(
    filename="sample_0001.png",
    sample_id="0001",
)

def print_filename_changed(event):
    print("Filename changed")

image.observe(print_filename_changed, 'filename')

# will print "Filename changed" to the screen
image.filename="sample_0002.png"

Dynamic observers can also be disconnected using the same method, by adding the argument remove=True:

image.observe(print_filename_changed, 'filename', remove=True)

# nothing will print
image.filename="sample_0003.png"

Exercise

Currently scan_height and scan_width are set from the parts of the scan_size trait as part of the traits_init method. Remove the traits_init method and have scan_height and scan_width methods update whenever scan_size changes.

Property Traits

The Image class has three traits which are closely related: scan_size, scan_width and scan_height. We would ideally like to keep all of these synchronized. This can be done with trait observation, as shown in the previous section, but this sort of pattern is common enough that Traits has some in-built helpers.

Instead of declaring that scan_width = Float() we could instead declare it to be a Property trait type. Property traits are similar to @property decorators in standard Python in that rather than storing a value, they compute a derived value via a “getter”, and optionally store values via a “setter”. Traits uses specially named methods of the form _get_<property> and _set_property for these “getters” and “setters.” If there is a “getter” but no “setter” then the property is read-only.

Additionally, we need to know when the value of the property might change, and so we need to declare what traits to observe to know when the property might change. What all this means is that we can define scan_width as a property by:

class Image(HasTraits):
    ...

    scan_width = Property(Float, depends_on='scan_size')

    def _get_scan_width(self):
        return self.scan_size[0]

    def _set_scan_width(self, value):
        self.scan_size = (value, self.scan_height)

Traits will then take care of hooking up all the required observers to make everything work as expected; and the Property can also be observed if desired.

Simple Property traits like this are computed “lazily”: the value is only calculated when you ask for it.

Cached Properties

It would be quite easy to turn the histogram function into a read-only property. This might look something like this:

class Image(HasTraits):
    ...

    histogram = Property(Array, depends_on='image')

    def _get_histogram(self):
        hist, bins = np.histogram(
            self.image,
            bins=256,
            range=(0, 256),
            density=True,
        )
        return hist

This works, but it has a downside that the histogram is re-computed every time the property is accessed. For small images, this is probably OK, but for larger images, or if you are working with many images at once, this may impact performance. In these cases, you can specify that the property should cache the returned value and use that value until the trait(s) the property depend on changes:

class Image(HasTraits):
    ...

    histogram = Property(Array, depends_on='image')

    @cached_property
    def _get_histogram(self):
        hist, bins = np.histogram(
            self.image,
            bins=256,
            range=(0, 256),
            density=True,
        )
        return hist

This has the trade-off that the result of the computation is being stored in memory, but in this case the memory is only a few hundred bytes, and so is unlikely to cause problems; but you probably wouldn’t want to do this with a multi-gigabyte array.

Exercise

Make pixel_area a read-only property.

Documentation

Another advantage of using Traits is that you code becomes clearer and easier for yourself and other people to work with. If you look at the original version of the image class, it isn’t clear what attributes are available on the class and, worse yet, it isn’t clear when those attributes are available.

Self-Documenting Code

By using Traits, all your attributes are declared up-front, so anyone reading your class knows exactly what your class is providing:

class Image(HasTraits):
    filename = File(exists=True)
    sample_id = Str()
    operator = Str("N/A")
    date_acquired = Date()
    scan_size = Tuple(Float, Float)
    scan_width = Property(Float, depends_on='scan_size')
    scan_height = Property(Float, depends_on='scan_size')
    image = Array(shape=(None, None), dtype='uint8')
    pixel_area = Property(Float, depends_on='scan_height,scan_width,image')
    histogram = Property(Array, depends_on='image')

This goes a long way towards the ideal of “self-documenting code.” It is common in production-quality Traits code to also document each trait with a special #: comment so that auto-documentation tools like Sphinx can generate API documentation for you:

class Image(HasTraits):

    #: The filename of the image.
    filename = File(exists=True)

    #: The ID of the sample that is being imaged.
    sample_id = Str()

    #: The name of the operator who acquired the image.
    operator = Str("N/A")

    ...

HasStrictTraits

One common class of errors in Python are typos when accessing an attribute of a class. For example, if we typed:

>>> image.fileanme = "sample_0002.png"

Python will not throw an error, but the code will not have the effect that the user expects. Some development tools can help you detect these sorts of errors, but most of these are not available when writing code interactively, such as in a Jupyter notebook.

Traits provides a special subclass of HasTraits called HasStrictTraits which restricts the allowed attributes to only the traits on the class. If we use:

class Image(HasStrictTraits):
    ...

then if we type:

>>> image.fileanme = "sample_0002.png"
TraitError: Cannot set the undefined 'fileanme' attribute of a 'Image'
object.

We get an immediate error which flags the problem.

Visualization

Traits allows you to instantly create a graphical user interface for your HasTraits classes. If you have TraitsUI and a suitable GUI toolkit (such as PyQt5 or PySide2) installed in your Python environment then you can create a dialog view of an instance of your class by simply doing:

>>> image.configure_traits()

This gives you a default UI out of the box with no further effort, but it usually is not what you would want to provide for your users.

With a little more effort using the features of TraitsUI, you can design a dialog which is more pleasing:

from traitsui.api import Item, View

class Image(HasStrictTraits):
    ...

    view = View(
        Item('filename'),
        Item('sample_id', label='Sample ID'),
        Item('operator'),
        Item('date_acquired'),
        Item('scan_width', label='Width (m):'),
        Item('scan_height', label='Height (m):')
    )

TraitsUI can be used as the building block for complete scientific applications, including 2D and 3D plotting.