Introduction¶
This tutorial is intended to introduce you to the basics of Traits and to give you an idea of what capabilities the Traits library provides. This tutorial assumes that you are comfortable with the Python programming language, object-oriented programming, and the core tools of the Python scientific ecosystem, such as NumPy.
This tutorial is designed to introduce the basics of Traits, but also to explain why you might want to use Traits in your own code.
An interactive version of this tutorial can be accessed using the ETS Demo application.
Introduction to Traits¶
To make the tutorial more practical, let’s imagine that you are writing some tools to organize and process image data, perhaps to help you develop a machine learning algorithm, or to document your research. This image data might come from a camera, a scanning electron microscope, or from some more exotic source. No matter the source, the image will have data associated with it, such as who produced the image, the equipment that was used, the date and time that the image was taken, and so forth.
In this tutorial, we’ll consider an example where we have a collection of greyscale SEM images, each of them stored in a file, along with other information, such as the sample that was imaged, the size of the scanned region, and the operator who took the image.
We would like to read in the image and provide a number of standard analysis steps to the user, as well as making the raw image data available as a NumPy array.
The sample code here shows how you might create a class in standard object-oriented Python with NumPy for analysis and Pillow for image loading. Experiment with the code in the example to see how you can use it.
In the subsequent steps of the tutorial, we’ll look at how we can use Traits to simplify and improve the code here.
import datetime
import os
import numpy as np
from PIL import Image as PILImage
class Image:
""" An SEM image stored in a file. """
def __init__(self, filename, sample_id, date_acquired, operator,
scan_size=(1e-5, 1e-5)):
# initialize the primary attributes
self.filename = filename
self.sample_id = sample_id
self.operator = operator
self.date_acquired = operator
self.scan_size = scan_size
# useful secondary attributes
self.scan_width, self.scan_height = self.scan_size
def read_image(self):
""" Read the image from disk. """
pil_image = PILImage.open(self.filename).convert("L")
self.image = np.array(pil_image)
# compute some extra secondary attributes from the image
if self.image.size > 0:
self.pixel_area = (
self.scan_height * self.scan_width / self.image.size
)
else:
self.pixel_area = 0
def histogram(self):
""" Compute the normalized histogram of the image. """
hist, bins = np.histogram(
self.image,
bins=256,
range=(0, 256),
density=True,
)
return hist
def threshold(self, low=0, high=255):
""" Compute a threshold mask for the array. """
return (self.image >= low) & (self.image <= high)
# ---------------------------------------------------------------------------
# Demo code
# ---------------------------------------------------------------------------
this_dir = os.path.dirname(__file__)
image_dir = os.path.join(this_dir, "images")
filename = os.path.join(image_dir, "sample_0001.png")
# load the image
image = Image(
filename=filename,
operator="Hannes",
sample_id="0001",
date_acquired=datetime.datetime.today(),
scan_size=(1e-5, 1e-5),
)
# read the image from disk
image.read_image()
# perform some sample computations
print(
"The maximum intensity of {} is {}".format(
image.sample_id,
image.histogram().argmax(),
)
)
pixel_count = image.threshold(low=200).sum()
print(
"The area with intensity greater than 200 is {:0.3f} µm²".format(
pixel_count * image.pixel_area * 1e12
)
)
Links¶
Validation¶
A common issue faced by scientists is ensuring that the data that is entered matches the sort of data that is expected. For example, we expect that the filename is a string (or perhaps a pathlib Path, for advanced users), the operator name is a string, the acquisition date is a datetime.date object, and so on. Many languages allow (or even require) you to specify these data types as part of your program, but even then these data types tend to reflect what the computer stores in memory, and not what the data actually represents.
For example, not only should the file be a string, but we would like to validate that the string is in fact a valid path name, and ideally that the file can actually be found. A sample ID may be expected to follow some pattern based on lab protocols. The image data is expected to be a 2D array of values rather than just any NumPy array. And so on.
Traits provides a way to ensure that values are validated when assigned in your
classes: as part of your class definition you _declare_ what you expect your
data to be. To do this, the first thing you want to do is inherit from the
base HasTraits
class which enables all of the validation machinery:
from traits.api import HasTraits
class Image(HasTraits):
''' An SEM image stored in a file. '''
Having done this, we can declare the types of data that we expect for the
attributes of an Image
class. For example, we expect that the operator
name should be a string, so we can use the standard Traits Str
trait
type:
from traits.api import HasTraits, Str
class Image(HasTraits):
''' An SEM image stored in a file. '''
operator = Str()
Now, if we try and assign any other value to the operator
attribute, we
find that the class will raise a TraitError
:
>>> image = Image()
>>> image.operator = 3
TraitError: The 'operator' trait of an Image instance must be a
string, but a value of 3 <class 'int'> was specified.
Traits has trait types corresponding to all the basic Python data types:
Int
, Float
, Complex
, Bool
, and Str
. It also has trait
types for the standard containers: List
, Dict
, Set
and Tuple
.
There is an Instance
trait type for values which are instances of a
Python class. Traits also provides a rich set of trait types that cover
many common data types, for example:
we can use a
Date
trait type for the date_acquiredwe can specify that the scan size is not just a tuple, but a pair of floating point values by specifying
Tuple(Float, Float)
.we can use a
File
trait for the filename, and we can require that the path refer to an existing file by usingFile(exists=True)
.we can specify that the image data is a 2D NumPy array of unsigned integers with
Array(shape=(None, None), dtype='uint8')
Everything else can remain unchanged in the class, and it will still work as
expected, however just as with regular Python classes, we need to remember
to call super()
in the __init__
method:
def __init__(self, filename, sample_id, date_acquired, operator,
scan_size=(1e-5, 1e-5)):
super().__init__()
# initialize the primary attributes
...
When we talk about an attribute which is declared by a trait type, it is common to call it a _trait_ rather than an attribute.
Traits and Static Types¶
The first version of Traits was written over 15 years ago. In the last 5
years or so, Python has started to gain the ability to perform static type
checking using tools like MyPy and certain integrated development
environments. The dataclass
module introduced in recent Python versions
can do similar sorts of type declaration for classes. Advanced Python users
may be aware of, and using these classes already.
As we will see, the capabilities of Traits are much greater than these type checking systems, however if you have the traits-stubs package installed, most of your trait type declarations will be recognised and can be used with these new Python type systems.
Exercise¶
The example code hasn’t declared trait types for all the attributes used by
the class. Declare trait types for scan_width
, scan_height
and
pixel_area
.
Links¶
Initialization¶
If you have done any significant amount of object-oriented Python
programming, you may have noticed that your __init__
methods often
have a lot of boilerplate. In our original example, the code copies all of
the __init__
arguments to corresponding attributes before doing any
real work:
def __init__(self, filename, sample_id, date_acquired, operator,
scan_size=(1e-5, 1e-5)):
# initialize the primary attributes
self.filename = filename
self.sample_id = sample_id
self.operator = operator
self.scan_size = scan_size
Traits lets you avoid this boilerplate by defining a default __init__
method that accepts keyword arguments that correspond to the declared
traits. The Traits version of the Image
class could potentially skip
the __init__
method entirely:
class Image(HasTraits):
filename = File(exists=True)
sample_id = Str()
date_acquired = Date()
operator = Str()
scan_size = Tuple(Float, Float)
# this works!
image = Image(
filename=filename,
operator="Hannes",
sample_id="0001",
date_acquired=datetime.datetime.today(),
scan_size=(1e-5, 1e-5),
)
Default Values¶
There are a couple of complications in the example that we need to take into account. The first is what happens if a user forgets to provide an initial value:
>>> image = Image(
... filename=filename,
... sample_id="0001",
... date_acquired=datetime.datetime.today(),
... scan_size=(1e-5, 1e-5),
... )
>>> image.operator
""
As this example shows, the operator gets given a default value of the empty
string ""
. In fact every trait type comes with an default value. For
numeric trait types, like Int
and Float
, the default is 0. For
Str
trait types it is the empty string, for Bool
traits it is
False
, and so on.
However, that might not be what you want as your default value. For example,
you might want to instead flag that the operator has not been provided with
the string "N/A"
for “not available”. Most trait types allow you to
specify a default value as part of the declaration. So we could say:
operator = Str("N/A")
and now if we omit operator
from the arguments, we get:
>>> image.operator
"N/A"
Dynamic Defaults¶
The second complication comes from more complex initial values. For example,
we could declare some arbitrary fixed date as the default value for
date_acquired
:
date_acquired = Date(datetime.date(2020, 1, 1))
But it would be better if we could set it to a dynamic value. For example,
a reasonable default would be today’s date. You can provide this sort of
dynamically declared default by using a specially-named method which has
the pattern _<trait-name>_default
and which returns the default value.
So we could write:
def _date_acquired_default(self):
return datetime.datetime.today()
Dynamic defaults are best used for values which don’t depend on other traits.
For example, it might be tempting to have the image
trait have a dynamic
default which loads in the data. As we will see, this is almost always
better handled by Traits observation and/or properties, which are discussed
in subsequent sections of the tutorial.
The traits_init
Method¶
Although you aren’t required to write an __init__
method in a
HasTraits
subclass, you can always choose to do so. If you do, you
must call super()
to ensure that Traits has a chance to set up
its machinery. In our example the __init__
method is also used to set
up some auxiliary values. This doesn’t have to change:
def __init__(self, **traits):
super().__init__(**traits)
# useful secondary attributes
self.scan_width, self.scan_height = self.scan_size
However Traits offers a slightlty more convenient way of doing this sort of
post-initialization setup of state: you can define a traits_init
method
which the HasTraits
class ensures is called as part of the main
initialization process. When it has been called, all initial values will
have been set:
def traits_init(self):
# useful secondary attributes
self.scan_width, self.scan_height = self.scan_size
Exercise¶
In our original example, the scan_size
atribute had a default value of
(1e-5, 1e-5)
. Modify the code in the example so that the trait is
initialized to this default using a dynamic default method.
Links¶
Observation¶
In our code so far, there is a problem that it is possible for certain related values to get out of sync with one another. For example, if we change the filename after we have read the image into memory, then the data in memory still refers to the old image. It would be nice if we could automatically re-load the image if the filename changes. Traits allows you to do this.
The Observe Decorator¶
We want to have the read_image method run whenever the filename
trait
changes. We can do this by adding an observe
decorator to the method:
class Image(HasTraits):
...
@observe('filename')
def read_image(self, event):
...
The observer passes an event object to the function which contains information about what changed, such as the old value of the trait, but we don’t need that information to react to the change, so it is ignored in the body of the function.
For most traits, the observer will run only when the trait’s value actually changes, not just when the value is set. So if you do:
>>> image.filename = "sample_0001.png"
>>> image.filename = "sample_0001.png"
then the observer will only be run once.
Observing Multiple Traits¶
If you look at the computation of pixel_area
in the original code, it
looks like this:
self.pixel_area = self.scan_height * self.scan_width / self.image.size
It depends on the scan_width
, scan_height
and the image
, so we
would like to listen to changes to all of these. We could write three
@observe
functions, one for each trait, but the content would be the
same for each. A better way to do this is to have the observer listen to
all the traits at once:
class Image(HasTraits):
...
@observe('scan_width, scan_height, image')
def update_pixel_area(self, event):
if self.image.size > 0:
self.pixel_area = (
self.scan_height * self.scan_width / self.image.size
)
else:
self.pixel_area = 0
Dynamic Observers¶
Sometimes you want to be able to observe changes to traits from a different
object or piece of code. The observe
method on a HasTraits
subclass
allows you to dynamically specify a function to be called if the value of a
trait changes:
image = Image(
filename="sample_0001.png",
sample_id="0001",
)
def print_filename_changed(event):
print("Filename changed")
image.observe(print_filename_changed, 'filename')
# will print "Filename changed" to the screen
image.filename="sample_0002.png"
Dynamic observers can also be disconnected using the same method, by adding
the argument remove=True
:
image.observe(print_filename_changed, 'filename', remove=True)
# nothing will print
image.filename="sample_0003.png"
Exercise¶
Currently scan_height
and scan_width
are set from the parts of the
scan_size
trait as part of the traits_init
method. Remove the
traits_init
method and have scan_height
and scan_width
methods
update whenever scan_size
changes.
Links¶
Property Traits¶
The Image
class has three traits which are closely related: scan_size
,
scan_width
and scan_height
. We would ideally like to keep all of these
synchronized. This can be done with trait observation, as shown in the
previous section, but this sort of pattern is common enough that Traits has
some in-built helpers.
Instead of declaring that scan_width = Float()
we could instead declare it
to be a Property
trait type. Property traits are similar to @property
decorators in standard Python in that rather than storing a value, they compute
a derived value via a “getter”, and optionally store values via a “setter”.
Traits uses specially named methods of the form _get_<property>
and
_set_property
for these “getters” and “setters.” If there is a “getter”
but no “setter” then the property is read-only.
Additionally, we need to know when the value of the property might change, and
so we need to declare what traits to observe to know when the property might
change. What all this means is that we can define scan_width
as a
property by:
class Image(HasTraits):
...
scan_width = Property(Float, depends_on='scan_size')
def _get_scan_width(self):
return self.scan_size[0]
def _set_scan_width(self, value):
self.scan_size = (value, self.scan_height)
Traits will then take care of hooking up all the required observers to make everything work as expected; and the Property can also be observed if desired.
Simple Property traits like this are computed “lazily”: the value is only calculated when you ask for it.
Cached Properties¶
It would be quite easy to turn the histogram function into a read-only property. This might look something like this:
class Image(HasTraits):
...
histogram = Property(Array, depends_on='image')
def _get_histogram(self):
hist, bins = np.histogram(
self.image,
bins=256,
range=(0, 256),
density=True,
)
return hist
This works, but it has a downside that the histogram is re-computed every time the property is accessed. For small images, this is probably OK, but for larger images, or if you are working with many images at once, this may impact performance. In these cases, you can specify that the property should cache the returned value and use that value until the trait(s) the property depend on changes:
class Image(HasTraits):
...
histogram = Property(Array, depends_on='image')
@cached_property
def _get_histogram(self):
hist, bins = np.histogram(
self.image,
bins=256,
range=(0, 256),
density=True,
)
return hist
This has the trade-off that the result of the computation is being stored in memory, but in this case the memory is only a few hundred bytes, and so is unlikely to cause problems; but you probably wouldn’t want to do this with a multi-gigabyte array.
Exercise¶
Make pixel_area
a read-only property.
Links¶
Documentation¶
Another advantage of using Traits is that you code becomes clearer and easier for yourself and other people to work with. If you look at the original version of the image class, it isn’t clear what attributes are available on the class and, worse yet, it isn’t clear when those attributes are available.
Self-Documenting Code¶
By using Traits, all your attributes are declared up-front, so anyone reading your class knows exactly what your class is providing:
class Image(HasTraits):
filename = File(exists=True)
sample_id = Str()
operator = Str("N/A")
date_acquired = Date()
scan_size = Tuple(Float, Float)
scan_width = Property(Float, depends_on='scan_size')
scan_height = Property(Float, depends_on='scan_size')
image = Array(shape=(None, None), dtype='uint8')
pixel_area = Property(Float, depends_on='scan_height,scan_width,image')
histogram = Property(Array, depends_on='image')
This goes a long way towards the ideal of “self-documenting code.” It is
common in production-quality Traits code to also document each trait with a
special #:
comment so that auto-documentation tools like Sphinx can
generate API documentation for you:
class Image(HasTraits):
#: The filename of the image.
filename = File(exists=True)
#: The ID of the sample that is being imaged.
sample_id = Str()
#: The name of the operator who acquired the image.
operator = Str("N/A")
...
HasStrictTraits¶
One common class of errors in Python are typos when accessing an attribute of a class. For example, if we typed:
>>> image.fileanme = "sample_0002.png"
Python will not throw an error, but the code will not have the effect that the user expects. Some development tools can help you detect these sorts of errors, but most of these are not available when writing code interactively, such as in a Jupyter notebook.
Traits provides a special subclass of HasTraits
called HasStrictTraits
which restricts the allowed attributes to only the traits on the class.
If we use:
class Image(HasStrictTraits):
...
then if we type:
>>> image.fileanme = "sample_0002.png"
TraitError: Cannot set the undefined 'fileanme' attribute of a 'Image'
object.
We get an immediate error which flags the problem.
Links¶
Visualization¶
Traits allows you to instantly create a graphical user interface
for your HasTraits
classes. If you have TraitsUI and a suitable
GUI toolkit (such as PyQt5 or PySide2) installed in your Python
environment then you can create a dialog view of an instance of
your class by simply doing:
>>> image.configure_traits()
This gives you a default UI out of the box with no further effort, but it usually is not what you would want to provide for your users.
With a little more effort using the features of TraitsUI, you can design a dialog which is more pleasing:
from traitsui.api import Item, View
class Image(HasStrictTraits):
...
view = View(
Item('filename'),
Item('sample_id', label='Sample ID'),
Item('operator'),
Item('date_acquired'),
Item('scan_width', label='Width (m):'),
Item('scan_height', label='Height (m):')
)
TraitsUI can be used as the building block for complete scientific applications, including 2D and 3D plotting.