API Overview

This section introduces the main EDM API, without going into too many details.

Basic concepts

At the highest level, EDM manipulates two concepts:

  • environments, which are self-contained, and consist of an installed language runtime (say python) and a set of packages
  • packages within an environment

To manipulate environments, one uses the EnvironmentsManager class; to manipulate packages within an environment, one uses the PackagesManager class

Managing environments

A given EnvironmentsManager instance manipulates environments defined in an root directory: this root directory contains both the environments and the metadata used by EDM to track various details around environments.

The recommended way to create an EnvironmentsManager instance is through the EnvironmentsManager.from_settings factory. Settings instances track the various configuration attributes, and can be created in memory without referring to any stored settings:

from edm.api import EnvironmentsManager, Settings

settings = Settings(root_directory="/tmp")
environments_manager = EnvironmentsManager.from_settings(settings)

Once created, an EnvironmentsManager instance can be used to query environments, e.g.:

for env in environments_manager:
    print("Environment {} installed in {}".format(env.name, env.prefix))

Session objects

Many operations around package management involve network I/O. At the lowest-level, network I/O is handled through a Session instance, which is essentially a thin wrapper around the requests library.

The simplest way to create a session object is through the Session.from_settings and Session.authenticated_from_settings factories:

settings = Settings(auth=("nono", "le petit robot"))
with Session.authenticated_from_settings(settings) as session:
    # session is now authenticated against the remote server
    ...

In most cases, you will want to use an authenticated session.

Managing packages

To manage packages within an environment, one needs to use a PackagesManager instance. As with every other API so far, the simplest way to do so is through the PackagesManager factory:

settings = Settings(auth=...)
# This assumes an environment "test-env" has already been created
with Session.authenticated_from_settings(settings) as session:
    pkgs_manager = PackagesManager.from_session_and_environment_name(
        settings, session, "test-env"
    )
    pkgs_manager.install(["numpy"])

Command contexts

Command contexts are used in EDM to decouple computation from destructive operations, and to allow giving more information to the end-user without complicating the EDM API too much.

Most high-level operations around package management such as install, remove, or update can be decomposed as follows:

  1. convert the given set of requirements into a problem the dependency solver understands
  2. solve the corresponding problem
  3. apply the problem to the configured environment

1 and 2 are non-destructive, while 3 is. Moreover, once a solution has been calculated by the solver, EDM has more information that may be used to inform the end-user. That’s why most operations in the PackagesManager have a corresponding _command method, which returns a command context instead of applying directly the operation.

A command context has two roles:

  • define an execute method to actually apply the operation it is coming from (such as install for install_command)
  • define some command-specific attributes with detailed information about the operation applied by execute

Example:

settings = Settings()

# This assumes an environment "test-env" has already been created
with Session.authenticated_from_settings(settings) as session:
    pkgs_manager = PackagesManager.from_session_and_environment_name(
        settings, session, "test-env"
    )
    command = pkgs_manager.install_command(["numpy"])
    print("packages to be installed:")
    for package in command.installed:
        print("{} {}".format(package.name, package.version))

Command contexts’ attributes are documented in the reference

Messaging

EDM uses a messaging bus for tracking progress during long-running operations, operations requiring network IO, etc… The messaging infrastructure is implemented in edm.messaging (but the public API is exposed through edm.api) and implements a simple topic-based pub/sub protocol (see e.g. Wikipedia for an overview).

Publisher instances receive events and dispatch them to registered handlers. One generally send events through emitters, which encompass a pre-defined set of events.

Progress

As an example, a common use for messaging is sending events to notify of the progress of a long-running task which consists of a known number of steps. This is done through the progress_emitter:

import time

from edm.api import PrintDotHandler, Publisher, progress_emitter


TOPIC = "example"

publisher = Publisher()
handler = PrintDotHandler("STARTING\n", "\nDONE")

publisher.subscribe(handler, TOPIC)

with progress_emitter(TOPIC, publisher, length=10) as emitter:
    for i in emitter:
        time.sleep(0.1)

This will print:

STARTING
..........
DONE

That example uses a pre-defined handler factory PrintDotHandler to display a message before the operation, a . at each step, and another message at the end.

To write custom handlers to events sent through a progress_emitter, you need to implement the IProgressHandler interface.

Note

Publisher instances use weakref to keep track of registered handlers. Thus you need to keep a reference to the handlers while listening to events.

Start/stop

Another common messaging pattern is to notify the beginning and the end of a task. This is useful when it is difficult to track the progress of a task, or when we cannot know beforehand how long it would take. The function start_stop_emitter can be used to send these events to any handler implementing the IStartEndSequenceHandler interface:

import time

from edm.api import StartEndHandler, Publisher, start_end_emitter


TOPIC = "example"

publisher = Publisher()
handler = StartEndHandler("STARTING... ", "DONE")

publisher.subscribe(handler, TOPIC)

with start_end_emitter(TOPIC, publisher):
    time.sleep(1.5)

This will print:

STARTING... DONE

Embedding EDM

EDM is designed both as a CLI and as a library, so that the packaging management can be embedded in a larger appliction (e.g. canopy). In those situations, it is important for the embedded EDM not to conflict with the standalone CLI tool when both are installed. This section describes the mechanism in EDM for both standalone and embedded EDM to coexist.

Embedded vs standalone behaviour

In standalone mode, running edm from the CLI works as follows:

  • look for the file ~/.edm.yaml, unless the config path location is tweaked through one of the docummented flag/environment variable.
  • use ~/.edm as the root directory, unless it is tweaked through one of the documnented flag/environment variable.

In embedded mode, the behaviour is:

  • look for the file edm.cfg in sys.exec_prefix. If it exists, EDM is assumed to be embedded, otherwise it is assumed to be in standalone mode.
  • this file is expected to define an entrypoint. That entry point is called at runtime, and the entry point is expected to return an EDM Settings instance properly configured.

The edm.cfg file is a simple json document:

{
    "module": "my_package.edm_embedding",
    "function": "edm_settings_factory"
}

This will cause EDM to call the entry point as follows (error management ignored):

m = importlib.import_module("my_package.edm_embedding")
settings = m.edm_settings_factory()