Skip to main content

S3 Data Connector

Required package

In order to use the S3 data connector you need to install the Boto3 package:

$ edm install boto3

S3 Connection Object

class edge.api.S3Connection(name, title, bucket, endpoint, access_key, secret_key, _verify=None)

Represents a connection to a remote S3 bucket.

Use the conn.root attribute to access the contents as a file system.

Example

# List files and folders
[1]: conn.root.list()
['myfile.jpg', 'mysubfolder']

# Open a file
[2]: img = conn.root.open('myfile.jpg')

# Open a "folder" in S3
[3]: myfolder = conn.root.open('myfolder')

# List files in that subfolder
[4]: myfolder.list()
['example.jpg']

# Download a file
[5]: myfolder.download('example.jpg', localpath='example_downloaded.jpg')

Properties

Methods

References

kind

Connection kind.

uid

Unique ID of the connection.

name

Connection name, for internal use

title

Title for display

bucket

Underlying S3 bucket

root

Access an S3 bucket as a file system

iter_prefixes()

Search for key prefixes in the bucket.

Key prefixes can be used to organise the keys in a bucket in a structure that resembles a folder. The result contains strings between an optional prefix and the first occurrence of the delimiter character, which is assumed to be "/". For example, if the bucket contains two keys named "folder1/myfile.txt" and "folder2/subfolder/myfile2.txt", iter_prefixes() finds the keys "folder1/" and "folder2/", while iter_prefixes(prefix='folder2/') finds the "folder2/subfolder/".

Parameters:

  • prefix (str, optional) — Base prefix to use for searching other prefixes. Defaults to an empty prefix.

Returns:

     iterator(str)

         An iterator yielding the prefixes found.

list_prefixes()

Search for key prefixes in the bucket.

Key prefixes can be used to organise the keys in a bucket in a structure that resembles a folder. The result contains strings between an optional prefix and the first occurrence of the delimiter character, which is assumed to be "/". For example, if the bucket contains two keys named "folder1/myfile.txt" and "folder2/subfolder/myfile2.txt", iter_prefixes() finds the keys "folder1/" and "folder2/", while iter_prefixes(prefix='folder2/') finds "folder2/subfolder/".

Parameters:

  • prefix (str, optional) — Base prefix to use for searching other prefixes. Defaults to an empty prefix.

Returns:

     iterator(str)

         An iterator yielding the prefixes found.

iter_keys()

Search for keys in the bucket.

Parameters:

  • prefix (str, optional) — Limit the search to keys starting with a given prefix.

Returns:

     iterator(str)

         A list of the keys found in the bucket.

list_keys()

Search for keys in the bucket.

Parameters:

  • prefix (str, optional) — Limit the search to keys starting with a given prefix.

Returns:

     list(str)

         A list of the keys found in the bucket.

test()

Test the connection to the external filestore

Parameters:

  • timeout (int, optional) — The time in seconds to wait before reporting a connection as failed. If not specified, it defaults to a short timeout.

Returns:

     ConnectionStatus

         One of VALID_CONNECTION, BAD_CREDENTIALS, or NO_CONNECTION)

         The connection status

open()

Retrieve data from an S3 object as an appropriate Python object.

Parameters:

  • key (str) — The S3 bucket key from which to retrieve data
  • open_with (str, optional) — Opener to use, for example "pandas" or "imageio". Using "file" will give a file-like object with access to the raw bytes. If not specified, an appropriate opener will be selected for you.

Returns:

     Any

         The most appropriate type found for the underlying data. If no specific type is found, a generic File is returned.

Raises:

  • ValueError — Raised if requested type is not known.
  • NotFoundError — Raised when the given key is not found in the bucket.

to_dict()

Translates the connection as a dictionary, for serialization.

Returns:

     Dict

         Connection contents in dictionary form.

S3 Folder Object

class edge.api.S3Folder(_conn, _prefix)

Represents a "folder" in an S3 bucket.

You can use this object to explore the contents of an S3 bucket interactively. S3 prefixes are mapped to folders, S3 keys are mapped to files.

Private constructor

Example

[1]: s3conn.root.list_folders() ['folder1', 'folder2']

[2]: folder = s3conn.root.open('folder1') [3]: folder.list_files() ['myimg.jpg']

[4]: img = folder.open('myimg.jpg')

Methods

Reference

list_folders()

List all folder names in this folder.

Returns:

     list(str)

         List of folder names.

list_files()

List all file names in this folder.

Returns:

  • list (str) — List of file names.
  • show_all(bool) — Also show files hidden by default. Currently this means file names which end with the string $folder$.

list()

List all file and folder names in this folder.

Returns:

  • list (tr) — List of File and Folder names.
  • show_all (bool) — Also show files hidden by default. Currently this means file names which end with the string $folder$.

open()

Retrieve an S3 object as an appropriate Python object.

Parameters:

  • name (str) — The name of the file or folder to open.
  • open_with (str, optional) — Opener to use, for example "pandas" or "imageio". Using "file" will give a file-like object with access to the raw bytes. If not specified, an appropriate opener will be selected for you.

Returns:

     Any

         The most appropriate type found for the underlying data. If no specific type is found, a generic File is returned.

Raises:

  • ValueError — Raised if requested type is not known.
  • NotFoundError — Raised when the given key is not found in the bucket.

download()

Download to a local file.

Currently, only files are supported.

Parameters:

  • name (str) — Name of the remote object
  • localpath (str, optional) — Download to this path locally. Defaults to the same basename in the current working directory.

S3 File Object

class edge.api.S3File(bucket_object, context=None)

File-like object providing remote access to an S3 object for S3Connector.

Construct a new File object.

Parameters:

  • bucket_object (S3.Object) — The object in a bucket this file represents
  • context (SSLContext) — Optional SSL Context for S3 accesses - used in testing to disable SSL. Do not specify unless you have a strong need.

Properties

Methods

Reference

basename

Name of the File object.

size

Size of the File object

readable()

Return whether object was opened for reading.

If False, read() will raise OSError.

seekable()

Return whether object supports random access.

If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().

writable()

Return whether object was opened for writing.

If False, write() will raise OSError.

tell()

Return current stream position.

seek()

Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:

  • 0 — start of stream (the default); offset should be zero or positive
  • 1 — current stream position; offset may be negative
  • 2 — end of stream; offset is usually negative

Returns:

         The new absolute position.