S3 Data Connector
Required package
In order to use the S3 data connector you need to install the Boto3 package:
$ edm install boto3
S3 Connection Object
class edge.api.S3Connection(name, title, bucket, endpoint, access_key, secret_key, _verify=None)
Represents a connection to a remote S3 bucket.
Use the conn.root
attribute to access the contents as a file system.
Example
# List files and folders
[1]: conn.root.list()
['myfile.jpg', 'mysubfolder']
# Open a file
[2]: img = conn.root.open('myfile.jpg')
# Open a "folder" in S3
[3]: myfolder = conn.root.open('myfolder')
# List files in that subfolder
[4]: myfolder.list()
['example.jpg']
# Download a file
[5]: myfolder.download('example.jpg', localpath='example_downloaded.jpg')
Properties
S3Connection.kind
S3Connection.uid
S3Connection.name
S3Connection.title
S3Connection.bucket
S3Connection.root
Methods
S3Connection.iter_prefixes(prefix=None)
S3Connection.list_prefixes(prefix=None)
S3Connection.iter_keys(prefix=None)
S3Connection.list_keys(prefix=None)
S3Connection.test(timeout=None)
S3Connection.open(key, open_with=None)
S3Connection.to_dict()
References
kind
Connection kind.
uid
Unique ID of the connection.
name
Connection name, for internal use
title
Title for display
bucket
Underlying S3 bucket
root
Access an S3 bucket as a file system
iter_prefixes()
Search for key prefixes in the bucket.
Key prefixes can be used to organise the keys in a bucket in a structure that resembles a folder. The result contains strings between an optional prefix and the first occurrence of the delimiter character, which is assumed to be "/". For example, if the bucket contains two keys named "folder1/myfile.txt" and "folder2/subfolder/myfile2.txt", iter_prefixes()
finds the keys "folder1/" and "folder2/", while iter_prefixes(prefix='folder2/')
finds the "folder2/subfolder/".
Parameters:
- prefix (str, optional) — Base prefix to use for searching other prefixes. Defaults to an empty prefix.
Returns:
iterator(str)
An iterator yielding the prefixes found.
list_prefixes()
Search for key prefixes in the bucket.
Key prefixes can be used to organise the keys in a bucket in a structure that resembles a folder. The result contains strings between an optional prefix and the first occurrence of the delimiter character, which is assumed to be "/". For example, if the bucket contains two keys named "folder1/myfile.txt" and "folder2/subfolder/myfile2.txt", iter_prefixes()
finds the keys "folder1/" and "folder2/", while iter_prefixes(prefix='folder2/')
finds "folder2/subfolder/".
Parameters:
- prefix (str, optional) — Base prefix to use for searching other prefixes. Defaults to an empty prefix.
Returns:
iterator(str)
An iterator yielding the prefixes found.
iter_keys()
Search for keys in the bucket.
Parameters:
- prefix (str, optional) — Limit the search to keys starting with a given prefix.
Returns:
iterator(str)
A list of the keys found in the bucket.
list_keys()
Search for keys in the bucket.
Parameters:
- prefix (str, optional) — Limit the search to keys starting with a given prefix.
Returns:
list(str)
A list of the keys found in the bucket.
test()
Test the connection to the external filestore
Parameters:
- timeout (int, optional) — The time in seconds to wait before reporting a connection as failed. If not specified, it defaults to a short timeout.
Returns:
ConnectionStatus
One of VALID_CONNECTION, BAD_CREDENTIALS, or NO_CONNECTION)
The connection status
open()
Retrieve data from an S3 object as an appropriate Python object.
Parameters:
- key (str) — The S3 bucket key from which to retrieve data
- open_with (str, optional) — Opener to use, for example "pandas" or "imageio". Using "file" will give a file-like object with access to the raw bytes. If not specified, an appropriate opener will be selected for you.
Returns:
Any
The most appropriate type found for the underlying data. If no specific type is found, a generic File is returned.
Raises:
ValueError
— Raised if requested type is not known.NotFoundError
— Raised when the given key is not found in the bucket.
to_dict()
Translates the connection as a dictionary, for serialization.
Returns:
Dict
Connection contents in dictionary form.
S3 Folder Object
class edge.api.S3Folder(_conn, _prefix)
Represents a "folder" in an S3 bucket.
You can use this object to explore the contents of an S3 bucket interactively. S3 prefixes are mapped to folders, S3 keys are mapped to files.
Private constructor
Example
[1]: s3conn.root.list_folders() ['folder1', 'folder2']
[2]: folder = s3conn.root.open('folder1') [3]: folder.list_files() ['myimg.jpg']
[4]: img = folder.open('myimg.jpg')
Methods
S3Folder.list_folders()
S3Folder.list_files(show_all=False)
S3Folder.list(show_all=False)
S3Folder.open(name, open_with=None)
S3Folder.download(name, localpath=None)
Reference
list_folders()
List all folder names in this folder.
Returns:
list(str)
List of folder names.
list_files()
List all file names in this folder.
Returns:
- list (str) — List of file names.
- show_all(bool) — Also show files hidden by default. Currently this means file names which end with the string $folder$.
list()
List all file and folder names in this folder.
Returns:
- list (tr) — List of File and Folder names.
- show_all (bool) — Also show files hidden by default. Currently this means file names which end with the string $folder$.
open()
Retrieve an S3 object as an appropriate Python object.
Parameters:
- name (str) — The name of the file or folder to open.
- open_with (str, optional) — Opener to use, for example "pandas" or "imageio". Using "file" will give a file-like object with access to the raw bytes. If not specified, an appropriate opener will be selected for you.
Returns:
Any
The most appropriate type found for the underlying data. If no specific type is found, a generic File is returned.
Raises:
ValueError
— Raised if requested type is not known.NotFoundError
— Raised when the given key is not found in the bucket.
download()
Download to a local file.
Currently, only files are supported.
Parameters:
- name (str) — Name of the remote object
- localpath (str, optional) — Download to this path locally. Defaults to the same basename in the current working directory.
S3 File Object
class edge.api.S3File(bucket_object, context=None)
File-like object providing remote access to an S3 object for S3Connector.
Construct a new File object.
Parameters:
- bucket_object (S3.Object) — The object in a bucket this file represents
- context (SSLContext) — Optional SSL Context for S3 accesses - used in testing to disable SSL. Do not specify unless you have a strong need.
Properties
Methods
Reference
basename
Name of the File object.
size
Size of the File object
readable()
Return whether object was opened for reading.
If False, read() will raise OSError.
seekable()
Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
writable()
Return whether object was opened for writing.
If False, write() will raise OSError.
tell()
Return current stream position.
seek()
Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:
- 0 — start of stream (the default); offset should be zero or positive
- 1 — current stream position; offset may be negative
- 2 — end of stream; offset is usually negative
Returns:
The new absolute position.