fs.remote

Utilities for interfacing with remote filesystems

This module provides reusable utility functions that can be used to construct FS subclasses interfacing with a remote filesystem. These include:

  • RemoteFileBuffer: a file-like object that locally buffers the contents of
    a remote file, writing them back on flush() or close().
  • ConnectionManagerFS: a WrapFS subclass that tracks the connection state
    of a remote FS, and allows client code to wait for a connection to be re-established.
  • CacheFS: a WrapFS subclass that caches file and directory meta-data in
    memory, to speed access to a remote FS.
class fs.remote.CacheFS(*args, **kwds)

Simple FS wrapper to cache meta-data of a remote filesystems.

This FS mixin implements a simplistic cache that can help speed up access to a remote filesystem. File and directory meta-data is cached but the actual file contents are not.

CacheFSMixin constructor.

The optional keyword argument ‘cache_timeout’ specifies the cache timeout in seconds. The default timeout is 1 second. To prevent cache entries from ever timing out, set it to None.

The optional keyword argument ‘max_cache_size’ specifies the maximum number of entries to keep in the cache. To allow the cache to grow without bound, set it to None. The default is 1000.

class fs.remote.CacheFSMixin(*args, **kwds)

Simple FS mixin to cache meta-data of a remote filesystems.

This FS mixin implements a simplistic cache that can help speed up access to a remote filesystem. File and directory meta-data is cached but the actual file contents are not.

If you want to add caching to an existing FS object, use the CacheFS class instead; it’s an easy-to-use wrapper rather than a mixin. This mixin class is provided for FS implementors who want to use caching internally in their own classes.

FYI, the implementation of CacheFS is this:

class CacheFS(CacheFSMixin,WrapFS):
pass

CacheFSMixin constructor.

The optional keyword argument ‘cache_timeout’ specifies the cache timeout in seconds. The default timeout is 1 second. To prevent cache entries from ever timing out, set it to None.

The optional keyword argument ‘max_cache_size’ specifies the maximum number of entries to keep in the cache. To allow the cache to grow without bound, set it to None. The default is 1000.

class fs.remote.CachedInfo(info={}, has_full_info=True, has_full_children=False)

Info objects stored in cache for CacheFS.

class fs.remote.ConnectionManagerFS(wrapped_fs, poll_interval=None, connected=True)

FS wrapper providing simple connection management of a remote FS.

The ConnectionManagerFS class is designed to wrap a remote FS object and provide some convenience methods for dealing with its remote connection state.

The boolean attribute ‘connected’ indicates whether the remote filesystem has an active connection, and is initially True. If any of the remote filesystem methods raises a RemoteConnectionError, ‘connected’ will switch to False and remain so until a successful remote method call.

Application code can use the method ‘wait_for_connection’ to block until the connection is re-established. Currently this reconnection is checked by a simple polling loop; eventually more sophisticated operating-system integration may be added.

Since some remote FS classes can raise RemoteConnectionError during initialization, this class makes use of lazy initialization. The remote FS can be specified as an FS instance, an FS subclass, or a (class,args) or (class,args,kwds) tuple. For example:

>>> fs = ConnectionManagerFS(MyRemoteFS("http://www.example.com/"))
Traceback (most recent call last):
    ...
RemoteConnectionError: couldn't connect to "http://www.example.com/"
>>> fs = ConnectionManagerFS((MyRemoteFS,["http://www.example.com/"]))
>>> fs.connected
False
>>>
class fs.remote.RemoteFileBuffer(fs, path, mode, rfile=None, write_on_flush=True)

File-like object providing buffer for local file operations.

Instances of this class manage a local tempfile buffer corresponding to the contents of a remote file. All reads and writes happen locally, with the content being copied to the remote file only on flush() or close(). Writes to the remote file are performed using the setcontents() method on the owning FS object.

The intended use-case is for a remote filesystem (e.g. S3FS) to return instances of this class from its open() method, and to provide the file-uploading logic in its setcontents() method, as in the following pseudo-code:

def open(self,path,mode="r"):
    rf = self._get_remote_file(path)
    return RemoteFileBuffer(self,path,mode,rf)

def setcontents(self,path,file):
    self._put_remote_file(path,file)

The contents of the remote file are read into the buffer on-demand.

RemoteFileBuffer constructor.

The owning filesystem, path and mode must be provided. If the optional argument ‘rfile’ is provided, it must be a read()-able object or a string containing the initial file contents.