From 7f2caadecdee9fa9424245877ab4b9faae76b997 Mon Sep 17 00:00:00 2001 From: "Sean C. Gillies" Date: Fri, 18 May 2018 15:10:25 -0600 Subject: [PATCH] Add an advanced dataset topic --- docs/topics/datasets.rst | 49 ++++++++++++++++++++++++++++++++++++++++ docs/topics/index.rst | 1 + 2 files changed, 50 insertions(+) create mode 100644 docs/topics/datasets.rst diff --git a/docs/topics/datasets.rst b/docs/topics/datasets.rst new file mode 100644 index 00000000..d3211525 --- /dev/null +++ b/docs/topics/datasets.rst @@ -0,0 +1,49 @@ +Advanced Datasets +================= + +The analogy of Python file objects influences the design of Rasterio dataset +objects. Datasets of a few different kinds exist and the canonical way to +obtain one is to call ``rasterio.open`` with a path-like object or URI-like +identifier, a mode (such as "r" or "w"), and other keyword arguments. + +Dataset Identifiers +------------------- + +Datasets in a computer's filesystem are identified by paths, "file" URLs, +or instances of ``pathlib.Path``. The following are equivalent. + +* ``'/path/to/file.tif'`` +* ``'file:///path/to/file.tif'`` +* ``pathlib.Path('/path/to/file.tif')`` + +Datasets within a local zip file are identified using the "zip" scheme from +`Apache Commons VFS `__. + +* ``'zip:///path/to/file.zip!/folder/file.tif'`` +* ``'zip+file:///path/to/file.zip!/folder/file.tif'`` + +Note that ``!`` is the separator between the path of the archive file and the +path within the archive file. Also note that his kind of identifier can't be expressed using +pathlib. + +Similarly, variables of a netCDF dataset can be accessed using "netcdf" scheme +identifiers. + +``'netcdf:/path/to/file.nc:variable'`` + +Datasets on the web are identifed by "http" or "https" URLs such as + +* ``'https://example.com/file.tif'`` +* ``'https://landsat-pds.s3.amazonaws.com/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'`` + +Datasets within a zip file on the web +are identified using a "zip+https" scheme and paths separated by ``!`` as above. +For example: + +``'zip+https://example.com/file.tif&p=x&q=y!/folder/file.tif'`` + +Datasets on AWS S3 may be identified using "s3" scheme identifiers. + +``'s3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'`` + +Resources in other cloud storage systems will be similarly supported. diff --git a/docs/topics/index.rst b/docs/topics/index.rst index 4a6b0ff0..29856788 100644 --- a/docs/topics/index.rst +++ b/docs/topics/index.rst @@ -6,6 +6,7 @@ Advanced Topics color concurrency + datasets errors features fillnodata