rasterio/examples/introduction.ipynb
2014-09-13 21:22:25 -07:00

393 lines
9.5 KiB
Plaintext

{
"metadata": {
"name": "",
"signature": "sha256:5a6908bb26106597e34dd231b1a5f453aaa6e8a3e4c9298d8c3baaf3c3e0c4a1"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# An introduction to Rasterio\n",
"\n",
"The smallest interesting problems [1] addressed by Rasterio are reading raster data from files as [Numpy](http://www.numpy.org/) arrays and writing such arrays back to files. In between, you can use the world of scientific python software to analyze and process the data. Rasterio also provides a few operations that are described in the next notebooks in this series.\n",
"\n",
"This notebook demonstrates the basics of reading and writing raster data with Rasterio.\n",
"\n",
"## Overview of a dataset\n",
"\n",
"A raster dataset consists of one or more dense (as opposed to sparse) 2-D arrays of scalar values. An RGB TIFF image file is a good example of a raster dataset. It has 3 bands (or channels \u2013 we'll call them bands here) and each has a number of rows (its `height`) and columns (its `width`) and a uniform data type (unsigned 8-bit integers, 64-bit floats, etc). Geospatially referenced datasets will also possess a mapping from image to world coordinates (a `transform`) in a specific coordinate reference system (`crs`). This metadata about a dataset is readily accessible using Rasterio.\n",
"\n",
"The Scientific Python community often imports numpy as `np`. Do this and also import rasterio."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import numpy as np\n",
"\n",
"import rasterio"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Rasterio uses for many of its tests a small 3-band GeoTIFF file named \"RGB.byte.tif\". Open it using the function `rasterio.open()`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src = rasterio.open('../tests/data/RGB.byte.tif')"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 10
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This function returns a dataset object. It has many of the same properties as a Python file object."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.name"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 11,
"text": [
"'../tests/data/RGB.byte.tif'"
]
}
],
"prompt_number": 11
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.mode"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 12,
"text": [
"'r'"
]
}
],
"prompt_number": 12
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.closed"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 13,
"text": [
"False"
]
}
],
"prompt_number": 13
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Raster datasets have additional structure and a description can be had from its `meta` property or individually."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.meta"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 14,
"text": [
"{'affine': Affine(300.0379266750948, 0.0, 101985.0,\n",
" 0.0, -300.041782729805, 2826915.0),\n",
" 'count': 3,\n",
" 'crs': {'init': u'epsg:32618'},\n",
" 'driver': u'GTiff',\n",
" 'dtype': 'uint8',\n",
" 'height': 718,\n",
" 'nodata': 0.0,\n",
" 'transform': (101985.0,\n",
" 300.0379266750948,\n",
" 0.0,\n",
" 2826915.0,\n",
" 0.0,\n",
" -300.041782729805),\n",
" 'width': 791}"
]
}
],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.crs"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 15,
"text": [
"{'init': u'epsg:32618'}"
]
}
],
"prompt_number": 15
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To close an opened dataset, use its `close()` method."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.close()\n",
"src.closed"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 16,
"text": [
"True"
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can't read from or write to a closed dataset, but you can continue access its properties."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.driver"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 23,
"text": [
"u'GTiff'"
]
}
],
"prompt_number": 23
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dataset layout\n",
"\n",
"Three properties of a Rasterio dataset tell you a lot about it in Numpy terms. The `shape` of a dataset is a `height, width` tuple and is exactly the shape of Numpy arrays that would be read from it. The testing dataset has 718 rows and 791 columns."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.shape"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 26,
"text": [
"(718, 791)"
]
}
],
"prompt_number": 26
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `count` of bands in the dataset is 3."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.count"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 27,
"text": [
"3"
]
}
],
"prompt_number": 27
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"All three of its bands contain 8-bit unsigned integers."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"src.dtypes"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 28,
"text": [
"['uint8', 'uint8', 'uint8']"
]
}
],
"prompt_number": 28
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Numpy concepts are the model here. If you wanted to create a 3-D Numpy array into which the testing data file's bands would fit without any resampling, you would use the following Python code."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"dest = np.empty((src.count,) + src.shape, dtype='uint8')\n",
"dest"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 25,
"text": [
"array([[[0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" ..., \n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0]],\n",
"\n",
" [[0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" ..., \n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0]],\n",
"\n",
" [[0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" ..., \n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0]]], dtype=uint8)"
]
}
],
"prompt_number": 25
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## References"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[1]: Mike Bostock's words from his FOSS4G keynote, 2014-09-10"
]
}
],
"metadata": {}
}
]
}