GeoTIFFs to reV HDF5 Files#
Prerequisites:
Required: None
Recommended: Working with GeoTIFFs
Introduction#
In the previous tutorial, we demonstrated how we can use reVX
’s Geotiff
handler to manage geotiff files.
In this tutorial, we will go over getting TIFF files into a reV-ready format using the LayeredH5
handler.
We’ll cover the following steps:
Creating a Layered HDF5 file using a GeoTIFF template
Writing layers to the Layered HDF5 file
Extracting layers from the Layered HDF5 file
All of the above from the command line
Let’s get started!
Downloading the data#
Before we dive into the code, we first have to download a sample TIFF from Siting Lab to use as an example of adding data to a layered HDF5 file. In particular, we will be using data from [GDS24a], [GDS24b], [GDS24c], and [GDS24d].
If you have already downloaded the data, you can skip this step (just make sure path variables below are set correctly). We’ll start by defining the local file path destination:
AIRPORT_HELIPORT_SETBACKS = "airport_heliport_setbacks.tif"
NEXRAD_GREEN_LOS = "NEXRAD_green_los.tif"
SETBACKS_PIPELINE_REFERENCE = "setbacks_pipeline_reference.tif"
SETBACKS_STRUCTURE_115HH_170RD = "setbacks_structure_115hh_170rd.tif"
SETBACKS_STRUCTURE_REFERENCE = "setbacks_structure_reference.tif"
Let’s also define the URL for each of these files:
FILE_URLS = {
AIRPORT_HELIPORT_SETBACKS: "https://data.openei.org/files/6120/airport_heliport_setbacks.tif",
NEXRAD_GREEN_LOS: "https://data.openei.org/files/6121/nexrad_4km.tif",
SETBACKS_PIPELINE_REFERENCE: "https://data.openei.org/files/6125/setbacks_pipeline_115hh_170rd_extrapolated.tif",
SETBACKS_STRUCTURE_115HH_170RD: "https://data.openei.org/files/6132/setbacks_structure_115hh_170rd_extrapolated.tif",
SETBACKS_STRUCTURE_REFERENCE: "https://data.openei.org/files/6132/setbacks_structure_115hh_170rd.tif"
}
Next, we can use a siting lab utility function to download the data. This function uses urllib
(which is part of the Python standard library) under the hood.
crop=True
to crop the data immediately after downloading it to make it easier to work with. If you have a machine with sufficiently large memory (32GB+), or you are downloading the file in order to use it for analysis purposes, you should set crop=False
.
def download(local_filepath):
url = FILE_URLS[local_filepath]
download_tiff_file(url, local_filepath, crop=True)
with ThreadPool(len(FILE_URLS)) as p:
p.map(download, FILE_URLS)
Working with Layered HDF5 files#
In this section, we will outline some basic workflows using the LayeredH5
class.
Creating the Layered HDF5 file from TIFF#
First, we will initialize the LayeredH5
object.
If creating a new HDF5 file that does not exist, we use the .create_new()
method.
When creating a new HDF5, a template filepath must be specified. The template file is used to define the properties of the HDF5 file including:
The profile information
Coordinate reference system and projection
The geographic extent, spatial resolution
All other files that are subsequently added to the HDF5 file will be transformed/adjusted to fit the properties of the template file before being written to the file.
H5_PATH = "example.h5"
# Initialize layered h5 object
h5 = LayeredH5(H5_PATH, template_file=NEXRAD_GREEN_LOS)
# If file doesn't exist, create new h5
h5.create_new()
Inspecting the HDF5 file (using the layers
property), we see that the first two layers are longitude and latitude arrays. These are the coordinate locations for each grid cell defined by the template file pixels.
# Use the layer method to see the layers in the H5 file
h5.layers
['latitude', 'longitude']
Meta data information about the HDF5 file can be retrieved by using .profile
and .shape
print(f"H5 profile: {h5.profile}")
print(f"shape: {h5.shape}")
H5 profile: {'driver': 'GTiff', 'dtype': 'uint8', 'nodata': 255.0, 'width': 2000, 'height': 2000, 'count': 1, 'crs': '+init=epsg:5070', 'transform': (90.0, 0.0, 1829980.2632930684, 0.0, -90.0, 2297068.2309463923), 'blockxsize': 256, 'blockysize': 256, 'tiled': True, 'compress': 'lzma', 'interleave': 'band'}
shape: (2000, 2000)
Writing layers to the HDF5 file#
Once the HDF5 file is created (or if it exists already), we can write NumPy arrays and TIFF files into the h5 files using the .write_layer_to_h5()
and .write_geotiff_to_h5()
respectively
# Adding numpy arrays
with Geotiff(NEXRAD_GREEN_LOS) as geo:
h5.write_layer_to_h5(
values=geo.values,
layer_name="nexrad_green_los",
profile=geo.profile,
description="NEXRAD Line of sight"
)
# Adding a geotiff file directly
h5.write_geotiff_to_h5(
geotiff=AIRPORT_HELIPORT_SETBACKS,
layer_name="airport_heliport_setbacks",
description="Setbacks from airports and heliports",
replace=False
)
Now we can check to see what layers are currently in the HDF5 file:
# Checking current layers in the
h5.layers
['airport_heliport_setbacks', 'latitude', 'longitude', 'nexrad_green_los']
We can also add multiple GeoTIFFs into the h5 using the .layers_to_h5()
method. This method accepts two types of inputs:
A list of GeoTIFFs filepaths. In this case, the layer name in the HDF5 file will be the stem of the filename.
A dictionary mapping layer names to GeoTIFFs filepaths.
Optionally, you can include a dictionary mapping layer names to layer descriptions (in text format) using the description
argument.
file_list = [
SETBACKS_PIPELINE_REFERENCE,
SETBACKS_STRUCTURE_115HH_170RD,
SETBACKS_STRUCTURE_REFERENCE,
]
print(f"Adding {len(file_list)} file(s) to the h5...")
for fn in file_list:
print(fn.split(".")[0])
h5.layers_to_h5(
layers=file_list,
replace=False,
descriptions={
"setbacks_pipeline_reference": "This dataset represents wind energy "
"setback requirements from oil and gas pipelines. A setback "
"requirement is a minimum distance from a pipeline that an energy "
"project may be developed. As of April 2022, no ordinances were "
"discovered for any counties. Such ordinances are likely to arise as "
"regulations continue to expand. Therefore, this dataset applies a "
"median setback equivalent to 1.1 times the turbine tip-height, "
"sourced from trends in other infrastructure. The turbine parameters "
"used were a hub-height of 115 meters and a rotor diameter of 170 "
"meters, as obtained from the Annual Technology Baseline (ATB) 2022."
}
)
Adding 3 file(s) to the h5...
setbacks_pipeline_reference
setbacks_structure_115hh_170rd
setbacks_structure_reference
We can check that our layers have indeed been added to the HDF5 file:
# Checking current layers in the h5
h5.layers
['airport_heliport_setbacks',
'latitude',
'longitude',
'nexrad_green_los',
'setbacks_pipeline_reference',
'setbacks_structure_115hh_170rd',
'setbacks_structure_reference']
We can also check that our description for airport_heliport_setbacks
has been properly added:
with h5py.File(H5_PATH) as h5_fh:
print(h5_fh["setbacks_pipeline_reference"].attrs["description"])
This dataset represents wind energy setback requirements from oil and gas pipelines. A setback requirement is a minimum distance from a pipeline that an energy project may be developed. As of April 2022, no ordinances were discovered for any counties. Such ordinances are likely to arise as regulations continue to expand. Therefore, this dataset applies a median setback equivalent to 1.1 times the turbine tip-height, sourced from trends in other infrastructure. The turbine parameters used were a hub-height of 115 meters and a rotor diameter of 170 meters, as obtained from the Annual Technology Baseline (ATB) 2022.
Extracting Layers from the HDF5 file#
Layers in the HDF5 file can also be extracted to GeoTIFFs. Simply call .layer_to_geotiff()
to extract a single layer:
# Extracting a single layer
layer = "airport_heliport_setbacks"
out_filepath = "airport_heliport_setbacks_h5_extract.tif"
h5.layer_to_geotiff(layer=layer, geotiff=out_filepath)
Alternatively, you can call .extract_layers()
to extract multiple layers. This method requires you to pass a dictionary mapping layer names to output filepaths:
# Extracting multiple layers
layers = {
"nexrad_green_los": "nexrad_green_los_h5_extract.tif",
"setbacks_pipeline_reference": "setbacks_pipeline_reference_h5_extract.tif"
}
h5.extract_layers(layers)
All the layers in the HDF5 can be extracted using the .extract_all_layers()
method. To use it, simply pass an output directory where the extracted files should be written:
h5.extract_all_layers(out_dir=".")
Layered HDF5 file via CLI#
Alternatively, the command line can be used to create, add to, and extract layers from the layered HDF5 file.
Adding GeoTIFFs to the Layered HDF5 file#
First, we need to construct a JSON config file that contains layer name mapping to GeoTIFF filepaths.
This JSON configuration file can optionally contain layer descriptions as a dictionary. For example, suppose we create a layers.json
file with the following content:
{
"layers":{
"nexrad_green_los": "nexrad_green_los.tif",
"setbacks_pipeline_reference": "./setbacks_pipeline_reference.tif"
},
"descriptions": {
"setbacks_pipeline_reference": "This dataset represents wind energy setback requirements from oil and gas pipelines."
}
}
Then we can run the following command:
$ reVX exclusions -h5 example_cli.h5 layers-to-h5 --layers layers.json
We can check wether the write was successful or not using the h5ls
command (you may have to run conda install h5py
):
$ h5ls example_cli.h5
latitude Dataset {2000, 2000}
longitude Dataset {2000, 2000}
nexrad_green_los Dataset {1, 2000, 2000}
setbacks_pipeline_reference Dataset {1, 2000, 2000}
Extracting GeoTIFF layers from the HDF5 file#
To extract layers from the h5 file, we pass a list of layers to extract as well as the desired output directory as arguments to the command:
$ mkdir data
$ reVX exclusions -h5 example_cli.h5 layers-from-h5 -l nexrad_green_los,setbacks_pipeline_reference -o ./data
We can check wether the write was successful or not using the h5ls
command (you may have to run conda install h5py
):
$ ls data
nexrad_green_los.tif setbacks_pipeline_reference.tif
Conclusion#
In this tutorial, we have walked through the basic steps to create and add to a Layered HDF5 file. This type of file is used primarily as the exclusions layer input for reV supply curve aggregation. You should now be able to:
Create a layered HDF5 file
Add layers to the HDF5 file
Extract layers from the HDF5 file