Mode
The mode represent the assignment that you give to the code.
The following schema describes GDL’s modes and concepts.
Data Tiling
# Creating the patches from the raw data
(geo_deep_env) $ python GDL.py mode=tiling
The data tiling
preparation phase creates chips
(or patches) that will be used for either training, validation or testing with the dataloader.
For this tiling step, GDL requires a csv as input with a list of rasters and labels to be
used in the subsequent training phase.
This csv must have been specified as a path
in the raw_data_csv
from General Parameters.
The other parameters will be found in Defaults Parameters under tiling
and
this configuration file looks like:
tiling_data_dir
(str)This directory defines where output patches will be written, the value is read from the from
tiling_data_dir
in General Parameters.
train_val_percent
(dict)This parameter defines the proportion of patches to be redirected to the validation set,
{'trn':0.7, 'val':0.3}
means around 30% of patches will belong to validation.
patch_size
(int)Size of an individual patch. For example, a raster of 1024 x 1024 pixels will output 4 patchs if patch_size is 512. The value for this parameter should remain relatively stable as varying patch sizes has little impact on the performance of model. Tiling is mostly aimed at making it possible to fill a batch with at least 4 patch pairs of different AOIs without busting a machine’s memory while training. Defaults to 512.
min_annot_perc
(int)Minimum annotated percent, discards patch pairs (imagery & ground truth) if the non-background area (e.g. area covered with classes of interest) on a given ground truth patch is lower than this minimum. Defaults to 0 (keep all patchs). This parameter is a data balancing tool for undersampling. It is easy to implement and use, but may not be the perfect solution for all data balancing problems. For more information on pros and cons of undersampling, oversampling and other class balancing strategies, see Buda & Al., 2018 and Longadge & Dongre, 2013.
continuous_values
(bool)If True, the tiling script will ensure all pixels values in the rasterized ground truth have continuous values starting at 1 (0 being background). In most cases, this parameter has no impact as values may already be continuous. However, it becomes useful to set to
True
when filtering polygons from a ground truth file using an attribute field and attribute values (see the dataset section). For example, filtering values[2,4]
from a given attribute field will create ground truth rasterized patches with these same discontinuous values, unless the value isTrue
. If you choose to set toFalse
, errors may occur in metrics calculation, training and outputted values at inference. We strongly recommend keeping the defaultTrue
value.
save_preview_labels
(bool)If True, a
.png
copy of rasterized ground truth patches will be written for quick visualization. A colormap is used to map actual values in.geotiff
ground truth patch (usually very close to 0, thus hard to visualize with a system viewer). However, the conversion from.geotiff
to.png
discard the georeferencing information. If one wishes to locate a particular patch, it is recommended to open the .geotiff version of ground truth patch in a GIS software like QGIS.
multiprocessing
(bool)If True, the tiling script uses Python’s multiprocessing capability to process each AOI in parallel. This greatly accelerates the tiling process. For testing or debugging purposes or for small dataset, we’d recommend keeping the default value
False
.
clahe_clip_limit
(int)Our teams empirical tests have shown that, in most satellite imagery with right skewed histogram (ex.: most of Worldview imagery), histogram equalization with the CLAHE algorithm improves the performance of models and subsequent quality of extractions. After having compared Kornia’s and Scikit-image’s implementation on 3 RGB images of varying sizes, the geo-deep-learning team has favored Kornia’s CLAHE.
write_dest_raster
(bool)If True, the destination raster will be written in the AOI’s root directory (see AOI’s class docstrings). Defaults to
False
, when bands requested don’t require a VRT to be created, no destination raster is written even ifTrue
since the destination raster would be identical to the source raster. If a VRT is required, butFalse
, no destination raster is written to disk. This feature is currently implemented mostly for debugging and demoing purposes.
write_mode
(str)Defines behavior in case patches already exist in destination folder for a particular dataset. Modes available are “raise_exists” (by default, tiling will raise error if patches already exist) and “append” (tiling will skip AOIs for which all patches already exist). This feature is applies to 1st step of tiling only, does not apply to 2nd step (filtering, sorting among trn/val and burning vector ground truth patches).
Note
Kornia expects “clip_limit” as a float with default value at 40. Because sk-image’s implementation expects this parameter to be between 0 and 1, geo-deep-learning forces user to input an integer as “clip_limit”. This is meant to reduce the potential confusion with sk-image’s expected value.
Training
# Training the neural network
(geo_deep_env) $ python GDL.py mode=train
Training, along with validation and testing phase is where the neural network learns, from the data prepared in
the tiling mode to make all the predictions. The crux of the learning process is the training phase.
During the training the data are separated in three datasets for training, validation and test. The samples labeled “trn”
as per above are used to train the neural network. The samples labeled “val” are used to estimate the training
error (i.e. loss) on a set of sub-images not used for training. After every epoch and at the end of all epochs,
the model with the lowest error on validation data is loaded and use on the samples labeled “tst” if they exist.
The result of those “tst” images is used to estimate the accuracy of the model, since those images were
unseen during training nor validation.
For all those steps, we have the parameters that can be found in Defaults Parameters under training
and this configuration file looks like:
num_gpus
(int)Number of GPUs used for training. The value does not matter if Pytorch is installed cpu-only.
batch_size
(int)Number of training tiles in one forward/backward pass.
eval_batch_size
(int)Number of validation tiles in one forward/backward pass.
batch_metrics
(int)Compute metrics every n batches. If set to 1, will calculate metrics for every batch during validation. Calculating metrics is time-consuming, therefore it is not always required to calculate it on every batch, for every epoch.
lr
(float)Learning rate at first epoch.
max_epochs
(int)Maximum number of epoch for one training session.
min_epochs
(int)Minimum number of epoch for one training session.
num_workers
(int, optional)Number of workers assigned for the dataloader. If not provided, will be deduced from the number of GPU (num_workers = 4 * num_GPU). References
mode
(str)‘min’ or ‘max’, will minimize or maximize the chosen loss.
max_used_ram
(int, optional)Used to calculate wether or not the process can use the GPU. If a GPU is already used by another process, the training can still be pushed to this GPU if
max_used_ram
is not met.
max_used_perc
(int, optional)Value between 0-100. Used to calculate wether or not the process can use the GPU. If a GPU is already used by another process, the training can still be pushed to this GPU if
max_used_perc
is not met.
state_dict_path
(str, optional)Path to a pretrained model (.pth.tar).
state_dict_strict_load
(bool, optional)Defines whether to strictly enforce that the keys in state_dict match the keys returned by this Pytorch’s state_dict() function. Default: True. Reference
compute_sampler_weights
(bool, optional)If provided, estimate sample weights by class for unbalanced datasets. Uses Sk-learn
Inference
# Inference on the data
(geo_deep_env) $ python GDL.py mode=inference
The inference phase is the last one, it allows the use of a trained model to predict on new input data without
ground truth. For this final step in the process, it need to assign every pixel in the original image a value
corresponding to the most probable class with a certain level of confidence. Like the other two mode, the parameter
will be found in Defaults Parameters under inference
and this configuration file looks like
(for binary inference):
raw_data_csv
(str)Path to the images csv.
root_dir
(str)Directory where outputs and downloads will be written by default, if
checkpoint_dir
oroutput_path
are omitted.
raw_data_csv
(str)Points to a csv containing paths to imagery for inference. If a ground truth is present in 2nd column, it will be ignored.
input_stac_item
(str)A path or url to stac item directly. See stac item example for Spacenet test data, also contained in test data.
state_dict_path
(str)Path to checkpoint containing trained weights for a given neural network architecture.
output_path
(str, optional)Complete path including parent directories and full name with extension where output inference should be saved. By default
root_dir/{aoi.aoi_id}_pred.tif
(see AOI documentation), theoutput_path
parameter should only be used if a single inference is being performed. Otherwise, it is recommended to set the root_dir and use the default output name.
checkpoint_dir
(str)Directory in which to save the checkpoint file if url.
chunk_size
(int)Size of chunk (in pixels) to read use for inference iterations over input imagery. The input patch will be square, therefore set at
512
it will generate 512 x 512 patches.
max_pix_per_mb_gpu
(int)If chunk_size is omitted, this defines a “maximum number of pixels per MB of GPU Ram” that should be considered. E.g. if GPU has 1000 Mb of Ram and this parameter is set to 10, chunk_size will be set to
sqrt(1000 * 10) = 100
. By defaults it’s set to 25. Since this feature is based on a rule-of-thumb and assumes some prior empirical testing. WIP.
prep_data_only
(bool)If True, the inference script will exit after preparation of input data. If checkpoint path is url, then the checkpoint will be download, if imagery points to urls, it will be downloaded and if input model expects imagery with histogram equalization, this enhancement is applied and equalized images save to disk.
gpu
(int)Number of gpus to use at inference.
max_used_perc
(int)If GPU’s usage exceeds this percentage, it will be ignored. For example, if you have a process already running on
GPU:0
before running your script, if the first process takes up more than any of these values,GPU:0
will be ignored and it will try to push your new process on another GPU.
max_used_ram
(int)If RAM usage of detected GPU exceeds this percentage, it will be ignored.
ras2vec
(bool)If True, a polygonized version of the inference
.gpkg
will be created with rasterio tools.
Note
Current implementation doesn’t support a number of GPU superior to 1 at inference.