2.a Aim of the AquaMaps algorithm
Species distributions represent a combination of historical and ecological factors. Modelling distributions based on environmental data only, will show suitable habitats rather than actual distribution which must also take into account physical barriers to dispersal. Species modelling is also dependent on the sampling density and evenness.
The AquaMaps modelling concept combines spatial and ecological parameters with an option for expert adjustments, and permits use of large occurrence databases (GBIF, FishBase) by filtering taxon names through a validator. The power of AquaMaps is particularly in predictions of occurrence at a large scale and using sparse distribution records.
2.b Why is raquamaps needed?
By implementing AquaMaps in R we will make AquaMaps more versatile and compatible with other R modelling approaches including opening for other algorithms than the native AquaMaps algorithm, but particularly providing more speed and user programming options for AquaMaps.
AquaMaps modelling is an important tool particularly for large scale modelling with few data points and can be used with alternative environmental layers such as IPCC projections.
2.c Main idea behind raquamaps
The raquamaps package implements the AquaMaps algorithm
used to build an environmental envelope for each species, and uses web
services and data harvesting tools implemented in R
(raquamaps, rgbif) to map species
distributions and build environmental layers. Thus a complete R package
is created, which can be built into or be used from other applications,
or run as a server application or as an independent desktop
application.
2.d Approach
Components
AquaMaps includes the following components:
- A global indexed grid of half or quarter degree cells based on the C-squares system of hierarchical numerical identifiers, providing environmental parameters (e.g., temperature, precipitation, salinity, etc.) for each cell
- Occurrence frequency for species in each cell
- Known environmental envelope for each species calculated on the basis of occurrence data (environmental values in cells of occurrence) or using expert information.
In the raquamaps package these components are provided
in the follow way:
A reference dataset is provided called
aquatic_hcafwherehcafis an acronym for “half degree cell authority file”. This dataset contains data for a large number of biological climate environment parameters, both categorical and numerical. For example, categorical data including grid cell identifiers (such as CSquareCodes and LOICZID), country, cell type and EEZ is included. More than 80 different numerical layers are also included, providing information on annual mean temperatures, precipitation levels, depth, elevation, ice concentration, primary productivity etc.For occurrence frequency, up-to-date data about known occurrences can be fetched from GBIF (using the
rgbifpackage) and/or from Fishbase (using therfishbasepackage). The package documentation shows how to calculate the frequency of species per grid cell and functions are provided that allow for batch calculations. Usage is documented in the package vignettes.The species preferences for environmental envelopes can be calculated on the basis of the occurrence data through the use of functions provided in the package. The results in terms of environmental envelopes can later be adjusted to cater for expert tuning of parameters.
Calculating probability of occurrence
For each taxon analysed, an envelope is calculated for each environmental parameter, which provides the upper and lower tolerance limits for the taxon. The probability of occurrence is then calculated for each parameter in each cell and may range from 0 to 1. The individual parameter probabilities are then multiplied to give the prediction of occurrence in the cell.
Accounting for physical barriers to dispersal
AquaMaps can show actual distribution (point data), projected distribution using the probabilities of occurrence (suitable habitat), or modelled actual distribution using a combination of probabilities and a system of bounding boxes or constraining shapes constraining the distribution to known areas of occurrence. The bounding boxes may be based on Fishery zones (FAO areas), watersheds or other geographical structures that limit taxon distribution in addition to ecological niche, or expert data.
Predictive analytics
By adjusting the environmental data to IPCC models AquaMaps can also present projected distributions in the future according to global or regional warming scenarios.
Displaying results
The tool can also be used for species richness maps, either using raw occurrence data or models showing constrained or absolute suitable habitats.
Because raquamaps has probabilities of occurrence as end
product, the graphic system and the underlying statistics tool in
raquamaps can also be used to display results obtained with
other modeling tools.
Technical information
The present release of rAquaMaps is available online at https://raquamaps.github.io, where installation instructions and documentation is published. The source code is available at https://github.com/raquamaps.
Most recent changes as of Oct 1 2015 were:
The algorithm is now designed to allow for changing the model used, for example to use an additive model instead of the multiplicative model and to change the bioclimate variables used in the calculations
Results can be displayed as 5-step chloropeth maps using various class intervals (where the default is Fisher-Jenks class intervals, but it is also possible to discretize the maps using other types of class intervals, such as quantiles, equal length class intervals etc). The usage of 5 stepped class intervals is inspired from best practice for chloropeth maps as outlines in Slocum.
Maps are provided in static (ggplot) format and dynamic (web format using leaflet JS maps). The web maps allows for zooming and looking at raster data details.
Automatic builds are integrated through Travis-CI verifies that the package can be built properly and complies with CRAN rules for R-packages