This document describes how to use the aquamapsdata R
data package to access curated data through a static database assembled
from data sourced from https://aquamaps.org
Immediately after installing the package, a run-once action is
needed, in order to download and locally create the SQLite database
containing all the AquaMaps data. A small minified limited (< 1MB)
variant of the database is included in the package to simplify package
development, and can be activated by using
default_db("extdata").
Approximately 10G disk space is needed locally when remotely downloading the database file. The download is around 2G compressed and therefore a speedy Internet connection is recommended for this initial step.
# install aquamapsdata from GitHub using devtools
install.packages("devtools")
library("devtools")
install_github("raquamaps/aquamapsdata", dependencies = TRUE)
# initial run-once step required to install remote db locally
library(aquamapsdata)
download_db(force = TRUE)
default_db("sqlite")Once the database is available locally, it can be queried using a couple of different functions.
Please remember to begin your session by activating the connection to the downloaded database:
library(aquamapsdata)
default_db("sqlite")This package provides data that can be queried with tidyverse tools
such as dplyr.
It also requires some spatial tools (sp and
raster) to be installed.
library(aquamapsdata)
library(dplyr)
# This vignette is built using a minified offline db bundled into the package,
# so vignettes can be built in the cloud without requiring a
# full install and download of the database (time saver)
invisible(default_db("extdata"))
# NB: normally download_db() would be used first, followed by
# default_db("sqlite")Taxonomy can be searched and queried using fuzzy and exact name searches, returning keys with the internal identifiers used in the database.
Those keys could be said to represent species lists that can be used to retrieve other information, such as environmental envelopes etc.
# fuzzy search allows full text search operators AND, OR, NOT and +
# see https://www.sqlitetutorial.net/sqlite-full-text-search/
am_search_fuzzy(search_term = "trevally") |> pull(key)## [1] "Fis-29757"
# exact search without parameters returns all results
nrow(am_search_exact())## [1] 1
# exact search giving NULL params shows examples of existing values
# here we see what combinations are present in the dataset for
# angling, diving, dangerous, highseas, deepwater organisms
am_search_exact(
angling = NULL, diving = NULL, dangerous = NULL,
deepwater = NULL, highseas = NULL, m_invertebrates = NULL)## # A tibble: 1 × 6
## deepwater angling diving dangerous m_invertebrates highseas
## <int> <int> <int> <int> <int> <int>
## 1 0 1 1 0 0 0
# exact search without NULL params, specifying values
hits <-
am_search_exact(angling = 1, diving = 1, dangerous = 0)
# display results
display <-
hits |> mutate(binomen = paste(Genus, Species)) |>
select(SpeciesID, binomen, SpecCode, FBname)
knitr::kable(display)| SpeciesID | binomen | SpecCode | FBname |
|---|---|---|---|
| Fis-29757 | Caranx bucculentus | 1896 | Bluespotted trevally |
With a species identifier, probability of occurrence within the known native distribution for a species can either be retrieved in raster format or be displayed on a map .
Here we display the computer-generated native map for the Bluespotted trevally:
library(leaflet)
library(raster)
library(aquamapsdata)
# get the identifier for the species
key <- am_search_fuzzy("bluespotted")$key
ras <- am_raster(key)
# show the native habitat map
am_map_leaflet(ras, title = "Bluespotted trevally") |>
leaflet::fitBounds(lng1 = 100, lat1 = -46, lng2 = 172, lat2 = -2)Source: Kaschner, K., K. Kesner-Reyes, C. Garilao, J. Segschneider, J. Rius-Barile, T. Rees, and R. Froese. 2019. AquaMaps: Predicted range maps for aquatic species. World wide web electronic publication, www.aquamaps.org, Version 10/2019.
Content from AquaMaps as provided in this R package is licensed under
a Creative
Commons Attribution-NonCommercial 3.0 UnportedLicense: 
Attribution with citation information and copyright disclaimer should
be included. There is a function called am_citation() which
provides this information in “text” or “md” format (suitable for use
inside Rmarkdown documents).
# include a citation in text format
am_citation()## [1] "Kaschner, K., K. Kesner-Reyes, C. Garilao, J. Segschneider, J. Rius-Barile, T. Rees, and R. Froese. 2019. AquaMaps: Predicted range maps for aquatic species. World wide web electronic publication, www.aquamaps.org, Version 10/2019.\nContent from AquaMaps as provided in this R package is licensed under a Creative Commons Attribution-NonCommercial 3.0 UnportedLicense, please see http://creativecommons.org/licenses/by-nc/3.0/"
We can also display a map using several identifiers, for example those associated with the genus “Caranx”, and should then provide an aggregation function such as “count”.
keys <- am_search_exact(Genus = "Caranx")$SpeciesID
ras <- am_raster(keys, fun = "count")
am_map_leaflet(ras, title = "Caranx") |>
leaflet::fitBounds(lng1 = 100, lat1 = -46, lng2 = 172, lat2 = -2)
am_citation("md")For biodiversity maps within a specific bounding box, see the
function am_csc_from_extent, which provides biodiversity
map data for all species, optionally filtered by a user-defined
probability threshold (where 0.5 is the default used at aquamaps.org,
see the help for functions am_species_in_csc and
am_species_per_csc).
A few examples follow below and describe how the data in the database can be queried using the provided functions in the package.
Likely occurrences for a species is provided for locations in a half
degree cell grid. Individual cells have characteristics associated, in a
Half degree Cell Authority File,
with data available through the am_hcaf() function.
Information about this table is available in the help with fields
explained in the am_meta dataset.
A subset of HCAF records can be retrieved based on a certain criteria or field in that table.
## [1] "ID" "CsquareCode" "LOICZID" "NLimit"
## [5] "Slimit" "WLimit" "ELimit" "CenterLat"
## [9] "CenterLong" "CellArea" "OceanArea" "PWater"
## [13] "ClimZoneCode" "FAOAreaM" "FAOAreaIn" "CountryMain"
## [17] "CountrySecond" "CountryThird" "CountrySubMain" "CountrySubSecond"
## [21] "CountrySubThird" "EEZ" "LME" "LMEBorder"
## [25] "MEOW" "OceanBasin" "IslandsNo" "Area0_20"
## [29] "Area20_40" "Area40_60" "Area60_80" "Area80_100"
## [33] "AreaBelow100" "ElevationMin" "ElevationMax" "ElevationMean"
## [37] "ElevationSD" "DepthMin" "DepthMax" "DepthMean"
## [41] "DepthSD" "SSTAnMean" "SBTAnMean" "SalinityMean"
## [45] "SalinityBMean" "PrimProdMean" "IceConAnn" "OxyMean"
## [49] "OxyBMean" "LandDist" "Shelf" "Slope"
## [53] "Abyssal" "TidalRange" "Coral" "Estuary"
## [57] "Seamount" "MPA"
# compute depth across all cells
am_hcaf() |>
summarize(depth = mean(DepthMean, na.rm = TRUE)) |>
collect() |>
pull(depth)## [1] 438.2737
# cells with a depth value larger than 4000
deepwater <-
am_hcaf() |> filter(DepthMean > 4000) |> pull(CsquareCode)
# some of the on average deepest locations
deepwater## [1] "3017:488:1" "3110:205:4"
The cell location identifier CsquareCode can be used to look up what species are likely occuring there.
# species likely to occur in deepwater location(s)
deepwater_species <- am_species_in_csc(deepwater, min_prob = 0.5)
deepwater_species## # A tibble: 1 × 2
## # Groups: SpeciesID [1]
## SpeciesID n
## <chr> <int>
## 1 Fis-29757 1
key <- deepwater_species$SpeciesID
am_search_exact(SpeciesID = key)## # A tibble: 1 × 25
## SpeciesID SpecCode Genus Species FBname OccurRecs OccurCells StockDefs Kingdom
## <chr> <int> <chr> <chr> <chr> <int> <int> <chr> <chr>
## 1 Fis-29757 1896 Cara… buccul… Blues… 314 295 Southwes… Animal…
## # ℹ 16 more variables: Phylum <chr>, Class <chr>, Order <chr>, Family <chr>,
## # deepwater <int>, angling <int>, diving <int>, dangerous <int>,
## # m_invertebrates <int>, highseas <int>, invasive <int>, resilience <chr>,
## # iucn_id <int>, iucn_code <chr>, iucn_version <chr>, provider <chr>
The am_hspen() function provides the input data used to generate a
species’environmental envelopes and the envelopes themselves.
Information about this table is available in the help with fields
explained in the am_meta dataset.
HSPEN data can be queried for example based on taxonomy for a single species or higher taxa associated with that species, or any other relevant species identifiers.
# use one or more keys for species
key <- am_species_in_csc(deepwater, min_prob = 0.5)$SpeciesID
am_hspen() |> filter(SpeciesID == key) |> head(1) |> collect() |> glimpse()## Rows: 1
## Columns: 56
## $ SpeciesID <chr> "Fis-29757"
## $ Speccode <int> 1896
## $ LifeStage <chr> "adults"
## $ FAOAreas <chr> "57, 61, 71"
## $ FAOComplete <int> NA
## $ NMostLat <dbl> -7
## $ SMostLat <dbl> -23
## $ WMostLong <dbl> NA
## $ EMostLong <dbl> NA
## $ DepthYN <int> 1
## $ DepthMin <int> 7
## $ DepthPrefMin <int> 12
## $ DepthPrefMax <int> 36
## $ DepthMax <int> 63
## $ MeanDepth <int> 0
## $ Pelagic <int> 0
## $ TempYN <int> 1
## $ TempMin <dbl> 23.97
## $ TempPrefMin <dbl> 25.84
## $ TempPrefMax <dbl> 28.45
## $ TempMax <dbl> 32.65
## $ SalinityYN <int> 1
## $ SalinityMin <dbl> 29.93
## $ SalinityPrefMin <dbl> 33.82
## $ SalinityPrefMax <dbl> 35.1
## $ SalinityMax <dbl> 35.61
## $ PrimProdYN <int> 1
## $ PrimProdMin <dbl> 1.48
## $ PrimProdPrefMin <dbl> 3.67
## $ PrimProdPrefMax <dbl> 12.88
## $ PrimProdMax <dbl> 21.71
## $ IceConYN <int> 1
## $ IceConMin <dbl> -1
## $ IceConPrefMin <dbl> 0
## $ IceConPrefMax <dbl> 0
## $ IceConMax <dbl> 0
## $ OxyYN <int> 0
## $ OxyMin <dbl> 100.38
## $ OxyPrefMin <dbl> 196.51
## $ OxyPrefMax <dbl> 209.1
## $ OxyMax <dbl> 213.7
## $ LandDistYN <int> 0
## $ LandDistMin <dbl> 3
## $ LandDistPrefMin <dbl> 16
## $ LandDistPrefMax <dbl> 210
## $ LandDistMax <dbl> 421
## $ Remark <chr> NA
## $ DateCreated <chr> "2019-06-24 00:00:00"
## $ DateModified <chr> NA
## $ expert_id <int> NA
## $ DateExpert <chr> NA
## $ Layer <chr> "s"
## $ Rank <int> 1
## $ MapOpt <int> 1
## $ ExtnRuleYN <int> 1
## $ Reviewed <int> NA
# for higher taxa - find the keys associated with higher taxa
# am_search_exact(Family = am_search_exact(SpeciesID = key)$Family)In the beginning of the vignette, the taxonomy name search functions were illustrated with some examples. These allow the user to get a list of available mapped species based on a certain criteria (e.g. taxonomic group), so that biodiversity (or species richness) for that group can be mapped.
We can also use a function to list species occurring in an area; a location identified by a single or multiple CsquareCode identifiers. Another function allows querying for species diversity or richness across a set of cells. Both functions allows for specifying a probability threshold as deemed fit.
A location can be determined based on a user defined bounding box or extent (given by four coordinates). Using am_hcaf() a set of cells can be determined based on other criteria, allowing retrieval of CsquareCode cell identifiers that belong to a specific LME, for example.
# get cell identifiers for a bounding box or extent
csc <- am_csc_from_extent(100, 120, -22, -7)$CsquareCode
# within in this area, the following species are listed appear, each in n cells
am_species_in_csc(csc)## # A tibble: 1 × 2
## # Groups: SpeciesID [1]
## SpeciesID n
## <chr> <int>
## 1 Fis-29757 86
# in each cell location, the following number of distinct species are likely
# a measure of species diversity or "richness"
am_species_per_csc(csc, min_prob = 0.8)## # A tibble: 53 × 2
## CsquareCode n_species
## <chr> <int>
## 1 3010:475:2 1
## 2 3011:486:4 1
## 3 3011:487:3 1
## 4 3011:487:4 1
## 5 3011:488:3 1
## 6 3011:488:4 1
## 7 3011:489:3 1
## 8 3011:489:4 1
## 9 3011:496:2 1
## 10 3011:497:1 1
## # ℹ 43 more rows
Please note that the database file differs from the complete version available online at https://aquamaps.org in the following respects:
The database is incomplete in terms of species mapped (26,399 /33,518). It is based on AquaMaps’ conservative rule of generating envelopes and predictions for species with >=10 ‘good cells’ and excludes records of data-poor species (i.e. endemic and/or rare species). Please contact the AquaMaps team directly if you want access to the complete dataset.
The map data provided (hcaf_species_native table) give computer-generated predictions. Please contact the AquaMaps team directly if you need to access the latest reviewed/improved species maps.
Map data showing future species distributions for 2050 and 2100 (under different RCP scenarios) are excluded. Please contact the AquaMaps team directly if you are interested in these datasets.
We strongly encourage partnering with the AquaMaps team for larger research projects or publications that would make intensive use of AquaMaps to ensure that you have access to the latest version and/or reviewed maps, that the limitations of the data set are clearly understood and addressed, and that critical maps and/or unlikely results are recognized as such and double-checked for correctness prior to drawing conclusions and/or subsequent publication.
The AquaMaps team can be contacted through Rainer Froese (rfroese@geomar.de) or Kristin Kaschner (Kristin.Kaschner@biologie.uni-freiburg.de).
The dplyr package can be used to query the various
tables available in the database.
Here is a description of tables and fields which are included.
knitr::kable(am_meta)| table | field | description | type_mysql | type |
|---|---|---|---|---|
| hcaf_r | ID | Unique HCAF ID, for internal use only. | int | int |
| hcaf_r | CsquareCode | A unique identifier for every half-degree cell in the global map based on the c-square method - a hierarchical cell labelling system developed at CSIRO Oceans and Atmosphere (then CSIRO Marine Research). Example: 3414:227:3 | varchar | chr |
| hcaf_r | LOICZID | LOICZ ID numbers are long integers from 1 to 259200. They begin with the cell centered at 89.75 degrees N latitude and 179.75 degrees W latitude and proceed from West to East. When a full circle of 720 cells is completed, the numbering steps one cell south along the -180 meridian and continues sequentially west to east. | int | int |
| hcaf_r | NLimit | Northern boundary of cell in decimal degrees latitude (positive in N hemisphere, negative in S hemisphere). Points falling on this line are considered inside the cell in the S hemisphere (exception: cells adjoining the equator, i.e., where N_limit = 0). Also (polar case), points on this line are “inside” in the N hemisphere when N_limit = 90. | double | dbl |
| hcaf_r | Slimit | Southern boundary of cell in decimal degrees latitude (positive in N hemisphere, negative in S hemisphere). Points falling on this line are considered inside the cell in the N hemisphere. Also (polar case), points on this line are “inside” in the S hemisphere when S_limit = -90. | double | dbl |
| hcaf_r | WLimit | Western boundary of cell in decimal degrees latitude (positive in E hemisphere, negative in W hemisphere). Points falling on this line are considered inside the cell in the E hemisphere. Also (boundary case), points on this line are “inside” in the W hemisphere when W_limit = -180. | double | dbl |
| hcaf_r | ELimit | Eastern boundary of cell in decimal degrees latitude (positive in E hemisphere, negative in W hemisphere). Points falling on this line are considered inside the cell in the W hemisphere (exception: cells adjoining the Greenwich Meridian, i.e., where E_limit = 0). Also (boundary case), points on this line are “inside” in the E hemisphere when E_limit = 180. | double | dbl |
| hcaf_r | CenterLat | The center point of the cell in decimal degrees latitude. | double | dbl |
| hcaf_r | CenterLong | The center point of the cell in decimal degrees longitude. | double | dbl |
| hcaf_r | CellArea | The total area inside the cell in square kilometers, using WGS84 and Miller cylindrical projection (KGS description). | double | dbl |
| hcaf_r | OceanArea | The area in the cell that is normally covered by sea water or permanent ice, in square kilometers (KGS description). | double | dbl |
| hcaf_r | PWater | Proportion of water in each cell. | double | dbl |
| hcaf_r | ClimZoneCode | Climate zone to which the cell belongs based on climate zone shape file in SAU database. | varchar | chr |
| hcaf_r | FAOAreaM | Code number of FAO statistical area to which the cell belongs, for all oceanic and coastal cells. | int | int |
| hcaf_r | FAOAreaIn | Code number of FAO statistical area to which the cell belongs, for all inland and coastal cells. | int | int |
| hcaf_r | CountryMain | UN code number of country, island or area to which the largest land area of the cell belongs, for all inland and coastal cells. | varchar | chr |
| hcaf_r | CountrySecond | UN code number of country, island or area to which the second largest land area of the cell belongs, for all inland and coastal cells. | varchar | chr |
| hcaf_r | CountryThird | UN code number of country, island or area to which the third largest land area of the cell belongs, for all inland and coastal cells. | varchar | chr |
| hcaf_r | CountrySubMain | ISO code number of state, province, region to which the largest land area of the cell belongs. | varchar | chr |
| hcaf_r | CountrySubSecond | ISO code number of state, province, region to which the second largest land area of the cell belongs. | varchar | chr |
| hcaf_r | CountrySubThird | ISO code number of state, province, region to which the third largest land area of the cell belongs. | varchar | chr |
| hcaf_r | EEZ | Code number of country, island or area to which the EEZ area in the cell belongs, for all coastal and oceanic cells. | int | int |
| hcaf_r | LME | Code number of the large marine ecosystem to which the cell belongs, as given by NOAA (http://www.lme.noaa.gov), for all coastal and oceanic cells. | int | int |
| hcaf_r | LMEBorder | Tags whether or not cell lies along the border of an LME. 0=No, 1=Yes | tinyint | int |
| hcaf_r | MEOW | 5-digit code (ECO_Code) refering to the marine ecoregion the cell belongs to, as assigned by MEOW, a biogeographic classification of the world’s coasts and shelves. | int | int |
| hcaf_r | OceanBasin | Major ocean basins of world with north and south sub-basins separated by latitudinal data from literature. | int | int |
| hcaf_r | IslandsNo | Number of coastal or oceanic islands contained in cell, as provided by the World Vector Shoreline database. | int | int |
| hcaf_r | Area0_20 | Area in cell from 0-20 m depth, in square kilometers, as provided by Smith and Sandwell: Bathymetry and Elevation (currently only 70 N to 70 S). | double | dbl |
| hcaf_r | Area20_40 | Area in cell from 20-40 m depth, in square kilometers, as provided by Smith and Sandwell: Bathymetry and Elevation. | double | dbl |
| hcaf_r | Area40_60 | Area in cell from 40-60 m depth, in square kilometers, as provided by Smith and Sandwell: Bathymetry and Elevation. | double | dbl |
| hcaf_r | Area60_80 | Area in cell 60-80 m depth, in square kilometers, as provided by Smith and Sandwell: Bathymetry and Elevation. | double | dbl |
| hcaf_r | Area80_100 | Area in cell from 80-100 m depth, in square kilometers, as provided by Smith and Sandwell: Bathymetry and Elevation. | double | dbl |
| hcaf_r | AreaBelow100 | Area in cell below 100 m depth, in square kilometers, as provided by Smith and Sandwell: Bathymetry and Elevation. | double | dbl |
| hcaf_r | ElevationMin | Minimum elevation above sea level in meters, as provided by ETOPO2. | double | dbl |
| hcaf_r | ElevationMax | Maximum elevation above sea level in cell in meters, as provided by ETOPO2. | double | dbl |
| hcaf_r | ElevationMean | Mean elevation above sea level in meters, as provided by ETOPO2. | double | dbl |
| hcaf_r | ElevationSD | Standard deviation of mean elevation above sea level in meters, as provided by ETOPO2. | double | dbl |
| hcaf_r | DepthMin | Minimum ETOPO 2min bathymetry (negative) elevation in 30min cell. | double | dbl |
| hcaf_r | DepthMax | Maximum ETOPO 2min bathymetry (negative) elevation in 30min cell. | double | dbl |
| hcaf_r | DepthMean | Mean ETOPO 2min bathymetry (negative) elevation in 30min cell. | double | dbl |
| hcaf_r | DepthSD | Standard deviation of mean bottom depth below sea level in meters, as provided by ETOPO2. | double | dbl |
| hcaf_r | SSTAnMean | Mean annual sea surface temperature in degree Celsius (2000-2014), as derived from Bio-ORACLE, for all coastal and oceanic cells, from 90 N to 78.5 S. | double | dbl |
| hcaf_r | SBTAnMean | Mean annual sea bottom temperature in degree Celsius, as derived from Bio-ORACLE (2000-2014), for all coastal and oceanic cells, from 90 N to 78.5 S. | double | dbl |
| hcaf_r | SalinityMean | Mean annual surface salinity in practical salinity scale (PSS), as derived from Bio-ORACLE (2000-2014), for all coastal and oceanic cells, from 90 N to 78.5 S. | double | dbl |
| hcaf_r | SalinityBMean | Mean annual bottom salinity in practical salinity scale (PSS), as derived from Bio-ORACLE (2000-2014), for all coastal and oceanic cells, from 90 N to 78.5 S. | double | dbl |
| hcaf_r | PrimProdMean | Proportion of annual surface primary production in a cell in mgC·m-3·day -1, for all coastal and oceanic cells, from 90 N to 78.5 S. | double | dbl |
| hcaf_r | IceConAnn | Mean annual sea ice concentration in percent (or fraction from 0-1), as derived from Bio-ORACLE (2000-2014), for all coastal and oceanic cells, from 90 N to 78.5 S. | double | dbl |
| hcaf_r | OxyMean | Mean annual dissolved molecular oxygen at the surface, in millimole per cubic meter, as derived from Bio-ORACLE (2000-2014), for all coastal and oceanic cells, from 90 N to 78.5 S. | double | dbl |
| hcaf_r | OxyBMean | Mean annual dissolved molecular oxygen at the surface, in millimole per cubic meter, as derived from Bio-ORACLE (2000-2014), for all coastal and oceanic cells, from 90 N to 78.5 S. | double | dbl |
| hcaf_r | LandDist | Distance (km) to the nearest coastal cell (water cells only). | int | int |
| hcaf_r | Shelf | The water area of the cell that lies within the shelf zone (0 - 200m depth); based on min/max elevation and proportion in depth zone. | double | dbl |
| hcaf_r | Slope | The water area of the cell that lies within the slope zone (>200 - 4000m depth); based on min/max elevation and proportion in depth zone. | double | dbl |
| hcaf_r | Abyssal | The water area of the cell that lies within the abyssal zone (> 4000m depth); based on min/max elevation and proportion in depth zone. | double | dbl |
| hcaf_r | TidalRange | Extent of tides in scaled discrete classes as provided by the original LOICZ Database, for all coastal and oceanic cells. | int | int |
| hcaf_r | Coral | Proportion of whole (even non-water) cell covered by coral WCMC pixelclassify - NOT corrected to 284,300 sq km globally World Atlas of Coral Reefs UNEP WCMC 2001. | double | dbl |
| hcaf_r | Estuary | Area covered by estuaries in the cell. | double | dbl |
| hcaf_r | Seamount | Number of known seamounts attributed to the cell. | int | int |
| hcaf_r | MPA | Proportion of cell covered by a Marine Protected Area. | double | dbl |
| speciesoccursum_r | SpeciesID | AquaMaps’ unique identifier for a valid species used by the Catalogue of Life Annual Checklist (www.catalogueoflife.org). Example for the whale shark: Fis-30583 | varchar | chr |
| speciesoccursum_r | SpecCode | Species identifier used in FishBase or SeaLifeBase | int | int |
| speciesoccursum_r | Genus | Genus name of the species | varchar | chr |
| speciesoccursum_r | Species | Specific epithet of the species | varchar | chr |
| speciesoccursum_r | FBname | Common name suggested by FishBase or SeaLifeBase | varchar | chr |
| speciesoccursum_r | OccurRecs | Number of point records used to generate good cells | int | int |
| speciesoccursum_r | OccurCells | Number of good cells used to generate species envelope | int | int |
| speciesoccursum_r | StockDefs | Distribution of the species as recorded in FishBase or SeaLifeBase | longtext | chr |
| speciesoccursum_r | Kingdom | Kingdom to which the species belongs | varchar | chr |
| speciesoccursum_r | Phylum | Phylum to which the species belongs | varchar | chr |
| speciesoccursum_r | Class | Class to which the species belongs | varchar | chr |
| speciesoccursum_r | Order | Order to which the species belongs | varchar | chr |
| speciesoccursum_r | Family | Family to which the species belongs | varchar | chr |
| speciesoccursum_r | deepwater | Does the species occur in the deep-sea (i.e. tagged bathypelagic or bathydemersal in FishBase or SeaLifeBase)? 0=No, 1=Yes | tinyint | int |
| speciesoccursum_r | angling | Is the species a sport fish (i.e. tagged as a GameFish in FishBase)? 0=No, 1=Yes | tinyint | int |
| speciesoccursum_r | diving | Is the species found on a dive (i.e. where DepthPrefMin in HSPEN < 20 meters)? 0=No, 1=Yes | tinyint | int |
| speciesoccursum_r | dangerous | Is the species dangerous (i.e. tagged as ‘traumatogenic or venonous’ in FishBase or SeaLifeBase)? 0=No, 1=Yes | tinyint | int |
| speciesoccursum_r | m_invertebrates | Is the species a marine invertebrate? 0=No, 1=Yes | tinyint | int |
| speciesoccursum_r | highseas | Is the species an open ocean fish species (i.e. tagged as pelagic-oceanic in FishBase)? 0=No, 1=Yes | tinyint | int |
| speciesoccursum_r | invasive | Is the species recorded to be invasive (i.e. in FishBase or SeaLifeBase)? 0=No, 1=Yes | tinyint | int |
| speciesoccursum_r | resilience | Resilience of the species (i.e. as recorded in FishBase/SeaLifeBase) | varchar | chr |
| speciesoccursum_r | iucn_id | IUCN species identifier | int | int |
| speciesoccursum_r | iucn_code | IUCN Red list classification assigned to the species | varchar | chr |
| speciesoccursum_r | iucn_version | IUCN version | varchar | chr |
| speciesoccursum_r | provider | FishBase (FB) or SeaLifeBase (SLB)? | varchar | chr |
| occurrencecells_r | RecordID | Unique occurrencecells ID, for internal use only. | int | int |
| occurrencecells_r | CsquareCode | A unique identifier for every half-degree cell in the global map based on the c-square method - a hierarchical cell labelling system developed at CSIRO Oceans and Atmosphere (then CSIRO Marine Research). Example: 3414:227:3 | varchar | chr |
| occurrencecells_r | SpeciesID | AquaMaps’ unique identifier for a valid species used by the Catalogue of Life Annual Checklist (www.catalogueoflife.org). Example for the whale shark: Fis-30583 | varchar | chr |
| occurrencecells_r | SpecCode | Species identifier used in FishBase/SeaLifeBase. | int | int |
| occurrencecells_r | GoodCell | Is the cell a good cell (following the AquaMaps’ definition of a good cell i.e. cCell falls inside the known bounding box and/or FAO areas where the species is reported to occur)? 0=No, 1=Yes | tinyint | int |
| occurrencecells_r | InFAOArea | Does the cell occur within the FAO areas where the species is reported to occur? 0=No, 1=Yes | tinyint | int |
| occurrencecells_r | InBoundBox | Does the cell occur within the bounding box where the species is reported to occur? 0=No, 1=Yes | tinyint | int |
| occurrencecells_r | GBIF_YN | Is the cell partially/completely based on GBIF point data? null=No, 1=Yes | tinyint | int |
| occurrencecells_r | OBIS_YN | Is the cell partially/completely based on OBIS point data? null=No, 1=Yes | tinyint | int |
| occurrencecells_r | FBSLB_YN | Is the cell partially/completely based on FishBase/SeaLifeBase occurrence records? null=No, 1=Yes | tinyint | int |
| occurrencecells_r | CountryPoint_YN | Is the cell partially/completely based on FishBase/SeaLifeBase country records? null=No, 1=Yes | tinyint | int |
| occurrencecells_r | AWI_YN | Is the cell partially/completely based on AWI point data? null=No, 1=Yes | tinyint | int |
| occurrencecells_r | IATTC_YN | Is the cell partially/completely based on IATTC point data? null=No, 1=Yes | tinyint | int |
| occurrencecells_r | UWA_YN | Is the cell partially/completely based on UWA point data? null=No, 1=Yes | tinyint | int |
| occurrencecells_r | CenterLat | The center point of the cell in decimal degrees latitude. Example: 89.75 | double | dbl |
| occurrencecells_r | CenterLong | NA | NA | dbl |
| occurrencecells_r | FAOAreaM | FAO area to which the cell belongs. | tinyint | int |
| hspen_r | SpeciesID | AquaMaps’ unique identifier for a valid species used by the Catalogue of Life Annual Checklist (www.catalogueoflife.org). Example for the whale shark: Fis-30583 | varchar | chr |
| hspen_r | Speccode | Species identifier used in FishBase or SeaLifeBase. | int | int |
| hspen_r | LifeStage | Life stage of the species. Currently all envelopes refer to adult environmental preferences. | varchar | chr |
| hspen_r | FAOAreas | Comma-delimited string containing the FAO area codes where native occurrence of the species has been reported in the literature. Example: 5, 7, 18, 27, 37 | varchar | chr |
| hspen_r | FAOComplete | Are the FAO areas listed in FAOAreas complete for this species? 0=No, 1=Yes | tinyint | int |
| hspen_r | NMostLat | Northern-most latitude of distributional range of this species, in decimal degrees. Example: 55.5 | double | dbl |
| hspen_r | SMostLat | Southern-most latitude of distributional range of this species, in decimal degrees. Example: -15 | double | dbl |
| hspen_r | WMostLong | Western-most longitude of distributional range of this species in decimal degrees. Example: -130 | double | dbl |
| hspen_r | EMostLong | Eastern-most longitude of distributional range of this species in decimal degrees. Example: -80 | double | dbl |
| hspen_r | DepthYN | Is the depth parameter used in computing map data? 0=No, 1=Yes | tinyint | int |
| hspen_r | DepthMin | Minimum depth where the species has been found (in meters). Example: 20 | int | int |
| hspen_r | DepthPrefMin | Minimum depth PREFERRED by the species (in meters). Example: 30 | int | int |
| hspen_r | DepthPrefMax | Maximum depth PREFERRED by the species (in meters). Example: 60 | int | int |
| hspen_r | DepthMax | Maximum depth range where this species has been found (in meters). Example: 120 | int | int |
| hspen_r | MeanDepth | Is mean depth used to fit the depth envelope? By default, marine mammals use mean depth. 0=No, 1=Yes | tinyint | int |
| hspen_r | Pelagic | Does the species occurs in the water column well above and largely independent of the bottom? 0=No, 1=Yes | tinyint | int |
| hspen_r | TempYN | Is the temperature parameter used in computing map data? 0=No, 1=Yes | tinyint | int |
| hspen_r | TempMin | Minimum temperature tolerated by the species (in deg C). Example: 16 | double | dbl |
| hspen_r | TempPrefMin | Minimum temperature PREFERRED by the species (in deg C). Example: 20.0 | double | dbl |
| hspen_r | TempPrefMax | Maximum temperature PREFERRED by the species (in deg C). Example: 27.0 | double | dbl |
| hspen_r | TempMax | Maximum temperature tolerated by the species (in deg C). Example: 31 | double | dbl |
| hspen_r | SalinityYN | Is the salinity parameter used in generating map data? 0=No, 1=Yes | tinyint | int |
| hspen_r | SalinityMin | Minimum salinity tolerated by the species (in psu). Example: 20 | double | dbl |
| hspen_r | SalinityPrefMin | Minimum salinity PREFERRED by the species (in psu). Example: 33.4 | double | dbl |
| hspen_r | SalinityPrefMax | Maximum salinity PREFERRED by the species (in psu). Example: 35.7 | double | dbl |
| hspen_r | SalinityMax | Maximum salinity tolerated by the species (in psu). Example: 38 | double | dbl |
| hspen_r | PrimProdYN | Is the primary production parameter used in computing map data? 0=No, 1=Yes | tinyint | int |
| hspen_r | PrimProdMin | Minimum amount of primary production tolerated by the species (in mgC·m-3·day-1). Example: 0 | double | dbl |
| hspen_r | PrimProdPrefMin | Minimum amount of primary production PREFERRED by the species (in mgC·m-3·day-1). Example: 579 | double | dbl |
| hspen_r | PrimProdPrefMax | Maximum amount of primary production PREFERRED by the species (in mgC·m-3·day-1). Example: 1754 | double | dbl |
| hspen_r | PrimProdMax | Maximum amount of primary production tolerated by the species (in mgC·m-3·day-1). Example: 2935 | double | dbl |
| hspen_r | IceConYN | Is the ice concentration parameter used in computing map data? 0=No, 1=Yes | tinyint | int |
| hspen_r | IceConMin | Minimum sea ice concentration tolerated by the species (0-1 fraction). | double | dbl |
| hspen_r | IceConPrefMin | Minimum sea ice concentration PREFERRED by the species (0-1 fraction). | double | dbl |
| hspen_r | IceConPrefMax | Maximum sea ice concentration PREFERRED by the species (0-1 fraction). | double | dbl |
| hspen_r | IceConMax | Maximum sea ice concentration tolerated by the species (0-1 fraction). | double | dbl |
| hspen_r | OxyYN | Is the dissolved bottom oxygen parameter used in computing map data? 0=No, 1=Yes | tinyint | int |
| hspen_r | OxyMin | Minimum dissolved bottom oxygen tolerated by the species (in mmol·m-3). Example 1.33 | double | dbl |
| hspen_r | OxyPrefMin | Minimum dissolved bottom oxygen PREFERRED by the species (in mmol·m-3). Example 231.42 | double | dbl |
| hspen_r | OxyPrefMax | Maximum dissolved bottom oxygen PREFERRED by the species (in mmol·m-3). Example 327.77 | double | dbl |
| hspen_r | OxyMax | Maximum dissolved bottom oxygen tolerated by the species (in mmol·m-3). Example 408.99 | double | dbl |
| hspen_r | LandDistYN | Is the distance to land parameter used in computing map data? 0=No, 1=Yes | tinyint | int |
| hspen_r | LandDistMin | Minimum distance to land tolerated by the species (in km). Example: 20 | double | dbl |
| hspen_r | LandDistPrefMin | Minimum distance to land PREFERRED by the species in (km). Example: 33 | double | dbl |
| hspen_r | LandDistPrefMax | Maximum distance to land PREFERRED by the species (in km). Example: 35 | double | dbl |
| hspen_r | LandDistMax | Maximum distance to land tolerated by the species (in km). Example: 38 | double | dbl |
| hspen_r | Remark | Text field to accommodate any remarks relevant to this record. | longtext | chr |
| hspen_r | DateCreated | Date and time when this record was first created. Example: 2019-06-24 00:00:00 | datetime | chr |
| hspen_r | DateModified | Date and time when this record was last modified. If the record has not been modified, field is empty. Example: 2019-08-19 00:00:00 | datetime | chr |
| hspen_r | expert_id | ID of the expert who last reviewed the envelope. | int | int |
| hspen_r | DateExpert | Date and time when this record was last edited by an expert. Example: 2019-08-29 00:00:00 | datetime | chr |
| hspen_r | Layer | Indicates whether the temperature and salinity parameters are based on bottom (=b) or surface (=s) values of half-degree cells used to compute the envelope. | char | chr |
| hspen_r | Rank | Internal code for basis of computation for environmental envelope (1 = with >10 good cells; 2 = with 3-9 good cells only; 3 = restricted range, one known point, new species). | tinyint | int |
| hspen_r | MapOpt | Indicates how native map (predicted probabilities) is plotted: 1 = area covered by both species’ bounding box and FAO areas, 2 = area covered by species’ FAO areas only, 3 = area covered by species’ bounding box only. | tinyint | int |
| hspen_r | ExtnRuleYN | Was the FAO extension rule applied in the generation of the species envelope? 0=No, 1=Yes, null | tinyint | int |
| hspen_r | Reviewed | Is this a reviewed envelope? 0=No, 1=Yes, null | tinyint | int |
| hcaf_species_native | SpeciesID | AquaMaps’ unique identifier for a valid species used by the Catalogue of Life Annual Checklist (www.catalogueoflife.org). Example for the whale shark: Fis-30583 | varchar | chr |
| hcaf_species_native | CsquareCode | A unique identifier for every half-degree cell in the global map based on the c-square method - a hierarchical cell labelling system developed at CSIRO Oceans and Atmosphere (then CSIRO Marine Research). Example: 3414:227:3 | varchar | chr |
| hcaf_species_native | CenterLat | The center point of the cell in decimal degrees latitude. Example: 89.75 | double | dbl |
| hcaf_species_native | CenterLong | The center point of the cell in decimal degrees longitude. Example: -179.75 | double | dbl |
| hcaf_species_native | Probability | Overall probability of occurrence of the species in the cell (ranging from 0.01 to 1). Example: 0.71 | float | dbl |
| hcaf_species_native | FAOAreaYN | Does this cell fall within an FAO area where the species is known to occur (endemic/native)? 0=No, 1=Yes | tinyint | int |
| hcaf_species_native | BoundBoxYN | Does this cell fall within the geographical bounding box known for the species? 0=No, 1=Yes | tinyint | int |
This section describes how the data was prepared for usage in this R package. It may be of interest maybe not primarily for package users, but for those interested in understanding the data preparation steps involved in preparing the dataset for use in this package.
For data management and preparation, several steps are involved in preparing the dataset used in this package. These steps involve moving the relevant parts of the source data from its primary source into a local SQLite3 database that the package uses.
The source data lives in a MySQL/MariaDB database. If this data is made available in the form of a backup from a raw datadir or, preferably, in the form of a data dump, this can be loaded into a local MariaDB database engine.
With docker-compose this can be done in one step, using
the command docker-compose up -d and this
docker-compose.yml file:
volumes:
db:
services:
db:
image: mariadb:latest
ports:
- "3306:3306"
environment:
- MYSQL_ROOT_PASSWORD=your_root_db_password
- MYSQL_DATABASE=aquamapsdb
- MYSQL_USER=your_db_user
- MYSQL_PASSWORD=your_db_password
volumes:
- db:/var/lib/mysql
- ./aquamaps.sql:/docker-entrypoint-initdb.d/aquamaps.sql:roAfter this step, the data is available to access locally through
aquamapsdata::con_am().
The “metadata” for table and field names and their descriptions is
provided through aquamapsdata::am_meta which is prepared by
means of (data-raw/am-meta.R). This metadata is used in package
documentation and the am_search_exact() function allows for
taxonomic searches using some of those fields.
A set of functions then allows for syncing the data into an SQLite3 database with full text search support, which gets indexed.
Relevant steps are:
db_sync() from source
connection to target db.am_create_fts()
am_create_indexes()
The function am_search_exact takes a lot of parameters,
which can be combined, to query the taxonomy in a single call.
The am_search_fuzzy is quick and allows FTS5 search
syntax (search terms which can be quoted and also combined with AND, OR,
NOT).
These functions returns search results containing keys or identifiers
that can be used to retrieve map data in raster format through
am_raster(). With such a raster a leaflet map can be
created with am_map_leaflet().