Where should you plant mangroves? And which species will thrive at a given site?
This document walks through how we built the species suitability maps for Grenada’s Mangrove Innovation Hub. We used field data from survey plots across Grenada combined with landscape-scale environmental data to model where each of the three mangrove species — Red, Black, and White — are likely to establish and how well they might grow.
This work was developed by Gaea Conservation Network as part of “Developing a standardized approach for mangrove conservation and restoration in Grenada, Carriacou and Petite Martinique,” with support from the Grenada Fund for Conservation and KIDO Foundation, funded by GEF-SGP Grenada. Field data from baseline sites was also collected under “Restoring climate refuges for migratory birds and building community engagement with nature,” funded by Environment and Climate Change Canada (ECCC).
The goal here is transparency. We want practitioners, researchers, and anyone curious to understand the decisions we made, why we made them, and what the models actually tell us. If you just want to use the map, head to [Explore the Map]. If you want to understand how we got there, read on.
Mangrove species don’t establish randomly across the landscape. They respond to environmental conditions — how wet the soil is, how often it floods, how much freshwater versus saltwater influence there is. To predict where each species is likely to thrive, we need to capture these conditions across the entire landscape.
We used five environmental variables derived from publicly available spatial data.
Topographic Wetness Index (TWI) measures how likely water is to accumulate at a given location based on the shape of the land. Higher values mean wetter conditions — areas where groundwater is more accessible and soils stay saturated longer.
Flow Accumulation (FA) tracks how surface water moves across the landscape. Higher values indicate areas where water collects from upslope — channels, drainage paths, and low points that receive runoff.
Elevation integrates multiple environmental factors at once. In coastal wetlands, elevation determines how often a site floods with tidal water, how salty the soil becomes, and how well it drains. It’s not that mangroves respond to elevation directly — they respond to the conditions that elevation creates.
Distance to Coast captures marine influence. Sites closer to the coast experience more saltwater exposure, tidal energy, and wave action.
Distance to River captures freshwater influence. Sites near rivers receive more freshwater input, sediment, and nutrients — conditions that favor different species assemblages than purely marine sites.
The foundation for this analysis is a 5-meter resolution Digital Elevation Model (DEM) collected through LiDAR survey in 2017 as part of the Regional Disaster Vulnerability Reduction Project (DVRP), funded by the World Bank and Climate Investment Fund. This survey, conducted by Fugro, covered Grenada, Carriacou, and Petite Martinique.
From the DEM, we derived two hydrological layers in SAGA GIS: Topographic Wetness Index and Flow Accumulation. These capture how water moves across and accumulates on the landscape based on terrain shape.
Distance to Coast was calculated from administrative boundary shapefiles for Grenada. Distance to River was calculated from a river network layer also produced through the DVRP project.
Soil type: We considered including soil characteristics, since they influence drainage and root conditions. However, the available soil data had substantial gaps and was not digitized for Carriacou and Petite Martinique, so we excluded it from the final models.
Land use/land cover: Although land cover data was available from the DVRP project, we chose not to include it as a predictor for two reasons. First, land cover reflects human decisions — where people have built or cleared — rather than underlying environmental suitability. A site classified as “developed” might still have the right conditions for mangroves if it were restored. Second, the land cover raster has classification artifacts where adjacent pixels can have vastly different values, making it unreliable for modeling at fine scales. Instead, we use land cover as a post-processing filter to mask out clearly unsuitable areas like buildings and roads.
Environmental variables often have skewed distributions or extreme values that can cause problems in modeling. Flow accumulation, for example, ranges from near zero in most places to very high values along drainage channels. Raw elevation values span from sea level to mountain peaks. To make these variables suitable for modeling, we applied two steps.
Transformation: We applied the inverse hyperbolic sine (asinh) transformation to all continuous variables. This works like a log transformation — compressing extreme values while preserving the relative ordering — but unlike log, it handles zeros and negative values naturally. This is important for variables like elevation where values near zero are meaningful.
Scaling: After transformation, we standardized each variable using the mean and standard deviation calculated from the full landscape raster. This puts all variables on a common scale, which makes it easier to compare their effects in models. A one-unit change in scaled TWI represents the same relative shift as a one-unit change in scaled elevation.
Note: We calculated scaling statistics from the landscape rasters, not just from our survey points. This ensures that when we predict across the full landscape, the models see values within the same range they were trained on.
Here’s what each variable looks like across the landscape. Toggle between tabs to compare.
After transformation and scaling, here’s how values are distributed across the landscape. These histograms show the range of conditions the models will encounter when predicting across Grenada.
Before modeling, we checked whether our environmental variables were measuring distinct information or duplicating each other.
Correlations above 0.7 would indicate redundancy — variables so similar they’d compete in the model. Here, most correlations are moderate or weak, meaning each variable contributes something distinct. The strongest relationship is between Elevation and Distance to Coast (r = 0.58), which makes intuitive sense — lower areas tend to be closer to the coast — but they’re not so correlated that we need to drop one.
The environmental layers tell us about conditions across the landscape. But to build models that predict where mangroves occur and how well they grow, we need field observations — actual measurements of which species are present and how they’re performing at real sites.
Field data comes from two projects. Under the ECCC-funded project “Restoring climate refuges for migratory birds and building community engagement with nature,” we established 15 transects with 45 plots across four reference sites: Levera, Westerhall, Dover, and David Bay. Under the GEF-SGP project, we added surveys at additional sites including Beausejour and Lans Aux Epines.
At each plot, we recorded:
These measurements let us ask two questions: Where does each species occur? And where it occurs, how well does it grow?
Species distribution models need contrast — not just locations where a species is present, but also locations where it could occur but doesn’t. Without this, the model can’t distinguish suitable from unsuitable conditions.
Our survey plots are all in mangrove habitat. To provide contrast, we generated 200 random background points distributed across Grenada’s landscape. These represent “available” environmental conditions — places where we extracted the same environmental variables but did not observe mangroves.
Adding background points lets the presence models learn what distinguishes mangrove sites from the broader landscape, rather than just describing variation within mangrove areas.
plotVariables <- VegDataCompleteBinomial%>% st_drop_geometry() %>% dplyr::select(FA:DistanceToRiver) %>% colnames()
names(plotVariables) <- names(RasterList)
# Loop and create appropriate plots
for (i in 1:length(plotVariables)) {
cat("\n\n###", names(plotVariables[i]), "\n\n")
print(
VegDataCompleteBinomial %>%
pivot_longer(cols = RedPresent:WhitePresent, names_to = "Species", values_to = "Status")%>%
mutate(Species = str_trim(str_remove_all(Species, "Present")))%>%
mutate(Status = ifelse(Status == 0, "Background", "Present"))%>%
ggplot(aes(x = !!sym(plotVariables[i]), fill = Status)) +
geom_density(alpha = 0.45, color = NA)+
theme_minimal() +
scale_fill_viridis_d()+
facet_grid(~Species)+
labs(
title = paste("Distribution of", names(plotVariables[i])),
x = "Value",
y = "Density"
) +
theme(legend.position = "bottom")
)
cat("\n\n")
}
## Warning: Removed 6 rows containing non-finite outside the scale range (`stat_density()`).
## Warning: Removed 6 rows containing non-finite outside the scale range (`stat_density()`).
We want to answer two distinct questions: Where can each species establish? And where it establishes, how well does it grow? These are different processes — so we modeled them separately.
Presence models predict the probability that a species occurs at a given location. We used logistic regression (binomial distribution) with our survey plots plus background points. The outcome is binary — present or absent — and the model estimates how environmental conditions shift the odds of occurrence.
Height models predict how tall trees grow at locations where a species is already present. We used gamma regression, which handles the right-skewed, strictly positive distribution typical of size measurements. These models only use plots where the species was actually found — we can’t measure tree height where there are no trees.
This separation matters for restoration planning. A site might have high probability of establishment but poor growth potential, or vice versa. The two models capture different aspects of suitability.
We could have used tree-based models like gradient boosting machines (GBMs), which often perform better for prediction and can handle missing data natively — meaning we could have included the incomplete soil layer. But we chose logistic and gamma regression because we wanted explainable models.
With regression, we can say exactly how each environmental variable influences the outcome: a one-unit increase in TWI shifts the odds of occurrence by this much. We can see whether elevation has a positive or negative effect, and whether that effect changes when combined with other variables. This transparency matters when the goal is to understand what drives mangrove distribution, not just predict it.
For restoration planning, knowing why a site is suitable is as valuable as knowing that it’s suitable.
Environmental variables can influence mangrove distribution in simple or complex ways. Elevation alone might explain where Black mangroves occur. Or maybe elevation matters differently depending on how wet the soil is — an interaction effect.
We tested this systematically by fitting models of increasing complexity:
For each species and model type (presence or height), we compared all combinations using AIC (Akaike Information Criterion). Lower AIC indicates better model performance, balancing fit against complexity. Models within 2 AIC units of the best are considered essentially equivalent.
Not all models converged successfully. We excluded models that:
These issues typically occurred with more complex models or those including distance variables, which had limited variation across our study sites. The table below shows only models that passed these checks and performed within 2 AIC units of the best model for each species and response type.
We tested 4,398 model combinations across three species and two response types (presence and height). Of these, 137 converged successfully — those without convergence errors, complete separation, or other fitting issues. The table below shows all converged models.
A few patterns stand out:
What this means: We can predict where mangroves establish with reasonable confidence, but predicting how tall they grow is much harder. Growth performance appears to be driven by factors we didn’t measure — genetics, microsite conditions, disturbance history, or fine-scale soil variation.
SaveModelsAll %>%
ungroup() %>%
filter(!is.infinite(PseudoR2), !PseudoR2 < 0, ModelStatus == "Success") %>%
group_by(Species, ModelType) %>%
mutate(DeltaAIC = AIC - min(AIC, na.rm = TRUE),
DeltaBIC = BIC - min(BIC, na.rm = TRUE)) %>%
ungroup() %>%
mutate(
Response = case_when(
ModelType == "Binomial" ~ "Presence",
ModelType == "Gamma" ~ "Height"
),
PseudoR2 = scales::percent(round(PseudoR2, 3)),
DeltaAIC = round(DeltaAIC, 2),
DeltaBIC = round(DeltaBIC, 2)
) %>%
dplyr::select(Species, Response, Predictors = ModelName, PseudoR2, DeltaAIC, DeltaBIC) %>%
DT::datatable(
caption = "Model comparison results for all species and response types. Delta AIC/BIC show difference from the best model within each species-response combination.",
filter = "top",
options = list(
pageLength = 15,
autoWidth = TRUE
)
)
These plots show how each environmental variable influences the outcome. For presence models, positive values mean the variable increases the probability of finding that species; negative values mean it decreases. For height models, positive values mean taller trees; negative values mean shorter trees.
The error bars show 90% confidence intervals. When the interval doesn’t cross zero (the dashed line), the effect is statistically significant — we’re confident the variable has a real influence. Grey points indicate non-significant effects where we can’t rule out that the true effect is zero.
A few patterns are consistent across species:
for(i in 1:length(AllCoefficientPlots)) {
cat("### ", names(AllCoefficientPlots)[i], "\n\n")
print(AllCoefficientPlots[[i]])
cat("\n\n")
}
## `height` was translated to `width`.
## `height` was translated to `width`.
## `height` was translated to `width`.
The models give us predicted probabilities for every location across the landscape. Since the height models explained very little variation — under 2% for Red mangrove and only 10% for White mangrove — we focus here on presence predictions only. These tell us where each species is likely to establish based on environmental conditions.
Presence Probability shows the likelihood of successful species establishment. Higher values (yellow) indicate areas where the model predicts a species is more likely to occur. Lower values (purple) indicate less suitable conditions. These predictions are based on the top-performing presence models for each species.
## <SpatRaster> resampled to 501147 cells.
## <SpatRaster> resampled to 501147 cells.
## <SpatRaster> resampled to 501147 cells.
The presence models output probabilities — a site might have a 0.65 probability of Red mangrove occurrence. To create a map of “suitable” versus “unsuitable” areas, we need to choose a cutoff: above what probability do we call a site suitable?
This decision involves a tradeoff. A low threshold (say, 0.1) will capture most of the places where mangroves actually occur, but it will also flag many unsuitable sites as “suitable” — false positives. A high threshold (say, 0.5) will be more conservative, only flagging sites with strong predicted suitability, but it might miss genuinely suitable areas — false negatives.
Two metrics help us evaluate this tradeoff:
We tested six threshold strategies, ranging from statistically-derived cutoffs (Youden, Closest to Top Left) to fixed values (Conservative at 0.5, Moderate at 0.3, Liberal at 0.1). The plot below shows how balanced accuracy changes across thresholds for each species.
For all three species, lower thresholds produced higher balanced accuracy. This happens partly because our data has many more absence locations than presence locations.
The thresholds we used:
- Red mangrove: 0.5
- Black mangrove: 0.5
- White mangrove: 0.5
These are the Youden-optimized thresholds that maximize balanced accuracy for each species. A site is classified as “suitable” if its predicted probability exceeds this value.
The presence models output probabilities — a site might have a 0.65 probability of Red mangrove occurrence. To create a map of “suitable” versus “unsuitable” areas, we need to choose a cutoff: above what probability do we call a site suitable?
This isn’t a technical decision — it’s a values decision. Different thresholds reflect different priorities, and there’s no single right answer. The plot below shows six threshold strategies for each species. Moving left (lower thresholds) catches more potential sites but includes more that won’t pan out. Moving right (higher thresholds) is more selective but misses more opportunities. The y-axis shows balanced accuracy — how well the threshold performs at correctly identifying both suitable and unsuitable sites.
Here’s what each strategy means:
Liberal (0.1) sets a low bar. If the model sees any reasonable signal, it flags the site. This catches nearly everything that could work, but you’ll need to verify sites in the field. Use this when you’re screening a large area and don’t want to miss anything.
Prevalence sets the threshold based on how common mangroves are in our survey data. It’s a data-driven approach that tends to be inclusive. Results are similar to Liberal.
Youden finds the threshold that best balances catching good sites and avoiding bad ones — it maximizes the statistical tradeoff. For all three species, this lands around 0.1–0.2 and achieves the highest balanced accuracy. This is a good default if you don’t have strong preferences.
Closest to Top Left is similar to Youden — it finds the threshold closest to “perfect” classification. Results are nearly identical.
Moderate (0.3) is a fixed middle-ground threshold. It’s more selective than Youden but doesn’t require as much certainty as Conservative. Balanced accuracy drops somewhat.
Conservative (0.5) sets a high bar. The model needs to be confident before flagging a site. You’ll miss some opportunities, but the sites you do identify have stronger predicted suitability. Use this when resources are limited and you want to invest in sure things.
For the maps below, we used three thresholds that span this range: Liberal (0.1) for the inclusive view, Youden for the balanced view, and Conservative (0.5) for the selective view. You can toggle between them to see how the mapped suitability changes.
AllThresholdResults %>%
mutate(Method = factor(Method, levels = c("Liberal", "Prevalence", "Youden",
"ClosestTopLeft", "Moderate", "Conservative"))) %>%
ggplot(aes(x = Threshold, y = BalancedAccuracy, color = Species)) +
geom_line(aes(group = Species), alpha = 0.5) +
geom_point(aes(shape = Method), size = 4) +
facet_wrap(~ Species) +
scale_color_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
scale_shape_manual(values = c("Liberal" = 16, "Prevalence" = 17, "Youden" = 8,
"ClosestTopLeft" = 15, "Moderate" = 18, "Conservative" = 4)) +
theme_minimal() +
labs(title = "Threshold Strategies by Species",
x = "Threshold",
y = "Balanced Accuracy") +
theme(legend.position = "bottom") +
guides(color = guide_legend(nrow = 1), shape = guide_legend(nrow = 2))
The models predict based on environmental conditions alone. They identify areas with the right combination of wetness, water accumulation, and elevation — but those conditions exist beyond coastal mangrove habitat. Inland wetlands and river valleys show up as “suitable” because they share similar hydrological characteristics.
To create the final map, we applied two steps:
Step 1: Classification. We converted the continuous probability predictions into suitable/unsuitable classifications using the thresholds that maximized balanced accuracy for each species. For each pixel, we asked: does the predicted probability exceed the threshold? If yes, mark it suitable for that species. We then combined the three species layers into a single map showing all possible combinations — Red only, Black only, White only, Red + Black, Red + White, Black + White, or all three. This lets restoration practitioners see at a glance which species are appropriate for different locations.
Step 2: Coastal masking. We considered using land cover to remove unsuitable areas, but it proved unreliable — adjacent pixels often have vastly different classifications due to raster artifacts. Instead, we created a mask using a 1000-meter inward buffer from Grenada’s national administrative boundary (CDEMA, 2024). This buffer distance is consistent with approaches used in other mangrove mapping studies. The mask removes inland areas while retaining the coastal fringe where mangroves can realistically establish.
The final map shows species suitability only within this coastal zone.
The interactive map below shows species combination suitability across Grenada’s coastal zone using the three threshold strategies described above. You can toggle between Liberal, Youden, and Conservative to see how the mapped suitability changes depending on your priorities.
The map also shows our survey sites — polygons indicating vegetation type mapped in the field — and individual survey plots where we collected the data used to build these models. Comparing the model predictions to actual vegetation helps validate the approach and identify areas of agreement or uncertainty.
# Show in the RMarkdown
webMap
These models identify where environmental conditions are suitable for mangrove establishment across Grenada’s coastal zone. The key findings:
Elevation matters most. Across all three species, elevation emerged as the strongest predictor — not because mangroves respond to height directly, but because elevation integrates tidal flooding, salinity, and drainage. Lower coastal areas have conditions mangroves need.
TWI adds predictive power. The Topographic Wetness Index — a measure of groundwater accessibility — helps distinguish suitable from unsuitable sites, especially for Red mangrove. Wetter sites are more likely to support mangroves.
Establishment is predictable; growth is not. We can predict where mangroves are likely to establish with reasonable confidence (45–59% of variation explained). But predicting how tall they’ll grow is much harder. Growth performance depends on factors we didn’t measure — genetics, microsite conditions, disturbance history.
The map is a starting point, not an answer. The species combination map shows where environmental conditions could support each species. It doesn’t account for land ownership, accessibility, community priorities, or site-specific conditions that only field visits can reveal. Use it to narrow down candidate sites, then ground-truth before planting.
For practitioners ready to explore restoration sites, visit the interactive map to find suitable locations across Grenada, Carriacou, and Petite Martinique.