Introduction

Where should you plant mangroves? And which species will thrive at a given site?

This document walks through how we built the species suitability maps for Grenada’s Mangrove Innovation Hub. We used field data from survey plots across Grenada combined with landscape-scale environmental data to model where each of the three mangrove species — Red, Black, and White — are likely to establish and how well they might grow.

This work was developed by Gaea Conservation Network as part of “Developing a standardized approach for mangrove conservation and restoration in Grenada, Carriacou and Petite Martinique,” with support from the Grenada Fund for Conservation and KIDO Foundation, funded by GEF-SGP Grenada. Field data from baseline sites was also collected under “Restoring climate refuges for migratory birds and building community engagement with nature,” funded by Environment and Climate Change Canada (ECCC).

The goal here is transparency. We want practitioners, researchers, and anyone curious to understand the decisions we made, why we made them, and what the models actually tell us. If you just want to use the map, head to [Explore the Map]. If you want to understand how we got there, read on.

The Environmental Data

Mangrove species don’t establish randomly across the landscape. They respond to environmental conditions — how wet the soil is, how often it floods, how much freshwater versus saltwater influence there is. To predict where each species is likely to thrive, we need to capture these conditions across the entire landscape.

We used five environmental variables derived from publicly available spatial data.

The Variables

  • Topographic Wetness Index (TWI) measures how likely water is to accumulate at a given location based on the shape of the land. Higher values mean wetter conditions — areas where groundwater is more accessible and soils stay saturated longer.

  • Flow Accumulation (FA) tracks how surface water moves across the landscape. Higher values indicate areas where water collects from upslope — channels, drainage paths, and low points that receive runoff.

  • Elevation integrates multiple environmental factors at once. In coastal wetlands, elevation determines how often a site floods with tidal water, how salty the soil becomes, and how well it drains. It’s not that mangroves respond to elevation directly — they respond to the conditions that elevation creates.

  • Distance to Coast captures marine influence. Sites closer to the coast experience more saltwater exposure, tidal energy, and wave action.

  • Distance to River captures freshwater influence. Sites near rivers receive more freshwater input, sediment, and nutrients — conditions that favor different species assemblages than purely marine sites.

Data Sources

The foundation for this analysis is a 5-meter resolution Digital Elevation Model (DEM) collected through LiDAR survey in 2017 as part of the Regional Disaster Vulnerability Reduction Project (DVRP), funded by the World Bank and Climate Investment Fund. This survey, conducted by Fugro, covered Grenada, Carriacou, and Petite Martinique.

From the DEM, we derived two hydrological layers in SAGA GIS: Topographic Wetness Index and Flow Accumulation. These capture how water moves across and accumulates on the landscape based on terrain shape.

Distance to Coast was calculated from administrative boundary shapefiles for Grenada. Distance to River was calculated from a river network layer also produced through the DVRP project.

Variables We Excluded

  • Soil type: We considered including soil characteristics, since they influence drainage and root conditions. However, the available soil data had substantial gaps and was not digitized for Carriacou and Petite Martinique, so we excluded it from the final models.

  • Land use/land cover: Although land cover data was available from the DVRP project, we chose not to include it as a predictor for two reasons. First, land cover reflects human decisions — where people have built or cleared — rather than underlying environmental suitability. A site classified as “developed” might still have the right conditions for mangroves if it were restored. Second, the land cover raster has classification artifacts where adjacent pixels can have vastly different values, making it unreliable for modeling at fine scales. Instead, we use land cover as a post-processing filter to mask out clearly unsuitable areas like buildings and roads.

Processing Decisions

Environmental variables often have skewed distributions or extreme values that can cause problems in modeling. Flow accumulation, for example, ranges from near zero in most places to very high values along drainage channels. Raw elevation values span from sea level to mountain peaks. To make these variables suitable for modeling, we applied two steps.

  • Transformation: We applied the inverse hyperbolic sine (asinh) transformation to all continuous variables. This works like a log transformation — compressing extreme values while preserving the relative ordering — but unlike log, it handles zeros and negative values naturally. This is important for variables like elevation where values near zero are meaningful.

  • Scaling: After transformation, we standardized each variable using the mean and standard deviation calculated from the full landscape raster. This puts all variables on a common scale, which makes it easier to compare their effects in models. A one-unit change in scaled TWI represents the same relative shift as a one-unit change in scaled elevation.

Note: We calculated scaling statistics from the landscape rasters, not just from our survey points. This ensures that when we predict across the full landscape, the models see values within the same range they were trained on.

Environmental Rasters

Here’s what each variable looks like across the landscape. Toggle between tabs to compare.

Scaled Flow Accummaltion

Scaled Topgraphic Wetness Index

Scaled Elevation

Scaled Distance to Coast

Scaled Distance to River

Variable Distributions

After transformation and scaling, here’s how values are distributed across the landscape. These histograms show the range of conditions the models will encounter when predicting across Grenada.

Scaled Flow Accummaltion

Scaled Topgraphic Wetness Index

Scaled Elevation

Scaled Distance to Coast

Scaled Distance to River

Correlation Matrix

Before modeling, we checked whether our environmental variables were measuring distinct information or duplicating each other.

Correlations above 0.7 would indicate redundancy — variables so similar they’d compete in the model. Here, most correlations are moderate or weak, meaning each variable contributes something distinct. The strongest relationship is between Elevation and Distance to Coast (r = 0.58), which makes intuitive sense — lower areas tend to be closer to the coast — but they’re not so correlated that we need to drop one.

The Field Data

The environmental layers tell us about conditions across the landscape. But to build models that predict where mangroves occur and how well they grow, we need field observations — actual measurements of which species are present and how they’re performing at real sites.

Survey Plots

Field data comes from two projects. Under the ECCC-funded project “Restoring climate refuges for migratory birds and building community engagement with nature,” we established 15 transects with 45 plots across four reference sites: Levera, Westerhall, Dover, and David Bay. Under the GEF-SGP project, we added surveys at additional sites including Beausejour and Lans Aux Epines.

At each plot, we recorded:

  • Presence of each species — whether Red, Black, or White mangroves occurred
  • Basal area — calculated from tree counts in diameter size classes, giving a measure of stand density
  • Tree height — recorded for each species present

These measurements let us ask two questions: Where does each species occur? And where it occurs, how well does it grow?

Background Points

Species distribution models need contrast — not just locations where a species is present, but also locations where it could occur but doesn’t. Without this, the model can’t distinguish suitable from unsuitable conditions.

Our survey plots are all in mangrove habitat. To provide contrast, we generated 200 random background points distributed across Grenada’s landscape. These represent “available” environmental conditions — places where we extracted the same environmental variables but did not observe mangroves.

Adding background points lets the presence models learn what distinguishes mangrove sites from the broader landscape, rather than just describing variation within mangrove areas.

plotVariables <- VegDataCompleteBinomial%>% st_drop_geometry() %>% dplyr::select(FA:DistanceToRiver) %>% colnames()

names(plotVariables) <- names(RasterList)

# Loop and create appropriate plots
for (i in 1:length(plotVariables)) {
  cat("\n\n###", names(plotVariables[i]), "\n\n")

  print(
    VegDataCompleteBinomial %>%
      pivot_longer(cols = RedPresent:WhitePresent, names_to = "Species", values_to = "Status")%>%
      mutate(Species = str_trim(str_remove_all(Species, "Present")))%>%
      mutate(Status = ifelse(Status == 0, "Background", "Present"))%>%
      ggplot(aes(x = !!sym(plotVariables[i]), fill = Status)) +
      geom_density(alpha = 0.45, color = NA)+
      theme_minimal() +
      scale_fill_viridis_d()+
      facet_grid(~Species)+
      labs(
        title = paste("Distribution of",  names(plotVariables[i])),
        x = "Value",
        y = "Density"
      ) +
      theme(legend.position = "bottom")
  )
  
  cat("\n\n")
}

Scaled Flow Accummaltion

Scaled Topgraphic Wetness Index

Scaled Elevation

Scaled Distance to Coast

## Warning: Removed 6 rows containing non-finite outside the scale range (`stat_density()`).

Scaled Distance to River

## Warning: Removed 6 rows containing non-finite outside the scale range (`stat_density()`).

The Modeling Approach

We want to answer two distinct questions: Where can each species establish? And where it establishes, how well does it grow? These are different processes — so we modeled them separately.

Two Questions, Two Models

  • Presence models predict the probability that a species occurs at a given location. We used logistic regression (binomial distribution) with our survey plots plus background points. The outcome is binary — present or absent — and the model estimates how environmental conditions shift the odds of occurrence.

  • Height models predict how tall trees grow at locations where a species is already present. We used gamma regression, which handles the right-skewed, strictly positive distribution typical of size measurements. These models only use plots where the species was actually found — we can’t measure tree height where there are no trees.

This separation matters for restoration planning. A site might have high probability of establishment but poor growth potential, or vice versa. The two models capture different aspects of suitability.

Why Logistic Regression?

We could have used tree-based models like gradient boosting machines (GBMs), which often perform better for prediction and can handle missing data natively — meaning we could have included the incomplete soil layer. But we chose logistic and gamma regression because we wanted explainable models.

With regression, we can say exactly how each environmental variable influences the outcome: a one-unit increase in TWI shifts the odds of occurrence by this much. We can see whether elevation has a positive or negative effect, and whether that effect changes when combined with other variables. This transparency matters when the goal is to understand what drives mangrove distribution, not just predict it.

For restoration planning, knowing why a site is suitable is as valuable as knowing that it’s suitable.

Testing Model Complexity

Environmental variables can influence mangrove distribution in simple or complex ways. Elevation alone might explain where Black mangroves occur. Or maybe elevation matters differently depending on how wet the soil is — an interaction effect.

We tested this systematically by fitting models of increasing complexity:

  • Single effects — each variable alone
  • Additive effects — variables combined without interactions
  • Two-way interactions — pairs of variables that modify each other’s effects

For each species and model type (presence or height), we compared all combinations using AIC (Akaike Information Criterion). Lower AIC indicates better model performance, balancing fit against complexity. Models within 2 AIC units of the best are considered essentially equivalent.

Models We Excluded

Not all models converged successfully. We excluded models that:

  • Had NA coefficients (failed to estimate)
  • Showed signs of complete separation (coefficients > 10, standard errors > 100, or warnings about fitted probabilities of exactly 0 or 1)
  • Produced negative or infinite pseudo-R² values
  • Failed with convergence errors

These issues typically occurred with more complex models or those including distance variables, which had limited variation across our study sites. The table below shows only models that passed these checks and performed within 2 AIC units of the best model for each species and response type.

Converged Models

We tested 4,398 model combinations across three species and two response types (presence and height). Of these, 137 converged successfully — those without convergence errors, complete separation, or other fitting issues. The table below shows all converged models.

A few patterns stand out:

  • Presence models perform well. For all three species, the top presence models explain 45–59% of the variation in occurrence. Red mangrove presence is best predicted by the combination of Flow Accumulation, TWI, and Elevation (R² = 58.5%), with interaction terms adding marginal improvement. Black mangrove shows a similar pattern with the same additive model performing best (R² = 45.3%). White mangrove presence is driven primarily by TWI and Elevation (R² = 55.3%), with Flow Accumulation adding little explanatory power.
  • Height models mostly fail. For Red mangrove, the top height models explain almost nothing — pseudo-R² values under 2%. Distance to Coast appears in these models, but it’s essentially no better than predicting the mean height for every tree. White mangrove height fares slightly better (R² = 10.4%) with a Distance to Coast × Flow Accumulation interaction, but this is still weak. Black mangrove height models are absent from the top models entirely — none met our quality criteria.

What this means: We can predict where mangroves establish with reasonable confidence, but predicting how tall they grow is much harder. Growth performance appears to be driven by factors we didn’t measure — genetics, microsite conditions, disturbance history, or fine-scale soil variation.

SaveModelsAll %>% 
  ungroup() %>% 
  filter(!is.infinite(PseudoR2), !PseudoR2 < 0, ModelStatus == "Success") %>%
  group_by(Species, ModelType) %>%
  mutate(DeltaAIC = AIC - min(AIC, na.rm = TRUE),
         DeltaBIC = BIC - min(BIC, na.rm = TRUE)) %>% 
  ungroup() %>%
  mutate(
    Response = case_when(
      ModelType == "Binomial" ~ "Presence",
      ModelType == "Gamma" ~ "Height"
    ),
    PseudoR2 = scales::percent(round(PseudoR2, 3)),
    DeltaAIC = round(DeltaAIC, 2),
    DeltaBIC = round(DeltaBIC, 2)
  ) %>%
  dplyr::select(Species, Response, Predictors = ModelName, PseudoR2, DeltaAIC, DeltaBIC) %>%
  DT::datatable(
    caption = "Model comparison results for all species and response types. Delta AIC/BIC show difference from the best model within each species-response combination.",
    filter = "top",
    options = list(
      pageLength = 15,
      autoWidth = TRUE
    )
  )

Model Coefficients

These plots show how each environmental variable influences the outcome. For presence models, positive values mean the variable increases the probability of finding that species; negative values mean it decreases. For height models, positive values mean taller trees; negative values mean shorter trees.

The error bars show 90% confidence intervals. When the interval doesn’t cross zero (the dashed line), the effect is statistically significant — we’re confident the variable has a real influence. Grey points indicate non-significant effects where we can’t rule out that the true effect is zero.

A few patterns are consistent across species:

  • TWI has a positive effect on presence for all three species. Wetter sites — where groundwater is more accessible — are more likely to have mangroves. This makes ecological sense: mangroves need saturated soils.
  • Elevation has a negative effect on presence for all three species. Lower sites are more likely to have mangroves. Remember, elevation is a proxy for tidal flooding, salinity, and drainage. Lower areas flood more frequently and retain more salt water — conditions mangroves tolerate better than upland vegetation.
  • Flow Accumulation has a weaker, negative effect where it appears. Sites with very high surface water flow may be too dynamic for mangrove establishment, or this variable may be capturing drainage channels rather than suitable wetland habitat.
  • Height models tell a different story. Red mangrove height shows no significant relationship with Distance to Coast — the confidence interval spans zero. White mangrove height does respond to Distance to Coast and its interaction with Flow Accumulation, but even this model explains relatively little variation. Growth performance appears driven by factors we didn’t measure.
  for(i in 1:length(AllCoefficientPlots)) {
    cat("### ", names(AllCoefficientPlots)[i], "\n\n")
    print(AllCoefficientPlots[[i]])
    cat("\n\n")
  }

Black Binomial Coefficients

## `height` was translated to `width`.

Red Binomial Coefficients

## `height` was translated to `width`.

White Binomial Coefficients

## `height` was translated to `width`.

From Models to Maps

The models give us predicted probabilities for every location across the landscape. Since the height models explained very little variation — under 2% for Red mangrove and only 10% for White mangrove — we focus here on presence predictions only. These tell us where each species is likely to establish based on environmental conditions.

Presence Probability shows the likelihood of successful species establishment. Higher values (yellow) indicate areas where the model predicts a species is more likely to occur. Lower values (purple) indicate less suitable conditions. These predictions are based on the top-performing presence models for each species.

Maps

Red Mangrove

## <SpatRaster> resampled to 501147 cells.

Black Mangrove

## <SpatRaster> resampled to 501147 cells.

White Mangrove

## <SpatRaster> resampled to 501147 cells.

Threshold Selection

The presence models output probabilities — a site might have a 0.65 probability of Red mangrove occurrence. To create a map of “suitable” versus “unsuitable” areas, we need to choose a cutoff: above what probability do we call a site suitable?

This decision involves a tradeoff. A low threshold (say, 0.1) will capture most of the places where mangroves actually occur, but it will also flag many unsuitable sites as “suitable” — false positives. A high threshold (say, 0.5) will be more conservative, only flagging sites with strong predicted suitability, but it might miss genuinely suitable areas — false negatives.

Two metrics help us evaluate this tradeoff:

  • Sensitivity — of all the places where mangroves actually occur, what proportion did the model correctly identify as suitable? High sensitivity means we’re not missing good sites.
  • Specificity — of all the places where mangroves don’t occur, what proportion did the model correctly identify as unsuitable? High specificity means we’re not flagging bad sites as good.
  • Balanced accuracy is the average of these two, giving equal weight to both types of errors.

We tested six threshold strategies, ranging from statistically-derived cutoffs (Youden, Closest to Top Left) to fixed values (Conservative at 0.5, Moderate at 0.3, Liberal at 0.1). The plot below shows how balanced accuracy changes across thresholds for each species.

For all three species, lower thresholds produced higher balanced accuracy. This happens partly because our data has many more absence locations than presence locations.

The thresholds we used:

  • Red mangrove: 0.5
  • Black mangrove: 0.5
  • White mangrove: 0.5

These are the Youden-optimized thresholds that maximize balanced accuracy for each species. A site is classified as “suitable” if its predicted probability exceeds this value.

Understanding the Threshold Tradeoff

The presence models output probabilities — a site might have a 0.65 probability of Red mangrove occurrence. To create a map of “suitable” versus “unsuitable” areas, we need to choose a cutoff: above what probability do we call a site suitable?

This isn’t a technical decision — it’s a values decision. Different thresholds reflect different priorities, and there’s no single right answer. The plot below shows six threshold strategies for each species. Moving left (lower thresholds) catches more potential sites but includes more that won’t pan out. Moving right (higher thresholds) is more selective but misses more opportunities. The y-axis shows balanced accuracy — how well the threshold performs at correctly identifying both suitable and unsuitable sites.

Here’s what each strategy means:

  • Liberal (0.1) sets a low bar. If the model sees any reasonable signal, it flags the site. This catches nearly everything that could work, but you’ll need to verify sites in the field. Use this when you’re screening a large area and don’t want to miss anything.

  • Prevalence sets the threshold based on how common mangroves are in our survey data. It’s a data-driven approach that tends to be inclusive. Results are similar to Liberal.

  • Youden finds the threshold that best balances catching good sites and avoiding bad ones — it maximizes the statistical tradeoff. For all three species, this lands around 0.1–0.2 and achieves the highest balanced accuracy. This is a good default if you don’t have strong preferences.

  • Closest to Top Left is similar to Youden — it finds the threshold closest to “perfect” classification. Results are nearly identical.

  • Moderate (0.3) is a fixed middle-ground threshold. It’s more selective than Youden but doesn’t require as much certainty as Conservative. Balanced accuracy drops somewhat.

  • Conservative (0.5) sets a high bar. The model needs to be confident before flagging a site. You’ll miss some opportunities, but the sites you do identify have stronger predicted suitability. Use this when resources are limited and you want to invest in sure things.

For the maps below, we used three thresholds that span this range: Liberal (0.1) for the inclusive view, Youden for the balanced view, and Conservative (0.5) for the selective view. You can toggle between them to see how the mapped suitability changes.

AllThresholdResults %>%
  mutate(Method = factor(Method, levels = c("Liberal", "Prevalence", "Youden", 
                                             "ClosestTopLeft", "Moderate", "Conservative"))) %>%
  ggplot(aes(x = Threshold, y = BalancedAccuracy, color = Species)) +
  geom_line(aes(group = Species), alpha = 0.5) +
  geom_point(aes(shape = Method), size = 4) +
  facet_wrap(~ Species) +
  scale_color_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  scale_shape_manual(values = c("Liberal" = 16, "Prevalence" = 17, "Youden" = 8,
                                "ClosestTopLeft" = 15, "Moderate" = 18, "Conservative" = 4)) +
  theme_minimal() +
  labs(title = "Threshold Strategies by Species",
       x = "Threshold", 
       y = "Balanced Accuracy") +
  theme(legend.position = "bottom") +
  guides(color = guide_legend(nrow = 1), shape = guide_legend(nrow = 2))

Constraining to Coastal Zones

The models predict based on environmental conditions alone. They identify areas with the right combination of wetness, water accumulation, and elevation — but those conditions exist beyond coastal mangrove habitat. Inland wetlands and river valleys show up as “suitable” because they share similar hydrological characteristics.

To create the final map, we applied two steps:

  • Step 1: Classification. We converted the continuous probability predictions into suitable/unsuitable classifications using the thresholds that maximized balanced accuracy for each species. For each pixel, we asked: does the predicted probability exceed the threshold? If yes, mark it suitable for that species. We then combined the three species layers into a single map showing all possible combinations — Red only, Black only, White only, Red + Black, Red + White, Black + White, or all three. This lets restoration practitioners see at a glance which species are appropriate for different locations.

  • Step 2: Coastal masking. We considered using land cover to remove unsuitable areas, but it proved unreliable — adjacent pixels often have vastly different classifications due to raster artifacts. Instead, we created a mask using a 1000-meter inward buffer from Grenada’s national administrative boundary (CDEMA, 2024). This buffer distance is consistent with approaches used in other mangrove mapping studies. The mask removes inland areas while retaining the coastal fringe where mangroves can realistically establish.

The final map shows species suitability only within this coastal zone.

Final Maps

The interactive map below shows species combination suitability across Grenada’s coastal zone using the three threshold strategies described above. You can toggle between Liberal, Youden, and Conservative to see how the mapped suitability changes depending on your priorities.

The map also shows our survey sites — polygons indicating vegetation type mapped in the field — and individual survey plots where we collected the data used to build these models. Comparing the model predictions to actual vegetation helps validate the approach and identify areas of agreement or uncertainty.

# Show in the RMarkdown
webMap

Conclusion

These models identify where environmental conditions are suitable for mangrove establishment across Grenada’s coastal zone. The key findings:

  • Elevation matters most. Across all three species, elevation emerged as the strongest predictor — not because mangroves respond to height directly, but because elevation integrates tidal flooding, salinity, and drainage. Lower coastal areas have conditions mangroves need.

  • TWI adds predictive power. The Topographic Wetness Index — a measure of groundwater accessibility — helps distinguish suitable from unsuitable sites, especially for Red mangrove. Wetter sites are more likely to support mangroves.

  • Establishment is predictable; growth is not. We can predict where mangroves are likely to establish with reasonable confidence (45–59% of variation explained). But predicting how tall they’ll grow is much harder. Growth performance depends on factors we didn’t measure — genetics, microsite conditions, disturbance history.

  • The map is a starting point, not an answer. The species combination map shows where environmental conditions could support each species. It doesn’t account for land ownership, accessibility, community priorities, or site-specific conditions that only field visits can reveal. Use it to narrow down candidate sites, then ground-truth before planting.

For practitioners ready to explore restoration sites, visit the interactive map to find suitable locations across Grenada, Carriacou, and Petite Martinique.