LGEO2185: RF classification with tidymodels using GEE-derived predictors

Author

Kristof Van Oost & Antoine Stevens

Session learning objectives

By the end of this session, students should be able to:

Understand the main building blocks of ML implementation. This tutorial is based on tidymodels: rsample, recipes, parsnip, workflows, tune, yardstick.
Build a reproducible classification pipeline (data split → recipe → model → workflow → cross-validation → tuning → final fit).
Use k-fold cross-validation to estimate generalization performance and understand why it beats a single train/test split for model selection.
Tune Random Forest hyperparameters (mtry, min_n, trees) and select a final model using a clear metric.
Interpret the fitted model using variable importance and connect it to geography/remote sensing reasoning (what drivers make sense, and why).
Understand potential and limitations of decision trees and Random Forest ML approaches

Introduction Decision trees and Random Forest

A decision tree is a rule-based model that partitions the predictor space into rectangular regions using simple threshold rules. In this example, we model landslide occurrence (lslpts) using terrain predictors from the spDataLarge dataset. We fit a small tree (only two decision levels) so the rules remain easy to interpret. We then visualize the tree rules as threshold lines in a scatterplot.

data("lsl", "study_mask", package = "spDataLarge")
ta <- terra::rast(system.file("raster/ta.tif", package = "spDataLarge"))

library(rpart)
library(rpart.plot)
library(ggplot2)
library(patchwork)
library(png)
library(grid)

set.seed(1)

tree2 <- rpart(
  lslpts ~ slope + cplan + cprof + elev + log10_carea,
  data = lsl,
  method = "class",
  control = rpart.control(
    maxdepth = 2,
    cp = 0.0,
    minsplit = 20
  )
)

tree2

n= 350 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 350 175 FALSE (0.5000000 0.5000000)  
  2) slope< 34.66101 84  15 FALSE (0.8214286 0.1785714)  
    4) cplan>=-0.09504572 77  10 FALSE (0.8701299 0.1298701) *
    5) cplan< -0.09504572 7   2 TRUE (0.2857143 0.7142857) *
  3) slope>=34.66101 266 106 TRUE (0.3984962 0.6015038)  
    6) cplan>=-0.02002188 177  84 FALSE (0.5254237 0.4745763) *
    7) cplan< -0.02002188 89  13 TRUE (0.1460674 0.8539326) *

quartz_off_screen 
                2

To understand why decision trees behave differently than linear models, it helps to look at the geometry of the decision boundary in a 2-predictor space (X1–X2).

Linear models separate classes with (at most) a single straight line in predictor space.
Decision trees split the space using axis-aligned thresholds (vertical/horizontal cuts), creating rectangular regions.
As trees grow deeper, they can approximate curved/diagonal boundaries, but they do so via increasingly fine “steps”.

The figure below illustrates these ideas in four panels:

Linear model: one straight boundary
Same linear boundary, with illustrative tree-style axis splits overlaid
Tree decision regions (rectangular partitions)
A tiny tree with only two criteria (two splits)

Top-left: a linear decision boundary (one straight line). Top-right: the same linear boundary, with tree-style axis-aligned split lines overlaid (vertical/horizontal thresholds). Bottom-left: the resulting tree decision regions (rectangular partitions). Bottom-right: a minimal tree with only two split criteria (very interpretable, but limited flexibility). This illustrates the key geometric difference: linear models are global and smooth, while trees build step-wise, rectangular approximations by repeatedly applying simple threshold rules.Note that decision trees can also be applied to both classification and regression problems. In the case of regression problems, the predicted value becomes the average value of the partition.

Decision trees are unstable (high variance) and are also sensitive to small data changes. This is illustrated below: we fit the trees across multiple random splits. The figure shows that the Training AUROC (see session 4) is stable, while the AUROC of the test models varies a lot.

library(rsample)
library(pROC)
library(purrr)

set.seed(1)

splits <- replicate(30, initial_split(lsl, prop = 0.75), simplify = FALSE)

aucs <- map_dfr(splits, function(spl) {
  train_dat <- training(spl)
  test_dat  <- testing(spl)
  m <- rpart(lslpts ~ slope + cplan + cprof + elev + log10_carea,
             data = train_dat,
             method = "class")
  prob_train <- predict(m, train_dat, type = "prob")[,2]
  prob_test  <- predict(m, test_dat, type = "prob")[,2]
  tibble(
    auc_train = as.numeric(auc(roc(train_dat$lslpts, prob_train, quiet = TRUE))),
    auc_test  = as.numeric(auc(roc(test_dat$lslpts, prob_test, quiet = TRUE)))
  )
})

ggplot(aucs) +
  geom_density(aes(x = auc_train, color = "Train")) +
  geom_density(aes(x = auc_test, color = "Test")) +
  theme_minimal()

Note

Advantages and Limitations of Decision Trees

Decision trees, whether used for regression or classification, offer several important advantages compared to more classical modelling approaches such as linear regression or logistic regression.

Advantages - Interpretability
Trees are extremely easy to explain. In many cases, they are even more intuitive than linear models because they rely on simple if–then rules. - Human-like reasoning
The hierarchical structure of trees often resembles human decision-making processes: we tend to evaluate one criterion, then refine based on another. - Graphical representation
Trees can be visualised directly as diagrams, making them accessible even to non-experts, especially when the tree is small and shallow. - Handling of qualitative predictors
Trees naturally incorporate categorical variables without requiring the creation of dummy variables or contrast coding.

Limitations - Predictive performance
A single decision tree often does not achieve the same predictive accuracy as more flexible or regularised statistical learning methods. - Instability (non-robustness)
Trees can be sensitive to small changes in the data. A slight modification in the training dataset may result in a substantially different tree structure.

From Single Trees to Tree Ensembles: Many of these limitations can be addressed by combining multiple trees into an ensemble. Methods such as:

Bagging
Random Forests
Boosting

aggregate large numbers of trees to improve predictive performance and stability. A natural way to reduce the variance and increase the test set accuracy of a statistical learning method is to take many training sets from the population, build a separate prediction model using each training set and then average the resulting predictions. This is called bagging and this is particularly useful for decision trees. This approach has been demonstrated to substantially improve accuracy by combining hundreds or even thousands or trees into a single prediction model.

Bagging can improve the accuracy, but this comes at a cost as the decision trees are no longer simple and interpretable. So the improved accuracy comes at the expense of interpretability.

Random Forests provide a further improvement over bagged trees: In each split, only a random sample of the predictors are allowed to be used. This is counter-intuitive but this process decorrelates the trees and results in more robust predictions.

Here we create both a logistic regression as well as an RF model for our landslide problem using the tidymodel approach (cfr Session 4) :

data("lsl", package = "spDataLarge")

library(tidymodels)
library(dplyr)
library(ggplot2)
library(patchwork)

set.seed(1)

# Ensure outcome is a factor
lsl <- lsl %>% mutate(lslpts = as.factor(lslpts))

# --- 1) Train/test split ---
lsl_split <- initial_split(lsl, prop = 0.75, strata = lslpts)
lsl_train <- training(lsl_split)
lsl_test  <- testing(lsl_split)

# --- 2) 5-fold CV on training set ---
lsl_folds <- vfold_cv(lsl_train, v = 5, strata = lslpts)

# --- 3) A common recipe (same preprocessing for both models) ---
lsl_recipe <- recipe(lslpts ~ slope + cplan + cprof + elev + log10_carea,
                     data = lsl_train) %>%
  step_normalize(all_numeric_predictors()) %>%
  step_nzv(all_predictors())

# --- 4) Model specs ---
glm_spec <- logistic_reg() %>%
  set_engine("glm") %>%
  set_mode("classification")

rf_spec <- rand_forest(
  trees = 500,
  mtry  = tune(),     # we'll set a simple default below (no full tuning yet)
  min_n = 5
) %>%
  set_engine("ranger", importance = "impurity") %>%
  set_mode("classification")

# A simple non-tuned choice for mtry (sqrt(p) is common)
p <- 5
rf_spec <- finalize_model(rf_spec, tibble(mtry = floor(sqrt(p))))

# --- 5) Workflows ---
glm_wf <- workflow() %>%
  add_recipe(lsl_recipe) %>%
  add_model(glm_spec)

rf_wf <- workflow() %>%
  add_recipe(lsl_recipe) %>%
  add_model(rf_spec)

Now let’s have a look at the hold-out test performance using the same split for both models

mets <- metric_set(accuracy, roc_auc)
glm_last <- last_fit(glm_wf, split = lsl_split, metrics = mets)
rf_last  <- last_fit(rf_wf,  split = lsl_split, metrics = mets)

test_comp <- bind_rows(
  glm_last %>% collect_metrics() %>% mutate(model = "GLM (logistic)"),
  rf_last  %>% collect_metrics() %>% mutate(model = "Random Forest")
) %>%
  select(model, .metric, .estimate)

test_comp

# A tibble: 4 × 3
  model          .metric  .estimate
  <chr>          <chr>        <dbl>
1 GLM (logistic) accuracy     0.727
2 GLM (logistic) roc_auc      0.767
3 Random Forest  accuracy     0.773
4 Random Forest  roc_auc      0.839

Interpretation

Logistic regression fits a smooth, global relationship (linear in the predictors after transformation).

Random Forest combines many trees, capturing nonlinearities and interactions automatically.

In many geoscience susceptibility problems, RF typically achieves higher AUROC, especially when relationships are nonlinear.

Note

Note that we did not fine-tune the (hyper)parameters of the RF model yet. For example the nubmer of predictors sampled at each sple mtry, the number of trees in the forest trees, and the minimum observations per tree (min_nare typical parameters that can also be tuned. We will develop this forther in a machine learning workflow below.

RF tutorial: land cover mapping

As we discussed in session 4, Tidymodels is a coherent “grammar” for machine learning in R: rather than every model/package having its own interface, tidymodels standardizes the steps you do again and again, i.e. splitting, pre-processing, training, tuning, and evaluation.

This module follows the workflow described in Rebecca Barter’s tidymodels tour (Bauer (2023)) and adapts it to spatial training data. We will re-use the training data and predictors we created in Google Earth Engine (Session Demo GEE).

# Core spatial + ML workflow packages
library(terra)

library(tidymodels)  # loads rsample, recipes, parsnip, yardstick, workflows, etc.
library(tune)
library(workflows)

# Optional but often useful
library(dplyr)
library(ggplot2)

# For variable importance with ranger models
library(vip)

To keep things reproducible we seed:

set.seed(42)

1. Load data (your spatial training set)

We will use a shapefile exported from GEE with training polygons/points and predictors for a small study site near the town of Ilebo in the DR Congo. Put the data (shapefile- and geotif-files into a subfolder /GEE_Exports_Annual). The shapefile contains small polygons that were digitized in Google Earth on high resolution satellite imagery and where an opererator labelled the polygons according to five possible land use classes:

# path to the shapefile
shp_path <- "GEE_Exports_S2_Ilebo/ilebo_polygon_band_values.shp"
# load shapefile (SpatVector)
kasai_observations <- terra::vect(shp_path)
names(kasai_observations)

 [1] "date"  "B11"   "B12"   "year"  "B8A"   "NDVI"  "B1"    "B2"    "B3"   
[10] "site"  "B4"    "B5"    "B6"    "B7"    "B8"    "B9"    "yymm"  "BSI"  
[19] "EVI"   "class"

kasai_observations

 class       : SpatVector 
 geometry    : polygons 
 dimensions  : 1318, 20  (geometries, attributes)
 extent      : 20.55695, 20.72394, -4.424681, -4.233725  (xmin, xmax, ymin, ymax)
 source      : ilebo_polygon_band_values.shp
 coord. ref. : lon/lat WGS 84 (EPSG:4326) 
 names       :      date   B11   B12  year   B8A  NDVI    B1    B2    B3  site
 type        :     <num> <num> <num> <int> <num> <num> <num> <num> <num> <chr>
 values      : 1.588e+12  2428  1891  2020  2746  4020 721.4 792.4  1022 Ilebo
               1.588e+12  2480  2098  2020  2432  2793 818.1 840.4  1084 Ilebo
               1.588e+12  2831  2283  2020  2990  3709 733.6 857.5  1156 Ilebo
 (and 10 more)

A quick visualization of the data we are going to work with:

We convert to a data.frame for tidymodels as Tidymodels works on tabular data. We typically:

keep geometry in a separate object
model on attribute table

kasai_df <- as.data.frame(kasai_observations)
kasai_df$class <- as.factor(kasai_df$class) #for modelling later outcome should be a factor
# quick check
#glimpse(kasai_df)

Note: if you later want spatial CV, you will need coordinates/geometry somewhere (even if you don’t model on them). We keep kasai_observations for that purpose.

2. EDA (exploration + quality checks)

2.1 Inspect columns + class label

The observed landcover classes in the database are given in the column class. Let’s explore what we have:

# What are the unique classes?
table(kasai_df$class, useNA = "ifany")


built_up cropland   forest savannah    water 
     212      361      246      265      234

2.2 Missing values and obvious issues

# Count NAs per column (largest first)
na_counts <- sort(colSums(is.na(kasai_df)), decreasing = TRUE)
head(na_counts, 20)

 date   B11   B12  year   B8A  NDVI    B1    B2    B3  site    B4    B5    B6 
    0     0     0     0     0     0     0     0     0     0     0     0     0 
   B7    B8    B9  yymm   BSI   EVI class 
    0     0     0     0     0     0     0

It looks like our database is fully populated.

2.3 Quick sanity plot

# Example: NDVI
if ("NDVI" %in% names(kasai_df)) {
  ggplot(kasai_df, aes(x = NDVI, fill = class)) +
    geom_histogram(bins = 30, alpha = 0.7) +
    theme_minimal()
}

What do we see here? built-up mostly under 0.5 and confirms mixed pixels as it aggregates houses + trees and gardens. Note overlap between cropland and savannah, this points to the need to look at multiple predictors.

# Example: BSI
if ("BSI" %in% names(kasai_df)) {
  ggplot(kasai_df, aes(x = BSI, fill = class)) +
    geom_histogram(bins = 30, alpha = 0.7) +
    theme_minimal()
}

Note

Now we are getting to the core. The tidymodels standardizes the steps you do again and again—splitting, pre-processing, training, tuning, and evaluation.

A helpful overview is:

3 rsample: how you create training/test sets and resamples (folds)
4 recipes: how you define preprocessing steps
5parsnip: a consistent interface to many model engines (e.g., ranger)
6 workflows: bundle recipe + model into a single object you can fit
7-9 tune: systematic tuning across resamples
10-11 yardstick: model performance metrics

3. Split into training and testing

Use a stratified split to preserve class proportions in both sets using initial_split from the rsample package.

kasai_split <- initial_split(kasai_df, prop = 0.8, strata = class)

kasai_train <- training(kasai_split)
kasai_test  <- testing(kasai_split)

nrow(kasai_train); nrow(kasai_test)

[1] 1052

[1] 266

4. Recipe (preprocessing)

A recipe is trained only on the training set and then applied to test/validation data.

4.1 Choose predictors

Exclude identifiers (system:index, date, etc.)
Exclude geometry fields if present (but we dealt with this before)
Keep only numeric predictors (for RF it is fine to mix factors too, but let’s keep it simple for now)

# drop  non-predictor fields (this is why we explored the dataset above)
drop_cols <- c("date", "yymm", "site", "year")

kasai_train2 <- kasai_train %>% select(-any_of(drop_cols))

kasai_test2 <- kasai_test %>% select(-any_of(drop_cols))

#glimpse(kasai_train2)

Note to self: this could have been done in step 3 just before the initial splitting.

4.2 Build the recipe

Note: Random Forest does not require normalized predictors in most cases. Still, we often: - remove near-zero variance predictors
- handle missing values
- encode categorical predictors if any

rf_recipe <- recipe(class ~ ., data = kasai_train2) %>%
  step_zv(all_predictors()) %>%           # remove zero-variance predictors
  step_nzv(all_predictors()) %>%          # remove near-zero variance predictors
  step_impute_median(all_numeric_predictors())  # simple imputation for NA, not relevant here

The last part of the recipe step_impute_median(all_numeric_predictors()) is just for illustration as we have no missing values. This function replaces missing values in a dataset (NA) in numeric predictor variables by replacing them with the median value, computed from the training data only.

5. Model specification

In tidymodels, you specify what kind of model (random forest classifier in our case) and then choose an engine (ranger) that actually fits it.

We’ll tune three parameters (see introduction session to Machine Learning):

mtry: number of predictors sampled at each split
min_n: minimum node size
trees: number of trees

rf_model <- rand_forest(
  mode  = "classification",
  mtry  = tune(),
  min_n = tune(),
  trees = 550
) %>%
  set_engine("ranger", importance = "impurity")  # enables variable importance citeturn0search8

rf_model

Random Forest Model Specification (classification)

Main Arguments:
  mtry = tune()
  trees = 550
  min_n = tune()

Engine-Specific Arguments:
  importance = impurity

Computational engine: ranger

6. Workflow (recipe + model)

A workflow glues preprocessing + model together into a single object you can resample, tune, and fit.

rf_workflow <- workflow() %>%
  add_recipe(rf_recipe) %>%
  add_model(rf_model)

rf_workflow

══ Workflow ════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: rand_forest()

── Preprocessor ────────────────────────────────────────────────────────────────
3 Recipe Steps

• step_zv()
• step_nzv()
• step_impute_median()

── Model ───────────────────────────────────────────────────────────────────────
Random Forest Model Specification (classification)

Main Arguments:
  mtry = tune()
  trees = 550
  min_n = tune()

Engine-Specific Arguments:
  importance = impurity

Computational engine: ranger

7. Cross-validation

Cross-validation gives a more stable estimate of performance than a single split (and is the standard approach for tuning).

set.seed(42)
kasai_folds <- vfold_cv(kasai_train2, v = 5, strata = class)

kasai_folds

#  5-fold cross-validation using stratification 
# A tibble: 5 × 2
  splits            id   
  <list>            <chr>
1 <split [839/213]> Fold1
2 <split [840/212]> Fold2
3 <split [842/210]> Fold3
4 <split [843/209]> Fold4
5 <split [844/208]> Fold5

8. Tuning grid

At this step, we do not fit just one Random Forest model. Instead, we test several versions of the model with different settings. This is called hyperparameter tuning.

With the ranger engine, the most important parameters are usually:

mtry: the number of predictor variables randomly considered at each split
min_n: the minimum number of samples allowed in a terminal node
trees: the number of trees grown in the forest

The model is trained repeatedly on resampled training data, and its predictive performance is evaluated for each parameter combination. The goal is to identify the settings that give the best balance between accuracy and generalization. After tuning, the best parameter combination is selected and used to fit the final Random Forest model.We start with a modest grid to speed up the calculations. Note that the combination of 4 levels and 2 parameters gives 16 combinations.

Note

This is a limited analysis to speed up the calculations, under normal circumstances you would test much more combinations (eg include the number of trees in the fitting and increase the levels).

rf_grid <- grid_regular(
  mtry(range = c(2, 14)),   # adjust based on number of predictors
  min_n(range = c(2, 20)),
  levels = 4
)

rf_grid

# A tibble: 16 × 2
    mtry min_n
   <int> <int>
 1     2     2
 2     6     2
 3    10     2
 4    14     2
 5     2     8
 6     6     8
 7    10     8
 8    14     8
 9     2    14
10     6    14
11    10    14
12    14    14
13     2    20
14     6    20
15    10    20
16    14    20

9. Tune the model (fit across folds)

We select metrics that are appropriate for classification.

F1 metric (f_meas)

F1 score measure the harmonic mean of precision and recall. An introduction to classification performance metrics can be found here.

rf_metrics <- metric_set(f_meas)

set.seed(42)
rf_tune <- tune_grid(
  rf_workflow,
  resamples = kasai_folds,
  grid      = rf_grid,
  metrics   = rf_metrics,
  control   = control_grid(save_pred = TRUE)
)

rf_tune

# Tuning results
# 5-fold cross-validation using stratification 
# A tibble: 5 × 5
  splits            id    .metrics          .notes           .predictions
  <list>            <chr> <list>            <list>           <list>      
1 <split [839/213]> Fold1 <tibble [16 × 6]> <tibble [0 × 4]> <tibble>    
2 <split [840/212]> Fold2 <tibble [16 × 6]> <tibble [0 × 4]> <tibble>    
3 <split [842/210]> Fold3 <tibble [16 × 6]> <tibble [0 × 4]> <tibble>    
4 <split [843/209]> Fold4 <tibble [16 × 6]> <tibble [0 × 4]> <tibble>    
5 <split [844/208]> Fold5 <tibble [16 × 6]> <tibble [0 × 4]> <tibble>

9.1 Visualize tuning results

autoplot(rf_tune)

9.2 Select the best model

best_params <- select_best(rf_tune, metric = "f_meas")
best_params

# A tibble: 1 × 3
   mtry min_n .config         
  <int> <int> <chr>           
1     2     8 pre0_mod02_post0

10. Finalize workflow + evaluate on the test set

rf_final_workflow <- finalize_workflow(rf_workflow, best_params)

test_model_fit <- last_fit(rf_final_workflow, split = kasai_split, metrics = rf_metrics)

test_model_fit

# Resampling results
# Manual resampling 
# A tibble: 1 × 6
  splits             id               .metrics .notes   .predictions .workflow 
  <list>             <chr>            <list>   <list>   <list>       <list>    
1 <split [1052/266]> train/test split <tibble> <tibble> <tibble>     <workflow>

10.1 Test-set metrics

collect_metrics(test_model_fit)

# A tibble: 1 × 4
  .metric .estimator .estimate .config        
  <chr>   <chr>          <dbl> <chr>          
1 f_meas  macro          0.887 pre0_mod0_post0

10.2 Confusion matrix

test_preds <- collect_predictions(test_model_fit)

test_preds %>%
  conf_mat(truth = class, estimate = .pred_class)

          Truth
Prediction built_up cropland forest savannah water
  built_up       37        2      0        0     0
  cropland        5       62      0        7     0
  forest          0        0     48        3     2
  savannah        1        9      2       43     1
  water           0        0      0        0    44

What can we learn from this confusion matrix?

11. Fit final model on all data (for deployment)

Once you’re satisfied, we refit on all labeled data (train + test combined) before we deploy for mapping. Think about this: what is the difference between tuning, performance reporting and inference?

final_model_fit <- fit(rf_final_workflow, data = bind_rows(kasai_train2, kasai_test2))

final_model_fit

══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: rand_forest()

── Preprocessor ────────────────────────────────────────────────────────────────
3 Recipe Steps

• step_zv()
• step_nzv()
• step_impute_median()

── Model ───────────────────────────────────────────────────────────────────────
Ranger result

Call:
 ranger::ranger(x = maybe_data_frame(x), y = y, mtry = min_cols(~2L,      x), num.trees = ~550, min.node.size = min_rows(~8L, x), importance = ~"impurity",      num.threads = 1, verbose = FALSE, seed = sample.int(10^5,          1), probability = TRUE) 

Type:                             Probability estimation 
Number of trees:                  550 
Sample size:                      1318 
Number of independent variables:  15 
Mtry:                             2 
Target node size:                 8 
Variable importance mode:         impurity 
Splitrule:                        gini 
OOB prediction error (Brier s.):  0.08495267

12. Variable importance (interpretation)

Variable importance is one of the simplest ways to connect “black-box” models back to domain reasoning. This variable importance assessment is a key interpretation step in machine learning.

# Extract the fitted parsnip model
rf_engine_fit <- final_model_fit  %>% extract_fit_parsnip()

# Plot variable importance
vip::vip(rf_engine_fit, num_features = 20)

Which predictors make sense for separating your classes?

Are any “top predictors” suspicious ?

Do the top predictors match your conceptual understanding of the landscape?

13. Reduced RF model

This is all fine, but the final model uses 10 bands, so when we go to Inference, I need spatial data for all predictors it is good to check if you do need ALL these predictors. This can be done by advanced procedures that evaluate the performance of models with reduced number of predictors (eg Recursive Feature Elimination), here we just manually select the 5 best predictors from the variable importance analyses above. Here is a link to a feature selection method

  ## ---- Reduced RF model evaluated like Section 10 (using last_fit) ----

# 0) Manually choose predictors from VIP plot
top_vars <- c("B12", "B11", "BSI", "B5", "B4")
kasai_train2_reduced <- kasai_train2 %>%
  dplyr::select(class, all_of(top_vars))

# 1) Reduced recipe
rf_recipe_reduced <- recipe(class ~ ., data = kasai_train2_reduced) %>%
  step_zv(all_predictors()) %>%
  step_nzv(all_predictors()) %>%
  step_impute_median(all_numeric_predictors())

# 2) Reduced workflow (reuse tuned hyperparameters)
rf_workflow_reduced <- workflow() %>%
  add_recipe(rf_recipe_reduced) %>%
  add_model(rf_model) 

rf_final_workflow_reduced <- finalize_workflow(rf_workflow_reduced, best_params)

# 3) Evaluate on test set with last_fit (same approach as Section 10)
test_model_reduced_fit <- last_fit(rf_final_workflow_reduced, split = kasai_split, metrics = rf_metrics)

# 4) Confusion matrix + metrics (same as Section 10)
test_preds_reduced <- collect_predictions(test_model_reduced_fit)

test_preds_reduced %>%
  conf_mat(truth = class, estimate = .pred_class)

          Truth
Prediction built_up cropland forest savannah water
  built_up       38        3      0        0     0
  cropland        4       62      0        8     0
  forest          0        0     48        1     3
  savannah        1        8      2       44     0
  water           0        0      0        0    44

collect_metrics(test_model_reduced_fit)

# A tibble: 1 × 4
  .metric .estimator .estimate .config        
  <chr>   <chr>          <dbl> <chr>          
1 f_meas  macro          0.894 pre0_mod0_post0

final_reduced_model_fit <- fit(
  rf_final_workflow_reduced,
  data = bind_rows(
    kasai_train2 %>% dplyr::select(class, all_of(top_vars)),
    kasai_test2  %>% dplyr::select(class, all_of(top_vars))
  )
)

metrics_full    <- collect_metrics(test_model_fit) %>% mutate(model = "Full RF")
metrics_reduced <- collect_metrics(test_model_reduced_fit) %>% mutate(model = "Reduced RF")

model_comparison <- bind_rows(metrics_full, metrics_reduced) %>%
  select(model, .metric, .estimate) %>%
  tidyr::pivot_wider(names_from = model, values_from = .estimate)

model_comparison

# A tibble: 1 × 3
  .metric `Full RF` `Reduced RF`
  <chr>       <dbl>        <dbl>
1 f_meas      0.887        0.894

14. Inference: mapping predictions with terra

Nice — inference is the fun part 😄. Since our final reduced RF is a tidymodels workflow fit (rf_final_fit_reduced), the simplest, robust approach with terra is:

load the raster stack
make sure layer names match your top_vars predictor names
convert raster values → data.frame
predict() with the workflow (class + probs)
write results back to rasters and export GeoTIFFs

Here’s an inference block:

# 1) Load predictor raster stack (must contain layers named like top_vars)
r <- rast("GEE_Exports_S2_Ilebo/ilebo_s2_2025_int_30m.tif")
# Keep only the predictors used by the reduced model
r_red <- r[[top_vars]]

# 2) Predict land-cover classes with the tidymodels fitted workflow

r_class <- terra::predict(
r_red,
model = final_reduced_model_fit,
type = "class",
na.rm = TRUE,
filename = "GEE_Exports_S2_Ilebo/ilebo_s2_LC.tif",
overwrite = TRUE
)

cat_tab <- levels(r_class)[[1]]
lab_col <- names(cat_tab)[2]
labs <- as.character(cat_tab[[lab_col]])

lc_cols <- c(
  built_up = "#e31a1c",
  cropland = "#FFD92F",
  forest   = "#1b9e77",
  savannah = "#66a61e",
  water    = "#1f78b4"
)

cols <- unname(lc_cols[labs])
cols[is.na(cols)] <- "grey70"

plot(r_class, col = cols, main = "Predicted land cover (Reduced RF, 2025)")

Note

Recipes prepare data, workflows train models, tuning chooses parameters, last_fit() evaluates once, and final fit() is what we deploy.

Stage	Object name	Package	What it represents	Used for
Data split	`kasai_split`	`rsample`	Definition of training vs test data	Reproducible splitting
	`kasai_train`	`rsample`	Training data table	Model learning
	`kasai_test`	`rsample`	Test data table	Final evaluation
	`kasai_folds`	`rsample`	Cross-validation folds	Hyperparameter tuning
Preprocessing	`rf_recipe`	`recipes`	Full preprocessing pipeline	Feature preparation
	`rf_recipe_reduced`	`recipes`	Reduced-predictor pipeline	Feature selection
Model spec	`rf_model`	`parsnip`	Random Forest definition + tunable parameters	Model type only (no data)
Workflow	`rf_workflow`	`workflows`	Recipe + model (full)	Tuning, fitting
	`rf_workflow_reduced`	`workflows`	Recipe + model (reduced)	Reduced modelling
Tuning	`rf_tune`	`tune`	CV performance for many parameter sets	Model selection
	`best_params`	`tune`	Best hyperparameter combination	Freezing complexity
Finalized workflow	`rf_final_workflow`	`workflows`	Tuned & locked full workflow	Ready for fitting
	`rf_final_workflow_reduced`	`workflows`	Tuned & locked reduced workflow	Ready for fitting
Test evaluation	`test_model_fit`	`yardstick`	Test-set performance (full)	Reporting only
	`test_model_reduced_fit`	`yardstick`	Test-set performance (reduced)	Model comparison
Deployment	`final_model_fit`	`parsnip`	Full model trained on all data	Inference / mapping
	`final_reduced_model_fit`	`parsnip`	Reduced model trained on all data	Inference / mapping
Prediction output	`r_class`	`terra`	Raster of predicted classes	Mapping / analysis

15. Neural networks

This session also introduces land-cover classification with neural networks as an alternative to Random Forest.

library(tidymodels)
library(nnet)

nn_recipe <- recipe(class ~ ., data = kasai_train2) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_numeric_predictors())

nn_model <- mlp(
  mode = "classification",
  hidden_units = 10,
  penalty = 0.001,
  epochs = 200
) %>%
  set_engine("nnet", trace = FALSE)

nn_workflow <- workflow() %>%
  add_recipe(nn_recipe) %>%
  add_model(nn_model)

nn_final_workflow <- nn_workflow

nn_test_model_fit <- last_fit(nn_final_workflow, split = kasai_split, metrics = rf_metrics)
nn_test_model_fit

# Resampling results
# Manual resampling 
# A tibble: 1 × 6
  splits             id               .metrics .notes   .predictions .workflow 
  <list>             <chr>            <list>   <list>   <list>       <list>    
1 <split [1052/266]> train/test split <tibble> <tibble> <tibble>     <workflow>

nn_test_preds <- collect_predictions(nn_test_model_fit)

conf_mat(nn_test_preds, truth = class, estimate = .pred_class)

          Truth
Prediction built_up cropland forest savannah water
  built_up       37        2      0        0     0
  cropland        5       63      0        9     0
  forest          0        0     44        2     0
  savannah        1        8      4       42     1
  water           0        0      2        0    46

collect_metrics(nn_test_model_fit)

# A tibble: 1 × 4
  .metric .estimator .estimate .config        
  <chr>   <chr>          <dbl> <chr>          
1 f_meas  macro          0.880 pre0_mod0_post0

nn_final_model_fit <- fit(nn_final_workflow, data = bind_rows(kasai_train2, kasai_test2))
nn_final_model_fit

══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: mlp()

── Preprocessor ────────────────────────────────────────────────────────────────
2 Recipe Steps

• step_zv()
• step_normalize()

── Model ───────────────────────────────────────────────────────────────────────
a 15-10-5 network with 215 weights
inputs: B11 B12 B8A NDVI B1 B2 B3 B4 B5 B6 B7 B8 B9 BSI EVI 
output(s): ..y 
options were - softmax modelling  decay=0.001

Now we mirror the RF implemenation: use MLP, tune, and evaluate the performance library(tidymodels) library(nnet)

library(tidymodels)
library(nnet)

nn_recipe <- recipe(class ~ ., data = kasai_train2) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_numeric_predictors())

nn_model <- mlp(
  mode         = "classification",
  hidden_units = tune(),
  penalty      = tune(),
  epochs       = tune()
) %>%
  set_engine("nnet", trace = FALSE)

nn_workflow <- workflow() %>%
  add_recipe(nn_recipe) %>%
  add_model(nn_model)

nn_grid <- grid_regular(
  hidden_units(range = c(5, 20)),
  penalty(range = c(-4, -2)),
  epochs(range = c(200, 500)),
  levels = 3
)

set.seed(42)
nn_tune <- tune_grid(
  nn_workflow,
  resamples = kasai_folds,
  grid      = nn_grid,
  metrics   = rf_metrics,
  control   = control_grid(save_pred = TRUE)
)

collect_metrics(nn_tune)

# A tibble: 27 × 9
   hidden_units penalty epochs .metric .estimator  mean     n std_err .config   
          <int>   <dbl>  <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>     
 1            5  0.0001    200 f_meas  macro      0.881     5 0.00670 pre0_mod0…
 2            5  0.0001    350 f_meas  macro      0.884     5 0.00744 pre0_mod0…
 3            5  0.0001    500 f_meas  macro      0.881     5 0.0127  pre0_mod0…
 4            5  0.001     200 f_meas  macro      0.886     5 0.00952 pre0_mod0…
 5            5  0.001     350 f_meas  macro      0.881     5 0.00714 pre0_mod0…
 6            5  0.001     500 f_meas  macro      0.872     5 0.0105  pre0_mod0…
 7            5  0.01      200 f_meas  macro      0.900     5 0.00744 pre0_mod0…
 8            5  0.01      350 f_meas  macro      0.897     5 0.0101  pre0_mod0…
 9            5  0.01      500 f_meas  macro      0.905     5 0.0127  pre0_mod0…
10           12  0.0001    200 f_meas  macro      0.870     5 0.0126  pre0_mod1…
# ℹ 17 more rows

show_best(nn_tune, metric = "f_meas", n = 10)

# A tibble: 10 × 9
   hidden_units penalty epochs .metric .estimator  mean     n std_err .config   
          <int>   <dbl>  <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>     
 1            5   0.01     500 f_meas  macro      0.905     5 0.0127  pre0_mod0…
 2            5   0.01     200 f_meas  macro      0.900     5 0.00744 pre0_mod0…
 3           20   0.01     500 f_meas  macro      0.899     5 0.00436 pre0_mod2…
 4            5   0.01     350 f_meas  macro      0.897     5 0.0101  pre0_mod0…
 5           20   0.01     350 f_meas  macro      0.897     5 0.00853 pre0_mod2…
 6           20   0.01     200 f_meas  macro      0.895     5 0.00722 pre0_mod2…
 7           20   0.001    200 f_meas  macro      0.893     5 0.0110  pre0_mod2…
 8           20   0.001    500 f_meas  macro      0.892     5 0.00480 pre0_mod2…
 9           12   0.01     200 f_meas  macro      0.892     5 0.00998 pre0_mod1…
10           12   0.01     350 f_meas  macro      0.890     5 0.0151  pre0_mod1…

best_nn_params <- select_best(nn_tune, metric = "f_meas")
best_nn_params

# A tibble: 1 × 4
  hidden_units penalty epochs .config         
         <int>   <dbl>  <int> <chr>           
1            5    0.01    500 pre0_mod09_post0

nn_final_workflow <- finalize_workflow(nn_workflow, best_nn_params)

test_model_nn_fit <- last_fit(
  nn_final_workflow,
  split   = kasai_split,
  metrics = rf_metrics
)

test_model_nn_fit

# Resampling results
# Manual resampling 
# A tibble: 1 × 6
  splits             id               .metrics .notes   .predictions .workflow 
  <list>             <chr>            <list>   <list>   <list>       <list>    
1 <split [1052/266]> train/test split <tibble> <tibble> <tibble>     <workflow>

nn_test_preds <- collect_predictions(test_model_nn_fit)

nn_test_preds %>%
  conf_mat(truth = class, estimate = .pred_class)

          Truth
Prediction built_up cropland forest savannah water
  built_up       39        1      0        0     0
  cropland        3       66      0       12     0
  forest          0        0     49        1     0
  savannah        1        6      1       40     0
  water           0        0      0        0    47

collect_metrics(test_model_nn_fit)

# A tibble: 1 × 4
  .metric .estimator .estimate .config        
  <chr>   <chr>          <dbl> <chr>          
1 f_meas  macro          0.914 pre0_mod0_post0

metrics_rf <- collect_metrics(test_model_fit) %>%
  mutate(model = "Random Forest")

metrics_nn <- collect_metrics(test_model_nn_fit) %>%
  mutate(model = "Neural Network")

model_comparison_rf_nn <- bind_rows(metrics_rf, metrics_nn) %>%
  dplyr::select(model, .metric, .estimate) %>%
  tidyr::pivot_wider(
    names_from  = model,
    values_from = .estimate
  )

model_comparison_rf_nn

# A tibble: 1 × 3
  .metric `Random Forest` `Neural Network`
  <chr>             <dbl>            <dbl>
1 f_meas            0.887            0.914

final_nn_model_fit <- fit(
  nn_final_workflow,
  data = bind_rows(kasai_train2, kasai_test2)
)

final_nn_model_fit

══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: mlp()

── Preprocessor ────────────────────────────────────────────────────────────────
2 Recipe Steps

• step_zv()
• step_normalize()

── Model ───────────────────────────────────────────────────────────────────────
a 15-5-5 network with 110 weights
inputs: B11 B12 B8A NDVI B1 B2 B3 B4 B5 B6 B7 B8 B9 BSI EVI 
output(s): ..y 
options were - softmax modelling  decay=0.01

References (key resources)

Rebecca Barter — “Tidymodels: tidy machine learning in R.” citeturn1view0
tidymodels.org — Random forest workflow notes (case study). citeturn0search2
Heather Kropp — Environmental Data Science, Ch. 7 (Land-cover classification; RF + Neural Network). citeturn2view0

References

Bauer, Paul C. 2023. “ML Modelling with Tidymodels.” 2023. https://bookdown.org/paul/ai_ml_for_social_scientists/06_01_ml_with_tidymodels.html.