Marine Geospatial Ecology Lab

Predictive Modeling Workflow

In MGET 0.8a42, we introduced a simplified workflow for predictive modeling, shown here for generalized linear models (GLMs).

The new workflow is the same for all model types (GLM, GAM, etc.) and consists of three steps:

  • Fit the model to training data in a table. This works the same as before. The model formula specifies the fields of the input table will be used as the response variable and predictor variables.

  • Evaluate the predictive power of the fitted model by performing predictions on test data in a table and comparing the predicted values of the response variable to the observed values. This uses a new tool, Predict <Model Type> From Table, which provides new functionality and replaces the Plot ROC of Binary Classification Model and the Plot Performance of Binary Classification Model tools. The new tool generates ROC and other plots for binary models, as well as new summary statistics, including Cohen’s Kappa (for classification models) and Kendall’s tau and % Variance Explained (for regression models). The new tool can also (optionally) store the predicted values back to the table, so you can perform your own comparison of the predicted and observed values.
  • Create a raster for the response variable by performing predictions on rasters representing the predictor variables. This works the same as before but new parameters have been added to allow you to specify constant values for predictor variables that do not vary spatially and therefore do not have rasters representing them. In previous versions of MGET, these variables could only be supplied by generating a constant raster having the necessary value; that awkward design has been eliminated.