Code Bug Fix: How to plot AUC ROC for different caret training models?

Original Source Link

Here’s a reprex

library(caret)
library(dplyr)

set.seed(88, sample.kind = "Rounding")

mtcars <- mtcars %>%
  mutate(am = as.factor(am))

test_index <- createDataPartition(mtcars$am, times = 1, p= 0.2, list = F)

train_cars <- mtcars[-test_index,]

test_cars <- mtcars[test_index,]

set.seed(88, sample.kind = "Rounding")
cars_nb <- train(am ~ mpg + cyl,
                data = train_cars, method = "nb", 
                trControl = trainControl(method = "cv", number = 10, savePredictions = "final"))

cars_glm <- train(am ~ mpg + cyl,
                 data = train_cars, method = "glm", 
                 trControl = trainControl(method = "cv", number = 10, savePredictions = "final"))

My question is, how would I go about creating an AUC ROC curve on a single plot to visually compare the two models?

I assume that you want to show the ROC curves on the test set, unlike in the question pointed in the comment (ROC curve from training data in caret) which uses the training data.

The first thing to do will be to extract predictions on the test data (newdata=test_cars), in the form of probabilities (type="prob"):

predictions_nb <- predict(cars_nb, newdata=test_cars, type="prob")
predictions_glm <- predict(cars_glm, newdata=test_cars, type="prob")

This gives us a data.frame with probabilities to belong to class 0 or 1. Let’s use the probability of class 1 only:

predictions_nb <- predict(cars_nb, newdata=test_cars, type="prob")[,"1"]
predictions_glm <- predict(cars_glm, newdata=test_cars, type="prob")[,"1"]

Next I’ll use the pROC package to create the ROC curves for the training data (disclaimer: I am the author of this package. There are other ways to achieve the result, but this is the one I am the most familiar with):

library(pROC)
roc_nb <- roc(test_cars$am, predictions_nb)
roc_glm <- roc(test_cars$am, predictions_glm)

Finally you can plot the curves. To have two curves with the pROC package, use the lines function to add the line of the second ROC curve to the plot

plot(roc_nb, col="green")
lines(roc_glm, col="blue")

To make it more readable you can add a legend:

legend("bottomright", col=c("green", "blue"), legend=c("NB", "GLM"), lty=1)

And with the AUC:

legend_nb <- sprintf("NB (AUC: %.2f)", auc(roc_nb))
legend_glm <- sprintf("GLM (AUC: %.2f)", auc(roc_glm))
legend("bottomright",
       col=c("green", "blue"), lty=1,
       legend=c(legend_nb, legend_glm))

ROC curves

Tagged : /

Leave a Reply

Your email address will not be published. Required fields are marked *