Title: pre: An R package for Deriving Prediction Rule Ensembles
Author: Marjolein Fokkema
Affiliation: Leiden University
Abstract:
Many statistical methods provide a trade-off between accuracy and
interpretability. For example, single classification trees may be easy to
interpret, but provide lower predictive accuracy than other methods. On the
other hand, tree ensembles random forests on the other hand provide better
accuracy, but are difficult to interpret. Prediction rule ensembles (PREs) aim
to reconcile accuracy and interpretability, as they consist of only a small set
of prediction rules. In turn, these prediction rules can be depicted as very
simple decision trees, which are easy to interpret and apply. Friedman and
Popescu (2008) developed a popular method for deriving PREs, which derives a
large initial ensemble of prediction rules from the nodes of CART trees and
selects a sparse final ensemble by regularized regression of the outcome
variable on the prediction rules. The R package pre is a completely R-based
implementation of the method, with some additional improvements. For example, it
uses a tree induction algorithm with unbiased variable selection for deriving
prediction rules. In the current presentation, I will show the functionality of
the package with some illustrative examples based psychological research data.