Skip to main content

The Pareto Frontier

The Pareto frontier shows the complete set of optimal trade-offs between predictive performance and model complexity.

What Is Pareto Optimality?

A model configuration is Pareto optimal when you cannot improve one objective (accuracy or simplicity) without degrading the other.

In Avenue's Pareto frontier chart:

  • X-axis: Model complexity (estimated number of factor tables after consolidation)
  • Y-axis: Predictive performance (cross-validation score—higher is better)

Each point represents a complete GBM configuration with specific hyperparameters. Points on the frontier are all optimal—the "best" choice depends on your requirements.

Interpreting Frontier Patterns

Elbow: A sharp bend where small accuracy gains require large complexity increases. Often a practical choice.

Plateau: Performance flattens beyond a certain complexity—diminishing returns from additional tables.

Cliff: Accuracy drops sharply below minimum complexity, showing the essential complexity for your problem.

Selecting a Model

Left side (fewer tables):

  • Simpler models, easier regulatory review
  • Better for understanding key drivers

Right side (more tables):

  • Maximum accuracy, more interactions captured
  • Still fully transparent and explainable

Middle:

  • Balanced accuracy and simplicity

The model selection guide provides specific recommendations by use case.

Complexity and Performance Metrics

Complexity: Median number of factor tables across cross-validation folds (after ANOVA-style consolidation)

Performance (depends on objective):

  • Poisson/Gamma/Tweedie: Negative log-likelihood
  • Binary: Log-loss or AUC
  • Regression/Huber: RMSE or MAE

Next Steps