Skip to main content

GBM to Factor Table Conversion

Avenue trains gradient boosting machines (GBMs) that can be exactly represented as factor tables—with zero information loss.

Background: GLM Factor Tables

Traditional GLMs produce factor tables through exponentiation of coefficients. For example:

Age RangeFactor
16-251.45
26-351.05
36-500.95
51-651.10
65+1.25

Premium = Base Rate × Age Factor × Territory Factor × Vehicle Factor

This format is standard for regulatory filings in insurance.

The Challenge

GBMs are tree ensembles, not linear models. Previous approaches to make GBMs transparent:

  • SHAP values: Explain predictions but don't provide the underlying model
  • Linear approximation: Loses predictive accuracy
  • Distillation: Creates a new simpler model, not the original GBM

Avenue's Approach

Avenue uses a mathematical proof showing any GBM can be exactly represented as a sum of factor tables. The key insight:

  1. Each decision tree partitions the feature space into regions (leaf nodes)
  2. Each leaf node maps to a factor contribution
  3. Tree paths define which features determine each factor
  4. Multiple trees' contributions sum together

By training GBMs with L0-like regularization penalties (both ensemble-wide and tree-wise), Avenue encourages sparsity in feature interactions, producing models that convert to a manageable number of interpretable factor tables.

Conversion Process

  1. Tree Decomposition: Each tree is decomposed into factor contributions based on internal node values
  2. Path Aggregation: Contributions with identical feature combinations are combined
  3. Table Consolidation: Related factors merge into coherent tables
  4. Validation: Predictions from tables match the GBM to machine precision

Factor Table Types

Main Effects

Single-feature tables:

Credit Score:
600-650: 1.50
651-700: 1.20
701-750: 0.90
751+: 0.70

Interactions

Feature combination tables:

Age × Vehicle Type:
Sedan SUV
18-30 1.00 1.30
31-50 0.90 1.10
51+ 0.80 1.00

Consolidation Options

When exporting, you can choose between two consolidation strategies:

ANOVA-style: Main effects and interactions in separate tables. Clearer feature attribution—you can see which effects are main vs. interaction. Best for understanding model structure.

Full consolidation: Main effects merged into interaction tables where applicable. Fewer total tables, potentially simpler implementation.

Both produce mathematically equivalent predictions. See the export guide for details.

Example Calculation

For a 35-year-old with an SUV in an urban territory (multiplicative model with log link):

Factor Lookup:

  • Age (31-35): 0.94
  • Vehicle (SUV): 1.15
  • Territory (Urban): 1.25
  • Age×Vehicle: 1.02

Calculation:

Base Rate: $500
Premium = $500 × 0.94 × 1.15 × 1.25 × 1.02 = $688.69

For additive models (identity or logistic link), factors are summed instead of multiplied.

Automatic Feature Engineering

Compared to manual GLM development, Avenue's GBM automatically:

  • Discovers optimal binning boundaries (data-driven, not arbitrary)
  • Captures non-linearities without manual polynomial/spline specification
  • Identifies significant interactions
  • Performs feature selection through L0-like regularization

Mathematical Exactness

Avenue guarantees:

  • Predictions from factor tables match the GBM exactly (to machine precision)
  • No approximation or distillation
  • Complete transparency—the full model is disclosed

Next Steps