GBM to Factor Table Conversion
Avenue trains gradient boosting machines (GBMs) that can be exactly represented as factor tables—with zero information loss.
Background: GLM Factor Tables
Traditional GLMs produce factor tables through exponentiation of coefficients. For example:
Age Range | Factor |
---|---|
16-25 | 1.45 |
26-35 | 1.05 |
36-50 | 0.95 |
51-65 | 1.10 |
65+ | 1.25 |
Premium = Base Rate × Age Factor × Territory Factor × Vehicle Factor
This format is standard for regulatory filings in insurance.
The Challenge
GBMs are tree ensembles, not linear models. Previous approaches to make GBMs transparent:
- SHAP values: Explain predictions but don't provide the underlying model
- Linear approximation: Loses predictive accuracy
- Distillation: Creates a new simpler model, not the original GBM
Avenue's Approach
Avenue uses a mathematical proof showing any GBM can be exactly represented as a sum of factor tables. The key insight:
- Each decision tree partitions the feature space into regions (leaf nodes)
- Each leaf node maps to a factor contribution
- Tree paths define which features determine each factor
- Multiple trees' contributions sum together
By training GBMs with L0-like regularization penalties (both ensemble-wide and tree-wise), Avenue encourages sparsity in feature interactions, producing models that convert to a manageable number of interpretable factor tables.
Conversion Process
- Tree Decomposition: Each tree is decomposed into factor contributions based on internal node values
- Path Aggregation: Contributions with identical feature combinations are combined
- Table Consolidation: Related factors merge into coherent tables
- Validation: Predictions from tables match the GBM to machine precision
Factor Table Types
Main Effects
Single-feature tables:
Credit Score:
600-650: 1.50
651-700: 1.20
701-750: 0.90
751+: 0.70
Interactions
Feature combination tables:
Age × Vehicle Type:
Sedan SUV
18-30 1.00 1.30
31-50 0.90 1.10
51+ 0.80 1.00
Consolidation Options
When exporting, you can choose between two consolidation strategies:
ANOVA-style: Main effects and interactions in separate tables. Clearer feature attribution—you can see which effects are main vs. interaction. Best for understanding model structure.
Full consolidation: Main effects merged into interaction tables where applicable. Fewer total tables, potentially simpler implementation.
Both produce mathematically equivalent predictions. See the export guide for details.
Example Calculation
For a 35-year-old with an SUV in an urban territory (multiplicative model with log link):
Factor Lookup:
- Age (31-35): 0.94
- Vehicle (SUV): 1.15
- Territory (Urban): 1.25
- Age×Vehicle: 1.02
Calculation:
Base Rate: $500
Premium = $500 × 0.94 × 1.15 × 1.25 × 1.02 = $688.69
For additive models (identity or logistic link), factors are summed instead of multiplied.
Automatic Feature Engineering
Compared to manual GLM development, Avenue's GBM automatically:
- Discovers optimal binning boundaries (data-driven, not arbitrary)
- Captures non-linearities without manual polynomial/spline specification
- Identifies significant interactions
- Performs feature selection through L0-like regularization
Mathematical Exactness
Avenue guarantees:
- Predictions from factor tables match the GBM exactly (to machine precision)
- No approximation or distillation
- Complete transparency—the full model is disclosed