Predictive Data Outputs
All data that Predictable generates as part of the training and scoring process.
Model Train Subset
The training population of customers used to train a specific client model.
| COLUMN NAME | TYPE | DEFINITION |
|---|---|---|
| CUSTOMER_ID | VARCHAR | unique customer identifier |
Model Summary
Summary statistics of trained models. Area Under Curve (AUC) = overall summary statistic.
| COLUMN NAME | TYPE | DEFINITION |
|---|---|---|
| MODEL_VERSION | FLOAT | version that model was trained on |
| TIMESTAMP | NUMBER | unix timestamp in seconds of when model was trained |
| TRAIN_ROC_AUC | FLOAT | summary statistic that evaluates overall predictive power on training set |
| TEST_ROC_AUC | FLOAT | summary statistic that evaluates overall predictive power on test set |
| TRUE_NEGATIVES | FLOAT | percentage of accurate negative predictions |
| FALSE_NEGATIVES | FLOAT | percentage of inaccurate negative predictions |
| TRUE POSITIVES | FLOAT | percentage of accurate positive predictions |
| FALSE POSITIVES | FLOAT | percentage of inaccurate positive predictions |
| NUMBER_OF_FEATURES | INT | total number of features processed in the model |
Feature Importance
Lists all features in the model. A higher feature_value indicates a larger impact on prediction.
| COLUMN NAME | TYPE | DEFINITION |
|---|---|---|
| MODEL_VERSION | FLOAT | version that model was trained on |
| TIMESTAMP | NUMBER | unix timestamp in seconds of when model was trained |
| FEATURE_NAMES | FLOAT | name of the feature (variable) included in the model |
| FEATURE_VALUES | FLOAT | normalized score (0-100) of impact feature had on predicted outcome |
Product Matrix
Output of the product recommendation model.
| COLUMN NAME | TYPE | DEFINITION |
|---|---|---|
| MODEL_VERSION | FLOAT | version that product recommendation model was trained on |
| TIMESTAMP | NUMBER | unix timestamp in seconds of when model was trained |
| PROD_OBS | VARCHAR | product that was purchased |
| PROD_REC | VARCHAR | product that is recommended |
| SCORE | FLOAT | metric that indicates the strength of the relationship – higher the better |
Model Scores
Output of the churn, next purchase, and propensity models.
| COLUMN NAME | TYPE | DEFINITION |
|---|---|---|
| CUSTOMER_ID | STRING | the unique customer identifier, the key to join back to customer tables. |
| MODEL_VERSION | FLOAT | software version the model was trained on for auditing purposes. |
| TIMESTAMP | NUMBER | unix timestamp in seconds of when scoring occured. Used to get latest score for a given customer, and to filter out older scores for customers that have not been rescored. |
| SCORE | NUMBER | 0-100 normalized output of scoring. |
Audiences
Predictable returns audiences in addition to scores. The audiences manifest as views, and contain a single column:
| COLUMN NAME | TYPE | DEFINITION |
|---|---|---|
| CUSTOMER_ID | STRING | the unique customer identifier, the key to join back to customer tables. |
The audiences are of two categories:
- Standard audiences. These are high/med/low groupings for each model ouput, and are provided as a convenient pre-defined way to work with scores.
- Smart audiences. These are based on customer attributes, scores, or some combination thereof.
