Predictive Data Outputs

All data that Predictable generates as part of the training and scoring process.

Model Train Subset

The training population of customers used to train a specific client model.

COLUMN NAME	TYPE	DEFINITION
CUSTOMER_ID	VARCHAR	unique customer identifier

Summary statistics of trained models. Area Under Curve (AUC) = overall summary statistic.

COLUMN NAME	TYPE	DEFINITION
MODEL_VERSION	FLOAT	version that model was trained on
TIMESTAMP	NUMBER	unix timestamp in seconds of when model was trained
TRAIN_ROC_AUC	FLOAT	summary statistic that evaluates overall predictive power on training set
TEST_ROC_AUC	FLOAT	summary statistic that evaluates overall predictive power on test set
TRUE_NEGATIVES	FLOAT	percentage of accurate negative predictions
FALSE_NEGATIVES	FLOAT	percentage of inaccurate negative predictions
TRUE POSITIVES	FLOAT	percentage of accurate positive predictions
FALSE POSITIVES	FLOAT	percentage of inaccurate positive predictions
NUMBER_OF_FEATURES	INT	total number of features processed in the model

Lists all features in the model. A higher feature_value indicates a larger impact on prediction.

COLUMN NAME	TYPE	DEFINITION
MODEL_VERSION	FLOAT	version that model was trained on
TIMESTAMP	NUMBER	unix timestamp in seconds of when model was trained
FEATURE_NAMES	FLOAT	name of the feature (variable) included in the model
FEATURE_VALUES	FLOAT	normalized score (0-100) of impact feature had on predicted outcome

Output of the product recommendation model.

COLUMN NAME	TYPE	DEFINITION
MODEL_VERSION	FLOAT	version that product recommendation model was trained on
TIMESTAMP	NUMBER	unix timestamp in seconds of when model was trained
PROD_OBS	VARCHAR	product that was purchased
PROD_REC	VARCHAR	product that is recommended
SCORE	FLOAT	metric that indicates the strength of the relationship – higher the better

Output of the churn, next purchase, and propensity models.

COLUMN NAME	TYPE	DEFINITION
CUSTOMER_ID	STRING	the unique customer identifier, the key to join back to customer tables.
MODEL_VERSION	FLOAT	software version the model was trained on for auditing purposes.
TIMESTAMP	NUMBER	unix timestamp in seconds of when scoring occured. Used to get latest score for a given customer, and to filter out older scores for customers that have not been rescored.
SCORE	NUMBER	0-100 normalized output of scoring.

Predictable returns audiences in addition to scores. The audiences manifest as views, and contain a single column:

COLUMN NAME	TYPE	DEFINITION
CUSTOMER_ID	STRING	the unique customer identifier, the key to join back to customer tables.

The audiences are of two categories:

Standard audiences. These are high/med/low groupings for each model ouput, and are provided as a convenient pre-defined way to work with scores.
Smart audiences. These are based on customer attributes, scores, or some combination thereof.