Generalized Data Inputs
These are the formats our models expect the data to be in for ingestion. The standard and optional schemas are as follows:
Data Source | Data Description | Optional/Required |
---|---|---|
Conversion Data | A table where each row represents a conversion event for a customer | Required |
Outbound Events | A table where each row is an outbound event originating from the marketer. These data are usually messaging data (e.g., email, SMS) | Optional |
Inbound Events | A table where each row is an inbound event collected by the marketer about a customer engagement. These data are usually website activity data | Optional |
Customer Attributes | A table where each row represents a customer and each dimension is an attribute describing the customer. These could be loyalty status or demographics data. | Optional |
Conversion Events
All conversion events with relevant associated data. This table should be at the transaction level – one row per transaction. For transactions with multiple items purchased, pass product SKUs in an array in the product column.
Metric | Type | Description | Optional/Required |
---|---|---|---|
CUSTOMER_ID | VARCHAR | Customer identifier from client | Required |
MAGNITUDE | FLOAT | Magnitude of customer activity (e.g., spend, donation, duration) This value should be > 0 | Required |
OBJECT | STRING | What did they spend money on | Required |
TS | INT | Timestamp of conversion event | Required |
EVENT_ATTRIBUTE | VARCHAR | Other data describing the event (e.g., retail vs. e-commerce, discount code) | Optional |
OBJECT_ATTRIBUTE | VARCHAR | Other data describing the object of conversion | Optional |
Engagement Events
This table should contain one engagement event per row, per customer. Engagements in this table should be limited to actions taken by a customer and should not include opens, sends, or delivered but should include clicks, pageviews, scrolls, bounces, impressions, etc. This data source is typically e-mail click engagement data or website engagement data. While the structure remains the same across engament sources, each source should be sent to predictable as a separate table (e.g. one table for e-mail events and a separate table for website). It is optional but will maximize the predictive power of the models if available.
Metric | Type | Description | Optional/Required |
---|---|---|---|
CUSTOMER_ID | VARCHAR | Customer identifier from client | Required |
TS | INT | Timestamp of engagement event | Required |
LABEL | STRING | Description of event (e.g., name of campaign, page URL) | Optional |
TYPE | STRING | Describe the type of event (e.g., email, sms, view, scroll, unsubscribe, etc.) | Optional |
EVENT_ATTRIBUTE | VARCHAR | Other data describing the attributes (e.g., utm_source, utm_medium, utm_campaign, campaign_type, etc.) | Optional |
Customer Attributes
A table where each row represents a customer and each dimension is an attribute describing the customer. This table should contain only one row per customer.
Metric | Type | Description | Optional/Required |
---|---|---|---|
CUSTOMER_ID | VARCHAR | Customer identifier from client | Optional |
ATTRIBUTE | VARCHAR | Description of attribute (e.g., customer type, loyalty status, demographics, 3rd party data) | Optional |