Not All Predictive Models are Created Equal

Not All Predictive Models are Created Equal

Two key ingredients are required for efficient, highly predictive models – data and analytics. While the data is crucial, the algorithms and analytics behind the predictive models are the engines that do most of the heavy lifting and differentiate good predictions from great predictions.

At Lattice, we take a three-step approach to machine learning to optimize for the best models for predictive marketing and sales. Here’s an overview of our approach.

Feature Selection

Statistical models perform best when they incorporate the most optimized set of attributes (or “features”). Therefore, Lattice has developed sophisticated techniques for selecting the right mix of internal and external data attributes for inclusion in each model.

It is typical to have thousands of candidate attributes that could potentially be included in the pattern matching algorithms. To start we apply various statistical techniques to determine which attributes should be retained and which should be discarded. Our models also look at the creation of derived attributes, which transform the raw data in a native attribute into a form that is more meaningful in a predictive model. For example, the founding date of a company is used as the basis for a derived attribute called “company age” that is likely far more predictive than founding date.

Data Normalization

While data represents a key input into any predictive algorithm, data can take many shapes and forms. Some attributes like “number of email opens” or “annual spend” are relatively straightforward to mine, whereas attributes like “job title” or “geography” need preprocessing before they can truly shine.

Lattice uses unsupervised machine learning techniques to segment attributes as well as data normalization algorithms to ensure uniformity across attribute values.

Model Execution

The real value of machine learning comes out when the models are finally selected and launched. Lattice has revolutionized the use of machine learning by making it simple enough for a business user to build and launch, but sophisticated enough to provide enterprise-grade insights.

Lattice has gathered years of learning through customer engagements to create pre-built models that already incorporate the optimal predictive algorithms for the use case at hand. For example, a cross-sell or upsell propensity model may require a dramatically different type of algorithm than a lead scoring or win-back model.

Here is a sample of the techniques Lattice incorporates into its predictive models.

Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable. Logistic regression is very resource-intensive, consuming a great deal of memory on a large data set but it is very stable and works particularly well when you have continuous features or attributes like revenue data.Decision Trees are very powerful algorithms that help identify the best predictors. Decision trees are intuitive to analyze and usually produce great results when applied to mixture of categorical (i.e.. SIC CODE, Industry vertical, location) and numerical attributes.Random Forests are one of the techniques behind the recommendation engine in Netflix and also a popular technique in the Hadoop framework. The main idea is to build a forest of many decision trees over different variations of the same data set and take weighted averages of the results. This technique is very powerful because can effectively identify patterns across a large noisy dataset. The technique is very computationally expensive but it can be easily run in parallel.Neural Networks are a composition of neurons combined together to describe a data set. While machine-intensive it is very powerful when you try to describe events that are non-linear (for instance a sales campaign that spans across multiple market segments). Neural networks are typically used to identify very complex patterns.K-means classification-clustering can be very useful for prospecting. Take for example, the numerous existing customers and potential prospects in your CRM software. Clustering allows you to find similarities between accounts and rank them according to the degree of similarity.Naïve Bayes is a probabilistic classifier. It is very useful in identifying patterns and behaviors of an account for cross and up-sell purposes. For example, this account bought product A and B so the probability of buying C is very high.

It’s not simply enough to have a broad, high quality set of buying signals or have a single great predictive algorithm. By selecting the right attributes, normalizing the data, and then applying it to the most effective, proven predictive model, machine learning can produce highly predictive results that provide insights and recommendations that directly drive business improvements.

   Image Credit(s):                Carl Bottiger                    

Written by

Rob Bois
March 19, 2014