Advanced Market Mix Modeling Techniques

Published Date: 02 Nov 2017

3rd IIMA International Conference On

Advanced Data Analysis, Business Analytics and Intelligence

April 13-14 2013

[1]: Manager, AbsolutData Intelligent Analytics [Email: [email protected]]

[2]: Consultant, AbsolutData Intelligent Analytics [Email: [email protected]]

[3]: Analyst, AbsolutData Intelligent Analytics [Email: [email protected]]

ABSTRACT

Marketing effectiveness is being used globally by several companies to answer questions related to the four Pâ€™s of marketing to make informed business decisions and optimize their marketing efforts in times of increasing cost pressures and diminishing budgets. Regression modeling through Ordinary Least Squares has been the traditional technique of choice so far.

In our paper we study the practical limitations and statistical challenges faced in OLS and explore advance market mix modeling techniques like Pooled regression, Fixed effects modeling, Hierarchical Bayesian regression and Proc Mixed in SAS which can overcome these challenges.

From our study we have designed a framework- Absolutdata Market Mix Modeling Choice of Technique (ADT MMM COT) that helps a marketer as well as a data analyst to choose the most suitable regression technique depending on the business case. The framework evaluates and compares the various techniques on several business and statistical parameters like data availability, business questions to be answered, business assumptions, cost and effort estimation and other advantages and limitations. We were able to conclude that when results are desired at aggregate level and a time series-cross sectional data is available, pooled regression and fixed effects modeling worked better. Advanced multilevel modelling techniques like Hierarchical Bayesian regression or Proc Mixed in SAS can be applied to get results at deeper granularity within limited time/effort for thorough decision-making.

Key Words: Regression Modeling, Multilevel modeling, Bayesian Methods, Marketing models, Data analysis in Retailing

INTRODUCTION

Globally, several companies are utilizing Marketing Effectiveness (ME) analytics for enhanced business decisions regarding performance of marketing mix elements (primarily 4Ps â€“ Product, Price, Place and Promotions). Gauging performance/returns from marketing elements is increasingly important in times of limited marketing budgets and competitive profit margins. Marketing Effectiveness analytics is typically used to answer the following business questions and many more, across diverse SKUs/ brands /regions/retailers etc.:

What are key Volume growth/decline drivers? (Key driver analysis)

What is the sensitivity of volume to Price changes? (Price Elasticity)

Which/What marketing elements of the competition impact?

Media & Promotion Responsiveness & their Returns On Investments (ROI)

What incrementality does the New Products drive to the brand?

In current knowledge environment the most common technique of choice is linear regression through Ordinary Least Squares (OLS) algorithm. Easily understood and widely applied, OLS has traditionally been used to answer marketing effectiveness related questions. However it has certain limitations and poses many challenges which can essentially be classified as:

Business challenges: OLS allows us to model for only one group at a time, which essentially means that we can have only one dependent variable in a model at once. This poses limitation in cases when results are desired at disaggregated levels for many groups, say for portfolio of brands/SKUs/retailers/regions etc. For instance, if the objective is to find the price elasticity and media effectiveness for the 10 brands of Cadbury viz. Cadbury Dairy Milk, Eclairs, Silk, Gems etc then we will need to build 10 different models and sync them manually to get an overall picture for Cadbury which would take a lot of time and may become extremely cumbersome.

Statistical limitations: OLS algorithm works with assumptions around normal distribution, non-multicollinearity etc. However, data available in a real-life scenario rarely fulfil the text book assumptions. For example promotions are often supported by increased distribution and media presence, which results in multicollinearity of marketing elements. Also if we talk about the same example as above, there might arise a problem of lack of data points for some of the groups which are new launches which can render the modelling process ineffective as we cannot make a robust model without sufficient data points or roughly two years of weekly data.

The objective of this paper is to evaluate existing advanced regression techniques viz pooled regression, fixed effects modeling, multilevel modeling through Proc Mixed in SAS and Hierarchical Bayesian regression, used for marketing-effectiveness models, with respect to technical and business parameters. The Intent is to develop guidelines regarding choice of technique for marketing-mix models and apply linear data transformations like standardization versus scaling, if any, in light of the business and statistical needs.

Methodology

The base for the study was established by referring to the technical papers, researches by other authors, textbooks on statistics and regression like Woolridge (2009), Gujarati and Sangeetha (2007), Park (2009) and other secondary sources. This along with Absolutdataâ€™s internal knowledge base on market mix modeling capabilities essentially developed the background, helped us understand what is presently being done in the field of market mix modelling and gave us an insight as to what all needs to be explored further.

OLS technique was evaluated for its applications and limitations, to help identify better suited market mix modeling techniques in restrictive situations. The parameters that are used for comparison of different techniques have been classified into two categories:

Business logic essentially means capturing the real business environment story in the model that includes:

having the relevant variables in the model with the correct intuitive signs

the coefficient and elasticity numbers make sense and correspond to the benchmark ranges

the ability of the technique to answer the desired business questions with the available data at the desired level of aggregation

Statistical robustness which is determined by Model fit, R2 and MAPE measures.

Each of the techniques was tested on data from leading CPG companies either in the form of live projects or validation case studies by comparing the results with the earlier method of OLS. The purpose of the validation exercise was to substantiate the results that we have explored and have some numbers backing up our results.

As business these days is becoming a mixture of both art and science, careful prudence had been maintained to balance both evaluation categories. The clientâ€™s (brand teamâ€™s) inputs were taken into consideration while finalizing the models, as they are the ones who best understand their business and are living and consuming it day in and day out.

Since the independent variables in a model are varied in character and scale, we also explored the scope of linear data transformation like scaling and standardization to derive more meaningful results from the regression process.

Introduction to different techniques

Regression techniques for market mix modeling have been in use for a long time to answer the basic questions around effectiveness and optimization. While the end use of all is the same â€“ to answer the business questions, there is a slight difference as to how they can be utilized in an optimal manner to save time and resources as well as build statistically robust models while overcoming the challenges posed. The results for the entire research have been summarized in this section starting with a brief introduction about each technique followed by a tabular comparison for all of them on various parameters like data requirements, business questions answered, advantages and limitations in the next section. Below is the list of techniques explored:

Pooled Regression - Pooled Regression is usually carried out on Panel data/ Time-Series Cross-Sectional data i.e. data that has observations over time for several different groups or â€˜cross-sectionsâ€™. Typically time-series regression models need a sufficient history of data to yield robust results (you need preferably 2 years of data to get robust results). If you have less than 2 years of data, but you have this for multiple groups, like stores or similar products, then you can still build a "pooled" model by combining time-series observations across several groups to get results at the aggregated level (Joseph Joy 2010).

Fixed effects - Fixed Effects Model incorporates dummy variable with pooled OLS to run a regression model. The groups that are pooled together are assumed to be coming from the same population with similar responsiveness, hence the coefficients are the same for all and thus the results are reported only at the aggregated level. However the intercept term is different for each group which is represented by the group level dummy variable. This technique is used when there are similarly performing groups from the same population but of different sizes â€“ some small scale and some large scale.

Proc Mixed in SAS â€“ Proc Mixed model is aÂ mixed modelÂ containing bothÂ fixed effectsÂ and random effects. It is used when we want results for each brand/SKUs/retailer/region. Thus in SAS, it establishes a regression equation and specifies first the dependent variable followed by the set of independent variables after an equal sign. In a mixed linear model the data is permitted to exhibit both correlation and non constant variability (Knowledge Base - SAS)

Hierarchical Bayesian Regression (HB Reg) - HB Reg estimates a model for hierarchical or multilevel data using Bayesian methods (Bayes Theorem) based on the MCMC (Monte Carlo Markov Chain) algorithm. Bayesian Method is a method ofÂ inferenceÂ in whichÂ the rule of BayesÂ updates the probability of an event as additionalÂ evidenceÂ is learned. So while estimating the results for one group it borrows information from other groups as well. It can thus estimate group parameters even when there are more parameters than observations per group(Sawtooth 2003).

Key Results

This section highlights the summary findings for each of the techniques, with evaluation across various parameters. The table below: ADT MMM COT gives a general framework to select the most preferable technique depending upon the business case.

The framework below has been rightly called as ADT MMM COT which stands for Absolutdata Market Mix Modeling Choice of Technique. This table is aimed at becoming a key reference for all marketers and data analysts to choose the technique that best fits their business objective by offering guidelines on choice of technique, points of differentiation and highlighting the advantages and limitations for each of the techniques explored. The evaluation also keeps in mind the constraints of cost and effort requirement that are faced in a normal business environment.

Table 1: ADT MMM COT

Technique

Algorithm

Data Requirements and Structuring

Business Questions Answered

Business Assumptions

Advantages

Limitations

Simple linear regression

OLS: minimizing the sum of squares of error

Time series data at aggregated level

Preferably 104 observations

No data transformation

Effectiveness of 4 Pâ€™s of marketing

@ Aggregated* level

No Business assumptions required

Simplest

Easy to use

Flexibility in choosing variables

Cost effective- free platforms

Time consuming for a portfolio of models

Not very robust

Adequate time series data required

Pooled

OLS

Panel Data

Stacked one below the other

No data transformation

Effectiveness of 4 Pâ€™s of marketing

@ Aggregated* level

All granular levels behave similarly

Lesser time series data can be clubbed across granular levels

Easy and time saving

Free platforms

Same variables across granularities

Does not give results at the granular level

Fixed Effects

Pooled OLS + Dummy intercepts at each granular level

Panel Data

Stacked one below the other

No data transformation

Effectiveness of 4 Pâ€™s of marketing

@ Aggregated* level

All granular levels behave similarly

Difference in average levels of volume

Same as pooled

Can differentiate each group for difference in volume levels

Same as pooled regression

HB Reg

Probabilistic and iterative in nature,

Based on Bayes theorem and MCMC algorithm

Panel Data

Standardized data works better

Effectiveness of 4 Pâ€™s of marketing

@ granular* level

Granular levels coming from similar population

Advanced

Portfolio modeling

Can model heterogeneous groups

Captures time variation well

Same variables across groups

Not a user friendly interface

High costs involved â€“ Buy software HB Reg from Sawtooth

Proc Mixed

Maximum Likelihood and Restricted Maximum Likelihood Estimation

Panel Data

Standardized data works better

Effectiveness of 4 Pâ€™s of marketing

@ Granular* level

Granular levels coming from similar population

Popular

Use friendly interface on SAS

Portfolio modeling

Simultaneous data structuring and consolidation

Same variables across groups

Complexity increases with number of granular groups

Costly

Note: * Aggregated Level: denotes aggregated result for overall brand/ country/ retailer - higher level of modeling;

* Granular level: denotes results for multiple SKU/ brand/region/retailer/store - deeper level of modeling

The most basic and easiest technique for market mix modelling is the simple linear regression through OLS which can be run on multiple free platforms like excel, R, etc. It requires minimal effort in terms of resources and time and is best suited for analysis of a single brand/ SKU/ region or any group when time series data is available for approximately 2 years at a weekly level. It can be used to answer the traditional questions around the 4Pâ€™s of marketing like key brand drivers, the responsiveness to price, media and promotion effectiveness, incrementality from new product launch, return on investment and so on. However in case where the data availability over time is an issue which is often the case in many developing countries where the data collection and storage methods are not very advanced and plus the IT infrastructure is not very good, using a simple OLS can be challenging. To overcome this difficulty we can pool the data across different SKUâ€™s/ regions/ stores/ granular level groups by stacking them one below the other and run a pooled regression model to deliver insights at a brand/ country/ retailer/ aggregated level. So essentially what it does is that if we have only 50 data points each for 10 granular groups, it combines to give 500 data points which enables to run a model on adequate number of data points. The only assumption involved here is that only similar groups should be pooled which are coming from the same population. For example, it should not be the case where some of the retailers are coming from developing countries and the rest from the developed countries. However if the groups at the granular level are similar in responsiveness or behaviour but are different in terms of average head start then we should go for fixed effects modelling instead of pooled regression. Fixed effects modelling essentially does this by incorporating a dummy variable for each group except for one which is taken to be the base group.

The above techniques are limited to be used for the analysis of a single model at the aggregated level rather than a portfolio of models like the case of responsiveness to price across different SKUs of a brand or media and promotion effectiveness across different retail stores all over the country. In these cases we will have to build multiple models and then study each of the models individually to build the overall story. This can be a strain in terms of time and available resources in an organisation. The remedy to this problem is the use of more advanced market mix modelling techniques like Hierarchical Bayesian regression and multilevel modelling through Proc Mixed in SAS. These advanced techniques are ideal in situation where the analysis is to be done at a much deeper granular level and a whole portfolio is to be modelled. The portfolio in consideration could be several SKUâ€™s, regions, states, geographies, retailers or brands. For instance in the case of a CPG company in cream cheese category, the objective was to design the pricing and promotion strategy for the whole portfolio which comprised of over 50 SKUs. The technique of choice here was HB regression as the number of models to be built was reduced drastically and the results were obtained at the desired level of granularity. At the same time it also tackled the problem of multi-collinearity among the pricing and promotion variables.

These advanced market mix modelling techniques offer huge savings in terms of time and resources. However the high costs involved in the onetime investments in buying the software- HB Reg from Sawtooth and Proc Mixed package from SAS for running these techniques could prove to be a roadblock. Small firms which may not have the financial muscle may find it cost prohibitive to purchase them. Looking at the advantages that these techniques offer in terms of modelling the entire portfolio together, providing insights at deeper granular level which save time and effort, the advantages can easily outweigh the onetime costs in the long run.

However modeling at the portfolio level or the entire range together is not all that easy as we need to sometime transform the data to bring the scale of the diverse variables across different categories at the same level. For instance, while modeling the portfolio of 10 brands, some of the brands might be large brands with volume selling in tonnes but some might be small brands with the volume selling in kg only. Hence to avoid the larger group impacting the smaller group, we standardize or scale the variables.

Conclusion

From our analysis we were able to develop guidelines for technique selection for marketing models depending upon the business case, helping a marketer and decision analysts both to make better and informed decisions.

This research also helped gain application-oriented details with respect to evaluated techniques. For instance, in situations when time-series data is limited (for OLS), scarcity in data points can be compensated by cross-sectional data thereby using pooled regression to get quick directional insights for business decision-making with limited data availability. Advanced multilevel modelling techniques like Hierarchical Bayesian regression or Proc Mixed in SAS can be applied to get results at deeper granularity within limited time/effort for thorough decision-making. For example, in one of our validated cases, differences in marketing levers were to be understood for multiple SKUs across various retailers. Multilevel modelling was observed to be significantly less effort-intensive in comparison to numerous (traditional) OLS models which would have been developed otherwise. Also, manual effort of coordinating business implications in multiple models could be taken care of dynamically through multilevel algorithms. However the issue of difference in scale of variables that may arise in case of multilevel modeling can be best encountered by group wise/ granularity wise standardization of data where the variables are transformed to have a mean of zero and standard deviation of 1 or through scaling. In the cases that were validated it was found that standardization worked much better than scaling.

Application of these techniques should be weighed in light of business objectives, data availability, cost consideration of software, technical know-how and effort estimation.

Acknowledgement

This white paper would not have seen the light of the day without the small but valuable contributions from all the colleagues at Absolutdata.

One name that stands out is Mr Abhik Pal, Manager, Absolutdata, who provided his expert guidance at every stage in terms of preparing the structure, compiling and editing the content of the white paper.

Also we would like to thank Mr Sundar Ramaswamy, COO, Absolutdata, for giving us the opportunity to write this paper and encouraging us through his full cooperation in terms of providing all the necessary resources.

Key References

Gujarati D.N. and Sangeetha,2007, Basic Econometrics, Tata McGraw-Hill, New York

Joseph Joy 2010, "Metriscient," accessible online at http://www.metriscient.com/pooledreg.htm

Knowledge Base - SAS http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_mixed_sect001.htm

Park Hun Myoung. 2009, "Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS," Working Paper- The University Information Technology Services (UITS) Center for Statistical and Mathematical Computing, Indiana University

Sawtooth Software (2003),"Technical Paper Series, HB-Reg: Hierarchical Bayes Regression Analysis (v.3)," Technical Paper accessible from www.sawtoothsoftware.com

Singer J.D.,1998, "Using SAS Proc Mixed to fit multilevel models, Hierarchical models, and Individual Growth models," Journal of Educational and Behavioral Statistics Winter 1998, vol 24, No 4 pp. 323-355

Woolridge J.M., 2009, "Introductory Econometrics : A modern Approach," South Western Cengage Learning, Mason City

Our Service Portfolio

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

Do not panic, you are at the right place

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now

Advanced Market Mix Modeling Techniques

ABSTRACT

INTRODUCTION

Methodology

Introduction to different techniques

Key Results

Table 1: ADT MMM COT

Technique

Algorithm

Data Requirements and Structuring

Business Questions Answered

Business Assumptions

Advantages

Limitations

Simple linear regression

Pooled

Fixed Effects

HB Reg

Proc Mixed

Conclusion

Acknowledgement

Key References

Our Service Portfolio

Want To Place An Order Quickly?

Do not panic, you are at the right place

Get 20% Discount, Now £19 £14/ Per Page14 days delivery time

Get An Instant Quote

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time