A general framework for causal classification

Jiuyong Li, Weijia Zhang, Lin Liu, Kui Yu, Thuc Duy Le, Jixue Liu

March, 2021

Abstract

In many applications, there is a need to predict the effect of an intervention on different individuals from data. For example, which customers are persuadable by a product promotion? which groups would benefit from a new policy? These are typical causal classification questions involving the effect or the change in outcomes made by an intervention. The questions cannot be answered with traditional classification methods as they only deal with static outcomes. In marketing research these questions are often answered with uplift modelling, using experimental data. Some machine learning methods have been proposed for heterogeneous causal effect estimation using either experimental or observational data. In principle these methods can be used for causal classification, but a limited number of methods, mainly tree based, on causal heterogeneity modelling, are inadequate for various real world applications. In this paper, we propose a general framework for causal classification, as a generalisation of both uplift modelling and causal heterogeneity modelling. When developing the framework, we have identified the conditions where causal classification in both observational and experimental data can be resolved by a naive solution using off-the-shelf classification methods, which supports flexible implementations for various applications. This result not only enables a practical way to solve the causal classification problem by using any existing classification method in the proposed framework, but also makes it possible to cross use the methods developed in both uplift modelling and causal heterogeneity modelling areas when the conditions are satisfied. Experiments have shown that our framework with off-the-shelf classification methods is as competitive as the tailor-designed uplift modelling and heterogeneous causal effect modelling methods.

Type

Journal article

Publication

International Journal of Data Science and Analytics. 11:127-139