May 23, 2022

Microsoft DoWhy – The Most Successful Open-Source Causal Library

The DoWhy library was developed by Microsoft researchers in 2018 with the goal of helping data scientists better understand and deploy causal inference. DoWhy uses two of the most popular scientific frameworks for causal inference — graphical causal models (GCMs) and potential outcomes — and combines them in one library.

The causal inference community fully embraced DoWhy, which has been downloaded over 1 million times to date. It is widely used in production scenarios across industry and academia, and Microsoft itself is using the library to power their own causal analyses such as estimating who benefits most from messages to avoid overcommunicating to large groups.

The Evolution of DoWhy to PyWhy

The research team at Microsoft has rightfully recognized that making causality a pillar of a business’s data science practice requires abroad, collaborative effort. To further its goal of creating a greater community around causality, Microsofthas collaborated with Amazon Web Services (AWS) to shift DoWhy into an independent, open-source governance model, known as PyWhy.

'The mission of PyWhy is to build an open-source ecosystem for causal machine learning that advances the state of the art and makes it available to practitioners and researchers. In PyWhy, we will build and host interoperable libraries, tools, and other resources spanning a variety of causal tasks and applications, connected through a common API on foundational causal operations and a focus on the end-to-end analysis process.'

Emre Kiciman and Amit Sharma, Microsoft Research
www.microsoft.com/en-us/research/blog/dowhy-evolves-to-independent-pywhy-model-to-help-causal-inference-grow/

The contribution of AWS to PyWhy is the result of years of Amazon research on GCMs, and complements DoWhy’s existing feature set. GCMs are a formal framework developed by Turing Award winner Judea Pearl to model cause-effect relationships between variables in a system. Causal models are an excellent way to visually represent causal relationships, providing a common language for business units and data scientists.

The Future is Bright for Causal AI

Amazon and Microsoft are both individually committing their research efforts to Causal AI, a great endorsement of the technology and the business impact. Additionally the collaboration between two companies that are usually in intense competition with one another is a clear indicator that causality is one of the next big things in AI.

To learn more about the PyWhy library, visit pywhy.org/dowhy/
or browse the source code on github.com/py-why.

Geminos has directly embedded the PyWhy library into their Causeway platform, providing a visual interface for users to build and analyze causal models.