Patterns from Static: Philosophy and the Question Concerning Statistics

Preface

We shall not cease from exploration

And the end of all our exploring

Will be to arrive where we started

And know the place for the first time.

— T.S. Eliot, Little Gidding

I began writing this book in 2018. At that time, data science was a relatively new field and data science degree programs were just beginning to emerge. Pioneering data science programs at University of California Berkeley and New York University emerged in the 2010s; the Statistics and Data Science major at the University of Colorado Boulder (CU Boulder)—my home institution at the time of this writing—launched in 2018, with the launch of the MS in Data Science in fall 2021.

At CU Boulder, like most universities, statistics—alongside computing and domain knowledge—is one of three foundational pillars of data science programs. Thus, the explosion of data science meant an explosion in the engagement with, and application of, statistical methods. But use of statistics does not equate to correct use of statistics. The rise of data science correlated with a rise of various misinterpretations and misuses of statistics. One goal of this book is to help statistics and data science students engage with statistical concepts and reasoning on a deeper level, to distinguish between correct and incorrect uses of statistics.

That “deeper level” is not always present in statistics education. Statistics courses often cover methods and “recipes” for producing inferences. Better, but still not sufficient, some statistics courses are mathematically rigorous. But, in my view, it is much less common that statistics courses engage in the philosophical, conceptual, and inferential underpinnings clearly threaded through the discipline. I believe it is this kind of engagement that makes one a stronger statistical thinker, and user of statistics.

As a pillar of data science, statistics is of instrumental value. Statistics helps practitioners—scientists, domain experts—answer research and business questions. Now, in the age of AI, where methods and “recipes” can be automated, it’s worth considering whether statistics has some deeper, intrinsic value. Why study statistics if computers can deploy methods on our behalf? I believe that there is still value in studying statistics. Statistics is about inference. And to infer is deeply human. As a human endeavor, inference is messy. Philosophers and statisticians are in deep disagreement about the nature of inference; about how we go from “what?” to “why?” (Pearl & Mackenzie, 2018). A deeper study and engagement with philosophical questions may help us gain clarity on the nature of inference. It may help us “arrive where we started” in our statistics journeys, and “know the place for the first time.”

Pearl, J., & Mackenzie, D. (2018). The book of why the new science of cause and effect. Penguin Books.