Patterns from Static: Philosophy and the Question Concerning Statistics
Preface
We shall not cease from exploration
And the end of all our exploring
Will be to arrive where we started
And know the place for the first time.
— T.S. Eliot, Little Gidding
Statistics, data science, and AI
I began writing this book in 2018. At that time, data science was a relatively new field and data science degree programs were just beginning to emerge. Pioneering data science programs at University of California Berkeley and New York University emerged in the 2010s; the BA in Statistics and Data Science at the University of Colorado Boulder (CU Boulder)—my home institution at the time of this writing—launched in 2018, with the launch of the MS in Data Science in fall 2021.
At CU Boulder, like most universities, statistics—alongside computing and domain knowledge—is one of three foundational pillars of data science programs. Thus, the explosion of data science meant an explosion in the engagement with, and application of, statistical methods. But use of statistics does not equate to correct use of statistics. The rise of data science correlated with a rise of various misinterpretations and misuses of statistics. One goal of this book is to encourage students to engage with statistical concepts and reasoning on a philosophical level, to distinguish between correct and incorrect uses of statistics. Another goal is more ambitious: to encourage those who are not students of statistics to see the discipline, not as a branch of math or technical field, but as conceptually rich and inherently interesting. Thus, this book may also be of interest to anyone curious about philosophy or science.
In my experience, this conceptual approach is novel. Statistics courses often present statistics as a sub-discipline of math and teach statistical methods as “recipes” for producing inferences. It is much less common that statistics courses engage in the philosophical, conceptual, and inferential underpinnings clearly threaded through the discipline. I believe it is this kind of engagement that makes one a stronger statistical thinker, and user of statistics.
As a pillar of data science, statistics is of instrumental value. Statistics helps practitioners—scientists, entrepreneurs, domain experts—answer research and business questions. Now, in the age of AI, where methods and “recipes” can be automated, it’s worth considering whether statistics has some deeper, perhaps intrinsic, value. Why study statistics if computers can deploy methods on our behalf? I believe that there is still value in studying statistics. Statistics is about inference. And to infer is deeply human. As a human endeavor, inference is messy. Philosophers and statisticians are in deep disagreement about the nature of inference; about how we go from “what?” to “why?” (Pearl & Mackenzie, 2018). A deeper study and engagement with philosophical questions may help us gain clarity on the nature of inference. For those of us with some familiarity with statistics, this deeper engagement may help us “arrive where we started” in our statistics journeys, and “know the place for the first time.”
How I used AI in this book
The majority of this book was written—and parts posted to the web in various places—before AI became widely available. My view is that writing is a human activity, and one that makes one’s mind and one’s views sharper. To hand that activity over to a computer is to miss the point of what writing can and ought to be. Maybe one day that view will seem quaint.
Nothing in this book was written by AI. With that said, I did use AI (primarily ChatGPT) in the following ways:
To explore some possible citations for an argument or description that I was giving (e.g., “I think person X used the following analogy, but I can’t remember the citation. Help?”)
To generate BibTex code for citations.
To test some of my ideas and arguments in chapter 3 and chapter 4; to generate one example in chapter 3; and for proofreading. For example, in a few occasions, I gave ChatGPT my account of someone’s position and asked whether it was a fair characterization. I was hesitant to do any of these things with AI. But, unfortunately, in 2024, my primary conversation partner (Ian) finished his PhD and moved back east, and very few colleagues or other students were available to discuss some of these niche topics. With that said, the sycophantic nature of AI made it such that I did not always gain from these exchanges.
To help convert my original writing format—LaTeX—to a web book. ChatGPT helped me write documents to use Pandocs to convert LaTeXto Markdown, and ultimately to a Quarto web book. These are tasks that have no real value to me beyond their finished product, and I happily surrendered them to AI!
To help with website copy. I am not in marketing. I had a set of ideas about how this book fits into the new AI landscape. I shared those ideas with ChatGPT to help move away from my more academic tone and toward something with more of a hook.