Machine Learning Toolbox for Social Scientists: Applied Predictive Analytics with R

About this book

The final version is forthcoming from Chapman & Hall/CRC on September 22, 2023. The draft and online version can be accessed below, which is still in progress without a final copyediting. The book covers predictive methods with complementary statistical “tools” that make it self-contained. This website is where I plan to include R codes, suplementary applications, errata, and various new chapters.


A draft version: Toolbox

Who is this book for?

The “causal inference” is the traditional framework for most statistics courses in social science and business fields, especially in Economics and Finance. As I tried to look at “prediction” from economists’ perspective, the book has become a “toolbox” that many social science and business students can follow and understand predictive methods beyond standard machine learning “code” applications. The book offers a new organization that helps students and faculty a smooth transition from “Inferential Statistics” to novel “prediction” methods. This transition starts with the first few sections, which offer a window where a traditional training in inferential statistics meets with data analytics that focuses on prediction.

This book is targeted at motivated students and researchers who have a background in inferential statistics using parametric models. It is applied because I skip many theoretical proofs and justifications that can easily be found elsewhere. I do not assume a previous experience with R but some familiarity with coding.

Contributions from the community are more than welcome! If you notice something is missing or notice an issue in the book (e.g., typos or problems with the material), please don’t hesitate to reach out.

R Bootcamp for Data Analytics

R Bootcamp is an online book that covers basics to learn R for Data Science. It is a companion book of Toolbox. It also has an R package, RBootcamp, containing the interactive exercises for each chapter.