References

Ahamada, Ibrahim, and Emmanuel Flachaire. 2011. Non-Parametric Econometrics. Oxford University Press.

Alpaydin, Ethem. 2014. Introduction to Machine Learning. 3rd ed. Cambridge, MA: MIT Press.

Atkinson, Elizabeth J., and Terry M. Therneau. 2022. “An Introduction to Recursive Partitioning Using the RPART Routines.” https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf.

Atkinson, Elizabeth J., Terry M. Therneau, and Mayo Foundation. 2000. “An Introduction to Recursive Partitioning Using the RPART Routines.” https://www.mayo.edu/research/documents/rpartminipdf/doc-10027257.

Baumer, Matthew. 2015. “K Nearest Neighbors.” https://rpubs.com/mbaumer/knn .

Bergstra, James, and Yoshua Bengio. 2012. “Random Search for Hyper-Parameter Optimization.” Journal of Machine Learning Research 13: 281–305. https://jmlr.csail.mit.edu/papers/volume13/bergstra12a/bergstra12a.pdf .

———. 2012. “Random Search for Hyper-Parameter Optimization.” Journal of Machine Learning Research 13: 281–305. https://jmlr.csail.mit.edu/papers/volume13/bergstra12a/bergstra12a.pdf .

Brabec, Jan, and Lukás Machlica. 2018. “Bad Practices in Evaluation Methodology Relevant to Class-Imbalanced Problems.” CoRR abs/1812.01388. https://arxiv.org/pdf/1812.01388.pdf.

Breiman, Leo. 2001a. “Random Forests.” Machine Learning 45: 5–32. https://link.springer.com/article/10.1023/A:1010933404324#citeas.

———. 2001b. “The Two Cultures.” Statistical Science 16 (3): 199–231.

Breiman, Leo, and Adele Cutler. 2004. “Random Forests.” https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm .

Brownlee, Jason. 2017. “What Is the Difference Between Test and Validation Datasets?” https://machinelearningmastery.com/difference-test-validation-datasets/.

Brunton, Steve. 2020a. “Principal Component Analysis (PCA).” https://www.youtube.com/watch?v=fkf4IBRSeEc .

———. 2020b. “Singular Value Decomposition (SVD): Mathematical Overview.” https://www.youtube.com/watch?v=nbBvuuNVfco .

Chalk, Alan. 2016. “Rpart Complexity Parameter Confusion.” Cross Validated. https://stats.stackexchange.com/q/223211.

Charpentier, Arthur. 2015. “On Some Alternatives to Regression Models.” Freakonometrics. https://freakonometrics.hypotheses.org/19424.

———. 2016. “Regression with Splines: Should We Care about Non-Significant Components?” Freakonometrics. https://freakonometrics.hypotheses.org/47681.

———. 2018a. “Classification from Scratch, Boosting 11/8.” Freakonometrics. https://freakonometrics.hypotheses.org/52782.

———. 2018b. “Classification from Scratch, Trees 9/8.” Freakonometrics. https://freakonometrics.hypotheses.org/52776.

Chawla, N. V., K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. 2002. “SMOTE: Synthetic Minority over-Sampling Technique.” Journal of Artificial Intelligence Research 16: 321–57. https://jair.org/index.php/jair/article/view/10302.

Cock, Dean De. 2011. “Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project.” Journal of Statistics Education 19 (3). http://jse.amstat.org/v19n3/decock.pdf.

Dalpiaz, David. 2017. “KNN for Classification Using Knn3().” https://daviddalpiaz.github.io/stat432sp18/supp/knn_class_r.html.

Dickenson-Jones, Giles. 2019. “7 Reasons for Policy Professionals to Get into r Programming in 2019.” http://gilesd-j.com/2019/01/07/7-reasons-for-policy-professionals-to-get-pumped-about-r-programming-in-2019/.

DU, Kroll. 2020. “PCA: Eigenvectors of Opposite Sign and Not Being Able to Compute Eigenvectors with ‘Solve‘ in r.” Cross Validated. https://stats.stackexchange.com/q/154716.

Erichson, N. Benjamin, Sergey Voronin, Steven L. Brunton, and J. Nathan Kutz. 2019. “Randomized Matrix Decompositions Using r.” Journal of Statistical Software 89 (11): 1–48. https://doi.org/10.18637/jss.v089.i11.

Fawcett, Tom. 2006. “An Introduction to ROC Analysis.” Pattern Recognition Letters 27 (8): 861–74. https://www.sciencedirect.com/science/article/pii/S016786550500303X .

Fox, John, and Sanford Weisberg. 2018. Nonparametric Regression in r: An Appendix to an r Companion to Applied Regression. 3rd ed. SAGE Publishing. https://socialsciences.mcmaster.ca/jfox/Books/Companion/appendices/Appendix-Nonparametric-Regression.pdf.

FRED. 2015. “FRED.” Federal Reserve Bank of St. Louis. https://fred.stlouisfed.org/.

Freund, Yoav, and Robert E. Schapire. 1997. “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting.” Journal of Computer and System Sciences 55 (1): 119–39. https://doi.org/https://doi.org/10.1006/jcss.1997.1504.

García-Portugués, Eduardo. 2022a. Lab Notes for Statistics for Social Sciences II: Multivariate Techniques. University of Madrid. https://bookdown.org/egarpor/SSS2-UC3M/logreg-deviance.html.

———. 2022b. Notes for Predictive Modeling. https://bookdown.org/egarpor/PM-UC3M/.

Gulzar. 2018. Cross Validated. https://stats.stackexchange.com/q/376191.

Hastie, Trevor, Junyang Qian, and Kenneth Tay. 2021. “An Introduction to Glmnet.” https://glmnet.stanford.edu/articles/glmnet.html.

Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer. https://hastie.su.domains/ElemStatLearn/.

Hippel, Paul Von. 2015. “Linear Vs. Logistic Probability Models: Which Is Better, and When?” Statistical Horizons. https://statisticalhorizons.com/linear-vs-logistic/.

Irizarry, Rafael A. 2022. Data Analysis and Prediction Algorithms with r. Bookdown. https://rafalab.github.io/dsbook/.

ISLR. 2021a. “Carseats: Sales of Child Car Seats.” ISLR. https://rdrr.io/cran/ISLR/man/Carseats.html .

———. 2021b. “Hitters: Baseball Data.” ISLR. https://rdrr.io/cran/ISLR/man/Hitters.html.

Kaggle. 2018. “Credit Card Fraud Detection.” Kaggle. https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud.

Kim, Seongho. 2015. “Ppcor: An r Package for a Fast Calculation to Semi-Partial Correlation Coefficients.” Commun Stat Appl Methods 22 (6): 665–74. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4681537/pdf/nihms740182.pdf .

Kohavi, Ronny, and Barry Becker. 1996. “Adult Data Set.” University of California, Irvine, School of Information & Computer Sciences. https://archive.ics.uci.edu/ml/datasets/Adult.

Kranz, Sebastian. 2022. “Mozart’s Emotions and Creativity: Statistical Analysis of Composers’ Letters and Output.” Ulm University. https://skranz.github.io/.

Kuhn, Max. 2019. The Caret Package. Bookdown. https://topepo.github.io/caret/index.html.

Kutz, Nathan, Steven Brunton, Bingni Brunton, and Joshua Proctor. 2016. Dynamic Mode Decomposition, Data-Driven Modeling of Complex Systems. Society for Industrial & Applied Mathematics. http://orlandi.ing.univaq.it/pub/Kutz%20J.%20Dynamic%20Mode%20Decomposition.%20Data-Driven%20Modeling%20of%20Complex%20Systems%202016.pdf.

Leathwick, J. R., J. Elith, and T. Hastie. 2006. “Comparative Performance of Generalized Additive Models and Multivariate Adaptive Regression Splines for Statistical Modelling of Species Distributions.” Ecological Modelling 199 (2): 188–96. https://www.sciencedirect.com/science/article/pii/S0304380006002572.

Lepidopterist. 2015. “What Is the Intuitive (Geometric?) Meaning of Minimizing the Log Determinant of a Matrix?” Cross Validated. https://stats.stackexchange.com/q/151315.

Paluszyńska, Aleksandra. 2017. “Understanding Random Forests with randomForestExplainer.” https://htmlpreview.github.io/?https://github.com/geneticsMiNIng/BlackBoxOpener/master/randomForestExplainer/inst/doc/randomForestExplainer.html.

Pearl, Judea, and Dana Mackenzie. 2018. The Book of Why. 1st ed. Basic Books.

Pfann, Gerard A., Peter C. Schotman, and Rolf Tschernig. 1996. “Nonlinear Interest Rate Dynamics and Implications for the Term Structure.” Journal of Econometrics 74 (1): 149–76. https://doi.org/https://doi.org/10.1016/0304-4076(95)01754-2.

Powell, Victor. 2017. “Principal Component Analysis.” https://setosa.io/ev/principal-component-analysis/.

Rajter, M. 2019. “In Memory of Monty Hall.” https://theressomethingaboutr.wordpress.com/2019/02/12/in-memory-of-monty-hall/.

“Research Portal on Machine Learning for Social and Health Policies.” 2022. Halifax, Nova Scotia: MLPortal. https://sites.google.com/view/mlportal/home.

Ridgeway, Greg. 2020. “Generalized Boosted Models: A Guide to the Gbm Package.” https://cran.r-project.org/web/packages/gbm/vignettes/gbm.pdf .

Rubin, Paul. 2015. “OLS Oddities.” https://orinanobworld.blogspot.com/2015/10/ols-oddities.html .

Schäfer, Juliane, and Korbinian Strimmer. 2005. “A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics.” Statistical Applications in Genetic & Molecular Biology 4 (32). https://pubmed.ncbi.nlm.nih.gov/16646851/.

Schapire, Robert E. 1990. “The Strength of Weak Learnability.” Machine Learning 5: 197–227. https://web.archive.org/web/20121010030839/http://www.cs.princeton.edu/~schapire/papers/strengthofweak.pdf.

Strang, Gilbert. 2016. Introduction to Linear Algebra, Fifth Edition. Wellesley-Cambridge Press. https://math.mit.edu/~gs/linearalgebra/ila0601.pdf.

Svetunkov, Ivan. 2019. “How Confident Are You? Assessing the Uncertainty in Forecasting.” https://forecasting.svetunkov.ru/en/2019/10/18/how-confident-are-you-assessing-the-uncertainty-in-forecasting/.

Tape, Thomas G. n.d. “Interpreting Diagnostic Tests.” University of Nebraska Medical Center. http://gim.unmc.edu/dxtests/Default.htm .

Tay, Kenneth. 2019. “Visualizing the Relationship Between Multiple Variables.” https://statisticaloddsandends.wordpress.com/2019/08/24/visualizing-the-relationship-between-multiple-variables/.

———. 2020. “What Is the DeLong Test for Comparing AUCs?” https://statisticaloddsandends.wordpress.com/2020/06/07/what-is-the-delong-test-for-comparing-aucs/ .

UCLA. 2021. “R Library Introduction to Bootstrapping.” ULCA Advanced Research Computing Statistical Methods & Data Analytics. https://stats.oarc.ucla.edu/r/library/r-library-introduction-to-bootstrapping/.

user17762. 2020. “How to Intuitively Understand Eigenvalue and Eigenvector?” Mathematics Stack Exchange. https://math.stackexchange.com/q/243553.

Wieringen, Wessel van. 2019. “The Generalized Ridge Estimator of the Inverse Covariance Matrix.” Journal of Computational & Graphical Statistics 28 (4): 932–42. https://www.tandfonline.com/doi/epub/10.1080/10618600.2019.1604374?needAccess=true https://www.sciencedirect.com/science/article/abs/pii/S0167947316301141 .

———. 2021. “Lecture Notes on Ridge Regression.” https://arxiv.org/pdf/1509.09169.pdf .

Wieringen, Wessel van, and Carel F. W. Peeters. 2019. “Ridge Estimation of Inverse Covariance Matrices from High-Dimensional Data.” Computational Statistics & Data Analysis 103: 284–303. https://www.sciencedirect.com/science/article/abs/pii/S0167947316301141.

Wilks. 2019. “Don’t Know Why Eigen() Gives a Vectors of Wrong Sign and the Loading Matrix Is Just Vector.” Stack Overflow. https://stackoverflow.com/questions/55076133/dont-know-why-eigen-gives-a-vectors-of-wrong-sign-and-the-loading-matrix-is-j.

Zhang, Teng. 2012. “Robust Subspace Recovery by Geodesically Convex Optimization.” arXiv:1206.1386v2 60 (11). https://arxiv.org/pdf/1206.1386v2.pdf.