Further reading

Below is a list of some highly recommended books, that either partially overlap with the content in this book or serve as a natural next step after you finish reading this book. All of these are available for free online.

  • The R Cookbook ( by Long & Teetor (2019) contains tons of examples of how to perform common tasks in R.
  • R for Data Science ( by Wickham & Grolemund (2017) is similar in scope to Chapters 2-6 of this book, but with less focus on statistics and greater focus on tidyverse functions.
  • Advanced R ( by Wickham (2019) deals with advanced R topics, delving further into object-oriented programming, functions, and increasing the performance of your code.
  • R Packages ( by Wickham and Bryan describes how to create your own R packages.
  • ggplot2: Elegant Graphics for Data Analysis ( by Wickham, Navarro & Lin Pedersen is an in-depth treatise of ggplot2.
  • Fundamentals of Data Visualization ( by Wilke (2019) is a software-agnostic text on data visualisation, with tons of useful advice.
  • R Markdown: the definitive guide ( by Xie et al. (2018) describes how to use R Markdown for reports, presentations, dashboards, and more.
  • An Introduction to Statistical Learning with Applications in R ( by James et al. (2013) provides an introduction to methods for regression and classification, with examples in R (but not using caret).
  • Hands-On Machine Learning with R ( by Boehmke & Greenwell (2019) covers a large number of machine learning methods.
  • Forecasting: principles and practice ( by Hyndman & Athanasopoulos, G. (2018) deals with forecasting and time series models in R.
  • Deep Learning with R ( by Chollet & Allaire (2018) delves into neural networks and deep learning, including computer vision and generative models.

Online resources


Agresti, A. (2013). Categorical Data Analysis. Wiley.

Bates, D., Mächler, M., Bolker, B., Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1.

Boehmke, B., Greenwell, B. (2019). Hands-On Machine Learning with R. CRC Press.

Box, G.E., Cox, D.R. (1964). An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26(2), 211-243.

Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A. (1984). Classification and Regression Trees. CRC press.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.

Brown, L.D., Cai, T.T., DasGupta, A. (2001). Interval estimation for a binomial proportion. Statistical Science, 16(2), 101-117.

Buolamwini, J., Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of Machine Learning Research, 81, 1-15.

Cameron, A.C., Trivedi, P.K. (1990). Regression-based tests for overdispersion in the Poisson model. Journal of Econometrics, 46(3), 347-364.

Casella, G., Berger, R.L. (2002). Statistical Inference. Brooks/Cole.

Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P.A., Lukasik, S. & Zak, S. (2010). A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images. In: Information Technologies in Biomedicine, Ewa Pietka, Jacek Kawa (eds.), Springer-Verlag, Berlin-Heidelberg, 15-24.

Chollet, F., Allaire, J.J. (2018). Deep Learning with R. Manning.

Committee on Professional Ethics of the American Statistical Association. (2018). Ethical Guidelines for Statistical Practice.

Cook, R.D., & Weisberg, S. (1982). Residuals and Influence in Regression. Chapman and Hall.

Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47(4), 547-553.

Costello, A.B., Osborne, J. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research, and Evaluation, 10(1), 7.

Cox, D. R. (1972). Regression models and life‐tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187-202.

Dastin, J. (2018). Amazon scraps secret AI recruiting tool that showed bias against women. Reuters.

Davison, A.C., Hinkley, D.V. (1997). Bootstrap Methods and their Application. Cambridge University Press.

Eck, K., Hultman, L. (2007). One-sided violence against civilians in war: Insights from new fatality data. Journal of Peace Research, 44(2), 233-246.

Eddelbuettel, D., Balamuta, J.J. (2018). Extending R with C++: a brief introduction to Rcpp. The American Statistician, 72(1), 28-36.

Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross-validation. Journal of the American Statistical Association, 78(382), 316-331.

Elston, D.A., Moss, R., Boulinier, T., Arrowsmith, C., Lambin, X. (2001). Analysis of aggregation, a worked example: numbers of ticks on red grouse chicks. Parasitology, 122(05), 563-569.

Fleming, G., Bruce, P.C. (2021). Responsible Data Science: Transparency and Fairness in Algorithms. Wiley.

Franks, B. (Ed.) (2020). 97 Things About Ethics Everyone in Data Science Should Know. O’Reilly Media.

Friedman, J.H. (2002). Stochastic Gradient Boosting, Computational Statistics and Data Analysis, 38(4), 367-378.

Gao, L.L, Bien, J., Witten, D. (2020). Selective inference for hierarchical clustering. Pre-print, arXiv:2012.02936.

Groll, A., Tutz, G. (2014). Variable selection for generalized linear mixed models by L1-penalized estimation. Statistics and Computing, 24(2), 137-154.

Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer Science & Business Media.

Hartigan, J.A., Wong, M.A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1), 100-108.

Henderson, H.V., Velleman, P.F. (1981). Building multiple regression models interactively. Biometrics, 37, 391–411.

Herr, D.G. (1986). On the history of ANOVA in unbalanced, factorial designs: the first 30 years. The American Statistician, 40(4), 265-270.

Hoerl, A.E., Kennard, R.W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67.

Hyndman, R. J., Athanasopoulos, G. (2018). Forecasting: Principles and Practice. OTexts.

James, G., Witten, D., Hastie, T., Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer.

Kuznetsova, A., Brockhoff, P. B., Christensen, R. H. (2017). lmerTest package: tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1-26.

Liero, H., Zwanzig, S. (2012). Introduction to the Theory of Statistical Inference. CRC Press.

Long, J.D., Teetor, P. (2019). The R Cookbook. O’Reilly Media.

Moen, A., Lind, A.L., Thulin, M., Kamali–Moghaddamd, M., Roe, C., Gjerstad, J., Gordh, T. (2016). Inflammatory serum protein profiling of patients with lumbar radicular pain one year after disc herniation. International Journal of Inflammation, 2016, Article ID 3874964.

Petterson, T., Högbladh, S., Öberg, M. (2019). Organized violence, 1989-2018 and peace agreements. Journal of Peace Research, 56(4), 589-603.

Picard, R.R., Cook, R.D. (1984). Cross-validation of regression models. Journal of the American Statistical Association, 79(387), 575–583.

Recht, B., Roelofs, R., Schmidt, L., Shankar, V. (2019). Do imagenet classifiers generalize to imagenet?. arXiv preprint arXiv:1902.10811.

Schoenfeld, D. (1982). Partial residuals for the proportional hazards regression model. Biometrika, 69(1), 239-241.

Scrucca, L., Fop, M., Murphy, T.B., Raftery, A.E. (2016). mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. The R Journal, 8(1), 289.

Smith, G. (2018). Step away from stepwise. Journal of Big Data, 5(1), 32.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288.

Tibshirani, R., Walther, G., Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411-423.

Thulin, M. (2014a). The cost of using exact confidence intervals for a binomial proportion. Electronic Journal of Statistics, 8, 817-840.

Thulin, M. (2014b). On Confidence Intervals and Two-Sided Hypothesis Testing. PhD thesis. Department of Mathematics, Uppsala University.

Thulin, M. (2014c). Decision-theoretic justifications for Bayesian hypothesis testing using credible sets. Journal of Statistical Planning and Inference, 146, 133-138.

Thulin, M. (2016). Two‐sample tests and one‐way MANOVA for multivariate biomarker data with nondetects. Statistics in Medicine, 35(20), 3623-3644.

Thulin, M., Zwanzig, S. (2017). Exact confidence intervals and hypothesis tests for parameters of discrete distributions. Bernoulli, 23(1), 479-502.

Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica, 26, 24-36.

Wasserstein, R.L., Lazar, N.A. (2016). The ASA statement on p-values: context, process, and purpose. The American Statistician, 70(2), 129-133.

Wei, L.J. (1992). The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Statistics in Medicine, 11(14‐15), 1871-1879.

Wickham, H. (2019). Advanced R. CRC Press.

Wickham, H., Bryan, J. (forthcoming). R Packages.

Wickham, H., Grolemund, G. (2017). R for Data Science. O’Reilly Media.

Wickham, H., Navarro, D., Lin Pedersen, T. (forthcoming). ggplot2: Elegant Graphics for Data Analysis.

Wilke, C.O. (2019). Fundamentals of Data Visualization. O’Reilly Media.

Xie, Y., Allaire, J.J., Grolemund, G. (2018). R Markdown: the definitive guide Chapman & Hall.

Zeileis, A., Hothorn, T., Hornik, K. (2008). Model-based recursive partitioning. Journal of Computational and Graphical Statistics, 17(2), 492-514.

Zhang, D., Fan, C., Zhang, J., Zhang, C.-H. (2009). Nonparametric methods for measurements below detection limit. Statistics in Medicine, 28, 700–715.

Zhang, Y., Yang, Y. (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, 187(1), 95-112.

Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Methodological), 67(2), 301-320.