A Practical Guide to Data Analysis and R Programming
This guide supports transparent and reproducible data analysis with the R programming language. It emphasizes statistical reasoning, interpretation, and scientific clarity, while also encouraging readers to build programming skills along the way.
Perspective:
Think of the process as two sides of the same coin:
- One side focuses on examining data carefully, applying appropriate statistical methods, and communicating findings clearly and reproducibly.
- The other side involves learning and practicing programming with R, using code as the instrument that makes rigorous analysis possible.
Together, these two aspects reinforce each other — statistical insight gives meaning to the code, while coding provides the precision and reproducibility that modern data analysis requires.
Intended Audience
This material is designed for learners and practitioners who:
- have prior exposure to statistics or quantitative reasoning
- are new to R or transitioning from other analysis tools
- value reproducibility and transparency in data analysis
- work in science, education, health, or applied research
Getting Started
All workflows of this guide begin with a reproducible computing environment. Installation instructions for Windows 11 are provided separately to keep this guide focused on analysis concepts.
Learning Resources
After setting up your computing environment, it’s helpful to begin with a structured introduction to R programming. Freely available ebooks are provided separately to keep this guide focused on analysis concepts. These free resource introduce the fundamentals of R programming, helping beginners understand basic concepts before moving on to analysis.
R Programming – Introduction (Wikibooks contributors, n.d.)
Hands-On Programming with R (Grolemund, 2014)
R for Data Science (Wickham et al., 2014)
Descriptive Statistics
Descriptive statistics summarize and describe the main features of a dataset. These measures help you understand the data's central tendency, variability, and overall distribution without making predictions. Key measures include mean, median, standard deviation, percentiles, skewness, kurtosis, and more.
If you want to see a practical example using actual temperature data (Dumaguete City, January 1, 2026), including computed statistics and raw data, you can explore the sample page below.
Scope of this Guide
Subsequent sections will focus on:
- exploratory statistics
- data visualization for interpretation
- statistical modeling using native R tools
- project-based and reproducible workflows
Note: This site is still under construction. More sections will be added over time.
References
- Grolemund, G. (2014). Hands-On Programming with R. RStudio Education. https://rstudio-education.github.io/hopr/index.html
- Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for Data Science. https://r4ds.hadley.nz/
- Wikibooks contributors. (n.d.). R Programming/Introduction. In Wikibooks. https://en.wikibooks.org/wiki/R_Programming/Introduction