df |>
dplyr::filter(age > 18) |>
dplyr::mutate(age_group = ifelse(age < 30, "young", "old")) |>
dplyr::group_by(age_group) |>
dplyr::summarise(mean_income = mean(income))Script Conventions
Project Management
Workflow
Organisation
Conventions I try to follow
Scripts
- one script = one purpose
- keep scripts short
Example layout of script folder:
R/
├── 01_data_processing.R # scripts for cleaning and processing data
├── 02_linear_models.R # scripts for running linear models
├── utilities.R # scripts for functions that are used across multiple analyses
├── functions_data_processing.R # scripts for functions that are specific to data processing
Functions
- any repeated logic gets moved to functions
- if repeated across different tasks ->
utilities.R - if task-specific ->
functions_01_data_cleaning.R
Style conventions
- Use pipes:
|> - one step per line
Naming Conventions
- use
snake_casefor everything - objects start with noun (e.g.,
df_*ormod_*) - functions start with verb (e.g.,
clean_,fit_)
Ordering and Flow
if structure begins to exist (e.g., order of scripts) use targets to remove manual ordering/execution
list(
tar_target(raw, get_data()),
tar_target(cleaned, clean_data(raw)),
tar_target(model, fit_linear_model(cleaned)),
tar_target(plot, create_plot(model))
)