3 Getting started
3.1 Software
This pipeline was developed on R version 4.5.1 “Great Square Root” and RStudio version 2025.05.1 “Mariposa Orchid”. You will need to have R installed and it is strongly recommended to also install RStudio.
3.2 Download the repository code
Go to the GitHub repository for ResPrj_HIA_Indonesia_pm25_mortality_1998_2020. Download the repository by clicking the green < > Code button, then “Download ZIP”. Extract the downloaded zip file to an appropriate location on your computer.
Alternatively, if you are comfortable with using Git, you may clone the repository.
3.3 Download the input data
All datasets are publicly available from the following sources:
- Modelled mortality data: Institute of Health Metrics and Evaluation (IHME) via the GBD Results Tool
- Modelled PM2.5 exposure data: Atmospheric Composition Analysis Group (ACAG) as part of the SatPM2.5 (Satellite-derived PM2.5) dataset (older versions archived)
- Geographical boundaries: Database of Global Administrative Areas
Please note the licencing and conditions of use for each dataset before using them in your work.
The specific dataset versions used in the development of the pipeline are available in the Cloud CARDAT repository to approved researchers.
- IHME data for Southeast Asia, as of 2025-06-03
- ACAG SatPM2.5 Global V5.GL.02
- GADM v4.0.4
3.3.1 Using the GBD Results Tool
You must register to search and download data from IHME’s GBD Results Tool. To retrieve the required variables for this pipeline, use the following search parameters:
- GBM Estimate: Cause of death or injury
- Measure: Deaths
- Metric: Number, Rate
- Cause: All causes
- Location: Select all under Southeast Asia (including sub-national areas for Indonesia)
- Age: Select all
- Sex: Male, Female
- Total percentage change: Off
- Year: All years from 1998 to 2021
3.4 Run the pipeline
3.4.1 Open the project
Open up the R project (ResPrj_Indonesia_HIA_pm25_mortality_1998_2020.Rproj) in RStudio. You should be able to see the project name in the upper-right corner of the RStudio window. If you see “Project: (None)” instead, you have not opened the project file correctly and the targets code will not work.
3.4.2 Install the required packages
RStudio should prompt you to install the libraries required if not already installed.
If not automatically prompted, you can install libraries by running the command install.packages("targets") in the Console. Additional packages used by the pipeline are tarchetypes, sf, terra, data.table, tmap - these can be installed with:
install.packages(c("targets", "tarchetypes", "sf", "terra", "data.table", "tmap"))
The iomlifetR package is also required to calculate HIA life tables and other metrics but is not available through CRAN. It can be installed from GitHub by running the following:
devtools::install_github("richardbroome2002/iomlifetR", build_vignettes = TRUE)
(“sf”, “terra”, “data.table”, “iomlifetR”, “tmap”)
3.4.3 Set the inputs
The pipeline is defined in the _targets.R file, while global objects are defined in config.R. Helper code designed to run interactively (line-by-line) is provided in main.R and can be used to examine, run and visualise the pipeline and its outputs.
You must define the paths to the input files in config.R. All variables
indir.geographyandinfile.geography(GADM data),indir.mortandinfile.mort(IHME data), andindir.pm25andinfile.pm25(ACAG data)
must be provided and point to the corresponding input data files you have downloaded.
The mapping of province names is included in this repository under the metadata/ directory. Ensure the infile.locname_map variable is correctly defined for this mapping file.
3.4.4 Run targets
Open the main.R file. This code should be run interactively, line-by-line. You can run code one line at a time in RStudio by clicking the button “Run” at the top-right of the code panel (known as the Source panel). Alternatively you may use the shortcut Ctrl + Enter on a Windows machine.
Starting from the top, run the code and examine the output.
library(targets): Loads thetargetspackage.tar_manifest()andtar_visnetwork(targets_only = T): Check the targets of the pipeline have been defined correctly and dependencies are linked correctly.tar_make(): Run all invalidated targets. Progress shown in console.tar_read(data_attributable_number): Read the output of the specified target (may be a data.frame, file path, etc.)browseURL(tar_read(report)[1]): Open up the rendered report in a browser.tar_meta(): View the target metadata, including warnings and errors, number of seconds to run, time target was last run, size of target.
You can also open the output files from the pipeline which are located in the data_derived/ (output data files), figs_and_tabs/ (figures and tables output) and report/ (rendered Rmarkdown documents) directories.