Datasources for reuse

This package has been created to help NHS, Public Health and related analysts/data scientists learn to use R. It contains several free datasets (just one at the moment), help files explaining their structure, and vignette examples of their use. We encourage contributions to the package, both to expand the set of training material, but also as development for newer R/github users as a first contribution. Please add relevant free, open source data sets that you think may benefit the NHS R-community.

Installation instructions

This packages is available on CRAN or the development version can be installed from source, via this Github repository. You will need Rtools installed to build the package, and the remotes package.



Please contribute to this repository, and please cite it when you use it in training or publications.

To contribute, please:

  • Fork the repository.
  • Add your dataset in the data folder, in .rda format. The best way to do this is with the usethis package with “gzip” compression: usethis::use_data(data, compress="gzip")
  • Please add a minimal R function to act as a help file. You can use the LOS_model as a guide.
  • Please add a vignette demonstrating how the data has been/can be used.
  • Create a pull request, detailing your additions, and we will review it before merging.

When contributing a dataset, the contributor certifies that:

  • They are the data owner, or are authorised to republish the dataset in question.
  • The dataset does not contain real patient-level data.
  • Where based on patient data, the contributor takes full responsibility for sharing the data and certifies that. it is has been processed, anonymised, aggregated or otherwise protected in accordance with all legal requirements under General Data Protection Regulation (GDPR), or other relevant legislation.

Please note that the ‘NHSRdatasets’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.