Starting a new project

R users

We recommand using Rstudio when starting a R project. It is the most widely used integrated development environment (IDE) by R users and developers.

Work within a Rstudio project

  • File > New Project > New Directory > New Project : renders a .Rproj file which will be located at the project root
    │
    ├── [my_project]
    │   └── my_project.Rproj
    │
    ├── [another_project]
    │   └── another_project.Rproj
    │
    ├── [again_another_project]
    │   └── again_another_project.Rproj
    │
  • name your project without blank space and accent

  • within your project, always use relative paths from the project root to ensure portability across users (check here R package.)

Start version control with git

How to use git

The files you need from the very beginning

  • DESCRIPTION & NAMESPACE files : those are the files that contain metadata of your project (or package). You can generate them with the following command in the R console:
usethis::create_package("mypackage")

To edit the DESCRIPTION file, check at the documentation section. The NAMESPACE file is automatically generated by the devtools::document() function.

  • LICENSE : you need a license for your project, namely a text file named LICENSE which states under which license your code falls. To choose one refer to https://choosealicense.com. For example, GPL-2 https://choosealicense.com/licenses/gpl-2.0/ is the most widely used free software license.

  • README : the README is a text file allowing your reader : (i) to understand the objectives of your project, (ii) how to use it, and (iii) how to install it. You will update the README along the project development.

How to organize your project repositories

  • The idea is to sort the files you are going to produce in different sub-repositories from the project root. Each of the sub-repository will receive a certain type of file. Thus, how much sub-repo (i.e., how your files are classified) you need is up to you and the specific project you are developing.

  • As an example, for a data analysis project you may have 5 sub-repo which are the following :

    • raw-data : contains the raw data used for the analysis, and should remain untouched.
    • transformed-data : this sub-repo is useful if you need to alter or transform the data, and then store it.
    • analysis : used to store the R scripts that are running the analysis and produce results.
    • outputs : may contain analysis results, like figures or tables.
    • R: contains only your own custom R functions, that you may have developed, and are called by the analysis scripts.
  .
  ├── my_project.Rproj
  ├── [raw-data]
  ├── [transformed-data]
  ├── [analysis]
  ├── [R]
  ├── [outputs]
  ├── README
  ├── DESCRIPTION
  ├── NAMESPACE

The raw-data sub-repo contains the raw data used for the analysis, and should remain untouched. The transformed-data sub-repo is useful if you need to alter or transform the data, and then store it. In the analysis sub-repo, you may store the R scripts that are running the analysis and produce results. The outputs sub-repo may contain analysis results, like figures. The R sub-repo contains only your own custom R functions, that you may have developed, and are called by the analysis scripts.

  • Obviously, this structure should be adapted to each project complexity and needs. Nevertheless, remember that you are trying to facilitate the reuse of your code, so be careful to keep your organization simple and readable.

Some other general good organisation practice

  • Write each of your custom R functions in one single .R file

Here is a list of useful R packages when developing R projects and/or packages

  • devtools : a package that provides functions to ease the development of R packages, by executing common tasks like documentation, testing, etc. e.g.:
    • devtools::document() : generates the NAMESPACE file and documentation files for your functions
    • devtools::load_all() : loads your code and functions in the R environment, so that you can test them immediately
    • devtools::test() : runs the tests you have written for your functions
  • usethis : workflow automation package for R projects for setting up and develop projects, e.g.:
    • usethis::use_git() : initializes a git repository in your project
    • usethis::use_testthat() : initializes the testthat package in your project
    • usethis::use_package("package") : adds a package to the DESCRIPTION file as a dependency of your package
    • usethis::use_r() : creates a new R script in the R/ sub-repository
    • usethis::use_test("myTest") : creates a new test script tests/testthat/test-myTest.R
  • here : a package that provides functions to easily refer to files in your project, e.g.:
    • here::here() : returns the path to the project root
    • here::here("R", "my_function.R") : returns the path to the my_function.R file in the R sub-repository
  • testthat : a package that provides functions to write and run tests for your functions. Check the official documentation and the dedicated section to see how to write tests. Once you have initiated your tests with usethis::use_testthat(), you can write tests in the tests/testthat/ sub-repository:
    • usethis::use_test("my_function") : creates a new test script tests/testthat/test-my_function.R
    • devtools::test() : to run the tests.
Back to top