Starting a new project
R users
We recommand using Rstudio when starting a R project. It is the most widely used integrated development environment (IDE) by R users and developers.
Work within a Rstudio project
- File > New Project > New Directory > New Project : renders a
.Rproj
file which will be located at the project root
│
├── [my_project]
│ └── my_project.Rproj
│
├── [another_project]
│ └── another_project.Rproj
│
├── [again_another_project]
│ └── again_another_project.Rproj
│
name your project without blank space and accent
within your project, always use relative paths from the project root to ensure portability across users (check here
R
package.)
Start version control with git
The files you need from the very beginning
DESCRIPTION
&NAMESPACE
files : those are the files that contain metadata of your project (or package). You can generate them with the following command in the R console:
::create_package("mypackage") usethis
To edit the DESCRIPTION file, check at the documentation section. The NAMESPACE file is automatically generated by the devtools::document()
function.
LICENSE : you need a license for your project, namely a text file named LICENSE which states under which license your code falls. To choose one refer to https://choosealicense.com. For example, GPL-2 https://choosealicense.com/licenses/gpl-2.0/ is the most widely used free software license.
README : the README is a text file allowing your reader : (i) to understand the objectives of your project, (ii) how to use it, and (iii) how to install it. You will update the README along the project development.
How to organize your project repositories
The idea is to sort the files you are going to produce in different sub-repositories from the project root. Each of the sub-repository will receive a certain type of file. Thus, how much sub-repo (i.e., how your files are classified) you need is up to you and the specific project you are developing.
As an example, for a data analysis project you may have 5 sub-repo which are the following :
raw-data
: contains the raw data used for the analysis, and should remain untouched.transformed-data
: this sub-repo is useful if you need to alter or transform the data, and then store it.analysis
: used to store the R scripts that are running the analysis and produce results.outputs
: may contain analysis results, like figures or tables.R
: contains only your own custom R functions, that you may have developed, and are called by the analysis scripts.
.
├── my_project.Rproj
├── [raw-data]
├── [transformed-data]
├── [analysis]
├── [R]
├── [outputs]
├── README
├── DESCRIPTION
├── NAMESPACE
The raw-data sub-repo contains the raw data used for the analysis, and should remain untouched. The transformed-data sub-repo is useful if you need to alter or transform the data, and then store it. In the analysis sub-repo, you may store the R scripts that are running the analysis and produce results. The outputs sub-repo may contain analysis results, like figures. The R sub-repo contains only your own custom R functions, that you may have developed, and are called by the analysis scripts.
- Obviously, this structure should be adapted to each project complexity and needs. Nevertheless, remember that you are trying to facilitate the reuse of your code, so be careful to keep your organization simple and readable.
Some other general good organisation practice
- Write each of your custom R functions in one single
.R
file
Here is a list of useful R packages when developing R projects and/or packages
devtools
: a package that provides functions to ease the development of R packages, by executing common tasks like documentation, testing, etc. e.g.:devtools::document()
: generates the NAMESPACE file and documentation files for your functionsdevtools::load_all()
: loads your code and functions in the R environment, so that you can test them immediatelydevtools::test()
: runs the tests you have written for your functions
usethis
: workflow automation package for R projects for setting up and develop projects, e.g.:usethis::use_git()
: initializes a git repository in your projectusethis::use_testthat()
: initializes thetestthat
package in your projectusethis::use_package("package")
: adds a package to the DESCRIPTION file as a dependency of your packageusethis::use_r()
: creates a newR
script in the R/ sub-repositoryusethis::use_test("myTest")
: creates a new test scripttests/testthat/test-myTest.R
here
: a package that provides functions to easily refer to files in your project, e.g.:here::here()
: returns the path to the project roothere::here("R", "my_function.R")
: returns the path to themy_function.R
file in theR
sub-repository
testthat
: a package that provides functions to write and run tests for your functions. Check the official documentation and the dedicated section to see how to write tests. Once you have initiated your tests withusethis::use_testthat()
, you can write tests in thetests/testthat/
sub-repository:usethis::use_test("my_function")
: creates a new test scripttests/testthat/test-my_function.R
devtools::test()
: to run the tests.