Tips and tricks for decision modeling in R

Introduction

R has a very steep learning curve and we all know Practice makes perfect. However, sometimes some help can make your coding experience more fun every day. In this document we present (and collect new) coding tips and tricks that help you navigate through R much easier. Enjoy!

Organizing your files

Use projects

RStudio projects make it straightforward to organize all your files of each project, each with their own working directory. When using projects all your files are in the same directory and you will struggle less with setting your working directory. Also sharing the work with other colleagues will be so much easier.

File -> New project. Here you can select to create a new projects or use an existing folder in case you already started working on the research. More about RStudio projects: https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects

Use R Markdown

With R markdown you can combine R code and text to create fully reproductive documentation of your work. More about RMarkdown: https://rmarkdown.rstudio.com/lesson-1.html

Set-up R and R Studio

We recommend to use R with the free version of Rstudio. The following tips help you to make the graphical interface even more useful. This information is inspired by the offical Rstudio support website, where you can also find more: https://support.rstudio.com/hc/en-us/articles/200549016-Customizing-RStudio

Always update R

Soft-wrap R source

Wrap lines of R source code which exceed the width of the editor onto the next line. Note that this does not insert a line-break at the point of wrapping, it simply displays the code on multiple lines in the editor. - Tools -> Global options -> code -> Under edition : check the box for Soft-wrap R source files

Set margin

Transparent and readable code start with - Tools -> Global options -> code -> Under Display : check the box for ** Show margin** and set the margin column to 80.

Rainbow parentheses

We know, this sounds to good to be true! The newest Rstudio has the ability to color your parentheses (and brackets and braces) based on the level of nesting. This is an extremely helpful feature that will help you to write perfect code and help to spot errors. - Tools -> Global options -> code -> Under Display : check the box for Soft-wrap R source files More: https://blog.rstudio.com/2020/11/04/rstudio-1-4-preview-rainbow-parentheses/

Mofidying code

Adjust multiple lines at once

While coding it often happens that you decide to change some names. With option you change your cursor into a plus and you can modify many lines of code at one

The steps of using the option

Run code line for line

To run you code, place your cursors on the left side of the line of code and click on a Mac “command” + “enter” or on a Microsoft “Ctrl” + “Enter”.

Class

Many R objects have a class attribute, a character vector giving the names of the classes from which the object inherits.

If the object does not have a class attribute, it has an implicit class, notably “matrix”, “array”, “function” or “numeric” or the result of typeof(x) (which is similar to mode(x)), but for type “language” and mode “call”, where the following extra classes exist for the corresponding function calls: if, while, for, =, <-, (, {, call.

The function class prints the vector of names of classes an object inherits from. Correspondingly, class <- sets the classes an object inherits from.

v_vector <- c(1, 2, 3, 4)  # inititate a vector 
class(v_vector)

## [1] "numeric"

v_vector_transposed <- t(v_vector)
class(v_vector_transposed)

## [1] "matrix" "array"

unclass(v_vector_transposed)

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4

Check dimensions

In our code we use a lot of matrix multiplication. For matrix multiplication the dimensions of vectors, matrices and arrays are important. As the inner dimensions have to match for the equation to be possible. You can use the dim() function to retrieve or set the dimension of an object.

# Remember we had the vector of length 4 
v_vector

## [1] 1 2 3 4

dim(v_vector) # NULL correspond that this is a column vector in R

## NULL

# the transposed one became a matrix
v_vector_transposed

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4

dim(v_vector_transposed)

## [1] 1 4

# Matrix multiplication of a column vector (4 x 1) with a matrix of (1 x 4) results in a 4 x 4 matrix
m_matrix <- v_vector %*% v_vector_transposed
dim(m_matrix)

## [1] 4 4

m_matrix

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]    2    4    6    8
## [3,]    3    6    9   12
## [4,]    4    8   12   16

List elements to environment

List are useful data structures to store information of different data types and size. But sometimes you like to get all the information from the list into a specific environment, for example your global. You can do so using the list2env function from base R

NOTE: while entries in the environments must have unique names this is not the case for list. List can have duplicated names It is the last element with the name that is used.

# Make a list
l_list <- list(gender = c("Male", "Female", "Male"),
               length = c(182, 175, 178),
               weight = c(80, 65, 75))

# While in the list I have to use the $ to index the values to calculate the BMI
l_list$weight / (l_list$length/100)^2

## [1] 24.15167 21.22449 23.67125

# Move elements of the list to the global environment
ls()

## [1] "l_list"              "m_matrix"            "v_vector"           
## [4] "v_vector_transposed"

list2env(l_list, globalenv())

## <environment: R_GlobalEnv>

ls()

## [1] "gender"              "l_list"              "length"             
## [4] "m_matrix"            "v_vector"            "v_vector_transposed"
## [7] "weight"

# after running the code I can use the element names to calculate BMI
weight / (length/100)^2

## [1] 24.15167 21.22449 23.67125

with in fuctions

With with() statement evaluates an R expression in an environment constructed from data. For example when I have a list I would like to use within a function environment I like to call the elements of the list by their name and not need the use of the $ sign.

calc_BMI <- function(my_list){
  with(my_list, {
    BMI <- weight / (length/100)^2 
    BMI <- round(BMI)
    return(BMI) } # close function expression
  )     # close with
  } # close function
  
# Run the function
calc_BMI(my_list = l_list)

## [1] 24 21 24

Testing

browser() breakpoints

The R function browser() halts execution and invokes an environment browser when it is called. You can put browser() anywhere in your code to stop at that point in the code for debugging.

The browser() statement needs to be part of your code and it needs to be applied like any other code change in order to become active. And you also need to remember to remove the browser() statement after you are down debugging.

For example in our calc_BMI function we can stop the function after we calculated the BMI before we round the BMI. This way we can check the values before their digits are removed.

calc_BMI <- function(my_list){
  with(my_list, {
    BMI <- weight / (length/100)^2 
# Add the browser comment
    browser() 
    
    BMI <- round(BMI)
    return(BMI) } # close function expression
  )     # close with
  } # close function
  
  

calc_BMI(my_list = l_list)

https://support.rstudio.com/hc/en-us/articles/205612627-Debugging-with-RStudio

Profilng

With the profvis function from the package with the same name you can run an R expression with profiling which gives you insight in the time and memory that is required to run the function. This can be helpful to identify bottlenecks in your code when the computational time becomes an issue.

library("profvis")

sample_individuals <- function(n_i){
 my_data <- data.frame(
   ID = 1:n_i, 
   age = runif(n_i, min = 30, max = 70) ,
   weight = rnorm(n_i, mean = 75, sd = 7) ,
   length = runif(n_i, min = 120, max = 201) 
 )
 
  plot(age ~ weight, data = my_data)
  m <- lm(age ~ weight, data = my_data)
  abline(m, col = "red")
}

profvis({
  expr = sample_individuals(n_i = 1000)
  interval = 0.001}
)

Cheet sheets

You are near the end of our summary of useful tips and trick, however there is so much more to discover and Rstudio also developed great cheatsheets we would like to recommend to you. https://www.rstudio.com/resources/cheatsheets/

Contribute

Do you have some tips or tricks that made your life easier. Please help us to improve this document and report it to any of the contact details below. Thank you!

Contact

Website: http://darthworkgroup.com
Email: info@darthworkgroup.com
GitHub: https://github.com/DARTH-git
Twitter: https://twitter.com/DARTHworkgroup