A few quick general R tips

FAQs
Posted

Thursday October 10, 2024 at 3:30 PM

Hi everyone!

As I’ve been looking through your exercises, I’ve noticed a few little R issues that might sometimes be tripping you up. They’re super minor, but can make life easier:

Talking about packages and functions

You’ve probably noticed that on the course website here, I put package names in {}s, like {ggplot2} or {gghalves}. This is a normal convention in the R world—people generally either put package names in {braces} or in bold.

When writing about functions, I typically format them as code, followed by empty open and closed parentheses, like geom_point(). That signals that it’s actual runnable code and not the name of a package. Remember that you can format things like inline code by using single backticks.

For instance, this:

You can use `geom_point()` from {ggplot2} to make scatterplots.

will knit into this:

  • You can use geom_point() from {ggplot2} to make scatterplots.

Nicer {ggplot2} documentation

The ggplot documentation within R (i.e. in the help panel) is good, but I find that it’s nicer to use the documentation website. It’s the same exact content, but the website version shows plots for each of the examples. Scroll down to the examples section for geom_point(), for instance.

The tidyverse package shortcut

When you run library(tidyverse), R shows a message that says it has loaded 9 packages:

── Attaching core tidyverse packages ────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     

The tidyverse package developers have found that those 9 are some of the most common packages that people use, so they created library(tidyverse) as a shortcut for loading them all at the same time. Alternatively, you could start your documents like this:

library(ggplot2)
library(dplyr)
library(readr)
library(tidyr)
library(stringr)
library(forcats)
# etc.

But that’s a lot of typing. It’s easier to just do library(tidyverse)

If you load the tidyverse package, you don’t need to load those 9 individual packages. Doing this is entirely redundant:

library(tidyverse)
library(ggplot2)  # This is alreaady loaded bc tidyverse
library(dplyr)  # This is also already loaded bc tidyverse

I’ve tried to point out in your exercises if/when you do this

Chunk labels

Labeling your R chunks is a good thing to do, since it helps with document navigation and is generally good practice. If you’re using chunk labels make sure you don’t use spaces in them. R will still knit a document with spaceful names, but it converts the spaces to underscores before doing it. So instead of naming chunks like #| label: My Neat Chunk}, use something like #| label: my-neat-chunk or #| label: my_neat_chunk:

```{r}
#| label: my-neat-chunk

```

Code style

Unlike other programming languages (grumbles at Python), R is fairly forgiving with the style of the code you write. You can have extra spaces, you can omit spaces, you can indent things however you want, etc.

But if you want to write readable code (you do!) and write code that others can work with (you do!), you should follow some basic style guidelines. I’ve summarized a few of the most important ones here.

I’m never going to grade you on any of this, by the way! These are just a set of best practices that you should get into the habit of using.