Quick note about calculating the mean of a column with dplyr in R. It’s surprisingly easy to screw up, and the culprit is forgetting to change the name of the column storing the new calculation.

A simple dataframe.

library(tidyverse)

df <- data.frame(
  'books_read' = c(1,2,3,4,5,6),
  'intelligence' = c(4,5,6,7,8,8)
)

df
##   books_read intelligence
## 1          1            4
## 2          2            5
## 3          3            6
## 4          4            7
## 5          5            8
## 6          6            8

I want to calculate the mean and standard deviation of the “books read” column. If I calculate the mean and then place it into a new column that has the same name as the original variable, then standard deviation command doesn’t work.

library(tidyverse)
df %>%
  summarise(
    books_read = mean(books_read), # this line is the problem
    sd_books_read = sd(books_read)
  )
##   books_read sd_books_read
## 1        3.5            NA

Instead, I need to call the new “mean books read” column a different name.

library(tidyverse)
df %>%
  summarise(
    mean_books_read = mean(books_read), # this line is the problem
    sd_books_read = sd(books_read)
  )
##   mean_books_read sd_books_read
## 1             3.5      1.870829

Bo\(^2\)m =)