How to create a Pivot Table in R

Why I make this?

Pivot Tables are beautiful. Even Steve Jobs thought so. But they’re a pain to create in R.

Not anymore!

How to do this?

If you’ve got data that looks like this –

S.No. UID District Block Age Gender Social Category
1 239140001 HOOGHLY ARAMBAGH 7 GIRL GENERAL
2 239140002 HOWRAH SHYAMPUR-I 2 BOY GENERAL
3 239140003 KOLKATA WARD 72 7 BOY SC
4 239140004 HOOGHLY BALAGARH 3 GIRL ST
5 239140005 BIRBHUM DUBRAJPUR 5 BOY SC
6 239140006 KOLKATA WARD 101 8 TRANS OBC

and you want a pivot table that looks like this –

District BOY GIRL TRANS
BIRBHUM 23 19 0
HOOGHLY 36 49 2
HOWRAH 13 19 4
KOLKATA 83 69 1

This is what you need to do –

1
2
as.data.frame(table(df[,c(3,6)])) %>% 
  pivot_wider(names_from="Gender", values_from="Freq")

That’s it!!! And it’s a million times faster than running creating an empty data frame and then populating it with for loops.

You can now create a pivot table for any two pairs by simply switching the columns!


Here’s what’s happening in the function –

  1. You segment the original data frame (df) with just the two columns that you need, i.e., District and Gender. You can do it by referring the column numbers directly (df[,c(3,6)]) or by running a which call using the column name (df[,c(which(colnames(df)=="District"),which(colnames(df)=="Gender"))]).
  2. Next, you run the table function on it get the counts for each Gender against each District. We wrap this inside a as.data.frame function because that’s just easier.
  3. Now we use the %>% (pipe) operator to run the pivot_wider function on the data frame we’ve just created.
  4. We tell pivot_wider to take the column names from the column called “Gender” and the values for that [District,Gender] entity from the “Freq” column. Note that the “Freq” column is automatically created when we run the table function to get the counts.

Note: Make sure you’ve run library(dplyr) and library(tidyr) first so you can use the pipe operator and the pivot_wider function.

So elegant, isn’t it?

The pivot_wider function is helpful when you want to turn your rows into columns. In this example, we get a lot of rows from running the table function. (The # of rows in Table = # of unique Districts x # of unique Genders)

A sister function is the pivot_longer function, which is useful for the opposite case – when you have a lot of columns that you want to turn into rows.