Week 9: Grouping, Summarising, and Tidying Data
Master grouped operations, reshape data layouts, and combine datasets.
Explore Chapter 9Reshaping Data: `pivot_longer()` and `pivot_wider()`.
Pivoting changes the layout of your data frame between "wide" and "long" formats.
`pivot_longer()`: From Wide to Long
Use `pivot_longer()` when some of your column names are actually values of a variable. It "lengthens" data by increasing the number of rows and decreasing the number of columns.
Key arguments:
- `cols`: The columns to pivot into longer format (gather).
- `names_to`: Name of the new column that will contain the original column names.
- `values_to`: Name of the new column that will contain the values from the original pivoted columns.
# Example wide data
wide_data <- data.frame(
student = c("Alice", "Bob"),
test1_score = c(85, 90),
test2_score = c(88, 85)
)
print(wide_data)
# Pivot test scores into longer format
long_data <- wide_data %>%
pivot_longer(
cols = c(test1_score, test2_score), # Or cols = ends_with("_score")
names_to = "test_type",
values_to = "score"
)
print(long_data)
# Output:
# # A tibble: 4 × 3
# student test_type score
# <chr> <chr> <dbl>
# 1 Alice test1_score 85
# 2 Alice test2_score 88
# 3 Bob test1_score 90
# 4 Bob test2_score 85
`pivot_wider()`: From Long to Wide
Use `pivot_wider()` when an observation is scattered across multiple rows. It "widens" data by increasing columns and decreasing rows.
Key arguments:
- `names_from`: The column whose values will become the new column names.
- `values_from`: The column whose values will fill the new columns.
# Using long_data from above
wide_again <- long_data %>%
pivot_wider(
names_from = test_type,
values_from = score
)
print(wide_again)
# Output (similar to original wide_data):
# # A tibble: 2 × 3
# student test1_score test2_score
# <chr> <dbl> <dbl>
# 1 Alice 85 88
# 2 Bob 90 85
Pivoting is crucial for preparing data for specific types of analysis or visualization.