For this Tardy Tuesday session we looked at the following Tidy Tuesday data challenge, which involved identifying the date when various public holidays in the USA (a rare thing) are expected to occur each year.
Brendan led/‘scribed’ the session
Analysis
We used the tidytuesdayR package to load the data, then pushed these to the global environment using list2env.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.0 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
--- Compiling #TidyTuesday Information for 2024-06-18 ----
--- There are 2 files available ---
--- Starting Download ---
Downloading file 1 of 2: `federal_holidays.csv`
Downloading file 2 of 2: `proposed_federal_holidays.csv`
--- Download complete ---
<environment: R_GlobalEnv>
Our main dataset looked as follows:
federal_holidays
# A tibble: 11 × 6
date date_definition official_name year_established date_established details
<chr> <chr> <chr> <dbl> <date> <chr>
1 Janu… fixed date New Year's D… 1870 1870-06-28 "Celeb…
2 Janu… 3rd monday Birthday of … 1983 1983-11-02 "Honor…
3 Febr… 3rd monday Washington's… 1879 NA "Honor…
4 May … last monday Memorial Day 1868 NA "Honor…
5 June… fixed date Juneteenth N… 2021 2021-06-17 "Comme…
6 July… fixed date Independence… 1870 NA "Celeb…
7 Sept… 1st monday Labor Day 1894 NA "Honor…
8 Octo… 2nd monday Columbus Day 1968 NA "Honor…
9 Nove… fixed date Veterans Day 1938 NA "Honor…
10 Nove… 4th thursday Thanksgiving… 1941 NA "Tradi…
11 Dece… fixed date Christmas Day 1870 NA "The m…
We were interested in those ‘roaming holidays’ where the date column contains a range of dates, and the date definition contains information on the criterion used to determine the specific date for a given year.
We decided to try to solve the problem manually for MLK day, which should be the third monday in January.
# find 3rd monday of january 202xwday("2024-06-17")
[1] 2
date_range <-"January 15–21"year <-2024# find monday (2) in date rangestart_date <-"January 15 2024"end_date <-"January 21 2024"mdy(start_date)