2020: Day 11

base R
tidyverse
matrix
loops
Author

Ella Kaye

Published

December 11, 2020

Setup

The original challenge

My data

Part 1

My code for Day 11 runs a little slow (about 10 seconds for Part 1 and 80 seconds for Part 2), so for the sake of being able to rebuild this page quickly as I keep updating it working through the challenges, I will demonstrate this code with the test input provided as an example.

library(dplyr)
library(stringr)
library(tidyr)

First we read in the data and convert it to a matrix (using the datapasta package for the test input):

layout <- tibble::tribble(
  ~X1,
  "L.LL.LL.LL",
  "LLLLLLL.LL",
  "L.L.L..L..",
  "LLLL.LL.LL",
  "L.LL.LL.LL",
  "L.LLLLL.LL",
  "..L.L.....",
  "LLLLLLLLLL",
  "L.LLLLLL.L",
  "L.LLLLL.LL"
  )
# get number of columns for matrix
num_col <- layout %>%
  mutate(length = str_length(X1)) %>%
  slice(1) %>%
  pull(length)

# split layout into characters and turn to vector
layout_vec <- layout %>%
  mutate(X1 = strsplit(X1, split = character(0), fixed = TRUE)) %>%
  pull(X1) %>%
  unlist()

# organise into matrix
initial_layout <- matrix(layout_vec, ncol = num_col, byrow = TRUE)

Next, we write a helper function that, given a matrix and row and column indices, returns a vector of the adjacent seats. We need to take care when indexing into the matrix, so we treat all corner and edge cases separately. Fiddly, but gets the job done.

get_adj <- function(mat, i,j) {
  
  nr <- nrow(mat)
  nc <- ncol(mat)
  
  # corner cases
  if (i == 1 & j == 1) {adj <- c(mat[1,2], mat[2,1:2])}
  else if (i == 1 & j == nc) {adj <- c(mat[1,(nc-1)], mat[2,(nc-1):nc])}
  else if (i == nr & j == 1) {adj <- c(mat[nr,2], mat[nr-1,1:2])}
  else if (i == nr & j == nc) {adj <- c(mat[nr-1, (nc-1):nc], mat[nr, nc-1])}  
  
  # edge cases
  else if (i == 1) {adj <- c(mat[1, c(j-1,j+1)], mat[2, (j-1):(j+1)])}
  else if (i == nr) {adj <- c(mat[nr, c(j-1,j+1)], mat[nr-1, (j-1):(j+1)])}
  else if (j == 1) {adj <- c(mat[c(i-1, i+1), 1], mat[(i-1):(i+1), 2])}
  else if (j == nc) {adj <- c(mat[c(i-1, i+1), nc], mat[(i-1):(i+1), nc-1])}
  
  # inside cases
  else {adj <- c(mat[i-1,(j-1):(j+1)], mat[i,c(j-1,j+1)], mat[i+1,(j-1):(j+1)])}
  
  adj
}

Once we have a vector of surrounding seats, we can apply the rules in the problem to determine whether a given seat needs to change state. The needs_changing helper function does that. It’s overkill at this point to give options to specify the function for finding the vector of seats to check, and the maximum number of occupied seats people can tolerate around them, but (spolier alert) I put in these options when working on the challenge in Part 2.

needs_changing <- 
  function(mat, i,j, get_surround = get_adj, max_occupied = 4) {
  
  surround <- get_surround(mat, i,j)
  n_occupied <- sum(surround == "#")
  
  if ((mat[i,j] == "L") & (n_occupied == 0)) return(TRUE)
  
  else if ((mat[i,j] == "#") & (n_occupied >= max_occupied)) {
    return(TRUE)
  }
  
  else return(FALSE)
}

Since floor spaces don’t change, we only need to consider seats. We save the indices of the seats into a data frame, so we can vectorise over it using tidyverse functions. However, when we’ve determined the seats that need changing, using our needs_changing function, we need to convert those indices from a data.frame into a matrix, in order to index into the layout matrix appropriately and make the changes.

seats <- which(initial_layout != ".", arr.ind = TRUE)

seats_df <- as.data.frame(seats) %>%
  rename(i = row, 
         j = col)
layout <- initial_layout
iters <- 0

# loop until there are no further changes
repeat {
  
  change <- 0
  
  seats_to_change <- 
    seats_df %>%
    rowwise() %>%
    mutate(change_seat = needs_changing(layout,i,j))   
  
  change <- sum(seats_to_change$change_seat)
  
  if (change == 0) break
  
  indices_to_change <- 
    seats_to_change %>%
    filter(change_seat) %>%
    select(i,j) %>%
    as.matrix()  

  layout[indices_to_change] <- 
    setdiff(c("L", "#"),  layout[indices_to_change])
  
  iters <- iters + 1
}

part_1_iters <- iters
sum(layout== "#")
[1] 37

On the test set, this takes 5 iterations. On the full data set, my answer is 2316, and it took 107 iterations.

Part 2

Now, people look to the first seat they can see in each direction, and will change from occupied to unoccupied if five or more of them are occupied.

The plan is to write a function that extracts full vectors from a given seat to the edge of the layout matrix in each of the eight directions, then finds the first seat in each of those directions, and finally collects those into a vector of the seats under consideration when determining if a change is needed. Then I can reuse the loop from Part 1, just changing the arguments in the calls to needs_changing.

Here’s a helper function to get the first seat in a vector looking in one direction:

get_first_seat_from_vec <- function(vec) {
  
  if (any(vec %in% c("#", "L"))) {
    return(vec[min(which(vec != "."))])
  }
  
  return(NA)
}

Now, if I thought getting adjacent seats to a given seat in Part 1 was fiddly, it’s nothing on getting a vector from a given seat to the edge of the matrix. There are many cases to consider to make we we don’t go out of bounds. In the diagonal directions, first we get a matrix of the indices of the matrix we need, then subset into the matrix accordingly.

# takes a layout matrix (elements ".", "#", "L")
# returns vector with first "L" or "#" encountered in each direction
get_first_seat <- function(mat, i,j) {
  
  nr <- nrow(mat)
  nc <- ncol(mat)
  
  # North
  if (i == 1) N <- NA
  if (i > 1) N <- mat[(i-1):1,j]
  
  # South
  if (i == nr) S <- NA
  if (i < nr) S <- mat[(i+1):nr,j]
  
  # East
  if (j == nc) E <- NA
  if (j < nc) E <- mat[i, (j+1):nc]
  
  # West
  if (j == 1) W <- NA
  if (j > 1) W <- mat[i, (j-1):1]
  
  # how far in each direction to edge of matrix
  to_N <- i - 1
  to_S <- nr - i
  to_E <- nc - j
  to_W <- j - 1
  
  # North-West
  NW_length <- min(to_N, to_W)
  
  if (i == 1 | j == 1) NW <- NA
  else {
    mat_index <- 
      matrix(c((i-1):(i-NW_length), (j-1):(j-NW_length)), ncol = 2)
    NW <- mat[mat_index]
  }
  
  # North-East
  NE_length <- min(to_N, to_E)
  
  if (i == 1 | j == nc) NE <- NA
  else {
    mat_index <- 
      matrix(c((i-1):(i-NE_length), (j+1):(j+NE_length)), ncol = 2)
    NE <- mat[mat_index]
  }
  
  # South-East
  SE_length <- min(to_S, to_E)
  
  if (i == nr | j == nc) SE <- NA
  else {
    mat_index <- 
      matrix(c((i+1):(i+SE_length), (j+1):(j+SE_length)), ncol = 2)
    SE <- mat[mat_index]
  }
  
  # South-West
  SW_length <- min(to_S, to_W)
  
  if (i == nr | j == 1) SW <- NA
  else {
    mat_index <- 
      matrix(c((i+1):(i+SW_length), (j-1):(j-SW_length)), ncol = 2)
    SW <- mat[mat_index]
  }

  # vectors from mat[i,j] to the edge in each direction 
  all_vecs <- 
    (list(N = N, S = S, E = E, W = W, NW = NW, NE = NE, SE = SE, SW = SW))
  
  # the first seat in each direction, collapsed to a vector
  first_seats <- purrr::map_chr(all_vecs, get_first_seat_from_vec)
  
  # remove NAs from list and return
  # (these occur either when starting on an edge, 
  # or when there are no seats in a given direction)
  return(first_seats[!is.na(first_seats)])

}
layout <- initial_layout
iters <- 0

# loop until there are no further changes
repeat {
  
  change <- 0
  
  seats_to_change <- 
    seats_df %>%
    rowwise() %>%
    mutate(change_seat = needs_changing(layout,i,j, get_first_seat, 5))   
  
  change <- sum(seats_to_change$change_seat)
  
  if (change == 0) break
  
  indices_to_change <- 
    seats_to_change %>%
    filter(change_seat) %>%
    select(i,j) %>%
    as.matrix()  

  layout[indices_to_change] <- 
    setdiff(c("L", "#"),  layout[indices_to_change])
  
  iters <- iters + 1
}

part_2_iters <- iters
sum(layout== "#")
[1] 26

On the test set, this takes 6 iterations. On the full data set, my answer is 2128, and it took 87 iterations. Given this is fewer iterations than in Part 1, it must be my code for getting the first seat that’s slowing things down.

I am unsatisfied both by how many lines of code this has taken as well as the time taken to run. The introduction to Advent of Code says that each challenge has a solution that will complete in at most 15 seconds on ten year old hardware. So clearly there’s a better way of doing this. Perhaps something to revisit in the future.

Session info

Toggle
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Sonoma 14.0
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/London
 date     2023-11-06
 pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
 quarto   1.4.466 @ /usr/local/bin/quarto

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 dplyr       * 1.1.2   2023-04-20 [1] CRAN (R 4.3.0)
 sessioninfo * 1.2.2   2021-12-06 [1] CRAN (R 4.3.0)
 stringr     * 1.5.0   2022-12-02 [1] CRAN (R 4.3.0)
 tidyr       * 1.3.0   2023-01-24 [1] CRAN (R 4.3.0)

 [1] /Users/ellakaye/Library/R/arm64/4.3/library
 [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library

──────────────────────────────────────────────────────────────────────────────