Ella Kaye


December 10, 2023


The original challenge

My data

Part 1

Toggle the code
# other options: aoc_input_data_frame(), aoc_input_matrix()
input <- aoc_input_data_frame(10, 2023)
# A tibble: 6 × 1
1 .-FL-L777F-F.7-LJF-7.F|7F7.FF---7-F-L|-F-JJ-LL.J-F7.L-LJFF-LFFJ7.FF-F|FFFJ77-…
2 --|J.F-7-|7F|7-|FF-J-7L|JL--.LF|.7J||F77|7.FL-7-77|--L|.F7FLJ--F--7.LL-7|LFL7…
3 |.F-7-L..J-JJL7.||..L|--JF7...-FLJ7|L|J|777.|--J|LJF7--LLJ77.|7L77|F||7|-|L|J…
4 7J.-J7.FFLF--.-FJ7|--J7--F|77L-|J7|..F.L7-7.J.|J|.F-F7JL-7.F7FLL|-|-JL-JFFF7J…
5 JJ.FL7|77.L7|-.FJ-|-F.|7J|.FJLFJ7-|.7LJF|.7JJ.L.LFJFL|LLFJ7|.L7||FJ7|7|7L7J|.…
6 L7-|JFF77J--|.L|--JFF7LJF--7|LL-|.LF|LF-7.LL77|FLJ-JFF7.-|JF7JL7-7.LJF|7J.L.F…
The crux of the puzzle

Find the furthest point in a loop from the start point. This is equivalent to finding the largest component in a graph, and half the number of its nodes.

First, my solution in full, then explanation below.

Toggle the code

# table with one row per grid element, with its row and col positions
# and an identifier for the node it will represent in a graph.
grid <- input |> 
  adventdrob::grid_tidy(X1) |> 
  mutate(node = row_number()) 

# find all nodes adjacent to each other in the grid
grid_adj <- adventdrob::adjacent_join(grid, grid)

# group values according to where the pipe is open
open_S <- c("|", "F", "7") 
open_N <- c("|", "L", "J") 
open_W <- c("-", "J", "7") 
open_E <- c("-", "L", "F") 

# find the nodes that are connected
grid_adj_connected <- grid_adj |> 
  mutate(is_connected = case_when(
    value %in% open_S & col2 == col & row2 > row & value2 %in% open_N ~ TRUE, 
    value %in% open_N & col2 == col & row2 < row & value2 %in% open_S  ~ TRUE, 
    value %in% open_W & col2 < col & row2 == row & value2 %in% open_E ~ TRUE, 
    value %in% open_E & col2 > col & row2 == row & value2 %in% open_W  ~ TRUE, 
    .default = FALSE

# get nodes, edges and convert into graph
grid_nodes <- grid |> 

grid_edges <- grid_adj_connected |> 
  filter(is_connected) |> 
  select(from = node, to = node2)

pipe_graph <- tbl_graph(nodes = grid_nodes, edges = grid_edges) 
# not called grid_graph as that's the name of a function in adventdrob

# the length of the longest path is the number of nodes in the largest component
longest_path <- pipe_graph |> 
  activate(nodes) |> 
  mutate(group = group_components()) |> 
  as_tibble() |> 
  count(group) |> 
  slice_max(n) |> 

# add 1 to the longest path to account for "S" (and make it a loop)
# furthest point is half-way round.
(longest_path + 1)/2
[1] 6831

I spent a lot of time on Day 3 going over David Robinson’s solution to that puzzle, using his adventdrob. That paid off today, as I was able to use the grid_tidy() and adjacent_join() functions to get a tidy table with a row for each pair of adjacent points. For example, in grid_adj, we can see all the points adjacent to the start point, "S":

Toggle the code
grid_adj |> 
  filter(value == "S")
# A tibble: 4 × 8
    row value   col  node  row2  col2 value2 node2
  <int> <chr> <int> <int> <dbl> <dbl> <chr>  <int>
1    17 S        38  2278    16    38 J       2138
2    17 S        38  2278    17    37 F       2277
3    17 S        38  2278    17    39 -       2279
4    17 S        38  2278    18    38 F       2418

We see that "S" is in row 17, column 38, and we can also see the values (value2) to its north, south, east and west. On the basis of this, "S" could be any of "J", "-" or "L". In example 1 we’re shown that that "S" is "F", and in example 2, "S" also has to be "F". It really bothered me as I first worked through this that there was more than one possibility in the full input, and that I had to find these by inspection. When I first ran my code, I replaced each of these in turn for "S" and got the same answer for each, which was the correct solution. It wasn’t clear to me why all of the three possibilities would close the loop. It took me a while to see why it didn’t matter what "S" is, and it only took a tiny tweak to my previous final line above to get the final line about to account for that.

Going back a bit to understand why, it was obvious to me when I first read the puzzle that we could represent the grid as a graph and that the one large, continuous loop we are looking for is equivalent to finding the largest connected component of that graph. Given the set-up of the problem, if we take the largest loop and remove the start point, we’ll still have the longest path, which will still be the largest connected component of the graph. So, instead of changing the "S" into each of its three possible values, running the rest of the code and dividing the number of nodes in the largest component by 2 (as I originally did), we can instead ignore the "S", still find the number of nodes in the largest component, add 1 to it to account for "S", then divide that by 2. Then we also get rid of the bother of figuring out what "S" could be.

So, that’s what we’re doing towards the end of the solution, from the creation of longest_path onwards. But before we can get that, we need to back up a bit and figure out how to get from grid_adj to a graph. We’re using the tidygraph package for that, which can create a graph from a pair of data frames, one containing information about the nodes, and one containing information about the edges.

The graph_nodes data frame is easy – it’s just the node identifies we gave each value at the start.

To find the edges, we need to look at each adjacent pair and figure out if they contain values that connect. To do that, we first group the pipe according to where they are open to connect, for example, the pipes "|", "F" and "7" are all open to the south, so they get grouped into open_S, and we make equivalent groups for the other three directions. Then, for any value that’s a pipe in open_S, if it has directly to its south a pipe that is open to the north, those two pipes are connected. We can figure similarly for the other groups of pipes. This is what we’re doing in grid_adj_connected. The graph_edges table is then just that, filtered on the pairs of nodes that are connected.

Part 2

The crux of the puzzle

Count the number of points inside the loop.

Not attempted (yet).

When I first read the Part 2 description, it looked hideous and I had no idea to proceed, and I wasn’t prepared to try and grind it out. I did assume that there must be some smart algorithm or formula that would be applicable, though. Now that it’s later in the day and others have had a go and shared hints, I’ve learnt about the existence of the Shoelace Formula and I may come back at some point and try to apply that here.

Session info

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       macOS Sonoma 14.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/London
 date     2023-12-10
 pandoc   3.1.1 @ /Applications/ (via rmarkdown)
 quarto   1.4.523 @ /usr/local/bin/quarto

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version    date (UTC) lib source
 adventdrob  * 0.0.1      2023-11-19 [1] Github (dgrtwo/adventdrob@491050f)
 aochelpers  * 2023-12-06 [1] local
 dplyr       * 1.1.4      2023-11-17 [1] CRAN (R 4.3.1)
 forcats     * 1.0.0      2023-01-29 [1] CRAN (R 4.3.0)
 ggplot2     * 3.4.4      2023-10-12 [1] CRAN (R 4.3.1)
 lubridate   * 1.9.3      2023-09-27 [1] CRAN (R 4.3.1)
 purrr       * 1.0.2      2023-08-10 [1] CRAN (R 4.3.0)
 readr       * 2.1.4      2023-02-10 [1] CRAN (R 4.3.0)
 sessioninfo * 1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
 stringr     * 1.5.1      2023-11-14 [1] CRAN (R 4.3.1)
 tibble      * 3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
 tidygraph   * 1.2.3      2023-02-01 [1] CRAN (R 4.3.0)
 tidyr       * 1.3.0      2023-01-24 [1] CRAN (R 4.3.0)
 tidyverse   * 2.0.0      2023-02-22 [1] CRAN (R 4.3.0)

 [1] /Users/ellakaye/Library/R/arm64/4.3/library
 [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library