Find the furthest point in a loop from the start point. This is equivalent to finding the largest component in a graph, and half the number of its nodes.

First, my solution in full, then explanation below.

Toggle the code

library(tidyverse)library(tidygraph)library(adventdrob)# table with one row per grid element, with its row and col positions# and an identifier for the node it will represent in a graph.grid <- input |> adventdrob::grid_tidy(X1) |>mutate(node =row_number()) # find all nodes adjacent to each other in the gridgrid_adj <- adventdrob::adjacent_join(grid, grid)# group values according to where the pipe is openopen_S <-c("|", "F", "7") open_N <-c("|", "L", "J") open_W <-c("-", "J", "7") open_E <-c("-", "L", "F") # find the nodes that are connectedgrid_adj_connected <- grid_adj |>mutate(is_connected =case_when( value %in% open_S & col2 == col & row2 > row & value2 %in% open_N ~TRUE, value %in% open_N & col2 == col & row2 < row & value2 %in% open_S ~TRUE, value %in% open_W & col2 < col & row2 == row & value2 %in% open_E ~TRUE, value %in% open_E & col2 > col & row2 == row & value2 %in% open_W ~TRUE, .default =FALSE ))# get nodes, edges and convert into graphgrid_nodes <- grid |>select(node)grid_edges <- grid_adj_connected |>filter(is_connected) |>select(from = node, to = node2)pipe_graph <-tbl_graph(nodes = grid_nodes, edges = grid_edges) # not called grid_graph as that's the name of a function in adventdrob# the length of the longest path is the number of nodes in the largest componentlongest_path <- pipe_graph |>activate(nodes) |>mutate(group =group_components()) |>as_tibble() |>count(group) |>slice_max(n) |>pull(n)# add 1 to the longest path to account for "S" (and make it a loop)# furthest point is half-way round.(longest_path +1)/2

[1] 6831

I spent a lot of time on Day 3 going over David Robinson’s solution to that puzzle, using his adventdrob. That paid off today, as I was able to use the grid_tidy() and adjacent_join() functions to get a tidy table with a row for each pair of adjacent points. For example, in grid_adj, we can see all the points adjacent to the start point, "S":

Toggle the code

grid_adj |>filter(value =="S")

# A tibble: 4 × 8
row value col node row2 col2 value2 node2
<int> <chr> <int> <int> <dbl> <dbl> <chr> <int>
1 17 S 38 2278 16 38 J 2138
2 17 S 38 2278 17 37 F 2277
3 17 S 38 2278 17 39 - 2279
4 17 S 38 2278 18 38 F 2418

We see that "S" is in row 17, column 38, and we can also see the values (value2) to its north, south, east and west. On the basis of this, "S" could be any of "J", "-" or "L". In example 1 we’re shown that that "S" is "F", and in example 2, "S" also has to be "F". It really bothered me as I first worked through this that there was more than one possibility in the full input, and that I had to find these by inspection. When I first ran my code, I replaced each of these in turn for "S" and got the same answer for each, which was the correct solution. It wasn’t clear to me why all of the three possibilities would close the loop. It took me a while to see why it didn’t matter what "S" is, and it only took a tiny tweak to my previous final line above to get the final line about to account for that.

Going back a bit to understand why, it was obvious to me when I first read the puzzle that we could represent the grid as a graph and that the one large, continuous loop we are looking for is equivalent to finding the largest connected component of that graph. Given the set-up of the problem, if we take the largest loop and remove the start point, we’ll still have the longest path, which will still be the largest connected component of the graph. So, instead of changing the "S" into each of its three possible values, running the rest of the code and dividing the number of nodes in the largest component by 2 (as I originally did), we can instead ignore the "S", still find the number of nodes in the largest component, add 1 to it to account for "S", then divide that by 2. Then we also get rid of the bother of figuring out what "S" could be.

So, that’s what we’re doing towards the end of the solution, from the creation of longest_path onwards. But before we can get that, we need to back up a bit and figure out how to get from grid_adj to a graph. We’re using the tidygraph package for that, which can create a graph from a pair of data frames, one containing information about the nodes, and one containing information about the edges.

The graph_nodes data frame is easy – it’s just the node identifies we gave each value at the start.

To find the edges, we need to look at each adjacent pair and figure out if they contain values that connect. To do that, we first group the pipe according to where they are open to connect, for example, the pipes "|", "F" and "7" are all open to the south, so they get grouped into open_S, and we make equivalent groups for the other three directions. Then, for any value that’s a pipe in open_S, if it has directly to its south a pipe that is open to the north, those two pipes are connected. We can figure similarly for the other groups of pipes. This is what we’re doing in grid_adj_connected. The graph_edges table is then just that, filtered on the pairs of nodes that are connected.

Part 2

The crux of the puzzle

Count the number of points inside the loop.

Not attempted (yet).

When I first read the Part 2 description, it looked hideous and I had no idea to proceed, and I wasn’t prepared to try and grind it out. I did assume that there must be some smart algorithm or formula that would be applicable, though. Now that it’s later in the day and others have had a go and shared hints, I’ve learnt about the existence of the Shoelace Formula and I may come back at some point and try to apply that here.

---title: "2023: Day 10"date: 2023-12-10author: - name: Ella Kayecategories: [grids, graphs, tidyverse, tidygraph, adventdrob, ⭐]draft: false---## Setup[The original challenge](https://adventofcode.com/2023/day/10)[My data](input){target="_blank"}## Part 1```{r}#| echo: falseOK <-"2023"<3000# Will only evaluate next code block if an actual year has been substituted for the placeholder.``````{r}#| eval: !expr OKlibrary(aochelpers)# other options: aoc_input_data_frame(), aoc_input_matrix()input <-aoc_input_data_frame(10, 2023)head(input)```::: {.callout-note collapse="false" icon="false"}## The crux of the puzzleFind the furthest point in a loop from the start point.This is equivalent to finding the largest component in a graph,and half the number of its nodes.:::First, my solution in full, then explanation below.```{r}library(tidyverse)library(tidygraph)library(adventdrob)# table with one row per grid element, with its row and col positions# and an identifier for the node it will represent in a graph.grid <- input |> adventdrob::grid_tidy(X1) |>mutate(node =row_number()) # find all nodes adjacent to each other in the gridgrid_adj <- adventdrob::adjacent_join(grid, grid)# group values according to where the pipe is openopen_S <-c("|", "F", "7") open_N <-c("|", "L", "J") open_W <-c("-", "J", "7") open_E <-c("-", "L", "F") # find the nodes that are connectedgrid_adj_connected <- grid_adj |>mutate(is_connected =case_when( value %in% open_S & col2 == col & row2 > row & value2 %in% open_N ~TRUE, value %in% open_N & col2 == col & row2 < row & value2 %in% open_S ~TRUE, value %in% open_W & col2 < col & row2 == row & value2 %in% open_E ~TRUE, value %in% open_E & col2 > col & row2 == row & value2 %in% open_W ~TRUE, .default =FALSE ))# get nodes, edges and convert into graphgrid_nodes <- grid |>select(node)grid_edges <- grid_adj_connected |>filter(is_connected) |>select(from = node, to = node2)pipe_graph <-tbl_graph(nodes = grid_nodes, edges = grid_edges) # not called grid_graph as that's the name of a function in adventdrob# the length of the longest path is the number of nodes in the largest componentlongest_path <- pipe_graph |>activate(nodes) |>mutate(group =group_components()) |>as_tibble() |>count(group) |>slice_max(n) |>pull(n)# add 1 to the longest path to account for "S" (and make it a loop)# furthest point is half-way round.(longest_path +1)/2```I spent a lot of time on [Day 3](../3/index.qmd) going over David Robinson's solution to that puzzle, using his [**adventdrob**](https://github.com/dgrtwo/adventdrob).That paid off today, as I was able to use the `grid_tidy()` and `adjacent_join()` functionsto get a tidy table with a row for each pair of adjacent points. For example, in `grid_adj`, we can see all the points adjacent to the start point, `"S"`:```{r}grid_adj |>filter(value =="S")```We see that `"S"` is in row 17, column 38, and we can also see the values (`value2`) to its north, south, east and west. On the basis of this, `"S"` could be any of `"J"`, `"-"` or `"L"`. In example 1 we're shown that that `"S"` is `"F"`, and in example 2, `"S"` also has to be `"F"`.It really bothered me as I first worked through this that there was more than one possibility in the full input, and that I had to find these by inspection.When I first ran my code, I replaced each of these in turn for `"S"` and got the same answer for each, which was the correct solution. It wasn't clear to me why all of the three possibilities would close the loop.It took me a while to see why it didn't matter what `"S"` is, and it only took a tiny tweak to my previous final line above to get the final line about to account for that. Going back a bit to understand why, it was obvious to me when I first read the puzzle that we could represent the grid as a graph and that the one large, continuous loop we are looking for is equivalent to finding the largest connected component of that graph.Given the set-up of the problem, if we take the largest loop and remove the start point, we'll still have the longest path, which will still be the largest connected component of the graph. So, instead of changing the `"S"` into each of its three possible values, running the rest of the code and dividing the number of nodes in the largest component by 2 (as I originally did),we can instead ignore the `"S"`, still find the number of nodes in the largest component, add 1 to it to account for `"S"`, then divide that by 2. Then we also get rid of the bother of figuring out what `"S"` could be.So, that's what we're doing towards the end of the solution, from the creation of `longest_path` onwards.But before we can get that, we need to back up a bit and figure out how to get from `grid_adj` to a graph. We're using the [**tidygraph**](https://tidygraph.data-imaginist.com) package for that, which can create a graph from a pair of data frames, one containing information about the nodes, and one containing information about the edges.The `graph_nodes` data frame is easy -- it's just the node identifies we gave each value at the start.To find the edges, we need to look at each adjacent pair and figure out if they contain values that connect. To do that, we first group the pipe according to where they are open to connect, for example, the pipes `"|"`, `"F"` and `"7"` are all open to the south, so they get grouped into `open_S`, and we make equivalent groups for the other three directions. Then, for any value that's a pipe in `open_S`, if it has directly to its south a pipe that is open to the north, those two pipes are connected. We can figure similarly for the other groups of pipes.This is what we're doing in `grid_adj_connected`. The `graph_edges` table is then just that, filtered on the pairs of nodes that are connected.## Part 2::: {.callout-note collapse="false" icon="false"}## The crux of the puzzleCount the number of points inside the loop.:::Not attempted (yet). When I first read the Part 2 description, it looked hideous and I had no idea to proceed,and I wasn't prepared to try and grind it out.I did assume that there must be some smart algorithm or formula that would be applicable, though.Now that it's later in the day and others have had a go and shared hints, I've learnt about the existence of the [Shoelace Formula](https://en.wikipedia.org/wiki/Shoelace_formula) and I may come back at some point and try to apply that here.##### Session info {.appendix}<details><summary>Toggle</summary>```{r}#| echo: falselibrary(sessioninfo)# save the session info as an objectpkg_session <-session_info(pkgs ="attached")# get the quarto versionquarto_version <-system("quarto --version", intern =TRUE)# inject the quarto infopkg_session$platform$quarto <-paste(system("quarto --version", intern =TRUE), "@", quarto::quarto_path() )# print it outpkg_session```</details>