Navigate a network following left/right instructions and count steps until we reach the destination.

Different approach to the write up today. For each part, I’ll put my full solution first then break it down afterwards.

Toggle the code

# wrangle data from inputinstructions <- input[1] |>strsplit("") |>unlist()n_instructions <-length(instructions)nodes <-tail(input, -2)# wrangle nodes into a matrixmatches <-gregexpr(pattern ="\\w{3}", text = nodes)list_rows <-regmatches(nodes, matches)network <-do.call(rbind, list_rows)colnames(network) <-c("node", "L", "R")rownames(network) <- network[,1]# iterate over the network until we reach the destinationloc <-"AAA"steps <-0while (loc !="ZZZ") { ii <- (steps %% n_instructions) +1## instruction index i <- instructions[ii] ## instruction value, "L" or "R" loc <- network[loc, i] steps <- steps +1}steps

[1] 19631

When I saw the word ‘network’ in the puzzle, at first I thought this would be some kind of graph theory problem, but it’s not. It’s got much more the flavour of Day 8, 2020.

I decided to go from a pure base R solution today, including dealing with regex.

The wrangling of the elements of nodes into a suitable data structure turned out to be the most challenging part of today for me. We want it in a format with three columns, one for each group of three letters, and n_instructions rows.

My first challenge was working in base R with regex to get the three groups of three letters out of each string. The regex for three word characters is "\w{3}".^{1} To understand what gregexpr() does, I think it’s easiest to look at an example:

Toggle the code

# matches <- gregexpr(pattern = "\\w{3}", text = nodes)# nodes[1] is "DRM = (DLQ, BGR)"matches[1]

This output is showing us that gregexpr(pattern = "\\w{3}", text = nodes) has found three instances of a match to "\\w{3}", starting at characters 1, 8 and 13 of the input string, and that each of those instances is matching three characters. Since all elements of nodes have the same regex pattern, all elements of matches are the same.

We use regmatches in conjunction with the output of gregexprto extract the matches from the elements of nodes, returning a list:

Now we have a list where each element is a vector of length three, The first, second and third elements of which should go into the first second and third columns respectively of our data structure.

One way to do this is to row-bind the elements of the list together. This is a job for do.call(), which constructs and executes a function call from a name or a function and a list of arguments to be passed to it. So, for example, do.call(rbind, list(1:2, 3:4, 5:6)) is equivalent to rbind(1:2, 3:4, 5:6). For our input, this gives the desired matrix. We also set descriptive column names and set the rownames as the names of the nodes, which makes it easy to extract values with [].

After initiating the loop, the last detail of interest is the use of modular arithmetic with %% to repeat the series of instructions as necessary.

Part 2

The crux of the puzzle

As above, but with several possible starts and finishes. Find when all paths land on a destination node simultaneously.

Toggle the code

all_nodes <- network[,1]1start_nodes <- all_nodes[grep("..A", all_nodes)]end_nodes <- all_nodes[grep("..Z", all_nodes)]2steps_from_start <-function(start_loc) { loc <- start_loc steps <-0# needs just to end in Zwhile (!(loc %in% end_nodes)) { ii <- (steps %% n_instructions) +1## instruction index i <- instructions[ii] ## instruction value, "L" or "R" loc <- network[loc, i] steps <- steps +1 } steps}all_paths <-sapply(start_nodes, steps_from_start)# function for greatest common denominator# applies Euclid's algorithmgcd <-function(x, y) {while (y !=0) { t <- y y <- x %% y x <- t } x}# function for lowest common multiplelcm <-function(x, y) { x * y /gcd(x, y)}# paths all reach a destination simultaneously at the lcm of their steps3options(digits =14)Reduce(lcm, all_paths)

1

grep(pattern, x) returns the indices of matches to pattern in x

2

It’s not best practice to have a function look up objects in the environment, e.g. to rely on previously defined locations and network rather than pass them as arguments to the function, but I’m never going to use this function again, so I can get away with it.

3

The answer is large, so need to increase the number of digits printed to display it all in a format that can be pasted into the Advent of Code answer submission box. 4, lcm() only takes two arguments. Using Reduce() successively applies it to each element is all_paths.

[1] 21003205388413

It’s not obviously from the text for Part 2 that the lowest common multiple of the part lengths is what we need. It’s not necessarily the case from the description that this would in fact be the solution. The example suggests it though: one path takes 2 steps to reach a node, the other takes 3 steps, so they both reach an end node together after 6 steps. Both example paths also loop: the first through 11B and 11Z, the second through 22B, 22C and 22Z.

Since I had committed myself to solving this puzzle purely in base R, I wrote my own functions for lcm and gcd.^{2}. I found the formula for the most efficent caluclation of the lowest common multiple on Stack Overflow, which in turn took me to Wikipedia for details on how to implement the Euclidean algorithm to find the greatest common divisor.

Session info

Toggle

─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.3.2 (2023-10-31)
os macOS Sonoma 14.1
system aarch64, darwin20
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/London
date 2023-12-28
pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
quarto 1.4.526 @ /usr/local/bin/quarto
─ Packages ───────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
aochelpers * 0.1.0.9000 2023-12-24 [1] local
sessioninfo * 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
[1] /Users/ellakaye/Library/R/arm64/4.3/library
[2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
──────────────────────────────────────────────────────────────────────────────

Footnotes

"\w" is a word character, i.e. any letter, digit or an underscore. The "{3}" indicates we want exactly three of these. When using this in R, we need to add an extra escape, "\\w{3}".↩︎

There are already versions of these in the pracma package↩︎

Source Code

---title: "2023: Day 8"date: 2023-12-8author: - name: Ella Kayecategories: [base R, regex, loops, ⭐⭐]draft: false---## Setup[The original challenge](https://adventofcode.com/2023/day/8)[My data](input){target="_blank"}## Part 1```{r}#| echo: falseOK <-"2023"<3000# Will only evaluate next code block if an actual year has been substituted for the placeholder.``````{r}#| eval: !expr OKlibrary(aochelpers)input <-aoc_input_vector(8, 2023)head(input)```::: {.callout-note collapse="false" icon="false"}## The crux of the puzzleNavigate a network following left/right instructions and count steps until we reach the destination.:::Different approach to the write up today. For each part, I'll put my full solution first then break it down afterwards.```{r}# wrangle data from inputinstructions <- input[1] |>strsplit("") |>unlist()n_instructions <-length(instructions)nodes <-tail(input, -2)# wrangle nodes into a matrixmatches <-gregexpr(pattern ="\\w{3}", text = nodes)list_rows <-regmatches(nodes, matches)network <-do.call(rbind, list_rows)colnames(network) <-c("node", "L", "R")rownames(network) <- network[,1]# iterate over the network until we reach the destinationloc <-"AAA"steps <-0while (loc !="ZZZ") { ii <- (steps %% n_instructions) +1## instruction index i <- instructions[ii] ## instruction value, "L" or "R" loc <- network[loc, i] steps <- steps +1}steps```When I saw the word 'network' in the puzzle, at first I thought this would be some kind of graph theory problem, but it's not. It's got much more the flavour of [Day 8, 2020](../../../2020/day/8/index.qmd){target="_blank"}.I decided to go from a pure base R solution today, including dealing with regex.The wrangling of the elements of `nodes` into a suitable data structure turned out to be the most challenging part of today for me.We want it in a format with three columns, one for each group of three letters, and `n_instructions` rows.My first challenge was working in base R with regex to get the three groups of three letters out of each string.The regex for three word characters is `"\w{3}"`.^[`"\w"` is a word character, i.e. any letter, digit or an underscore. The `"{3}"` indicates we want exactly three of these. When using this in R, we need to add an extra escape, `"\\w{3}"`.] To understand what `gregexpr()` does, I think it's easiest to look at an example:```{r}# matches <- gregexpr(pattern = "\\w{3}", text = nodes)# nodes[1] is "DRM = (DLQ, BGR)"matches[1]```This output is showing us that `gregexpr(pattern = "\\w{3}", text = nodes)` has found three instances of a match to `"\\w{3}"`, starting at characters 1, 8 and 13 of the input string,and that each of those instances is matching three characters.Since all elements of `nodes` have the same regex pattern, all elements of `matches` are the same. We use `regmatches` in conjunction with the output of `gregexpr`to extract the matches from the elements of `nodes`, returning a list:```{r}# list_rows <- regmatches(nodes, matches)list_rows |>head(3)```Now we have a list where each element is a vector of length three,The first, second and third elements of which should go into the first second and third columns respectively of our data structure.One way to do this is to row-bind the elements of the list together.This is a job for `do.call()`, which constructs and executes a function call from a name or a function and a list of arguments to be passed to it. So, for example, `do.call(rbind, list(1:2, 3:4, 5:6))` is equivalent to `rbind(1:2, 3:4, 5:6)`. For our input, this gives the desired matrix. We also set descriptive column names and set the rownames as the names of the nodes, which makes it easy to extract values with `[]`.```{r}# network <- do.call(rbind, list_rows)# colnames(network) <- c("node", "L", "R")# rownames(network) <- network[,1]head(network)```After initiating the loop, the last detail of interest is the use of modular arithmetic with `%%` to repeat the series of instructions as necessary.## Part 2::: {.callout-note collapse="false" icon="false"}## The crux of the puzzleAs above, but with several possible starts and finishes. Find when all paths land on a destination node simultaneously.:::```{r}all_nodes <- network[,1]start_nodes <- all_nodes[grep("..A", all_nodes)] # <1>end_nodes <- all_nodes[grep("..Z", all_nodes)] # <1>steps_from_start <-function(start_loc) { # <2> loc <- start_loc steps <-0# needs just to end in Zwhile (!(loc %in% end_nodes)) { ii <- (steps %% n_instructions) +1## instruction index i <- instructions[ii] ## instruction value, "L" or "R" loc <- network[loc, i] steps <- steps +1 } steps}all_paths <-sapply(start_nodes, steps_from_start)# function for greatest common denominator# applies Euclid's algorithmgcd <-function(x, y) {while (y !=0) { t <- y y <- x %% y x <- t } x}# function for lowest common multiplelcm <-function(x, y) { x * y /gcd(x, y)}# paths all reach a destination simultaneously at the lcm of their stepsoptions(digits =14) # <3>Reduce(lcm, all_paths) # <4>```1. `grep(pattern, x)` returns the indices of matches to `pattern` in `x`2. It's not best practice to have a function look up objects in the environment, e.g. to rely on previously defined `locations` and `network` rather than pass them as arguments to the function, but I'm never going to use this function again, so I can get away with it.3. The answer is large, so need to increase the number of digits printed to display it all in a format that can be pasted into the Advent of Code answer submission box.4, `lcm()` only takes two arguments. Using `Reduce()` successively applies it to each element is `all_paths`.It's not obviously from the text for Part 2 that the lowest common multiple of the part lengths is what we need. It's not necessarily the case from the description that this would in fact be the solution.The example suggests it though: one path takes 2 steps to reach a node, the other takes 3 steps, so they both reach an end node together after 6 steps. Both example paths also loop: the first through `11B` and `11Z`, the second through `22B`, `22C` and `22Z`. Since I had committed myself to solving this puzzle purely in base R,I wrote my own functions for `lcm` and `gcd`.^[There are already versions of these in the **pracma** package]. I found the formula for the most efficent caluclation of the lowest common multiple on [Stack Overflow](https://stackoverflow.com/questions/3154454/what-is-the-most-efficient-way-to-calculate-the-least-common-multiple-of-two-int), which in turn took me to [Wikipedia](https://en.wikipedia.org/wiki/Euclidean_algorithm) for details on how to implement the Euclidean algorithm to find the greatest common divisor.##### Session info {.appendix}<details><summary>Toggle</summary>```{r}#| echo: falselibrary(sessioninfo)# save the session info as an objectpkg_session <-session_info(pkgs ="attached")# get the quarto versionquarto_version <-system("quarto --version", intern =TRUE)# inject the quarto infopkg_session$platform$quarto <-paste(system("quarto --version", intern =TRUE), "@", quarto::quarto_path() )# print it outpkg_session```</details>