base R
regex
loops
⭐⭐
Author

Ella Kaye

Published

December 8, 2023

Setup

The original challenge

My data

Part 1

Toggle the code
library(aochelpers)
input <- aoc_input_vector(8, 2023)
head(input)
[1] "LRRLRRRLRLRLLRRLRLRLLRLRLRLLLLRRRLLRRRLRRRLRRRLLRLLRLRRLRLRLRRRLLLLRRLRLRRLRRLLRRRLRRLRLRRLRRLRRLRRLRLLRRLRRLLLLRLRLRRLLRRLLRRLRLLRLRRLRRLRRLRRRLRRLLLRRLRRRLRLRRRLLRLRRLRRRLRRLLRRRLRRLRLLRRLLRRLRRLRRRLRRLLRRLRRRLRLRLRLRLRLRRLRRLLRRRLRLRRLRRRLRLRLRLRLRLRRRLRRLRRRLLRRLRLLRRRLRRLRLLLLRRRLRRLRRRR"
[2] ""                                                                                                                                                                                                                                                                                                     
[3] "DRM = (DLQ, BGR)"                                                                                                                                                                                                                                                                                     
[4] "PKD = (TNC, DKH)"                                                                                                                                                                                                                                                                                     
[5] "FSM = (LKS, KPG)"                                                                                                                                                                                                                                                                                     
[6] "NDS = (KGD, HNX)"                                                                                                                                                                                                                                                                                     
The crux of the puzzle

Navigate a network following left/right instructions and count steps until we reach the destination.

Different approach to the write up today. For each part, I’ll put my full solution first then break it down afterwards.

Toggle the code
# wrangle data from input
instructions <- input[1] |> strsplit("") |> unlist()
n_instructions <- length(instructions)
nodes <- tail(input, -2)

# wrangle nodes into a matrix
matches <- gregexpr(pattern = "\\w{3}", text = nodes)
list_rows <- regmatches(nodes, matches)
network <- do.call(rbind, list_rows)
colnames(network) <- c("node", "L", "R")
rownames(network) <- network[,1]

# iterate over the network until we reach the destination
loc <- "AAA"
steps <- 0

while (loc != "ZZZ") {
  ii <- (steps %% n_instructions) + 1 ## instruction index
  i <- instructions[ii] ## instruction value, "L" or "R"
  loc <- network[loc, i]
  steps <- steps + 1
}
steps
[1] 19631

When I saw the word ‘network’ in the puzzle, at first I thought this would be some kind of graph theory problem, but it’s not. It’s got much more the flavour of Day 8, 2020.

I decided to go from a pure base R solution today, including dealing with regex.

The wrangling of the elements of nodes into a suitable data structure turned out to be the most challenging part of today for me. We want it in a format with three columns, one for each group of three letters, and n_instructions rows.

My first challenge was working in base R with regex to get the three groups of three letters out of each string. The regex for three word characters is "\w{3}".1 To understand what gregexpr() does, I think it’s easiest to look at an example:

Toggle the code
# matches <- gregexpr(pattern = "\\w{3}", text = nodes)
# nodes[1] is "DRM = (DLQ, BGR)"
matches[1]
[[1]]
[1]  1  8 13
attr(,"match.length")
[1] 3 3 3
attr(,"index.type")
[1] "chars"
attr(,"useBytes")
[1] TRUE

This output is showing us that gregexpr(pattern = "\\w{3}", text = nodes) has found three instances of a match to "\\w{3}", starting at characters 1, 8 and 13 of the input string, and that each of those instances is matching three characters. Since all elements of nodes have the same regex pattern, all elements of matches are the same.

We use regmatches in conjunction with the output of gregexprto extract the matches from the elements of nodes, returning a list:

Toggle the code
# list_rows <- regmatches(nodes, matches)
list_rows |> head(3)
[[1]]
[1] "DRM" "DLQ" "BGR"

[[2]]
[1] "PKD" "TNC" "DKH"

[[3]]
[1] "FSM" "LKS" "KPG"

Now we have a list where each element is a vector of length three, The first, second and third elements of which should go into the first second and third columns respectively of our data structure.

One way to do this is to row-bind the elements of the list together. This is a job for do.call(), which constructs and executes a function call from a name or a function and a list of arguments to be passed to it. So, for example, do.call(rbind, list(1:2, 3:4, 5:6)) is equivalent to rbind(1:2, 3:4, 5:6). For our input, this gives the desired matrix. We also set descriptive column names and set the rownames as the names of the nodes, which makes it easy to extract values with [].

Toggle the code
# network <- do.call(rbind, list_rows)
# colnames(network) <- c("node", "L", "R")
# rownames(network) <- network[,1]
head(network)
    node  L     R    
DRM "DRM" "DLQ" "BGR"
PKD "PKD" "TNC" "DKH"
FSM "FSM" "LKS" "KPG"
NDS "NDS" "KGD" "HNX"
KQQ "KQQ" "DPF" "GKD"
SBX "SBX" "DDL" "MGH"

After initiating the loop, the last detail of interest is the use of modular arithmetic with %% to repeat the series of instructions as necessary.

Part 2

The crux of the puzzle

As above, but with several possible starts and finishes. Find when all paths land on a destination node simultaneously.

Toggle the code
all_nodes <- network[,1]
1start_nodes <- all_nodes[grep("..A", all_nodes)]
end_nodes <- all_nodes[grep("..Z", all_nodes)]

2steps_from_start <- function(start_loc) {
  
  loc <- start_loc
  steps <- 0
  
  # needs just to end in Z
  while (!(loc %in% end_nodes)) {
    ii <- (steps %% n_instructions) + 1 ## instruction index
    i <- instructions[ii] ## instruction value, "L" or "R"
    loc <- network[loc, i]
    steps <- steps + 1
  }
  steps
}

all_paths <- sapply(start_nodes, steps_from_start)

# function for greatest common denominator
# applies Euclid's algorithm
gcd <- function(x, y) {
  while (y != 0) {
    t <- y
    y <- x %% y
    x <- t
  }
  x
}

# function for lowest common multiple
lcm <- function(x, y) {
  x * y / gcd(x, y)
}

# paths all reach a destination simultaneously at the lcm of their steps
3options(digits = 14)
Reduce(lcm, all_paths)
1
grep(pattern, x) returns the indices of matches to pattern in x
2
It’s not best practice to have a function look up objects in the environment, e.g. to rely on previously defined locations and network rather than pass them as arguments to the function, but I’m never going to use this function again, so I can get away with it.
3
The answer is large, so need to increase the number of digits printed to display it all in a format that can be pasted into the Advent of Code answer submission box. 4, lcm() only takes two arguments. Using Reduce() successively applies it to each element is all_paths.
[1] 21003205388413

It’s not obviously from the text for Part 2 that the lowest common multiple of the part lengths is what we need. It’s not necessarily the case from the description that this would in fact be the solution. The example suggests it though: one path takes 2 steps to reach a node, the other takes 3 steps, so they both reach an end node together after 6 steps. Both example paths also loop: the first through 11B and 11Z, the second through 22B, 22C and 22Z.

Since I had committed myself to solving this puzzle purely in base R, I wrote my own functions for lcm and gcd.2. I found the formula for the most efficent caluclation of the lowest common multiple on Stack Overflow, which in turn took me to Wikipedia for details on how to implement the Euclidean algorithm to find the greatest common divisor.

Session info

Toggle
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       macOS Sonoma 14.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/London
 date     2023-12-28
 pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
 quarto   1.4.526 @ /usr/local/bin/quarto

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version    date (UTC) lib source
 aochelpers  * 0.1.0.9000 2023-12-24 [1] local
 sessioninfo * 1.2.2      2021-12-06 [1] CRAN (R 4.3.0)

 [1] /Users/ellakaye/Library/R/arm64/4.3/library
 [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library

──────────────────────────────────────────────────────────────────────────────

Footnotes

  1. "\w" is a word character, i.e. any letter, digit or an underscore. The "{3}" indicates we want exactly three of these. When using this in R, we need to add an extra escape, "\\w{3}".↩︎

  2. There are already versions of these in the pracma package↩︎