2023: Day 1

base R

strings

regex

Copilot

ChatGPT

⭐⭐

Author

Ella Kaye

Published

December 1, 2023

Setup

The original challenge

My data

Part 1

I solved this puzzle in both R and, stretching myself, C.

R

Toggle the code

library(aochelpers)
input <- aoc_input_vector(1, 2023)
head(input)

[1] "rhqrpdxsqhgxzknr2foursnrcfthree"                
[2] "2bmckl"                                         
[3] "four95qvkvveight5"                              
[4] "2tqbxgrrpmxqfglsqjkqthree6nhjvbxpflhr1eightwohr"
[5] "7two68"                                         
[6] "nine7twoslseven4sfoursix"

The crux of the puzzle

Combine the first and last digits that appear in each element of input to a single two-digit number, then sum them for all elements.

I found this straightforward, similar to Day 4, 2022.

Toggle the code

get_value <- function(x) {
  x <- strsplit(x, "") |> unlist()
  x_nums <- x[x %in% 1:9]
  paste0(head(x_nums, 1), tail(x_nums, 1)) |> 
    as.numeric()
}

sapply(input, get_value) |> sum()

[1] 55208

C

Before Advent of Code started this year, I thought if there was one challenge I’d be able to solve in C, it would probably be Part 1 of Day 1. When the puzzle was released, it did indeed seem doable, though there was still plenty I had to figure out for the first time. This was also an opportunity to get an assist from LLMs, for the parts I was unfamiliar with.

The overall strategy was the same with the R solution, to write a get_value() function, then, in main(), loop over the input calling it on each line.

First, here’s my get_value() function. In C, strings are arrays, and this loops over the array in both directions, going left-to-right to find the first occurence of a digit, then right-to-left to find the last, with each loop breaking once a digit has been found, and updating the value of value appropriately.

Toggle the code

int get_value(char input[]) {
  
  // get the length of the string
  int length = strlen(input);
  
  // to store the value
  int value = 0;
  
  // find the first digit 
  for (int i = 0; i < length; i++) {
    if (input[i] > '0' && input[i] <= '9') {

      // convert to int and update value
      value = (input[i] - '0')*10;
      break;
      
    }
  }
  
  // find the last digit
  for (int i = length; i >= 0; i--) {
    if (input[i] > '0' && input[i] <= '9') {

      value += (input[i] - '0');
      break;
      
    }
  }
  
  return value;
}

As for the role of LLMs in this, I came up with the overall strategy and Copilot suggested the code if (input[i] > '0' && input[i] <= '9') and also value = (input[i] - '0') (I needed to amend it with the mulitiplication by 10). I hadn’t come across character literals before, so I asked ChatGPT to explain the code value = input[i] - '0', which was really helpful.¹

For main(), new to me was reading in input from a file. I tried various comments to encourage Copilot to show me the way, but it didn’t oblige. I found an example I could adapt on w3schools. One part not covered there is how to ensure that the array in which the line of input is stored is an appropriate length, so I used R to find the length of the longest string in my input², and set the array size to one larger that that.

Here’s my main():

Toggle the code

int main(void) {
  
  FILE *fptr;
  
  // Open a file in read mode
  fptr = fopen("input", "r");
  
  // Store the content of the (line of) the file
  char input_line[50];
  
  // Set up accumulator
  int total = 0;
  
  // Read the content and store it inside input_line
  while (fgets(input_line, 50, fptr)) {
    int value = get_value(input_line);
    total += value;
  }
  
  // Close the file
  fclose(fptr);
  
  printf("%d\n", total);
}

Putting this all together, including the appropriate header files and function declarations, we get a final script.

I compiled it with clang -o script script.c and ran it with ./script and was delighted to see the same total print in the console that I’d found with R.³

Part 2

The crux of the puzzle

As Part 1, except this time the ‘digit’ can also be a word, e.g. “six”.

Back to R, and Part 2, on the other hand, was not straightforward at all, especially for a Day 1 puzzle! At first I was pretty stumped, though after some experimentation and an insight, I was able to solve it without recourse to LLMs.

Like Part 1, I wrote a function that works on one line of input, then applied it to all of them. Unlike Part 1, it doesn’t suit splitting up the string into characters – instead we need to bring on the regex!

My first thought was str_extract_all from the stringr package, but that couldn’t handle the overlapping words, e.g. in "eightwothree" it found "eight" and "three", but not "two". str_extract() is good for finding the first match for a digit/word though.

The breakthrough insight that got me to my solution is that finding the last digit/word is equivalent to finding the first digit/word of the string in reverse. So first, I wrote a quick function to reverse a string:

Toggle the code

string_reverse <- function(x) {
  strsplit(x, "") |> 
    unlist() |> 
    rev() |> 
    paste0(collapse = "")
}

Next, we need to build the regex for matching a digit/word, both forwards and in reverse. We create nums first, then collapse it and add the regex for a digit, because we need nums later to match() a word to its corresponding value.

Toggle the code

nums <- c("one", "two", "three", "four", "five", "six", "seven", "eight", "nine")
nums_pattern <- paste(nums, collapse = "|")
nums_digit_pattern <- paste0(nums_pattern, "|\\d")

nums_pattern_rev <- string_reverse(nums_pattern)
nums_digit_pattern_rev <- paste0(nums_pattern_rev, "|\\d")

To get the last digit, we can then use str_extract() on a reversed input string, though we need to reverse the output back again to turn it into a number-word:⁴

Toggle the code

library(stringr)
get_last_digit <- function(x) {
  x |> 
    string_reverse() |> 
    str_extract(nums_digit_pattern_rev) |> 
    string_reverse() 
}

We also need to convert the string to a numeric, for either a spelled out word or a character digit:

Toggle the code

convert_to_digit <- function(x) {
  if (nchar(x) == 1) {
    x <- as.numeric(x)
  } else {
    x <- match(x, nums)
  }
  x
}

Putting this all together:

Toggle the code

get_value2 <- function(x) {

  first <- str_extract(x, nums_digit_pattern)
  
  last <- get_last_digit(x)
  
  first_digit <- convert_to_digit(first)
  last_digit <- convert_to_digit(last)
  
  10*first_digit + last_digit
}

sapply(input, get_value2) |> sum()

[1] 54578

That was quite a lot for Day 1! Perhaps I’ll come back another time and attempt a solution to Part 2 in C, but given how tricky that was in R, I expect it would be quite a challenge for me.

Session info

Toggle

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       macOS Sonoma 14.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/London
 date     2023-12-06
 pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
 quarto   1.4.515 @ /usr/local/bin/quarto

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version    date (UTC) lib source
 aochelpers  * 0.1.0.9000 2023-12-06 [1] local
 sessioninfo * 1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
 stringr     * 1.5.1      2023-11-14 [1] CRAN (R 4.3.1)

 [1] /Users/ellakaye/Library/R/arm64/4.3/library
 [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library

──────────────────────────────────────────────────────────────────────────────

Footnotes

The key parts of the ChatPGT explanation for me were:

In C, the code value = (input[i] - ‘0’) is typically used to convert a character representing a digit into its corresponding numeric value.
- '0': This represents the character literal for the digit 0. In the ASCII character set, the digits 0 to 9 are represented consecutively. Therefore, subtracting the ASCII value of ‘0’ from the ASCII value of a digit character gives the numeric value of that digit.
- (input[i] - '0'): By subtracting the ASCII value of ‘0’ from the ASCII value of the character at input[i], you get the numeric value of the digit represented by that character. This is a common technique used for converting digit characters to their corresponding integer values.
↩︎
input |> sapply(nchar) |> max()↩︎
Actually, at first I got the wrong answer because, in the second loop of get_value(), I’d used i > 0 instead of i >= 0 (damn the difference in indexing between R and C!) This mistake is only a problem when the line of input contains just one digit which appears at the beginning of the string (which hadn’t been an issue testing this on the example input). I don’t know any debugging techniques in C, but I threw in some printf statements and, fortuitously, the second line of my input is this problem case, so I spotted it. I’m not sure I’d have been so lucky in catching my error if the trouble wasn’t caused for several hundred lines of my one thousand line input.↩︎
In retrospect, I could have had a fully base R solution if instead of str_extract I’d used
```
matches <- regexpr(nums_digit_pattern_rev, string_reverse(x)) 
regmatches(string_reverse(x), m = matches)
```
but thank goodness for stringr with its much more intuitive function names and approach!↩︎