Skip to main content

Advent of Code 2020 Day 4

Part 1

In Day 4’s challenge, we are once again brought back to working with strings. In this problem, we are asked to verify the validity of passports based on the presence of required fields. These fields are: byr (birth year), iyr (issue year), eyr (expiration year), hgt (height), hcl (hair color), ecl (eye color), pid (passport ID), and cid (country ID).

To be valid, a passport must contain either all eight fields, or must be only missing the cid field which is optional. The problem does not seem really complex, let’s look at the input data:

ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
byr:1937 iyr:2017 cid:147 hgt:183cm

iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
hcl:#cfa07d byr:1929

hcl:#ae17e1 iyr:2013
eyr:2024
ecl:brn pid:760753108 byr:1931
hgt:179cm

hcl:#cfa07d eyr:2025 pid:166559648
iyr:2011 ecl:brn hgt:59in

One of a possible difficulty here is to make sure that all the data for a single passport is correctly taken into account despite being spread on multiple lines. For this bit, I have to admit that I cheated a little bit: I used a Vim macro to inline the data for all passports.

Suppose you don’t know Vim (you should really), here’s what you might have done to get all the data for a passport in a single variable:

passports = []
passport = ""
for line in lines:
    if line:
        passport += line
    else:
        passports.append(passport)
        passport = ""

Now, there are (as usual) multiple ways to solve this first part. I’m going to use maps since we haven’t used them before. We are thus going to represent a passport with a map, where the keys are the fields and the values are their corresponding values. With maps, the solution becomes quite clear: a passport is valid if the set of keys contains exactly 8 elements, or contains 7 elements and cid is not one of them.

If the objective is to count how many passports are valid, the implementation of the solution could be something like this:

valid_passports = 0
for passport in passports:
    passport_dict = {}
    for key_value in passport.split(" "):
        key, value = key_value.split(":")[0], key_value.split(":")[1]
        passport_dict[key] = value
    if len(passport_dict.keys()) == 8 or (len(passport_dict.keys()) == 7 and "cid" not in passport_dict.keys()):
        valid_passports += 1
print(valid_passports)

Part 2

The second part now asks us to also validate the values of each field. The rules are as follow:

  • byr (Birth Year) - four digits; at least 1920 and at most 2002.
  • iyr (Issue Year) - four digits; at least 2010 and at most 2020.
  • eyr (Expiration Year) - four digits; at least 2020 and at most 2030.
  • hgt (Height) - a number followed by either cm or in:
    • If cm, the number must be at least 150 and at most 193.
    • If in, the number must be at least 59 and at most 76.
  • hcl (Hair Color) - a # followed by exactly six characters 0-9 or a-f.
  • ecl (Eye Color) - exactly one of: amb blu brn gry grn hzl oth.
  • pid (Passport ID) - a nine-digit number, including leading zeroes.
  • cid (Country ID) - ignored, missing or not.

And the question, once again, is to count the number of valid passports.

Now that we have three days of data parsing behind us, this problem should not be too difficult for us really. The only difficult part could be to validate the hcl field as it is written in hex format, or the cid field which is a nine digit value. Thankfully, regular expressions are here to help. I think this problem is also a nice opportunity to practice a little bit problem decomposition and functions. For example, we can call a function on each passport and check its validity. This function itself will call 7 other functions, one for each rule we need to check. Then, all we have to do is to correctly parse the useful information in each value and verify its validity.

import re   # Used for regex

def check_passport_validity(passport):
    if len(passport.keys()) == 8 or (len(passport.keys()) == 7 and "cid" not in passport.keys()):
        return check_birth_year(passport["byr"]) and check_issue_year(passport["iyr"]) and
               check_expiration_year(passport["eyr"]) and check_height(passport["hgt"]) and
               check_hair_color(passport["hcl"]) and check_eye_color(passport["ecl"]) and
               check_passport_id(passport["pid"])

def check_birth_year(byr_value):
    birth_year = int(byr_value.split(":")[1])
    return 1920 <= birth_year <= 2002

def check_issue_year(iyr_value):
    issue_year = int(iyr_value.split(":")[1])
    return 2010 <= issue_year <= 2020

def check_expiration_year(eyr_value):
    expiration_year = int(eyr_value.split(":")[1])
    return 2020 <= expiration_year <= 2030

def check_height(hgt):
    height = int(hgt.split(":")[:-2])
    unit = hgt.split(":")[-2:]
    if unit == "cm":
        return 150 <= height <= 193
    else:
        return 59 <= height <= 76

def check_hair_color(hcl):
    hair_color = hcl.split(":")[1]
    return re.search("^#[0-9a-f]{6}$", hair_color) is not None

def check_eye_color(ecl):
    eye_color = ecl.split(":")[1]
    return eye_color in ('amb', 'blu', 'brn', 'gry', 'grn', 'hzl', 'oth')

def check_passport_id(pid):
    passport_id = pid.split(":")[1]
    return re.search("^[0-9]{9}$", hair_color) is not None

Concepts and difficulties

Day 4 was not necessarily difficult, it is really a matter of taking things slowly and bit by bit. This is why I tried to decompose Part 2 with a lot of very small functions that do just one thing. The difficulties could then lie in the data parsing phase (we have to juggle between strings and integers) and in the definition of regular expressions.

comments powered by Disqus