Reducing manual mind tasks with code


The brain is terrible at performing methodical mind tasks. We all know that. Especially when they are things like comparing if a bunch of nums in one file are not present in another.

By a bunch, I mean close to 1000 - which I had to check to review a PR at work.

Problem

Check if 728 specific numbers were removed from a huge (17k+) audit file generated by Jenkins (and no, GitHub refused to show the diff as well - but there’s more I learnt here *)

Ways to solve it

Slightly unoptimal method: Pull down the branch of the PR, check each number has been deleted - MANUALLY

Sounds like a lot of brain power used on this - not necessary.

Less-mundane-brain-work method: Let a script take care of all of this comparison!

import json

def find_dupe_nums(nums):
    non_dupe_nums = set()
    dupe_nums = []

    for num in nums:
        if num in non_dupe_nums:
            dupe_nums.append(num)

        else:
            non_dupe_nums.add(num)
    
    return dupe_nums


def check_nums_removed():
    with open("audit_json_file.json") as json_file:
        data = json.load(json_file)

        all_nums = []

        for key, _ in data.items():
            all_rdc_nums.append(key)

        with open('nums_should_be_removed.txt') as text_file:
            nums_should_be_removed = text_file.read().split()

            duplicates = find_dupe_nums(nums_should_be_removed)

            missed_to_be_removed_nums = []
            count_missed = 0

            for zipcode in nums_should_be_removed:
                if zipcode in all_rdc_nums:
                    missed_to_be_removed_nums.append(zipcode)
                    count_missed += 1

        if len(missed_to_be_removed_nums) == 0 and count_missed == 0:
            print("Yay! All removed!")
            print(f"these are duplicates {duplicates}")
            return
        
        print(f"count missed {count_missed} {missed_to_be_removed_nums}")

if __name__=="__main__":
    check_nums_removed()

This compares a huger JSON to a txt file - if a num from the text file is present in the JSON, we mark it as missed_to_be_removed, count and move on - this also handles duplicates.

It’s a pretty simple script with a small scope and the code was quickly written (could be cleaner for sure), but really helped validate my team member’s PR in an efficient way and quickly.

Doing things like this provides unique happiness in everyday things and also helps keep the mind sharp, and helps teammates along the way :)

PS: Learnt that if GitHub cant generate a diff for huge changes - appending a .diff to the PR shows the difference between the 2 files - eg., https://github.com/WomenWhoCode/hacktoberfest22-pythonic-stories/pull/14.diff. Saw there’s a .patch too but more on that another day…