Archivist — a personal project to use AI and agents to make sense of what my folks left behind

Six years ago Mom passed, presumably joining Dad wherever he went 30 years earlier. With that, we four kids inherited eight boxes of medical records, tax forms, old issues of Consumer Reports, recipe clippings, and the real treasure: documents and pictures tracing through four generations back to imperial Russia, specifically the Pale of Settlement. We preserved the oldest stuff in a museum-ish way, then set it all aside until this past fall.

Part of deciding to leave BenchSci was taking on these physical things: conserving them, making sense of them, and making it something that will be there when my kids—and my nieces and nephews—have kids of their own and actually start caring about where they came from. By best estimate that’s 2,500 objects, and many tell a story. (Just not the blurry vacation photos.)

Conserving is about keeping the physical thing in as good a shape as possible for as long as possible. Digitizing is about tracking the artifacts, finding a way to make them easy for my family to see, maybe even studying the history. For my sins, it seemed like classifying, clustering, contextualizing, and digitally restoring was a good way to get some seat time engineering with agents. It’s also a good way to explore dedicated and multimodal models, MCPs, my own agents, and maybe even some custom stuff to assemble a history from these pieces. The shorter thing might have been to just do this by hand, but hey—do it the hard way, am I right?

For the past three months it’s been a project, and the next few posts are therapy to share the hardest parts of this as a hobby project (digitizing, efficient token use, intricacies of debugging GCP oddities, etc.).