Project Luna

A lot can happen over the course of 3 months. My original 7-year hardware freeze starting June 1st, 2025 quickly evolved into a 9-year plan due to an attempt to achieve “high availability” in the homelab. Some aspects of this were already in place – two separate network-attached storage (NAS) virtual appliances, for instance. But a combination of wanting to test the VMware (now Broadcom) Virtual Cloud Foundation stack and comparing VSAN to Ceph cluster storage (more to come on this in future blog posts) in pursuit of high availability of VMs lead to yet more changes at the hardware level. As of this blog post on September 1st, 2025 – three months since the last post – the homelab looks a lot different, but I think is much better for the long haul.

The idea of waiting 9 years (to roughly coincide with my 50th birthday) is to stretch the original 7 years out. The core hardware and software stack will, for the most part, remain in a supported software state through those next 9 years. That meant pushing the envelope to get as new of hardware now as reasonable (still in the grey area between prosumer and enterprise) while still optimizing cost and energy consumption along the way. No jet-engine rack servers or enterprise network switches for me. This hardware is running about 10 feet from my office desk, so it has to be nearly silent. The constant whine of fans or huge energy bills are non-starters.

But the combination of “high availability” as a general design goal alongside maintainability lead to an interesting path. All of my designs had been – with a few exceptions – focused on my own individual ability to maintain a working homelab. But what if the homelab was designed with maintenance and troubleshooting tasks could be performed by a spouse or kids? “The Internet is down again” is something any homelabber sharing a network connection with others in a household has dreaded. What broke this time? Often relatively easy to fix for the more technically focused, wouldn’t it be great to have the same level of self-service troubleshooting for other members of one’s household that already exists for the standard consumer router (turn if off and back on). That’s the genesis of Project Luna – a moonshot to see if I could get a highly available and easy to troubleshoot homelab that I could essentially hand off to others for maintenance for day-to-day operations. Not the experimental bits, just the resources needed by others in the household (Internet, Plex, etc.)

Project Luna is a redesign of the homelab, segmenting the “home” and “lab” sections of the hardware and networking. When an experiment goes awry, it shouldn’t take down Plex. When the Internet connection needs reset, it shouldn’t involve a reboot of the virtual machine host running TrueNAS. When equipment needs power cycled, it shouldn’t involve tracing cables or pulling plugs out of the backs of devices. When a piece of equipment fails (permanently), is there a backup system ready to take its place right away?

Next post will outline the redundant network and router design, with a focus on energy efficiency and failure domains.

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.