Home Server Upgrades
I have had 2 servers at home for all of my infra, both of which were either free or budget builds and should charitably be described as ancient at this point. While I definitely learn plenty wrangling this stuff, they're not formally a homelab, and over time it's gone from just a file server for backups and media centralization to things that are a bit more critical to the daily function of the house, namely HomeAssistant for automation. So I need these to be reliable and relatively maintenance free, because I have plenty to do without also needing to do tech support or help figure out flakiness in the home automation stuff for those residents of the home that are going to be annoyed if it's anything other than 100% transparent to them.
- HP DL360 G7 with a single E5620, 32GB RAM, and 4x300G SAS SSD storage in a hardware RAID that runs a number of my docker containers that have stuff I'd potentially want to expose to the outside world (Minecraft servers, Home assistant, etc). Ubuntu 22.04 LTS.
- This isn't very power-efficient, and the RAID's cache card has already failed once and been replaced with a spare, and that spare has long since disabled the cache because it doesn't like the battery, has had glitches where it freaks out and declares the file system read only, making me think it's going to fail entirely. Plus every time it does that, I have to manually go into the RAID card and tell it to boot normally and recover things, which required pulling the system out of the rack and removing the GPU so that it would use the onboard video, because I never could get the GPU to put out a console display, all of which is a serious PITA. Since bypassing the RAID card to move to JBOD/software RAID or replacing an actually failed card was going to require a rebuild anyway, I figured I'm better off just building a new system.
- Dell Power Edge T110-II (tower) with a single E3-1230 V2 and 16GB of RAM, plus 5x3TB storage in ZFS. Runs TrueNAS. serves as primary file server, and also has a plex jail since that's where all of the media lives. I've been running this since it was FreeNAS, I think version 10, and there are some older posts on the blog about previous iterations of this build.
- TrueNAS Core (the version I'm running) is based on FreeBSD 13 and is basically end of support. There are multiple sub-versions of 13 that never got integrated, and no plans for anything using 14.
- TrueNAS really wants you to go to Scale (Linux based), and they've formally dropped support for jails and VMs running on Core now, so I can't even update to the (likely final) maintenance release of Core without risking breaking something..
- There's an upgrade path from Core to Scale but it requires hard drives, no USB live boot support anymore. I have no SATA ports available on this system, and it's more than a decade old so it's not like it even has USB3 or something that could be used for more drives.
The goal was to replace the above with some combination of 1-2 systems that had similar power usage (TDP for CPU, etc) but taking advantage of improvements to performance per watt that much newer chips would provide, preferably in similar form factors and allowing me to reuse the various disks involved. I run Folding at Home on my stuff when it's idle, and the heat generated ultimately feeds my heat pump water heater, especially in the winter time, so there's some incentive to have stuff that is still efficient under heavy load. I'd decided I wanted to move away from appliance OS in favor of regular Linux + openZFS because I don't really need the UI for management and didn't want to end up in the same situation I was in with TrueNAS again later. I considered either consolidating everything to one larger rackmount system or keeping 2 systems and using a purpose-built NAS case and motherboard (lots of SATA ports) to replace the Dell tower. Increasing to a 2RU system would mean I'd have to find a different switch that wasn't rack-mounted, because I only have a 4RU rack, and I got analysis paralysis on the DIY NAS route due to the large number of options.
The forcing function that led to me figuring out my plan and moving forward was that a Dell R330 with 4x4TB hard drives came available free as in beer, and I discovered that Dell does still make tower servers that are the newer gen to what I have, and found a T340 on eBay for under $400. The T340 has 8x3.5" hot swap SATA/SAS bays, plus 3 more 5.25 bays and 2 onboard SATA ports. So after some discussion about the merits of various ZFS/RAID configurations, I ended up with the following:
- Dell R330, E3-1230 v5, 32GB RAM, PERC H330 (in JBOD mode)
- 4x200GB SAS SSDs (from the HP) in 2 ZFS mirror vDevs (basically RAID10) for storage
- adapters to use them in the existing 3.5" bay/caddy
- Linux software RAID1 for OS made up of
- 1x256GB SATA SSD in the optical drive bay via an adapter
- 1x256GB NVME SSD in an "external" USB3 case plugged into the internal USB3 port
- Dell T340, E-2244G, 32GB RAM, PERC H330 (in JBOD mode)
- 2x200GB SAS SSDs (spares for the HP) in linux software RAID for OS, same 2.5 -> 3.5 caddy adapters
- ZFS storage pool consisting of the following mirror vDevs
- 2x4TB SATA
- 2x4TB SATA
- 2x3TB SATA
- 2x3TB SATA (via HDD cage for the 5.25 bays and the onboard SATA ports)
- eBay Nvidia Quadro P2200
- for FaH
and Plex transcode - Replaced a Quadro K2000 that had been in the HP and died
Software considerations
I ended up using Debian 12 on both. I've been fairly happy with Ubuntu, that HP ran 18 LTS and seamlessly upgraded to 20 and 22, but they seem to be really trying to focus on their paid support models - each new LTS release includes more nagging about the updates you could be getting but aren't, plus with the latest LTS release (24) they're pretty aggressively pushing Snap instead of apt for package management, and there's a limit to how much more that's different largely for the sake of being different I'm interested in being forced to learn in order to have the stuff I use at home Just Work, especially in the age of AI Slop overtaking all the good documentation and search. Yes, I have Opinions about systemd (more on that below), but for ease of use and compatibility reasons, I'd rather just live with the enemy I know than Prove a Point by finding a Linux flavor that doesn't use it and dealing with always having the "weird" OS anytime I need to do something or install something.
As much as possible* lives in Docker. I was already running Home Assistant and Minecraft servers (Java and Bedrock) on Docker in the existing system. Plex was previously running in a FreeBSD Jail on the TrueNAS box, so I built a container to do that on the new box. Both hosts are running Watchtower to keep the containers automagically up-to-date. For the most part, migration involved deploying the appropriate software or containers on the new system, and then copying over the directories from the old system, or doing backup/restore - Minecraft was the former, Home Assistant and Unifi were the latter, and Plex appears to store most of the relevant account settings online now, so it was just as easy to log in to the clean install and re-add my libraries and let it rescan everything.
*with a couple of exceptions:
- Unifi - conflicting info about whether their docker container is really supported/which one to use
- Folding at Home - at least the FAH-GPU container has gone unsupported. No updates for multiple years is causing the newer projects' work units to fail on account of missing/outdated libraries, especially for GPU. Plus there's now a completely new major rev of FAH (v8) that has no current Docker container. Also I'm starting to think that GPU + Docker isn't worth the hassle.
- Plex - I thought I was going to have to do this outside of a container, see the issues below.
Issues
Software RAID for OS
OpenZFS is pretty well-supported in Linux, but the prevailing wisdom is that you don't really want to try to boot your OS off of that, hence the Linux software RAID. But Debian's installer also doesn't know how to manage setting up the software RAID in a guided partition setup, and you end up with it failing to write grub to the disk. I started with instructions that told me to disable EFI boot, which worked on the R330 because it wasn't using anything hanging off of the PERC for the OS, so it didn't matter whether the installer could see it. The T340 has SAS OS disks, which hang off of the PERC. I was convinced I had some sort of hardware problem, which I lost several days troubleshooting because neither the installer or an actual install of Debian would see the disks behind that despite recognizing the PERC in logs, lspci, etc. Turns out that despite them both being 330s and both running the same (most current) firmware, there's something newer about the one in the T340 that makes it not work unless you're in EFI boot. Once I found the equivalent instructions for EFI things worked pretty well, though I have noticed that the grub script to copy the current boot image to the second disk is a bit fragile, especially if the disks ever change designations.
SMB/CIFS
Between Windows defaults changing so that it tries to use your Microsoft login for network shares (which among other things is almost guaranteed to be too long/contain illegal chars for an actual user account) and disabling guest login, I fought with smb.conf for entirely too long before I had a working network share. Lots of conflicting info about how to do this, and the config files that TrueNAS's webUI generates don't seem to translate directly.
This was what I finally found that made guest access work though note that there's also a command you have to run on the windows side. I still don't entirely understand why, but force user/group nobody within each share configuration stanza doesn't seem to work.
Plex/FAH GPU Funtimes
I originally tried to do Plex in Docker, including exposing the GPU to it for transcoding. Once I actually completed that, not only was I not seeing hardware transcoding as an option for plex, it appears that maybe Docker (or FAH) expects exclusive use of the GPU, because the GPU instance for Folding at Home, which is installed on the OS itself, failed and declared the GPU unavailable. As soon as I stopped Docker and restarted the FAH client, everything was fine.
My next try was to install Plex directly on the OS, thinking that maybe Docker was complicating things and the OS should be able to handle two programs sharing the GPU. This also didn't result in hardware transcoding support on Plex, and it broke FAH's GPU access in a similar way. So what I ended up doing was disabling the nvidia-container-runtime in docker, and starting the Plex container back up that way, and uninstalling the OS installation of Plex. Long-term, I ended up having to uninstall nvidia-container-runtime altogether and then redeploy a vanilla plex container in docker because there was still some sort of conflict trying to steal FAH's access to the GPU, which would result in things working, but then failing before the WU could actually complete successfully.
There is still some sort of race condition in the order services are loading which results in FAH starting before the GPU drivers are fully initialized and ready, so it doesn't find the GPU and you have to restart the service a few minutes after the system reboots.
Kernel and Virtualization
I was chasing some other random failures with the GPU, noticed I was getting these DMAR/IOMMU errors in the kernel logs, and one of the first search results implies this is some sort of issue with the 6.x kernel that can result in things like filesystem corruption if you don't disable VT-D (direct access for PCI passthrough virtualization). Since this was roughly coincident with a new crop of ZFS errors that happened after I fixed the thermal issues below, and I'm not using that set of virtualization anyway, I figured I'd better disable it. Dell's BIOS on the T340 doesn't have a separate option for VT-D, only enable/disable Virtualization technology, but it does have another setting for x2APIC mode, which is only necessary on machines with a lot of CPUs. It is disabled by default, but was enabled on my machine (I thought BIOS was already reset to defaults but it appears not). I tried disabling just that, but the same errors showed up pretty quickly, so I ended up disabling virtualization in the BIOS entirely. Docker still works, and everything else seems happy with it disabled - several days of stability and no kernel messages.
Diskname juggling
I
have 10 disks in the T340. For whatever reason, every time I reboot the
thing, they play 3 card monte with their /dev/sdX names. Makes it kinda
hard to make a sane smartd.conf, especially since 2 of those disks are
SAS (aka SCSI) and the rest are SATA, and the schedule is a little
different for the 2 OS SSDs vs the aging spinning rust. Ended up having
to use /dev/disk/by-id. The grub copy script I'm using references a specific partition on each of the two /dev/sdX devices, so I'm going to have to spend some time thinking about the best way to address that so I can use by-id references. So meantime, every time I see that a kernel or kernel module (ZFS, nvidia drivers) get an update, I check where the OS disks have migrated to and update the script with the device name du jour.
Logging
At the risk of this devolving into a rant, systemd continues to make its users suffer badly from its devteam's "not invented here" syndrome, or put another way, it continues to violate the principle of least astonishment if one is at all familiar with UNIX/Linux from the last couple of decades due to their insistence on reinventing well-understood and documented, functional, and modular things to (often poorly) re-implement them within the ever-increasing monolith that is systemd. My current sticking point is the transition from regular old text logs to binary files and journalctl, which I hadn't experienced on other iterations of Debian (raspi etc) and so this was my first exposure. I've mostly figured it out at this point, so it's probably not worth the hassle to go through and reconfigure everything to use regular syslog, but fail2ban breaks because it can't find the usual auth.log and the devs don't seem inclined to add a case to fix this on install. So you have to add "sshd_backend = systemd" to the [DEFAULT] section of /etc/fail2ban/paths-debian.conf manually.
Thermal Management
The T340 is kind of a stupid design in a couple of key ways:
First, it only has one fan, for everything, unless you count the power supply fans. Pictures and more details here. There's no forced airflow past the card slots, and no real way to add it, so all it really gets is the slight negative pressure the PSU fans create. The HDD cage I added to the 5.25 bays has a fan on the front, so there's a little air blowing toward the cards now, but it's pretty anemic.
Second, both of the full-length PCIe slots are in adjacent slots with an x4 and x8 slot below them. The PERC H330 lives in the top slot, and I put my GPU in the second slot. The H330 runs kinda hot anyway, and it only has a passive heatsink for cooling. But the real issue is that in my setup, bottom plate is way too close to the GPU's heat center, and I managed to trigger a thermal shutdown on the PERC, which is pretty much something you never, ever want to do unless you enjoy calming down several extremely angry file systems after they unceremoniously lost access to most or all their disks at the same time.
I thought about swapping the GPU and PERC between slots, but that will just block off the heat sink's airflow with the GPU and cook both cards, so the current workaround is a 70mm fan on top of the PERC's heatsink, and a PCIe extension cable so that I can move the GPU elsewhere. Since I don't need to use the display ports (the onboard graphics still work if I need to actually see the display), I don't actually need to put it in a slot, and can put it pretty much wherever it fits, as ghetto as that's likely to be. For its part, the GPU doesn't seem to notice that it's not really in direct airflow, as it still reports the same mid-60C temps under full load.
Also worth noting that because this sits on a shelf about 1.7M off of the floor and adjacent to where the other server, my switch, and UPS all hang in a vertical rack, the intake air is a little too warm, so I have a small fan blowing cool air off the ground toward the front of the chassis, which dropped the intake air about 5-6 degrees C (from 35 to 29).
To Do List
I have achieved minimum viable product after what ended up being about 2 weeks of work. There are a few things that are near term useful quality of life improvements I still need to work on.
- Backups - on the old system, I was doing a set of backups/zips/copies to a CIFS mount via cron so that important files on the 1RU system were getting backed up elsewhere. On this setup, it probably can be ZFS snapshots that I just push around, but I have to figure all of that out. I also probably need to decide what I want to do about offsite backup for some stuff that isn't already covered in other ways now that I have a more straightforward way to do that with snapshots.
- Login MOTD - Ubuntu has a pretty nice set of info it presents on login, including system load, updates pending, whether a restart is needed, etc, and I'd like to replicate that on my Debian boxes. Looks like this covers it.
- Unattended upgrades - apply the security updates that are pending without me remembering to log in periodically to do it.
- Alerts - I'm doing SMART monitoring and ZFS scrubs and such, but right now, it's all very much like Milhouse watching Bart's factory: "I saw the whole thing. First it started falling over, then it fell over." So I need to give it a way to demand attention when my monitoring detects a problem that isn't dependent on me logging in to check a half-dozen things every so often. .
- Monitoring - one of the things that I do miss about TrueNAS's nice webUI is the pretty pre-built charts on most things - network IO, temps, memory, CPU, storage performance, that worked well to see how things behave normally and troubleshoot when they don't. Rolling my own means right now I have... CLI, and CLI, and also some CLI. Probably this means InfluxDB or Prometheus and Grafana, or maybe Netdata. It'd be nice to integrate it into my existing HomeAssistant instance, but so far most of what I've found that watches hardware on HomeAssistant is assuming you want to monitor the hardware that HA is actually running on when you deploy it as an appliance, not to monitor a couple of systems, one of which may be coincidentally running an HA container.
References
Some of the stuff I found most helpful is linked inline, but here's some additional stuff that I used that I didn't already link elsewhere:
ZFS
For what appears to be primarily license incompatibility reasons, you don't get ZFS on linux without a bit of work. It's not difficult, but it's nice to have some good recipes for it. Also I didn't completely understand ZFS when I initially set it up on the last machine, so I did some things wrong in a way that wasn't easy to fix without vacating and rebuilding. This was an opportunity to do it more "right" this time, especially as it concerned hierarchical datasets to enable easier snapshotting, so I needed some more background reading.
- https://wiki.debian.org/ZFS
- https://openzfs.github.io/openzfs-docs/Getting%20Started/Debian/index.html
- https://klarasystems.com/articles/choosing-the-right-zfs-pool-layout/
- https://arstechnica.com/information-technology/2020/05/zfs-101-understanding-zfs-storage-and-performance/
LACP
I don't have any 10GE on my switch yet, and while I could add a 2 port expansion card, it's probably overkill for my application right now, so since my T340 had 2 onboard and 2 card-based GE ports, I did a 4x1GE LAG,
- https://www.server-world.info/en/note?os=Debian_12&p=bonding&f=1
- https://blog.bella.network/lacp-with-ios-and-debian/
Docker
I have created multiple datasets in my ZFS pools for ease of management and backups, and one of them is specifically for Docker, so I wanted to move where docker stores everything from the OS drives to the right ZFS dataset. https://forums.docker.com/t/change-the-default-docker-storage-location/140455
SMART
Needed something to base my config for making sure SMART was monitoring everything and doing periodic tests. https://dan.langille.org/2018/11/04/using-smartd-to-automatically-run-tests-on-your-drives/
No comments:
Post a Comment