In a slightly different theme of blog posts, I want to give you a look inside of my personal homelab environment, and to talk through some of the design decisions I've made, and also some of the improvements I'm looking at making in the future.
The reason I'm writing this blog post now is because my normal daily driver laptop (13" Macbook Pro) is currently in for a repair due to the battery and keyboard having decided to give up the ghost on a relatively new laptop, so I've fallen back to using my old Windows 7 laptop (Yes, I'll update it to Windows 10 soon!) but since moving back to the Windows laptop and using it along side my Windows Desktop which is what I use when I'm playing games or want a bigger screen, it's given me a chance to start fixing some of the issues that I've had with my home network, especially since moving ISP recently.
The reason most of this started up again was as a result of us moving our home ISP away from TalkTalk, and to NowTV. While I personally have very little good to say about TalkTalk, the hardware they shipped as modem routers were actually okay, specifically in the areas where they accepted that some users may be total novices and probably would never load the admin panel, whereas others would want to be able to customize things.
Unfortunately the NowTV box lacks a lot of the customization, and as such the main reason for improving the homelab setup has been so I can disable more and more chunks of the router itself and move them to my homelab where I can customize them to do what I want.
The major change I've made is in the way the DNS works, on the TalkTalk box I had it setup to route through PiHole, whereas the NowTV Box restricts your ability to customize the DNS server addresses that are distributed through DHCP.
My original DNS Setup looked something like this.
The diagram is not the best, but hopefully you can get the idea. There were two reasons for wanting to connect to PiHole, either through my personal VPN Server, or through my home network.
In order to make sure only my own known devices were able to resolve the DNS, the firewall on my remote Homelab server was configured to only permit TalkTalk IP's or my personal VPN server through on port 53 (DNS Port), so this naturally broke anyway when I had changed over to the new ISP.
Some of the other limitations were that the data Pihole could actually gather was limited, it would either show the external IP Address of my home network, or the external IP Address of my VPN, so you couldn't actually get any more granular detail, something that I wanted to improve on so I could identify if there were "Weird" requests at non-standard times.
The other part of this diagram that I did not make clear is the VM sat on the homelab server in the home network. This is running Windows Server 2012, and acts as an internal DNS server for the home network, and then forwarded everything on to the remote server, and failed over to 18.104.22.168 in the event connectivity was broken. This server does a load of other stuff, but in the scope of DNS that's all we actually care about.
My goal now was to make sure I could get more granular reporting of data, and to be able to route DNS requests through pihole again. There were a few other things that we needed to change to facilitate this, and some other changes I was making anyway.
Windows Server Upgrade
One issue that I had started working on a few months back was to retire my Windows server 2012 boxes. Currently there are two on my home network, one running as the primary domain controller for the network (And originally in a standalone configuration), and another that ran a file share setup to centralize all of our personal files throughout the network.
The original Windows Servers were setup when I was a student and used an Academic license that is suitable for homelabs and such like that, this only runs for me and my family and to be honest is overkill for what we really need, but as with all good homelab projects it's grown and morphed over time into more than what it originally was.
My plan then was to migrate to Windows Server 2019, as it's the latest in the editions of Windows Server, and removes the really really bad "Metro Style" that Windows Server 2012 had (Something that really makes it painful when using them over RDP because the hosts aren't all that powerful)... This work had already started, and before my recent upgrades my home lab server looked something like this in terms of logical servers:
- Active Directory Domain Services
- Windows Server Update Service (WSUS)
- DNS Resolution
- Windows Deployment Services (Never Configured as far as I can recall)
- Network Policy & Access Services (Also not something that I can recall explicitly configuring though might be part of another service)
- IIS (Part of WSUS I believe)
- File & Storage Services (Part of the AD DS Setup)
Running Windows Server 2012 R2
- File & Storage Services
Also Running Windows Server 2012 R2
- Active Directory Domain Services
- File & Storage Services (Again, Part of the AD DS Setup)
This was my first stepping stone, to make sure I had at least 2 actual domain controllers running, and that if I lost one I wouldn't be really screwed trying to re-create user accounts and all of that horrible stuff!
A further later addition to DC-02 that was added after the initial build out, but before my recent major re-jigging of the home network was to also add DNS support for that host, which was a bit painful to get to sync with DC-01, which was critical given the long term intention will be to retire DC-01 entirely from the network. It is now up and running, and was up and running before the most recent alterations to the network meaning DC-01 and DC-02 both resolve the same addresses, and will forward any records they don't know to PiHole on the remote server as per the diagram above.
The First Problem - DHCP & DNS
So, the first issue that I had to solve was that with the new router from the new ISP, it was dishing our it's own DNS Servers for all DHCP clients, and for the most part all our devices connect over DHCP and get dynamic addresses.
I was also aware that the issues we were having on the old router seemed to be DHCP Related, where after a few weeks, the DHCP service on the TalkTalk routers just seemed to give up trying, and would cause issues. What this meant was my earlier diagram had turned into something like this.
Now, this may not look too bad, it's less hops after all? But it does break a few things for my current setup, and some fairly important ones at that.
The first issue is I can no longer resolve anything on my internal DNS records, I have setup <mydomain>.local to operate within the house, and stuff like DC-01, DC-02 and MASS-STORAGE-01 along with some other odds and sods use that. This meant the end-user PC's which were formally "Domain Joined" and on "Domain Networks" had a bit of trouble working out where they were on the network, and if they were domain joined or not. This then had additional issues whereby PC's were no longer able to pull updates from the WSUS Server, which holds all of the update files (When you've got really rubbish internet, you want to be trimming down the updates you pull in, I'll write another blog another day on why I went with WSUS and some of the things I do).
The next issue is it skips PiHole in it's entirety, and while I knew going into this I would need to adjust the OpnSense Firewall on the server to allow Sky IP's through, it proved to be pretty pointless to do this given I couldn't actually set the DNS servers that the DHCP Server would dish out.
Finally, it highlighted an issue that I had semi-forgot we had, and that in the old routers being a bit rubbish people had been setting their own addresses in the IP configuration, while the move to the new router wasn't a major issue, it did highlight that things broke because people were either pointing to one of the Windows Servers, which were then pointing to PiHole and getting bounced off my Firewall, or some PC's were pointing directly to PiHole and also bouncing off, so a solution was needed to standardize this and stop the mess from getting worse!
Roll Your Own DHCP
I decided the easiest solution was to roll my own DHCP Server. There are a few options I considered when doing this, and much of this tied into my longer term approach when it comes to my home network, however I also wanted something that could fix the shorter term problem, without either totally screwing me in a few months if I do more upgrade work, or having to stick with the non-pihole DNS solution.
One solution I seriously considered was rolling my own PFSense or OpnSense router on the home lab server. I've done it before on the homelab and it's been configured to run a guest network (All be it one that's currently un-plugged because I forgot the SSID Password and couldn't be bothered to break back in), and is the same solution running both in the Hetzner environment with the remote home lab server, but also something I've completed for projects I've worked on commercially before.
The issue with rolling my own router was that the current position of the server is nowhere near the modem access to the property, and currently the server gets it's networking through power-line plugs, which work fine, but wouldn't work for what I need it to do and there's just no way that I'm knocking down half the walls in my house to re-route it, or getting Openreach into do the same. If / When we get FTTP Connectivity I might re-consider either putting a small server down there, even if it only runs the Router, or just buying a decent router instead of the crappy ones the ISP's give you.
Another solution that I considered but quickly disregarded was to roll my own DHCP server running on a CentOS or similar container within the home lab environment. While I've certainly got the capacity to run this, it didn't sit right with me to roll my own DHCP standalone on a container, and I've had troubles working out where the hell DHCP Servers have lived on previous networks I inherited, and found there were 3 different conflicting DHCP Servers, in that case moving to OpnSense was already something I'd planned and worked well, but in this scenario it just was not something I could do at the current time, and while is a longer term aspiration isn't achievable in the couple of days I gave myself to do the upgrades.
The solution that I decided to go with was to add the DHCP Feature to DC-02 in Windows Server 2019. I have dabbled once or twice at Windows Server's DHCP functionality so had a vague idea of how it worked, but wasn't something I'd used in anger, however given I was going to need to run Windows Server for other things, it felt the right thing to do, plus given DNS was managed in Windows Server my hope was that it would be able to link back into that so I could dynamically internally resolve PC's as I needed.
Put simply, I installed the DHCP Server as a new Windows Feature, ran through the very quick setup to authorize it under my Super User account (I have a "Standard" account which I use for day-to day work like typing this, and then have a "SU or Super User" Account which allows me to access the servers and have admin access on them). Then all I needed to do was add an IPV4 DHCP scope to match my current DHCP allocation.
My router originally issued IP's between 192.168.1.2 and 192.168.1.200, this was so anything over 200 could be assigned as static resources such as the domain controllers and other infrastructure I didn't want to spend an age hunting down in the future. My original intention was to issue reservations for the Domain Controllers, Mass-Storage Server and other things like PiHole when it was setup locally (Spoilers I know!) however I seem to get some weird behavior when trying to use reservations other than when they are simply reserving a DHCP lease, so within the address pool have added a small range of non-assignable IP addresses, which will continue to be used for the DC's as they don't appear to like being resolved through it's own DHCP Server, however everything else will now get a dynamic lease especially where DNS is dynamically updated.
So that's the DHCP server setup, DNS Set to point back to the Windows Server Domain Controllers, and all of the DHCP clients set to point to them and pull DHCP back from DC-02.
Some limitations to note though, at the moment I've not got a fail-over configuration setup for DHCP, normally I'd set DC-01 and DC-02 to be a fail-over pair, but as I'll elaborate on later, DC-01 is being "sunsetted" and I don't want to commission new functionality on to it, I also want to end up building a "New" DC-01 as a Windows Server 2019 rather than creating DC-03 and having to re-name or something later, so that's worth noting.
I also haven't set anything up with IPV6, that's still handled by the ISP issued router, and where my IPV6 knowledge isn't so great I've not messed with it at all. So far though DHCP is working well, and it's been interesting to see what's actually connected to the network.
I will also at this point note that I had forgotten that PiHole supported running DHCP itself, however again would likely have rejected that idea as I want to limit how critical the pihole server is to my infrastructure. Right now if that container died it's a pain, but I can fairly quickly re-route all the DNS forwarding to hit Cloudflare instead of PiHole.
The PiHole Install!
The next job was to install PiHole, this was something I'd decided to do a few months back but never got around to doing it. One of the issues I noted earlier was that the data I was getting back from PiHole was fairly coarse, it wasn't telling me specific clients making the requests and that's something that interested me, I wanted to be able to identify anything a bit weird and know which laptop, desktop, phone, tablet or otherwise was causing it. I already knew it was going to move into my home network and this was the pattern I had in mind:
So I went to setting it up like this, and unfortunately hit a few annoying snags. The original design I had decided on was to run a Proxmox container running CentOS 7 (Mainly because I had the 7 template downloaded but not 8... I know it's bad!) and then pull an IP from DHCP before adding it then as a reservation to ensure it would always be assigned that IP. I tried a few different settings but ultimately could never get CentOS 7 to pull from DHCP so decided to try with CentOS 8 and bit the bullet and downloaded it.
As I'm sure most of you can imagine it didn't go any better, and I ended up having to give it a static IP and adding it as an exception to the DHCP, I will need to go back in the future and give it a more sane IP because it's sat too low in the DHCP range for my liking and I stand no chance of guessing it in the future! I then installed PiHole using the Super easy install script and away we went, a bit of config to make sure it pointed to Cloudflare and away we go.
At this point I jumped back onto DC-02 and using the Windows DNS console pointed the forwarder at my internal PiHole IP Address and sat back and watched...
That Moment I realised I screwed up...
Checking some news websites to trigger ad requests I was pleased not to see any ad's. A total success, time to shut the laptop down and go relax, but alas, checking the PiHole dashboard showed me something I was not expecting to see, but really should have been.
Can you guess it? It's really obvious really if you've seen the diagram, let me show you.
I had (Rather stupidly) not realized that I had simply moved the problem back... Instead of seeing the hostname of my routers external IP Address, I was now seeing the IP Addresses of DC-01 and DC-02, as they were the servers actually making the requests to PiHole...
My (Again rather stupid) assumption was that as the request was simply being forwarded that it would be "On behalf of the client" and PiHole would have visibility on the client making the request, rather than it being that of DC-01 or DC-02.
This was thankfully a fairly easy fix I logged back into DC-02 and updated the DHCP settings to now point to PiHole as it's primary DNS server. I had originally also added DC-02 and DC-01 as additional servers in the event of PiHole going down, however decided that if I lost PiHole I'd probably loose the DC's as well and it also meant that I couldn't actually block the adverts as it would fail over to DC-02 which would then request the resolution through Cloudflare which would obviously work.
I then needed to update the DNS settings in PiHole so that instead of routing directly out to Cloudflare, it would route to the DC-01 and DC-02 servers, which would then route the request on from there. Finally I needed to make the updates in DC-01 and DC-02 to route exclusively to Cloudflare rather than to PiHole.
This new setup seems to be working quite nicely, and would look something like this on my continued poorly drawn diagram:
In terms of DNS, this has solved the majority of my initial problems and the PiHole is now up and running.
There have been some snags that I'm looking at longer term, the main one being that PiHole doesn't appear to be able to resolve the hostname from the IP addresses of the host, despite it being configured so should work. I need to do some more digging into this to work out quite what's not playing ball!
The next major improvement that is planned will be to improve the wireless access point on the network. The original router from TalkTalk has a fairly respectable wireless access point, however unfortunately the NowTV replacement is somewhat lacking and quite often in the rooms a bit further away from the router it's quite a struggle. I've got a Cisco Meraki router that was flashed with OpenWRT and is currently configured for my guest network, this will get re-purposed and will take on the task of running the entire home network's wireless needs.
As alluded to earlier there is also the desire to retire the current DC-01 and re-build it as a new Windows Server 2019 instance, linked to this I will need to build a new dedicated Windows Server 2019 instance for the WSUS to be moved to, as it is not designed to run on the same host as a domain controller, and has caused some issues with resources which I had not expected.
To make the above happen I will also be upgrading the hardware of my homelab server, I have purchased some additional RAM some time ago but never got around to shutting the server down to install it, so that will help to cope with the additional servers, and any other lab servers I might need going forward.
Another upgrade that I've been putting off is an upgrade to the storage. Currently I've got a number of 2.5" HDD's running in a RAID (5 I think) however the raid controller doesn't seem to support expanding the total capacity of the array without potentially destroying the array entirely, something I would rather not do given it has important data on it that I'd rather not be loosing. A few months ago I purchased a HP Microserver that has a few 3.5" HDD bays in it, and the move to 3.5" drives was another thing I had wanted to do, given they tend to be significantly cheaper, and currently my homelab is made up of a number of old Playstation or laptop hard drives that were removed to upgrade to SSD's. I may in the future upgrade all of the 2.5" drives to SSD's, but that's not something that's likely to happen in the near future.
The plan for the HP Server will be to bring that online as a storage only server, that will then allow me to migrate all of the current content on to that, and then expand the array on the main homelab server. I need to check the exact spec's of the HP Server as ideally I would move all of the "Core" infrastructure (DC-01, DC-02, MASS-STORAGE-01 and PiHole) on to that and only turn on the larger server if I need to run shorter term projects, but I'm not convinced I'll have capacity on the micro server either way.
I'm also longer term looking at re-purposing my old tower PC to run as a server, I've currently got a rack mount server that's not really ideal for the setup I've got and again would like to be able to ideally retire it fully in favor of a tower server as that can actually sit nicely under my desk rather than perched on a set of draws!
So that about sums up my current homelab, I'd love to know what you guys think and if you've got any suggestions please give me a shout! I'm @Wild1145 over on twitter, so give me a yell there with any thoughts / suggestions.