diff --git a/content/homelab-early-2021.rst b/content/homelab-early-2021.rst new file mode 100644 index 0000000000000000000000000000000000000000..d649d651469f038d580b8790e6322ae2a705714e --- /dev/null +++ b/content/homelab-early-2021.rst @@ -0,0 +1,133 @@ +shore.co.il infrastructure - May 2021 edition +============================================= + +:date: 2021-05-08 +:summary: Description of the shore.co.il infrastructure as it exists in May + 2021. + +Hardware +-------- + +The hardware I'm using consists of: + +* Netgate SG-2440 running OpenBSD. +* Linksys EA6350 running OpenWrt. +* ASrock N3150-NUC running Debian. +* An online.net (now Scaleway) Dedibox running Debian. +* An ADSL modem provided by my ISP. +* A purpose built PC in the living room running Debian. +* APC UPS (I don't remember the exact model I don't feel like getting and + checking). + +The OpenBSD box is ``ns1.shore.co.il``. It's the router for the home network, +the primary DNS server for the ``shore.co.il`` zone, DNS resolver for the +network, DHCP server and running HAProxy. The local Debian box has a 0.5TB drive +and is the LDAP server, mail server, Nextcloud, GitLab. I store all of my +private information on it encrypted. I have off-site backups I take about every +2 weeks that I store in my mom's house (also encrypted). The living room PC runs +Kodi, Transmission and my podcast downloader. It has a magnetic 8TB drive that +holds music, movies and other such media. The OpenWrt box is the wireless access +point. Lastly, the Dedibox is a bare metal instance which obviously has a faster +internet connection than I have locally. It runs this blog, a few other web +sites, the container registry, the secondary DNS server for ``shore.co.il`` and +most GitLab CI jobs run on it. Also, it runs my workbench (a container that has +all of the tools I need and use) that benefits from the faster internet +connection, faster drives, abundant memory and beefy CPU. The data drive is also +encrypted, but not backed up (everything on it can be recreated in less than a +day). + +Deployments +----------- + +Initial setup is done using Ansible. Services on the different Debian boxes run +in Docker containers and deployed using GitLab. The other OSes are maintained +using just Ansible. There's no redundancy since it would take more money than I +would like to spend. There's no infrastructure-as-code (no Terraform) since the +only thing I could code is the single Dedibox instance (and it's not supported +by the Scaleway provider last time I checked). All of the Debian instances run a +GitLab Runner in a Docker container and have access to the ``dockerd`` socket +so they can create containers, run jobs in them and build and push images. I +know this is a security risk, but since only I use this GitLab instance I'm +worried about it. The deployments are pretty consistent, projects have a +``docker-compose.yaml`` file, the GitLab runner runs ``docker-compose build``, +``docker-compose pull`` and ``docker-compose up`` to deploy services. All of the +code is in the `Shore group in my GitLab instance +<https://git.shore.co.il/shore>`_. The templates for the CI pipelines that I use +are also in `my GitLab instance <https://git.shore.co.il/shore/ci-templates>`_. + +Security +-------- + +Most services are only available over SSL (apart from services that don't +support it like DNS, DHCP, etc.). I'm using Let's Encrypt to issue globally +valid certificates. In my home network this has presented a problem. I wanted to +have multiple hosts using SSL with the single IP address and not rely on the +router to decrypt the traffic. I wanted the traffic to remain encrypted until +it reaches the host and that the certificate to globally valid. For that I run +HAProxy which uses SNI to identify the requested host and forwards the traffic +accordingly. + +The SSH servers all have rate limits and only allow public key authentication. I +rely on SSH keys to authenticate and login instead of LDAP, I prefer it since if +the LDAP server is down I'm not locked out entirely. Further more the only +services that use the LDAP server are on the same host and connect to it via a +Unix socket, further securing the access. The LDAP server is not available on +the network. + +For regular tasks like renewing the SSL certificates or updating the hosts I +have written Ansible playbooks. I routinely rotate the SSL keys and also the DH +parameters. I run them manually from my laptop, I don't want them to be updated +automatically by the hosts themselves. Also, rebooting the NUC and Dedibox +requires a manual step to unlock the encrypted drives. I can update all of the +hosts, rebuild all of the container images and deploy in about 2 hours and with +very little interaction and I do so every few weeks. + +Changes from the previous iteration +----------------------------------- + +There are a few changes in the infrastructure sine the last versions. First, +instead of the Dedibox I used an EC2 instance. The Dedibox costs more, but the +time saving from the beefier instance, from the CI improvements and being able +to run VMs on it are worth it. + +I used to use Ansible for everything, including deploying services which I do +now with Docker Compose and GitLab runners. The development workflow with Docker +is easier and faster than using Ansible along with Vagrant and Molecule. I can +honestly say that I'm surprised more people don't this more often. It's really +easy, simple, secure and reliable. I find that this approach is useful for +simple setups like mine (or dev or QA environments), especially since the same +Docker Compose setup can be used for local development. + +The addition of the Nextcloud and GitLab services have made me entirely +self-hosted. I use online.net and a DNS registrar but apart from that I don't +rely on any external service and I hold all of my private data (except that most +of the emails I send end up in Google's servers anyway). My sites are indexed by +Google, Bing and Yandex but I'm not sure what can I do about that. + +Future improvements +------------------- + +The UPS isn't supported by NUT or anything else available on Linux or OpenBSD. I +will replace it sometime in the future with one that does so that I can trigger a +clean shutdown when there's a power outage and the battery is running low. Also, +I would like to replace the NUC with a newer and faster one. + +I have external monitoring on services but I've yet to setup internal log +aggregation or metrics collection. I plan on setting up an EFK stack (I have +some POC code laying around but I need to update it and bring it in line with +the rest of the infrastructure). I also want to investigate Sensu for running +checks locally (a Nagios replacement), I have my eye on Testinfra for the host +checks. + +I'm using Z-Push along with Nextcloud for Activesync but it doesn't work with my +phone so I want to evaluate SOGo as a replacement. I want to try the +Dropbear-initramfs integration so I can unlock encrypted drives remotely over +SSH. I want to replace the workbench using Docker with the toolbox project. + +I avoided using a VPN for now and I don't want to go down that route. But I've +been in very closed networks so I want to setup a Websocket proxy to my SSH +server on the Dedibox so I can connect over port 443 and tunnel out from such +networks. + +Lastly, I see XMPP, Matrix or Mastodon (or maybe 2 of them) in the future for +secure and self-hosted chatting with friends.