diff --git a/content/db_resources.rst b/content/db_resources.rst new file mode 100644 index 0000000000000000000000000000000000000000..77b4b761469d34105769e936d2c0b9422807c0bc --- /dev/null +++ b/content/db_resources.rst @@ -0,0 +1,73 @@ +Database resource handling +########################## + +:date: 2021-05-08 +:summary: Handling and versioning of database resources + +A few years ago I was working at a company where part of the product was an +online encyclopedia. The content was curated by a team in the company using an +homegrown CMS. Deployments of it were done by dumping the content of a few +database tables, zipping them up, uploading to S3, and in the production +database, dropping those tables and restoring from content of the zip file in +S3. + +This was a Laravel app and deployments were done using Ansible. We had Ansible +code for that, but the database dump and uploading to S3 was manual and the +configuration was in the repo where the Ansible roles were maintained, not the +repo were the app was developed. + +I know most people would jump up and say that this causes downtime that can be +avoided, but this was the way it was done when I joined. It avoided the need to +care about database migrations for those tables. And given that this company was +based in Israel and we work on Sundays, nobody cared that the entire product +(not just the encyclopedia) was down for ~10 minutes on Sundays. As a side note, +you'd be surprised at how many Israeli startups do weekly deployments on +Sundays. + +This caused a few issues for the company. First of all, developers had +individual instances on Azure that were initially set up by the Ops team. At the +time we had Azure credits but those were going to run out in a few months and we +wanted to move developers to use their local machines entirely. + +Another thing was that the process of updating the resources was involved and +required somebody from the Ops team. It meant that once a breaking change was +merged to the master branch of the application, tests would break for all +developers until the database resources were updated. + +Yet another downside was automated testing. Previous attempts at having a CI +pipeline required sharing the database between all of the jobs. It needed to be +maintained by the Ops team. And it caused to jobs to fail if a breaking change +was deployed. + +The solution I came up with was extending the Laravel app to do most of the +work. First I added a new config array for the version of the resources we +currently want. IIRC we decided on the format of year and week number. Then I +wrote a new Artisan command to download the the zip file, drop the tables and +restore from the zip file. Laravel offers a way to save what it calls system +variables in the database, so the currently deployed version was stored in the +database and I could skip this process entirely if the currently version and +wanted version match making the process usually faster and idempotent. + +With the command in place, it was added to to all of the relevant places. +Developers had a short script that they maintained which ran all of the tasks +that were needed for updating the app (like database migrations, SCSS builds, +Gulp builds) so the command was added to the script. It replaced a few Ansible +tasks with a single task to run that command. + +The developers maintaining the CMS added the functionality to upload a database +dump of the those tables and exposed it in the admin UI and the loop was +complete. + +The benefit of this approach was the developer workflow. Instead of working with +the Ops team to update the resources, they would just update the config to +point to the new version. It allowed downgrades in cases we wanted to revert to +an older version. It made the PRs much clearer to everybody. The Dev +team was self-sufficient, the Ops team had one thing less to worry about. We +were one step closer to having developers develop using their local machines +(using Docker and Docker Compose) instead of those instances in Azure. We were +using Bitbucket and they just came out with the open beta for Bitbucket +Pipelines and I was able to set up automated testing that was completely +independent for each run. Overall, a lot upsides and time saved for about 200 +lines of PHP code (it took me around 2 days since it was the first time I wrote +more than 2 lines of PHP and the code review had a lot of issues for me to +address).