Development on a Budget
Published: April 20, 2020Recently I volunteered to create a website for an organization with which I’m affiliated. The organization provides a number of communal educational opportunities. Most of the lectures that are given are recorded and we wanted to create a site to host all of them. The goal was to enable those in attendance to review previous material as well as enable those who were unable to attend to listen in on what they missed. The spur for this was one particular series that is going to be on-going with a new segment released every day. The lectures got underway and before I volunteered to make the website, they were being distributed via email and WhatsApp.
I blogged previously about ‘blogging on a budget’. This concept was inspired from two different areas. I had read about “Code Golf”. While I found this very interesting and some of the solutions very creative, I could never really get in to doing it myself. I’ve done code katas and other coding challenges before, but had a hard time sustaining momentum as they all felt very contrived and repetitive. There are a number of blogs that I follow regularly, one of them being Troy Hunt’s. He had an article about the costs involved in running his wildly popular website HaveIBeenPwned. This really piqued my interest and while I’m not anywhere close to his volume in terms of data or traffic, I appreciated the concept of seeing how far one could stretch a dime in the creation of their websites. The blog this article is on was the first attempt at putting these concepts into practice. Since this new website was being done on a volunteer basis, I thought it would be a good second.
After I had completed the initial launch of the website(done in roughly 1.5 weeks in my free time), I had some friends request knowledge of how it was done. I figured documenting it on my blog would be a great start, so here we go…
As discussed in the post about re-doing my blog on the cheap, GitHub Pages has been a very solid platform for me to use. It is easy to setup, SSL comes along for free and it seems to serve sites very quickly. I didn’t want to use Jekyll this time as there were too many curve-balls I encountered previously with it. I decided to use Aurelia v2 for the front end. This is a framework that I’ve used in the past and really enjoyed. It enjoys a very small following, but is clean, mostly straight-forward and fast. I have used Angular exclusively for my day job for the last number of years and wanted to kick the tires on something else. I’ve tried a few times, but never really been able to wrap my head around React. I’ve played with Vue before, but there seems to be a bit of turmoil in that space as of late and I really like that Aurelia and Angular have more of a full-fledged framework built into them. I don’t have to find a routing library, or any of the other pieces contained in most front-end code-bases that aren’t purely components.
With those decisions in place, I needed to figure out more of the backend. The original audio files I was given were sometimes mp3s, sometimes mpegs, were usually ~30min - 1hr long and anywhere from 20 - 50MB each. I wanted to be able to upload the files, create something that would automatically resize them(saving end-users bandwidth), add metadata tags to them(for nicer user experience when opening the files in an audio player application) and update the website to reference them. Nice-to-haves in the short term also included automated email/text and potentially WhatsApp notifications. A little bit more long-term goals include the creation/updating of an rss/podcast feed(s). I also wanted to have some insights into how much use the application was actually receiving.
Given the above, GitHub Pages would serve my static files, but I would need something to store and manipulate and retrieve the audio files. For that I decided to utilize Azure. I’ll discuss the specifics of this decision later in the article.
The first thing I did was buy a domain. To be clear, you don’t need to buy a domain, especially if you host it via GitHub Pages, but I enjoy giving sites a brand of their own. There are a number of vendors that provide this service. I knew that I was going to handle hosting through GitHub, so I was solely looking for a cheap place to buy a domain. The vendor I found referenced in a few places as being a good, cheap choice was NameCheap. I’ve used Google Domains in the past for this, but when pricing out the domains that interested me, NameCheap was consistently cheaper.
Once this was complete, I did a number of quick website-y initial setup things to get the ball rolling.
- I setup a repository on GitHub for the project.
- I enabled GitHub Pages.
- I created the CNAME document for the site to enable using my custom domain name.
- I then pointed the CNAME DNS entry in NameCheap to point to the GitHub server IP addresses.
- I enabled HTTPS on the website in GitHub. As I’ve documented these steps in my article on refreshing this blog and they are well documented on GitHub Pages documentation, I won’t go into more detail on these steps here.
At this point, one could argue that I developed things out of order. Usually when working with clients, I try to drive the conversation towards determining what is the minimum viable product and how can we get there in the shortest amount of time. This allows one to get to market quickly(or at a minimum in front of potential users quickly), receive feedback and iterate on the product to suit the end users needs. Seeing as the audio files were being recorded and delivered to me daily, the MVP-driver in me would say to build the front-end web app to expose those files to consumers quickly and then iterate from there. That is NOT what I did though. While I’m in favor of not prematurely optimizing applications, I’m also not a proponent of ignoring those concerns completely and kicking them down the road to bite me later. The dance of all of these competing concerns is one every dev should have to grapple with on a regular basis. In this instance, the calculation was pretty clear to me. A number of people utilizing the site were going to be doing so from a mobile device and they were going to want to keep on top of the material, as new content was released daily. The size of each file, on average, was 40MB. Looking at one months worth of lectures, that would come out to 1.2GB of content which could be quite the whopper for someone’s data plan. What could be done?
As I’m not an expert with digital audio(only dabbled here and there), but have used a free product called Audacity to perform some basic file manipulation in the past, I downloaded and got to work seeing what I could do about the file size. In pretty quick order I saw that, due to the fact that the files were only someone speaking, I could alter the file from studio to mono and reduce the bitrate quite a bit and create a file ~1/4 of the size with non-noticeable reduction in sound quality. Once I realized this, and the impact it would have on the end user, I was hooked on trying to find a way to automate this process.
While it seemed as though there were a few libraries out there that should be able to accomplish the above audio file manipulation, I wasn’t able to figure them out in the time I had allotted for this task. I did find many references to an executable called FFMPEG. I downloaded it and fired up Visual Studio to see if I could get it to run as a process and do the dirty work for me. After a number of iterations and A LOT of bingle-ing, I found the correct incantations. At this point, I could have left it as such and run this program daily to shrink the file sizes before uploading them to the cloud. I really wanted this process to be almost entirely automated though and eventually wanted to allow the lecturers to upload their lectures directly through the website. I decided to push my luck, take things a step further, and put this process into an Azure Function.
Why Azure Functions? I’ve utilized Azure Functions in the past for a number of different things and have really found them a pleasure to work with. Since the focus of this article is doing things on the cheap, lets talk about pricing. To be very honest, I don’t fully understand their pricing structure, but I do know that I’ve created functions in the past on the “consumption plan”. This is where they calculate the ‘compute’ your function uses into some measurement and then you get charged per unit of that measurement. There are quite a number of those units that you get free monthly, to the extent that I have yet to reach the point where I’m getting charged for the functions I’ve created. Azure DevOps has integrations with Azure Functions and GitHub as well. This allowed me to setup an automated build when I check-in code to GitHub which then automatically deploys the code changes, upon successful build, to the Azure Function…all for free.
There are also two, free tools Microsoft has produced that make this experience even better. Azure Storage Emulator and Microsoft Azure Storage Explorer. These two tools allow you to create an Azure Function entirely locally on your machine. The function I had envisioned was one that would listen to blob storage. Anytime a new file got dropped into the container, my function would fire, manipulate the audio file to reduce its size, add metadata to it and then upload it again into another blob storage container which would be exposed to the end-users through the website.
While I’m in Azure-land, I setup one other function. It was a very small function. It is an HTTP trigger function, which means that it is exposed via a URL. It was a GET
function that takes a string and returns an audio file. If you are familiar with Azure Storage, you might wonder why I would structure things this way. If you are not familiar with Azure Storage, then the reason there would be to question this decision is that Azure Storage has different levels of exposure you can specify for your blobs. I could have just exposed all of them publicly and had the website reference those urls directly. There are a handful of reasons why I chose the route I did and they are as follows:
- I wanted to see every time someone requested one of the files. Application Insights is another free offering on Azure and it is integrated very nicely with Azure Functions.
- I wanted to de-couple the front-end request from the backend response. If I directly exposed the URL, this would make future migrations more difficult as I would have to maintain those urls and their relationships to the UI. My initial implementation had the GET request string strongly resemble the Azure Blob Storage URL, but I could see as the website grows, the desire to move to something a little more dynamic than that.
- Not all of the files that were uploaded on the backend should be exposed publicly and these settings applied to the container as a whole, not on each individual file.
Once all of this was done, I felt as though I had an “MVP” back-end and wanted to do the same for the front-end.
As the initial thrust of this website was for this daily lecture series, I focused on creating (literally) a single-page app that could display the information about the series reasonably well and play an audio file. I started with the ‘cli’ setup. Created one component. Added some style. And…that was it. It was approximately ~500 lines of code total, half of them being css. I added another site entry into my Google Analytics account, and copied the relevant js code into the index.ejs
file so that I would get additional insights into my users, what types of devices they are visiting the site on and where they are accessing it from around the globe. The only other, semi-interesting, decision I made at this point was to put all of the data for the lectures into a static json file in GitHub. I didn’t want to pay for a database of any kind and didn’t really want to go through the development overhead of needing to configure/code it all. In retrospect, it probably would have been slightly better to put it behind an Azure Function as well, but…I didn’t :).
Now it was time to launch. As mentioned above, all of the above was accomplished in ~1.5 weeks in my free time. The site was launched and started getting use. So far, the site is costing me ~40¢ a month to run. Within a few days, I got a few feature requests and I had a queue of clean-up/enhancement things I wanted to tackle for the ongoing maintainability/extension of the site. I was pretty happy with the turn around time and cost to run, but there was still work to be done.
Post launch there were a number of things I wanted to tackle in short-order to reduce the daily amount of work I had to maintain the site. The top two were:
- Automate the posting of new data to github.
- Automate the emailing individuals to inform them when new lectures were available.
As this post it getting a bit lengthy, and since there were a few other iterations I made in short order after launch, I’ll save the details for all of those for another post!