Site Monitoring, Plausibly

I’m on my annual sojourn to Squam this week, listening to the lake and the loons. I’ve got a few personal projects I’ve been meaning to do that I hope to work on while here (in addition to the socializing and swimming and such). For instance, I’ve been meaning to move away from Google Analytics for monitoring site traffic. No diss on the quality of the tool, but philosophically I’ve been increasingly uncomfortable with handing over my site’s viewership information to Google for their own purposes. I do still want some level of monitoring, though: site stats aren’t strictly necessary, but I’m enough of a stats geek that I do like to occasionally see what pages are getting views, what links are getting clicked, and so on.

There’s a couple different open source solutions out there that serve this purpose. Matomo is a popular one (it was previously named Piwik and has been around for quite some time), but seemed excessive for what I wanted. Personally, I ended up landing on Plausible. It’s fairly lightweight, fast, GDPR compliant, doesn’t collect PII, and I can self-host it. Setup was a little bit of a pain, but not an insurmountable one. To be more clear: installation itself is easy, as it’s just a docker container, but it terminates to http, not https. This meant I needed to set up a reverse proxy to pass it through a secure connection, or else browser content protections would block the script (they don’t like mixing secure and non-secure files). But hey, good opportunity to learn something new!

Certbot Gotchas

As a project this week, I’ve been doing some backend maintenance for my web hosting, which includes getting everything set up with SSL certs through Let’s Encrypt. (The writing is on the wall: most sites that can switch to HTTPS should switch to HTTPS. Not just for the added security for you and your viewers, but also because browsers and search engines and similar are starting to give warnings if it’s NOT secure.)

Thankfully, the days of having to pay an arm and a leg for a cert have passed. Let’s Encrypt is a free non-profit service (in partnership with other orgs including EFF, Mozilla, and Akamai, to name a few), which generates free, short term SSL certificates for your site. (For larger organizations, you may want to still throw down on a longer term set of certs, but for personal use this is great.)

Using the certs is pretty straightforward: they’ve created a tool that can run on your web server called certbot which streamlines the process and also monitors and automatically renews the certificates when they’re close to expiration. Installing certbot is pretty straightforward: it’s available via various package managers (apt and similar), so chances are good that whatever OS your server is running can install it pretty easily.

That said, there are still a few gotchas that I felt like got glossed over in the docs I was reading/following when using the tool, so here’s a few notes:

  • Be explicit about your domains: certbot doesn’t currently support wildcards (i.e. being able to say *.nadreck.me and have that handle any subdomains like images.nadreck.me). Instead, list them all out if you want them to share a certificate, and that includes both with and without www. So, for example, you might want to do something like sudo certbot --apache -d nadreck.me -d www.nadreck.me. If you don’t include both, someone going to your site using the address that wasn’t specified may end up getting the site blocked and the user warned of something potentially fraudulent.
  • If you already generated a certificate for a domain and need to update it (maybe you added a subdomain, or forgot to add www), the command you’re looking for is --expand. (I would have thought “update”, but no.) Note that when expanding a certificate, you need to re-list all domains you want included (you don’t just list the one you’re adding). So, using nadreck.me as an example again, if I wanted to add “images.nadreck.me” to the existing cert, I’d do sudo certbot --expand -d nadreck.me -d www.nadreck.me -d images.nadreck.me.
  • Keep it separated: the certs are free, there’s no need to overload the cert with a ton of domains. While it makes a certain amount of sense to bundle a domain and subdomains together, there’s no need to make one cert for all your sites. criticalgames.com shares a cert with nadreck.criticalgames.com, but not with nabilmaynard.com, if that makes any sense.
  • You can’t preemptively add sites to a cert. Certbot/letsencrypt performs a challenge response as part of the process to make sure you actually own the site you’re trying to set up, so if you haven’t actually set up that site or subdomain, the challenge will fail and the cert won’t be generated correctly. If you wanted to add, say, files.nadreck.me to your certificate, you’d need to set up that subdomain first, then expand your certificate. (The site can be empty, but the URL needs to resolve and land somewhere real.)

Anyway, hope that helps! The process really is pretty straightforward, and I recommend getting things set up to anyone maintaining a website these days.

Two Content Columns, No Sidebar

As I’m sure you’ve noticed, there is no sidebar on this site. (The closest we come are the widget space in the footer.) Instead, we’ve got an extra wide window, with two columns of content. This was something that grew out of sitting in the Post Formats session at WordCamp Portland 2011 (liveblog transcript found here and here). Basically, post formats allow you to format different types of posts in different ways (similar to how Tumblr works).

If you are already sorting content by type, why not take it a step further and sort content within the page layout as well? For me, it made the most sense to sort my content into long-form and short-form sections. That way, no matter how many links or tweets I post, longer articles still get the time and attention I’d like to afford them, despite being more infrequent.

The process of doing this wasn’t too bad in execution, though I did end up spending a long time exploring the WP_Query entry on the WordPress Codex, since I’ve not done much query tweaking in the past. Basically, I tweaked the CSS of the page to be wider tweaked, the div this template wraps the sidebar in to be wider, then commented out the sidebar itself. Then I made two queries, one for each column. The second column simply searches for the last 20 posts in either the “aside” format, the “status” format, or the “link” format (basically all posts that should never be more than, say, a short paragraph). The first column searches for the last 10 posts that AREN’T in “aside”, “status”, or “link”. This was necessary because “standard” posts have no searchable post-format slug to query against. Simple, eh?

Twitter Archiving on WordPress

Or: How I Learned to Stop Worrying and Love the Yahoo Pipes.

There are a few different backup services that allow for backing up your twitter feed. You may or may not be aware, but it’s actually rather difficult to back up and archive your tweets, if you have passed a certain threshold in number and age (the magic number currently being 3200 tweets). If by some miracle, you manage to get a more complete archive (I signed up with BackupMyTweets a while back, and they managed to go all the way back as near as I can tell), there is then the task of figuring out what to DO with those archives.

Personally, I wanted to put them into a WordPress install, and then use a plugin to keep it up to date going forward, because I’m a fan of a consolidated media identity (come to one place, which I manage, and get all the data you want or need). The problem was that while BackupMyTweets had all my tweets backed up, their download options left something to be desired (PDF, CSV, XML, and JSON, none of which in formats that could be easily imported into WP). I could have used a different service, like TweetBackup, but they were limited by the 3200 tweet cap, and thus it wouldn’t be all of my tweets. If I was going to bother doing this consolidation, I wanted to do it ONCE, and I wanted it to be as complete as possible.

I spent some time doing research into this problem, and wasn’t really happy with any of the solutions. I’m not really a programmer, and so the notion of writing a perl or python script to parse the archive xml format into what wordpress needs seemed daunting and unreasonable. Ultimately, I discovered a really simple and easy solution: Yahoo Pipes. If you haven’t played with this service before, I highly recommend it — it’s not really doing anything a good programmer (or even scripter) couldn’t do, but it takes a lot of the pain out of that process and gives you a visual method to track all the transformations and parsing you might be applying. Case in point, I’ve put together a CSV to RSS converter that takes the Twitter CSV archive from BackupMyTweets, and parses it into an RSS feed that I could then import into WordPress. The end result: a blog with ~4200 one-line posts.

A few caveats:

  • If you are going to use this method, be sure to set the default category to “tweets” (or wherever else you plan to put them) BEFORE you run the importer.
  • You may need to break your RSS feed into multiple files, as there is a database timeout that you might run into otherwise.
  • Titles on tweets are kind of silly. I recommend using a theme that supports the “status” post format and removes the titles for status posts.

If you want to check out the pipe I made, it can be found here. It’s pretty simple: pull from a csv file stashed on a site, map the columns to the correct fields in a “Create RSS” widget, do something to solve the “what should the title be on a tweet” question (I did a truncated version of the tweet), output the result.