Josh Betz

Engineer, Automattician, Wisconsin Badger

Automate git bisect

I occasionally use git bisect to figure out where I’ve introduced test failures. If you’re not familiar, given known good and bad commits, along with a test command, git bisect does a binary search over your repo to determine where a bug or test failure was introduced. This is especially useful on large repos with a test suite that takes a minute or two (or more ?) to run.

Most of the time, I just need to check the last handful of commits. To that end, I’ve written a bash function that assumes the current HEAD is bad, 10 commits prior is good, and automatically runs a test command that you define.

gbi() {
	git bisect start
	git bisect bad
	git checkout HEAD~10
	git bisect good
	git bisect run "$@"
}

This won’t cover every case, but should help automate this fairly verbose process most of the time.

Astrophysics for People in a Hurry

As grown-ups, dare we admit to ourselves that we, too, have a collective immaturity of view? Dare we admit that our thoughts and behaviors spring from a belief that the world revolves around us? Apparently not. Yet evidence abounds. Part the curtains of society’s racial, ethnic, religious, national, and cultural conflicts, and you find the human ego turning the knobs and pulling the levers.

— Neil deGrasse Tyson

From Astrophysics for People in a Hurry.

Nginx Cache WordPress

There are tons of caching options for WordPress. Some of the popular plugins are Batcache, WP Super Cache, and W3 Total Cache. There are also services like Cloudflare and Fastly. On VIP Go, we use Varnish instances distributed across the world and route traffic to the nearest server over our Anycast network. I’ve used almost everything on my blog over the years, but this time I wanted to keep it simple. Since I already use Nginx for SSL termination and proxying to a Docker container, I decided it would be easiest to cache the HTML there. https://gist.github.com/joshbetz/63f08fc2f37e5e267d14f219d0f5b4ed The configuration is pretty basic:
  • 404s are cached for 30 seconds
  • 301s (permanent redirects) are cached for 24 hours
  • Everything else is cached for 5 minutes
Pages are only cached if they’ve been accessed 3 times to avoid filling the cache unnecessarily. If you’re logged in or have a cookie that looks like it could be valid, you bypass the cache completely to avoid caching logged in data. One of my favorite parts is that I can serve stale content if there’s a server error. So if I break the Docker container, ideally people won’t notice 🙂 The goal is to prevent my tiny Digital Ocean VM from being overwhelmed if lots of people visit at once. The short cache TTL means I don’t really need to worry about purging anything, which further simplifies this. Ultimately this is easier to maintain than any of the other things I’ve tried and just as effective. The full configuration is on Github.

Biased Algorithms

Biased algorithms and their effects are something I’ve been interested in exploring recently. It’s not a problem with Mathematics or Computer Science per se — humans with implicit bias come to false conclusions all the time. We’re the source of these problematic algorithms after all. The problem is that these bad assumptions can be deployed on a massive scale and aren’t questioned because we think of the math as infallible.

A recent episode of 99% Invisible, The Age of the Algorithm, discusses this topic and gives some examples of where it is having real, negative effects today.

Most recidivism algorithms look at a few types of data — including a person’s record of arrests and convictions and their responses to a questionnaire — then they generate a score. But the questions, about things like whether one grew up in a high-crime neighborhood or have a family member in prison, are in many cases “basically proxies for race and class,” explains O’Neil.

Essentially, any time you use historical data that was effected by a bias to influence the future, you risk perpetuating that bias.

If you’re interested, Cathy O’Neil also wrote a book called Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.

 

Restless Dream

One of my New Years resolutions was to play more music. I decided it might also be fun to share some of it. I actually learned this song a couple years ago, but it’s one of my favorites. Naturally, a Jack’s Mannequin song — Restless Dream.