5 Years

This week is five years since I started working at Automattic. In some ways, it doesn’t seem like it could possibly be that long. We’re growing fast — over 80% of the company started after me. At the same time, my job has changed a lot and when I think of everything I’ve worked on, I don’t know how it all fits into five years.

500 Automatticians at the 2017 Grand Meetup

In a few weeks I leave for a three month, paid sabbatical — a perk everyone is eligible for after five years. I’ve been getting advice from colleagues. Unplug. Take time for yourself. Travel. I’m excited for the opportunity.

People are often surprised to hear that I’ve worked at the same company for this long. Apparently five years is a long time in tech. Or maybe it’s because I’m a millennial. Regardless, they usually want to know why I’ve stayed for so long. The sabbatical is a big, exciting thing to talk about, but the day-to-day flexibility is the real reason. We have an open vacation policy, I get to set my own hours, and I can work from anywhere with an internet connection. The list goes on. This can be an adjustment for most people — especially not having a central office to work in, but after you’re used to it, the alternative sounds worse.

Lots of people tell me they’ve tried working “remote” and they don’t understand how I do it — or how we run a company of over 700 people this way. I’ve learned this usually means they’ve had an opportunity to work from home a few days a week. It sounds like the same thing, but it’s fundamentally different from what Automattic does. The main difference is communication. If you work in an office, you’re naturally worried about missing out on what’s happening in the office. As a fully distributed company, communication is all online and there are no office meetings that you never hear about. We have a saying, “P2 or it didn’t happen”, which means if you don’t post notes to an internal blog, you can’t expect anybody to know about it. And everyone does because it’s the primary way we communicate.

I’m excited to take a break and unplug, but I’m equally excited to come back this fall because things are moving fast around here and there are some exciting changes ahead.

If you’re interested, we’re hiring at Automattic and on VIP 🙂

Automate git bisect

I occasionally use git bisect to figure out where I’ve introduced test failures. If you’re not familiar, given known good and bad commits, along with a test command, git bisect does a binary search over your repo to determine where a bug or test failure was introduced. This is especially useful on large repos with a test suite that takes a minute or two (or more ?) to run.

Most of the time, I just need to check the last handful of commits. To that end, I’ve written a bash function that assumes the current HEAD is bad, 10 commits prior is good, and automatically runs a test command that you define.

gbi() {
	git bisect start
	git bisect bad
	git checkout HEAD~10
	git bisect good
	git bisect run "$@"
}

This won’t cover every case, but should help automate this fairly verbose process most of the time.

Astrophysics for People in a Hurry

As grown-ups, dare we admit to ourselves that we, too, have a collective immaturity of view? Dare we admit that our thoughts and behaviors spring from a belief that the world revolves around us? Apparently not. Yet evidence abounds. Part the curtains of society’s racial, ethnic, religious, national, and cultural conflicts, and you find the human ego turning the knobs and pulling the levers.

— Neil deGrasse Tyson

From Astrophysics for People in a Hurry.

Nginx Cache WordPress

There are tons of caching options for WordPress. Some of the popular plugins are Batcache, WP Super Cache, and W3 Total Cache. There are also services like Cloudflare and Fastly. On VIP Go, we use Varnish instances distributed across the world and route traffic to the nearest server over our Anycast network.

I’ve used almost everything on my blog over the years, but this time I wanted to keep it simple. Since I already use Nginx for SSL termination and proxying to a Docker container, I decided it would be easiest to cache the HTML there.

https://gist.github.com/joshbetz/63f08fc2f37e5e267d14f219d0f5b4ed

The configuration is pretty basic:

  • 404s are cached for 30 seconds
  • 301s (permanent redirects) are cached for 24 hours
  • Everything else is cached for 5 minutes

Pages are only cached if they’ve been accessed 3 times to avoid filling the cache unnecessarily. If you’re logged in or have a cookie that looks like it could be valid, you bypass the cache completely to avoid caching logged in data. One of my favorite parts is that I can serve stale content if there’s a server error. So if I break the Docker container, ideally people won’t notice 🙂

Ultimately this is easier to maintain than any of the other things I’ve tried and just as effective.

The full configuration is on Github.

Biased Algorithms

Biased algorithms and their effects are something I’ve been interested in exploring recently. It’s not a problem with Mathematics or Computer Science per se — humans with implicit bias come to false conclusions all the time. We’re the source of these problematic algorithms after all. The problem is that these bad assumptions can be deployed on a massive scale and aren’t questioned because we think of the math as infallible.

A recent episode of 99% Invisible, The Age of the Algorithm, discusses this topic and gives some examples of where it is having real, negative effects today.

Most recidivism algorithms look at a few types of data — including a person’s record of arrests and convictions and their responses to a questionnaire — then they generate a score. But the questions, about things like whether one grew up in a high-crime neighborhood or have a family member in prison, are in many cases “basically proxies for race and class,” explains O’Neil.

Essentially, any time you use historical data that was effected by a bias to influence the future, you risk perpetuating that bias.

If you’re interested, Cathy O’Neil also wrote a book called Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.