Glean vs. the universe

In which developer Chris Moore persuasively argues that a recent Glean glitch was caused by… the sun?

Clock3 min readCalendarPublished: 4 Feb 2022
AuthorChris Moore

The incident

One day, while checking logs, I noticed that a user was having trouble uploading some changes they made to an event.

Each audio snippet in an event has some corresponding metadata and one of the fields in this metadata is the “duration” of the audio. However, this user had one bit of metadata with a “deration” field instead, so the server was rejecting it as invalid event JSON.

Combing through the history of the codebase, “deration” appeared precisely zero times. So what happened here?

Bit flips

At the lowest level, all data is stored in binary. Each bit represents some power of 2, so a bit changing unexpectedly can result in a discrepancy of that power of 2. This happened in the 2003 election in Schaerbeek, Belgium, where a candidate received 4096 (or 2^12) more votes than was considered possible, and so a bit flip is suspected to have been the culprit.

Another example is when a Super Mario 64 speedrunner inexplicably teleported upwards during a run, and managed to replicate it by flipping a bit in Mario’s height.

In our case, a particularly eagle-eyed developer noticed that in Unicode, “e” and “u” are represented in binary as “1100101” and “1110101” respectively. So all it would take is one bit in the user’s local storage to flip from a 1 to a 0 to go from “u” to “e”.

This is about as far as we can get with cold, hard facts. I cannot prove that a bit definitely flipped in this user’s storage. They could’ve manually gone and messed with their IndexedDB instance for all we know. But let’s assume the bit did flip for the sake of this tech blog!

Any number of software or hardware issues could cause a bit to flip: integer overflow errors, power surges, the list goes on. Considering that at the date of writing, this is the only time we’ve seen this happen, we’ll rule out issues in our codebase (how convenient!). As for potential hardware issues, we can’t really say for sure. However, there is one potential cause that is significantly more interesting (to me at least, and hopefully to you too)!

Cosmic rays

Space is full of particles, all whizzing around from stars and galaxies. Cosmic rays in particular are high-energy particles (mostly protons), which primarily come from the sun. These particles very rarely make it to the surface of the earth, but rather interact with particles in the earth’s atmosphere. This in turn produces showers of particles, and it’s these particles that are more likely to get up to mischief.

The particles in these cosmic ray showers are still quite energetic, enough to potentially ionise atoms they come into contact with. If this were to happen in a hard drive, the small resultant charge could be enough to flip a bit.

So it’s entirely possible (although impossible to prove) that the sun is directly responsible for this Glean user having been unable to upload their event.

Case closed

We fixed the issue by putting some custom code in to clean up the affected user’s IndexedDB instance, and removed it again once it had run in production. Therefore, by ironclad deductive reasoning with only a few assumptions, it can be concluded that we are, in fact, more powerful than the Sun.

Take on the universe with Team Glean

When we’re not fighting against the cosmos, we develop software that helps learners achieve their full potential.

To learn more about our mission (and to see our current vacancies), follow the link below!

Time for a simpler, smarter note taking accommodation?

Glean is the online note taking tool that makes compliance simple, reduces cost and admin burden, and improves student outcomes.

More from Tech Blog

5 min read

Migrating a Flutter app to Null Safety

In this post, Team Glean’s Cameron talks us through the benefits and drawbacks of the ‘null safety’ upgrade to Dart.

3 min read

How we discuss engineering improvements at Glean

Team Glean's Gwen talks us through Engineering's regular 'Entmoot' and how it could help your dev team work smarter

5 min read

The day development stopped

How the Glean Engineering Team recreated their development stack from scratch.