Tab Completion

I'm Tab Atkins Jr, and I wear many hats. I work for Google on the Chrome browser as a Web Standards Hacker. I'm also a member of the CSS Working Group, and am either a member or contributor to several other working groups in the W3C. You can contact me here.
Listing of All Posts

If I Had A Time Machine

Last updated:

An incomplete and growing list of mistakes that I'd fix if I had a time machine:

Languages

Make all languages be written vertically, and stack their lines left-to-right. Everyone being on one writing direction is A+, and vertical writing gives more time for the ink to dry before your hand has to touch it when you're left-handed.

Figure out a universal script that everyone can use. Doesn't have to be too complicated; tons of languages with different sounds use the Latin alphabet fine. I'm kinda in love with the Korean script, tho - it's alphabetic, but is written like it's ideographic, with the letters arranged into syllable-blocks in a standard way. Something like that would be good.

Putting everyone on a single language would be good too, but probably really hard to maintain. Just making sure everyone's writing in a mutually intelligible way is a good start.

Numbers

Switch everyone to base 6 counting. It's got some great reasons behind it!

  1. Each of your hands is a single base-6 digit (can represent the values 0-5), so you could count to 35 instead of just 10.
  2. It's got good divisibility - /2 and /3 (and /6 of course) are just "check the last digit", /4 and /9 are "check the last two digits", /5 is "sum the digits and check if they're /5" (same as /9 in base 10), /7 is "alternately add/subtract the digits and check if they're /7" (same as /11 in base 10).
  3. The multiplication table is really trivial, almost insultingly so. Similar to the divisibility, multiplying by 2 and 3 are now super easy (like multiplying by 2 and 5 are in base 10).
* | 1| 2| 3| 4| 5| 10|
----------------------
 1| 1| 2| 3| 4| 5| 10|
 2| 2| 4|10|12|14| 20|
 3| 3|10|13|20|23| 30|
 4| 4|12|20|24|32| 40|
 5| 5|14|23|32|41| 50|
10|10|20|30|40|50|100|

Numbers in base 6 are about 50% longer than in base 10, which isn't a huge loss. We can compensate for it by making sure the names for the digits are single-syllable, as that boosts your ability to remember strings of digits. Grouping digits into sets of 3 would also feel more "natural" - we might even be able to go up to sets of 6 instead.

I guess when talking about computers, we'd generally use octal (like we use hexadecimal in base 10) - 012345TE. Octal is a lot easier to learn than hex, too.

I Understand Platonic Solids A Little Better Now

Last updated:

So I was watching this experimental gameplay video for Miegakure, a 4d indie video game. It's like Fez, which was a 2d platformer in a 3d world that involved "rotating" things in 3d to solve puzzles, except lifted one level, so it's a 3d game in a 4d world.

Anyway, it mentioned the 24-cell, a weird 4d shape that I've always wanted to understand better, so I started on a Wikipedia binge. This immediately led me back to the platonic solids, when I realized something.

It always struck me as weird that there are an infinite number of regular 2d polygons, one for every number of sides - equilateral triangle, square, regular pentagon, regular hexagon, etc - but there were only five regular 3d polyhedrons - with dice names, the d4, d6, d8, d12, and d20. Going from infinite to finite is always interesting, and I didn't get it before, but I got to thinking about the patterns in the shapes, and suddenly it clicked - there just isn't enough space for any more regular polyhedrons!

See, the dice shapes can be split into two lines. The d4, d6, and d12 form the "increasing polygon side-count" line, while the d4, d8, and d20 form the "increasing number of triangles" line. (Yes, the d4 is the base of both.)

The "increasing polygon side-count" line goes from triangle (3) sides, to square (4) sides, to pentagon (5) sides. As you increase the number of sides, the angles get larger, so you have lay the shapes out "flatter" to get them to fit together snugly, reducing the curvature and increasing the size of the die. But what happens when you fit hexagons together? They tile the plane. Put hexagons together snugly, and they perfectly fill up a flat 2d plane - 120° × 3 = 360°. You can't curve them at all, there's not enough space. So there's no way to make a polyhedron with a regular hexagon - it can't curve around to form a ball!

The same argument happens with "increasing numbers of triangles". It's easy to fit together three triangles - you get the d4. Spread them out a little and you make space for a fourth, getting the d8. A little more flattening makes room for a fifth, making the d20. You can still squeeze in one more triangle, putting six together snugly, but only by flattening them into a plane again. It's the same problem as the hexagon tiling - 60° × 6 = 360° again. The triangles don't curve around at all, so you once again can't form a ball!

7 (whether 7-sided shapes, or 7 triangles) is of course right out. You can't fit those together no matter how you try.

I presume this is why there are only three regular solids in 5+ dimension: the simplex (analogue of the d4), the measure complex (analogue of the d6), and the cross complex (analogue of the d8). In higher dimensions the volumes take up more "room", so you just can't fit as many together, and aren't able to go up to the "5" shapes (analogues of the five-sided or 5-triangled shapes).

This still doesn't answer what the fuck is up with four dimensions, tho. 4d has six regular solids - 5 analogues of the 3d solids, and 1 special. There's the three made out of increasing-size dice: the 5-cell, made out of d4s; the 8-cell, made out of d6s; and the 120-cell, made out of d12s. There's the three made out of tetrahedrons: the 5-cell, where 4 meet at each vertex; the 16-cell, where 5 meet at each vertex; and the 600-cell, where 6 meet at each vertex.

Then there's the 24-cell, made out of d8s. What the fuck. Geometry is weird.

My Password Strategy, and So Can You!

Last updated:

Everyone knows they're supposed to use strong, random passwords, and use different passwords on every single site. For most of us, tho, that's impossible - we just can't remember that many. Some of us give up and just remember one strong password (there goes your security when something is breached) or use a password locker like 1Password (hope you're backing up appropriately, and have access to the vault from every single place you'll need a password).

None of these are good strategies. A few years ago I thought about this a bit, and put together a tool to help me handle all this properly and safely. I've evangelized it a bit informally, but more people really need to know about this. So, here goes:

  1. Memorize one good, strong password. Make it long and random, as secure as you can possibly make it and still be sure to remember it. This is the only random password you'll be required to remember ever again.
  2. Whenever you need a password for something, visit https://tabatkins.github.io/password/. Put your master password from Step 1 into the master password slot, and enter a memorable "site tag" for that slot. This does not need to be secure in any way, so focus on making it as easy to remember as possible, like the domain name of the site.
  3. Hit the "Long" button and copy the generated password out. DONE.
  4. If you're using a terrible website like a bank that applies password limits, the "Short" button usually works - it gives you a 12-char password. If you need more control, the "More Options" section lets you customize the password thoroughly, which should satisfy whatever idiotic demands they call for. Try to avoid using this if possible, just because it means more memorization.
  5. Store the site tag (and the custom options, if necessary) somewhere accessible. I just use a Google Doc, which I can access from anywhere on multiple devices. This is not secure information, so don't worry about it being exposed - as long as your Master Password is good, you're safe.

And that's it! This method has several benefits over a traditional password locker:

  • Same amount of memorization - one master password.
  • No need to "back up" anything - you can probably remember the site tag anyway, and if you do record them somewhere it doesn't have to be securely stored.
  • Accessible from anywhere - as long as you can touch the internet, you can reach this - no need for a browser extension or a separate program that won't be installed on public computers.
  • Works offline - the site is totally self-contained in a single file, so you can save it to your device and use it locally without any internet connection at all.
  • Totally independent - nothing can stop working because some company was acquired or went out of business. If you save a local version of the file or host it on your own site, even me removing my site won't stop you.
  • No chance of losing passwords - no chance that your password file can be "corrupted" and impossible to decrypt, because there is no password file - it's just a hash function run on your two inputs.

If you're paranoid, feel free to audit the code on your own - it's unobfuscated HTML and JS that makes zero network calls and saves no information. The entire operation is done locally, my version of the file is served over HTTPS, and you can run a local version if you're really paranoid about code changes.

I've been using this for years, and it changed my life around passwords.

Compactly Encoding All Primes via Gaps

Last updated:

(As usual for these mathy posts, inspiration was an XKCD forums thread. Join us!)

Let's say you wanted to store a list of every single prime. Cool! Problem is, primes start to get fairly big after a while. You don't want to use any particular fixed-width binary encoding (it'll waste a lot of space on small ones, and be unable to store large ones), and even with a variable-width encoding you're wasting a lot of space. Primes tend to be really close to each other, so by listing each one explicitly you're repeating most of the data!

There's a lot of ways to throw compression at this problem, but an easy one to talk about is to instead store a list of the gaps between primes. For example, there's a 2-gap after 3, since the next prime is 5; there's a 4-gap after 7, since the next prime is 11. Primes actually cluster pretty closely together - there's an infinite number of "twin primes" with a 2-gap, and the average gap at any point is equal to the density of the primes, which is the natural log (in other words, it grows very slowly). So the gaps are much smaller than the primes, especially as numbers get bigger - for example, per this MathWorld article, the maximal gap between primes 18 digits or less is only 1220 (and the average gap is around 40 at that point).

So now we have to talk about encoding those gaps, in some reasonably efficient way. I've come up with a scheme that I think works pretty well, derived from UTF-8:

  1. 2 and 3 are assumed. This will start listing the gaps from 3 onwards.
  2. Because all the gaps are even, we'll actually be encoding the halfGap.
  3. Find the halfGap's binary expansion.
  4. Determine how many output bytes you'll need by dividing the number of bits halfGap uses by 7. (If the answer is > 8, this is an exceptional case. Skip to step 7 for these cases.)
  5. In the first output byte, set the high (bytes - 1) bits to 1, then the next bit to 0. In the remaining bits of this byte, and the remaining output bytes, write the halfGap directly.
  6. Occasionally, whenever you feel is appropriate, encode an "i-frame": write out a byte of all 1s, then encode the next prime into however many subsequent bytes it takes. Use only the lower 7 bits of each byte, leaving the high bit 0. When finished, write out another all-1s byte.
  7. For exceptional cases (gap would need more than 8 bytes to encode), write out the prime as an i-frame instead.

Judicious use of "i-frames" (video encoding terminology!) means you don't have to process the entire preceding list to find a given prime, just back to the most recent i-frame. It also gives you a resync point, in case you lose track of where you are in the stream for whatever reason. (UTF-8 resyncs with every character, but it pays for that with lower efficiency.)

This encodes every gap with just 1/7 overhead. The average gap fits into a single byte until around e^2^8, or approximately 10^111; into two bytes until around e^2^15; three bytes until e^2^22; etc. At about e^2^57 (approximately 10^10^17) it finally reaches the point where the average gap is "exceptional" and requires an i-frame, and you have to pay a 2-byte tax for each one. Luckily, at that point each prime takes up about 10 petabytes of storage just to write out, so the 2 extra bytes aren't very noticeable. ^_^

Mobile Viewports WTF

Last updated:

We're refreshing the W3C stylesheet right now, and part of that is requiring a good <meta name=viewport> value so the specs are readable on phones.

We've come up with a pretty decent one so far: width=device-width, initial-scale=1, shrink-to-fit=no. This seems to work well, but I've found it still has a frustrating weakness:

If some element is wider than the initial viewport size, then the size of the viewport that position:fixed elements use is increased to be the widest element's width, with the same aspect ratio as the original viewport size.

Example: say your phone's natural viewport is 320px wide and 640px tall. If there's an element in your page that's 400px wide, the fixpos viewport will end up being 400x800 instead, so anything you fixpos to the bottom will be 160px below the bottom edge of the screen. :(

Does anyone know a way to fix this?