Pronouncing Base 6 Numbers

Last updated: Monday, November 8 2021

What's the best way to pronounce heximal numbers?

One could just read them out like they're decimal: "21" being pronounced "twenty one", etc. But that leads to some confusion when mixing heximal and decimal numbers in conversation, and more generally just confuses people who are currently used to only using decimal numbers. It also makes phrases like "base ten" confusing/ambiguous - which base is that 10 in???

Instead I'll propose two simple possibilities, both of which I like for different reasons. One sticks close to decimal while remaining distinguishable; the other is much more foreign but has fun and interesting numerical and syntactic qualities.

How To Pronounce It Like Decimal

(Several aspects of this are inspired by jan Misali's method, but I don't like all the details of their proposal.)

All the individual digits are pronounced normally, as in decimal English: zero, one, two, three, four, five.

All the "teens" (two-digit numbers with a 1 in the higher digit) are also pronounced like in decimal English: 10 is "six", 11 is "seven", 12 is "eight", 13 is "nine", 14 is "ten", and 15 is "eleven".

Higher numbers are pronounced akin to how decimal does it, just with a different suffix. Rather than appending -ty, append -sy: 20 is "twosy", 21 is "twosy one", 32 is "threesy two", etc with "foursy" and "fivesy". (Note: English phonotactics dictates that all of these S's become voiced, pronouncing as "z".)

You can, when it helps play along with a greater pattern, pronounce the 1X numbers as "onesy-X", starting with 10 as "onesy" instead of "six", up to 15 as "onesy five" instead of "eleven". 20 can also be pronounced "dozen", when desired.

After two digits, use "nif" to indicate the third digit: 123 is "one nif twosy three".

Always group nifs into two-digit pairs: 1234 is "eight nif threesy four".

Groups of four digits then form a "kilo": 1,0000 is "one kilo", 2,3451 is "two kilo, threesy four nif, fivesy one".

Every group of four past that gets named using the standard SI prefixes: 12₆ digits is "mega", 30 is "giga", 40 is "tera", etc. I presume we'd have some latin cognates to continue to draw on for very large numbers, like in decimal, which similarly are never actually used, so I'm not going to care about them.

Justification

Keeping the single digits same as in decimal is obvious; all bases do that already.

Pronouncing 10₆ as "six" isn't too unusual, from what I understand of other base-pronunciation schemes. Having a unique word for your 10 value helps in talking about it, so you don't constantly have to mutter "base six" after every mention of "ten".

For the rest of the 1X values, continuing to use the decimal names makes them simple and easy to remember. Also, the 1X's don't yet have the ability to do the unifying suffix pattern that the higher values do (which I'll get to in a sec), so they've gotta be weird somehow anyway; this is why decimal has the unique "X-teen" variant (shortened from "X and ten"). As a bonus, decimal has unique non-systematic words for all the values we need; it doesn't start systematizing them (which would feel weird carrying over into heximal) until thirteen.

For 2X and after, I use a naming scheme distinguishable from but obviously inspired by decimal. 20⏨ in decimal was originally "two tens"; English shortened the "tens" to a -ty suffix (and lightly modifies "two" and "three" to sound better with the suffix). If we follow the same pattern in heximal, 20₆ is "two sixes", shortening to "twosy".

An accidental nice feature of this over decimal is that it lacks the "thirteen/thirty"/etc confusion that decimal has; a final "n" is hard to hear in some circumstances. In heximal, "nine" and "thirty" are totally different-sounding.

For higher values, "nif" was chosen because it's what jan Misali used; it's the word for six sixes in the Ndom language, which is the largest heximal-native language in the world. It's pronounceable and easy to say and understand among other digits, too.

After that we come to a decision point. Do we start the googlology (naming of large numbers) with 4-digit groupings, because 6⁴ (1296⏨) ≈ 10³ (1000⏨)? Or do we go for the rounder-in-heximal 6-digit grouping, so your first grouping is at 10¹⁰ (in heximal), the next at 10²⁰, etc?

For now, I lean toward the four-digits solution. Six heximal digits counts up to about 50,000⏨, which just feels a bit too large to be a useful point to round to. Six digits are also too large, imo, to be the grouping size for using separators when writing out large numbers, whereas four digits feels fine: 100,0000 for "nif kilo", versus "1,000000". If we tried to add separators at a smaller value, the only reasonable spot is at 3 digits "1,000,000", but then that conflicts with the pronunciation grouping of heximal digits into niftimal pairs. (Separating every two is just overkill: 1,00,00,00.)

So four digits is the googlology point, where we start naming much larger groups and just repeatedly reuse the preceding digit groupings. Using decimal's thousand/million/billion etc is a bit misleading (the values are pretty different) and would feel weirdly out-of-place (I'm using different names for everything else). Plus I've never liked their naming, at least in the short scale the US and most of the world now uses, where "million" (referring to "one") is 10^3×2, "billion" (referring to "two") is 10^3×3, etc. (In the long scale, at least, a million is 10^6×1, a billion is 10^6×2, etc, which makes sense.)

So I figured, let's just use the metric system's naming. It's both familiar and slightly foreign, and means that when you actually use the heximal metric system, it all just works automatically.

Pronouncing Digit Pairs Specially Instead

(This system was suggested to me by @berenryan, apparently cribbed from their old notes talking to fellow conlang nerds.)

The previous method has the nice benefit of being reasonably familiar; it requires a minimum of teaching (just teaching what "nif", "kilo", etc mean) and you can even skip that and just read out digit strings.

However, it just inherits the English naming system with minimal tweaks. If you're willing to get a little weird and deal with the fact that nobody will understand your numbers (but have the satisfaction that in a theoretical world where everyone learned it as kids, it would be better than the first method), there are some much better possibilities.

First, the digits. 0 is "pa" (/pɑ/ like in "papa"), 1 is "be" (/be/ like "bay"), 2 is "ti" (/ti/ like "tea"), 3 is "do" (/do/ like "d'oh!"), 4 is "ku" (/ku/ like "coup"), and 5 is "gr" (/gɹ/ like "grrrr" but not drawn out). Note that all of the vowels are Spanish-style, not English-style.

When speaking numbers larger than a single digit, group them into pairs and pronounce each pair as follows:

	+0	+1	+2	+3	+4	+5
00	pama	befa	tiva	dona	kusa	gaza
10	peŋe	bime	tofe	duve	kane	gese
20	pizi	boŋi	tumi	dafi	kevi	gini
30	poso	buzo	taŋo	demo	kifo	govo
40	punu	basu	tezu	diŋu	komu	gufu
50	pavr	benr	tisr	dozr	kuŋr	gamr

(ŋ is pronounced "ng")

So the basic patterns:

the first letter of each digit-pair word uses the consonant from the one's digit's name: anything with a 0 ones digit starts with a "p" because 0 is "pa", etc.
the last letter of each digit-pair word uses the vowel from the ten's digit's name; all the 1_ values end in an "e" because 1 is "be", etc.
the second and third letters cycle between five vowels (a, e, i, o, u) and seven consonants (m, f, v, n, s, z, ŋ), respectively.

Larger numbers work the same as in the previous system; use "nif" after two digits, "kilo" after four, etc. "Nif" can usually be omitted in the middle of a number, since the numbers are so short to say anyway. You can still use it if the last two digits would be "00", rather than saying "pama", if you'd like.

So, 1,2345 is "befa kilo, dafi gufu". "1300" is "duve nif", or "duve pama".

Justification

This is actually a pretty brilliant setup for several reasons.

First, the digit names are obviously chosen to just use the Nth first/last letter; they also (aside from gr/gaza) are the first syllable of the 0_ numbers, since the second and last letters cycle thru the same vowels for the first five values.
Second, the sounds are well-chosen to be regular but distinguishable. The first consonants are all plosives, alternating between unvoiced and voiced; the second consonants are a collection of nasals and fricatives. Choosing consonants this way gives each word a sharp, distinguishable start that won't blend into other words, and then softens into a blending sound in the middle that carries stress well.
Third, the words all have significant redundancy, which helps with hearing. It's common for invented systems like this to be minimally redundant; it would have been easy to just produce two-letter words by, say, combining the first letter of the ten's digit's name and the last letter of the one's digit's name. They'd all be unique, so that technically works, right? In real conversation tho, where people might mumble, or the room might be noisy, or a radio might be crackling, distinguishing between some of those sounds can be quite difficult; this is why military jargon pronounces 5 and 9 as "fiver" and "niner", because "five" and "nine" sound remarkably similar over a low-bandwidth radio and the addition of the "er" sound forces us to emphasize the "v" versus "n" and makes them more distinct.
In here, the first and fourth letters cover 6×6 unique combinations, for the 36 possible values, but the second and third letters cover 5×7, or 35, possible values, meaning they're also almost completely unique across the set of numbers. (Only 0₆ and 55₆, pama and gamr, share their center letters.) As well, the two sets tile their combinations in different ways, so if two sounds are kinda similar they won't show up together multiple times. If someone is shouting across the room and you're not sure if they said "pizi" or "bizi", well, only one of those is the name of a number. If you can correctly hear any two letters in the word, you can almost always tell which number it was; you've got to fuck up really bad to mishear it.
(Of the six possible pairs of letters you could hear, three of the pairs uniquely identify a number in every instance; one (the center two letters) has a single collision, between 0₆ (pama) and 55₆ (gama); and two (the first and second letter, or the second and fourth letter) have six collisions, as each X0₆/X5₆ or 0X₆/5X₆ number collides.)
Fourth, it puts divisibility right into the name of the number: numbers divisible by 2 start with "p", "t", or "k"; numbers divisible by 3 start with "p" or "d"; divisible by 5 has an "a" in the second position; divisible by 7 (11₆) has an m in the third position.
This scales above two-digit numbers, too. In decimal, you can tell if a number is divisible by 3 or 9 by adding all the digits; if the result is divisible by 3 or 9, so was the original number. In heximal this trick works for 5, and the naming system actually does half the work for you already - since the second letter tells the value of the pair mod 5, you can easily tell that, say, a four-digit number whose first pair has an "e" and second pair has an "u" (like demo kusa, 3304) is divisible by five (since they're +1 and +4 above a multiple of five, respectively). Any pair with an "a" can be skipped, since it's divisible by 5 already and won't affect the result.
Similarly, there's a trick for divisibility by seven (11₆) - add together digit pairs, and if the result is divisible by seven, the original number was. Again, the naming system does half our work for us, since the second letter tells the value of the pair mod 7, so we can just look at those letters and quickly tell that, say, 12331 (befa dafi buzo) is divisible by 7, since "f" means it's +1 above a multiple of seven and "z" means +5, so 1+1+5 = 7!. (And it's divisible by five: e+a+u = 1+0+4 = 5!)
(These tricks work in any base, for the value one below and one above the base. So in base 10, they're for the values 9⏨ and 11⏨.)
So we get super-easy divisiblity by 2, 3, 5, and 7 all built into the naming system. This isn't just a party-trick, either - it's a built-in mental math check. Multiplying, say, basu (41) by kevi (24), you should get 1504, or gese kusa. You know right away that the last pair of the result must start with a "k" (because b×k, or 1×4, is 4 or k), check. Then the whole number needs to be divisible by 5 (because a×e, or 0×1, is 0), and yup, e+u is 5. Finally, it needs to be 1 more than a multiple of 7 (because s×v, or 4×2, is 8, reducing to 1), and yup, s+s is 4+4, or 1 mod 7.
These phonic additions and multiplications would be drilled into you as a child learning arithmetic in elementary school, becoming utterly intuitive. And from them, you can tell that you answer is very likely to be correct, since you got the correct offset from multiples of 2, 3, 5, and 7.
(In fact, a given combination of 2, 3, 5, and 7 is unique within each span of 210⏨ numbers, so if you pass those checks, you know you're either exactly right, or you're off by some multiple of 210⏨ (550₆). And 550₆ is just short of 1000₆; in other words, you'd have to be off by at least six nif to still get the checks to pass.)

Fifth, some studies have shown an inverse correlation between the number of syllables in the names for numbers and the amount of numbers a person can remember at once. That is, in languages where digits are shorter to say (like Chinese, where they're all single-syllable), people can remember longer numbers than in languages with longer digit names (like English, with "seven" being two-syllable and several of the other digits "full" syllables stuffed with sounds, and almost of the names for the tens digits being two or three syllables). In this system, each digit is a single syllable; all two-digit numbers are two syllables; writing out a pronounced number requires exactly twice as many letters as writing it out in digits. That's about as small as you can get things!
Sixth, having a dedicated and very short system for pronouncing all the two-digit values is convenient for "senary compression" (or as we'd call it natively, "niftimal"), the act of grouping heximal digits into pairs and treating them as units. Since base 6 is a little on the small side, this can be useful when talking about or remembering long numbers (this is the same reason octal and hexadecimal are useful, as compression methods for reading/discussing long binary numbers).

I'm pretty sure it's a marvelous way to fuck things up in a child's brain. We already have to deal with language irregularities. It's not "nine-ten-to" it's "ninety-two". In french, it's not "neuf-dix-deux" it's "quatre-ving-douze" which is "four-twenty-twelve". Try to gain a child's trust in calculus with that... it selects only the ones focusing on the graphical writing of numbers, and leave all others on the side of the road. So i suggest you take these fancy ideas about numbers and languages and put them in a bin, and then go to golf, or billiard, or bowling. Hell, even play violin.

Reply?

Congrats on missing the entirety of the point! This is about a world in which we're already using base 6, not imposing a bizarre base-36 naming system on top of the existing base-10 numbering system.

Reply?

The creativity of Tab's system should be applauded. Tab's post is more inventive than many rounds of golf.

Reply?

Personally I think "twosy" sounds nicer when voiced to (whatever appropriate spelling of) "doozy", with the added feature of gesturing back at "dozen" without quite going there.

Reply?

+1. Let's take it a step farther and just change two to doo

Reply?

wow that digit pairs system is cool! im learning it for fun now :D

a note i hope is interesting, for using an arithmetic trick i've been figuring out along with heximal (sorry if it's too long or uninteresting, feel free to delete it if so):

i've found it useful to extend it for "balanced" form, allowing negative digits - the units consonant in the first part get "r" appended if negated, and the sixes vowel in the fourth part gets "l" prepended similarly

seems pronouncable, and opens up what i call "clock form" usage (so named because 1:55 = "five to two") where you temporarily reduce to a -3,-2,…,2,3 digit set (note: 3 is only legal when followed by a negative or 0 remainder to the right, and -3 for positive, because rounding), which is a fairly simple process

then you can add with the advantages of a balanced base, but another advantage that most carries that do occur can get temporarily "soaked up" by ± 4,5

and also multiplication gets simplified as usual as well - you only need to learn 3 multiplication rules (2×2, 2×3, 3×3), plus multiplication by 0,1,-1, to multiply in clock form! and with waaay fewer carries throughout, and most carries tend to cancel rather than cascade, and can often be soaked up

for long division, it also halves the length of worst-case results, as the digit cycle hits a negated repetition and just follows the same pattern but negated

eg, in decimal, normally 1/7 = 0.'142857, now it's 0.'143000 - 0.'000143, which i write 0.'143'! and in general, 0.'blah' = blah/(10\^x + 1), mirroring the usual blah/(10\^x - 1); 143/1001 = 1/7, correctly

(to prove its utility, that's just how i remember sevenths now, converting it on the fly)

then you can just convert back to strict form (0,1,…,5) at the end when you're done :D effectively you frontloaded any carries you might do, which imo is way easier than actually doing carries, and thus made common operations significantly simpler

and both clock form and strict form have unique number representation (except for 0.'9 equivalents), so as long as you're working in one or the other you can still compare numbers for free like usual

Reply?

oh, also you can use leading negatives for cases where you want to keep the positive direction consistent (eg, temperatore)

this way, if you had -(54) degrees, you'd be able to instead use -102 as your canonical form - you can see at a glance it's exactly 100 less than 2 degrees, and your intuition for the difference between 2 and 4 degrees works identically for the difference between -102 and -104

this would be very handy if your absolute zero were a round number! (sorry i've been reading a few of your blog posts, just happened across them and enjoying them lots! can't really help but think about how this might work with degrees temp)

Reply?

I like the Idea of using speaking digit pairs. however I was wondering why it is in a big endian way? is it hard to remember that benr is larger than gese?

Reply?