Roll.js, an Exact Dice-Simulation Library

Last updated:

Are you the sort of person who likes to play around with RPG or boardgame mechanics? Have you ever looked at some dice-based mechanic and wondered just what the outcome chances really were, but instead just gave up, rolled a couple of times, and relied on vibes? Have you tried using AnyDice.com but gave up when you saw you'd have to learn a funky DSL to do anything non-trivial? Do you know JavaScript?

If you answered yes to some of those questions, I've got a new library for you! Roll.js is an exact-results dice-simulation library - it doesn't do simulations of the "repeat 10k times and report the average", it tracks every outcome with its precise chance of occurring (up to floating-point accuracy, at least).

The README explains the library itself, and there's a playground tool that'll let you write code against it immediately and even share the results with others! There are several bits of example code in the README, and several more in the playground (click the ? in the upper-right).

For example, a simple "d20 with advantage" roll is:

Roll.d20.advantage()

(playground link)

Want to know average damage of a greatsword after Great Weapon Master is applied (re-roll 1s and 2s, a single time)?

Roll.nd(2, 6).replace(x=> x <= 2, Roll.d6).sum();

(playground link)

Wanna build a d5 out of a d6 by rerolling 6s until they stop coming up, as long as it takes?

// "reroll()" calls map on its returned Roll 
// results again, until things stabilize.
Roll.d6.reroll({
  map: x=> (x == 6) ? Roll.d6 : x
});

(playground link)

Wanna do something complicated, like figure out what the chances are of dying/stabilizing/reviving from a D&D death save?

Roll.d20.reroll({
	summarize(roll, oldSummary={}) {
		return {
			successes:(roll>=10?1:0) + (oldSummary.successes || 0),
			failures:(roll<10?1:0) + (roll==1?1:0) + (oldSummary.failures || 0),
			nat20: (roll==20),
			toString() { return `${this.successes}/${this.failures}/${this.nat20}`; },
		}
	},
	map(summary) {
		if(summary.nat20) return "revive";
		if(summary.successes >= 3) return "stabilize";
		if(summary.failures >= 3) return "die";
		return Roll.d20;
	}
})

(playground link)


Point is, this library can do a lot of stuff, pretty easily, and you can use the JS you already know to do arbitrarily complicated stuff with it. I originally wrote it because I was fed up rewriting simulation code every time my brother and I were thinking about D&D homebrew, and in particular when I wrote some code to test out an alternate death-save mechanic it was getting too complicated; I figured I could just do it right, once, and (after several days of swearing at infinite-looping reroll code) I was correct!

At the moment the convenience functions (like .advantage()) are pretty biased towards D&D 5e usage, but I'm happy to add more for other dice systems. If you have any you'd like to see, let me know in a comment!

How To: Correctly Test for Python's Version

Last updated:

Python has four obvious built-in ways to retrieve its current version. If your software has a minimum version requirement, you'll use one of these methods to test whether the Python currently running your code is a high-enough version to do so successfully.

Three of these ways are worthless, actively-harmful traps. If you read nothing else, read this: use sys.version_info.

The four ways are:

  • sys.version_info is the good way, returning a tuple of numbers, one for each component of the version
  • sys.version is a terrible way, returning a string containing the full version identifier
  • platform.python_version() is another terrible way, also returning a string
  • platform.python_version_tuple() is a final and especially terrible way, returning A TUPLE OF GODDAM STRINGS FOR THE COMPONENTS

Deets

sys.version_info returns a tuple of numeric components. Well, it returns a namedtuple, but same deal. If I convert it to a plain tuple, on my system I get:

(3, 7, 12, 'final', 0)

Tuples in Python have nice sorting behavior; they'll compare against other tuples element-wise, with missing items considered as coming before present items. This means that if I write sys.version_info >= (3, 7), it'll be true - the 3 and 7 match, then the remaining components in sys.version mean it's slightly greater than the (3, 7) literal.

Importantly, because this uses element-wise comparison and the elements are (mostly) numbers, it won't fail for stupid reasons, like version 3.10 coming out. If you compare (3, 10) >= (3, 7) it correctly returns True, because 10 is greater than 7.


platform.python_version_tuple() also returns a tuple of components, but each component is a string. Currently on my system it returns ('3', '7', '12').

If you compare it with a literal tuple, like platform.python_version_tuple() >= ('3', '7') (note that you have to write the literal with strings, or else it'll error), it'll appear to work - you'll get back True. If the function returns ('3', '6') or ('3', '8') it also appears to do the right thing, returning False and True respectively. Bump it to ('3', '9') and it's still True, as expected. But advance to ('3', '10') and it suddenly returns False, indicating that version 3.10 is less than version 3.7.

This is because string-wise comparison is not the same as numerical comparison. Strings compare letter-by-letter, and "1" (the first letter of "10") is indeed less than "7", and so fails the >= check.

This is a very easy mistake for end-users to make. It's an unforgiveable mistake for Python, the language, to make. Returning a tuple, indicating it's been parsed, but filling that tuple with strings when they know the components are numeric and will be used as if they were numbers, is just absolute clown-shoes behavior.


sys.version and platform.python_version() just straight up return strings. Once again, naive comparison seems to work: on my system sys.version returns '3.7.12 (default, Nov 11 2021, 11:50:43) \n[GCC 10.3.0]', and if you run sys.version >= '3.7', it returns True as expected.

Both of these fail when the version is 3.10, in the exact same way as platform.python_version_tuple().

Conclusion

Always, always, ALWAYS use sys.version_info to get the current Python version, for any purpose. Anything else will break your code for stupid reasons.

Imposing Names On The Earth

Last updated:

I recently used What3Words to give a friend my house's location, since our entire neighborhood is only distinguished by the second address line, and Google Maps doesn't actually know any of them. W3W can redirect to Maps with a lat/long which works correctly for directions.

Afterwards, I spent part of the weekend thinking, as one does, about the tradeoffs of the two major "make every part of Earth addressable" efforts: What3Words and Google's PlusCodes.

The two made some very interesting tradeoff decisions differently, and I'm not sure which I like better!

What3Words

W3W divides the world up into ~3m squares, and assigns each a 3-word name. For example, I am at this very moment sitting at my office desk, located in the square named ///jukebox.slang.leaned. You can pop that triplet at the end of "what3words.com/" and find my precise location.

The assignment algorithm is proprietary (that's how they make money), but it's designed to (a) make more populated areas use shorter words, and (b) make nearby areas have no relationship to each other.

That is, there is absolutely no relationship between the name of one square and the name of the next square over; they're effectively random.

The intentional benefit of this is that if you get the name slightly wrong, you'll get the address WAY wrong, and will hopefully notice that the directions to your friends house shouldn't be directing you to another country. No "near misses" just putting you on a nearby street.

Obvious downside is the same, tho - errors will get the address way wrong, rather than nearly correct. It also means you can't generalize to larger areas; there's no concept of resolution larger (or smaller) than the 3m base square.

So you can't give an approximate location, only a fairly precise one. That said, on the scale that most ordinary people want, that's fine; we already have meaningful spatial demarcations that serve that purpose well (town/city names, plus maybe directional words or neighborhoods).

While the first version of W3W specifically used English words, they've since produced a large number of localized versions, so people can share addresses in their own language. They all use the same grid, so you can provide your location in multiple languages if you're expecting to serve a multilingual community.

PlusCodes

They're literally just a more usable way to express lat/long coordinates. Rather than base-10 (useful for math) they use base-20 (useful for memorization/communication), and interweave the digits so you get more precise in both dimensions as you go along.

That is, the first and second chars of a plus code are the high-order lat and long, naming a 20 degree square slice of the Earth. The third and fourth narrow that down to a 1 degree square (roughly 60 miles). Fifth and six cut it to a .05 degree square (roughly 3 miles). Etc. This goes on for 10 digits at which point they switch to a different scheme that goes digit-by-digit, but the details don't super matter.

Between the 8th and 9th digit you use a + to break it up visually, like 6PH57VP3+PR. You can also drop the first four digits if you have a city name or similar, like "7VP3+PR6 Singapore", giving you an easier-to-remember 7-digit name for an approximately 3m square area.

Benefits here are

  • easy to compute, obviously, since it's just lat/long expressed in base 20 and interwoven. The encoding/decoding algos can be done from scratch in a hundred lines or less of code. No need to contract out decoding to a 3rd-party company.
  • the precision can be scaled up or down trivially by adding or removing digits. (If you remove digits before the +, you replace them with 0, which isn't otherwise a valid digit in a pluscode.) So you can indicate large areas (so long as they're reasonably approximated by the slice boundaries), or even much smaller, more precise areas.
  • if you forget, or just leave off, the last bit of the code, you'll still end up close by. Like, a neighborhood can generally be located without any digits past the +.

Big downside is that getting anything but the very end wrong can make the address quite wrong without it necessarily looking wrong. In other words, errors in the pluscode translate to vastly different scales of errors in the final location, depending on where the error occurs. The most obvious errors (pointing to an entirely wrong part of the world) generally aren't even possible since those digits are usually omitted in favor of specifying a city instead, so your errors will often still give a location that looks reasonable at first glance, but might be next door, next neighborhood over, or next city over.

Another downside is the code spends the same amount of entropy on each location no matter how populated or middle-of-the-ocean it is. The one nod to this is being able to shorten the standard code from 11 digits to 7+city name (effectively giving human-friendly names to the 4-digit-prefix blocks), but past that there's no distinction between "populated city" and "wilderness", which W3W does distinguish between.

Both Are Cool

So there's several tradeoffs on either, several of which place the two on directly opposed sides. I dunno which I like best! Maybe both are great! I think that maybe they're best for different use-cases.

I think W3W does end up slightly better for human-relevant locations, such as in cities. The "any error is obviously way wrong" behavior seems like a benefit to me over the "error ranges from very slightly to very extremely wrong, depending on where it occurs" behavior of PlusCodes. And I like that populated areas are favored with shorter, more common, easier-to-spell words. (Compare my location I stated above, ///jukebox.slang.leaned, with an ocean square just a mile or so off the coast near me, ///virtuously.daylight.facsimiles.)

Also, when you are naming a larger area, like your house (rather than a very specific area like your front door) you have a goodly number of choices to pick from, and can choose one that's particularly memorable or appropriate. For example, I have about 20 squares covering my house, and chose a nice tech-related one (which I won't share, since my address isn't easy-accessible public knowledge and I'd prefer to keep it that way if I can) that happens to land roughly in the middle of my living room.

On the other hand, scientific use-cases seems to better served by PlusCodes. The variable slice sizes you can name means you can address many different use-cases, like naming ranges or geographic areas, as well as very precise locations, and can get more precise than the 3m square too, just by appending more digits. The names also all look generically similar, rather than having the potential for humor, which can be both good and bad in professional settings. (The 20-character set used by PlusCodes was chosen both to reduce the chance of mistyping/mishearing them, and to minimize the chance of addresses being real words, rude or otherwise. For example, it's missing all the vowels.)

Anyway I just wanted to get some of these thoughts out, and to interest more people in either/both of these systems if they weren't already aware. If you read to this point, success!

Holistic Review of TC39 "Dataflow" Proposals

Last updated:

There's currently five "linear dataflow" proposals bouncing around TC39: the pipe operator `|>`, the bind-this `::` operator, the partial function application `foo~()` operator, the Extension operator `::`/`:`, and the pipe()/flow() functions. @js-choi put together a wonderful diagram comparing and contrasting their functionality overlaps, along with an accompanying blogpost going over things in more detail.

The major roadblock for the Pipe operator right now is a general unease from the committee about this zoo of similar proposals. Quite reasonably, they don't want to commit to several things that have significant overlap, but neither do they want to commit to a subset of them that ends up missing important use-cases covered by the rejected proposals.

My personal thoughts on the matter are that we need the Pipe operator, could use PFA and the pipe()/flow() methods, and don't need the other two (but do need a small thing similar to bind-this but not currently covered by it).

Pipe Operator

As the diagram helpfully illustrates, the pipe operator nearly subsumes every other proposal. In some cases (notably, anything involving this), pipe can be more verbose than the other proposals, but generally is still better than the status quo today. We've reached Stage 2 with Pipe for a reason, and when we resolve this "why am I holding so many limes??" problem with the surplus of related dataflow proposals, I don't think we'll have issues proceeding the rest of the way.

Partial Function Application

This can be thought of as a slightly shorter way to write an arrow function, when all it does is immediately call another function. This is useful on its own, of course - fn~(arg0, ?, arg2, ?) is imo easier to read than (a,b)=>fn(arg0, a, arg2, b).

It has some nice features, tho. For one, the non-placeholder args are fully evaluated immediately, rather than deferred until call-time. That is, fn~(expensiveOp(), ?) is not actually equal to x=>fn(expensiveOp(), x), which'll call expensiveOp() every time fn() is called, but rather is equal to (a=>x=>fn(a, x))(expensiveOp()), pre-calculating expensiveOp() and reusing its result each time.

For two, it makes it very easy and simple to bind a method to its object. arr.map(obj.method) doesn't actually work, for two reasons: one, the method loses its this binding; two, it's called with three arguments in this circumstance. arr.map(obj.method~(?)), on the other hand, works perfectly - it retains its this binding and calls the method with a single argument, the first one passed, as if you'd written arr.map(x=>obj.method(x)).

Pipe can't meaningfully reproduce any of these. In some cases you can do some shenanigans to achieve the same thing, but it's always substantially more verbose and confusing. I think PFA has a useful role to play here.

Function.pipe() and Function.flow()

This is actually four methods on the Function object, along with pipeAsync() and flowAsync().

These are nice, simple utility methods, for piping a value between unary functions, or composing a function out of unary functions that'll eventually pipe a value thru. (That is, pipe(val, fn1, fn2, fn3) and flow(fn1, fn2, fn3)(val) are identical, and both equal to fn3(fn2(fn1(val))).) The *Async() versions just handle async functions, automatically awaiting the return value of each function before calling the next, and finally returning a promise; pipeAsync(val, fn1, fn2, fn3) is equivalent to fn1(val).then(fn2).then(fn3).

These seem like reasonably useful utility functions. They're pretty trivial to define yourself, but as a built-in they have the benefit of letting Typescript give them built-in typing information, which is useful since TS's current type-inference is actually too weak to type them fully correctly. Low-cost with moderate value, I'm fine with them going in.

Bind-this Operator

This covers three distinct use-cases:

  • hard-binding a method to its object: obj::obj.meth (same as obj.meth~() in PFA, above, or obj.meth.bind(obj) in today's JS)
  • reusing a method from one object on another object: newObj::obj.meth (same as obj.meth.bind(newObj) in today's JS)
  • calling a "method-like" function on another object, effectively extending the set of methods without touching the object's prototype: import {meth} from "module.js"; obj::meth(arg) (same as module.meth.call(obj, arg) in today's JS)

The first use-case is very reasonable, but I think PFA does it better. In particular, you don't have to repeat the obj expression, which might actually be a long or expensive operation, like (await fetch(...)).json(). In PFA that's (await fetch(...)).json~(), but in bind-this you have to either do the fetch twice, like (await fetch(...))::(await fetch(...)).json() or use a temp var.

The second is just... rare. The vast, vast majority of methods are not written in a sufficiently generic fashion to be usable on objects of a different class. Most web platform builtins aren't, either, like the Map methods (they manipulate a hidden MapData attached to the object, rather than using some hookable API you can sub in other objects for). The main exception here is array-likes, because the Array methods are intentionally written to only depend on the .length property and the existence of number-named properties. The better solution for that, tho, is the iterator generics that are being defined in another proposal, which work for all iterators.

An additional issue here is that it extracts the method directly from the object, meaning if you want a method from a particular class, you have to either construct an instance of the class, or explicitly invoke the prototype: obj::(new OtherObj()).meth or obj::OtherObj.prototype.meth. You usually don't notice this as a problem, since again Array is the main thing you want to use here, and [] is a short, cheap way to construct an Array instance. For anything other than Array, tho, this is a problem.

The third is where things get problematic. The problem here is that it requires you to, up-front, decide whether a given function that takes an Array (or whatever) is going to be written in this pseudo-method style, which is only convenient to call with this syntax, or as a normal function that takes the Array (or whatever) as an argument, which is only convenient to call as a normal function. This forks the ecosystem of module functions; authors have to decide up front how they want their library to be used, and users need to know which calling convention each module expects, so they know whether to write mod.func(arg0, arg1) or arg0::mod.func(arg1). Pipe already solves this without forking - just write a normal function, and call it pipe-wise if you want: mod.func(arg0, arg1) or arg0 |> mod.func(##, arg1), whichever makes more sense for your code at that point. No need for library authors to make any decision, they can just write their functions normally, and authors can treat all libraries the same.

So overall, bind-this does three things. One is done better by PFA, one is rare and mostly done better by the iterator helpers proposal, and one shouldn't be done imo.

Extensions

The Extensions proposal plays in a similar space to bind-this. It also has three major use-cases:

  • call a class's method on a different object: obj::Class:method(arg0) (same as Class.prototype.method.call(obj, arg0) in today's JS)
  • call a non-method function (usually, a function in a module) on a different object, effectively adding methods to an object without touching the object's prototype: obj::mod:func(arg0) (same as mod.func(obj, arg0) in today's JS, or obj |> mod.func(##, arg0) in pipe)
  • calling a "method-like" function on another object, effectively adding methods to an object without touching the object's prototype: import {meth} from "mod.js"; obj::meth()
  • use a method descriptor on an object (again, usually from a module), effectively adding getters and setters to the object without having to touch the prototype: obj::fakeGetter or obj::fakeSetter = 1.

The first covers the same ground as the second use-case from Bind-this; aside from Array, this is just very rarely useful. It does at least solve the problem that bind-this has for extracting methods from anything other than Array, since it specially handles classes themselves. But still, rarely actually useful.

The second use-case is reasonable, and avoids the problems that bind-this has; you can just write an ordinary function, with the only constraint being that it takes its "most significant" argument first. No significant ecosystem forking. However, it does require that you keep the function stored in something; you have to call it like import * as mod from "mod.js"; obj::mod:meth(arg), not import {meth} from "mod.js"; obj::meth(arg);. In addition, it's fully redundant with pipe; obj |> mod.meth(##, arg) works just as well with the same definitions, and is more flexible since you can sub in different args if that's what your code wants in a particular instance, or import the "method" directly rather than having to keep it stored in a namespacing object.

The third and fourth use-cases are where it gets weird. This is in fact the main use-case for Extensions afaict, which it was built around.

The third brings back the ecosystem-forking concerns of bind-this, plus some new issues where you actually have a brand new namespace to implement the custom callable. This is very weird and has gotten pushback from the committee, rightfully so imo. It's even worse than bind-this, since you can't even use .bind()/etc to call it "normally", due to the namespace stuff.

The fourth is something genuinely new, and without too much weirdness - it just uses descriptor objects, aka objects with .get() and/or .set() methods. This is why the second use-case requires the function to be on an object, fwiw, since the two syntaxes overlap/conflict if you could just use the function directly (it might have those methods for unrelated reasons, and we really don't like duck-typing things if we can avoid it; the way await fucks up classes that happen to have a .then() method is bad enough already). It's still weird, tho - we don't expose authors to descriptors generally. There's ecosystem forking, since any getter/setter could instead be written as a normal function - obj::size vs size(obj) or obj |> size(##) - and users need to know how the author decided to write things before they can use it, but it's more minor since it's limited to things that can be expressed as a getter or setter.

A nice thing is that it works with assignment syntax - obj::foo = 1 works (assuming foo is like {set(val){...}}). However, it works exactly like getters/setters, which is a problem, because a major use-case for customizing setting (working with functional data structures) isn't workable with the standard getter/setter operation. In obj.foo.bar.baz = 1, if all of those are getter/setter pairs, then JS will call the "foo" and "bar" getters and the "baz" setter. That's fine as long as the assignment is mutating, but a functional data structure needs it to then call the "bar" and "foo" setters as well, as each setter will return a fresh object rather than mutating the existing one. It would be very sad to do more work in the getter/setter space without making this work "correctly" for these use-cases.

Overall, Extensions does some interesting things, but mostly overlaps heavily with bind-this in both functionality and problems. I don't see it as justifying its weight, at least in its current form.

Nota Bene: Reliable Method Calling

A small but important use-case of bind-this/extensions that I didn't actually cover, but which was brought up to me in response to this essay, is the use-cases of reliably calling methods in the face of possible hostile/bad code. For example, there's an Object.prototype.toString() method that's often used as a simple form of brand-checking. If a class provides its own toString, tho, it overrides this. Alternately, a class can have a null prototype (via extends null), and thus not have access to the Object.prototype methods at all. So today, if you want to reliably call Object.prototype.toString on a value, you have to do it manually/verbosely, as Object.prototype.toString.call(obj), which is admittedly pretty bad. This sort of thing is in fact very common in utility/library code; see this review of a bunch of JS showing how common `.call()` usage is

Solving this well doesn't require the full power of bind-this or Extensions; you can get away with any "uncurry this"-type operator. That is, you could have an operator (or unforgeable function via built-in modules) that automatically turns a method into an ordinary function, taking its this as its first param instead, like:

const toString = ::Object.prototype.toString;
const brand = toString(obj);

const slice = ::Arry.prototype.slice;
const skipFirst = slice(arrayLike, 1);

// or as a function
import {uncurryThis} from std;
const slice = uncurryThis(Array.prototype.slice);
const skipFirst = slice(arrayLike, 1);

Alternately, we could present it as a call operator (similar to PFA's ~() operator) that changes a this-using function to instead take it as its first argument, like:

const toString = Object.prototype.toString;
const brand = toString::(obj);

const slice = Array.prototype.slice;
const skipFirst = slice::(arrayLike, 1);

// more convenient for doing multiple extractions, too
const {slice, splice, map, flatMap} = Array.prototype;
const mapped = map::(arrayLike, x=>x+1);

Either of these solves this specific, reasonable use-case well, and does so in a way that does not overlap with the other dataflow proposals. In particular, they work great with the pipe operator, like arrayLike |> slice::(##, 1), to achieve the normal "looks like methods" syntax that pipe brings to other functions.

They also avoid the ecosystem-forking concerns that the larger bind-this and Extensions proposals have; these are a slightly less convenient way to call a function, useful because they serve a specific purpose, but not suitable for general authoring style. That is, no one would write a library with a bunch of this-using functions, intending for them to be called this way. meth::(obj, arg) is just less convenient than writing a normal function and doing meth(obj, arg).

Between the two, I slightly prefer the call operator. It's easier to use when extracting methods en-masse from a prototype, and I suspect it's somewhat more performant, since it doesn't create any new function objects, just provides an alternate way to call a method with a specific this value, and so should be similar in cost to a normal method call. It's also more amenable to ad-hoc method extraction to call on a different object; Array.prototype.slice::(arrayLike, 1) vs (::Array.prototype.slice)(arrayLike, 1). I also think the operator precedence is a little clearer with the call-operator; there's no need to carefully engineer precedence to cover the correct amount of expression forward, you just call whatever's on the LHS with the RHS arglist, using identical precedence to a normal call expr.

Final Conclusion

Pipe is a general-purpose operator that, in a lightweight, minimally magic, and easily understandable fashion, handles most dataflow-linearization issues with aplomb. PFA covers a valuable area that pipe doesn't handle very well, plus some additional space that pipe doesn't cover at all. Bind-this and Extensions both cover space that pipe can handle, albeit usually with more verbosity, but bring their own significant new issues, and mostly are covering space that doesn't need solving, imo. The pipe()/flow() methods seem useful and unproblematic. I think there's a useful space left for a limited "uncurry-this" operator.

MyPy and Dict-Like get() Methods

Last updated:

I'm hacking on something for my `kdl-py` project, and realized I wanted a .get() method to retrieve child nodes by name, similar to a dict. I had some trouble working out how to set it up to type properly in MyPy, so here's a short summary of my results, which I'm quite happy with now.

(Disclaimer: while working on this, I was laboring under the misapprehension that dict.get("foo") raised an exception if "foo" wasn't in the dict, but that's only the behavior for dict["foo"]! dict.get("foo") will instead always return the default value if the key is missing, which just defaults to None, which is a much simpler behavior, ugh.)

So, the problem I was having is that I wanted an optional argument (the default value, to be returned if the node name couldn't be found), and I wanted to tell whether that argument was passed at all. Any value is valid for the default, so I can't rely on a standard sentinel value, like None.

One way to do this is with kwargs shenanigans (leaving the argument out of the arglist, using a **kwargs arg instead, and just checking if it shows up in there), but that's awkward at the best of times, and doesn't let you typecheck well (can't indicate that the call might return the default value, type-wise).

The usual way to do this in JS, which doesn't have a kwargs equivalent, is instead to set the default value to some unique object value that's not exposed to the outside, and see if it's still equal to that value. Since the outside world doesn't have access to that value, you can be sure that if you see it, the argument wasn't passed at all.

This is how I ended up going. Here's the final code:

import typing as t

class _MISSING:
  pass

T = t.TypeVar('T')

class NodeList:
  @t.overload
  def get(self, key: str) -> Node:
    ...

  @t.overload
  def get(
    self, 
    key: str, 
    default: t.Union[T, _MISSING] = _MISSING(),
  ) -> t.Union[Node, T]:
    ...

  def get(
    self,
    key: str,
    default: t.Union[T, _MISSING] = _MISSING(),
  ) -> t.Union[Node, T]:
    if self.data.has(key):
      return self.data.get(key)
    if isinstance(default, _MISSING):
      raise KeyError(f"No node with name '{key}' found.")
    else:
      return default

Boom, there you go. Now if you call nl.get("foo"), the return type is definitely Node, so you don't have to do a None check to satisfy MyPy (it'll just throw if you screw up), but it'll correctly type as "Node or whatever your default is" when you do pass a default value.