MyPy and Dict-Like get() Methods

Last updated:

I'm hacking on something for my `kdl-py` project, and realized I wanted a .get() method to retrieve child nodes by name, similar to a dict. I had some trouble working out how to set it up to type properly in MyPy, so here's a short summary of my results, which I'm quite happy with now.

(Disclaimer: while working on this, I was laboring under the misapprehension that dict.get("foo") raised an exception if "foo" wasn't in the dict, but that's only the behavior for dict["foo"]! dict.get("foo") will instead always return the default value if the key is missing, which just defaults to None, which is a much simpler behavior, ugh.)

So, the problem I was having is that I wanted an optional argument (the default value, to be returned if the node name couldn't be found), and I wanted to tell whether that argument was passed at all. Any value is valid for the default, so I can't rely on a standard sentinel value, like None.

One way to do this is with kwargs shenanigans (leaving the argument out of the arglist, using a **kwargs arg instead, and just checking if it shows up in there), but that's awkward at the best of times, and doesn't let you typecheck well (can't indicate that the call might return the default value, type-wise).

The usual way to do this in JS, which doesn't have a kwargs equivalent, is instead to set the default value to some unique object value that's not exposed to the outside, and see if it's still equal to that value. Since the outside world doesn't have access to that value, you can be sure that if you see it, the argument wasn't passed at all.

This is how I ended up going. Here's the final code:

import typing as t

class _MISSING:
  pass

T = t.TypeVar('T')

class NodeList:
  @t.overload
  def get(self, key: str) -> Node:
    ...

  @t.overload
  def get(
    self, 
    key: str, 
    default: t.Union[T, _MISSING] = _MISSING(),
  ) -> t.Union[Node, T]:
    ...

  def get(
    self,
    key: str,
    default: t.Union[T, _MISSING] = _MISSING(),
  ) -> t.Union[Node, T]:
    if self.data.has(key):
      return self.data.get(key)
    if isinstance(default, _MISSING):
      raise KeyError(f"No node with name '{key}' found.")
    else:
      return default

Boom, there you go. Now if you call nl.get("foo"), the return type is definitely Node, so you don't have to do a None check to satisfy MyPy (it'll just throw if you screw up), but it'll correctly type as "Node or whatever your default is" when you do pass a default value.

(a limited set of Markdown is supported)

#1 - FeRD (Frank Dana):

This works, and is in fact close to the Python norms.

The "typical" way to do this in Python (and actually documented... somewhere) is to use a specific object as the default, just as you did. But since the stdlib both predates and doesn't concern itself with type-checking, usually it's just done with an instance of object. You'll see things like this scattered around the Python stdlib modules:

```python _sentinel = object()

def get(self, key, default=_sentinel): # ... if default == _sentinel: raise KeyError() ```

That won't make MyPy happy, but my bigger issue with it has always been how badly it self-documents:

```python

help(NodeList.get)

get(self, key, default=<object object at 0x7f82437787b0>) ``` Ugh.

Your version comes out somewhat better, as it at least displays the class name:

```python

help(NodeList.get)

get(self, key: str, default: Union[~T, main._MISSING] = <main._MISSING object at 0x7f48107c6ae0> ) -> Union[main.Node, ~T] ```

You can improve on that somewhat by giving _MISSING a __repr__, which will be used:

```python class _MISSING: def repr(self): return "_MISSING" ```

```python

help(NodeList.get)

get(self, key: str, default: Union[~T, main._MISSING] = _MISSING ) -> Union[main.Node, ~T] ```

I've long advocated for using bare classes themselves (not instances of a class) as sentinel defaults. Types are objects in Python, they can be compared with ==, and they self-document even without the __repr__:

```python class _MISSING: pass

class NodeList: # ... def get(self, key, default=_MISSING): # ... if default == _MISSING: raise KeyError() ```

```python

help(NodeList.get)

get(self, key, default=<class 'main._MISSING'>) ```

...But I confess I'm not sure how to make that palatable to MyPy.

Reply?

(a limited set of Markdown is supported)

#2 - FeRD (Frank Dana):

Well, darn. So much for "you can use any of the block-level elements except headings" in the comment MarkDown. (I know, I know, \`\`\`{lang} is a GitHub-Flavored extension, I shouldn't have expected it to work...)

Reply?

(a limited set of Markdown is supported)

#3 - FeRD (Frank Dana):

Let's try that comment again, without ruining the code blocks:

This works, and is in fact close to the Python norms.

The "typical" way to do this in Python (and actually documented... somewhere) is to use a specific object as the default, just as you did. But since the stdlib both predates and doesn't concern itself with type-checking, usually it's just done with an instance of object. You'll see things like this scattered around the Python stdlib modules:

``` _sentinel = object()

def get(self, key, default=_sentinel): # ... if default == _sentinel: raise KeyError() ```

That won't make MyPy happy, but my bigger issue with it has always been how badly it self-documents:

```

help(NodeList.get)

get(self, key, default=<object object at 0x7f82437787b0>) ```

Ugh.

Your version comes out somewhat better, as it at least displays the class name:

```

help(NodeList.get)

get(self, key: str, default: Union[~T, main._MISSING] = <main._MISSING object at 0x7f48107c6ae0> ) -> Union[main.Node, ~T] ```

You can improve on that somewhat by giving _MISSING a __repr__, which will be used:

``` class _MISSING: def repr(self): return "_MISSING" ```

```

help(NodeList.get)

get(self, key: str, default: Union[~T, main._MISSING] = _MISSING ) -> Union[main.Node, ~T] ```

I've long advocated for using bare classes themselves (not instances of a class) as sentinel defaults. Types are objects in Python, they can be compared with ==, and they self-document even without the __repr__:

``` class _MISSING: pass

class NodeList: # ... def get(self, key, default=_MISSING): # ... if default == _MISSING: raise KeyError() ```

```

help(NodeList.get)

get(self, key, default=<class 'main._MISSING'>) ```

...But I confess I'm not sure how to make that palatable to MyPy.

Reply?

(a limited set of Markdown is supported)

Lol, sorry, the markdown impl I wrote here doesn't recognize backtick blocks at all, just tilda blocks (which were what was supported by some previous Markdown parser I'd used). I really need to update to support that, since backtick blocks are massively more common (even if CommonMark specifies that both should work).

Reply?

(a limited set of Markdown is supported)