Holistic Review of TC39 "Dataflow" Proposals

Last updated:

There's currently five "linear dataflow" proposals bouncing around TC39: the pipe operator `|>`, the bind-this `::` operator, the partial function application `foo~()` operator, the Extension operator `::`/`:`, and the pipe()/flow() functions. @js-choi put together a wonderful diagram comparing and contrasting their functionality overlaps, along with an accompanying blogpost going over things in more detail.

The major roadblock for the Pipe operator right now is a general unease from the committee about this zoo of similar proposals. Quite reasonably, they don't want to commit to several things that have significant overlap, but neither do they want to commit to a subset of them that ends up missing important use-cases covered by the rejected proposals.

My personal thoughts on the matter are that we need the Pipe operator, could use PFA and the pipe()/flow() methods, and don't need the other two (but do need a small thing similar to bind-this but not currently covered by it).

Pipe Operator

As the diagram helpfully illustrates, the pipe operator nearly subsumes every other proposal. In some cases (notably, anything involving this), pipe can be more verbose than the other proposals, but generally is still better than the status quo today. We've reached Stage 2 with Pipe for a reason, and when we resolve this "why am I holding so many limes??" problem with the surplus of related dataflow proposals, I don't think we'll have issues proceeding the rest of the way.

Partial Function Application

This can be thought of as a slightly shorter way to write an arrow function, when all it does is immediately call another function. This is useful on its own, of course - fn~(arg0, ?, arg2, ?) is imo easier to read than (a,b)=>fn(arg0, a, arg2, b).

It has some nice features, tho. For one, the non-placeholder args are fully evaluated immediately, rather than deferred until call-time. That is, fn~(expensiveOp(), ?) is not actually equal to x=>fn(expensiveOp(), x), which'll call expensiveOp() every time fn() is called, but rather is equal to (a=>x=>fn(a, x))(expensiveOp()), pre-calculating expensiveOp() and reusing its result each time.

For two, it makes it very easy and simple to bind a method to its object. arr.map(obj.method) doesn't actually work, for two reasons: one, the method loses its this binding; two, it's called with three arguments in this circumstance. arr.map(obj.method~(?)), on the other hand, works perfectly - it retains its this binding and calls the method with a single argument, the first one passed, as if you'd written arr.map(x=>obj.method(x)).

Pipe can't meaningfully reproduce any of these. In some cases you can do some shenanigans to achieve the same thing, but it's always substantially more verbose and confusing. I think PFA has a useful role to play here.

Function.pipe() and Function.flow()

This is actually four methods on the Function object, along with pipeAsync() and flowAsync().

These are nice, simple utility methods, for piping a value between unary functions, or composing a function out of unary functions that'll eventually pipe a value thru. (That is, pipe(val, fn1, fn2, fn3) and flow(fn1, fn2, fn3)(val) are identical, and both equal to fn3(fn2(fn1(val))).) The *Async() versions just handle async functions, automatically awaiting the return value of each function before calling the next, and finally returning a promise; pipeAsync(val, fn1, fn2, fn3) is equivalent to fn1(val).then(fn2).then(fn3).

These seem like reasonably useful utility functions. They're pretty trivial to define yourself, but as a built-in they have the benefit of letting Typescript give them built-in typing information, which is useful since TS's current type-inference is actually too weak to type them fully correctly. Low-cost with moderate value, I'm fine with them going in.

Bind-this Operator

This covers three distinct use-cases:

  • hard-binding a method to its object: obj::obj.meth (same as obj.meth~() in PFA, above, or obj.meth.bind(obj) in today's JS)
  • reusing a method from one object on another object: newObj::obj.meth (same as obj.meth.bind(newObj) in today's JS)
  • calling a "method-like" function on another object, effectively extending the set of methods without touching the object's prototype: import {meth} from "module.js"; obj::meth(arg) (same as module.meth.call(obj, arg) in today's JS)

The first use-case is very reasonable, but I think PFA does it better. In particular, you don't have to repeat the obj expression, which might actually be a long or expensive operation, like (await fetch(...)).json(). In PFA that's (await fetch(...)).json~(), but in bind-this you have to either do the fetch twice, like (await fetch(...))::(await fetch(...)).json() or use a temp var.

The second is just... rare. The vast, vast majority of methods are not written in a sufficiently generic fashion to be usable on objects of a different class. Most web platform builtins aren't, either, like the Map methods (they manipulate a hidden MapData attached to the object, rather than using some hookable API you can sub in other objects for). The main exception here is array-likes, because the Array methods are intentionally written to only depend on the .length property and the existence of number-named properties. The better solution for that, tho, is the iterator generics that are being defined in another proposal, which work for all iterators.

An additional issue here is that it extracts the method directly from the object, meaning if you want a method from a particular class, you have to either construct an instance of the class, or explicitly invoke the prototype: obj::(new OtherObj()).meth or obj::OtherObj.prototype.meth. You usually don't notice this as a problem, since again Array is the main thing you want to use here, and [] is a short, cheap way to construct an Array instance. For anything other than Array, tho, this is a problem.

The third is where things get problematic. The problem here is that it requires you to, up-front, decide whether a given function that takes an Array (or whatever) is going to be written in this pseudo-method style, which is only convenient to call with this syntax, or as a normal function that takes the Array (or whatever) as an argument, which is only convenient to call as a normal function. This forks the ecosystem of module functions; authors have to decide up front how they want their library to be used, and users need to know which calling convention each module expects, so they know whether to write mod.func(arg0, arg1) or arg0::mod.func(arg1). Pipe already solves this without forking - just write a normal function, and call it pipe-wise if you want: mod.func(arg0, arg1) or arg0 |> mod.func(##, arg1), whichever makes more sense for your code at that point. No need for library authors to make any decision, they can just write their functions normally, and authors can treat all libraries the same.

So overall, bind-this does three things. One is done better by PFA, one is rare and mostly done better by the iterator helpers proposal, and one shouldn't be done imo.

Extensions

The Extensions proposal plays in a similar space to bind-this. It also has three major use-cases:

  • call a class's method on a different object: obj::Class:method(arg0) (same as Class.prototype.method.call(obj, arg0) in today's JS)
  • call a non-method function (usually, a function in a module) on a different object, effectively adding methods to an object without touching the object's prototype: obj::mod:func(arg0) (same as mod.func(obj, arg0) in today's JS, or obj |> mod.func(##, arg0) in pipe)
  • calling a "method-like" function on another object, effectively adding methods to an object without touching the object's prototype: import {meth} from "mod.js"; obj::meth()
  • use a method descriptor on an object (again, usually from a module), effectively adding getters and setters to the object without having to touch the prototype: obj::fakeGetter or obj::fakeSetter = 1.

The first covers the same ground as the second use-case from Bind-this; aside from Array, this is just very rarely useful. It does at least solve the problem that bind-this has for extracting methods from anything other than Array, since it specially handles classes themselves. But still, rarely actually useful.

The second use-case is reasonable, and avoids the problems that bind-this has; you can just write an ordinary function, with the only constraint being that it takes its "most significant" argument first. No significant ecosystem forking. However, it does require that you keep the function stored in something; you have to call it like import * as mod from "mod.js"; obj::mod:meth(arg), not import {meth} from "mod.js"; obj::meth(arg);. In addition, it's fully redundant with pipe; obj |> mod.meth(##, arg) works just as well with the same definitions, and is more flexible since you can sub in different args if that's what your code wants in a particular instance, or import the "method" directly rather than having to keep it stored in a namespacing object.

The third and fourth use-cases are where it gets weird. This is in fact the main use-case for Extensions afaict, which it was built around.

The third brings back the ecosystem-forking concerns of bind-this, plus some new issues where you actually have a brand new namespace to implement the custom callable. This is very weird and has gotten pushback from the committee, rightfully so imo. It's even worse than bind-this, since you can't even use .bind()/etc to call it "normally", due to the namespace stuff.

The fourth is something genuinely new, and without too much weirdness - it just uses descriptor objects, aka objects with .get() and/or .set() methods. This is why the second use-case requires the function to be on an object, fwiw, since the two syntaxes overlap/conflict if you could just use the function directly (it might have those methods for unrelated reasons, and we really don't like duck-typing things if we can avoid it; the way await fucks up classes that happen to have a .then() method is bad enough already). It's still weird, tho - we don't expose authors to descriptors generally. There's ecosystem forking, since any getter/setter could instead be written as a normal function - obj::size vs size(obj) or obj |> size(##) - and users need to know how the author decided to write things before they can use it, but it's more minor since it's limited to things that can be expressed as a getter or setter.

A nice thing is that it works with assignment syntax - obj::foo = 1 works (assuming foo is like {set(val){...}}). However, it works exactly like getters/setters, which is a problem, because a major use-case for customizing setting (working with functional data structures) isn't workable with the standard getter/setter operation. In obj.foo.bar.baz = 1, if all of those are getter/setter pairs, then JS will call the "foo" and "bar" getters and the "baz" setter. That's fine as long as the assignment is mutating, but a functional data structure needs it to then call the "bar" and "foo" setters as well, as each setter will return a fresh object rather than mutating the existing one. It would be very sad to do more work in the getter/setter space without making this work "correctly" for these use-cases.

Overall, Extensions does some interesting things, but mostly overlaps heavily with bind-this in both functionality and problems. I don't see it as justifying its weight, at least in its current form.

Nota Bene: Reliable Method Calling

A small but important use-case of bind-this/extensions that I didn't actually cover, but which was brought up to me in response to this essay, is the use-cases of reliably calling methods in the face of possible hostile/bad code. For example, there's an Object.prototype.toString() method that's often used as a simple form of brand-checking. If a class provides its own toString, tho, it overrides this. Alternately, a class can have a null prototype (via extends null), and thus not have access to the Object.prototype methods at all. So today, if you want to reliably call Object.prototype.toString on a value, you have to do it manually/verbosely, as Object.prototype.toString.call(obj), which is admittedly pretty bad. This sort of thing is in fact very common in utility/library code; see this review of a bunch of JS showing how common `.call()` usage is

Solving this well doesn't require the full power of bind-this or Extensions; you can get away with any "uncurry this"-type operator. That is, you could have an operator (or unforgeable function via built-in modules) that automatically turns a method into an ordinary function, taking its this as its first param instead, like:

const toString = ::Object.prototype.toString;
const brand = toString(obj);

const slice = ::Arry.prototype.slice;
const skipFirst = slice(arrayLike, 1);

// or as a function
import {uncurryThis} from std;
const slice = uncurryThis(Array.prototype.slice);
const skipFirst = slice(arrayLike, 1);

Alternately, we could present it as a call operator (similar to PFA's ~() operator) that changes a this-using function to instead take it as its first argument, like:

const toString = Object.prototype.toString;
const brand = toString::(obj);

const slice = Array.prototype.slice;
const skipFirst = slice::(arrayLike, 1);

// more convenient for doing multiple extractions, too
const {slice, splice, map, flatMap} = Array.prototype;
const mapped = map::(arrayLike, x=>x+1);

Either of these solves this specific, reasonable use-case well, and does so in a way that does not overlap with the other dataflow proposals. In particular, they work great with the pipe operator, like arrayLike |> slice::(##, 1), to achieve the normal "looks like methods" syntax that pipe brings to other functions.

They also avoid the ecosystem-forking concerns that the larger bind-this and Extensions proposals have; these are a slightly less convenient way to call a function, useful because they serve a specific purpose, but not suitable for general authoring style. That is, no one would write a library with a bunch of this-using functions, intending for them to be called this way. meth::(obj, arg) is just less convenient than writing a normal function and doing meth(obj, arg).

Between the two, I slightly prefer the call operator. It's easier to use when extracting methods en-masse from a prototype, and I suspect it's somewhat more performant, since it doesn't create any new function objects, just provides an alternate way to call a method with a specific this value, and so should be similar in cost to a normal method call. It's also more amenable to ad-hoc method extraction to call on a different object; Array.prototype.slice::(arrayLike, 1) vs (::Array.prototype.slice)(arrayLike, 1). I also think the operator precedence is a little clearer with the call-operator; there's no need to carefully engineer precedence to cover the correct amount of expression forward, you just call whatever's on the LHS with the RHS arglist, using identical precedence to a normal call expr.

Final Conclusion

Pipe is a general-purpose operator that, in a lightweight, minimally magic, and easily understandable fashion, handles most dataflow-linearization issues with aplomb. PFA covers a valuable area that pipe doesn't handle very well, plus some additional space that pipe doesn't cover at all. Bind-this and Extensions both cover space that pipe can handle, albeit usually with more verbosity, but bring their own significant new issues, and mostly are covering space that doesn't need solving, imo. The pipe()/flow() methods seem useful and unproblematic. I think there's a useful space left for a limited "uncurry-this" operator.

(a limited set of Markdown is supported)