Against Innovation Tokens

The “innovation token” model for selecting technologies is bad, and here’s why.

Updated 2024-07-04: After some discussion, added an epilogue going into more detail about the value of the distinction between the two types of tokens.

In 2015, Dan McKinley laid out a model for software teams selecting technologies. He proposed that each team have a limited supply of “innovation tokens”, and, when selecting a technology, they can choose boring ones for free but “innovative” ones cost a token. This implies that we all know which technologies are innovative, and we assume that they are inherently costly, so we want to restrict their supply.

That model has become popular to the point that it is now part of the vernacular. In many discussions, it is accepted as received wisdom, or even common sense.

In this post I aim to show you that despite being superficially helpful, this model is wrong, and in fact, may be counterproductive. I believe it is an attractive nuisance in computer programming discourse.

In fairness to Mr. McKinley, the model he described in this post is:

nearly a decade old at this point, and
much more nuanced in its description of the problem with “innovation” than the subsequent memetic mutation of the concept.

While I will be referencing McKinley’s post, and I do take some issue with it, I am reacting more strongly to the life of its own that this idea has taken on once it escaped its original context. There are a zillion worse posts rehashing this concept, on blogs and LinkedIn, but I won’t be linking to them because the goal is not to call anybody out.

To some extent I am re-raising McKinley’s own caveats and reinforcing them. So I may be arguing with a strawman, but it’s a strawman I have seen deployed with some regularity over the years.

To reduce it to its core, this strawman is “don’t use new or interesting technology, and if you have to, only use a little bit”.

Within the broader culture of programmers, an “innovation token” has become a shorthand to smear any technology perceived — almost always based on vibes, not data — as risky, and the adoption of novel approaches as pretentious and unserious. Speaking of programmer culture though, I do have to acknowledge there is also a pervasive tendency for us to get distracted by novelty and waste time on puzzles rather than problem-solving, so I understand where the reactionary attitude represented by the concept of an innovation token comes from.

But it is reactionary.

At its worst, it borders on anti-intellectualism. I have heard it used on more than one occasion as a thought-terminating cliche to discard a potentially promising new tool. But before I get into that, let me try to give a sympathetic summary of the idea, because the model is not entirely bad.

It has been popular for a long time because it does work okay as an heuristic.

The real problem that McKinley is describing is operational overhead. When programmers make a technology selection, we are often considering how difficult it will make the programming. Innovative technology selections are, by definition, less mature.

That lack of maturity — particularly in the open source world — often means that the project is in a part of its lifecycle where it is concerned with development affordances more than operational ones. Therefore, the stereotypical innovative project, even one which might legitimately be a big improvement to development velocity, will create more operational overhead. That operational overhead creates a hidden cost for the operations team later on.

This is a point I emphatically agree with. When selecting a technology, you should consider its ease of operation more than its ease of development. If your team is successful, they will be operating and maintaining it far longer than they are initially integrating and deploying it.

Furthermore, some operational overhead is inevitable. You will need to hire people to mitigate it. More popular, more mature projects will have a bigger talent pool to hire from, so your training costs will be lower, and those training costs are part of your operational cost too.

Rationing innovation tokens therefore can work as a reasonable heuristic, or proxy metric, for avoiding a mess of complex operational problems associated with dependencies that are expensive to operate and hard to hire for.

There are some minor issues I want to point out before getting to the overarching one.

“has a lot of operational overhead” is a stereotype of a new technology, not an inherent property. If you want to reject a technology on the basis of being too high-overhead, at least look into its actual overhead a little bit. Sometimes, especially in 2024 as opposed to 2015, the point of a new, shiny piece of tech is to address operational issues that the more boring, older one had.
“hard to learn” is also a stereotype; if “newer” meant “harder” then we would all be using troff rather than Google Docs. Actually ask if the innovativeness is making things harder or easier; don’t assume.
You are going to have to train people on your stack no matter what. If a technology is adding a lot of value, it’s absolutely worth hiring for general ability and making a plan to teach people about it. You are going to have to do this with the core technology of your product anyway.

As I said, though, these are minor issues. The big problem with modeling operational overhead as an “innovation token” is that an even bigger concern than selecting an innovative tool is selecting too many tools.

The impulse to select more tools and make your operational environment more complex can be made worse by trying to avoid innovative tools. The important thing is not “less innovation”, but more consistency. To illustrate this, let’s do a simple thought experiment.

Let’s say you’re going to make a web app. There’s a tool in Haskell that you really like for a critical part of your app’s problem domain. You don’t want to spend more than one innovation token though, and everything in Haskell is inherently innovative, so you write a little service that just does that one part and you write the rest of your app in Ruby, calling into that service whenever you need to use that thing. This will appropriately restrict your “innovation token” expenditure.

Does doing this actually reduce your operational overhead, though?

First, you will have to find a team that likes both Ruby and Haskell and sees no problem using both. If you are not familiar with the cultural proclivities of these languages, suffice it to say that this is unlikely. Hiring for Haskell programmers is hard because there are fewer of them than Ruby programmers, but hiring for polyglot Haskell/Ruby programmers who are happy to do either is going to be really hard.

Since you will need to find different people to write in the different languages, even in the best case scenario, you will have two teams: the Haskell team and the Ruby team. Even if you are incredibly disciplined about inter-service responsibilities, there will be some areas where duplication of code is necessary across those services. Disagreements will arise and every one of these disagreements will be a source of social friction and software defects.

Then, you need to set up separate CI pipelines for each language, separate deployment systems, and of course, separate databases. Right away you are effectively doubling your workload.

In the worse, and unfortunately more likely scenario, there will be enormous infighting between these two teams. Operational incidents will be more difficult to manage because rather than learning the Haskell tools for operational visibility and disseminating that institutional knowledge amongst your team, you will be half-learning the lessons from two separate ecosystems and attempting to integrate them. Every on-call engineer will be frantically trying to learn a language ecosystem they don’t use regularly, or you will double the size of your on-call rotation. The Ruby team may start to resent the Haskell team for getting to exclusively work on the fun parts of the problem while they are doing things that look more like rote grunt work.

A better way to think about the problem of managing operational overhead is, rather than “innovation tokens”, consider “boundary tokens”.

That is to say, rather than evaluating the general sense of weird vibes from your architecture, consider the consistency of that architecture. If you’re using Haskell, use Haskell. You should be all-in on Haskell web frameworks, Haskell ORMs, Haskell OAuth integrations, and so on.¹ To cross the boundary out of Haskell, you need to spend a boundary token, and you shouldn’t have many of those.

I submit that the increased operational overhead that you might experience with an all-Haskell tool selection will be dwarfed by the savings that you get by having a team that is aligned with each other, that can communicate easily, and that can share programs with each other without needing to first strategize about a channel for the two pieces of work to establish bidirectional communication. The ability to simply call a function when you need to call it is very powerful, and extremely underrated.

Consistency ought to apply at each layer of the stack; it is perhaps most obvious with programming languages, but it is true of web frameworks, test frameworks, cryptographic libraries, you name it. Make a choice and stick with it, because every deviation from that choice carries a significant cost. Moreover this cost is a hidden cost, in the same way that the operational downsides of an “innovative” tool that hasn’t seen much production use might be hidden.

Discarding a more standard tool in favor of a tool more consistent with your architecture extends even to fairly uncontroversial, ubiquitous tools. For example, one of my favorite architectural patterns is to forego the use of the venerable — and very boring – Cron, the UNIX task-scheduler. Instead of Cron, it can make a lot of sense to have hand-written bespoke code for scheduling tasks within the application. Within the “innovation tokens” model, this is a very silly waste of a token!

Just use Cron! Everybody knows how to use Cron!

Except… does everybody know how to use Cron? Here are some questions to consider, if you’re about to roll out a big dependency on Cron:

How do you write a unit test for a scheduling rule with Cron?
Can you even remember how to write a cron rule that runs at the times you want?
How do you inject secrets and configuration variables into the distinct and somewhat idiosyncratic runtime execution environment of Cron?
How do you know that you did that variable-injection properly until the job actually runs, possibly in the middle of the night?
How do you deploy your monitoring and error-logging frameworks to observe your scripts run under Cron?

Granted, this architectural choice is less controversial than it once was. Cron used to be ambiently available on whatever servers you happened to be running. As container-based deployments have increased in popularity, this sense that Cron is just kinda around has gone away, and if you need to run a container that just runs Cron, much of the jankiness of its deployment is a lot more immediately visible.

There is friction at the boundary between things. That friction is a cost, but sometimes it’s a cost worth paying.

If there’s a really good library in Haskell and a really good library in Ruby and you really do want to use them both, maybe it makes sense to actually have multiple services. As your team gets larger and more mature, the need to bring in more tools, and the ability to handle the associated overhead, will only increase over time. But the place that the cost comes in the most is at the boundary between tools, not in the operational deficiencies of any one particular tool.

Even in a bog-standard web application with the most boring, least innovative tech stack imaginable (PHP, MySQL, HTML, CSS, JavaScript), many of the annoying points of friction are where different, inconsistent technologies make contact. If you are a programmer working on the web yourself, consider your own impression of the level of controversy of these technologies:

CSS frameworks that attempt to bridge the gap between CSS and JavaScript, like Bootstrap or Tailwind.
ORMs that attempt to bridge the gap between SQL and your backend language of choice
RPC systems that attempt to connect disparate services together using simple abstractions.

Consider that there are far more complex technical tools in terms of required skills to implement them, like computer vision or physics simulation, tools which are also pretty widely used, which consistently generate lower levels of controversy. People do have strong feelings about these things as well, of course, and it’s hard to find things to link to that show “this isn’t controversial”, but, like, search your feelings, you know it to be true.

You can see the benefits of the boundary token approach in programming language design. Many of the most influential and best-loved programming languages had an impact not by bundling together lots of tools, but by making everything into one thing:

LISP: everything is a list
Smalltalk: everything is an object
ML: everything is an algebraic data type
Forth: everything is a stack

There is a tremendous power in thinking about everything as a single kind of thing, because then you don’t have to juggle lots of different ideas about different kinds of things; you can just think about your problem.

When people complain about programming languages, they’re often complaining about how many different kinds of thing they have to remember in order to use it.

If you keep your boundary-token budget small, and allow your developers to accomplish as much as possible while staying within a solution space delineated by a single, clean cognitive boundary, I promise you can innovate as much as you want and your operational costs will remain manageable.

Epilogue

In subsequent Mastodon discussion of this post on with Matt Campbell and Meejah, I realized that I may not have made it entirely clear why I feel the distinction between “boundary” and “innovation” tokens is important. I do say above that the “innovation token” model can be a useful heuristic, so why bother with a new, but slightly different heuristic? Especially since most experienced engineers - indeed, McKinley himself - would budget “innovation” quite similarly to “boundaries”, and might even consider the use of more “innovative” Haskell tools in my hypothetical scenario to not even be an expenditure of innovation tokens at all.

To answer that, I need to highlight the purpose of having heuristics like this in the first place. These are vague, nebulous guidelines, not hard and fast rules. I cannot give you a token calculator to plug your technical decisions into. The purpose of either token heuristic is to facilitate discussions among a team.

With a team of skilled and experienced engineers, the distinction is meaningless. Senior and staff engineers (at least, the ones who deserve their level) will intuit the goals behind “innovation tokens” and inherently consider things like operational overhead anyway. In practice, a high-performing, well-aligned team discussing innovation tokens and one discussing boundary tokens will look functionally indistinguishable.

The distinction starts to be important when you have management pressures, nervous executives, inexperienced engineers, a fresh team without existing consensus about core technology choices, and so on. That is to say, most teams that exist in the messy, perpetually in medias res world of the software industry.

If you are just getting started on a project and you have a bunch of competent but disagreeable engineers, the words “innovation” and “boundaries” function very differently.

If you ask, “is this an innovation” about a particular technical tool, you are asking your interlocutor to pull in a bunch of their skills and experience to subjectively evaluate the relative industry-wide, or maybe company-wide, or maybe team-wide² newness of the thing being discussed. The discussion of whether it counts as boring or innovative is immediately fraught with a ton of subjective, difficult-to-quantify information about costs of hiring, difficulty of learning, and your impression of the feelings of hundreds or thousands of people outside of your team. And, yes, ultimately you do need to have an estimate of all that stuff, but starting your “is it OK to use this” conversation by simultaneously arguing about all those subjective judgments is setting yourself up for failure.

Instead, if you ask “does this introduce a boundary between two different technologies with different conceptual models”, while that is not a perfectly objective question, it is much easier for your team to answer, with much crisper intermediary factual questions. What are the two technologies? What are the models? How much do they differ? You can just hash out the answers to each one within the team directly, rather than needing to sift through the last few years of Stack Overflow developer surveys to determine relative adoption or popularity of technologies in the world at large.

Restricting your supply of either boundary or innovation tokens is a good idea, but achieving unanimity within your team about what your boundaries are is always going to be easier than deciding what your innovations are.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like “how can we make our architecture more consistent”.

I gave a talk about this once, a very long time ago, where Haskell was Python. ↩
It’s not clear, that’s a big part of the problem. ↩

A Grand Unified Theory of the AI Hype Cycle

I’m sorry, but as an AI language model, I cannot repeat history exactly. However, I can rhyme with it.

ai llm hype history semasiology Wednesday May 22, 2024

The Cycle

The history of AI goes in cycles, each of which looks at least a little bit like this:

Scientists do some basic research and develop a promising novel mechanism, N. One important detail is that N has a specific name; it may or may not be carried out under the general umbrella of “AI research” but it is not itself “AI”. N always has a few properties, but the most common and salient one is that it initially tends to require about 3x the specifications of the average computer available to the market at the time; i.e., it requires three times as much RAM, CPU, and secondary storage as is shipped in the average computer.
Research and development efforts begin to get funded on the hypothetical potential of N. Because N is so resource intensive, this funding is used to purchase more computing capacity (RAM, CPU, storage) for the researchers, which leads to immediate results, as the technology was previously resource constrained.
Initial successes in the refinement of N hint at truly revolutionary possibilities for its deployment. These revolutionary possibilities include a dimension of cognition that has not previously been machine-automated.
Leaders in the field of this new development — specifically leaders, like lab administrators, corporate executives, and so on, as opposed to practitioners like engineers and scientists — recognize the sales potential of referring to this newly-“thinking” machine as “Artificial Intelligence”, often speculating about science-fictional levels of societal upheaval (specifically in a period of 5-20 years), now that the “hard problem” of machine cognition has been solved by N.
Other technology leaders, in related fields, also recognize the sales potential and begin adopting elements of the novel mechanism to combine with their own areas of interest, also referring to their projects as “AI” in order to access the pool of cash that has become available to that label. In the course of doing so, they incorporate N in increasingly unreasonable ways.
The scope of “AI” balloons to include pretty much all of computing technology. Some things that do not even include N start getting labeled this way.
There’s a massive economic boom within the field of “AI”, where “the field of AI” means any software development that is plausibly adjacent to N in any pitch deck or grant proposal.
Roughly 3 years pass, while those who control the flow of money gradually become skeptical of the overblown claims that recede into the indeterminate future, where N precipitates a robot apocalypse somewhere between 5 and 20 years away. Crucially, because of the aforementioned resource-intensiveness, the gold owners skepticism grows slowly over this period, because their own personal computers or the ones they have access to do not have the requisite resources to actually run the technology in question and it is challenging for them to observe its performance directly. Public critics begin to appear.
Competent practitioners — not leaders — who have been successfully using N in research or industry quietly stop calling their tools “AI”, or at least stop emphasizing the “artificial intelligence” aspect of them, and start getting funding under other auspices. Whatever N does that isn’t “thinking” starts getting applied more seriously as its limitations are better understood. Users begin using more specific terms to describe the things they want, rather than calling everything “AI”.
Thanks to the relentless march of Moore’s law, the specs of the average computer improve. The CPU, RAM, and disk resources required to actually run the software locally come down in price, and everyone upgrades to a new computer that can actually run the new stuff.
The investors and grant funders update their personal computers, and they start personally running the software they’ve been investing in. Products with long development cycles are finally released to customers as well, but they are disappointing. The investors quietly get mad. They’re not going to publicly trash their own investments, but they stop loudly boosting them and they stop writing checks. They pivot to biotech for a while.
The field of “AI” becomes increasingly desperate, as it becomes the label applied to uses of N which are not productive, since the productive uses are marketed under their application rather than their mechanism. Funders lose their patience, the polarity of the “AI” money magnet rapidly reverses. Here, the AI winter is finally upon us.
The remaining AI researchers who still have funding via mechanisms less vulnerable to hype, who are genuinely thinking about automating aspects of cognition rather than simply N, quietly move on to the next impediment to a truly thinking machine, and in the course of doing so, they discover a new novel mechanism, M. Go to step 1, with M as the new N, and our current N as a thing that is now “not AI”, called by its own, more precise name.

The History

A non-exhaustive list of previous values of N have been:

Neural networks and symbolic reasoning in the 1950s.
Theorem provers in the 1960s.
Expert systems in the 1980s.
Fuzzy logic and hidden Markov models in the 1990s.
Deep learning in the 2010s.

Each of these cycles has been larger and lasted longer than the last, and I want to be clear: each cycle has produced genuinely useful technology. It’s just that each follows the progress of a sigmoid curve that everyone mistakes for an exponential one. There is an initial burst of rapid improvement, followed by gradual improvement, followed by a plateau. Initial promises imply or even state outright “if we pour more {compute, RAM, training data, money} into this, we’ll get improvements forever!” The reality is always that these strategies inevitably have a limit, usually one that does not take too long to find.

Where Are We Now?

So where are we in the current hype cycle?

Here’s a Computerphile video which explains some recent research into LLM performance. I’d highly encourage you to have a look at the paper itself, particularly Figure 2, “Log-linear relationships between concept frequency and CLIP zero-shot performance”.
Here’s a series of posts by Simon Willison explaining the trajectory of the practicality of actually-useful LLMs on personal devices. He hasn’t written much about it recently because it is now fairly pedestrian for an AI-using software developer to have a bunch of local models, and although we haven’t quite broken through the price floor of the gear-acquisition-syndrome prosumer market in terms of the requirements of doing so, we are getting close.
The Rabbit R1 and Humane AI Pin were both released; were they disappointments to their customers and investors? I think we all know how that went at this point.
I hear Karius just raised a series C, and they’re an “emerging unicorn”.
It does appear that we are all still resolutely calling these things “AI” for now, though, much as I wish, as a semasiology enthusiast, that we would be more precise.

Some Qualifications

History does not repeat itself, but it does rhyme. This hype cycle is unlike any that have come before in various ways. There is more money involved now. It’s much more commercial; I had to phrase things above in very general ways because many previous hype waves have been based on research funding, some really being exclusively a phenomenon at one department in DARPA, and not, like, the entire economy.

I cannot tell you when the current mania will end and this bubble will burst. If I could, you’d be reading this in my $100,000 per month subscribers-only trading strategy newsletter and not a public blog. What I can tell you is that computers cannot think, and that the problems of the current instantation of the nebulously defined field of “AI” will not all be solved within “5 to 20 years”.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. Special thanks also to Ben Chatterton for a brief pre-publication review; any errors remain my own. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like “what are we doing that history will condemn us for”. Or, you know, Python programming.

How To PyCon

Since I am headed to PyCon tomorrow, let’s talk about conference tips.

pycon Wednesday May 15, 2024

These tips are not the “right” way to do PyCon, but they are suggestions based on how I try to do PyCon. Consider them reminders to myself, an experienced long-time attendee, which you are welcome to overhear.

See Some Talks

The hallway track is awesome. But the best version of the hallway track is not just bumping into people and chatting; it’s the version where you’ve all recently seen the same thing, and thereby have a shared context of something to react to. If you aren’t going to talks, you aren’t going to get a good hallway track.. Therefore: choose talks that interest you, attend them and pay close attention, then find people to talk to about them.

Given that you will want to see some of the talks, make sure that you have the schedule downloaded and available offline on your mobile device, or printed out on a piece of paper.

Make a list of the talks you think you want to see, but have that schedule with you in case you want to call an audible in the middle of the conference, switching to a different talk you didn’t notice based on some of those “hallway track” conversations.

Participate In Open Spaces

The name “hallway track” itself is antiquated, in a way which is relevant and important to modern conferences. It used to be that conferences were exclusively oriented around their scheduled talks; it was called the “hallway” track because the way to access it was to linger in the hallways, outside the official structure of the conference, and just talk to people.

But however, at PyCon and many other conferences, this unofficial track is now much more of an integrated, official part of the program. In particular, open spaces are not only a more official hallway track, they are considerably better than the historical “hallway” experience, because these ad-hoc gatherings can be convened with a prepared topic and potentially a loose structure to facilitate productive discussion.

With open spaces, sessions can have an agenda and so conversations are easier to start. Rooms are provided, which is more useful than you might think; literally hanging out in a hallway is actually surprisingly disruptive to speakers and attendees at talks; us nerds tend to get pretty loud and can be quite audible even through a slightly-cracked door, so avail yourself of these rooms and don’t be a disruptive jerk outside somebody’s talk.

Consult the open space board, and put up your own proposed sessions. Post them as early as you can, to maximize the chance that they will get noticed. Post them on social media, using the conference's official hashtag, and ask any interested folks you bump into help boost it.¹

Remember that open spaces are not talks. If you want to give a mini-lecture on a topic and you can find interested folks you could do that, but the format lends itself to more peer-to-peer, roundtable-style interactions. Among other things, this means that, unlike proposing a talk, where you should be an expert on the topic that you are proposing, you can suggest open spaces where you are curious — but ignorant — about something, in the hopes that some experts will show up and you can listen to their discussion.

Be prepared for this to fail; there’s a lot going on and it’s always possible that nobody will notice your session. Again, maximize your chances by posting it as early as you can and promoting it, but be prepared to just have a free 30 minutes to check your email. Sometimes that’s just how it goes. The corollary here is not to always balance attending others’ spaces with proposing your own. After all if someone else proposed it you know at least one other person is gonna be there.

Take Care of Your Body

Conferences can be surprisingly high-intensity physical activities. It’s not a marathon, but you will be walking quickly from one end of a large convention center to another, potentially somewhat anxiously.

Hydrate, hydrate, hydrate. Bring a water bottle, and have it with you at all times. It might be helpful to set repeating timers on your phone to drink water, since it can be easy to forget in the middle of engaging conversations. If you take advantage of the hallway track as much as you should, you will talk more than you expect; talking expels water from your body. All that aforementioned walking might make you sweat a bit more than you realize.

Hydrate.

More generally, pay attention to what you are eating and drinking. Conference food isn’t always the best, and in a new city you might be tempted to load up on big meals and junk food. You should enjoy yourself and experience the local cuisine, but do it intentionally. While you enjoy the local fare, do so in whatever moderation works best for you. Similarly for boozy night-time socializing. Nothing stings quite as much as missing a morning of talks because you’ve got a hangover or a migraine.

This is worth emphasizing because in the enthusiasm of an exciting conference experience, it’s easy to lose track and overdo it.

Meet Both New And Old Friends: Plan Your Socializing

A lot of the advice above is mostly for first-time or new-ish conferencegoers, but this one might be more useful for the old heads. As we build up a long-time clique of conference friends, it’s easy to get a bit insular and lose out on one of the bits of magic of such an event: meeting new folks and hearing new perspectives.

While open spaces can address this a little bit, there's one additional thing I've started doing in the last few years: dinners are for old friends, but lunches are for new ones. At least half of the days I'm there, I try to go to a new table with new folks that I haven't seen before, and strike up a conversation. I even have a little canned icebreaker prompt, which I would suggest to others as well, because it’s worked pretty nicely in past years: “what is one fun thing you have done with Python recently?”².

Given that I have a pretty big crowd of old friends at these things, I actually tend to avoid old friends at lunch, since it’s so easy to get into multi-hour conversations, and meeting new folks in a big group can be intimidating. Lunches are the time I carve out to try and meet new folks.

I’ll See You There

I hope some of these tips were helpful, and I am looking forward to seeing some of you at PyCon US 2024!

In PyCon2024's case, #PyConUS on Mastodon is probably the way to go. Note, also, that it is #PyConUS and not #pyconus, which is much less legible for users of screen-readers. ↩
Obviously that is specific to this conference. At the O’Reilly Software Architecture conference, my prompt was “What is software architecture?” which had some really fascinating answers. ↩

Hope

Your words are doing something. Do you know what that something is?

politics posting sprachspiel Tuesday May 07, 2024

Humans are pattern-matching machines. As a species, it is our superpower. To summarize the core of my own epistemic philosophy, here is a brief list of the activities in the core main-loop of a human being:

stuff happens to us
we look for patterns in the stuff
we weave those patterns into narratives
we turn the narratives into models of the world
we predict what will happen based on those models
we do stuff based on those predictions
based on the stuff we did, more stuff happens to us; return to step 1

While this ability lets humans do lots of great stuff, like math and physics and agriculture and so on, we can just as easily build bad stories and bad models. We can easily trick ourselves into thinking that our predictive abilities are more powerful than they are.

The existence of magic-seeming levels of prediction in fields like chemistry and physics and statistics, in addition to the practical usefulness of rough estimates and heuristics in daily life, itself easily creates a misleading pattern. “I see all these patterns and make all these predictions and I’m right a lot of the time, so if I just kind of wing it and predict some more stuff, I’ll also be right about that stuff.”

This leaves us very vulnerable to things like mean world syndrome. Mean world syndrome itself is specifically about danger, but I believe it is a manifestation of an even broader phenomenon which I would term “the apophenia of despair”.

Confirmation bias is an inherent part of human cognition, but the internet has turbocharged it. Humans have immediate access to more information than we ever had in the past. In order to cope with that information, we have also built ways to filter that information. Even disregarding things like algorithmic engagement maximization and social media filter bubbles, the simple fact that when you search for things, you are a lot more likely to find the thing that you’re searching for than to find arguments refuting it, can provide a very strong sense that you’re right about whatever you’re researching.

All of this is to say: if you decide that something in the world is getting worse, you can very easily convince yourself that it is getting much, much worse, very rapidly. Especially because there are things which are, unambiguously, getting worse.

However, Pollyanna-ism is just the same phenomenon in reverse and I don’t want to engage in that. The ice sheets really are melting, globally, fascism really is on the rise. I am not here to deny reality or to cherry pick a bunch of statistics to lull people into complacency.

I believe that while dwelling on a negative reality is bad, I also believe that in the face of constant denial, it is sometimes necessary to simply emphasize those realities, however unpleasant they may be. Distinguishing between unhelpful rumination on negativity and repetition of an unfortunate but important truth to correct popular perception is subjective and subtle, but the difference is nevertheless important.

As our ability to acquire information about things getting worse has grown, our ability to affect those things has not. Knowledge is not power; power is power, and most of us don’t have a lot of it, so we need to be strategic in the way that we deploy our limited political capital and personal energy.

Overexposure to negative news can cause symptoms of depression; depressed people have reduced executive function and find it harder to do stuff. One of the most effective interventions against this general feeling of malaise? Hope.. Not “hope” in the sense of wishing. As this article in the American Psychological Association’s “Monitor on Psychology” puts it:

“We often use the word ‘hope’ in place of wishing, like you hope it rains today or you hope someone’s well,” said Chan Hellman, PhD, a professor of psychology and founding director of the Hope Research Center at the University of Oklahoma. “But wishing is passive toward a goal, and hope is about taking action toward it.”

Here, finally, I can get around to my point.

If you have an audience, and you have some negative thoughts about some social trend, talking about it in a way which is vague and non-actionable is potentially quite harmful. If you are doing this, you are engaged in the political project of robbing a large number of people of hope. You are saying that the best should have less conviction, while the worst will surely remain just as full of passionate intensity.

I do not mean to say that it is unacceptable to ever publicly share thoughts of sadness, or even hopelessness. If everyone in public is forced to always put on a plastic smile and pretend that everything is going to be okay if we have grit and determination, then we have an Instagram culture of fake highlight reels where anyone having their own struggles with hopelessness will just feel even worse in their isolation. I certainly posted my way through my fair share of pretty bleak mental health issues during the worst of the pandemic.

But we should recognize that while sadness is a feeling, hopelessness is a problem, a bad reaction to that feeling, one that needs to be addressed if we are going to collectively dig ourselves out of the problem that creates the sadness in the first place. We may not be able to conjure hope all the time, but we should always be trying.

When we try to address these feelings, as I said earlier, Pollyanna-ism doesn’t help. The antidote to hopelessness is not optimism, but curiosity. If you have a strong thought like “people these days just don’t care about other people¹”, yelling “YES THEY DO” at yourself (or worse, your audience) is unlikely to make much of a change, and certainly not likely to be convincing to an audience. Instead, you could ask yourself some questions, and use them for a jumping-off point for some research:

Why do I think this — is the problem in my perception, or in the world?
If there is a problem in my perception, is this a common misperception? If it’s common, what is leading to it being common? If it’s unique to me, what sort of work do I need to do to correct it?
If the problem is real, what are its causes? Is there anything that I, or my audience, could do to address those causes?

The answers to these questions also inform step 6 of the process I outlined above: the doing stuff part of the process.

At some level, all communication is persuasive communication. Everything you say that another person might hear, everything you say that a person might hear, is part of a sprachspiel where you are attempting to achieve something. There is always an implied call to action; even “do nothing, accept the status quo” is itself an action. My call to action right now is to ask you to never make your call to action “you should feel bad, and you should feel bad about feeling bad”. When you communicate in public, your words have power.

Use that power for good.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! Special thanks also to Cassandra Granade, who provided some editorial feedback on this post; any errors, of course, remain my own.

I should also note that vague sentiments of this form, “things used to be better, now they’re getting worse”, are at their core a reactionary yearning for a prelapsarian past, which is both not a good look and also often wrong in a very common way. Complaining about how “people” are getting worse is a very short journey away from complaining about kids these days, which has a long and embarrassing history of being comprehensively incorrect in every era. ↩

Software Needs To Be More Expensive

Software, like coffee, is too artificially cheap, and we need to make it more expensive. I have one suggestion for how to do that.

software supported open-source Saturday March 30, 2024

The Cost of Coffee

One of the ideas that James Hoffmann — probably the most influential… influencer in the coffee industry — works hard to popularize is that “coffee needs to be more expensive”.

The coffee industry is famously exploitative. Despite relatively thin margins for independent café owners¹, there are no shortage of horrific stories about labor exploitation and even slavery in the coffee supply chain.

To summarize a point that Mr. Hoffman has made over a quite long series of videos and interviews², some of this can be fixed by regulatory efforts. Enforcement of supply chain policies both by manufacturers and governments can help spot and avoid this type of exploitation. Some of it can be fixed by discernment on the part of consumers. You can try to buy fair-trade coffee, avoid brands that you know have problematic supply-chain histories.

Ultimately, though, even if there is perfect, universal, zero-cost enforcement of supply chain integrity… consumers still have to be willing to, you know, pay more for the coffee. It costs more to pay wages than to have slaves.

The Price of Software

The problem with the coffee supply chain deserves your attention in its own right. I don’t mean to claim that the problems of open source maintainers are as severe as those of literal child slaves. But the principle is the same.

Every tech company uses huge amounts of open source software, which they get for free.

I do not want to argue that this is straightforwardly exploitation. There is a complex bargain here for the open source maintainers: if you create open source software, you can get a job more easily. If you create open source infrastructure, you can make choices about the architecture of your projects which are more long-term sustainable from a technology perspective, but would be harder to justify on a shorter-term commercial development schedule. You can collaborate with a wider group across the industry. You can build your personal brand.

But, in light of the recent xz Utils / SSH backdoor scandal, it is clear that while the bargain may not be entirely one-sided, it is not symmetrical, and significant bad consequences may result, both for the maintainers themselves and for society.

To fix this problem, open source software needs to get more expensive.

A big part of the appeal of open source is its implicit permission structure, which derives both from its zero up-front cost and its zero marginal cost.

The zero up-front cost means that you can just get it to try it out. In many companies, individual software developers do not have the authority to write a purchase order, or even a corporate credit card for small expenses.

If you are a software engineer and you need a new development tool or a new library that you want to purchase for work, it can be a maze of bureaucratic confusion in order to get that approved. It might be possible, but you are likely to get strange looks, and someone, probably your manager, is quite likely to say “isn’t there a free option for this?” At worst, it might just be impossible.

This makes sense. Dealing with purchase orders and reimbursement requests is annoying, and it only feels worth the overhead if you’re dealing with a large enough block of functionality that it is worth it for an entire team, or better yet an org, to adopt. This means that most of the purchasing is done by management types or “architects”, who are empowered to make decisions for larger groups.

When individual engineers need to solve a problem, they look at open source libraries and tools specifically because it’s quick and easy to incorporate them in a pull request, where a proprietary solution might be tedious and expensive.

That’s assuming that a proprietary solution to your problem even exists. In the infrastructure sector of the software economy, free options from your operating system provider (Apple, Microsoft, maybe Amazon if you’re in the cloud) and open source developers, small commercial options have been marginalized or outright destroyed by zero-cost options, for this reason.

If the zero up-front cost is a paperwork-reduction benefit, then the zero marginal cost is almost a requirement. One of the perennial complaints of open source maintainers is that companies take our stuff, build it into a product, and then make a zillion dollars and give us nothing. It seems fair that they’d give us some kind of royalty, right? Some tiny fraction of that windfall? But once you realize that individual developers don’t have the authority to put $50 on a corporate card to buy a tool, they super don’t have the authority to make a technical decision that encumbers the intellectual property of their entire core product to give some fraction of the company’s revenue away to a third party. Structurally, there’s no way that this will ever happen.

Despite these impediments, keeping those dependencies maintained does cost money.

Some Solutions Already Exist

There are various official channels developing to help support the maintenance of critical infrastructure. If you work at a big company, you should probably have a corporate Tidelift subscription. Maybe ask your employer about that.

But, as they will readily admit there are a LOT of projects that even Tidelift cannot cover, with no official commercial support, and no practical way to offer it in the short term. Individual maintainers, like yours truly, trying to figure out how to maintain their projects, either by making a living from them or incorporating them into our jobs somehow. People with a Ko-Fi or a Patreon, or maybe just an Amazon wish-list to let you say “thanks” for occasional maintenance work.

Most importantly, there’s no path for them to transition to actually making a living from their maintenance work. For most maintainers, Tidelift pays a sub-hobbyist amount of money, and even setting it up (and GitHub Sponsors, etc) is a huge hassle. So even making the transition from “no income” to “a little bit of side-hustle income” may be prohibitively bureaucratic.

Let’s take myself as an example. If you’re a developer who is nominally profiting from my infrastructure work in your own career, there is a very strong chance that you are also a contributor to the open source commons, and perhaps you’ve even contributed more to that commons than I have, contributed more to my own career success than I have to yours. I can ask you to pay me³, but really you shouldn’t be paying me, your employer should.

What To Do Now: Make It Easy To Just Pay Money

So if we just need to give open source maintainers more money, and it’s really the employers who ought to be giving it, then what can we do?

Let’s not make it complicated. Employers should just give maintainers money. Let’s call it the “JGMM” benefit.

Specifically, every employer of software engineers should immediately institute the following benefits program: each software engineer should have a monthly discretionary budget of $50 to distribute to whatever open source dependency developers they want, in whatever way they see fit. Venmo, Patreon, PayPal, Kickstarter, GitHub Sponsors, whatever, it doesn’t matter. Put it on a corp card, put the project name on the line item, and forget about it. It’s only for open source maintenance, but it’s a small enough amount that you don’t need intense levels of approval-gating process. You can do it on the honor system.

This preserves zero up-front cost. To start using a dependency, you still just use it⁴. It also preserves zero marginal cost: your developers choose which dependencies to support based on perceived need and popularity. It’s a fixed overhead which doesn’t scale with revenue or with profit, just headcount.

Because the whole point here is to match the easy, implicit, no-process, no-controls way in which dependencies can be added in most companies. It should be easier to pay these small tips than it is to use the software in the first place.

This sub-1% overhead to your staffing costs will massively de-risk the open source projects you use. By leaving the discretion up to your engineers, you will end up supporting those projects which are really struggling and which your executives won’t even hear about until they end up on the news. Some of it will go to projects that you don’t use, things that your engineers find fascinating and want to use one day but don’t yet depend upon, but that’s fine too. Consider it an extremely cheap, efficient R&D expense.

A lot of the options for developers to support open source infrastructure are already tax-deductible, so if they contribute to something like one of the PSF’s fiscal sponsorees, it’s probably even more tax-advantaged than a regular business expense.

I also strongly suspect that if you’re one of the first employers to do this, you can get a round of really positive PR out of the tech press, and attract engineers, so, the race is on. I don’t really count as the “tech press” but nevertheless drop me a line to let me know if your company ends up doing this so I can shout you out.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics such as “How do I figure out which open source projects to give money to?”.

I don’t have time to get into the margins for Starbucks and friends, their relationship with labor, economies of scale, etc. ↩
While this is a theme that pervades much of his work, the only place I can easily find where he says it in so many words is on a podcast that sometimes also promotes right-wing weirdos and pseudo-scientific quacks spreading misinformation about autism and ADHD. So, I obviously don’t want to link to them; you’ll have to take my word for it. ↩
and I will, since as I just recently wrote about, I need to make sure that people are at least aware of the option ↩
Pending whatever legal approval program you have in place to vet the license. You do have a nice streamlined legal approvals process, right? You’re not just putting WTFPL software into production, are you? ↩