Why the most valuable things you know are things you cannot say

The Dimensionality Problem

There is an apparent contradiction at the heart of expertise. Expert judgement is learnable, in the sense that people demonstrably acquire it over time. It is also non-transmissible, in the sense that no expert can transfer their judgement to another person through explanation. If it was once learnable, why can it not be taught?

The resolution lies in a distinction between two fundamentally different modes of learning. The first is instruction: the transfer of explicit models, rules, and relationships from one person to another through language. The second is calibration: the development of internal models through repeated exposure to feedback in a specific environment. Judgement is learnable through calibration. It is not transmissible through instruction. These are different processes operating on different substrates, and conflating them is the source of the apparent contradiction.

To see why, we need to be precise about what “high-dimensional” means when applied to expert knowledge, because the concept is doing all the real work.

Consider a simple decision: should I cross the road? A rule-based encoding of this decision might operate on three variables: is a car visible, how fast is it moving, and how far away is it. These three dimensions are sufficient to produce a reasonable crossing decision most of the time. You could write this as an explicit rule, transmit it through language, and a person who had never crossed a road could apply it successfully in straightforward cases.

Now consider the actual model that an experienced pedestrian uses. They are integrating: the car’s speed, its acceleration (is it slowing down?), the road surface (wet or dry, affecting stopping distance), the driver’s apparent attentiveness (are they looking at their phone?), the car’s trajectory (drifting within the lane?), the presence of other cars that might obscure the driver’s view, the width of the road, their own walking speed today (are they carrying something heavy, are they injured?), the behaviour of other pedestrians (are they crossing confidently or hesitating?), the sound of the engine (accelerating or decelerating, even before the speed change is visible), the type of vehicle (a truck has different stopping characteristics than a bicycle), the time of day (affecting driver fatigue and visibility), and dozens of other variables they could not enumerate if asked.

This is perhaps thirty to fifty dimensions of input, processed simultaneously, producing a crossing decision in under a second. The experienced pedestrian is not consciously evaluating each variable. They are running a pattern-matching model that was calibrated over thousands of crossings, each of which provided feedback (safe crossing, near miss, honked at, had to run). The model works. It produces better decisions than the three-variable rule. It cannot be articulated.

Why Language Fails as a Transmission Channel

Language is a serial, low-bandwidth channel. It transmits one proposition at a time, sequentially. Each proposition can relate a small number of variables: “if X and Y, then Z.” Complex conditionals can extend this to perhaps five or six variables before the sentence becomes unparseable: “if X and Y but not Z, unless W and V, then Q.”

The expert’s model does not operate through conditionals of this form. It operates through a continuous, nonlinear mapping from a high-dimensional input space to an output. The interaction effects between variables are where the real information lives. The road being wet matters differently depending on the car’s speed, which matters differently depending on the driver’s attentiveness, which matters differently depending on the time of day. These are not additive effects that can be listed sequentially. They are multiplicative interactions across many variables simultaneously.

... continue reading