PROTOCOLS / 01 · ANATOMY · EN

MIDI COMPLETE

the protocol that survived forty years, from the byte to MIDI 2.0
MIDI is the oldest digital protocol still in daily use in music. Forty years after its publication, its core hasn't changed: the same bytes as in 1983 still travel between a keyboard and a computer.

This sheet takes the protocol apart layer by layer — from the status byte to the Universal MIDI Packet of MIDI 2.0 — and shows why a deliberately simple standard outlived every hardware revolution. The final section pulls MIDI out of music: the same protocol drives lights, motors, LEDs, entire installations.
Get this card as a PDF — print-ready A4, continuous reading.
/ 00

WHAT IT IS

a communication protocol, not an audio signal
MIDI is not an audio format. No sound travels through a MIDI cable. It is a communication protocol — a format for messages and the rules of their exchange — of the same kind as USB or HTTP: what travels on the wire are addressed bytes, read by the machines that receive them.

The mental shift happens here. Between a MIDI keyboard and a computer, no sound travels. What travels is the order: play note 60 at intensity 100, raise the volume to 127, select program 4. The receiving machine interprets the order and produces the sound itself. A MIDI cable is closer to a telegraph wire than to a speaker cable.

What can be transmitted through those orders covers anything that controls a sound machine: triggering and releasing notes, continuously modifying a parameter — vibrato, filter, volume — synchronizing tempos, selecting sounds, sending fine settings. Anything that can be commanded can be commanded in MIDI; the resulting sound comes out elsewhere.

The scope quickly exceeds music. The protocol describes addressed messages and exchange rules — it never says the target must be an instrument. Synthesizers and digital audio workstations were the historical target, but the same protocol today drives lighting installations, video sequences, robotic arms, generative works (§12). Anything that responds to a message can be commanded in MIDI.
midi is a protocol, not a signal. what circulates is orders.
Diagram
Audio vs MIDI: two ways to get soundTwo parallel chains. Top: an audio source sends a varying electrical voltage through a cable to a speaker that converts it into an acoustic wave. Bottom: a keyboard or DAW sends MIDI orders shown in hex (90 3C 64 means play middle C at velocity 100; B0 07 7F means set volume to maximum) to a synth that turns them into sound. Both produce sound at the end, but only the audio cable carries the sound itself, in electrical form.AUDIO vs MIDI — TWO WAYS TO GET SOUNDAUDIOAUDIO SOURCEgenerates a signalSPEAKERconverts to soundSOUNDa varying electrical voltage travels the cable — the speaker converts it into an acoustic waveMIDIKEYBOARD / DAWsends orders« play middle C,velocity 100 »« raise volumeto max »[90 3C 64][B0 07 7F]SYNTHmakes the soundSOUNDorders travel the cable — the synth turns them into soundaudio cable: sound is CARRIED. MIDI cable: it is PRODUCED on arrival.
/ 01

GENESIS

1983 — an agreement between competitors
In 1981, Dave Smith, founder of Sequential Circuits, presented to the Audio Engineering Society the idea of a universal interface between synthesizers — the Universal Synthesizer Interface. The problem was concrete: at the time, getting two machines from different brands to talk required proprietary tinkering. Smith joined forces with Ikutaro Kakehashi, founder of Roland, and the idea became a shared standard. The MIDI 1.0 specification was published in August 1983.

The first public demonstration took place at the Winter NAMM of January 1983: a Sequential Prophet-600 and a Roland Jupiter-6 — two competing instruments — exchanged notes over a single cable. The bet fits in one sentence: rival manufacturers agree on an open, royalty-free standard that anyone implements without paying anyone.

That economic decision, as much as the technical one, explains the longevity. MIDI costs nothing, stays deliberately minimal, and settles for being good enough. Smith and Kakehashi received a Technical Grammy in 2013 for a standard that had, in the meantime, equipped nearly all recorded music.
a royalty-free standard, agreed on by competing manufacturers — that economic bet is what kept it alive for forty years.
Diagram
A timeline of MIDI history from 1981 to 2020 in seven milestones.Timeline: 1981 proposal by Dave Smith of Sequential; 1983 MIDI 1.0 first demo at NAMM (highlighted); 1988 SMF the .mid file; 1991 General MIDI and MSC; 1999 USB-MIDI; 2018 MPE; 2020 MIDI 2.0 dialogue (highlighted). Forty years with backward compatibility never broken.GENESIS — FORTY YEARS IN SEVEN MILESTONES1981proposal · D. Smith1983MIDI 1.0 · NAMM demo1988SMF · the .mid1991General MIDI · MSC1999USB-MIDI2018MPE2020MIDI 2.0 · dialogueforty years, a backward compatibility never broken:a Note On from 1983 still plays today.
/ 02

THE MIDI BUS

a transmission lane, not a wire — and one that carries sixteen channels
When a MIDI keyboard sends a message to a computer, the order travels over a bus. The bus is the transmission lane — the physical connection shared by every message moving between machines. On that same lane circulate up to sixteen distinct logical channels: sixteen tags that each message carries in its status byte (§3), letting messages share the connection without mixing. A bus is not a channel — it is the whole that carries sixteen of them.

The bus of 1983 takes the shape of a five-pin DIN cable, unidirectional: an OUT connector on the sender side, an IN connector on the receiver side. To link several instruments to the same sequencer, each machine has a third THRU connector that re-emits bit-for-bit what it received. This is the daisy chain mechanism: sender → IN of A → THRU of A → IN of B → THRU of B → … At every link, electronic latency accumulates, typically 0.3 ms per THRU. Beyond sixteen instruments, several necessarily receive on the same channel and lose independent addressing. A DIN bus thus saturates at around sixteen instruments — exactly the number of channels it carries.

With the rise of software sequencers and large studio setups, multi-port interfaces appear. A single box — MOTU MIDI Express, Emagic Unitor, Steinberg MIDEX — offers eight, sixteen, even thirty-two independent MIDI connectors, linked to the computer by a single proprietary cable. Each port is a separate bus, carrying its own sixteen channels. The topology shifts from chain to star. Eight ports = 8 × 16 = 128 independent channels. The sixteen-channel constraint recedes without disappearing: it remains true per bus, but the whole system counts several.

USB MIDI, standardized in 1999 and broadly deployed from the 2000s onward, abolishes the underlying problem. Each device connected to the computer via USB carries its own complete MIDI bus, with its own sixteen channels. No chaining, no accumulated latency, no forced sharing. A contemporary setup — one computer, four USB controllers, two USB synths — totals six independent buses, that is 6 × 16 = 96 channels, without a single traditional MIDI cable.

You still regularly hear the phrase "MIDI is limited to sixteen channels." It describes the limit of a bus, never of a system. The sixteen-channel cell has become a base unit: what passed for a ceiling in 1983 now serves as a counting unit. This confusion is the most widespread blind spot about the protocol, and the reason MIDI 2.0 raises the per-bus limit to 256 channels per group (§11) rather than stacking layers on top of sixteen.
one usb cable carries a complete bus of 16 channels per device. the 1983 shortage no longer exists.
Diagram
Evolution of the MIDI bus: daisy chain, star, USBThree topologies stacked. 1983: a daisy chain of instruments behind a master, sharing one bus of 16 channels. 1990s: a star around a multi-port interface, each port a separate bus of 16 channels. Today: USB, each device carrying its own complete 16-channel bus, no chaining.BUS EVOLUTION — THREE TOPOLOGIES, SAME CONCEPT1983DIN · daisy chainMASTERINSTRINSTRINSTR...(max 16)1 BUS16 channels shared→ saturation at 161990sSTAR · multi-portCOMPUTERINTERFACEINSTRINSTRINSTRINSTRn BUSindependent16 ch × ntodayUSB · 1 bus/deviceCOMPUTERSYNTHCTRLSYNTHn BUSno chaining16 ch × n machines16 channels = the base unit of a system, no longer its ceiling.
The three MIDI DIN sockets: IN receives, THRU copies IN, OUT emits what the device generates.A MIDI device with three sockets. IN (blue) receives the incoming stream. THRU (green) is an exact copy of IN, used to relay to the next device. OUT (gold) emits what the device itself generates from its engine. The pitfall: THRU copies IN, not OUT.THE THREE SOCKETS · IN / OUT / THRUDEVICE(synth, keyboard, interface)engineINTHRUOUTincoming streamcopygenerated→ next IN→ next ININ — receives the incoming streamTHRU — exact copy of IN (never OUT)OUT — emits what the device generatesthe pitfall: THRU copies IN, not OUT.to relay what you receive, use THRU — not OUT.
/ 03

THE BYTE

the atomic unit of the protocol
MIDI is an asynchronous serial protocol: it transmits bytes one after another, on a single wire, with no shared clock. Everything else is built on a single distinction, carried by the most significant bit of each byte. When that bit is 1 — a value from 0x80 to 0xFF — the byte is a status byte: it announces the nature of the message. When the bit is 0 — from 0x00 to 0x7F — the byte is a data byte: it carries a value.

This one-bit boundary propagates through all of MIDI. A data byte has only seven bits: hence the 0 to 127 found everywhere — note number, velocity, controller value, program number. The eighth bit never encodes data; it separates an instruction from a value. That is the price of a stream that decodes unambiguously, byte by byte, even when its start was missed.

The status byte reads in two nibbles. The four high bits name the message type; the four low bits name the channel, from 0 to 15. Seven types address a channel this way — the Channel Voice messages:

- Note Off — 0x8n — releases a note: note number, then release velocity.
- Note On — 0x9n — triggers a note: note number, then velocity.
- Polyphonic Key Pressure — 0xAn — pressure applied to one specific note after the attack.
- Control Change — 0xBn — a controller changes: controller number, then value (§7).
- Program Change — 0xCn — changes the sound: a single byte, the program number.
- Channel Pressure — 0xDn — the channel's global pressure: a single byte.
- Pitch Bend — 0xEn — pitch flexes: two data bytes.

A complete message is therefore not one byte but a small group: the status byte followed by zero, one or two data bytes depending on the type. A Note On takes three bytes — status, note, velocity; a Program Change takes two; some system messages, only one. The byte is the unit the protocol reads; the message is what it composes.

The note number spans 0 to 127 in semitones. 60 is middle C — its octave name (C3, C4 or C5) varies by manufacturer, but the number does not. Velocity, also seven bits, encodes how fast the key was struck, so most often the intensity of the attack. A convention handles one detail of economy: a Note On with velocity 0 counts as a Note Off. The equivalence lets notes chain without re-sending the status byte each time — the mechanism behind running status (§5).

Pitch bend is the exception to the seven-bit grid. It assembles two data bytes — low then high — into a fourteen-bit value: 16384 steps, centered on 8192, the rest point. The added resolution is no luxury: the ear immediately hears the staircase of a too-coarsely quantized glissando. It is the only Channel Voice value natively on fourteen bits — everywhere else, MIDI 1.0 lives with its 128 steps. That ceiling holds for forty years; it is the one MIDI 2.0 will lift by taking controllers to 32 bits (§11).
everything in MIDI 1.0 fits in seven bits. zero to 127, for a note as for a volume.
Diagram
The MIDI byte dissected: a three-byte Note On messageA Note On message is three bytes — status 0x90, note 60, velocity 100. The status byte is split into eight bits 1001 0000: the high nibble codes the type (Note On), the low nibble codes the channel (1). The high bit set to 1 marks a status byte.ONE MESSAGE · NOTE ON · THREE BYTESSTATUS BYTEDATA BYTEDATA BYTE0x9060100Note On / channel 1middle Cvelocity 100/127STATUS, BIT BY BIT10010000high bit = 1TYPE · 1001 · NOTE ONCHANNEL · 0000 · CH. 1the eighth bit never encodes data — it separates an instruction from a value.→ THE CORRESPONDING NOTECDO central · note 60 · C3 in Ableton (C4 in scientific convention)
/ 04

THE MESSAGES

seven Channel Voice types — one pitfall per type
Channel Voice messages form the main family of MIDI: everything that travels on a channel, as opposed to System messages (§6). Their status byte always follows the same structure: a high nibble that designates the type (one of seven values, from 0x8n to 0xEn) and a low nibble that carries the channel number (§3). The seven types cover everything an instrument can receive as an order — trigger a note, release it, express pressure, select a sound, shift pitch, transmit a controller value. This is the complete vocabulary of MIDI 1.0 at the scale of a channel.

Note On (0x9n) triggers a note. Three bytes: status, note number (0-127 — note 60 = middle C, §3), and velocity (0-127 — the speed at which the key is pressed, which translates on most instruments into an initial volume and timbral character). It is the most frequent message of the protocol: a musical performance emits hundreds per minute.

Note Off (0x8n) releases a note. Three bytes: status, note number, release velocity (the speed of release). In practice, almost no device or DAW reads release velocity — Ableton Live ignores it, Logic ignores it, FL Studio ignores it. It is encoded in the message but remains a reception blind spot.

And here appears the most practically loaded pitfall of the family. A historical convention turns a Note On with velocity 0 into an implicit Note Off. When a sequencer plays a sequence of successive notes, it saves one status byte per released note by sending « Note On vel=0 » instead of « Note Off vel=X »; the receiver interprets vel=0 as a release request. This trick is what makes running status (§5) efficient over a melody: a long sequence contains only repeated « Note On »s, with vel>0 to trigger and vel=0 to release. Consequence: when observing MIDI traffic in practice, you see many Note On vel=0 and very few actual 0x8n. And consequence for anyone coding in Max for Live or Max/MSP: a patch watching only for 0x8n will miss half the releases. The right reflex is to treat « Note On vel=0 » as a Note Off in its own right.

Poly Aftertouch (0xAn), also called Poly Key Pressure, transmits the pressure applied to an already-pressed key, note by note. Three bytes: status, note number, pressure value (0-127). It is the rarest message in practice. Implementing Poly AT requires a pressure sensor under each key — a hardware cost that has only become widespread on MPE controllers (Roli Seaboard, LinnStrument, Lightpad) and a few high-end keyboards (Native Instruments Kontrol S88, Polyend Tracker+). On a standard keyboard, Channel Aftertouch (0xDn) substitutes for it. On the receiving side, many instruments ignore Poly AT even when they receive it: they don't know how to map "pressure on this specific note" to a per-note sound parameter.

Control Change (0xBn) is the catch-all of the protocol: 128 controller addresses, two data bytes (CC number 0-127, value 0-127). 1 = mod wheel, 7 = channel volume, 10 = pan, 64 = sustain pedal, etc. CCs carry all continuous controllers (modulation, pan, sustain, expression…) and several structural commands (Bank Select, channel mode, reset). The full detail — the 0-127 map, MSB/LSB pairs for 14-bit precision, channel mode, RPN/NRPN — is the subject of §7.

Program Change (0xCn) selects a sound. It is the shortest of the seven: two bytes — status, program number (0-127). No additional parameter. A library of 128 sounds (often called patches or presets) was plenty in 1983. Today, this range is extended with Bank Select (CC 0 for bank MSB, CC 32 for LSB, sent just before the Program Change) — theoretically 128 × 128 × 128 = over two million combinations, in practice limited by what each device actually exposes.

Second classical pitfall: off-by-one. The protocol numbers programs from 0 to 127, but most manuals and user interfaces display them from 1 to 128. Selecting "program 1" in a DAW actually sends Program Change 0 on the wire. When a sequencer doesn't find the right sound, this offset is almost always to blame.

Channel Aftertouch (0xDn), also called Channel Pressure, transmits the pressure applied to the keyboard as a whole. Two bytes: status + value (0-127). Unlike Poly AT, it does not individualize notes: finger pressure on any held key triggers a single message that applies to all notes currently playing on the channel. It is the standard aftertouch on 95% of master keyboards because it requires only one sensor for the entire keyboard mechanism. Typically mapped to vibrato, filter opening, or volume.

Pitch Bend (0xEn) continuously shifts the pitch of all notes currently playing on the channel. Three bytes: status, value LSB, value MSB. Combined, the two bytes form a 14-bit value — 16,384 positions — centered on 8,192, which means "no shift". This high resolution comes from a musical need: a pitch bend wheel must be able to interpolate perfectly smoothly, without audible stairs. The default range is ±2 semitones, configurable via RPN 0 (Registered Parameter Number, §7) up to ±24 semitones on most instruments. The pitfall: it is easily forgotten that center is 8192, not 0 — a patch sending 0 as "neutral value" puts pitch two whole tones below.

Seven messages, seven status bytes, two usage families: notes (On, Off, Poly AT) and parameters (CC, Program Change, Channel AT, Pitch Bend). This is the entirety of the MIDI 1.0 Channel Voice vocabulary. Everything else — clock, transport, SysEx — is System (§6), addressed off-channel. Knowing the table of seven and their pitfalls is the foundation that lets you read a MIDI stream without confusion.
seven types, seven pitfalls. a note on with velocity 0 equals note off — the most famous one hides six more.
Diagram
The seven Channel Voice messages of MIDI 1.0Table of the seven Channel Voice message types in MIDI 1.0. Columns: type name, status byte (hex), number of bytes, role, and a typical pitfall for each type. Note On 0x9n: 3 bytes, triggers a note, velocity 0 is a disguised Note Off. Note Off 0x8n: 3 bytes, releases a note, release velocity rarely read. Poly Aftertouch 0xAn: 3 bytes, per-note pressure, rare, replaced by Channel AT. Control Change 0xBn: 3 bytes, continuous controllers, 128 addresses, detailed in §7. Program Change 0xCn: 2 bytes, selects a sound, off-by-one 0-127 vs 1-128. Channel AT 0xDn: 2 bytes, channel-wide pressure, does not distinguish notes. Pitch Bend 0xEn: 3 bytes, continuous pitch, center is 8192 not 0.THE SEVEN CHANNEL VOICE MESSAGESTYPESTATUSBYTESROLEPITFALLNote On0x9n3trigger a notevelocity 0 = disguised Note OffNote Off0x8n3release a noterelease velocity rarely readPoly AT0xAn3per-note pressurerare, replaced by Channel ATControl Change0xBn3continuous controllers128 addresses → §7Program Change0xCn2select a soundoff-by-one 0-127 ↔ 1-128Channel AT0xDn2channel-wide pressuredoesn't distinguish notesPitch Bend0xEn3continuous pitch shiftcenter = 8192, not 0these seven cover everything that runs on a channel.clock, transport and sysex are SYSTEM (§6).
The life of a MIDI note: Note On triggers it, it sounds, Note Off ends it. Duration is the interval.A time axis. At the start, NOTE ON 0x90 with velocity 100 triggers the note. The note stays active (DO sounds) for a duration. Then NOTE OFF 0x80 with release ends it. The duration equals the interval between On and Off — it is never transmitted as such. A missing Note Off leaves a stuck hanging note.A NOTE’S LIFE — ON, HELD, OFFtimenote active (DO sounds)NOTE ON0x90 · velocity 100NOTE OFF0x80 · releaseduration = interval On → Offduration is never transmitted — it is the gap between two messages.a missing Note Off = a stuck note (hanging note).
Aftertouch: channel aftertouch sends one pressure value for the whole channel, poly aftertouch sends one per key.Left, channel aftertouch (0xDn): three pressed keys all converge to a single pressure gauge — one value for the whole channel. Right, poly aftertouch (0xAn): each pressed key has its own gauge at a different level — one value per key, independent. Channel uses one global sensor (common); poly uses one sensor per key (rare, costly, very expressive).AFTERTOUCH — CHANNEL vs POLYPHONICCHANNEL · 0xDn1 value · whole channelPOLY · 0xAn1 value per key · independentchannel = one global sensor (common). poly = one sensor per key(rare, costly, very expressive). same idea, very different feel.
/ 05

THE CHANNELS

the addressing model and its limits
A Channel Voice status byte (§3) holds the channel in its low nibble: four bits, so sixteen values, 0 to 15. On screen those channels show as 1 to 16; in the byte they are 0 to 15. Sixteen logical addresses on a single cable — that is the whole of MIDI 1.0 addressing.

A channel is not a physical wire. It is a tag carried by every message, which devices read to decide whether an order concerns them. Several instruments chained together receive the same stream; each reacts only to messages on its channel and ignores the rest. The separation is logical, not physical.

How a device listens to those channels depends on its reception mode. Two axes combine. The first, Omni, decides whether it listens to all channels indiscriminately (Omni On) or only to its own (Omni Off). The second, Poly versus Mono, decides whether it plays several notes at once (Poly) or only one (Mono). The four combinations give the spec's four historical modes — Omni On/Poly, Omni On/Mono, Omni Off/Poly, Omni Off/Mono — selected by the Channel Mode messages seen in §7 (CC 124 to 127).

This is where the two families sharing the 0xBn status part ways. Below CC 120 they are Channel Voice — continuous controllers acting on the sound. From CC 120 up they are Channel Mode — orders acting on how the channel listens. Same status byte, two roles, separated by a threshold in the controller number.

The channel also carries an economy. Since the status byte does not change while messages of the same type chain on the same channel, the spec allows omitting it: this is running status. After a first Note On, a series of note/velocity pairs suffices — each new pair is read as a Note On until a different status arrives. Coupled with the "Note On of velocity 0 equals Note Off" convention (§3), running status markedly cuts the throughput: a dense note stream fits in two bytes per event instead of three.

Sixteen channels enable multitimbrality: a single device playing several distinct sounds, one per channel — a piano on channel 1, a bass on 2, drums on 10. A multitimbral module is, from this angle, sixteen instruments in one box, addressed by a single cable.

But sixteen is a low ceiling. As soon as per-note expression is wanted — a pitch bend specific to each finger on a keyboard, say — the model chokes: pitch bend, controllers and pressure apply to the whole channel, hence to all its notes at once. Working around that limit first produced MPE, which hijacks the channels by dedicating one per note (§10), then MIDI 2.0, which lifts the ceiling by moving to 256 channels and making controllers per-note native (§11). The sixteen channels of 1983 didn't vanish; they became the base cell of a far wider addressing scheme.
a channel is not a wire. it is a tag each message carries, and each device reads to know whether the order is for it.
Diagram
One MIDI bus, sixteen channels, system messagesA large gold-bordered circle represents the MIDI bus. Inside, sixteen smaller circles numbered 1 to 16 represent the channels. A violet circle at the center labeled SYSTEM MESSAGES represents the family of messages that bear no channel address.BUS · 16 CHANNELS · SYSTEM MESSAGES(all share the same physical wire)← the bus12345678910111213141516SYSTEMMESSAGESnchannel (1-16)system message (no channel)the bus is the shared physical envelope.channels are labels — system messages carry none.
Running status: the repeated status byte omittedThree chained Note On messages. Full form: nine bytes, status 0x90 repeated each time. Running status: seven bytes, the status byte omitted on the second and third notes since type and channel do not change.RUNNING STATUS · THREE NOTES IN A ROWFULL FORM · 9 BYTES903C40903E40904040RUNNING STATUS · 7 BYTES903C40·omitted3E40·omitted4040the status byte’s low nibble carries the channel: 0 to 15.as long as type and channel don’t change, the status byte can be dropped.
Multitimbrality: one MIDI port carries 16 channels, a multitimbral synth gives each a different sound.One MIDI port (gold, 16 channels) connects by a single cable to a multitimbral synth. Inside, each channel gets a different sound: channel 1 piano, channel 2 bass, channel 3 strings, channel 10 drums, and so on up to channel 16 lead. One player drives a whole band.MULTITIMBRALITY — ONE PORT, 16 VOICES1 MIDI port16 canauxmultitimbral synthch1 · pianoch2 · bassch3 · stringsch10 · drumsch16 · leadone cable carries 16 channels — a multitimbral synthassigns a different sound to each. one player, a whole band.
/ 06

THE SYSTEM MESSAGES

a class of messages addressed to no channel
When you press PLAY in Ableton and the external synth starts at the exact instant, when a sampler stays synchronized throughout the session, when a firmware updates over a MIDI cable, when a bank of patches is saved from an old DX7 to disk — all of that runs on System messages. This is the second super-family of MIDI, alongside Channel Voice (§4). The defining feature: these messages address the system as a whole, never a particular instrument.

Technically, they are identified by their status byte. As soon as the high nibble reaches 0xF — values 0xF0 to 0xFF — the channel convention disappears. The low nibble, which in Channel Voice carried the channel number 0-15 (§3), instead becomes a sub-type. That is why a System message has no channel: the slot is taken by something else. Sixteen statuses available, divided across three distinct families.

System Common (0xF1 to 0xF6) groups messages that address the whole system, but punctually rather than continuously. Four practical cases used in production:

- MIDI Time Code Quarter Frame (0xF1) — a component of an SMPTE clock transmitted over MIDI. Synchronizes tape, video sequencer, or DAW session to another system's time code (§9 for the detail).
- Song Position Pointer (0xF2) — indicates where you are in the current sequence, expressed in sixteenth notes from the start. Useful for a sampler to know where to resume after a user jump on the timeline.
- Song Select (0xF3) — selects a song by number (0-127) among those stored in a sequencer. Heavily used in live setups with hardware sequencers.
- Tune Request (0xF6) — requests all analog synths on the bus to retune. Inherited from a time when VCOs drifted with temperature.

Values 0xF4 and 0xF5 are reserved by the specification — never emitted in practice. 0xF7 has a special use (end of SysEx, see below).

System Real-Time (0xF8 to 0xFF) is the protocol's clockwork. Six single-byte messages (status, no data), because they must traverse the bus with negligible latency:

- Timing Clock (0xF8) — 24 pulses per quarter note. This is the master clock that keeps the tempo synchronized across all devices on the bus.
- Start (0xFA) — starts the sequence from the beginning.
- Continue (0xFB) — resumes the sequence where it stopped.
- Stop (0xFC) — stops the sequence.
- Active Sensing (0xFE) — optional heartbeat sent every 300 ms, letting the receiver detect a cable disconnect. Rarely used in practice.
- Reset (0xFF) — resets everything to power-on state.

Typical real-world case: when you press PLAY in Ableton Live configured as transport master, a Start (0xFA) goes out first, then a continuous stream of Clock (0xF8) at 24 ppqn to the slave devices. The downstream synth receives the Start and begins at the exact instant, then each pulse keeps tempo aligned with the DAW.

THE TRAP — absolute priority. Real-Time messages can insert themselves in the middle of another Channel Voice message. If a three-byte Note On is in the middle of being transmitted and a Clock arrives because you are on a beat, the Clock passes between two bytes of the Note On. The receiver must know that any status from 0xF8 to 0xFF is to be processed immediately, without interrupting the parsing of the message in progress. A naive parser that reads byte by byte without this logic will mistake the Clock for a Note On data byte and corrupt its entire stream. This is one of the subtleties that makes writing a robust MIDI parser trickier than it first looks.

System Exclusive (SysEx) starts with 0xF0 and ends with 0xF7. Between them, anything goes. It is the protocol's open trap for whatever fits no other category: sysex editing on old synths (DX7, Juno-106, Wavestation, MicroKorg), patch banks saved to disk, firmware updates, proprietary requests, internal communication between modules of the same manufacturer.

The minimal structure of a SysEx message:

- 0xF0 — start
- 1 to 3 bytes of manufacturer ID (Roland = 0x41, Yamaha = 0x43, Sequential = 0x01, Moog = 0x04, Korg = 0x42, etc. — assigned by the MIDI Manufacturers Association)
- N bytes of payload, each ≤ 0x7F to avoid being mistaken for a status
- 0xF7 — end

The payload is not standardized. Each manufacturer invents its own internal protocol, and a Korg patch dump has no chance of being interpreted by a Yamaha. It is wild west — SysEx interoperability is zero by construction. The Universal SysEx spec (IDs 0x7E for non-real-time, 0x7F for real-time) adds a standardized layer on top: MTC Full Frame, GM Mode On/Off, Master Volume, Master Balance. But these Universal standards remain the exception, and 95% of SysEx in circulation is proprietary.

THE SECOND TRAP — 0xF7 has a double duty. End-of-SysEx marker, but also a System Common status that is theoretically reserved. In practice, an isolated 0xF7 is never emitted by a device — it always serves to close an open SysEx frame. This ambiguity is one of the scars of a protocol that ran out of status bytes in 1983 and had to recycle what it had.

System = three families, sixteen statuses, one common feature: no notion of channel. Common for the punctual, Real-Time for the real-time clockwork, Exclusive for the unlimited manufacturer territory. It is the structural complement of the seven Channel Voice messages (§4). Together — seven Channel Voice + sixteen System — they form the entirety of the MIDI 1.0 vocabulary, down to the last byte.
when the status reaches 0xf, the low nibble stops being a channel and becomes a sub-type. three families live there — common, real-time, sysex.
Diagram
Map of MIDI System status bytes 0xF0 to 0xFFA grid of 16 cells covering the status bytes from 0xF0 to 0xFF. These bytes have no channel nibble: their low half encodes a sub-type. Color groups: violet for SysEx delimiters (0xF0 SysEx start and 0xF7 EOX end), blue for System Common (0xF1 MTC Quarter Frame, 0xF2 Song Position, 0xF3 Song Select, 0xF6 Tune Request), red for System Real-Time (0xF8 Clock, 0xFA Start, 0xFB Continue, 0xFC Stop, 0xFE Active Sense, 0xFF Reset). Gray cells (0xF4, 0xF5, 0xF9, 0xFD) are reserved/undefined.THE SIXTEEN STATUS BYTES 0xF0–0xFF0xF0SysEx0xF1MTC Quarter0xF2Song Pos0xF3Song Sel0xF40xF50xF6Tune Req0xF7EOX0xF8Clock0xF90xFAStart0xFBContinue0xFCStop0xFD0xFEAct. Sense0xFFResetSysExSystem CommonSystem Real-Timereserved0xF = no channel. these messages speak to the system, not to an instrument.
Anatomy of a SysEx message: F0, manufacturer ID, payload, F7.A SysEx message has four parts: F0 starts it, a manufacturer ID (Roland 0x41, Yamaha 0x43, Korg 0x42, Universal 0x7E or 0x7F), a proprietary payload of free length, and F7 ends it (EOX). Between F0 and F7 the manufacturer does what it wants: patch dumps, settings, firmware.ANATOMY OF A SYSEX MESSAGEF0SysEx start41 / 43 / 42manufacturer ID... data ...proprietary payloadF7end (EOX)(length is free — terminated only by F7)manufacturer IDs0x41Roland0x43Yamaha0x42Korg0x7E/7FUniversalbetween F0 and F7, the manufacturer does what it wants —patch dumps, settings, firmware: a private channel.
MIDI clock ticks 24 times per quarter note via F8 messages.A timeline shows ticks. Every tick is an F8 Clock message; there are 24 ticks per quarter note. Gold marks fall on each beat. Twenty-four ticks make one quarter note — that is the shared tempo. Start FA, Stop FC, Continue FB drive the transport.MIDI CLOCK — 24 PULSES PER QUARTER NOTEbeat 1beat 2beat 3each tick = one F8 (Clock)24 ticks = 1 quarter noteF8 ticks 24 times per quarter note — that is the shared tempo.Start (FA), Stop (FC), Continue (FB) drive the transport.
/ 07

FOCUS · CONTROL CHANGES

128 controllers, half of them standardized
The Control Change is the 0xBn message: a status byte, then two data bytes — the controller number (0 to 127) and its value (0 to 127). Behind this single form sits the widest and most disparate zone of MIDI 1.0: 128 addresses carrying everything that is neither a note, nor a program change, nor a system message. Modulation, volume, pan, pedals, bank selection, fine synthesis settings — all travel through the same format, told apart only by their number.

The map of the 128 numbers is not uniform. Part of it is standardized by the MIDI specification: a device receiving CC 7 knows it means volume, wherever it comes from. Another part is left open, at the manufacturers' discretion. That open half is the source of both the richness and the disorder of the ecosystem: two synthesizers can assign the same number to two unrelated parameters.

A few numbers anchor everyday use:

- CC 1 — the modulation wheel. The default expressive controller, almost always wired to vibrato or filter opening.
- CC 7 — channel volume. The overall level, the one a console fader touches.
- CC 10 — pan. Position in the stereo field, 0 left, 64 center, 127 right.
- CC 11 — expression. A second, relative volume, meant for dynamics inside a phrase without touching the CC 7 master level.
- CC 64 — the sustain pedal. Above 63 the pedal is down; below, up.

The seven-bit problem (§3) bites here too: 128 steps rarely suffice for a filter sweep or a slow fade, where the staircase is audible. The spec therefore provides a 14-bit resolution mechanism by pairing. CC 0 to 31 are the high bytes; CC 32 to 63 are their matching low bytes. Sending CC 1 then CC 33 composes a 14-bit modulation value — 16384 steps instead of 128. In practice few devices implement the low byte, and the lower half of the map often stays unused.

One pair is almost always used: Bank Select. CC 0 and CC 32 select a sound bank before a Program Change (§3) picks the program within it. Without the pair, Program Change reaches only the first 128 sounds; with it, the space opens to thousands.

The top of the map is reserved. CC 120 to 127 are not continuous controllers but Channel Mode messages — orders aimed at the channel itself:

- CC 120 — All Sound Off: cuts all sounding notes immediately.
- CC 121 — Reset All Controllers: returns controllers to their default values.
- CC 123 — All Notes Off: releases every held note.
- CC 124 to 127 — the Omni and Mono/Poly modes, which define how the channel listens (§5).

Then comes the most indirect mechanism: Registered and Non-Registered Parameter Numbers. Rather than spending one controller number per parameter, RPN and NRPN use a handful as an addressing scheme. A parameter is selected first — CC 101 and 100 for an RPN, a standardized parameter such as pitch bend sensitivity; CC 99 and 98 for an NRPN, a manufacturer-specific one — then its value is edited through Data Entry, CC 6 and CC 38. A protocol inside the protocol, opening a near-unlimited address space at the cost of a multi-message sequence.

This dual nature — one half standardized, one half open — is the whole story of Control Changes. It explains why the same wheel movement can drive a filter on one synth and nothing at all on another. The spec fixed the essentials and left the rest open; forty years of use have filled the blanks with conventions that sometimes contradict each other.
a control change is an address and a value. what the address means never travels with it.
Diagram
The map of the 128 Control Change numbersA Control Change message: status byte 0xB0, controller number, value 0-127. The 128 numbers split into zones: 0-31 high byte, 32-63 low byte (paired for 14-bit resolution), 64-119 switches and continuous controllers, 120-127 channel mode.CONTROL CHANGE · A NUMBER, A VALUESTATUS BYTECONTROLLER N°VALUE0xB00164type | channelmodulation0 to 127THE 128 NUMBERSHIGH BYTELOW BYTESWITCHES · CONTINUOUSMODE03264120127PAIRED → 14 BITS · 16384 STEPSSTANDARD CCs1 mod · 7 volume · 10 pan · 11 expression · 64 sustainhalf the numbers are standardized; the other half, left to manufacturers.
MSB and LSB: two 7-bit CCs combine into a 14-bit value with 16384 steps.Two control changes combine for high resolution. CC n is the MSB (coarse, 7-bit, e.g. CC 0). CC n+32 is the LSB (fine, 7-bit, e.g. CC 32). Combined as MSB times 128 plus LSB, they form a 14-bit value: 16384 steps instead of 128, allowing fine moves without the staircase.MSB / LSB — TWO CCs FOR 14-BIT RESOLUTIONCC n (MSB)coarse · 7 bitsex. CC 0CC n+32 (LSB)fine · 7 bitsex. CC 32+14-bit value(MSB × 128) + LSB0-1270-127a CC pair (n and n+32) combines into 14 bits: 16 384 stepsinstead of 128. fine moves without the staircase.
Selecting a sound: Bank Select MSB (CC0), Bank Select LSB (CC32), then Program Change pick a precise sound in a precise bank.Three steps in sequence. Step 1: CC 0, Bank Select MSB (coarse bank). Step 2: CC 32, Bank Select LSB (fine bank). Step 3: Program Change, the sound 0-127. Program Change alone reaches 128 sounds; sending Bank Select first unlocks 16384 banks times 128, over two million addressable sounds.SELECTING A SOUND — BANK SELECT + PROGRAM CHANGECC 0Bank MSBbank (coarse)step 1CC 32Bank LSBbank (fine)step 2PC nProgram Changethe sound · 0-127step 3a precise sound, in a precise bank(MSB × 128 + LSB) → bank · PC → soundProgram Change alone reaches 128 sounds. Bank Select firstunlocks 16384 banks × 128 = over two million addressable sounds.
/ 08

TRANSPORT

five physical supports carry one protocol
When you plug in a MIDI cable, it often feels like there is "the" MIDI cable. In reality, five physical supports today carry the same protocol, and the choice between them changes everything — audible latency, maximum distance, stage reliability, plug-and-play. The MIDI bus (§2) is logical and invariant; it is its material incarnations that vary at the low level. Knowing the five and their quirks saves hours hunting why a cable "doesn't work".

Important note — this is about physical transport, the material supports that carry MIDI bytes. Not to be confused with sequencer transport (Play, Stop, Record, Locate, Rewind), which is a family of commands at the message level, covered in §9 (MMC). Same word, two completely distinct worlds.

DIN-5 (since 1983) is the original standard. A round 5-pin connector, three pins used: pins 4 and 5 for data, pin 2 for shield ground. Inside every receiver, opto-isolation via an infrared diode electrically decouples sender and receiver — no current flows through, no ground loop possible. This opto-isolation is what made the protocol robust on stage for forty years. Rate: 31,250 baud (31.25 kbps), chosen in 1983 to be achievable with affordable electronics of the time. Per-byte latency: about 0.32 ms (one byte = 10 bits with start and stop, that is 320 µs at 31.25 kbps). Max theoretical length: 15 m before signal degradation, often less in practice with average cables. Unidirectional connectors (separate IN, OUT, THRU — see §2 on daisy chain), bulky to carry but nearly indestructible in use. Most stage keyboards and high-end rack synths still feature DIN-5 precisely for this robustness.

TRS 3.5 mm (since 2014-2015) appeared with the miniaturization of controllers — Korg Volca, Arturia BeatStep, Make Noise 0-Coast, Teenage Engineering OP-1. The stereo mini-jack (Tip / Ring / Sleeve, three pins) replaces DIN-5 when bulk becomes prohibitive. Same opto-isolation, same 31,250 baud rate, same latency. Small, ubiquitous in audio, cheap.

THE TRS PITFALL — two incompatible wiring schemes have circulated since the start, with no official standard for years:
- Type A: Tip = data, Ring = +5V — officially standardized by the MIDI Manufacturers Association in 2018; adopted by Korg, Make Noise, Novation (after 2018), Empress
- Type B: Tip = +5V, Ring = data — historical wiring used before 2018 by Novation, Arturia, Akai

Plugging a standard TRS cable between a Type A device and a Type B device gives complete silence. The opto-isolation protects against any hardware damage, but no communication passes. Solutions: use a type-specific TRS-DIN adapter, a reversible (crossed) cable like those Befaco makes, or check the specs before buying. Much pre-2018 gear remains Type B without firmware updates — always check the manual.

USB-MIDI (since 1999) has become the de facto standard for connecting computer and MIDI hardware. The USB Audio Class 1.0 specification includes MIDI from 1999; the Class Compliant designation guarantees no proprietary driver is needed — the OS recognizes the device on plug-in across Windows, macOS, Linux, iOS, Android. Typical USB latency: 1 to 3 milliseconds, musically negligible. Rate: far above strict MIDI needs, which lets multiple buses multiplex on a single cable — one USB connector can virtually carry several independent MIDI ports (§2 on buses). Max passive cable distance: 5 m; beyond that, active hub or extender required. USB-powered controllers possible without external supply — practical for mobile setups.

BLE-MIDI (since 2014) transports MIDI over Bluetooth Low Energy. Specification published by Apple in 2014 with iOS 8, supported natively in macOS El Capitan in 2015, then in Windows later. On Android, arrival has been later and fragmented depending on manufacturer. No cable, maximum mobility, pairing in seconds.

THE BLE PITFALL — variable, unpredictable latency. Unlike wired supports where latency is constant and low, BLE-MIDI typically has 5 to 15 ms of latency under ideal conditions, and can climb to 30 ms or more under interference (2.4 GHz Wi-Fi sharing the band, other Bluetooth devices, distance, obstacles). This variability makes BLE-MIDI unusable for anyone playing in real time with a strict sampler, a precise sequencer, or any setup demanding sub-millisecond synchronization. Acceptable for slow controllers (expression pedals, fader controllers, study keyboards), unacceptable for live performance requiring strict rhythmic precision.

RTP-MIDI / AppleMIDI (since 2005) transports MIDI over an IP network. Appeared in macOS Tiger in 2005 under the name AppleMIDI, later standardized as RTP-MIDI under IETF RFC 4695 and 6295. Uses UDP on port 5004. On local LAN, latency of 1 to 5 ms — comparable to USB. Over Wi-Fi or remote WAN, dependent on network quality. Distance theoretically unlimited as long as you stay on the same network (or via VPN). Lets a Mac connect to another via Ethernet, Wi-Fi, or Internet — practical for distributing a setup across several machines, or remote control from another location. On Windows, Tobias Erichsen's rtpMIDI driver is the equivalent, free, widely used in practice. Configuration: automatic Bonjour discovery between Apple machines, explicit IP entry otherwise. Less plug-and-play than USB but opens up setup architectures impossible otherwise.

Five supports, one protocol. What travels on the cable — the MIDI bytes themselves (§3, §4, §6) — is identical everywhere, down to the byte. But the choice of support determines all the rest: audible latency (from 0.3 ms for DIN-5 to 30 ms for BLE under interference), reach (15 m for DIN-5, unlimited for RTP-MIDI on LAN), stage reliability (DIN-5 is bulletproof, BLE can drop at the slightest interference), plug-and-play experience (USB without config, BLE with pairing, RTP with network config). The right reflex: choose the support for the use case, not the reverse — DIN-5 for robust stage, USB for studio, TRS for small controllers (checking A versus B), BLE for non-strict mobility, RTP for distributed setups.
five physical supports carry the same protocol. what changes: latency, distance, pitfall.
Diagram
The five physical layers carrying MIDI: DIN-5, TRS 3.5mm, USB-MIDI, BLE-MIDI, RTP-MIDIComparison table of the five physical supports carrying the MIDI protocol. Columns: support, connector, rate, latency, distance, pitfall. DIN-5: 5-pin 180-degree, 31250 baud, 0.3 ms per byte, 15 m max, bulky. TRS 3.5mm: mini-jack, 31250 baud, 0.3 ms per byte, 1-2 m, Type A vs Type B incompatible. USB-MIDI: USB A/B/C, very high rate, 1-3 ms, 5 m, active hub needed beyond. BLE-MIDI: Bluetooth LE, radio 2.4 GHz, 5-30 ms, 10 m, variable latency. RTP-MIDI: RJ45 or Wi-Fi, LAN speeds, 1-5 ms, unlimited distance, network configuration required.FIVE SUPPORTS, ONE PROTOCOL — THE PHYSICAL LAYERSSUPPORTCONNECTORRATELATENCYDISTANCEPITFALLDIN-55-pin 180°31250 baud0.3 ms/byte15 m maxbulkyTRS 3.5mmmini-jack31250 baud0.3 ms/byte1-2 mType A vs Type BUSB-MIDIUSB A/B/Cvery high1-3 ms5 mactive hub beyondBLE-MIDIBluetooth LEradio 2.4G5-30 ms10 mvariable latencyRTP-MIDIRJ45 / Wi-FiLAN1-5 msunlimitednetwork configwhat travels (the bytes) is identical everywhere.what changes: latency, distance, pitfall.
TRS mini-jack MIDI: Type A puts data on Tip, Type B on Ring; the two are incompatible.Two TRS mini-jack wirings. Type A (gold, 2018+, Korg Make Noise Novation): data on Tip, +5V on Ring. Type B (red, pre-2018, Arturia Akai old Novation): data on Ring, +5V on Tip. Same connector, two wirings — plugging A into B gives total silence, no damage and no signal.TRS MINI-JACK — TYPE A vs TYPE BTRSTYPE Adata = Tip · +5V = Ring2018+ · Korg, Make Noise, Novation+TRSTYPE Bdata = Ring · +5V = Tippré-2018 · Arturia, Akai, old Novationincompatible — silencesame connector, two wirings. A on Tip, B on Ring —plug A into B = total silence (no damage, no signal).
Transport latency compared: DIN-5 and USB about 1-2 ms, RTP about 4 ms, BLE about 15 ms with jitter.Horizontal bars of latency. DIN-5 about 1 ms and USB-MIDI about 2 ms (green, short, stable). RTP-MIDI about 4 ms (gold) with some jitter. BLE-MIDI about 15 ms (red) with a large dashed jitter range. Wired is short and stable; wireless BLE is long and variable, and the jitter is what desyncs a tight performance.TRANSPORT LATENCY — WIRED vs WIRELESSDIN-5~1 msUSB-MIDI~2 msRTP-MIDI± jitter~4 msBLE-MIDI± jitter~15 ms0 ms510152025wired = short and stable. wireless (BLE) = long and variable —the jitter (dashed) is what desyncs a tight performance.
/ 09

THE LAYERS PUT ON TOP

seven conventions added on top of MIDI 1.0 to make it interoperable
MIDI 1.0 (§2 to §8) carries bytes between machines. But once the byte arrives: what does it mean? Program Change 0 — what sound plays exactly? How do you align a sequencer to a video tape down to a quarter-frame? How do you archive a session to replay tomorrow on another machine? MIDI 1.0 answers none of these questions. Seven conventions were put on top, at different periods, by the MIDI Manufacturers Association (the standards body) or by competing makers (Roland, Yamaha). None is mandatory — but without them, MIDI remains a dumb transport protocol, unable to guarantee that a file produced here will play there, that an external clock will sync the session, that a remote tape machine will obey the PLAY button.

GENERAL MIDI — GM (1991, MMA): the minimum contract for sound interoperability. 128 standardized programs numbered 0 to 127: Program 0 = Acoustic Grand Piano, Program 24 = Acoustic Guitar (Nylon), Program 56 = Trumpet, Program 80 = Lead Square Wave, etc. Channel 10 reserved for drum kits with note-to-percussion mapping (note 36 = Bass Drum 1, note 38 = Acoustic Snare, note 49 = Crash Cymbal 1). Minimum guaranteed polyphony: 24 voices. Mandatory support for All Notes Off (CC 123, §7) for clean reset. GM plugs into MIDI 1.0 via simple Program Change (§4) — no new mechanism, just a shared dictionary.

THE GM PITFALL, the most famous one — it is a minimum contract. GM fixes NAMES and POSITIONS, never the sonic rendering. Program 0 = "Acoustic Grand Piano" but the piano of a Roland Sound Canvas SC-55 (1991) sounds radically different from that of a Yamaha MU100 (1997), and both have nothing to do with a modern FluidSynth running on a free SoundFont bank. GM guarantees "it plays a piano on channel 1", not "it sounds the same". This is the classic misunderstanding with .mid files that have circulated on the Internet for twenty-five years.

GENERAL MIDI 2 — GM2 (1999, MMA): clean extension of GM by the same authority. 256 programs instead of 128, accessible via Bank Select (CC 0 coarse + CC 32 fine, §7) followed by Program Change. Minimum polyphony raised to 32 voices. Additional standardized CCs: Filter Cutoff (CC 74), Resonance (CC 71), Attack (CC 73), Release (CC 72), Vibrato Rate/Depth/Delay. Standardized RPN and NRPN. Additional drum kits. Backward compatible: a GM file plays correctly on GM2 hardware. In practice: barely used. The majority of .mid files in circulation are GM; GM2 stays confined to DAWs and workstations that implement it out of thoroughness.

GS (Roland, 1991) and XG (Yamaha, 1994): the proprietary supersets. Appeared almost simultaneously with GM, in the commercial war of consumer workstations of the 90s (the famous sound modules SC-55, SC-88, MU80, MU100). Each extends GM with its own extended banks, its own CCs, its own SysEx — and they are not compatible with each other. A .mid file tagged GS will play correctly on a Roland Sound Canvas but lose all its flourishes on a Yamaha MU. Inverse for XG.

GS extends GM via Roland-specific Bank Select MSB combined with Roland SysEx (manufacturer ID 0x41, §6). XG does the same with its own Bank Select values and Yamaha SysEx (0x43). In practice today: historical relic. GS and XG survive via emulators (Roland SC-VA, Yamaha SYXG50) and GS/XG SoundFont collections. No new hardware implements them. Good historical knowledge to understand a 1996 .mid; no productive use today.

MTC — MIDI TIME CODE (1987, MMA): SMPTE timecode encapsulated in MIDI. SMPTE is the standardized video/film timecode (hours:minutes:seconds:frames), traditionally transmitted over a dedicated audio cable (LTC, Linear Time Code) between analog tape machines and mixing desks. MTC carries it on the MIDI bus via two mechanisms: Quarter-Frame Messages (status 0xF1, §6) emitted 4 times per SMPTE frame for continuous playback tracking, and Full Frame Messages (Universal SysEx) for one-shot jumps/locate. Lets you sync a MIDI sequencer to a video tape, a multitrack tape machine, or today a DAW to an external timecode supplier. Practical accuracy: 1/4 frame (~10 ms at 25 fps, finer at 30 fps). Still very much used in audio-video post-production and professional studio sync.

MMC — MIDI MACHINE CONTROL (1992, MMA): transport control in the sequencer sense. Here "transport" means session playback commands (Play, Stop, Record, Locate, Rewind, Fast Forward) — NOT TO BE CONFUSED with the physical transport of §8 (DIN, USB, BLE, RTP) which means the material supports carrying bytes. Same word, two completely distinct worlds. Encoded as Universal Real-Time SysEx (F0 7F <ID> 06 <command> F7), so in the Universal SysEx reservoir (§6). Lets a master machine (DAW) control the transport of a slave (multitrack tape, another DAW, hardware sequencer). Still very much used to pilot Pro Tools, Logic, Cubase from outside, or to sync several DAWs together.

SMF — STANDARD MIDI FILE (1988, MMA): the disk storage format. Extension .mid (sometimes .smf). Lets you archive a complete MIDI performance — all notes, all CCs, tempo, time-signature changes, track names — in a transportable, exchangeable, replayable file. Three formats coexist in the spec:

- Format 0: all voices merged into a single track. Light, simple, but definitive loss of logical separation. Used for distribution (karaoke, old ringtones, .mid files on the web).
- Format 1: multiple named tracks, a dedicated tempo track. Standard DAW format for exports/imports. This is what you get when exporting a .mid from Logic, Cubase, Ableton, Reaper.
- Format 2: multiple independent sessions in a single file. Conceptually useful for song suites. In practice almost never used — most readers and DAWs ignore or reject this format.

THE SMF PITFALL: converting format 0 → format 1 does not recover the lost separation. The merge is definitive. That's why you generally export to format 1 and reserve format 0 only for final distribution. SMF also stores meta-events (non-sonic text): track names, copyright, lyrics, bar markers, time signature, key signature. These meta-events are NEVER transmitted on a real MIDI cable — they exist only in the .mid file, as archival metadata.

Seven layers, seven answers to seven needs MIDI 1.0 left open. GM, GM2, GS, XG say what sound plays. MTC says what time it is. MMC says how to remotely pilot the transport. SMF says how to archive. None was mandatory at the design of MIDI 1.0 in 1983; all became unavoidable in production. When you load a GM .mid into a DAW, sync a session on an incoming SMPTE timecode, control a tape machine via remote PLAY/STOP — you use these layers without always knowing it. They have stood for thirty years precisely because they answer needs that MIDI 1.0 alone does not cover.
MIDI 1.0 carries the bytes. these layers add meaning — what sound, what time, what file.
Diagram
Seven conventions added on top of MIDI 1.0 for interoperabilityTable of seven conventions added on top of MIDI 1.0. GM 1991 MMA via Program Change. GM2 1999 MMA via Bank Select plus Program Change. GS 1991 Roland via SysEx 0x41. XG 1994 Yamaha via SysEx 0x43. MTC 1987 MMA via Quarter-Frame status 0xF1. MMC 1992 MMA via Universal SysEx. SMF 1988 MMA file only.SEVEN LAYERS PUT ON TOP OF MIDI 1.0NAMEYEARAUTHORCONTRIBUTIONMECHANISMGM1991MMA128 standard sounds, ch.10 drumsProgram ChangeGM21999MMA256 sounds, polyphony 32, standard CCsBank Select + PCGS1991RolandRoland proprietary extensionSysEx 0x41XG1994YamahaYamaha proprietary extensionSysEx 0x43MTC1987MMASMPTE over MIDI (h:m:s:f)Quarter-Frame 0xF1MMC1992MMAsequencer transport (Play/Stop)SysEx UniversalSMF1988MMA.mid file (formats 0/1/2)file onlyMIDI 1.0 carries the bytes. these layers add meaning —what sound, what time, what file.
A stack: GS, XG, GM2 extensions build on General MIDI, which builds on MIDI 1.0. MTC, MMC, SMF sit aside for sync and storage.A stack diagram. At the base, MIDI 1.0 (gold): notes, CC, SysEx, 16 channels. Above it, General MIDI: the minimum sound contract. Above that, three competing extensions: GS (Roland), XG (Yamaha), GM2 (MMA). To the side, a dashed box: MTC, MMC, SMF for sync and storage, which do not touch the sounds.THE STACK — WHAT BUILDS ON MIDI 1.0MIDI 1.0notes · CC · SysEx · 16 channelsGENERAL MIDIminimum sound contractGS(Roland)XG(Yamaha)GM2(MMA)MTC · MMCSMFsync & storagedon’t touchthe soundssound dialects stack on GM, which builds on MIDI 1.0.above: the sound. beside: sync and the file.
The three Standard MIDI File formats: 0 single stream, 1 synchronous multitrack, 2 independent patterns.Three panels. Format 0 (green): one track, everything merged into a single stream. Format 1 (gold): several synchronous tracks sharing one clock — the everyday format. Format 2 (blue): several independent patterns. Format 1 is the common one.STANDARD MIDI FILE — FORMATS 0 / 1 / 2FORMAT 0single stream1 track · all mergedFORMAT 1synchronous multitracktracks share one clockFORMAT 2separate patternsindependent patternsformat 1 is the everyday one: synchronous named tracks.format 0 flattens everything; format 2 holds independent patterns.
/ 10

MPE

one channel per note — expression becomes polyphonic
In MIDI 1.0, Pitch Bend (status 0xE, §4) and Channel Aftertouch (0xD) act on the entire channel. When you play a three-note chord on one channel and a pitch bend arrives, all three notes bend together, by the same amount. Same for pressure: Channel Aftertouch applies a single value to all active notes on the channel. This is a structural limit, not a flaw: the channel is the addressing unit for continuous expression (§5). For a piano or an organ, it doesn't matter — those instruments have no per-note continuous expression. But for anyone wanting to play like an expressive acoustic instrument — bending a single string of a guitar chord, adding vibrato to a single voice of a choir — MIDI 1.0 cannot follow. Historically you cheated: a channel assigned by hand per note ("mono per channel" mode), tedious setups, no standardization.

MPE (MIDI Polyphonic Expression), standardized by the MIDI Manufacturers Association in January 2018 (spec CA-034), solves the problem with a simple idea: route each played note to a different channel, in rotation. One note = one channel. As soon as each note lives on its own channel, Pitch Bend, Channel Aftertouch and CCs become per-note again — since they act on a channel that holds only one note. The per-channel expression of MIDI 1.0 mechanically becomes per-note expression. No new message type was invented: MPE is a convention for using the existing channels, not an extension of the protocol. That's what makes it backward-compatible with all MIDI 1.0 transport (§2 to §8).

The MPE zone. MPE organizes the 16 channels into zones. A zone has a Master Channel and Member Channels. The Lower Zone uses channel 1 as master and channels 2 onward as members (up to 15 members, channels 2 to 16). The Upper Zone uses channel 16 as master and channels 15 and below as members. Both zones can coexist (master 1 + master 16, members split in the middle), allowing two controllers or two independent sounds on the same port. The Master Channel carries global messages applying to all notes in the zone: sustain (CC 64), overall pitch bend, global modulation. Member Channels each carry one note with its individual expression.

The handshake — MCM. So the receiver knows it is receiving MPE and how many member channels the zone uses, the sender emits an MPE Configuration Message (MCM): a specific RPN (RPN 6, §7) on the master channel, whose value is the number of member channels in the zone. RPN 6 = 7 means "Lower Zone with 7 members" (channels 2 to 8). RPN 6 = 0 disables the zone. It is the handshake that tells the synth "prepare to receive one note per channel". Without a correctly received and interpreted MCM, the synth treats the stream as ordinary multi-channel.

The five expression dimensions. MPE captures five per-note dimensions, with the terminology popularized by Roli widely adopted:

- Strike — Note On velocity, initial press speed. An instant, captured at attack.
- Press — the Channel Aftertouch of the note's channel, so the pressure held after attack. Continuous.
- Slide — CC 74 (brightness / timbre), typically the vertical finger movement on the surface. Continuous.
- Glide — the Pitch Bend of the note's channel, typically the horizontal finger movement. Continuous.
- Lift — Note Off velocity (release velocity, §4), release speed. An instant, captured at release.

Three are continuous (Press, Slide, Glide) — they evolve as long as the note lasts. Two are instants (Strike at attack, Lift at release). Together they turn each note into a complete expressive gesture, from touch to release.

The controllers. MPE hardware preceded then accompanied standardization. The Roli Seaboard (RISE, GRAND, BLOCK) offers a continuous silicone surface, without discrete keys, where you slide and press. The LinnStrument (Roger Linn, creator of the MPC) is a grid of pressure- and slide-sensitive pads in a fourths layout. The Haken Continuum and its smaller sibling the ContinuuMini, a high-end continuous surface, pioneered per-note expression before MPE even existed. The Expressive E Osmose blends a familiar key-bed with augmented continuous control. Eigenharp, Joué Play, Sensel Morph explore other surfaces. On the sound side, compatible synths and plugins multiplied: Equator and Equator2 (Roli), Pigments (Arturia), Diva and Repro (u-he), Serum 2, Bitwig's native instruments, and those of Ableton Live and Logic.

Compatibility and the central pitfall. On the DAW side, native support arrived gradually: Bitwig very early, then Logic Pro 10.5 (2020), Ableton Live 11 (2021), Cubase 11. THE PITFALL: channel allocation is not magic. A zone configured with 7 member channels plays 7 truly independent notes. The 8th simultaneous note must recycle an already-occupied channel — and then inherits the expression of the note that was there, or overwrites its own. Result: bends or pressures that "jump" from one note to another unpredictably as soon as you exceed the zone's polyphony. More subtle still: the sending controller and the receiving synth must be configured on the same zone and the same number of members. The MCM is supposed to announce this, but not every device reconfigures automatically. A mismatch and it plays, but expression goes haywire — a member channel on the sender side lands on a channel interpreted differently on the receiver side.

MPE is not universal. A synth without MPE support — most pre-2018 hardware, and plenty of entry-level gear even today — receives the MPE stream as ordinary multi-channel. Two possible behaviors, both wrong: either it ignores the member channels and plays everything on channel 1 (per-note expression lost, but it plays), or it treats each channel as a separate multitimbral voice and over-consumes its polyphony (each note takes a voice slot on a different channel). Before plugging a Seaboard into a hardware synth, check it advertises MPE support — otherwise plan a single-channel fallback.

MPE reinvents nothing: it cleverly repurposes MIDI 1.0's channel addressing to obtain per-note expression, without touching the transport protocol. One note, one channel, five dimensions. The keyboard stops being an on/off switch and becomes a continuous instrument, where each finger sculpts its own voice from touch to release. It is the last great idea built on MIDI 1.0 — and the conceptual bridge to MIDI 2.0 (§11), which will raise per-note expression to a native property of the protocol, no longer needing the channel trick.
MIDI 1.0: pitch bend per channel. MPE: pitch bend per note. the instrument stops being on/off.
Diagram
Poly mode lets a channel play several notes at once; Mono mode plays one note at a time. This is a channel mode, not polyphony.Left, Poly mode: three pressed keys all feed one channel — many notes at once. Right, Mono mode: a single note feeds the channel — one note at a time. This is a channel mode set by CC 124-127, not a voice count. MPE rests on it: each member channel runs in Mono so one note carries all its expression.POLY MODE vs MONO MODE — NOT POLYPHONYPOLY MODEcanal 1one channel, many notes at onceMONO MODEcanal 1one channel, one note at a timea channel mode (CC 124-127), not a voice count. MPE rests on this:each member channel runs in Mono — one note, all its expression.
In MIDI 1.0 pitch bend applies to the whole channel; all notes of a chord bend together.A box labeled CHANNEL 1 contains three notes DO, MI, SOL. A single Pitch Bend message 0xE1 enters the channel from the left. Three identical blue upward arrows above the box show all three notes bend by the same amount — impossible to bend a single note of the chord in MIDI 1.0.MIDI 1.0 — PITCH BEND ACTS ON THE WHOLE CHANNELall bend identicallyCHANNEL 1PITCH BEND 0xE1DOMISOLpitch bend acts on the channel. all three notes bend together —impossible to bend a single note of the chord.
MPE routes each note to its own channel so expression becomes per-note.A master channel (channel 1, thick gold border) carries zone-global messages, shown by a downward arrow entering it. Seven member channels ch2 to ch8 follow. Four notes are deposited in rotation: note 1 on ch2, note 2 on ch3, note 3 on ch4, note 4 on ch5. Above each note three upward arrows emanate, showing per-note expression that rises from the note: glide blue, press red, slide green.MPE — ONE CHANNEL PER NOTE, INDEPENDENT EXPRESSIONsustain · all-notes-off (whole zone)MASTERch1ch2ch3ch4ch5ch6ch7ch81note2note3note4noterotationglide (bend)pressslide (timbre)each played note takes its own channel in rotation.pitch, pressure, timbre become independent per note.
The five MPE expression dimensions per note.Table of the five MPE expression dimensions: Strike from Note On velocity (instant), Press from Channel Aftertouch (continuous), Slide from CC 74 timbre (continuous), Glide from Pitch Bend (continuous), Lift from Note Off velocity (instant).THE FIVE EXPRESSION DIMENSIONS, PER NOTEDIMENSIONGESTUREMIDI SOURCETYPESTRIKEpress speedNote On velocityinstantPRESSheld pressureChannel AftertouchcontinuousSLIDEvertical slideCC 74 (timbre)continuousGLIDEhorizontal slidePitch BendcontinuousLIFTrelease speedNote Off velocityinstantthree are continuous (press, slide, glide), they evolve as the note lasts.two are instants: strike at attack, lift at release.
/ 11

MIDI 2.0

the protocol learns to converse — resolution, per-note expression, 256 channels
MIDI 1.0 (§2 to §10) is a monologue. The sender emits its bytes, the receiver takes them, and no one ever replies. A keyboard knows nothing about the synth plugged in front of it — how many sounds it has, which CCs it understands, what its presets are called. Resolution is 7-bit: 128 steps for velocity, 128 for each CC. Expression is per-channel (§5), hence the MPE trick (§10) to make it per-note. Sixteen channels, full stop. These limits date from 1983 — a brilliant compromise for the electronics of the time, but a compromise. Forty years later we still hit them: audible stepping on a slowly swept filter, no way to automatically display a synth's patch names in the DAW, per-note expression cobbled together.

MIDI 2.0, announced in January 2020 by the MIDI Manufacturers Association and its Japanese counterpart the AMEI, throws nothing away. It is a backward-compatible extension: a MIDI 2.0 device can speak MIDI 1.0 to an old synth, and two 2.0 devices together unlock the new features. The philosophy fits in one word: dialogue. Where MIDI 1.0 sent blind, MIDI 2.0 makes devices introduce themselves, negotiate and reply. Everything else — resolution, channels, per-note expression — flows from this new ability to converse.

MIDI-CI — the cornerstone. MIDI Capability Inquiry is the handshake. When two devices connect, one asks the other: "do you speak MIDI 2.0? which profiles do you support? which parameters do you expose?". MIDI-CI travels inside Universal SysEx messages (§6) — so compatible with any existing MIDI 1.0 transport. It is the handshake that conditions everything: without MIDI-CI, no MIDI 2.0. And crucially, the conversation always starts in MIDI 1.0. If one side doesn't reply or doesn't understand, you simply stay in 1.0 — that is graceful degradation, the safety net that guarantees nothing ever breaks.

The three pillars of MIDI-CI. MIDI-CI covers three distinct domains:

- Profiles — a device announces it implements a known profile: "I am a drawbar organ", "I am a General MIDI 2". The profile fixes the meaning of the controllers: the receiver then knows that such-and-such CC drives such-and-such drawbar, without manual configuration. Plug in and play, with the right mappings from the start.
- Property Exchange — devices exchange structured metadata, in JSON, via SysEx. Patch names, preset lists, controller state, configuration. This is what lets a DAW automatically display the sound names of a hardware synth, instead of an anonymous "Program 0, Program 1…" list.
- Protocol Negotiation — the mechanism that decides which protocol, 1.0 or 2.0, the link uses, and switches from one to the other.

UMP and Groups. MIDI 2.0 introduces a new container: the Universal MIDI Packet (UMP). Where MIDI 1.0 sent a variable-length byte stream (§3), the UMP is a fixed-size packet — 32, 64, 96 or 128 bits depending on the message type. A single UMP stream carries both MIDI 1.0 and MIDI 2.0 messages indifferently. UMP organizes everything into Groups: 16 Groups, each holding 16 channels. Sixteen times sixteen: 256 channels per connection. Each Group can run in 1.0 or 2.0 protocol independently. The old "channel 10 = drums" becomes "Group 0, channel 10" — addressing gains a dimension.

Resolution explodes. Velocity goes from 7-bit (128 values) to 16-bit (65,536 values). Control Changes go from 7-bit to 32-bit. Pitch bend from 14-bit to 32-bit. Concretely: a slowly swept filter no longer reveals the stepping of the levels — the movement becomes continuous again, like on an analog instrument. A musician's nuance on an expressive controller is no longer coarsely quantized. It is the end of zipper noise, that zip-fastener sound heard on overly slow MIDI 1.0 automation.

Per-note expression becomes native. MIDI 2.0 builds in directly what MPE cobbled together with the channel trick (§10). Per-Note Pitch Bend, Per-Note Controllers, per-note articulation: each note carries its own expression without sacrificing a channel. Where MPE configured a zone of 15 member channels for 15 independent notes at most, MIDI 2.0 gives per-note expression to every note on a single channel. The trick becomes unnecessary — per-note expression is now a native property of the protocol. Registered Controllers (heirs of RPN, §7) and Assignable Controllers (heirs of NRPN) also gain resolution and proper addressing.

THE PITFALL — MIDI 2.0 is not a cable. There is no "MIDI 2.0 cable". The historical DIN-5 (§8) stays MIDI 1.0: unidirectional, and too slow for the full-duplex MIDI-CI requires. MIDI 2.0 is carried mainly over USB-MIDI 2.0 (native full-duplex) and over networks. Buying a device labeled "MIDI 2.0" is not enough: the whole chain — the OS, the driver, the DAW, the other device — must support MIDI 2.0 end to end. If one link doesn't follow, MIDI-CI gracefully drops everyone back to MIDI 1.0. It is intended, it is the safety net — but it means you can believe you're using MIDI 2.0 while actually still in 1.0.

Slow adoption, and bidirectionality. The spec came out in 2020; real support arrives in trickles. macOS integrated UMP into CoreMIDI as early as Big Sur (2020). Linux gained MIDI 2.0 support in ALSA with kernel 6.5 (2023). Windows is rolling out its MIDI Services 2.0 gradually from 2024-2025. On the hardware side, a few keyboards and controllers (the Roland A-88MKII among the pioneers) expose MIDI-CI, but many "MIDI 2.0-ready" products in practice implement only MIDI-CI and Profiles, without the full 32-bit resolution. Check what is actually implemented, not the logo on the box. Another mental shift: MIDI 2.0 is bidirectional. MIDI 1.0 over DIN was unidirectional — one cable per direction (§2). MIDI-CI assumes a return channel. Over USB it is native and transparent; but the topology inherited from DIN, "I send, full stop", no longer holds.

MIDI 2.0 does not replace MIDI 1.0: it wraps and extends it. The same heritage — notes, channels, CCs, SysEx — but with dialogue added, resolution added, native per-note expression, 256 channels. Backward compatibility guarantees that none of the ten previous organs becomes obsolete: a Note On stays a Note On. What changes is that two devices can finally talk to each other instead of emitting into the void. After forty years of monologue, MIDI learns conversation. One last territory remains: what musicians do with the protocol when they bend its rules — §12.
MIDI 1.0 speaks. MIDI 2.0 converses. same heritage, two devices that finally reply to each other.
Diagram
MIDI 1.0 is a one-way monologue, MIDI 2.0 is a two-way dialogue via MIDI-CI.Left side, MIDI 1.0: a SENDER box with a single arrow to a RECEIVER box, no reply possible. Right side, MIDI 2.0: device A and device B in gold with a double arrow between them, MIDI-CI lets them ask "do you speak 2.0?" and negotiate.MIDI 1.0 SPEAKS · MIDI 2.0 CONVERSESMIDI 1.0 — monologueSENDERRECEIVERno reply possibleMIDI 2.0 — dialogueABMIDI-CI: "do you speak 2.0?"one sends, the other takes it. vs. both introduce themselves and negotiate.
Resolution comparison: 7-bit staircase versus 32-bit smooth ramp.Left: a coarse 8-step staircase representing 7-bit resolution, 128 steps, audible stepping. Right: a smooth diagonal ramp representing 32-bit resolution, about 4 billion steps, continuous motion. A slowly swept filter steps audibly in 7-bit but moves continuously in 32-bit, ending zipper noise.RESOLUTION — 7-BIT vs 32-BIT7-bit · 128 stepsaudible stepping32-bit · ~4 billioncontinuousa slowly swept filter: audible stepping in 7-bit,continuous motion in 32-bit. end of zipper noise.
UMP organizes 16 Groups of 16 channels each, totaling 256 channels.A UMP connection box contains 16 Group cells in a row, G0 to G15, with G0 highlighted in gold. A dashed line zooms G0 into a detail box showing its 16 channels numbered 1 to 16. Sixteen Groups times sixteen channels equals 256 channels. Per-note expression is native, no channel needs to be sacrificed as in MPE.UMP — 16 GROUPS × 16 CHANNELS = 256 CHANNELSUMP — one full-duplex connectionG0G1G2G3G4G5G6G7G8G9G10G11G12G13G14G15Group 012345678910111213141516each Group = 16 channels (protocol 1.0 or 2.0)addressing gains a dimension: Group + channel.per-note expression is native — no channel sacrificed (cf MPE).
MIDI-CI handshake: A discovers, B replies with capabilities, they negotiate the protocol.A sequence diagram between device A and device B. Step 1: A sends Discovery, who are you. Step 2: B replies with its profiles and properties. Step 3: they negotiate the protocol. Result: if both are OK they switch to MIDI 2.0, otherwise they stay in 1.0 gracefully. It all starts in MIDI 1.0 and nothing breaks.MIDI-CI — THE HANDSHAKE SEQUENCEDEVICE ADEVICE B1 · Discovery — who are you?2 · Reply — profiles, properties3 · Negotiate protocolboth OK → MIDI 2.0 · else → stays in 1.0 (graceful)it all starts in MIDI 1.0. the dialogue engages only if both reply.otherwise, everyone stays in 1.0 — nothing breaks.
/ 12

DIVERSIONS

what artists do with MIDI when they step outside music
The whole card has shown how MIDI works. This last section shows what artists do with it when they step outside music. The mechanism is simple: MIDI is a protocol of discrete, timed messages — a note on, a note off, a control change, values from 0 to 127 (§3, §4). This very simplicity makes it universally divertible: anything that obeys an event ("go now") or a value ("set this parameter to that") can be driven by MIDI. Light, motors, video, networks, installations. Music is just one of its uses — historically the first, not the only one. It is the freest territory of the protocol.

MIDI Show Control (MSC), standardized by the MMA in 1991, is the first official diversion. Designed to drive an entire show — lighting, machinery, video projection, sound, pyrotechnics — from a single control room. Encoded as Universal Real-Time SysEx (§6), with commands like GO, STOP, RESUME, TIMED_GO, LOAD. A lighting console (ETC Eos, ChamSys MagicQ, grandMA), a media server, a machinery system receive their cues from one stream. MSC has held big shows for thirty years: Broadway, Cirque du Soleil, Disney and Universal parks, where hundreds of light, sound and motion cues fire to the quarter-second.

MIDI to light (DMX512). DMX512 is the standard stage-lighting protocol: 512 channels per universe, each 0 to 255, to drive fixtures, dimmers, fog machines, moving heads. MIDI connects to it via MIDI-to-DMX bridges (Enttec DMX USB Pro, dedicated modules) or software (QLC+, Lightjams, ChamSys MagicQ, Chataigne). A note triggers a cue, a CC drives an intensity or a fade. It is the historical bridge between the sound desk and the lighting desk: the electronic pad triggering a flash, the Ableton Live set driving all the lighting via a Max for Live device or a show-control plugin. The musician becomes their own lighting designer.

MIDI to visuals. VJing and real-time video art rely heavily on MIDI for triggering and control. Resolume (VJ), TouchDesigner (Derivative — node-based environment for installations, data viz and projection mapping), Max/Jitter (Cycling ’74 — the video and matrix side of Max: real-time, generative, OpenGL image processing), MadMapper (surface mapping), Notch, VDMX, Millumin, Isadora (stage performance) all receive MIDI natively: a note launches a video clip, a CC drives an effect, a fader controls the layer mix. TouchDesigner converts incoming MIDI into a CHOP, then routes it to any visual parameter. The music controller becomes an image controller — the same APC40, Launchpad or Push drives samples or projections interchangeably.

MIDI to the creative network (OSC). When MIDI hits its resolution or structure limits, you relay it to OSC (Open Sound Control): a richer network protocol — hierarchical addresses, floating-point values, high resolution, UDP/TCP transport. MIDI triggers, OSC distributes. Creative-coding environments bridge the two: Max/MSP (Cycling '74) with its midiin, notein, ctlin objects; Pure Data (the open-source equivalent); Processing (TheMidiBus library); openFrameworks; vvvv; and above all Chataigne, a free multi-protocol orchestrator (MIDI, OSC, DMX, Art-Net, NDI, HTTP) that has become central to interactive installations. MIDI comes in, is translated, routed, and goes back out as OSC toward a cluster of machines, a game engine (Unity, Unreal Engine), a render server.

Hardware hacking. MIDI is the makers' favorite language because it is trivial to emit from a microcontroller. Arduino (with the MIDI Library or a MIDI shield), and above all Teensy (PJRC), which announces itself natively as a USB-MIDI Class Compliant device — no driver, the OS recognizes it on plug-in (§8) — are the basic tools. You wire any sensor to them: accelerometer, flex sensor, photoresistor, ultrasonic distance, capacitive surface, potentiometer. The physical gesture becomes a note or a CC. Bela offers an ultra-low-latency real-time platform for the augmented instrument; a Raspberry Pi runs Pure Data or RNBO (Max's C++ export, embeddable on Pi, VST, web). The DIY controller, the augmented instrument, the bespoke gestural interface: all go through MIDI because every music application already understands it, with no configuration.

MIDI to the physical world. At the end of the chain, actuators: solenoids that strike, stepper motors, servos, relays that switch. A note becomes a hammer strike, a CC an axis position. This is robotic music and kinetic installation: modern player pianos (Yamaha Disklavier), the automaton ensembles of the Logos Foundation (Godfried-Willem Raes), the musical machines of Felix Thorn (Felix's Machines), countless MIDI-driven sound sculptures. A note equals an actuator. The protocol designed for synthesizers now drives objects that strike, turn, light up, blow.

MIDI as material. The protocol itself becomes the subject of the work. Generativity: algorithms — Markov chains, cellular automata, L-systems — produce MIDI streams no human would play. Live coding: Sonic Pi, TidalCycles, Orca (Hundred Rabbits — an esoteric sequencer that drives MIDI, OSC and UDP from a grid of characters) generate MIDI live, keystroke by keystroke. Sonification: data — weather, stock prices, environmental sensors, network traffic — becomes MIDI, hence sound or image. And the aesthetic diversion of the protocol itself: MIDI feedback loops, deliberate glitch, stream saturation as raw material.

Three field pitfalls recur whenever you divert MIDI outside music. First, bandwidth and resolution, inherited from 1983: 31250 baud (§8) saturate quickly under dense control, and 7-bit (128 steps, §3 and 11) is coarse for a slow lighting fade or fine motor movement. Hence the field rule: MIDI stays the trigger — "go now" — and for the heavy, continuous load, you relay to OSC or Art-Net downstream. Second, note on velocity 0 equals note off (§4) resurfaces cruelly: a solenoid meant to strike that receives a velocity of 0 does nothing, the message being read as a release. Classic beginner pitfall in MIDI hardware — the actuator "never responds" while the message arrives just fine. Finally, latency and jitter: for a sound-video-light installation locked to the frame, the jitter of a wireless transport (BLE, §8) desyncs everything. Wired (DIN, USB) for the critical, never wireless.

MIDI was conceived in 1983 to make synthesizers talk to each other. Forty years later, it drives fixtures, motors, images, networks, sculptures — because its grammar of discrete, timed messages is simple enough to apply to almost anything that obeys an event or a value. That is the paradox of its longevity: its very limits — modest bandwidth, low resolution, elementary messages — made it a universal triggering language, indestructible and ubiquitous. Music was only the beginning. Anything that lights up, turns, strikes or displays can listen to MIDI — and artists, everywhere, make it say what it was never designed to say.
MIDI drives anything that obeys an event or a value. music is just one of its uses.
Diagram
MIDI as a triggering hub: one note or CC drives light, visuals, physical actuators, networks, and sound.A central MIDI hub (note or CC, gold) branches to five targets: LIGHT via DMX512, VISUALS via TouchDesigner, PHYSICAL via Arduino and solenoids, NETWORK via OSC Max and Pd, and SOUND via a synth shown in grey as the usual use. The same message drives different worlds.MIDI — A TRIGGERING HUB FOR ANYTHINGLIGHTDMX512VISUALSTouchDesigner, JitterPHYSICALArduino, solenoidNETWORKOSC · Max · PdSOUNDsynth (the usual)MIDInote / CCone message — note or CC — drives different worlds. music is just one.
Six diversions of MIDI beyond music: show control, light, visuals, network, physical, generative.Table of six MIDI diversions: Show control via MIDI Show Control 1991; Light via MIDI to DMX512; Visuals via TouchDesigner and Resolume; Network via OSC Max Pd Chataigne; Physical via Arduino Teensy solenoids; Generative via Orca Sonic Pi live coding. MIDI stays the trigger; heavy load relays to OSC or Art-Net.SIX DIVERSIONS OF MIDI BEYOND MUSICDOMAINBRIDGE / TOOLMIDI DRIVESPITFALLSPECTACLEMIDI Show Control (1991)light, machinery, video cuesSysEx = latencyLUMIÈREMIDI→DMX512 bridgeintensity, cues, fades7-bit coarseVISUELTouchDesigner, Resolumeclips, VJ params, mappingjitter = desyncRÉSEAUOSC · Max, Pd, Chataigneroutes, fans outPHYSIQUEArduino, Teensy, solenoidmotors, relays, actuatorsnote-on vel 0 = noneGÉNÉRATIFOrca, Sonic Pi, live codingalgorithmic stream31250 saturatesMIDI stays the trigger — "go now".heavy continuous load relays to OSC / Art-Net.
Division of labor: MIDI is the light trigger, OSC and Art-Net carry the heavy continuous high-resolution payload.A source (controller or sequencer) sends a thin MIDI arrow (light, go now) to a relay (Max, Pd, Chataigne). The relay sends a thick OSC / Art-Net arrow (dense, high-resolution) to heavy loads: dense lighting, multi-parameter video, machine cluster. MIDI says WHEN to fire; OSC and Art-Net carry HOW MUCH, continuously and at high resolution.DIVISION OF LABOR — MIDI TRIGGERS, OSC CARRIESSOURCEcontroller / seqMIDIlight · go nowRELAYMax · Pd · ChataigneOSC / Art-Netdense · HDHEAVY LOADSdense lightingmulti-param videomachine clusterlight trigger → heavy continuous payloadMIDI says WHEN to fire. OSC / Art-Net carry HOW MUCH —dense, continuous, high-resolution. each does what it is good at.