The pipeline learns to watch television

In Report 1 we built a machine that takes a scene and returns its O*NET vector โ€” the specific federal occupation codes it depicts, and the tasks, work activities, and skills inside them. That version read words. It worked on transcripts.

But most of what a job looks like never makes it into dialogue. A doctor does not say "I am now bag-valve-masking the patient." She just does it, on camera, while saying something else entirely. So we taught the pipeline to watch: it now samples still frames from a clip, hands them to Claude alongside the audio transcript, and grounds each extracted task in either a line of dialogue or a thing visible in the frame.

Then we pointed it at the most obvious target available โ€” the television doctor โ€” and asked a question you can only answer with thirty years of footage: has the job changed? We took six clips from ER (which premiered in 1994) and six from The Pitt (2025), both minute-to-minute emergency-medicine dramas, and ran them through the same O*NET machine.

One frame in, four tasks out

Here is the method, made concrete. The model sees a frame, identifies the occupations present, and returns the specific O*NET profile items it can justify. Items tagged visual are ones it drew from the image rather than the soundtrack โ€” the part a transcript-only reading would have missed.

Frame โ†’ tasks: ER, 1995
A gowned, goggled physician leans over a trauma patient in the ER ER ยท "Carter Running a Trauma" ยท scene 12

Occupations detected: Emergency Medicine Physicians, Paramedics. Items returned:

taskPerform emergency resuscitations on patients.visual
"gloved hands holding a bag-valve mask over the patient's face"
taskStabilize patients in critical condition.visual
"gown and gloves, applying a bag-valve mask to an unresponsive patient"
dwaImplement advanced life support techniques.visual
"a gloved hand pressing a bag-valve mask resuscitator over a patient's face"
dwaTreat medical emergencies.visual
"protective gown, goggles, gloves; performing airway management on a bleeding patient"

Still: Warner Bros. Television / ER (NBC). Low-resolution frame reproduced for non-commercial research commentary. visual marks evidence taken from the image, not the dialogue.

The same frame, thirty years later, in a show built around a single emergency department shift:

Frame โ†’ tasks: The Pitt, 2025
A modern trauma bay: an elderly patient on a gurney with EKG leads, attended by a gloved team The Pitt ยท "Broken Pacemaker" ยท scene 18

Occupations detected: Emergency Medicine Physicians, Registered Nurses, Paramedics. Items returned:

taskStabilize patients in critical condition.visual
"patient on gurney with leads attached, multiple staff attending, oxygen administered"
taskPrepare patients for and assist with examinations or treatments.visual
"personnel in gloves attending to a patient on a gurney, adjusting oxygen/airway"
dwaTreat medical emergencies.visual
"emergency team surrounding patient on gurney with monitoring equipment"
taskMonitor all aspects of patient care.visual
"multiple healthcare workers surrounding the patient with monitoring equipment visible"

Still: Warner Bros. Television / The Pitt (HBO Max). Low-resolution frame reproduced for non-commercial research commentary.

Across all twelve clips, 16% of the evidence the model returned was visual โ€” grounded in the frame, invisible to the transcript. Crash carts, defibrillator paddles, IV lines, the choreography of a trauma bay. That is the part of the job television shows you instead of telling you, and it is exactly the part the old pipeline was blind to.

The job barely moved โ€” and moved completely

Measured by the raw skill emphasis, the two eras are almost the same job. The cosine similarity between ER's and The Pitt's O*NET skill vectors is 0.91. Registered Nurses and Emergency Medicine Physicians are the top two occupations in both, by a wide margin. It is still, unmistakably, the emergency room.

What changed is the edges of the portrayal โ€” and they changed in two clear directions.

Share of evidence by SOC major group, ER versus The Pitt
Where the portrayal sits in the federal occupation taxonomy. Healthcare (SOC 29) dominates both, but the surrounding cast of occupations is almost entirely different.

First, the camera moved into the bay and stayed there. Healthcare occupations rose from 82% of all evidence in ER to 91% in The Pitt. ER's clips spend real time on the civic apparatus around the medicine: firefighters wheeling in the wounded, a hospital administrator, attendings teaching residents, even a news crew and a 911 dispatcher. The Pitt strips almost all of it away and holds on the resuscitation.

Second, a whole new kind of work appears. An entire branch of the occupation taxonomy that is absent from the ER clips โ€” Community & Social Service โ€” shows up in The Pitt at 8% of all evidence, driven by Mental Health and Substance Abuse Social Workers. The tasks that appear only in 2025 are telling: substance-abuse counseling, mental-health assessment, educating patients about their illness and community resources, care coordination, discharge planning. The tasks that appear only in ER are the opposite flavor: hands-on CPR, fire-and-rescue radio traffic, facility supervision, a press briefing.

So the one-line reading: the 1990s show framed the emergency room as a hub in a wider civic system โ€” fire, police, press, hospital management, teaching. The 2025 show reframes the same job as medicine plus behavioral health, addiction, and the psychosocial work of holding a patient together. The doctor will see you differently now.

A quick aside on what this might say about work, not just television. It's tempting to read the drift as a change in what we find interesting about medicine itself. In 1994, the technology was the spectacle โ€” the paddles, the monitors, the crash cart, the sheer procedural novelty of a trauma bay was the draw, the thing that signaled this is what a doctor does. Thirty years on, all of that is ambient; we take it for granted that the machines work. What's left to dramatize โ€” what now reads as the essence of the job โ€” is the psychological and the interpersonal: breaking the bad news, steadying the frightened patient, the social worker's caseload, the judgment call. The equipment became furniture and the relationships became the story. And that's the part worth dwelling on: as technology quietly absorbs the technical core of a profession, the thing we come to see as the real job โ€” the part with the status, the drama, the value โ€” migrates to whatever the machines can't touch.

What this is and isn't

This is a pilot, and an honest one. It is six clips per era, drawn from whatever is officially posted online โ€” not a scene-matched sample. ER's longer trauma clips and The Pitt's shorter ones each carry their own quirks; some of the specific percentages would move with a larger, hand-matched corpus. The direction of the finding is the interesting part, not the third decimal place. Automatic transcription and visual extraction add their own noise.

But the method clearly works. The pipeline now reads a moving image and returns the federal taxonomy of work it depicts โ€” frame by frame, task by task. Television is just the first thing we pointed it at. Again.