How the atlas was grounded in the full AI labor review

The atlas is model-generated, but it is not meant to stand alone. This page now synthesizes the broader research work captured in `research/notes` and `research/outputs`, not only the original handful of papers that first anchored occupation notes.

The current review consulted 24 source items, accepted 17 into the final synthesis, keeps 5 core source cards on this page, and now uses 11 sources for direct occupation-note mapping.

Assembly

How this broader review was assembled

This is no longer just a paper shelf for the note layer. The goal is to keep the operational source subset visible while also bringing the stronger academic and institutional evidence into the same reading frame.

Step 1

Start with the full reviewed evidence base

The local review now synthesizes the research memos and final output artifact, not only the original five core papers that first anchored the occupation-note layer.

Step 2

Keep the operational note layer separate from the wider review

The app still foregrounds a five-source core card layer, but occupation notes can now also draw from selected additional-reading sources when the occupation mapping is specific enough to audit.

Step 3

Separate exposure, adoption, productivity, and employment

Some sources estimate task overlap, some measure firm-level productivity, some track online labor-market substitution, and some offer official cross-country synthesis. The review keeps those objects distinct.

Step 4

Use the literature to sharpen the atlas, not overwrite it

The broader review adds context, caveats, and stronger ranges for interpretation. It still does not silently replace the atlas score unless the scoring method changes.

Reading guide

How to read the full review against the atlas

The useful move is to let each layer do the job it is actually good at: the broader review for ranges and caveats, the selected note sources for occupation-facing context, and the atlas for comparable coverage across the BLS taxonomy.

What the broader review contributes

The full research review brings together exposure studies, field productivity evidence, online labor-market evidence, and institutional synthesis. That wider frame helps interpret the atlas without pretending a single paper settles the question.

What the note layer contributes

The note layer now combines the five core OpenAI and Anthropic sources with selected additional-reading studies where the occupation link is specific enough to audit rather than hand-wave.

What the atlas contributes

The atlas covers the full BLS occupation taxonomy in one place, keeps labor-market context like pay and projected growth attached to each occupation, and lets readers compare several AI lenses across the whole U.S. job structure.

What to keep separate

Observed usage, benchmark capability, task exposure, productivity gains, and employment effects answer different questions. This page keeps them separate instead of collapsing them into one universal exposure score.

Review footprint

Three evidence layers now sit behind the page

11 note sources

Occupation-note layer

The note system still starts from the five core OpenAI and Anthropic papers, but it now adds selected exposure, productivity, employment, and official sources when the occupation mapping is tight enough to stay auditable.

Atlas role

These sources power the note text shown inside occupation detail views and the occupation-hook examples on this page.

7 core studies

Academic empirical backbone

The broader review adds the strongest exposure papers, field productivity studies, and online labor-market evidence from Science, NBER, and peer-reviewed or widely cited working-paper sources.

Atlas role

This is the layer that stabilizes claims about which jobs move first, what gains are measurable, and where early substitution is actually showing up.

7 official reports

Institutional synthesis layer

ILO, IMF, and OECD reports add global exposure ranges, firm-adoption context, distributional patterns, and the strongest official cautions against equating exposure with layoffs.

Atlas role

This layer calibrates the atlas against cross-country evidence, firm heterogeneity, and labor-policy framing that the core paper-card subset cannot cover on its own.

Full-review synthesis

What recurs across the full evidence base

Exposure is broad, but realized displacement is narrower so far

Across the strongest academic and institutional sources, technical exposure is large, but the most concrete negative labor effects remain selective and concentrated in specific task markets rather than broad economy-wide employment collapse.

ILOIMFOpenAIOECD

Office-heavy, digitized, language-intensive work moves first

The repeated cross-source pattern is still clerical, administrative, customer-support, writing, legal-support, finance-support, and other highly digitized professional work at the high end of exposure, with physical and in-person roles relatively insulated in the near term.

ILOFelten/Raj/SeamansAnthropicIMF

Augmentation has the strongest measured evidence in structured workflows

The cleanest measured gains come from codified service and writing tasks where AI can draft, retrieve, summarize, or coach inside an existing workflow rather than trying to replace the whole job all at once.

Brynjolfsson et al.Noy & ZhangOECDAnthropic

Early substitution is clearest in modular external labor markets

The strongest short-run negative labor evidence comes from online labor and freelance markets, where work is easier to unbundle, price-compare, and substitute quickly.

Hui et al.Liu et al.Teutloff et al.

Adoption and impact vary sharply across firms and countries

Capability alone does not determine labor outcomes. Firm maturity, workflow integration, regulation, and labor-market context shape whether AI appears as a productivity tool, a contractor substitute, or a slow-moving organizational change.

OECDIMFOpenAIAnthropic

Distributional effects are real, but they do not all point the same way

Women are often more exposed in official data because of clerical concentration, higher-educated workers are often more exposed because they do cognitive work, and less-experienced workers sometimes gain more from AI assistance even when exposed occupations overall face pressure.

ILOIMFBrynjolfsson et al.Hui et al.

Headline estimates

Quantitative ranges the broader review supports

Global exposure floor

25%

ILO estimates one in four workers globally are in occupations with some degree of generative-AI exposure.

ILO 2025

Highest global exposure tier

3.3%

ILO's top exposure bucket is much smaller than the broader exposure population, which is one reason exposure should not be read as job loss.

ILO 2025

Global vs advanced-economy exposure

~40% / ~60%

IMF's framing puts about 40% of global employment and about 60% of advanced-economy employment in exposed jobs.

IMF 2024

U.S. task exposure potential

80% / 19%

OpenAI and Eloundou et al. estimate about 80% of U.S. workers could have at least 10% of tasks affected, while about 19% could have at least half their tasks affected.

OpenAIScience 2024

Customer-support productivity gains

+14-15% avg.

Brynjolfsson, Li, and Raymond find measured productivity gains around 14-15% on average, with about 34-35% gains for novice or lower-skilled workers.

Brynjolfsson et al.

Writing-task productivity gains

-40% time, +18% quality

Noy and Zhang show strong experimental gains on professional writing tasks, supporting the augmentation case for codifiable language work.

Noy & Zhang

SME adoption and staffing

31% use, 83% no staff-need change

OECD's SME evidence points to meaningful adoption and performance gains without a corresponding broad reduction in staff need in most firms surveyed.

OECD 2025

Early freelance and platform pressure

-2% to -50%

Across the strongest platform studies, exposed markets show declines ranging from low single-digit contract and earnings losses to double-digit or larger demand drops in substitutable skill clusters.

Hui et al.Liu et al.Teutloff et al.

Evidence base

The broader review rests on academic and official source backbones

Academic and empirical backbone

These are the strongest non-operational sources from the memos and final review for exposure rankings, productivity measurement, and early labor-demand effects.

OpenAI / ScienceTask-exposure study

GPTs are GPTs: Labor market impact potential of LLMs

2023-2024

Foundational exposure paper for the claim that task overlap can be broad even when realized labor outcomes are still uncertain.

Felten, Raj, SeamansOccupation and industry ranking paper

How will Language Modelers like ChatGPT Affect Occupations and Industries?

2023

Useful for occupation and industry rankings, especially clerical, legal, finance, and customer-support-heavy roles.

Brynjolfsson, Li, RaymondField productivity study

Generative AI at Work

2023 / 2025

Strongest measured workplace productivity evidence for augmentation in structured customer-support work.

Noy & ZhangRandomized task experiment

Experimental evidence on the productivity effects of generative artificial intelligence

2023

Shows large writing-task productivity gains in a controlled setting rather than a firm or labor-market equilibrium.

Hui, Reshef, ZhouEmployment effect study

The Short-Term Effects of Generative Artificial Intelligence on Employment: Evidence from an Online Labor Market

2024

One of the cleanest early indications of reduced contracts and earnings in AI-exposed freelance work.

Liu et al.Online labor-market study

"Generate" the Future of Work through AI: Empirical Evidence from Online Labor Markets

2024

Shows larger drops in demand and transaction volume in AI-exposed submarkets, alongside worker reallocation toward programming.

Teutloff et al.Freelance demand study

Winners and losers of generative AI: Early Evidence of Shifts in Freelancer Demand

2025

Sharpest evidence that substitutable freelance skill clusters can see double-digit or larger demand losses while complementary ones grow.

Institutional and official synthesis

These official sources anchor the review's global ranges, adoption context, distributional findings, and caution around over-interpreting near-term employment effects.

ILOGlobal exposure index

Generative AI and Jobs: A Refined Global Index of Occupational Exposure

2025

Best official source for global occupational exposure ranges and the continued primacy of clerical exposure.

ILOMethod update

Generative AI and jobs: A 2025 update

2025

Useful for methodology, multimodal updates, and keeping the global exposure frame current.

IMFCross-country synthesis

Gen-AI: Artificial Intelligence and the Future of Work

2024

Important for the global, advanced-economy, emerging-market, and low-income-country exposure ranges and complementarity framing.

OECDEmployment synthesis

OECD Employment Outlook 2023: Artificial Intelligence and the Labour Market

2023

Strongest official caution that broad negative employment effects are still hard to establish in aggregate.

OECDWorkplace survey synthesis

Using AI in the workplace: Opportunities, risks and policy responses

2024

Adds survey evidence on worker-perceived performance gains, enjoyment, and ongoing anxieties around job loss and inequality.

OECDSME adoption survey

Generative AI and the SME Workforce

2025

Best current evidence that firms can report meaningful GenAI use and performance gains without broad staffing cuts.

OECDFirm diffusion note

Fostering an inclusive digital transformation as AI spreads among firms

2024

Important for the claim that adoption is concentrated in frontier firms and could widen productivity and wage gaps.

Fault lines

Where the research still needs to be read carefully

Exposure is not the same thing as displacement

The strongest exposure papers estimate task overlap or technical reach, not whether firms actually cut jobs or wages. The full review keeps those categories separate on purpose.

Capability is ahead of adoption

OpenAI and Anthropic both point to a gap between what current models can do and what current firms or workers have actually integrated into reliable production workflows.

Firm studies and platform studies tell different stories

Structured enterprise settings often show augmentation and productivity gains, while online labor platforms are where substitution pressure appears first and most clearly.

Cross-country headline percentages are range markers, not a single shared truth

ILO, IMF, OECD, and OpenAI-style studies use different taxonomies, exposure concepts, and capability assumptions, so their top-line percentages are informative ranges rather than interchangeable facts.

Employment evidence still lags task and productivity evidence

There is much more credible evidence on exposure and short-run workflow gains than on whole-economy hiring, wage, and occupation-transition effects over the medium term.

Wired source layer

What the occupation-note papers contribute directly

The page now reflects the broader review, but the app still operationalizes a smaller source subset for occupation notes. These are the papers that currently map most directly into note text, evidence hooks, and atlas interpretation.

OpenAIPolicy blueprint

AI at Work: OpenAI's Workforce Blueprint

October 2025

What this source measures

Combines product-usage observations, early market interpretation, and workforce-transition proposals.

How it informs the atlas

Useful as a policy and timing lens for interpreting replacement scores cautiously and for explaining why current usage can lag capability.

It is not an occupation-by-occupation empirical exposure table, so it should not be treated as a direct labeling source for the full atlas.

Read paper

Near-term use looks more collaborative than fully substitutive

OpenAI frames current workplace use as decision support, writing, research, and routine streamlining, and argues the observed pattern is still more enabling than replacing.

Foreword, pp. 2-4.

Workplace adoption often starts bottom-up

The paper says employees often begin using ChatGPT before formal enterprise deployment, with writing, market research, and data analysis showing up early across business functions.

Foreword, pp. 2-3.

Capability is moving faster than labor-market measurement

OpenAI links its GDPval benchmark to a claim that GPT-5-level systems already match or exceed professionals on about half of the benchmarked economically valuable tasks.

Foreword, p. 3.

AnthropicEmpirical labor note

Labor Market Impacts of AI: A New Measure and Early Evidence

March 5, 2026

What this source measures

Introduces observed exposure, combining theoretical LLM feasibility with real usage weighted toward work-related and more automated use.

How it informs the atlas

This is the closest external analogue to our replacement metric because it explicitly tries to bridge theoretical exposure and observed use.

It is still built from Claude-centered usage and a custom weighting scheme, so it is informative context rather than a drop-in target variable for our own labels.

Read paper

Current deployment still trails theoretical capability

Anthropic argues actual task coverage remains only a fraction of what current models could theoretically do, which is a strong warning against equating capability with labor displacement.

Key findings, p. 2.

Higher observed exposure lines up with weaker projected growth

The note reports that occupations with higher observed exposure are projected by BLS to grow less through 2034.

Key findings, p. 2.

Early labor effects are subtle rather than dramatic

Anthropic reports no broad unemployment spike for exposed workers since late 2022, but it does see suggestive evidence that hiring of younger workers has slowed in exposed occupations.

Key findings, p. 2.

OpenAICapability benchmark

GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

October 2025

What this source measures

Benchmarks frontier models on expert-authored, economically valuable tasks from predominantly digital occupations.

How it informs the atlas

Strong input for our digital-adjacency and augmentation reasoning because it measures capability on serious real-world deliverables.

Because GDPval is only 44 occupations and intentionally digital, it cannot stand in for exposure across manual, care, or field-heavy occupations.

Read paper

The benchmark targets high-value digital work, not the full labor market

GDPval covers 44 occupations across the top 9 GDP sectors and deliberately focuses on predominantly digital roles.

Abstract and Section 2.1, pp. 1-3.

The task set is grounded in real expert work product

Tasks are built from work contributed by experienced practitioners and are evaluated with human expert pairwise comparisons rather than only automatic grading.

Abstract and Sections 2.2-2.5, pp. 1-4.

Frontier models are approaching expert-quality output on this narrow slice

OpenAI reports that the strongest frontier systems are approaching industry experts on the GDPval gold subset, which raises the upper bound on near-term exposure for digital occupations.

Section 3.1, pp. 4-5.

AnthropicUsage analysis

Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations

February 2025

What this source measures

Maps millions of Claude conversations onto O*NET tasks to show where AI is already being used in the economy.

How it informs the atlas

One of the best external anchors for our augmentation and physical-world insulation metrics.

It is platform-specific usage evidence, so it can understate occupations where capability exists but product adoption, regulation, or workflow integration lag.

Read paper

Observed use is concentrated in software and writing

Anthropic finds software development and writing tasks together account for nearly half of total observed Claude usage.

Abstract, pp. 1-2.

Adoption is broad but shallow across many occupations

The paper reports that about 36% of occupations show AI use in at least a quarter of their tasks, but only a small share show deep task penetration.

Abstract and contributions, pp. 1-3.

Augmentation edges automation in observed product use

Anthropic estimates 57% of usage is augmentative and 43% is more automation-like, while occupations involving physical manipulation show minimal current use.

Abstract and Section 1 contributions, pp. 1-3.

AnthropicUsage and success-rate report

The Anthropic Economic Index Report: Economic Primitives

January 15, 2026

What this source measures

Adds task complexity, autonomy, success rate, and work-versus-coursework distinctions to Claude usage analysis.

How it informs the atlas

Best source in this set for occupation-specific nuance beyond a single headline score, especially around what kind of work remains after AI takes on some tasks.

It still reflects Claude usage and success rather than a cross-provider equilibrium, so it should complement rather than override our ensemble estimates.

Read paper

Success rates meaningfully change occupational exposure

When Anthropic weights tasks by both importance and Claude success rate, some occupations such as data entry keyers and database architects show large swaths of work within reach.

Introduction, pp. 3-4.

Observed use remains mixed between collaboration and delegation

Anthropic reports augmentation again exceeds automation on Claude.ai, even while automated use remains stronger in first-party API traffic.

Chapter 1 overview, pp. 4-5.

Task removal can imply deskilling or upskilling depending on the occupation

The report uses travel agents and property managers to show that removing AI-covered tasks can either hollow out the most complex work or strip away bookkeeping-heavy work and leave more strategic responsibilities.

Introduction, pp. 3-4.

Occupation notes

How to read the paper-backed notes inside occupations

The note layer is a compact interpretation aid for readers, not an implementation plan. The notes now mix core vendor studies with selected exposure, productivity, employment, and official sources from the broader review where the occupation link is strong enough to defend.

Step 1

Treat occupation notes as context, not a verdict

A paper-backed note helps explain why an occupation looks exposed, insulated, augmentable, or ambiguous. It should sharpen your reading of the atlas rather than replace it.

Step 2

Pay attention to what kind of evidence you are reading

Some notes reflect observed usage, some reflect exposure rankings, some reflect benchmark capability, some report productivity or employment effects, and some are policy framing. Those are related, but they are not the same thing.

Step 3

Use the note together with the atlas metrics

The strongest reading is comparative: look at the occupation's replacement, augmentation, physical, and disagreement scores, then use the note to understand what kind of pressure or caveat the literature adds.

Occupation hooks

Occupation families with the clearest paper trail

These are strong first candidates because the papers either name the occupations directly or describe a narrow enough occupation family to support an auditable note.

Software DevelopersData ScientistsTechnical WritersWriters and Authors

Observed Claude usage is especially concentrated in software development, writing, and analytical work, so these occupations are good candidates for paper-backed exposure notes.

These are direct occupation families where our high digital-adjacency and replacement scores can be paired with external observed-usage evidence.

usageAnthropicAbstract and contributions, pp. 1-3.

AnesthesiologistsConstruction LaborersConstruction Workers

Occupations requiring physical manipulation of the environment show minimal current Claude usage, making them strong examples for the physical-world insulation metric.

This gives the atlas a concrete external citation for why some low-replacement cells stay relatively green even when they are economically large.

usageAnthropicAbstract and contributions, pp. 1-3.

Data Entry KeyersDatabase Architects

When Anthropic factors in task success rates, data entry keyers and database architects are examples where Claude appears capable across a large share of the job.

These are unusually clean occupation-level hooks for the modal because the report names them directly and says something more specific than a generic exposure score.

capabilityAnthropicIntroduction, pp. 3-4.

TelemarketersHuman Resources SpecialistsManagement AnalystsLoan Officers

Felten, Raj, and Seamans rank telemarketers and several HR, analyst, finance, and postsecondary-teaching roles among the occupations most exposed to language modeling.

This gives the atlas a clean exposure-ranking hook for office-heavy occupations that predate the later product-usage studies.

exposureFelten, Raj, SeamansAbstract, arXiv 2303.01157.

Customer Service Representatives

Brynjolfsson, Li, and Raymond report that AI assistance raised customer-support productivity by about 14% on average and much more for novice or lower-skilled agents.

This is one of the clearest examples where high exposure does not automatically mean immediate displacement because measured augmentation gains arrive first.

productivityBrynjolfsson, Li, RaymondNBER working paper abstract, 2023.

Technical WritersWriters and AuthorsPublic Relations SpecialistsMarket Research Analysts and Marketing Specialists

Noy and Zhang show large productivity gains on professional writing tasks, making writing, marketing, and communications work strong candidates for productivity-style notes rather than pure replacement claims.

This adds experimental task evidence to occupations that otherwise only show up under generic exposure or usage notes.

productivityNoy & ZhangMIT working paper, pp. 2-4; Science 2023.

Writers and AuthorsGraphic DesignersInterpreters and TranslatorsEditors

Hui, Reshef, and Zhou find lower employment and earnings in highly affected freelance occupations, which is the clearest early warning for modular text and image work sold in external markets.

This is the cleanest way to put an actual measured downside next to occupations that are easy to unbundle and buy on demand.

employmentHui, Reshef, ZhouOrganization Science abstract, 2024.

Data Entry KeyersBookkeeping, Accounting, and Auditing ClerksReceptionists and Information ClerksSecretaries and Administrative Assistants, Except Legal, Medical, and Executive

The ILO's refined global index keeps clerical occupations at the top of exposure rankings and frames the likely outcome as job transformation more often than outright replacement.

This supplies an official cross-country anchor for office and clerical occupations that otherwise rely mostly on vendor-side or platform-side studies.

exposureILOILO summary, 2025.

Additional reading

How the atlas was grounded in the full AI labor review

How this broader review was assembled

How to read the full review against the atlas

Three evidence layers now sit behind the page

Occupation-note layer

Academic empirical backbone

Institutional synthesis layer

What recurs across the full evidence base

Exposure is broad, but realized displacement is narrower so far

Office-heavy, digitized, language-intensive work moves first

Augmentation has the strongest measured evidence in structured workflows

Early substitution is clearest in modular external labor markets

Adoption and impact vary sharply across firms and countries

Distributional effects are real, but they do not all point the same way

Quantitative ranges the broader review supports

The broader review rests on academic and official source backbones

Academic and empirical backbone

Institutional and official synthesis

Where the research still needs to be read carefully

Exposure is not the same thing as displacement

Capability is ahead of adoption

Firm studies and platform studies tell different stories

Cross-country headline percentages are range markers, not a single shared truth

Employment evidence still lags task and productivity evidence

What the occupation-note papers contribute directly

AI at Work: OpenAI's Workforce Blueprint

Labor Market Impacts of AI: A New Measure and Early Evidence

GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations

The Anthropic Economic Index Report: Economic Primitives

How to read the paper-backed notes inside occupations

Occupation families with the clearest paper trail

Other papers and reports worth reading on this topic