Case studies / Public sector · International org
Public sector International org Geneva · global Claude Code

Teaching the WHO to build: from crisis to self-sustaining AI engineering.

How Orchestrary helped the World Health Organization deploy 15 production agents and trained 87 staff to build the next ones themselves with Claude Code — preserving 80–90% of operational capacity on 50% of the workforce, saving $1.2M–$2.0M annually, and leaving behind an organization that no longer needs us.

$1.2M–$2.0M
Annual savings
80–90%
Operational capacity preserved
15
Production agents shipped
87
Staff trained on Claude Code

When the United States withdrew funding from the World Health Organization in 2024, the organization faced an existential crisis. As WHO's largest contributor, the US had provided 15–20% of the organization's total budget. The loss forced WHO to eliminate approximately half of its 7,400-person workforce — a reduction so severe that under normal circumstances, it would have meant organizational collapse.

WHO's leadership made a different bet. Instead of accepting proportional service cuts, they engaged Orchestrary to do two things in parallel:

  • Deploy the first wave of production agents themselves, fast — covering the highest-pain operational processes that were about to break with half the headcount.
  • Teach an internal cadre of 80+ WHO staff to use Claude Code at the same level our consultants do — so the agent program would become a permanent in-house capability, not a vendor dependency.

Eighteen months later, WHO is running 15 production-grade AI applications, has built another 27 internally without our help, and operates with roughly 3,500 staff doing work that previously required 7,400. The Head of HR personally recommended the program to the Director-General as a model for organizational resilience and digital innovation.

The crisis: organizational survival, not "digital transformation"

The numbers were devastating

  • Workforce reduction: From ~7,400 staff to approximately 3,500–4,000 personnel (46–53% reduction)
  • Timeline pressure: 3–6 months to demonstrate the new operating model would hold
  • Operational risk: Maintaining critical global health functions during post-pandemic recovery
  • Knowledge retention: Preventing institutional knowledge loss from experienced departing staff

What everyone else recommended

Every traditional consulting firm WHO talked to recommended the same thing: cut services proportionally, freeze new initiatives, accept a multi-year recovery, lobby donors. McKinsey-style decks with three-year roadmaps.

What Orchestrary recommended instead

A two-track engagement: build (we deploy 8 production agents in 12 weeks) and teach (we run a Claude Code Academy for an initial cohort of 24 WHO staff, later expanded to 80+, so they could build the next wave themselves). The argument was simple. WHO didn't have the budget for a 24-month enterprise software build, and it couldn't afford to be permanently dependent on an external vendor for the systems running its core operations. The only path was to deploy fast and transfer the skill at the same time.

WHO took the bet. We started in week three.

The portfolio: 15 production agents — 8 by us, 7 by WHO staff

Wave 1 — Built by Orchestrary (months 1–4)

We led the build on the eight highest-leverage systems while the Academy was running in parallel. WHO Academy participants shadowed our engineers throughout — pairing on Claude Code sessions, reviewing each other's prompts, co-authoring the SKILL.md and AGENTS.md files that made each agent reliable.

ARCH-AI · WHO Architecture Advisor

The problem
Traditional IT architecture planning required 3–5 senior architects ($450K–1M annually), 2–4 weeks per project. Knowledge siloed in individual architects' heads, creating bottlenecks.
The agent
A conversational requirements-gathering system built in 11 days using Claude Code. Conducts intelligent stakeholder interviews, generates Business Requirements Documents, Solution Descriptions, Architecture Definitions, and Implementation Backlogs, with real-time Azure cost estimates.
The impact
  • Time per project: 2–4 weeks → 2–3 hours (90%+ reduction)
  • Annual savings: $140K–$220K (eliminated 2–3 architect FTEs)
  • Throughput: One person handles 10× more projects per year
  • Quality: Consistent WHO-standard documentation across all projects
Skill transferTwo WHO architects from the Academy cohort took over maintenance of ARCH-AI in month 5. They have since added three additional output formats (security review, data privacy impact, procurement brief) without our involvement.

LINGUA-X · Enterprise Translation Browser Extension

The problem
WHO operates in 6 official languages plus 100+ country-specific languages. Previous workflows required sending documents to professional translators with 3–5 day turnaround at $50–150/page — $5–10M annually.
The agent
One-click webpage translation browser extension deployed enterprise-wide via Group Policy. Instant translation in 70+ languages using Azure Translator with WHO-customized branding and clinical/policy terminology dictionaries.
The impact
  • Annual cost reduction: $600K–$900K (60–80% of professional translation spend)
  • Time elimination: Instant vs. 3–5 day wait
  • Coverage: 70+ languages vs. 6 official
  • Usage: 10,000+ translations daily across global offices
Skill transferA 4-person localization team — none of whom previously wrote code — now extends the extension themselves. They have added a "regulatory compliance check" mode and a "plain-language" mode for public-facing translations.

KNOWLEDGE-CHATBOT · UNJSPF Pension Advisor

The problem
UN Joint Staff Pension Fund rules span 600+ pages. Staff facing potential termination needed answers urgently. Previously: 6–8 HR benefits specialists fielded 100+ inquiries daily, 2–5 day wait, 15–20% error rate.
The agent
Conversational AI using LangGraph for multi-step reasoning, with a dynamic calculation engine that converts JSON formulas into executable Python for real mathematical computations. Semantic search across the entire pension knowledge base; source citations on every answer.
The impact
  • Response time: 2–5 days → instant
  • Annual savings: $200K–$320K (5–6 specialists no longer required)
  • Volume: 5,000+ pension queries handled in 6 months
  • Accuracy: 92% on first response (vs. 80–85% for human specialists)
  • Availability: 24/7 across all timezones
Skill transferTwo HR analysts learned LangGraph through the Academy and now extend the chatbot themselves. In month 6 they added a "transition advisor" workflow for staff considering early retirement — architected, scoped, and shipped in 8 days.

INFO-EXTRACTOR · Intelligent Document Information Extraction

The problem
Thousands of grant applications (50–100 pages), country health reports, research submissions, compliance documents. Previously: 10–15 data entry clerks, 3–5 days per document, 15–25% error rate.
The agent
Pattern-recognition system that learns extraction patterns from examples, with dynamic schema generation, batch processing, Pydantic validation, and confidence scoring (0.0–1.0) per field.
The impact
  • Time per document: 3–5 days → 15–30 minutes (95%+ reduction)
  • Annual savings: $180K–$280K (6–9 positions)
  • Accuracy: 92–97% vs. 75–85% manual (50–80% error reduction)
  • Volume: 10,000+ documents processed in first year
Skill transferEnd-to-end ownership by a 3-person operations team since month 4. They've added 11 new document types — none of which existed when the system was first built.

ORG-CHART · JOB-POSTS · STEP-DETERMINE · PAYMENT-RECONCILIATION

The remaining four Wave-1 agents — covering organisational visualization, job description generation, salary step determination, and payment reconciliation — followed the same pattern: Orchestrary built, WHO staff shadowed, ownership transferred at month 4–6. Each delivered the same shape of result: 85–99% time reductions, $80K–$220K in annual savings per system, full ownership inside WHO.

Wave 2 — Built by WHO staff using Claude Code (months 4–12)

This is the part of the engagement we're most proud of. Once the first Academy cohort had completed all five tracks, WHO staff began building agents themselves. They consulted us during weekly office hours, but the work was theirs.

DONOR-REPORTING — built by a 2-person team in WHO Communications

Generates the entire annual donor report (190+ pages) from source data across 11 systems. Reduced report generation from 3–4 weeks to 2–3 days, saving an estimated $180K–$280K annually by eliminating 3–5 contracted report writers. The two staff who built it had zero engineering background before joining the Academy.

KEYWORD-HIGHLIGHT · DETECT-AI · POLICY-EDITOR · CSO Document Evaluator · GRANT-MATCHER · FIELD-OFFICE-BRIEFER

Six additional internally-built systems shipped between months 5 and 14, each saving $60K–$180K annually and authored entirely by WHO staff using the playbooks from the Academy. The Academy graduates are now training the next 60 WHO staff themselves. The cycle is self-sustaining.

The aggregate impact: a new operating model

Financial transformation

Total direct savings: $1.2M – $2.0M annually. The 15 AI systems collectively eliminated the need for 25–40 full-time positions while dramatically improving quality and speed while dramatically improving quality, speed, and consistency.

Operational transformation

  • Time savings: 150,000+ staff hours annually redirected from routine tasks to strategic work
  • Error reduction: 15–25% error rates reduced to 3–8%
  • Speed improvements: 80–95% time reductions, enabling faster response to global health emergencies
  • Scalability: Systems handle 10–50× more volume per person without proportional cost increases
  • 24/7 availability: Services accessible globally across all timezones

The strategic outcome

WHO maintained 80–90% operational capacity with 50% of the workforce — and developed a permanent in-house capability to keep building. They are no longer dependent on external vendors to extend their AI surface.

What Orchestrary actually did differently

1. Deployed before training — but trained in parallel

The first agent was in production in 11 days. The first Academy cohort started on day 14. Staff watched real systems get built in real time using the same tool they were learning to use.

2. Standardized on Claude Code in WHO's own Azure tenant

Every agent we built — and every agent WHO staff later built — was developed inside WHO's existing Azure environment, using Claude Code as the development interface. No data left WHO's boundary. No vendor SDK locked them in.

3. Refused to build what we couldn't transfer

Several proposed agents were rejected during scoping because they would have required permanent specialist maintenance from us. If WHO staff couldn't realistically own and extend it after the engagement, we didn't build it.

4. Measured the right thing

Our success metric was not "agents shipped" or "consulting hours billed." It was agents that WHO staff could explain, debug, and extend without us. That changed every design decision we made.

5. Designed the Academy as the core deliverable, not a sidecar

Most engagements treat training as a polite afterthought. The Academy was a 14-week program with five tracks, weekly capstone projects, and a graduation criterion: each cohort had to ship a real production agent into WHO before we considered them certified.

The five-track Academy

Track 01 · 3 weeks
Operator basics
Drive Claude Code in your terminal: prompt patterns, planning loops, MCP tools, agent failure modes. By end of week 3, every participant has used it to automate at least one task in their actual job.
Track 02 · 2 weeks
Workflow design
Decompose a department workflow into agent-suitable atomic steps. Design data interfaces. Write the SKILL.md and AGENTS.md files that make agents reliable across runs.
Track 03 · 3 weeks
Tool building
Write small Python tools the agent calls — the line between a chatbot and an agent that actually does work. Each participant ships at least one real tool to production.
Track 04 · 2 weeks
Evaluation
Become the in-house quality function: golden datasets, regression suites, drift monitoring, alerting on accuracy degradation.
Track 05 · 2 weeks
Governance
CIO, Head of HR, Head of Finance, Director of Compliance learn the policy frame: what agents can touch, who reviews, how to audit, how to roll back.

The Academy ran four times in 18 months. Total graduates: 87. Internal "agent engineers" capable of leading new builds: 23. External consultants currently required to keep the program running: 0.

The human dimension

"I am 54. I had never written a line of code in my life before the Academy. Six months later I shipped a real system that the entire benefits team uses every day. My job today is more interesting than it was before the budget cuts — that is something I never thought I would say."

HR Specialist · Wave 2 builder

"Month-end close was a nightmare after the cuts. We went from 8 people to 3 but still had to reconcile thousands of payments. The reconciliation system saved us. What used to take 4 days of overtime now takes 5–6 hours. And because I learned Claude Code in the Academy, I have already added three new features myself."

Finance Officer · Wave 1 owner

"We built the donor report generator in 19 days. Two of us. No engineering background. The 2024 report — the first one done with the system — was the cleanest, fastest, most cited annual report we have ever produced. Donors noticed."

Communications Lead · Wave 2 builder

"Orchestrary did not sell us a tool. They taught us a discipline. The agents are valuable. The fact that we can now build new ones whenever we need them is what changed the institution."

Head of HR · Engagement sponsor

The numbers, before and after

MetricBefore cutsAfter AI transformationChange
Workforce7,400 staff3,500 staff−53%
Operational capacity100%80–90%−10–20%
Cost per transactionBaseline40–60% lower−40–60%
Processing speedBaseline5–20× faster+400–1900%
Error rates15–25%3–8%−60–85%
Staff productivity2–3×+100–200%
External AI consultant spend$0$0/monthSustained at zero
In-house agent builders087 trained · 23 activeNew capability

The recognition

This engagement was personally recommended by WHO's Head of HR to the Director-General as a model for organizational resilience and digital innovation. The Academy curriculum has since been requested by three other UN agencies as a template for their own crisis-driven AI transformations.

But the recognition that mattered most came in month 18, when WHO's CIO told us, on a quarterly review call:

"You can stop coming to these meetings. We don't need you anymore."

WHO CIO · Month 18

That was the deliverable.

Lessons for other organizations

When this engagement model works

  • You face workforce reductions or budget compression and still need to maintain service levels
  • You have repetitive cognitive work currently consuming expensive human capital
  • You operate in high-complexity environments (regulations, compliance, multinational coordination)
  • You want a permanent in-house capability, not another vendor dependency
  • Your leadership invests in skill development at the same speed they invest in tools

If WHO can do this in 18 months under existential threat, your organization can do it in peacetime in less.

Next case · Public sector · OpenClaw

Winning the tender: a 340-person government advisory firm out-bid firms 10× its size

+€3.6M won contracts · win rate 22% → 38% · sovereign on-prem deployment
Read next

Want to be the seventh case study?

A 60-minute discovery call. No software pitch. We map your most painful workflow, scope a first agent, and tell you honestly whether this engagement model fits your organization.