Glue, governance and the humdrum of AI

11 May 2026 to 17 May 2026

This week reminded anyone paying attention that models are theatre; the outcome is decided by data, integration and the nitty‑gritty of making systems run. The real stories are engineering habits, new attack surfaces and the legal‑and‑reputational questions that follow when AI leaves the lab.

engineering gravity: data, labels and the dirty work

The practical bottleneck is not another model size announcement; it is data. People who actually build systems spend their time wrestling with messy inputs-cleaning, linking, versioning and arguing about what counts as 'good'-and those choices determine whether a model is useful or just confidently wrong.

That truth shows up in domain specifics. Robotics failures usually trace back to annotation choices-identity across viewpoints, temporal action boundaries, failure‑mode tags-not flaky control logic. Equally, apps that look straightforward on a product page get complicated fast when MLS feeds, payment reconciliation and state compliance rules enter the brief. And clinical deployments expose the same gap: accuracy and hallucination are only the start; provenance, consent, auditability and legal redress are the governance work that keeps patients safe.

Ask vendors for concrete references: which MLS schemas, reconciliation flows and audit histories do they support?
Treat annotations and label schemas as engineering assets: version them, test inter‑annotator agreement, and record failure tags.
Budget for governance: independent audits, operational benchmarks and legal frameworks where human safety or liability exists.

deployments look nothing like demos

Claims about mass deployment are about scale, not a single showcase. A British robotics firm promising 1,000-2,000 humanoids across a supplier's factories by 2032 is notable precisely because it forces questions most PR skirts: maintenance, integration with existing automation, uptime, spares, safety certification and who actually pays for servicing.

Conferences that gather builders matter because they expose those trade‑offs. Events aimed at physical AI trade the benchmark theatre for conversations about latency, sensing, battery life and wiring-the unglamorous engineering that decides whether a demo survives repeated shifts and regulatory scrutiny.

developer practice over product theatre

Small rituals and better runtimes move things faster than glossy roadshows. A weekly, no‑slides 'Build Club'-live coding, sharing screens and fixing bugs in public-transfers tacit knowledge in a way slides never do. The trick is preserving what's built, curating topics and avoiding the tendency to normalise hacky shortcuts; done well it accelerates competence without adding ceremony.

Tooling follows habit. Self‑hosted runtimes that manage state, tool integration and execution controls are practical steps toward agents that do background work, but they demand ops discipline: storage, updates, sandboxing and monitoring. Similarly, voice APIs are shifting usage patterns from typing to speaking, which is a product design problem as much as a modelling one; transcription errors, latency and privacy rules change how you build interactions, not just which model you call.

If you copy a Build Club, define preservation, IP and curation rules up front.
Treat self‑hosted agent runtimes as infrastructure projects: plan for incident response and safe execution, not just feature checklists.

trust, keys and the new attack surface

Attempts to fix trust problems can create fresh vulnerabilities. Federated unlearning-promising deletion without centralising data-adds complexity: tracking parameter influence and verifying erasure become new protocols to manipulate. Without adversarial testing and independent verification, unlearning risks swapping one privacy headache for a broader attack surface that can be exploited to force deletions or leak information.

Security incidents and reputational disputes underscore the point. A recent npm supply‑chain attack prompted a high‑level account of key rotation and containment and a macOS update deadline; the response was sensible but light on forensic detail. Meanwhile a courtroom scuffle over credibility between industry figures showed that governance will not only be technical: reputations, motives and competing narratives now play a central role in how institutions and contracts are judged. Add to that the legal exposure of single‑operator services-such as a one‑person AI divorce assistant funded with seed capital-and you have a reminder that liability and trust will follow deployments into the real world.

Glue, governance and the humdrum of AI

engineering gravity: data, labels and the dirty work

deployments look nothing like demos

developer practice over product theatre

trust, keys and the new attack surface

Sources behind this briefing

Keep going without the AI pageant