In partnership with

Hello from The AI Night,

Today in AI:

  • OpenAI Launches Guaranteed Capacity For Reserved Compute

  • Google DeepMind AI Solves Nine Open Problems

  • OpenAI Launches Locked-Screen Computer Use For Codex

Here's the deal: OpenAI launched Guaranteed Capacity, a commitment-based offering that lets customers reserve long-term access to its compute for production systems, customer-facing apps, and AI agents. Announced May 19, it sells reserved capacity rather than a new model or feature.

The Breakdown:

  • Customers pick one, two, or three-year commitments, with discounts that increase at higher annual spend levels.

  • Reserved capacity can be drawn down across OpenAI's full product and model portfolio.

  • The program targets production workloads, customer-facing applications, and agents running on OpenAI.

  • Sam Altman framed it as a "big win-win" and said OpenAI will reserve enough compute for ChatGPT and Codex.

  • The offering runs until OpenAI's current allocation sells out, with plans to bring it back later.

The bigger picture: AWS built a hundred billion dollar business selling reserved compute. OpenAI is copying that playbook right before an IPO. Lock in multi-year deals, book revenue upfront, show Wall Street predictable growth. The discount rewards loyalty. The lock-in prevents switching. This is not an AI company shipping a product. It is a cloud vendor learning its first enterprise sales trick. 

Here's the deal: Google DeepMind built AlphaProof Nexus, a framework pairing frontier LLMs with the Lean proof assistant to autonomously search for formal proofs. Its most capable agent resolved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures, each solve costing a few hundred dollars.

The Breakdown:

  • Uses Gemini 3.1 Pro for proving and Gemini 3.0 Flash for rating sketches, with RL-based AlphaProof as a focused tool.

  • Two solved Erdős problems had been open for 56 years.

  • It solved a 15-year-old question on Hilbert functions in algebraic geometry.

  • Lean's compiler verifies every step, so the logic needs no expert review.

  • A simpler basic agent replicated all nine Erdős solves, just costlier on the hardest ones.

  • All Lean proofs are published on GitHub.

The bigger picture: The real signal is the drift from specialized trained systems toward simple agentic loops. DeepMind admits the basic agent's success surprised them, since stripped-down loops underperformed on benchmarks months earlier. That points to a frontier where compiler feedback, not a bigger model, grounds reasoning. For builders, formal verification flips hallucination from a fatal flaw into a cheap filter: a proof either compiles or it does not. 

Here's the deal: OpenAI extended Codex's Computer Use feature so it no longer needs an unlocked machine. From a phone, Codex can now operate apps on your Mac even when the screen is off and locked.

The Breakdown:

  • Users trigger tasks from their phone while the Mac screen stays off and locked.

  • The capability extends Computer Use, currently macOS only, excluding the EEA, UK, and Switzerland at launch.

  • Setup requires the Computer Use plugin plus Screen Recording and Accessibility permissions.

  • Codex asks before using each app, with an "Always allow" option for trusted apps.

  • It cannot automate terminal apps or Codex itself, and cannot authenticate as an administrator.

The bigger picture: Every desktop AI agent assumes you are watching. That assumption just broke. A tool that operates your locked Mac from your phone is not an assistant. It is an employee with a key to the office. OpenAI solved the convenience problem. Whether anyone has solved the trust problem of an unattended agent running through your apps is a question security teams have not started asking yet.

Want to get the most out of ChatGPT?

ChatGPT is a superpower if you know how to use it correctly.

Discover how HubSpot's guide to AI can elevate both your productivity and creativity to get more things done.

Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.

What else you need to know:

Lee Robinson argued that engineers should keep thinking deeply about code rather than less, warning that AI-generated code is becoming a production liability when developers don't understand the systems they ship.

Grok shipped Build 0.1.219 with bug fixes including prompt-caching usage limit corrections, terminal rendering fixes for kitty and VTE-based terminals, and clickable multi-line markdown links across word-wrap.

Fujitsu developed self-evolving multi-AI agent technology that learns from execution results and human feedback, automatically enhancing its Takane LLM and lifting domain accuracy by an average 28 points.

Google's Liz Reid said AI Mode now expands across countries and languages within months rather than years, crediting multilingual model architecture and existing Search ranking systems for location-aware grounding.

Novata launched Risk Atlas, an AI-powered risk monitoring tool that normalizes data across portfolios and supply chains, tracking five categories: reputational, cyber, geopolitical, physical climate, and transition risk.

That’s it for today’s edition of The AI Night.

Our goal is to cut through the noise, surface what actually changed, and explain why it matters.

2 ways to support us:

  1. Forward this to your AI-curious friendhttps://www.theainight.com

  2. Sponsor The AI Night and reach 500+ AI builders daily → passionfroot.me/theainight

Reply

Avatar

or to participate

Keep Reading