June 9, 2026
I took a couple of weeks off, so we're playing catch-up. My youngest daughter graduated from high school last week, and between that, the after-prom party she threw at my house, and her graduation party (also at my house), there wasn't a lot of time left for keeping up with serverless, AI, and cloud. So this one covers about three weeks of news, and it's a long one. Apologies in advance.
In this issue, Anthropic ships two major models, DynamoDB gets "extended" to run locally on Postgres, and Aurora DSQL adds JSONB support. Plus, we've got plenty of awesome content from the cloud, serverless, and AI communities.
Let's start with the money, because it's the reason for everything else. Anthropic raised a boatload of money, $65 billion in a Series H at a $965 billion post-money valuation. That kind of capital buys a lot of compute, and the spending showed up almost immediately in the product line.
First came Claude Opus 4.8, which introduced dynamic workflows in Claude Code as a research preview, better coding and browser-automation numbers, and effort control settings, all at the same price as Opus 4.7. Then, before anyone had a chance to settle in, Anthropic announced Claude Fable 5 and Claude Mythos 5, the first generation of Mythos-class models built for autonomous, professional work. Fable 5 is the one you can actually use, and Mythos 5 remains the locked-down sibling. If you want a second opinion before you commit, Claire Vo's review of Fable 5 puts it through three real-world scenarios and is honest about where it falls down.
AWS, predictably, did not want to be left out. Claude Opus 4.8 landed on AWS through Bedrock and Claude Platform, and then Fable 5 showed up as the first generally available Mythos-class model on AWS too, with a longer writeup on the AWS blog covering the built-in safeguards for autonomous operation. Anthropic wasn't the only model vendor getting the Bedrock treatment, either. OpenAI's GPT-5.5, GPT-5.4, and Codex are now generally available on Bedrock with pay-per-token pricing matching OpenAI's direct rates, inference staying inside your chosen region, and the usual KMS, VPC, and CloudTrail story for compliance.
To make all of this easier to work with, Bedrock also shipped a redesigned console optimized for the OpenAI- and Anthropic-compatible APIs (there's a hands-on writeup on the AWS blog) built around the bedrock-mantle endpoint, with project-based organization, side-by-side comparisons, and prefilled code snippets. They rounded it out with request-level usage attribution so you can tag individual inference calls by team or environment, CloudWatch metrics for the mantle endpoint, and expanded Service Quotas support. The cost attribution piece is the one I'd pay attention to. Once you've got three model families running through one endpoint, knowing which team is spending what stops being optional.
The agent side of Bedrock kept pace. AgentCore Runtime added interactive shells via a new InvokeAgentRuntimeCommandShell API, giving you WebSocket terminal access into a running agent's microVM to inspect files, run commands, or debug state without losing session context. AgentCore Identity now lets you bring your own secrets through AWS Secrets Manager, and Step Functions added an AgentCore-powered agentic reasoning step so you can drop a reasoning task into a state machine without bolting on extra infrastructure. The AWS MCP Server picked up cross-account and cross-role access too, so a coding agent can finally hop between accounts and roles in a single session instead of stopping, swapping credentials, and starting over. Anyone who's managed agents across more than one account knows exactly how annoying that loop was.
The most interesting database news of the bunch didn't get a flashy launch event. AWS released ExtendDB 0.1, an open source adapter that implements the DynamoDB API on top of pluggable storage backends, with PostgreSQL as the first reference implementation. That means you can write code against DynamoDB programming patterns and run it locally, in CI, or on-prem against Postgres. I've been wanting something like this for years. DynamoDB Local has always been a reasonable stand-in, but a pluggable adapter that lets you point real DynamoDB access patterns at a Postgres backend opens up a lot of testing and migration scenarios that used to be a pain. It's 0.1, so temper your expectations, but the direction is genuinely useful.
Aurora DSQL stayed busy, picking up JSONB support with compression on by default, so you can store semi-structured config and API parameters next to your relational data and let DSQL compress the larger payloads for you. Over in search, the next generation of Amazon OpenSearch Serverless went GA, and the headline feature is scale-to-zero. There's a proper deep-dive on the AWS blog that leans into the agentic AI angle with instant resource creation and Vercel and Kiro integrations, and OpenSearch Serverless also added Agentic Search on top. Scale-to-zero is the big one for me. Vector and search backends that scale to zero change the math on a whole category of side projects and low-traffic workloads that previously couldn't justify the always-on cost.
A small but welcome bit of housekeeping: AWS is standardizing retry behavior across all SDKs and tools. The change splits backoff into two strategies, a fast 50ms for transient errors and a slower 1000ms for throttling, which is a more sensible default than treating every failure the same way. It becomes the default in November 2026, but you can opt in today with AWS_NEW_RETRIES_2026=true. If you've ever hand-tuned retry configs to stop hammering a throttled service, this is the kind of quiet fix that saves you from rediscovering the same lesson on the next project.
There was plenty more from AWS over the past few weeks. FinOps Agent went into preview, answering cost questions and surfacing optimization opportunities out of Cost Optimization Hub and Compute Optimizer. Cognito added multi-Region replication as an add-on for Essentials and Plus tier user pools, syncing identities to a standby Region so you can redirect traffic during a regional disruption. And AWS named four new Heroes for May 2026, with serverless and AI/ML leaders from Italy, Canada, and Argentina. Congratulations to all of them. The community is better for the work you do.
One last thing from me. I pushed an update to data-api-client, my DocumentClient-style wrapper for the Amazon Aurora Serverless Data API. If you're working with the Data API and want the familiar parameter-mapping ergonomics instead of the raw request format, give it a look.
Vector Storage Costs: S3, OpenSearch, pgvector, Pinecone by Darryl Ruggles
Darryl built a full cost model and benchmark harness comparing S3 Vectors, OpenSearch Serverless NextGen, Aurora pgvector, and Pinecone, including how the May 2026 scale-to-zero launch shifts the comparison. There's a calculator to find the crossover point for your own workload shape, which is exactly the kind of thing you want before you pick a vector store and regret it later.
AI Changed How We Build. Our Tools Didn't. by Ran Isenberg
Ran walks through the gap between AI-driven development and the tooling we still use to manage it. IDEs, GitHub, Jira, and sprint planning were all built for a world where humans wrote the code, and they haven't caught up to one where agents write and engineers mostly review. He's got a companion piece on adapting the engineer's job that gets into burnout risk and rising token costs too.
AI enthusiasts are in a race against time, AI skeptics are in a race against entropy by Charity Majors
Charity uses Fin's productivity gains as a case study and lands on a point that's easy to lose in the hype: the wins came from engineering discipline and fast feedback loops, not from AI being magic. If you're trying to bridge the gap between the true believers and the people rolling their eyes on your team, this is a good framework.
Running an AI-native engineering org
Anthropic's engineering team shares how their process changed once every commit became Claude-assisted, including the move from six-month roadmaps to far more fluid planning. The bit about going past "who changed this" to "what information do I actually need" is the part worth sitting with.
Lessons from building Claude Code: How we use skills
The Claude Code team breaks down nine skill types they use internally, from library reference to verification to scaffolding, plus the practices that make a skill actually work. If you're building anything with skills, this is required reading.
Using Claude Code: The unreasonable effectiveness of HTML by Thariq Shihipar
Anthropic makes the case that HTML beats Markdown for AI output because of its density and interactivity, with examples spanning richer docs, code reviews, and throwaway custom editors. I said it last issue and I'll say it again: I'm sold on the HTML move.
Your Agent Loops are Hungrier Than You Think by Michael Walmsley
Michael lays out why agentic loops burn tokens quadratically: every turn replays the full conversation history, so turn 20 is paying for turns 1 through 19. He backs it with real token counts from actual scenarios, and if you've been surprised by an agent bill, this explains where it went.
The Claude Cowork product guide
Anthropic's guide to Claude Cowork, their desktop knowledge-work agent, covers local file access, Slack and Google Drive integration, when to reach for it over other Claude tools, and seven worked examples. A useful orientation if you're trying to figure out where Cowork fits.
Codex is becoming a productivity tool for everyone
OpenAI shared usage data putting Codex at 5 million weekly active users, with knowledge workers growing three times faster than developers. The use cases have spread well past code into reports, spreadsheets, presentations, and analysis. The line between "coding tool" and "work tool" keeps getting blurrier.
AI Memory Systems Explained: From Retrieval to Durable, Context-Aware Agents by Jeremy Daly
This is mine. It's a deep architectural walkthrough of how to move from basic RAG to a production-grade memory system, covering five memory types (policy, preference, fact, episodic, trace), how their storage patterns differ, hybrid retrieval, and why you need a memory manager controlling what gets stored and retrieved while keeping governance and privacy intact. If memory has been the fuzzy part of your agent design, this should sharpen it up.
Building TypeScript agents with Strands | Serverless Office Hours
Erik walks through the Strands Agents TypeScript SDK for building agents on AWS, including agents that run in Node.js and the browser, connecting multiple model providers, and orchestrating multi-agent workflows.
Building with Claude: Lessons from real projects | Serverless Office Hours
Ran Isenberg joins Julian Wood to talk through practical Claude Code workflows in serverless development: custom skills, configuration strategies, and context management. Worth watching if you're still figuring out how these tools fit your process.
AI-assisted development in practice | Serverless Office Hours
Darryl Ruggles builds a full serverless blogging platform with AI coding tools and is honest about what works (MCP servers for Terraform and AWS docs), what breaks, and how to keep security and best practices intact when you let AI write your infrastructure.
Serverless Craic Ep86 AI and Software Development - the Real Problem
The Serverless Edge crew makes the case that AI amplifies both good and bad engineering practices, with a discussion that wanders through platform engineering, cognitive load, and socio-technical systems.
Serverless CrAIc Ep85 Why Team Topologies Matters More Than Ever in the AI Era
The crew asks whether AI agents count as team members and what that does to cognitive load, working through how organizational frameworks bend when code generation speeds up but human collaboration stays the bottleneck.
AWS Bites #154: S3 Files
Eoin and Luciano dig into S3 Files, explaining why S3 was never really a file system (no atomic renames, expensive listings, immutable objects) and how this service bridges the gap, with benchmark data and a frank look at the 60-second write-back delay and eventual consistency.
Claude Fable 5 review: what the new Mythos model gets right (and very wrong)
Claire Vo reviews Anthropic's first generally available Mythos-class model and the launches around it, including Managed Agents and safety classifiers, testing it on product specs and multi-agent orchestration. A grounded look at a model that's getting a lot of breathless coverage everywhere else.
A rational conversation on where AI is actually going | Benedict Evans
Benedict Evans argues foundation models won't hold lasting pricing power and that value moves up the stack, with distribution becoming the real moat now that software is cheap to build. A nice counterweight to the model-vendor news in this issue.
The AI paradox: More automation, more humans, more work | Dan Shipper
Dan Shipper draws on running Every to argue that work is moving inside AI agents, that SaaS is thriving rather than dying because agents drive more usage, and that roles like PM are getting more leverage from AI tooling.
DynamoSQL™ — ANSI SQL for Amazon DynamoDB
DynamoSQL is a SQL query engine for DynamoDB with JOINs, CTEs, aggregations, and subqueries, no pipelines or ETL required. It's in beta with early access through AWS Marketplace and offers MCP integration for AI applications.
I Built pretext-pdf: Serverless PDFs Without Chromium by Himanshu Jain
Himanshu built pretext-pdf, a Node.js library that generates PDFs from JSON without Chromium, aimed at structured documents like invoices and reports with 40-100ms generation times. If you've ever wrestled a headless Chromium into a Lambda just to make a PDF, this is a lighter path.
Introducing Open-Source Skills for AWS SDK Best Practices by David Yaffe
AWS released open-source skills for their Agent Toolkit to improve how AI coding agents generate SDK code, currently for Swift, JavaScript v3, and Python (Boto3), targeting the common mistakes like wrong API names, bad parameter types, and missed paginators.
That $65 billion raise is the smoke rising from the Anthropic and OpenAI IPO talk, with their valuations looking shakier than the headlines suggest once you do the math on token economics. Burning compute to win benchmarks is one thing. Making the unit economics work when customers actually use the product is another, and that's where the recent billing changes come in. Anthropic pulling claude -p out of what your Max subscription covers, plus the GitHub Copilot billing changes are already having a real effect on how people use these tools. The tokenmaxing that let everyone ship slop faster is getting expensive, and maybe that's (kind of) a good thing.
It forces discipline, which is the thread running through several pieces in this issue. Charity Majors makes the case that the AI productivity wins came from engineering discipline and tight feedback loops, not magic. Ran Isenberg points out that our tools were built for humans writing code and are straining under agents doing it. Both are circling the same idea: the teams that come out ahead won't be the ones with the most tokens, they'll be the ones with the most discipline. If that discipline doesn't show up, we're all in trouble.
The model you use this year will be obsolete by next. The patterns you build around storage, cost, and testing will outlast all of them.
See you next week,
Jeremy
I hope you enjoyed this newsletter. We're always looking for ideas and feedback to make it better and more inclusive, so please feel free to reach out to me via Bluesky, LinkedIn, X, or email.
Stay up to date on using serverless to build modern applications in the cloud. Get insights from experts, product releases, industry happenings, tutorials and much more, every week!
We share a lot of links each week. Check out the Most Popular links from this week's issue as chosen by our email subscribers.
Check out all of our amazing sponsors and find out how you can help spread the #serverless word by sponsoring an issue.
Jeremy is the founder of Ampt, a Cloud & AI consultant, and an AWS Serverless Hero that has a soft spot for helping people
solve problems using the cloud. You can find him ranting about serverless, cloud, and AI on Bluesky, LinkedIn, X, and at
conferences around the world.
Off-by-none is committed to celebrating the diversity of the serverless community and recognizing the people who make it awesome. If you know of someone doing amazing things with serverless, please nominate them to be a Serverless Star ⭐️!