June 23, 2026
In our previous issue, the US government ordered Anthropic to pull Fable 5 and Mythos 5, AWS WAF started charging AI bots for content, and Bedrock added Grok, Gemma, and a pair of GPTs. This week, AWS Summit NYC unsurprisingly goes heavy on AI agents, Lambda picks up MicroVMs for isolated sandboxes, and AWS Blocks leans into IfC. Plus, we've got lots of amazing cloud, serverless, and AI content from the community.
Infrastructure FROM Code is back, baby! AWS announced AWS Blocks, an open-source TypeScript framework for composing application backends without wrestling with the underlying infrastructure tooling, and if the concept sounds familiar, it should. It's very close to the ideas we built into Ampt, and the thinking behind it is still as powerful as ever. You write application code, Blocks infers the services it needs, runs locally with built-in auth, and deploys to production AWS with no code changes. Seeing AWS adopt and lean into IfC development this directly is amazing to watch. It's early and still in preview, but it looks very promising and could be a really nice companion for lots of different projects.
Most of what follows came out of the AWS Summit in New York, and the recap is the fastest way to see the full slate in one place. The headliners are worth taking one at a time.
The one I keep coming back to is Lambda MicroVMs, a new compute primitive built on Firecracker that gives you VM-level isolation with near-instant launch and resume. You get stateful sessions that persist memory and disk for up to eight hours, full lifecycle control to launch, suspend, resume, and terminate, and support for HTTP/2, gRPC, and WebSockets. The AWS blog walks through the mechanics: you supply a Dockerfile and a code artifact, Lambda builds a Firecracker snapshot with your app already initialized, and each user session gets its own environment with up to 16 vCPUs and 32 GB of memory. The pricing isn't great (you still pay for wall time 🤦), but this is one of the most interesting things to happen to Lambda in a while.
Amazon Bedrock Managed Knowledge Base is now generally available, a fully managed take on RAG with six native connectors (S3, SharePoint, Confluence, Google Drive, OneDrive, and a web crawler), managed vector storage, hybrid search, and an agentic retriever that can break multi-hop queries into step-by-step plans. The launch post has the details, including the Smart Parsing pass for content optimization. If you've been hand-rolling RAG plumbing, this collapses a lot of it into a managed service.
Bedrock AgentCore also had a big week. The AgentCore harness is now GA, a config-based path to deploying agents with managed runtime, memory strategies, multi-model support across Bedrock, OpenAI, and Gemini, and Step Functions integration, with an export path to custom code when you outgrow it (AWS frames it as going from idea to production agent in two API calls). Web Search shipped as a GA feature in US East (N. Virginia), giving agents real-time web access through Amazon's own index without bolting on an external provider. There are two solid reads on it from Channy and the ML blog, just watch out for that $7-per-1,000-queries pricing. 😬 Guardrails in policy went GA for evaluating agent actions and blocking things like prompt injection, with policies written in natural language or code. And AgentCore Memory added cross-account access, which sounds dull until you're trying to share memory in a multi-tenant setup. If you want a bundled view, the broader knowledge and continuous learning post ties the memory, web search, and paid-content pieces together.
Adjacent to that, Bedrock Guardrails picked up a new API aimed at agentic workflows. It runs in detect-only mode and returns numeric severity and confidence scores, so you set your own thresholds for blocking, retrying, or just logging at each step. Plus it hooks into agent frameworks through lifecycle hooks without making you stand up guardrail resources first. Sandeep Singh's walkthrough of the InvokeGuardrailChecks API is a useful guide if you want to dig deeper.
On the storage and data side, S3 Vectors got two upgrades. It now returns up to 10,000 results per query instead of 100, with pagination so you can start processing the first page right away, and it cut query charges by up to 80% on large indexes (10M+ vectors), automatically across regions. S3 also added annotations, up to 1 GB of mutable metadata per object that surfaces automatically as queryable Iceberg tables, built for agents that need to understand data context without a human in the loop. That same theme runs through AWS Context, which maps enterprise data relationships into knowledge graphs agents can query, extending the same technology already powering QuickSight.
For the boring-but-useful column, Amazon ECS added faster service auto scaling with 20-second high-resolution metrics across Fargate and EC2. Channy's breakdown shows scale-out dropping from over six minutes to under 90 seconds, and you can swap awkward step-scaling policies for target tracking. If you've been overprovisioning to cover slow reactions, this is your chance to right-size.
Two more preview tools from AWS push on the generated code angle. AWS DevOps Agent added release management, which runs readiness reviews, validates infrastructure against Well-Architected practices, and generates and runs tests in isolated environments before production (the blog has the workflow). And AWS Transform shipped continuous modernization, autonomously scanning repos to find and prioritize tech debt and opening remediation PRs, with GitHub, GitLab, and Bitbucket support (the AWS post covers end-of-life dependency detection and the Security Agent tie-in). Both are pointed squarely at the flood of AI-generated code that still needs reviewing and maintaining.
Outside of AWS, Anthropic shipped a batch of updates. Claude Design now stays on brand with design-system imports from GitHub or design files, bidirectional sync with Code, and direct canvas editing. Claude Code picked up artifacts in beta for Team and Enterprise, generating shareable visual pages like incident timelines and PR walkthroughs from your codebase and conversation context. And Claude rolled out centrally managed authorization for MCP connectors using the Enterprise-Managed Authorization extension, so admins can shorten access-token lifetimes and a deprovisioned user's connector access expires fast instead of lingering. Elsewhere, Cloudflare introduced temporary accounts for AI agents that let an agent deploy a Worker with wrangler deploy --temporary and no signup, live for 60 minutes and claimable afterward, and Vercel Functions can now run up to 30 minutes for Pro and Enterprise teams on Node.js and Python, aimed at LLM reasoning, AI streaming, and document processing.
I built an event-driven order system with both ECS and Lambda. Here's why. by Suleiman Abdulkadir
Nice walkthrough of mixing ECS and Lambda instead of forcing everything into one compute model. The saga pattern with EventBridge is probably how I'd build it too, though fifteen services for an order system is a lot of surface area to operate.
IaC Isn't Dying. AI Makes it More Important - DevOps.com by Jonah Kowall
The argument that IaC becomes your system of record for non-deterministic AI output is exactly right. If agents are generating infrastructure, you need something deterministic to reconcile against, and as of right now, that's still some form of IaC.
Why I Ripped AgentCore and Strands Out of Production by Anderson Carvalho
This is the anti-framework story I keep seeing lately: the agent SDKs pile on abstraction you don't need until you suddenly do. Swapping back to Lambda-per-customer with direct Bedrock calls won't fit everyone, but matching complexity to the actual workload is the right instinct.
What's new in Strands Agents | Serverless Office Hours
Julian Wood and the team run through the Strands updates, and Evals 1.0 is the one worth your time. Pre-production testing is the agent gap nobody has filled well yet.
The Great AI Reality Check Has Begun
The core point holds: generating code was never the hard part of software engineering, and 2025 drove that home. The "Doorman Fallacy" applies perfectly here, because the gap between code generation and shipping real systems gets really wide without the right people guiding it.
MIT Just Revealed the AI Bubble's Fatal Flaw
The title oversells it, but the breakdown of who actually has the compute and data to compete is a useful gut check. Worth a watch if you want a clearer read on the economics underneath all the model announcements. AI will remain incredibly useful, but I don't think there's a moat that will sustain these valuations.
HazyBeacon Abuses AWS Lambda Function URLs for Stealthy Command-and-Control Operations
HazyBeacon uses stolen IAM credentials to stand up Lambda Function URLs as command-and-control channels that blend right into trusted AWS traffic. The takeaway is about identity governance and egress monitoring, since the technique leans on credential theft rather than any flaw in Lambda itself.
June 29 - July 2, 2026 - AI Engineer World's Fair 2026: San Francisco 🗣️ (I'll be there!)
For as long as we've been shipping to the cloud, there's been a wall between the code that does the work and the code that describes where it runs. You write a function, then you go write the CloudFormation, the Terraform, the CDK stack, or the SAM template that tells the cloud how to host it. Two artifacts, two mental models, kept in sync by hand and by hope. Infrastructure as Code was a real step forward because it made that second artifact deterministic and reviewable, and I'm not here to argue against determinism. You want a system of record that says exactly what's running and why.
What AWS Blocks gets right is that the determinism doesn't have to live in a separate file. It can sit right next to the code that uses it. That's the same instinct behind the annotations Wing was doing, and the approaches others like Encore and Nitric have taken, where you declare what you need inside your application code and let the framework work out the provisioning. Watching AWS lean into that idea, smartly using TypeScript's integrated type safety, with a clean local-to-production story, is a good sign for where this is heading.
Blocks doesn't go as far as I'd like, and that gap is exactly the part I've spent the last five-plus years on at Ampt. Mapping code to generated infrastructure is the easy half. The harder and more valuable half is making the infrastructure itself adaptable, a living thing that responds as the code and its usage patterns change, rather than a fixed target you have to keep reconciling and reshaping by hand. Blocks lines up closely with what we call Productized Patterns at Ampt, and that adaptability is the direction I keep wanting more of.
This matters more now than it did two years ago, because rapid code generation is the new normal. When code gets written this fast and a lot of it is throwaway, the old contract is backwards. Asking freshly generated code to also provision the infrastructure to run itself puts the burden in the wrong place. Flip it around: let the code be produced, and let the infrastructure figure out the best way to run it. That's a far better loop for testing and prototyping, and you can take the next step to harden it without locking yourself into something as rigid as a hand-maintained IaC stack.
The conversation being back on the table, with AWS in it, is the real story here. Blocks is early and it's still in preview, but the principle underneath it is the one worth betting on: code you can throw away cheaply, and infrastructure that adapts to keep up. If the codegen tools are going to keep producing at this pace, that's the model that lets you move fast without leaving a pile of brittle stacks behind you.
See you next week,
Jeremy
I hope you enjoyed this newsletter. We're always looking for ideas and feedback to make it better and more inclusive, so please feel free to reach out to me via Bluesky, LinkedIn, X, or email.
Stay up to date on using serverless to build modern applications in the cloud. Get insights from experts, product releases, industry happenings, tutorials and much more, every week!
We share a lot of links each week. Check out the Most Popular links from this week's issue as chosen by our email subscribers.
Check out all of our amazing sponsors and find out how you can help spread the #serverless word by sponsoring an issue.
Jeremy is the founder of Ampt, a Cloud & AI consultant, and an AWS Serverless Hero that has a soft spot for helping people
solve problems using the cloud. You can find him ranting about serverless, cloud, and AI on Bluesky, LinkedIn, X, and at
conferences around the world.
Off-by-none is committed to celebrating the diversity of the serverless community and recognizing the people who make it awesome. If you know of someone doing amazing things with serverless, please nominate them to be a Serverless Star ⭐️!