Off-by-none: Issue #365

May 12, 2026

Valkey 9 Unlocks Hybrid Search on ElastiCache πŸ”

In our previous issue, Amazon Bedrock crossed the final frontier of hosted frontier models, AI agents started buying domain names, and Amazon Q Developer got a one-way ticket to the AWS graveyard. This week, Claude Platform sets up shop on AWS, ElastiCache learns to do full-text and hybrid search, and Ampt rolls out Node.js 24 as the default runtime. Plus, we've got plenty of awesome cloud, serverless, and AI content from the community.

News & Announcements

Anthropic and AWS got even closer this week. Anthropic introduced the Claude Platform on AWS, which sits alongside Claude on Bedrock as a second, distinct way to use Claude inside your AWS account. The split is worth understanding: Claude Platform is Anthropic-operated with data processed outside AWS, while Claude on Bedrock keeps data inside the AWS boundary. Claude Platform on AWS is now generally available across 18 regions with direct access to Anthropic's APIs, console, Managed Agents, web search, and prompt caching, all billed through AWS Marketplace. AWS has its own post on the launch explaining the IAM and Marketplace plumbing. The short version: enterprises that want full Anthropic-native features without leaving their AWS account just got a much cleaner deployment path.

AgentCore also had a heck of a week. AgentCore Runtime now supports bring-your-own file system from S3 and EFS, letting you mount durable storage directly at agent runtime paths instead of bolting on file access through tools. AgentCore Memory now supports metadata for long-term memory with up to ten indexed keys that can be set manually or inferred by an LLM, making retrieval over long-term memory actually targetable instead of a vector similarity guessing game. And in the "what could possibly go wrong" category, Bedrock AgentCore Payments launched in preview (read the official announcement blog), built with Coinbase and Stripe and using the x402 protocol to let agents pay for APIs, MCP servers, and web content in stablecoins. So agents now have file systems, memory with metadata, and a wallet. πŸ”₯

On the agent tooling side, AWS announced the Agent Toolkit for AWS, a managed suite of pre-validated skills for AI coding agents covering application development, data analytics, and AgentCore, with IAM guardrails baked in. Also, the AWS MCP Server is generally available, now with IAM context keys, a sandboxed Python execution tool, and better token efficiency. AWS is trying really hard to be the default platform for AI coding agents. Giving devs an opinionated, authenticated entry point seems like the smart play, but AWS doesn't have the same head start they did with serverless.

It was a big week for ElastiCache as well. Valkey turned two, with Docker pulls up 17x year over year and adoption across the major clouds, which is a pretty good trajectory given that it started as a Redis fork barely 24 months ago. They also announced the release of Valkey 9.0 for Amazon ElastiCache, which brings built-in search, hash field expiration, and multi-database support in cluster mode. The headline features got their own announcements: ElastiCache now supports real-time full-text, exact-match, and numeric range search, hybrid search combining vector similarity and full-text, and real-time aggregations, all at microsecond latency and across all regions at no extra cost. Chaitanya Nuthalapati has a walkthrough of building search and recommendation engines on top of it with full code, and there's a separate post on the aggregations specifically. ElastiCache is turning into a serious AI workload backend, but it might also be the serverless full-text search service we've been waiting for.

For AWS SAM users, two nice quality-of-life updates: SAM now natively supports WebSocket APIs for API Gateway, auto-generating routes, integrations, and IAM permissions from your template, and SAM CLI 1.159.0 added BuildKit support for Lambda container images, bringing multi-stage builds, better caching, cross-architecture builds, and Docker secrets to the workflow. It seems like these updates should have shipped years ago, but I'm glad to see them land.

In other Anthropic news, Claude Code got agent view, a centralized UI for managing multiple coding sessions in parallel without juggling terminal tabs. If you've been doing this manually with tmux and worktrees, this is going to save you some major pain. Anthropic also rolled out Claude integrations across Excel, PowerPoint, Word, and Outlook, with Excel, PowerPoint, and Word now GA and Outlook in public beta. Context follows you across apps, and enterprises get OpenTelemetry logging and Analytics API access for governance. And Claude Managed Agents picked up "dreaming," outcomes, and multiagent orchestration, with outcomes being a rubric-based eval system showing up to 10-point improvements on hard tasks. Netflix and Wisedocs are already shipping with it.

Ampt now supports Node.js 24 as the default runtime, bringing Web Streams, URLPattern, iterator helpers, and a pile of features that used to require third-party npm packages.

Finally, Cloudflare is laying off over 1,100 employees, which they're framing as a reorganization for the AI era rather than cost-cutting. The severance package is genuinely good (full base pay through end of 2026 and accelerated equity vesting), but the framing is doing a lot of work. "Reorganization for the AI era" is becoming the corporate euphemism of the decade.

Tutorials

Reads

Notes from Code with Claude 2026 by Chris Ebert
Chris pulls together the announcements that mattered from Code with Claude 2026: the SpaceX compute deal, Multiagent Orchestration, and Dreaming inside Managed Agents. The context window observations are the most useful part for anyone actually shipping agents right now.

AWS Lambda Is Dead. The $0.20 Was Never the Price
The author migrated 47 Lambda functions to Cloudflare Workers and dropped their monthly bill from $8,362 to $1,790, with most of the savings coming from the orchestration tax (API Gateway, CloudWatch, NAT, egress) rather than Lambda itself. He's right that the bundle is where the real money goes, and the August 2025 INIT billing change is worth knowing about. But the workloads he's describing (HTTP APIs, webhooks, auth, edge functions waiting on a database) were never the shape Lambda was built for. Lambda's actual sweet spot is async event-driven work that needs to fan out to thousands of concurrent executions for seconds at a time, not synchronous request/response paths burning wall clock waiting on Postgres. High-volume systems need to be designed for the runtime you're putting them on. Putting a sync API behind API Gateway and a NAT'd Lambda and then complaining about the bundle is a design problem dressed up as a pricing problem. Workers is a better fit for that workload, and he should use it. Just don't declare the tool dead because it was the wrong one for the job.

Rethinking Distributed Systems for Serverless Performance and Reliability by Aaron Davidson, Roland FΓ€ustlin, and Zach Williams
Databricks walks through how their serverless Spark platform works, including Spark Connect that decouples apps from clusters, a Serverless Gateway that does the routing, and an autoscaler that earns its name. Using serverless to take 4-5 hour jobs down to 20 minutes is the kind of number that makes the architectural decisions worth reading about.

Podcasts, Videos, and more

How serverless experts build with AI today | Serverless Office Hours
Mark Sailes joins Julian Wood to share how serverless experts built Study from Experts, a focused video learning platform for AWS professionals.

Beyond the Basics: Production Serverless Patterns for Extreme Scale β€’ Janak Agarwal β€’ GOTO 2025
Janak digs into Lambda patterns that actually hold up under load, with two grounded examples: rapid scale-out for spiky traffic and real-time financial analytics built on Step Functions Distributed Map. This is the kind of content that should be louder than the "Lambda is dead" takes, because it shows what the architecture is genuinely good at.

Spec-driven development: The AI engineering workflow at Notion | Ryan Nystrom
Claire Vo interviews Ryan Nystrom about how Notion engineers use their internal Boxy system to @mention Codex from comments and get full PRs with screenshots in 20 minutes. The conversation covers practical workflows including configuring subagents, MCP integrations, and the shift toward spec-first development where AI handles implementation.

New from AWS

Final Thoughts πŸ€”

Look at what AWS shipped this week and squint a little. Claude Platform on AWS, Agent Toolkit, and AWS MCP Server GA, plus AgentCore gets durable file systems, metadata for long-term memory, and payments with stablecoin rails. AWS is staking out the substrate layer for the agentic era, and the feature list isn't random.

The bet is straightforward. If agents need compute, identity, storage, memory, payment, and an authenticated way to call services, AWS already has four of those and is shipping the other two as fast as they can write their press releases. The pitch to enterprises is: your agents already run on AWS, your data already lives on AWS, your IAM already governs everything, so why would you run the agent loop anywhere else?

It's a credible play. But the serverless comparison I mentioned earlier is the one worth thinking about. AWS had a multi-year head start with Lambda, and the platform shape was so unfamiliar that competitors took years to even define the category. Agents don't have that property. Cloudflare, Vercel, Modal, Fly, and a dozen smaller platforms are already shipping agent primitives. The Anthropic-AWS deal is notable, but Anthropic will sell its service to anyone willing to buy. Model providers are commodity inputs now. The differentiation has to come from somewhere else.

The substrate fight will be won on governance, observability, and cost controls, not raw capability. Every platform is going to give agents file systems and wallets and OS-level actions. The platform that wins is the one where, when an agent does something dumb or expensive at 3 a.m., you can see exactly what happened, who authorized it, what it cost, and how to stop it from happening again. AWS has decades of muscle memory on that exact problem, which is their edge.

If you're building on any of these primitives, the planning question is no longer "can the agent do this." It's "when this agent does something I didn't expect, what's my blast radius and how fast can I close it." Build for that and the rest takes care of itself.

See you next week,
Jeremy


I hope you enjoyed this newsletter. We're always looking for ideas and feedback to make it better and more inclusive, so please feel free to reach out to me via Bluesky, LinkedIn, X, or email.

Previous Issue

Issue #364May 5, 2026

Sign up for the Newsletter

Stay up to date on using serverless to build modern applications in the cloud. Get insights from experts, product releases, industry happenings, tutorials and much more, every week!

 

This Week's Top Links

We share a lot of links each week. Check out the Most Popular links from this week's issue as chosen by our email subscribers.

 

This Week's Sponsor

Check out all of our amazing sponsors and find out how you can help spread the #serverless word by sponsoring an issue.

 

About the Author

Jeremy is the founder of Ampt, a Cloud & AI consultant, and an AWS Serverless Hero that has a soft spot for helping people solve problems using the cloud. You can find him ranting about serverless, cloud, and AI on Bluesky, LinkedIn, X, and at conferences around the world.

 

Nominate a Serverless Star

Off-by-none is committed to celebrating the diversity of the serverless community and recognizing the people who make it awesome. If you know of someone doing amazing things with serverless, please nominate them to be a Serverless Star ⭐️!