Off-by-none: Issue #357

March 17, 2026

S3 Turns 20! 🥳

In our previous issue, AWS helped your AI agents go rogue, Anthropic put AI code review SaaS on notice, and GPT-5.4 arrived just in time for a government contract. This week, S3 turns 20, AWS makes agents more stateful and observable, and Claude expands to a 1M token context window. Plus, we've got plenty of awesome content from the cloud, serverless, and AI communities.

News & Announcements

Amazon S3 just turned 20 years old. Two decades of storing everything, and somehow still not quite old enough to legally drink in the US. S3 has quietly become one of the most important primitives in all of computing, and AWS is still evolving it. This week, Amazon S3 introduces account regional namespaces for general purpose buckets. No more global bucket name collisions. It took 20 years, but we finally get namespacing that aligns with how people actually build multi-account systems. Progress.

AWS also continues to round out its AgentCore story with three meaningful updates. The runtime now supports stateful MCP server features, memory now supports streaming notifications for long-term memory updates, and the runtime now supports the AG-UI protocol. Taken together, this is AWS pushing toward agents that are more stateful, observable, and interactive instead of just stateless prompt loops duct-taped together.

And speaking of observability, Amazon Bedrock now supports visibility into first token latency and quota consumption. First token latency is one of the most important user experience metrics in AI systems, and having this exposed natively is a big step toward operating these workloads like production systems instead of using fragile manual instrumentation.

On the developer tooling side, AWS CDK Mixins is now generally available, making it easier to compose reusable infrastructure patterns without copy-pasting stacks. It doesn't mean I like CDK now, but this should help cut down on all that spaghetti infra code. And AWS Lambda Managed Instances now supports Rust, which feels like a natural fit for high-performance, long-running serverless workloads.

AWS also announced a deeper collaboration with NVIDIA to accelerate AI workloads from pilot to production. Not surprising, but another signal that the real battle isn’t just model quality anymore. It’s who can operationalize AI at scale.

Over at Anthropic, the context window race continues. 1M context is now generally available for Opus 4.6 and Sonnet 4.6. That’s not just a spec bump, it fundamentally changes how you think about retrieval, memory, and application design. Alongside that, Claude is getting more practical: it can now build interactive visuals directly in your conversation and is continuing to expand its specialized tools for Excel and PowerPoint. Less “chatbot,” more “useful coworker.” They also announced The Anthropic Institute, which looks like a broader push into research, policy, and long-term AI impact.

Google officially closed its acquisition of Wiz, locking in one of the biggest cloud security deals ever. As AI workloads expand, security is quickly becoming a first-class concern once again, especially in multi-tenant, multi-cloud environments.

And finally, Cloudflare keeps shipping practical improvements at the edge. They showed how RFC 9457-compliant error responses can reduce agent token costs by up to 98%, which is the kind of optimization that actually matters at scale. And AI Security for Apps is now generally available, continuing their push to make the edge a viable place to run and secure AI-powered applications.

Tutorials

Lambda Managed Instances for Steady Workloads by Lucas Vera
The OODA Loop Pattern for Autonomous AI Agents — How I Built a Self-Improving System by Yedan Yagami
The complete guide to building skills for Claude/Codex by Rohit
Multimodal embeddings at scale: AI data lake for media and entertainment workloads by Hammad Ausaf
Improve operational visibility for inference workloads on Amazon Bedrock with new CloudWatch metrics for TTFT and Estimated Quota Consumption by Zohreh Norouzi
Secure AI agents with Policy in Amazon Bedrock AgentCore by Bharathi Srinivasan
Powertools for AWS Lambda: Simplify Observability by Darryl Ruggles
Serverless applications on AWS using Lambda with Java 25, API Gateway and DynamoDB - Part 1 Sample application by Vadym Kazulkin
Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL - Part 1 Sample applications by Vadym Kazulkin
Collaborating with agents teams in Claude Code by Heeki Park
How I built LennyRPG by Ben Shih

Reads

AI Didn’t Wait for Security. Now What?
Ran Isenberg breaks down why blocking AI tools fails and walks through a framework for governing AI adoption with centralized brokers, sandboxed environments, and skill catalogs. The AWS Kiro outage serves as the cautionary tale for what happens when AI inherits elevated permissions without proper isolation.

Operationalizing Agentic AI Part 1: A Stakeholder’s Guide
Nav Bhasin from AWS Generative AI Innovation Center outlines four criteria for identifying agent-appropriate work: clear boundaries, judgment across tools, measurable success, and safe failure modes. Part one of a series aimed at helping enterprises move from AI investment to actual execution.

Agentic AI in the Enterprise Part 2: Guidance by Persona
Part 2 of AWS's enterprise agentic AI series provides role-specific guidance for P&L owners, CTOs, and CISOs. Nav Bhasin walks through the practical concerns each persona should focus on, from KPIs and architecture decisions to security models and compliance frameworks. Surprisingly concrete for an AWS blog post.

I'm Building Agents That Run While I Sleep
Abhishek Ray built an open-source Claude Code skill that uses headless browser agents to verify AI-generated code against acceptance criteria you write upfront. The system includes pre-flight checks (pure bash, no LLM) and a planner that figures out how to test your specs with Playwright.

What AWS Actually Shipped in the Last 12 Months (Non-AI Edition)
Tobias Schmidt takes a comprehensive look at AWS releases over the past 12 months, filtering out the AI hype to focus on serverless and infrastructure improvements.

Skills vs MCP — Why Your AI Needs an Orchestration Layer
Alex Moening explores how MCP's tool proliferation degrades AI performance through context overload. The data shows 20-50% accuracy drops as context grows, and just 7 MCP servers consumed 34% of Claude Code's token budget. He proposes a skills layer that routes intents to workflows, loading tools only when needed.

Slop Creep: The Great Enshittification of Software
Boris Tane introduces "slop creep" as the gradual degradation of codebases through individually reasonable but collectively destructive decisions. He explores how AI coding agents accelerate this problem by lacking holistic system understanding, and proposes tight planning with code snippets rather than autonomous implementation.

Podcasts, Videos, and more

Serverless CrAIc Ep82: AI Is Changing Software Engineering — Why Your North Star Matters
This Serverless Craic team covers the North Star framework, the difference between leading and lagging metrics, and how AI tools are removing friction from development cycles. They discuss the increasingly popular opinion of why product-oriented engineering practices matter even more when you can ship features in hours instead of months.

Serverless resilience: A practitioner's guide | Serverless Office Hours
Ben Freiberg and Marco Jahn walk through battle-tested patterns for resilient serverless systems, covering cell-based architectures, deployment trade-offs, and multi-region failover strategies. The focus is on moving beyond platform defaults to implement proven architectural strategies that minimize downtime when failures occur.

Claude Code in Action
Anthropic has released a comprehensive training course covering Claude Code fundamentals, from architecture and tool integration to MCP server extensions. The course targets developers looking to incorporate AI assistance into their existing workflows, with modules on context management and version control integration.

This week on How I AI: From Figma to Claude Code and back & From journalist to iOS developer
Gui Seiz and Alex Kern from Figma show how to pull a live interface from production, staging, or localhost into Figma, turn it into editable design frames, explore variations collaboratively, and push changes back into code using Claude Code and MCPs.

New from AWS

Developer Tools

Chaos Engineering for AWS Lambda: failure-lambda 1.0
Gunnar Grosch announces failure-lambda 1.0, a chaos engineering tool rewritten in TypeScript with AWS SDK v3. The new version adds timeout and corruption failure modes, plus a Lambda Layer that enables fault injection across any managed runtime without code modifications.

FieldTrip — Search every field across your schemas
From the creator of EventCatalog, David Boyne introduces FieldTrip, an open-source CLI tool that discovers and indexes schema files (OpenAPI, AsyncAPI, Protobuf, Avro, JSON Schema) in your codebase. It provides three visualization modes: a searchable table, a heatmap showing which properties appear in which schemas, and a force-directed graph of schema relationships.

Upcoming Events

March 26, 2026 - AI Codecon: Software Craftsmanship in the Age of AI by O'Reilly Media, Inc.

Final Thoughts 🤔

Somehow all of my “free time” lately has turned out to be anything but. It’s been busy… but also incredibly productive. The kind of productive where ideas are actually turning into things, not just scribbled in a note on my reMarkable waiting for the right moment.

I’ve also got a new blog post dropping tomorrow: “The Convergence Problem: Rethinking the 2028 Global Intelligence Crisis.” It digs into the idea that AI is making automation trivial, but without what I’m calling productive imperfection, we risk optimizing everything into sameness. When everything is equally good, differentiation starts to disappear, and that has some pretty interesting implications for how we build products, companies, and markets.

On the experimentation side, I’ve been continuing my deep dive on memory systems for agents, and this is where things are getting really interesting. When agents only remember what they need, when they need it, their behavior becomes surprisingly constrained and repeatable. Not brittle. Not random. Actually predictable in a way that feels much closer to designing systems than prompting models.

It’s not about larger context windows, but more about controlling what not to remember, when to retrieve, and how to evolve state over time. That shift changes everything. I’ll be writing a lot more about this soon.

See you next week,
Jeremy

I hope you enjoyed this newsletter. We're always looking for ideas and feedback to make it better and more inclusive, so please feel free to reach out to me via Bluesky, LinkedIn, X, or email.

Previous Issue

Issue #356 • March 10, 2026

OpenClaw, Now with More Cloud! 🦞☁️

This Week's Top Links

We share a lot of links each week. Check out the Most Popular links from this week's issue as chosen by our email subscribers.

This Week's Sponsor

Check out all of our amazing sponsors and find out how you can help spread the #serverless word by sponsoring an issue.

About the Author

Jeremy is the founder of Ampt, a Cloud & AI consultant, and an AWS Serverless Hero that has a soft spot for helping people solve problems using the cloud. You can find him ranting about serverless, cloud, and AI on Bluesky, LinkedIn, X, and at conferences around the world.

Nominate a Serverless Star

Off-by-none is committed to celebrating the diversity of the serverless community and recognizing the people who make it awesome. If you know of someone doing amazing things with serverless, please nominate them to be a Serverless Star ⭐️!