Agent-style ships writing rules, Engram adds persistent memory

Friday, 1 May 2026

Agent-style ships 21 writing rules for AI agents to follow at generation time, fixing the classic curse-of-knowledge and citation discipline issues that plague LLM output. It's a drop-in addition for Claude Code, Copilot, Cursor, and Aider.

Engram offers persistent memory for AI coding agents through a single Go binary with SQLite and MCP server support. Anthropic's Bedrock SDK fixes error handling in streaming responses, whilst Codex CLI 0.128.0 adds its own version of goal persistence. Multiple memory management patterns are emerging for agent workflows, plus practical patterns for fixing MCP timeouts.

⚖️Reinforcement tuning with judge

amazon·2 min read

AWS deep dive on RLAIF using LLM judges instead of human labellers. Shows how Nova models use contextual feedback for domain-specific alignment.

Takeaway

When fine-tuning models for domain-specific tasks, LLM judges give us nuanced feedback that simple rule-based rewards can't match. This approach scales better than human labelling for specialized use cases like technical

AWSReinforcement LearningFine-TuningLLM Ops

🔧Anthropic Bedrock SDK fixes errors

github

v0.29.1 patches error handling in streaming responses. Now throws proper APIError for error events in chunk frames instead of hanging silently.

Bigger Picture

MCP Timeout Reality Check

The async handleId pattern confirms what many suspected: MCP tools blocking on slow APIs breaks agent workflows. The 424 errors and frozen states described match real pain points devs face when integrating external services.

TypeScriptAnthropicAWS

📝Agent-style ships writing rules

GitHub

21 curated English writing rules (12 from classic guides, 9 from field observation) for AI agents to follow at generation time. Emphasises two critical rules targeting curse-of-knowledge and citation discipline in technical documentation.

Top Voted

PythonAgentsClaude

📊Grafana Assistant learns

grafana·2 min read

Builds persistent knowledge base of our infrastructure before incidents hit. Studies services, connections, and metrics automatically for faster troubleshooting.

Takeaway

This flips the usual AI assistant model: instead of teaching context during each incident, Grafana Assistant maps our infrastructure ahead of time. When alerts fire, it already knows which services connect and where the relevant

ObservabilityAgentsDevOps

Yesterday's Sentiment/Energised

Agent Memory Gets Sorted

Strong momentum around solving agent memory persistence, with Engram and auto-memory shipping production-ready solutions. The async MCP pattern tackles timeout issues that have been frustrating devs building agent workflows. Meanwhile, agent-style addresses output quality with concrete writing rules.

🎯Codex CLI gets goal persistence

simonwillison·2 min read

v0.128.0 adds /goal command for autonomous looping until completion or token budget exhaustion. Implements a version of the Ralph loop pattern, as documented on Simon Willison's Weblog.

Bigger Picture

Writing Quality Finally Addressed

Agent-style's 21 rules target the curse-of-knowledge problem that makes AI output sound academic rather than practical. The before/after examples show dramatic readability improvements for technical documentation.

OpenAICLIAgents

💾Auto-memory fixes agent amnesia

GitHub

Zero-dependency Python CLI that turns Copilot CLI's SQLite into session recall. Works with VS Code, JetBrains, Neovim as opt-in backends.

Deep Dive

PythonCLICopilot

💻GitHub Copilot CLI interactive

github

Tutorial covering when to use interactive versus non-interactive modes in Copilot CLI. Explains workflow differences and command patterns.

Takeaway

Understanding these modes helps optimize our Copilot CLI workflow. Interactive mode for exploration and complex tasks, non-interactive for scripting and automation. The tutorial clarifies when each approach works best.

GitHubCopilotCLIAI Workflows

Learn/Multiple Mentions

What is the Model Context Protocol exactly?

MCP is a standardised way for AI agents to connect with external tools and data sources. Rather than custom integrations, agents use MCP to access databases, APIs, and services through a unified interface. Engram adds persistent memory via MCP, whilst AWS shows timeout fixes using async patterns to prevent 424 errors when tools hit slow APIs.

Tool-callingAgent-integration

🛡️AI sandboxing hits Kubernetes

cncf·2 min read

CNCF analysis of how Anthropic's new model, Mythos, found and exploited kernel vulnerabilities, changing infrastructure security assumptions. Calls for structural isolation over detection-based defences.

Takeaway

If AI can autonomously chain kernel exploits, our shared-kernel container model becomes a liability. This pushes us towards micro-VMs and hardware isolation rather than relying on runtime detection and dashboards of doom.

SecurityKubernetesAI Safety

🧠Engram adds persistent memory

GitHub

Go binary with SQLite backend that gives AI coding agents long-term memory across sessions. Works with Claude Code, VS Code, Cursor via MCP protocol.

Bigger Picture

Memory Wars Heat Up

Three different memory approaches ship within days: Engram's MCP-based persistence, auto-memory's CLI mining, and Codex's built-in goals. This suggests the fragmented toolchain is maturing rapidly around session continuity.

Under The Radar

GoMCPAgents

🎮TensorRT accelerates Unreal NNE

nvidia·2 min read

NVIDIA plugin adds TensorRT for RTX as Unreal Engine 5 Neural Network Engine runtime. Shows significant throughput gains over DirectML for RTX GPUs.

Takeaway

For game devs using neural networks in Unreal (super resolution, denoising, neural rendering), this TensorRT runtime delivers better throughput than DirectML. The JIT optimization tailors models to specific RTX hardware at

C++NvidiaComputer VisionPerformance

📋Stop clipboard-driven context

Dev.to·2 min read

Analysis of why devs copy-paste between terminals and agents instead of using built-in IDE chat. Proposes letting agents read terminal state directly.

Takeaway

We're all guilty of this: running tests in one pane, copying errors to Claude in another. The real fix isn't better clipboard management but giving agents read access to terminal state whilst maintaining privacy boundaries.

CLIAgentsDev ToolsAI Workflows

📱Uncensored Local AI goes mobile

GitHub

Flutter app that runs GGUF models directly on Android and iOS. No cloud dependency, no content filters, completely offline with local OpenAI API compatibility.

Deep DiveFlutterLocal AIGGUF

⚡Fix MCP timeouts with async

Dev.to·2 min read

The handleId pattern returns job IDs immediately, then polls for results without blocking agents. Prevents 424 errors when MCP tools hit slow APIs.

Takeaway

This solves the frozen agent problem when our MCP tools depend on slow external services. Instead of blocking the entire workflow, we get immediate responses with polling, keeping agents responsive during long-running operations.

MCPAgentsAWSDevOps

Learn/Core Concept

How does reinforcement learning tuning work?

RLAIF uses AI models as judges instead of humans to provide feedback during training. Unlike traditional RLHF where humans rate outputs, an LLM evaluates responses and assigns scores based on criteria like helpfulness or accuracy. This lets us fine-tune models for domain-specific tasks without expensive human labellers. AWS shows how Nova models use contextual feedback for alignment at scale.

Fine-tuningAlignment

Read online