#AI Agents
50 articles with this tag
Databricks Genie Tames Wild Maintenance Reports
Databricks Genie AI agents are transforming solar and wind maintenance by turning unstructured PDF reports into a queryable data layer for advanced analytics.

AI Agents Get Dumber With More Context, Expert Warns
Nupur Sharma of Qodo explains how too much context can hinder AI agents, leading to the 'lost in the middle' problem, and discusses solutions like context engines and hybrid orchestration.

Cloudflare's Sunil Pai & Matt Carrie on Eval++ Compute Primitive
Cloudflare's Sunil Pai & Matt Carrie unveil Eval++, a new compute primitive for building durable, scalable, and low-latency AI agents.

OpenAI's Lee Spacagna on Operationalizing AI Workflows
Lee Spacagna from OpenAI demonstrates how AI agents can be built and operationalized to automate tasks and multiply workforce impact in financial services.

AI Evals: Broken But Essential, Use Them Anyway
Ara Khan and Cline argue that AI evaluations, though flawed, are crucial. They outline common pitfalls and a process for iterative improvement, emphasizing honesty and nuanced assessment.

CrewAI: Taming AI Agent Costs
CrewAI outlines strategies to combat rising AI agent costs by optimizing token spend through orchestration and infrastructure controls.

AI Agents Running Businesses: Andon Labs on Project Vend
Andon Labs' Lukas Petersson and Axel Backlund discuss Project Vend, an experiment using AI agents to run a simulated vending business, exploring LLM capabilities and challenges.

Benchmarking AI Agents: Snorkel AI's Vincent Chen Explains
Vincent Chen from Snorkel AI explores the art and science of benchmarking AI agents, detailing the complexities and methodologies involved in evaluation.

GitHub Universe Enters the Agentic Era
GitHub Universe 2026 gears up for the agentic era, focusing on practical AI integration for developers.

Conductor CEO on AI Agents and Workflow Optimization
Conductor CEO Charlie Holtz discusses how his team orchestrates AI agents, emphasizing "slot-free zones," strategic model selection, and the iterative process of building effective AI workflows.
Endava Bets on AI Agents for Software Delivery
Endava is revolutionizing software delivery by embedding OpenAI's AI agents across its entire workflow, transforming how enterprises build and deploy technology.

Claude Code Benchmarking: Semantic Search vs. Grep
Turbopuffer's Kuba Rogut benchmarks semantic code retrieval on Claude Code, revealing how semantic search enhances AI agent precision and efficiency compared to grep.

Lassie Secures $35M Series A
a16z leads $35 million Series A for Lassie, an AI company automating administrative tasks for small businesses, starting with dental practices.

Nvidia's RTX Spark: AI Agents and the Future of PCs
Nvidia's new RTX Spark chip aims to redefine PCs by enabling on-device AI agents for complex tasks, promising a new era of computing power and creative potential.

Steven Willmott on Spec-Driven Testing for AI Agents
Steven Willmott of SafeIntelligence discusses spec-driven testing for AI agents, emphasizing the need for clear specifications beyond traditional datasets to ensure robustness and safety.

Sakana AI: Finance Agents Take Shape
Sakana AI is deploying AI agents to revolutionize financial operations, with engineers focusing on practical integration and enterprise-grade reliability.

Nick Nisi on Building Better AI Agents
Nick Nisi of WorkOS discusses how to build better AI agents by focusing on measurement, enforcement, and learning from failures.
Claude Code's Latest Updates
Claude Code rolls out Opus 4.8 as default, introduces dynamic workflows, security plugins, and performance enhancements for developers.

Google DeepMind Explains AI Agent Building Struggles
Philipp Schmid from Google DeepMind explains the core challenges senior engineers face when building AI agents, contrasting traditional engineering with agentic development.

Neo4j's Zach Blumenfeld on AI Agents and Decision Traces
Neo4j's Zach Blumenfeld explains why AI agents need decision traces and how context graphs, powered by Neo4j, can provide the necessary memory and reasoning capabilities for more accurate and accountable AI.

Claude's Corner: Salus (YC W2026), The Bouncer Your AI Agents Desperately Need
AI agents are confidently doing the wrong thing at scale. Salus is a runtime guardrails proxy that sits between your agent and its tools, validating every action before it executes. Here's what they built, how it works, and whether you could clone it.

Agent vs. Traditional Observability: Braintrust's Phil Hetzel Explains
Phil Hetzel of Braintrust discusses the fundamental differences between traditional observability and the specialized needs of AI agent evaluation.

OpenAI Agents SDK: Building with Model-Native Harnesses
OpenAI's latest Build Hour session dives into the updated Agents SDK, showcasing new features like a Codex-style harness and enhanced sandboxing capabilities.
Enterprise AI Agents: The Scale-Up Playbook
Enterprise leaders are finding success in scaling AI agents by embedding governance, orchestrating complex workflows, and empowering their workforce.

Anthropic Debuts Claude Opus 4.8
Anthropic unveils Claude Opus 4.8, boosting AI performance with new features like 'effort control' and 'dynamic workflows' for complex coding.

Neo4j: Context Graphs for AI Agents
Neo4j experts Andreas Kollegger and Zaid Zaim discuss how context graphs enhance AI agents for explainable and decision-aware operations.

AI Agents: Building Enterprise Guardians
Onyx Security CEO Maxim Bar Kogan discusses the critical need for AI agent security and governance in enterprises, highlighting the risks and solutions.
Databricks Genie Sparks Media Personalization
Databricks Genie uses AI to let media execs ask complex questions of their data in natural language, speeding up personalization and product development.

Angus McLean on Bounded Autonomy in AI
Angus J. McLean of Oliver discusses 'Bounded Autonomy' in AI, exploring the shift to agentic processes in advertising and offering practical advice for building AI agents.
Databricks Genie: Partner AI Solutions Emerge
Databricks partners are launching industry-specific conversational AI solutions built on Databricks Genie, democratizing data access and accelerating AI-driven decisions.

Rust: The Ideal Language for Vibe-Coding?
Daniel Szoke from Sentry argues that Rust's strict constraints make it ideal for AI agentic coding, turning compile errors into valuable debugging feedback.

AI Agents Are Rewriting Commerce
AI agents are rewriting the rules of commerce, forcing brands to adapt or risk becoming invisible to consumers and their digital assistants.
Warp Bets on Open Source with GPT-5.5
Warp is betting on OpenAI's GPT-5.5 to power its open-source development strategy, using AI agents for coding and humans for oversight.

Robinhood Embraces AI Agents for Trading, Spending
Robinhood launches Agentic Trading and Agentic Credit Card, allowing AI agents to manage investments and spending with user-controlled safety features.

AI's Boring Revenue Play: Compliance
AI is transforming compliance from a costly, manual burden into a strategic revenue driver, leveraging advanced technology to navigate complex regulations.

Stop Babysitting AI Agents: Build a Context Engine
Brandon Walsenuk from Unblocked discusses the critical need for context engines to empower AI agents, moving beyond simple data access to true understanding and autonomous operation.

The 4 Types of AI Agent Memory Explained
IBM Master Inventor Martin Keen details the four essential memory types AI agents need: working, semantic, procedural, and episodic.

Does GenAI Belong to Data Scientists?
Phil Hetzel of Braintrust discusses the evolving role of data scientists in Generative AI agent development, arguing for a collaborative, multidisciplinary approach.

DeepMind's Scale: How Agents Run at Google
Google DeepMind's KP Sawhney and Ian Ballantyne reveal how they run AI agents at scale, discussing the architecture, tools, and challenges involved in managing complex automated tasks.

AI's Paradox: More Automation, More Work
Dan Shipper, CEO of EVERY, discusses the AI paradox: how increased automation may lead to more human work, and how AI agents will reshape workflows by becoming integrated into our daily tools.

Rachel Nabors: The Infinite Canvas of the Web Agent
Rachel Nabors discusses how AI agents can leverage the web's 'infinite canvas' using MCP tools and WebMCP, transforming browser interactions.
Databricks Adds OpenTelemetry Tracing
Databricks integrates OpenTelemetry tracing directly into Unity Catalog, offering governed, cost-effective observability for AI agents and simplifying telemetry pipelines.

Sally Ann O'Malley on OpenClaw in Containers
Sally Ann O'Malley from Red Hat discusses how OpenClaw agents can be containerized for reproducible, secure, and portable AI development from local machines to Kubernetes.
AI Agents Supercharge GPU Kernel Development
LinkedIn is leveraging AI agents to automate complex GPU kernel engineering for its Liger Kernel project, accelerating AI model performance.
LinkedIn's AI Memory Platform
LinkedIn's Cognitive Memory Agent (CMA) provides AI agents with context and memory for personalized, adaptive experiences, starting with its Hiring Assistant.

Daytona's Ivan Burazin on AI Agents, Growth, and Agent Cloud
Daytona CEO Ivan Burazin discusses the company's 74% MoM growth, 850K daily runs, and the critical need for specialized compute infrastructure for AI agents.

ClickUp's 22% cut comes with $1M salary bands. Evans calls it the 100x org.
ClickUp CEO Zeb Evans announced a 22% headcount cut and the introduction of $1M cash salary bands, framed not as cost-cutting but as a restructure around what he calls the 100x organization. His diagnosis of how AI is reshaping engineering org design is sharper than most peer SaaS messaging.

Liam Hampton on VS Code AI Agents
Liam Hampton from Microsoft explores how VS Code is becoming a central hub for AI agents, demonstrating customization options and practical workflows for local, background, and cloud agents.

Microsoft's small AI agents get smarter
Microsoft Research unveils MagenticLite, an AI system using smaller models for efficient browser and file system tasks, pushing agentic AI capabilities on user hardware.

CQ Exchange: Shared AI Agent Knowledge
Mozilla.ai launches CQ Exchange, a centralized platform for AI agents to share and access experience-driven knowledge, moving beyond local storage.