#Machine Learning
50 articles with this tag

IBM's Tejas Kumar on 'AI Harnesses'
IBM's Tejas Kumar explains the concept of AI harnesses, detailing their types (Eval and Agent) and key components like tools, models, context management, and guardrails.

Lawrence Jones on Fighting AI with AI
Lawrence Jones of incident.io discusses how AI can be used to debug and manage complex AI systems, highlighting the importance of structured data and automated analysis pipelines.
Unified Embodied AI: Pelican-Unified 1.0
Pelican-Unified 1.0, the first unified embodied foundation model, achieves SOTA performance by integrating VLM, reasoning, and generation, proving unification enhances rather than compromises specialist strengths.
PipelineIQ: Databricks' AI for Sales Action
Databricks' PipelineIQ uses AI to turn messy CRM data into actionable sales guidance, focusing on 'Next Best Actions' rather than flawed forecasting.

Supabase's Pedro Rodrigues on AI Agents and Context
Pedro Rodrigues from Supabase discusses how 'Skills' and the MCP framework improve AI agent context and performance. Learn key principles for building effective product skills.

Anthropic, Gates Foundation Ink $200M AI Deal
Anthropic and the Gates Foundation are launching a $200 million, four-year AI partnership to improve global health, education, and economic mobility using Claude.

From Record Keeper to Intelligence Hub
The traditional system of record is giving way to a system of intelligence, powered by AI agents that orchestrate data and actions.

OpenAI Build Hour Dives into GPT-Realtime-2 Capabilities
OpenAI's Build Hour showcased GPT-Realtime-2, detailing advancements in voice AI for real-time translation, speech-to-text, and conversational agents, with demos and customer spotlights.

Joanna Stern Embeds AI in Her Life
Tech journalist Joanna Stern discusses AI integration in life with Atlassian's Sherif Mansour, highlighting human skills like curiosity and articulation.
Data Quality: The AI Strategy
NYU Langone Health demonstrates how prioritizing data quality at the source is the cornerstone of any successful AI strategy, driving real-world value in healthcare.
Claroty's AI library decodes industrial devices
Claroty's new AI library uses a multi-agent system on Databricks to solve the critical identity crisis in industrial devices, improving security accuracy.
Databricks Unifies Clinical Data
Databricks' new open-source Site Feasibility Workbench brings clinical trial intelligence onto its Lakehouse, tackling data silos and improving auditability.

Hugging Face: Agents Train Models with New Skills
Merve Noyan from Hugging Face explains how agents can now train models and utilize new skills to interact with the Hugging Face Hub, enhancing AI development workflows.

Microsoft's GridSFM: AI for the Power Grid
Microsoft's new GridSFM AI model drastically speeds up power grid analysis, promising efficiency gains and cost savings.

Building an AI Chess Coach: Take Take Take
Anant Dole and Asbjorn Steinskog discuss building an AI chess coach, the limitations of LLMs in chess, and their eval framework.
Sports Data Gets Smarter
Databricks aims to unify sports data with its Lakehouse platform, turning player tracking and biomechanical data into actionable insights for competitive advantage.
NVIDIA Touts Codex GPT-5.5 Gains
NVIDIA is integrating OpenAI's Codex, powered by GPT-5.5 and running on its own hardware, to accelerate complex engineering and research tasks.

Codex AI Automates Complex Computer Tasks
Codex AI demonstrates advanced capabilities, automating complex tasks across applications by interacting with computer interfaces.
OpenAI's "Parameter Golf" Reveals AI's Role
OpenAI's "Parameter Golf" competition revealed how AI coding agents are transforming machine learning research, pushing innovation under tight constraints.
DataMaster: Autonomous Data Engineering
DataMaster pioneers autonomous data engineering, unlocking significant ML gains by optimizing data pipelines rather than algorithms, as shown on MLE-Bench Lite and PostTrainBench.
Beyond Benchmarks: A New Intelligence Metric
A new Generalized Turing Test framework formalizes intelligence via indistinguishability, offering a dataset-agnostic and empirically validated hierarchy of AI capabilities.
Healthcare AI's Trust Deficit
Achieving trustworthy AI in healthcare demands a robust data foundation, prioritizing transparency, human oversight, and built-in governance over mere algorithmic advancements.

Vincent Koc on Adaptive AI Evaluation
Vincent Koc of Comet ML discusses the limitations of static AI evaluation and the shift towards adaptive, intent-based methods for measuring AI agents.

Microsoft's MatterSim accelerates material discovery
Microsoft's MatterSim AI platform achieves experimental validation, faster simulations, and introduces a powerful multi-task model for advanced material discovery.
AI Archives: Water Data Gets Searchable
Databricks uses multimodal AI to turn Sudan's scanned water archives into a searchable database for critical groundwater discovery.

MLX Genmedia: Prince Canuma on On-Device AI
Prince Canuma of MLX Genmedia discusses the power of on-device AI, showcasing how MLX enables efficient deployment of AI models on Apple Silicon devices for vision and audio tasks.

SAP, Snowflake Streamline AI with Zero-Copy Data
SAP and Snowflake launch zero-copy integration, enabling enterprises to unify critical business data for advanced AI applications without duplication.

Predictive vs. Generative AI: Key Differences Explained
IBM's Martin Keen clarifies the distinction between predictive AI (forecasting outcomes) and generative AI (creating new content), outlining their core mechanics and use cases.
Predictive Quality: Beyond Defect Detection
Manufacturers are moving beyond catching defects to predicting them with Databricks Genie, transforming quality control and reducing scrap.
MemAlign MLflow Bridges AI Judge Gap
Databricks' MemAlign framework in MLflow significantly improves AI judges' accuracy in evaluating machine learning code, bridging the gap with human experts.
Superhuman Hits 200K QPS With Databricks
Superhuman and Databricks engineers collaborated to build an AI inference platform serving over 200K QPS with sub-second latency.
Gosset AI: Drug Discovery Precision Leap
Gosset AI platform outperforms frontier LLMs in niche drug discovery by 3.2x, demonstrating the power of curated data over generic web search for R&D.
LLMs Slash Neural Architecture Search Costs
Delta-Code Generation uses LLMs to produce compact architecture refinements, dramatically cutting costs and improving NAS efficiency.
AI Validates Physical Simulations
AI CFD Scientist introduces vision-based validation for computational fluid dynamics, achieving autonomous discovery and ensuring physical realism where prior AI agents failed.

Transformers Conquer Computer Vision
Isaac Robinson from Roboflow explains how Transformers, once confined to NLP, have revolutionized computer vision, surpassing CNNs through massive pre-training and architectural innovation.
HR's AI Overload Solution
HR is drowning in capacity gaps. A phased AI adoption strategy, powered by Databricks and MathCo, offers a path to transformation.

Black Forest Labs: FLUX and the Future of Visual AI
Stephen Batifol of Black Forest Labs discusses FLUX, the company's visual AI model, and the future of generative AI with a focus on real-time generation and world models.
Databricks Adds Real-Time Data to AI Agents
Databricks' MCP Marketplace now equips AI agents with live external data, enabling more sophisticated, real-time decision-making.

Claude's Corner: Synthetic Sciences — AI Co-Scientists Running Research End-to-End
Synthetic Sciences (YC W2026) built an AI platform that runs the full research loop — literature reviews, GPU training, experiment analysis, and LaTeX paper drafts — while scientists sleep. Here's what they built, how it works, and whether you can replicate it.
Telecom Churn Models Miss the Mark
Telecom churn prediction models often identify customers too late for effective intervention, creating a 'Velocity Problem' that Databricks Genie aims to solve.

GitHub Cuts Agentic Workflow Costs
GitHub implements new strategies to cut token costs in its automated agentic workflows by enhancing logging and optimizing tool usage.

Snowflake: Data Foundation Fuels AI in Healthcare
Snowflake highlights how robust data foundations are crucial for unlocking AI's potential in healthcare and public sectors, enabling accuracy, governance, and operational improvements.

Matt Pocock: Engineering Fundamentals Still Crucial in AI
Matt Pocock, author of 'AI Hero', emphasizes that engineering fundamentals are more crucial than ever for building robust AI systems.

TestGorilla Bets on AI for Fairer Hiring
TestGorilla is harnessing AI and skills-based assessments on Snowflake to revolutionize recruitment, aiming for fairer evaluations amidst an AI-driven hiring market.
OpenAI's New Voice API Models
OpenAI introduces GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper to its API, enhancing voice intelligence for developers.

Reviewing AI's Code Contributions
AI-generated code is flooding pull requests. Learn how to spot the hidden technical debt and subtle bugs these agents introduce.
Data Bottleneck Slows AI Security Detection
Data access issues are slowing down AI-powered security threat detection, a problem Databricks aims to solve with its new AI agent.

Raindrop: Mastering Agent Observability
Raindrop's Danny Gollapalli and Ben Hylak discuss agent observability, the limitations of traditional testing, and the importance of signals for building reliable AI.
Databricks Unveils AI Upskilling Subscription
Databricks launches Databricks Academy Pro, an all-in-one learning subscription to address the enterprise AI skills gap with unlimited training options.

AI Designs Its Own Chips with Ricursive
Ricursive Intelligence's Anna Goldie and Azalia Mirhoseini discuss how AI is revolutionizing chip design, enabling faster and more efficient creation of specialized silicon.