#Deep Learning

50 articles with this tag

5 AI Research Papers Shaping AI's Future

Discover five key AI research papers that reveal the current trajectory and future directions of artificial intelligence development.

3 days ago

Technology

RunPod's Audry Hsu on IDE-Integrated GPU Cloud Deployment

Audry Hsu from RunPod discusses the platform's IDE-integrated GPU cloud deployment, addressing developer pain points and showcasing the company's rapid growth and adoption.

6 days ago

AI Research

Together AI Pushes LLM Context Limits to 5 Million Tokens

Max Ryabinin from Together AI discusses breaking barriers in LLM training, detailing techniques to achieve 5 million token context lengths and their impact on memory and performance.

7 days ago

tech

LinkedIn's Generative Recommender Speed-Up

LinkedIn engineers drastically improved Generative Recommender training efficiency, cutting GPU hours by up to 65% through system-level optimizations.

18 days ago

AI Research

LocateAnything: Parallel Decoding for Vision

LocateAnything revolutionizes vision-language models with Parallel Box Decoding, boosting speed and accuracy in visual grounding and detection.

18 days ago

tech

Uber's DeepETT Boosts Traffic Forecasts

Uber's DeepETT system revolutionizes traffic forecasting with deep learning, boosting accuracy and handling 2 million predictions per second.

19 days ago

tech

Uber Eats' Search Engine Gets Smarter

Uber Eats enhances its delivery search with semantic AI, leveraging LLMs and optimized infrastructure for speed, scale, and accuracy.

19 days ago

AI Research

Omar Sanseviero on Google's AI Strategy

Omar Sanseviero from Google DeepMind discusses Google's AI strategy, focusing on efficient models, multimodality, and open innovation in AI.

21 days ago

AI Research

Graph Neural Networks Explained: GNN Basics & Models

Explore the essentials of Graph Neural Networks (GNNs), from their basic principles to key models like GCNs, GraphSAGE, GATs, GINs, and Transformers.

22 days ago

tech

AI Agents Supercharge GPU Kernel Development

LinkedIn is leveraging AI agents to automate complex GPU kernel engineering for its Liger Kernel project, accelerating AI model performance.

25 days ago

tech

AI Agents Build Better AI

LinkedIn Engineering details how AI agents are revolutionizing model development through automated, iterative refinement loops.

25 days ago

AI Research

Attractors Unlock Scalable Reasoning

Equilibrium Reasoners (EqR) leverage learned attractor landscapes to achieve scalable, adaptive test-time compute allocation, dramatically boosting accuracy on complex reasoning tasks.

25 days ago

AI Research

Jure Leskovec on Relational Foundation Models

Jure Leskovec, AI researcher and Stanford professor, discusses Relational Foundation Models, a new AI approach for understanding complex enterprise data and its applications.

25 days ago

Claude's Corner

Claude's Corner: Ndea - Chollet's $43M Bet That Scale Isn't AGI

Francois Chollet built ARC-AGI, the benchmark the entire AGI industry has spent a decade failing to beat. Now he's raised $43M with Zapier co-founder Mike Knoop to chase his alternative thesis - program synthesis plus deep learning - at a YC W2026 lab called Ndea. Here's why it matters, why $43M, and why you can't replicate it.

about 1 month ago

AI Research

Microsoft's MatterSim accelerates material discovery

Microsoft's MatterSim AI platform achieves experimental validation, faster simulations, and introduces a powerful multi-task model for advanced material discovery.

about 1 month ago

AI Research

LLMs Slash Neural Architecture Search Costs

Delta-Code Generation uses LLMs to produce compact architecture refinements, dramatically cutting costs and improving NAS efficiency.

about 1 month ago

Artificial Intelligence

Andrej Karpathy: AI Models Need Human-Like Reasoning

Andrej Karpathy discusses the evolution of AI from programming to prompting, emphasizing the current need for models to develop human-like reasoning.

about 1 month ago

Artificial Intelligence

Yann LeCun Pushes AI Beyond Language Models

Yann LeCun is championing a new AI architecture, JEPA, that moves beyond language models to learn world representations and predict future states, aiming for more robust AI.

about 1 month ago

Artificial Intelligence

Y Combinator Decodes AI: Recursive Reasoning Models

Y Combinator Decoded explores how recursive AI models, like HRM and TRM, are revolutionizing AI reasoning by mimicking the human brain's efficiency.

about 2 months ago

Technology

Cloudflare Unweights LLMs by 22%

Cloudflare's 'Unweight' system slashes LLM model sizes by up to 22% using lossless compression, enhancing inference speed and efficiency.

about 2 months ago

Technology

AI Agents Collaborate to Solve Math Problems

Together AI's EinsteinArena platform enables AI agents to collaborate on complex scientific problems, achieving new breakthroughs in mathematics.

2 months ago

AI Research

DMax: Parallel Decoding for Diffusion LLMs

DMax revolutionizes diffusion language models with Soft Parallel Decoding, boosting TPF significantly while preserving accuracy and achieving 1,338 TPS.

2 months ago

AI Research

AI Accelerates Molecular Dynamics at Scale

AI-driven potentials are now integrated into GROMACS, enabling near ab initio fidelity for large-scale molecular dynamics simulations on multi-GPU systems.

2 months ago

AI Research

Google Researchers Explore AI Storage Efficiency

Google researchers are developing AI compression techniques to reduce model storage needs by sixfold, aiming to lower costs and boost efficiency in AI development.

3 months ago

Artificial Intelligence

NVIDIA's Jensen Huang on AI's Future and Compute Demands

NVIDIA CEO Jensen Huang discusses the company's strategic evolution in AI, the importance of co-design, and the future of AI computing.

3 months ago

Artificial Intelligence

Mamba-3: Inference-First SSMs Arrive

Together AI's Mamba-3 advances state space models with a focus on inference speed, outperforming previous versions and some Transformers.

3 months ago

Artificial Intelligence

Andrej Karpathy on AI Agents: More Than Just Code

Andrej Karpathy discusses the evolution of AI agents beyond code generation, emphasizing the need for modularity, self-improvement, and human-AI collaboration for future advancements.

3 months ago

AI Research

VideoAtlas: Unlocking Long-Context Video AI

VideoAtlas AI offers a lossless, hierarchical grid representation and Video-RLM for scalable, robust long-context video understanding with logarithmic compute growth.

3 months ago

Technology

Databricks Adds Serverless NVIDIA GPUs

Databricks launches AI Runtime, offering serverless NVIDIA GPUs for simplified AI model training and fine-tuning directly within the Lakehouse.

3 months ago

AI Research

MoDA: Unlocking LLM Depth Scaling

Mixture-of-Depths Attention (MoDA) tackles LLM signal degradation by enabling cross-layer attention, boosting performance with minimal overhead.

3 months ago

Technology

AI vs. ML: What's the Difference?

AI is the broad concept of machines mimicking human intelligence, while machine learning is a specific method where systems learn from data.

3 months ago

Artificial Intelligence

AI's Consciousness Debate

Vishal Misra and Martin Casado discuss LLM functionality, the path to AGI, and the role of data in AI development.

3 months ago

AI Research

SCORE: Recurrent Depth for Deep Networks

SCORE introduces a recurrent, iterative approach to deep neural networks, accelerating training and reducing parameter counts without complex ODE solvers.

3 months ago

Artificial Intelligence

AI Agents Now Do Overnight Research

An automated system uses AI agents to conduct overnight LLM training experiments, modifying code and iterating on models autonomously.

3 months ago

Artificial Intelligence

AI Learns Beyond Text

AI is moving beyond text, with multimodal pretraining enabling models to learn from images, audio, and video for richer comprehension.

3 months ago

AI Research

Transformer Artifacts Unpacked

Research demystifies massive activations and attention sinks in Transformers, revealing them as architectural artifacts enabled by pre-norm configurations.

3 months ago

AI Research

ZipMap: Linear-Time 3D Vision

ZipMap revolutionizes 3D vision with linear-time reconstruction, achieving 20x speedup and enabling real-time state querying.

3 months ago

AI Research

Bridging DSP and DL for Speech Enhancement

TVF integrates DSP interpretability with deep learning's adaptability for low-latency, real-time speech enhancement, offering explicit control over spectral modifications.

3 months ago

AI Research

New Models Tackle Reasoning Puzzles with Symmetry

New Symbol-Equivariant Recurrent Reasoning Models (SE-RRMs) offer improved performance and generalization on reasoning tasks like Sudoku and ARC-AGI by explicitly encoding symmetry.

3 months ago

Artificial Intelligence

Hinton on AI: From Intuition to Backpropagation

AI pioneer Geoffrey Hinton discusses the historical evolution of AI, from logic-based systems to neural networks and the significance of backpropagation.

4 months ago

AI Research

Certified Circuits for Stable AI Explanations

New 'Certified Circuits' framework provides provable stability for AI model explanations, yielding more accurate and compact circuits.

4 months ago

AI Research

Multimodal LLMs: What's Lost in Translation?

New research reveals multimodal LLMs struggle to utilize non-textual data due to a 'mismatched decoder problem,' impacting their true understanding.

4 months ago

AI Research

Predicting Transformer Training Instability

Researchers introduce RKSP, a method to predict transformer training divergence from a single forward pass, and KSS, a technique to actively prevent it, saving compute and enabling higher learning rates.

4 months ago

Artificial Intelligence

New EB-JEPA Library Simplifies AI World Models

Meta AI's new EB-JEPA library offers accessible, single-GPU implementations for advanced AI world models, covering image, video, and planning tasks.

4 months ago

AI Research

AI Learns Faster by Predicting the Future

AI learns faster with Predictive Inverse Dynamics Models (PIDMs) by forecasting future states, making imitation learning more data-efficient than traditional methods.

4 months ago