Machine Learning

Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use

May 10, 2025

LLMs have made impressive gains in complex reasoning, primarily through innovations in architecture, scale, and training approaches like RL. RL…

Machine Learning

ZeroSearch from Alibaba Uses Reinforcement Learning and Simulated Documents to Teach LLMs Retrieval Without Real-Time Search

May 10, 2025

Large language models are now central to various applications, from coding to academic tutoring and automated assistants. However, a critical…

Machine Learning

Huawei Introduces Pangu Ultra MoE: A 718B-Parameter Sparse Language Model Trained Efficiently on Ascend NPUs Using Simulation-Driven Architecture and System-Level Optimization

May 10, 2025

Sparse large language models (LLMs) based on the Mixture of Experts (MoE) framework have gained traction for their ability to…

A Coding Guide to Unlock mem0 Memory for Anthropic Claude Bot: Enabling Context-Rich Conversations

May 10, 2025

In this tutorial, we walk you through setting up a fully functional bot in Google Colab that leverages Anthropic’s Claude…

Enterprise AI Without GPU Burn: Salesforce’s xGen-small Optimizes for Context, Cost, and Privacy

May 10, 2025

Language processing in enterprise environments faces critical challenges as business workflows increasingly depend on synthesising information from diverse sources, including…

Matrix3D: Large Photogrammetry Model All-in-One

May 9, 2025

We present Matrix3D, a unified model that performs several photogrammetry subtasks, including pose estimation, depth prediction, and novel view synthesis…

Machine Learning

Elevate marketing intelligence with Amazon Bedrock and LLMs for content creation, sentiment analysis, and campaign performance evaluation

May 9, 2025

In the media and entertainment industry, understanding and predicting the effectiveness of marketing campaigns is crucial for success. Marketing campaigns…

Machine Learning

ServiceNow AI Released Apriel-Nemotron-15b-Thinker: A Compact Yet Powerful Reasoning Model Optimized for Enterprise-Scale Deployment and Efficiency

May 9, 2025

AI models today are expected to handle complex tasks such as solving mathematical problems, interpreting logical statements, and assisting with…

Google Redefines Computer Science R&D: A Hybrid Research Model that Merges Innovation with Scalable Engineering

May 9, 2025

Computer science research has evolved into a multidisciplinary effort involving logic, engineering, and data-driven experimentation. With computing systems now deeply…

Machine Learning

AI That Teaches Itself: Tsinghua University’s ‘Absolute Zero’ Trains LLMs With Zero External Data

May 9, 2025

LLMs have shown advancements in reasoning capabilities through Reinforcement Learning with Verifiable Rewards (RLVR), which relies on outcome-based feedback rather…

Meta AI Open-Sources LlamaFirewall: A Security Guardrail Tool to Help Build Secure AI Agents

May 9, 2025

As AI agents become more autonomous—capable of writing production code, managing workflows, and interacting with untrusted data sources—their exposure to…

OpenAI Releases Reinforcement Fine-Tuning (RFT) on o4-mini: A Step Forward in Custom Model Optimization

May 9, 2025

OpenAI has launched Reinforcement Fine-Tuning (RFT) on its o4-mini reasoning model, introducing a powerful new technique for tailoring foundation models…

Machine Learning

Ming-Lite-Uni: An Open-Source AI Framework Designed to Unify Text and Vision through an Autoregressive Multimodal Structure

May 9, 2025

Multimodal AI rapidly evolves to create systems that can understand, generate, and respond using multiple data types within a single…

Machine Learning

Multimodal LLMs Without Compromise: Researchers from UCLA, UW–Madison, and Adobe Introduce X-Fusion to Add Vision to Frozen Language Models Without Losing Language Capabilities

May 8, 2025

LLMs have made significant strides in language-related tasks such as conversational AI, reasoning, and code generation. However, human communication extends…

Machine Learning

How Deutsche Bahn redefines forecasting using Chronos models – Now available on Amazon Bedrock Marketplace

May 8, 2025

This post is co-written with Kilian Zimmerer and Daniel Ringler from Deutsche Bahn. Every day, Deutsche Bahn (DB) moves over…

Google Launches Gemini 2.5 Pro I/O: Outperforms GPT-4 in Coding, Supports Native Video Understanding and Leads WebDev Arena

May 8, 2025

Just ahead of its annual I/O developer conference, Google has released an early preview of Gemini 2.5 Pro (I/O Edition)—a…

Hugging Face Releases nanoVLM: A Pure PyTorch Library to Train a Vision-Language Model from Scratch in 750 Lines of Code

May 8, 2025

In a notable step toward democratizing vision-language model development, Hugging Face has released nanoVLM, a compact and educational PyTorch-based framework…

NVIDIA Open-Sources Open Code Reasoning Models (32B, 14B, 7B)

May 8, 2025

NVIDIA continues to push the boundaries of open AI development by open-sourcing its Open Code Reasoning (OCR) model suite —…

Researchers from Fudan University Introduce Lorsa: A Sparse Attention Mechanism That Recovers Atomic Attention Units Hidden in Transformer Superposition

May 7, 2025

Large Language Models (LLMs) have gained significant attention in recent years, yet understanding their internal mechanisms remains challenging. When examining…

Machine Learning

Is Automated Hallucination Detection in LLMs Feasible? A Theoretical and Empirical Investigation

May 7, 2025

Recent advancements in LLMs have significantly improved natural language understanding, reasoning, and generation. These models now excel at diverse tasks…