When we began studying jailbreak evaluations, we found a fascinating paper claiming that you could jailbreak frontier LLMs simply by…
Machine Learning
This paper was accepted at the 2nd AI for Math Workshop at ICML 2025. We introduce Boolformer, a Transformer-based model…
In the manufacturing world, valuable insights from service reports often remain underutilized in document storage systems. This post explores how…
This post was written with Zach Heath of Kyruus Health. When health plan members need care, they shouldn’t need a…
Extracting meaningful insights from unstructured data presents significant challenges for many organizations. Meeting recordings, customer interactions, and interviews contain invaluable…
Aligned representations across languages is a desired property in multilingual large language models (mLLMs), as alignment can improve performance in…
The ever-increasing parameter counts of deep learning models necessitate effective compression techniques for deployment on resource-constrained devices. This paper explores…
Every data selection method inherently has a target. In practice, these targets often emerge implicitly through benchmark-driven iteration: researchers develop…
Successful generative AI software as a service (SaaS) systems require a balance between service scalability and cost management. This becomes…
AI-powered speech solutions are transforming contact centers by enabling natural conversations between customers and AI agents, shortening wait times, and…
Generative AI is transforming how businesses deliver personalized experiences across industries, including travel and hospitality. Travel agents are enhancing their…
Amazon Bedrock offers model customization capabilities for customers to tailor versions of foundation models (FMs) to their specific needs through…
Vector embeddings have become essential for modern Retrieval Augmented Generation (RAG) applications, but organizations face significant cost challenges as they…
Evaluating the performance of large language models (LLMs) goes beyond statistical metrics like perplexity or bilingual evaluation understudy (BLEU) scores.…
We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and services: (i) a…
Organizations are adopting large language models (LLMs), such as DeepSeek R1, to transform business processes, enhance customer experiences, and drive…
This post was written with Ilan Geller, Kamal Mannar, Debasmita Ghosh, and Nakul Aggarwal of Accenture. Video highlights offer a…
AI agents will change how we all work and live. Our AWS CEO, Matt Garman, shared a vision of a…
This post is co-written with Mark Berkeland, Oscar Rodriguez and Marina Gerzon from Vonage. Voice-based technologies are transforming the way…
This is a guest post co-written with Rahul Ghosh, Sandeep Kumar Veerlapati, Rahmat Khan, and Mudit Chopra from PayU. PayU…