Machine Learning

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

July 22, 2025

When we began studying jailbreak evaluations, we found a fascinating paper claiming that you could jailbreak frontier LLMs simply by…

Boolformer: Symbolic Regression of Logic Functions with Transformers

July 22, 2025

This paper was accepted at the 2nd AI for Math Workshop at ICML 2025. We introduce Boolformer, a Transformer-based model…

Machine Learning

Use generative AI in Amazon Bedrock for enhanced recommendation generation in equipment maintenance

July 22, 2025

In the manufacturing world, valuable insights from service reports often remain underutilized in document storage systems. This post explores how…

Machine Learning

Kyruus builds a generative AI provider matching solution on AWS

July 22, 2025

This post was written with Zach Heath of Kyruus Health. When health plan members need care, they shouldn’t need a…

Machine Learning

Build an AI-powered automated summarization system with Amazon Bedrock and Amazon Transcribe using Terraform

July 22, 2025

Extracting meaningful insights from unstructured data presents significant challenges for many organizations. Meeting recordings, customer interactions, and interviews contain invaluable…

Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models

July 22, 2025

Aligned representations across languages is a desired property in multilingual large language models (mLLMs), as alignment can improve performance in…

On Information Geometry and Iterative Optimization in Model Compression: Operator Factorization

July 22, 2025

The ever-increasing parameter counts of deep learning models necessitate effective compression techniques for deployment on resource-constrained devices. This paper explores…

Language Models Improve When Pretraining Data Matches Target Tasks

July 18, 2025

Every data selection method inherently has a target. In practice, these targets often emerge implicitly through benchmark-driven iteration: researchers develop…

Machine Learning

Manage multi-tenant Amazon Bedrock costs using application inference profiles

July 18, 2025

Successful generative AI software as a service (SaaS) systems require a balance between service scalability and cost management. This becomes…

Machine Learning

Deploy a full stack voice AI agent with Amazon Nova Sonic

July 18, 2025

AI-powered speech solutions are transforming contact centers by enabling natural conversations between customers and AI agents, shortening wait times, and…

Machine Learning

Build real-time travel recommendations using AI agents on Amazon Bedrock

July 18, 2025

Generative AI is transforming how businesses deliver personalized experiences across industries, including travel and hospitality. Travel agents are enhancing their…

Machine Learning

Implementing on-demand deployment with customized Amazon Nova models on Amazon Bedrock

July 17, 2025

Amazon Bedrock offers model customization capabilities for customers to tailor versions of foundation models (FMs) to their specific needs through…

Machine Learning

Building cost-effective RAG applications with Amazon Bedrock Knowledge Bases and Amazon S3 Vectors

July 17, 2025

Vector embeddings have become essential for modern Retrieval Augmented Generation (RAG) applications, but organizations face significant cost challenges as they…

Machine Learning

Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI

July 17, 2025

Evaluating the performance of large language models (LLMs) goes beyond statistical metrics like perplexity or bilingual evaluation understudy (BLEU) scores.…

Apple Intelligence Foundation Language Models Tech Report 2025

July 17, 2025

We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and services: (i) a…

Machine Learning

Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI

July 17, 2025

Organizations are adopting large language models (LLMs), such as DeepSeek R1, to transform business processes, enhance customer experiences, and drive…