Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      CodeSOD: A Unique Way to Primary Key

      July 22, 2025

      BrowserStack launches Figma plugin for detecting accessibility issues in design phase

      July 22, 2025

      Parasoft brings agentic AI to service virtualization in latest release

      July 22, 2025

      Node.js vs. Python for Backend: 7 Reasons C-Level Leaders Choose Node.js Talent

      July 21, 2025

      The best CRM software with email marketing in 2025: Expert tested and reviewed

      July 22, 2025

      This multi-port car charger can power 4 gadgets at once – and it’s surprisingly cheap

      July 22, 2025

      I’m a wearables editor and here are the 7 Pixel Watch 4 rumors I’m most curious about

      July 22, 2025

      8 ways I quickly leveled up my Linux skills – and you can too

      July 22, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The Intersection of Agile and Accessibility – A Series on Designing for Everyone

      July 22, 2025
      Recent

      The Intersection of Agile and Accessibility – A Series on Designing for Everyone

      July 22, 2025

      Zero Trust & Cybersecurity Mesh: Your Org’s Survival Guide

      July 22, 2025

      Execute Ping Commands and Get Back Structured Data in PHP

      July 22, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

      July 22, 2025
      Recent

      A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

      July 22, 2025

      “I don’t think I changed his mind” — NVIDIA CEO comments on H20 AI GPU sales resuming in China following a meeting with President Trump

      July 22, 2025

      Galaxy Z Fold 7 review: Six years later — Samsung finally cracks the foldable code

      July 22, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Deploy conversational agents with Vonage and Amazon Nova Sonic

    Deploy conversational agents with Vonage and Amazon Nova Sonic

    July 16, 2025

    This post is co-written with Mark Berkeland, Oscar Rodriguez and Marina Gerzon from Vonage.

    Voice-based technologies are transforming the way businesses engage with customers across customer support, virtual assistants, and intelligent agents. However, creating real-time, expressive, and highly responsive voice interfaces still requires navigating a complex stack of communication protocols, AI models, and media infrastructure. To simplify this process, Vonage has integrated Amazon Nova Sonic, our speech-to-speech foundation model (FM), with the Vonage Voice API, part of their Communications Platform as a Service (CPaaS) offering.

    With this integration, developers can deploy AI voice agents to enable more human-like voice conversations over phone calls, SIP connections, WebRTC, and mobile apps. The solution makes it straightforward to bring intelligent, real-time conversations into workflows for a variety of use cases, such as a small auto repair shop using voice AI to book appointments and track down parts, a global retail brand handling a high volume of customer service calls, or a developer building a scalable voice interface.

    In this post, we explore how developers can integrate Amazon Nova Sonic with the Vonage communications service to build responsive, natural-sounding voice experiences in real time. By combining the Vonage Voice API with the low-latency and expressive speech capabilities of Amazon Nova Sonic, businesses can deploy AI voice agents that deliver more human-like interactions than traditional voice interfaces. These agents can be used as customer support, virtual assistants, and more.

    Amazon Nova Sonic for real-time conversational AI

    Amazon Nova Sonic is a speech-to-speech FM designed to build real-time conversational AI applications in Amazon Bedrock, with industry-leading price-performance and low latency. Its architecture unifies speech understanding and generation into a single model, to enable more human-like voice conversations in AI applications. The model can understand speech in different speaking styles and generate speech in expressive voices, including both masculine-sounding and feminine-sounding voices. Amazon Nova Sonic can adapt the intonation, prosody, and style of the generated speech response to align with the context and content of the speech input and gracefully handle interruptions. Additionally, Amazon Nova Sonic allows for function calling and knowledge grounding with enterprise data using Retrieval Augmented Generation (RAG).

    Vonage Voice APIs, powered by AI

    Vonage, an AWS partner, provides a developer-friendly platform for building voice, messaging, video, and authentication experiences. With its wide-ranging Voice APIs, Vonage offers WebRTC support, multi-channel communication tools, standard phone call integrations, in-app softphones, front-ending contact centers, and voice-over-browser functionality. The software also offers essential building blocks such as inbound and outbound voice call handling, voicemail support, and programmable logic for call routing and queuing. Vonage’s solution builder and SDKs allow for fast, low-code integration, while its interoperability with business applications and productivity tools enables teams to embed communication directly into their existing workflows.

    Solution overview

    Vonage collaborated with Amazon Nova Sonic to build low-latency, voice-first applications that can understand and respond like a human agent over standard telephony or WebRTC channels. This new tool can connect inbound and outbound Vonage calls directly to Amazon Nova Sonic for conversational AI processing, using expressive, real-time speech synthesis to deliver fluid, natural interactions. Amazon Nova Sonic’s integration into Vonage Voice API seamlessly manages audio buffering, custom media infrastructure, and protocol translation, so teams can focus on building engaging experiences.

    With built-in conversation control logic and noise cancellation, Vonage’s integration with Amazon Nova Sonic makes it straightforward for businesses to rapidly build and deploy responsive AI voice agents. These agents can handle real-time voice conversations and scale voice interactions without relying on traditional contact centers.

    Vonage is making this integration available as a GitHub repository for developers to deploy and customize to their needs.

    “As an AWS Amazon Partner Network (APN) member, Vonage has a long history of working closely with the AWS innovation team to create new solutions to benefit enterprise customers,” said Christophe Van de Weyer, President and Head of Business Unit API for Vonage. “This latest collaboration with AWS enables organizations to transform how they engage with customers by adopting generative AI solutions that create added value for internal and external communication. By combining Vonage’s communications APIs with AWS’s advanced AI, this new voice AI agent technology enables businesses to streamline the adoption of intelligent agents, accelerate the modernization of legacy voice systems, and provide a robust service to deliver exceptional customer experiences with measurable improvements in satisfaction and operational efficiency.”

    The following video showcases a demo of Diana, an AI voice agent built using Vonage’s integration with Amazon Nova Sonic.

    The following architecture diagram provides an overview of Amazon Nova Sonic deployed as a voice agent in the Vonage Voice API framework on AWS.

    Architectural Diagram of the Solution

    The solution routes different types of incoming calls to Amazon Nova Sonic over a WebSocket connection. The architectural components include (left to right):

    • Calls – Incoming voice connections that can come from global phone numbers, SIP connections with contact centers or business systems, or WebRTC connections from web browsers and mobile apps.
    • Vonage Voice API – Provides programmatic control over these types of calls and voice connections, allowing them to be integrated with AI systems, routed elsewhere, or given speech and other treatments. Because Amazon Nova Sonic is a full speech-to-speech AI service, the real-time voice streams are connected directly, unlike other AI integrations that might use text-based integration.
    • Amazon Nova Sonic connector – A Vonage integration that connects calls to Amazon Nova Sonic over a WebSocket connection, providing low-latency, real-time, bi-directional voice streaming directly with Amazon Nova Sonic. The connector also manages voice isolation to better handle noisy environments, conversational elements like “barge in” where the caller interrupts the conversation, and fallback options if needed.
    • Amazon Nova Sonic – Part of the Amazon Nova family of FMs available in Amazon Bedrock. Amazon Nova Sonic unifies speech understanding and generation into a single model, streamlining development and reducing complexity when building conversational applications.
    • Retrieval Augmented Generation (RAG) – Tools within Amazon Bedrock that optimize the output of an underlying large language model (LLM). Amazon Nova Sonic can reference enterprise-authorized knowledge sources. Attribution and source visibility can be configured based on customer requirements.
    • Customizable prompt – Provided to the AI model and allows the voice agent’s personality and conversational capabilities to be defined and the right knowledge base to be used.
    • User context – Maintained by Amazon Nova Sonic throughout interaction sequences to allow a natural continuous conversation. Personally identifiable information (PII) is processed in real time and not retained by Amazon Nova Sonic. AWS safeguards your data through comprehensive security controls, encryption at rest and in transit, and compliance certifications, while also giving you the flexibility to configure additional logging, security, and compliance measures through AWS services.

    These components work together to create a flexible, intelligent voice agent service that can dynamically adapt to different communication scenarios and business use cases with different knowledge bases and prompts.

    Example use cases

    The following are just a few of the high-impact ways businesses are already using this integration to transform voice interactions:

    • Customer support automation – Deploy voice agents that answer inbound customer queries, take appointments, and escalate calls only when necessary.
    • Proactive outbound calling – Generate dynamic, expressive outbound messages like reminders, confirmations, or follow-ups with voicemail fallback.
    • Multilingual voice assistants – Build voice experiences that seamlessly switch between English and Spanish depending on the caller, enabled by Vonage’s language detection and multilingual synthesis with Amazon Nova Sonic.

    Conclusion

    By combining Amazon Nova Sonic with Vonage’s flexible communication infrastructure, developers can build intelligent, responsive AI voice agents. With this solution, you can provide proactive voice engagement, create multilingual assistants, handle customer support, and more. This integration makes voice-first AI applications more accessible and scalable than ever.

    To start building with Amazon Nova Sonic, visit the Amazon Bedrock console. For Vonage integration, explore the Vonage API Developer Portal or use the Vonage Solution Builder to configure your voice agent in minutes.

    To learn more about Amazon Nova Sonic, check out the AWS News Blog, Amazon Nova Sonic product page, or Amazon Bedrock User Guide.


    About the authors

    Divyesha Malhotra is a Senior Product Manager Technical Intern on the AGI Nova Sonic team. She leads the customer adoption and integrations of cutting-edge speech-to-speech foundation models for next-generation voice-based technologies.

    Mark Berkeland is a Senior Solutions Engineer in the API Business Unit at Vonage. He designs and implements technical solutions including demos and proofs of concept to help customers bring voice and messaging applications to life. With a professional programming career that began in 1979, his experience ranges from FORTRAN on punched cards to modern cloud-native stacks like React Native, combining deep technical expertise with a passion for making complex ideas accessible.

    Oscar Rodriguez is Senior Director of Global Partner Solutions in the API Business Unit at Vonage, where he leads strategic initiatives to empower partners through scalable communications solutions. He brings deep technical expertise and a practical understanding of real-world application development with over 20 years experience in web technologies and the last 10 in CPaaS.

    Marina Gerzon is a Partner Solutions Architect at Vonage with over 20 years of experience in real-time communications, specializing in Video and Voice over IP solutions. Known for her ability to bridge technical depth with business impact, her work spans Telecom, Education, Healthcare, Fintech, and Insurance industries, where she has consistently delivered enterprise-grade SaaS and PaaS architectures tailored to complex business needs.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleCVE-2025-7673 – Zyxel zhttpd Web Server Buffer Overflow Vulnerability
    Next Article Enabling customers to deliver production-ready AI agents at scale

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    July 22, 2025
    Machine Learning

    Boolformer: Symbolic Regression of Logic Functions with Transformers

    July 22, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Palo Alto Networks PAN-OS Vulnerability Enables Admin to Execute Root User Actions

    Security

    Bottles: Un Appello alla Comunità per il Sostegno e la Crescita

    Linux

    KLayout is a GDS and OASIS file viewer and editor

    Linux

    Apple Fined €150 Million by French Regulator Over Discriminatory ATT Consent Practices

    Development

    Highlights

    Data Management & Optimization [SUBSCRIBER]

    April 3, 2025

    <p>This module focuses on intermediate data management techniques and optimization strategies to ensure your SwiftUI…

    ncgopher is a gopher, gemini and finger client

    June 2, 2025

    CVE-2025-7939 – A vulnerability was found in jerryshensjf JPACooki

    July 21, 2025

    CVE-2025-45753 – Vtiger CRM PHP Code Execution Vulnerability

    May 21, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.