Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      CodeSOD: A Unique Way to Primary Key

      July 22, 2025

      BrowserStack launches Figma plugin for detecting accessibility issues in design phase

      July 22, 2025

      Parasoft brings agentic AI to service virtualization in latest release

      July 22, 2025

      Node.js vs. Python for Backend: 7 Reasons C-Level Leaders Choose Node.js Talent

      July 21, 2025

      The best CRM software with email marketing in 2025: Expert tested and reviewed

      July 22, 2025

      This multi-port car charger can power 4 gadgets at once – and it’s surprisingly cheap

      July 22, 2025

      I’m a wearables editor and here are the 7 Pixel Watch 4 rumors I’m most curious about

      July 22, 2025

      8 ways I quickly leveled up my Linux skills – and you can too

      July 22, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The Intersection of Agile and Accessibility – A Series on Designing for Everyone

      July 22, 2025
      Recent

      The Intersection of Agile and Accessibility – A Series on Designing for Everyone

      July 22, 2025

      Zero Trust & Cybersecurity Mesh: Your Org’s Survival Guide

      July 22, 2025

      Execute Ping Commands and Get Back Structured Data in PHP

      July 22, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

      July 22, 2025
      Recent

      A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

      July 22, 2025

      “I don’t think I changed his mind” — NVIDIA CEO comments on H20 AI GPU sales resuming in China following a meeting with President Trump

      July 22, 2025

      Galaxy Z Fold 7 review: Six years later — Samsung finally cracks the foldable code

      July 22, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Postgres RAG Stack: Embedding, Chunking & Vector Search

    Postgres RAG Stack: Embedding, Chunking & Vector Search

    July 17, 2025

    This is Part 2 of a three-part series (links at the bottom). The GitHub repo can be checked out here.

    Postgres RAG Stack brings together Postgres, pgVector, and TypeScript to power fast, semantic search. In Part One, we covered the theory behind semantic search: how embeddings convert meaning into vectors, how vector databases and indexes enable fast similarity search, and how RAG combines retrieval with language models for grounded, accurate responses. In this guide, you’ll scaffold your project, set up Docker with pgVector, and build ingestion and query scripts for embedding and chunking your data.

    Now we will begin setting up the foundation for our RAG application:

    • Next.js 15 project scaffold with environment files and directory layout
    • PostgreSQL 17 + pgvector in a Docker container
    • content_chunks table with an HNSW index
    • An ingestion script that chunks any text corpus and stores embeddings
    • Commands to validate cosine search plus troubleshooting tips

    Create the project directory

    mkdir rag-chatbot-demo && cd rag-chatbot-demo

    Scaffold a new Next.js app (optional)

    To create a chatbot using Next.js, scaffold now to avoid conflicts. Skip if you only need the RAG basics:

    npx create-next-app@latest . 
      --typescript 
      --app 
      --tailwind 
      --eslint 
      --import-alias "@/*"

    Set up folder structure

    mkdir -p scripts postgres input

    Create docker-compose.yml

    # ./docker-compose.yml
    services:
      db:
        image: pgvector/pgvector:pg17
        container_name: rag-chatbot-demo
        environment:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: password
          POSTGRES_DB: ragchatbot_db
        ports:
          - '5432:5432'
        volumes:
          - ./pgdata:/var/lib/postgresql/data
          - ./postgres/schema.sql:/docker-entrypoint-initdb.d/schema.sql
    volumes:
      pgdata:
    

    Create schema

    Add a SQL file that Docker runs automatically on first boot:

    -- ./postgres/schema.sql
    
    -- Enable pgvector extension
    CREATE EXTENSION IF NOT EXISTS vector;
    
    CREATE TABLE content_chunks (
      id         bigserial PRIMARY KEY,
      content    text,
      embedding  vector(1536),      -- OpenAI text‑embedding‑3‑small
      source     text,              -- optional file name, URL, etc.
      added_at   timestamptz DEFAULT now()
    );
    
    -- High‑recall ANN index for cosine similarity
    -- Note: Adding index before inserting data slows down the insert process
    
    CREATE INDEX ON content_chunks
    USING hnsw (embedding vector_cosine_ops);
    

    Launch Docker Compose

    After creating the schema file, start the container:

    docker compose up -d

    Create your .env file

    In the project root, add:

    # .env
    DATABASE_URL=postgresql://postgres:password@localhost:5432/ragchatbot_db
    # Get your key from https://platform.openai.com/account/api-keys
    OPENAI_API_KEY=your-openai-key-here

    Preparing the Embedding Pipeline

    Sample data file

    Create input/data.txt with sample documentation or FAQs. Download the full file here.

    # AcmeCorp Subscription Guide
    ## How do I renew my plan?
    Log in to your dashboard, select Billing → Renew, and confirm payment. Your new cycle starts immediately.
    
    ## How can I cancel my subscription?
    Navigate to Billing → Cancel Plan. Your access remains active until the end of the current billing period.
    
    ## Do you offer student discounts?
    Yes. Email support@acmecorp.com with proof of enrollment to receive a 25% discount code.
    
    ---
    # Troubleshooting Connectivity
    
    ## The app cannot reach the server
    Check your internet connection and verify the service URL in Settings → API Host.
    

    Install dependencies

    npm install pg langchain ai @ai-sdk/openai dotenv

    Dependencies:

    • ai: toolkit for AI models and streaming responses
    • @ai-sdk/openai: OpenAI adapter for embeddings and chat
    • pg: PostgreSQL client for Node.js
    • langchain: splits documents into semantically meaningful chunks
    • dotenv: loads environment variables from .env

    Create scripts/embed.ts

    This script reads text, splits it into chunks, generates embeddings via OpenAI, and stores them in content_chunks:

    // scripts/embed.ts
    import 'dotenv/config';
    import fs from 'node:fs';
    import path from 'node:path';
    import { Pool } from 'pg';
    import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';
    import { openai } from '@ai-sdk/openai'; // OpenAI adapter
    import { embedMany } from 'ai'; // generic AI interface
    
    const BATCH_SIZE = 50;
    const MAX_CHUNK_LENGTH = 512; // max characters per chunk
    const pool = new Pool({
        connectionString: process.env.DATABASE_URL
    });
    
    /**
     * Ingest a plain-text Q&A file where each line is either a question or an answer.
     * Splits on single newlines; if a line exceeds MAX_CHUNK_LENGTH, it is further
     * chunked by RecursiveCharacterTextSplitter.
     */
    async function ingest(file: string) {
        const raw = fs.readFileSync(file, 'utf8');
        console.log(`Loaded ${file}`);
    
        // Split into lines, drop empty lines
        const lines = raw
            .split(/r?ns*r?n/)
            .map((l) = & gt; l.trim())
            .filter(Boolean);
    
        // Prepare overflow splitter for any long lines
        const overflowSplitter = new RecursiveCharacterTextSplitter({
            chunkSize: MAX_CHUNK_LENGTH,
            chunkOverlap: 50,
        });
    
        // Build final list of chunks
        const chunks: string[] = [];
        for (const line of lines) {
            if (line.length & lt; = MAX_CHUNK_LENGTH) {
                chunks.push(line);
            } else {
                // Further split long lines into smaller chunks if needed
                const sub = await overflowSplitter.splitText(line);
                chunks.push(...sub);
            }
        }
    
        console.log(`Processing ${chunks.length} chunks in batches of ${BATCH_SIZE}`);
    
        // Process chunks in batches using embedMany
        for (let i = 0; i & lt; chunks.length; i += BATCH_SIZE) {
            const batch = chunks.slice(i, i + BATCH_SIZE);
            console.log(`Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(chunks.length / BATCH_SIZE)}`);
    
            // Embed the entire batch at once
            const {
                embeddings
            } = await embedMany({
                model: openai.embedding('text-embedding-3-small'),
                values: batch,
            });
    
            // Insert all embeddings from this batch into the database
            for (let j = 0; j & lt; batch.length; j++) {
                const chunk = batch[j];
                const embedding = embeddings[j];
                const vectorString = `[${embedding.join(',')}]`;
                console.log(`Inserting chunk ${i + j + 1}/${chunks.length}: ${chunk.slice(0, 60)}...`);
                await pool.query('INSERT INTO content_chunks (content, embedding, source) VALUES ($1,$2,$3)', [chunk, vectorString, path.basename(file)]);
            }
        }
    }
    
    async function main() {
        console.log('Starting embedding ingestion…');
        await ingest('./input/data.txt');
        await pool.end();
    }
    
    main().catch((err) = & gt; {
        console.error('Ingestion error:', err);
        process.exit(1);
    });
    

    Run the embedding script

    npx tsx scripts/embed.ts

    Testing Retrieval Functionality

    Create scripts/query.ts to embed a query, fetch the top-N chunks, and print them:

    /* scripts/query.ts */
    import 'dotenv/config';
    import { Pool } from 'pg';
    import { openai } from '@ai-sdk/openai';
    import { embed } from 'ai';
    
    const pool = new Pool({ connectionString: process.env.DATABASE_URL });
    const TOP_N = 5; // number of chunks to retrieve
    
    async function query(query: string) {
      console.log(`Embedding query: "${query}"`);
      const { embedding: qVec } = await embed({
        model: openai.embedding('text-embedding-3-small'),
        value: query,
      });
      const qVecString = `[${qVec.join(',')}]`;
    
      console.log(`Fetching top ${TOP_N} similar chunks from database...`);
      const { rows } = await pool.query<{ content: string; source: string; score: number; }>(
        `SELECT content, source,
                1 - (embedding <=> $1) AS score
           FROM content_chunks
       ORDER BY embedding <=> $1
          LIMIT $2`,
        [qVecString, TOP_N]
      );
    
      console.log('Results:');
      rows.forEach((row, i) => {
        console.log(`
    #${i + 1} (score: ${row.score.toFixed(3)}, source: ${row.source})`);
        console.log(row.content);
      });
    
      await pool.end();
    }
    
    (async () => {
      try {
        await query('How can I change my billing information?');
      } catch (err) {
        console.error('Error testing retrieval:', err);
      }
    })();
    

    Run the query script

    npx tsx scripts/query.ts
    Output:
    Embedding query: "How can I change my billing information?"
    Fetching top 5 similar chunks from database...
    Results:
    #1 (score: 0.774, source: data.txt)
    How do I update my billing information?
    Navigate to Billing → Payment Methods and click “Edit” next to your stored card.
    
    #2 (score: 0.512, source: data.txt)
    How do I change my account password?
    Go to Profile → Security, enter your current password, then choose a new one.
    
    #3 (score: 0.417, source: data.txt)
    How do I delete my account?
    Please contact support to request account deletion; it cannot be undone.
    

    🔎 score reflects cosine similarity: 1.0 is a perfect match; closer to 0 = less similar.

    Conclusion

    At this point, we’ve built a vector search backend: scaffolded a Next.js project, spun up Postgres with pgvector, created a schema optimized for similarity search, and built a TypeScript pipeline to embed and store content. We validated our setup with real cosine-similarity queries. In Part 3, we’ll build a user-friendly chatbot interface powered by GPT-4 and streaming responses using the ai SDK.

    References

    • Part 1: Vector Search Embeddings and RAG
    • Part 3: Coming soon
    • Repo: https://github.com/aberhamm/rag-chatbot-demo

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticlePreventing Command Conflicts with Laravel’s Isolatable Interface
    Next Article PHP 8.5.0 Alpha 2 available for testing

    Related Posts

    Development

    GPT-5 is Coming: Revolutionizing Software Testing

    July 22, 2025
    Development

    Win the Accessibility Game: Combining AI with Human Judgment

    July 22, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    How to Hire Top AI Developers

    Web Development

    Building Smarter Conversations: Next-Gen AI Assistants & Chatbots Redefining Customer Experience💬

    Web Development

    CVE-2025-4297 – PHPGurukul Men Salon Management System SQL Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Owlcat Games talks to us about about WH40K: Rogue Trader, the next game ‘Dark Heresy’ — and how the studio feels about working with Xbox Game Pass

    News & Updates

    Highlights

    CVE-2025-31422 – Designthemes Visual Art | Gallery WordPress Theme Object Injection Vulnerability

    July 16, 2025

    CVE ID : CVE-2025-31422

    Published : July 16, 2025, 12:15 p.m. | 5 hours, 59 minutes ago

    Description : Deserialization of Untrusted Data vulnerability in designthemes Visual Art | Gallery WordPress Theme allows Object Injection. This issue affects Visual Art | Gallery WordPress Theme: from n/a through 2.4.

    Severity: 8.8 | HIGH

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    CVE-2025-3823 – SourceCodester Web-based Pharmacy Product Management System Cross-Site Scripting Vulnerability

    April 20, 2025

    PSA: You don’t need to spend $400+ to upgrade your Xbox Series X|S storage

    June 3, 2025

    CVE-2025-6411 – PHPGurukul Art Gallery Management System SQL Injection

    June 21, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.