Replacing Elasticsearch with PostgreSQL Alone: Hybrid Search with VectorChord-BM25 and RRF

If you've ever stood up a separate Elasticsearch cluster just to add a search feature, you know how heavy that operational burden can be. Schema synchronization, indexing lag, transaction consistency issues between two systems... honestly, more than once I've wondered why we even need Elasticsearch sitting alongside PostgreSQL when the data is already there. One thing teams that have removed Elasticsearch consistently mention is that "sync bugs disappeared" — all those hours spent checking the indexing pipeline status after every deployment, and debugging which side was correct whenever a transaction mismatch occurred.

In March 2025, the tensorchord team released the vchord_bm25 extension written in Rust, offering a fairly realistic alternative to this pain. It's not just full-text search (BM25) that came along — combined with VectorChord's vector search, it's now possible to complete hybrid search entirely within PostgreSQL.

This article covers how VectorChord-BM25 works, and how to implement hybrid search on a single PostgreSQL stack by combining BM25 keyword search and vector similarity search with RRF (Reciprocal Rank Fusion). We'll walk through working SQL and Python code, a pros-and-cons analysis, and common pitfalls to avoid when getting started. Code examples are based on an English corpus. If you're curious about pg_tokenizer configuration for Korean text, check the upcoming topics preview at the end of this article.

Core Concepts

Why BM25 Is Better Than TF-IDF

BM25 (Best Match 25) is the standard in information retrieval — good enough for Elasticsearch to adopt it as its default ranking algorithm. Like TF-IDF, it combines term frequency (TF) and inverse document frequency (IDF), but there's a critical difference.

Just because a word appears 100 times in a long document doesn't necessarily mean it's more relevant than a word appearing 10 times in a short document. BM25 addresses this with parameters k1 (term frequency saturation) and b (document length normalization strength). A higher k1 increases the impact of term repetition, and a b closer to 1 strengthens document length normalization.

BM25(d, q) = Σ IDF(t) × [TF(t,d) × (k1 + 1)] / [TF(t,d) + k1 × (1 - b + b × |d|/avgdl)]

At first glance this formula looks complicated and I assumed it was basically the same as TF-IDF, but in corpora with many long documents, the difference is quite noticeable in practice.

How VectorChord-BM25 Brings BM25 into PostgreSQL

vchord_bm25 is a PostgreSQL extension that provides several key components.

Component	Role
`bm25vector`	A dedicated type for storing sparse vectors generated by the tokenizer
BM25 index	A custom index in the form `USING bm25 (embedding bm25_ops)`
`<&>` operator	A distance operator that returns BM25 scores (negative values — lower means more relevant)
Block WeakAnd	An early termination algorithm that efficiently retrieves only the top K results

A quick note on the bm25vector type: it is different from pgvector's vector (dense vector). It is a sparse vector where each dimension corresponds to a vocabulary ID and most values are zero. Only the word indices that appear in the document have values, with everything else being 0 — so even with a vocabulary size in the tens of thousands, the actual storage space is much smaller.

There's one design point worth knowing. The reason to_bm25query takes an index name as its first argument is that the BM25 index itself encodes which tokenizer to use. This enforces that the same tokenizer is always used both at document indexing time and at query conversion time. Not knowing this makes it easy to fall into mistake #2 described later.

What is Block WeakAnd? Rather than scanning the entire document corpus, it maintains a top-K candidate set and skips blocks of documents that are guaranteed to score lower than the current threshold. Elasticsearch uses this algorithm internally as well. It allows accurate top results to be retrieved quickly without scanning the full index.

Why Vector Search Alone Is Not Enough

Vector similarity search (Semantic Search) excels at intent-based queries like "warm jacket for winter." But for queries where the exact spelling matters — such as "iPhone 16 Pro 256GB," a specific function name, or an API endpoint — it can surprisingly underperform. "iPhone 16 Pro" and "Galaxy S25 Ultra" can be semantically close in embedding space.

Conversely, BM25 alone for a query like "recommend a good laptop" will find documents containing the words "good" or "recommend," but may miss documents that actually contain laptop reviews. The two approaches are structured to complement each other's weaknesses.

RRF — A Way to Combine Two Scores on Different Scales

BM25 scores and cosine similarities have different units, so directly summing them causes one to overwhelm the other. RRF (Reciprocal Rank Fusion) sidesteps this problem by using only ranks instead of absolute scores.

RRF_score(d) = Σ 1 / (k + rank(d))

k=60 is the default value; this constant buffers the score difference between high and low ranks. For example, the difference between rank 1 (1/61 ≈ 0.016) and rank 10 (1/70 ≈ 0.014) in BM25 is small, and a document ranked 1st in vector search but 5th in BM25 will still have a fairly high combined score.

Why k=60? It's the experimentally validated value from the original paper by Cormack et al. (SIGIR 2009). It works well across most domains, but running A/B tests to find the optimal value for your use case is recommended.

Practical Application

Basic Setup: Creating a BM25 Index

There's an all-in-one Docker image you can spin up immediately, so getting started isn't difficult. tensorchord/vchord-suite:pg18-latest includes VectorChord, BM25, and pg_tokenizer all in one.

sql

-- Install required extensions
CREATE EXTENSION IF NOT EXISTS pg_tokenizer CASCADE;
CREATE EXTENSION IF NOT EXISTS vchord_bm25 CASCADE;
 
-- Create a BERT-based tokenizer (for English)
SELECT create_tokenizer('bert_tok', $$ model = "bert_base_uncased" $$);
 
-- Create the documents table
CREATE TABLE documents (
  id        SERIAL PRIMARY KEY,
  body      TEXT,
  embedding bm25vector  -- dedicated type for storing sparse vectors
);
 
-- Tokenize raw text and store as bm25vector
INSERT INTO documents (body, embedding)
SELECT body, tokenize(body, 'bert_tok')::bm25vector
FROM raw_texts;
 
-- Create BM25 index
CREATE INDEX doc_bm25_idx ON documents USING bm25 (embedding bm25_ops);

After creating the index, searching looks like this:

sql

-- BM25 search: scores are negative, so ORDER BY ASC = highest relevance first
SELECT
  id,
  body,
  embedding <&> to_bm25query('doc_bm25_idx', tokenize('database search', 'bert_tok')) AS score
FROM documents
ORDER BY score ASC
LIMIT 10;

Code Element	Description
`<&>`	BM25 distance operator. Returns negative values — lower means more relevant
`to_bm25query`	Converts a search query into a BM25 query vector. The index name as first argument determines the tokenizer
`tokenize(...)`	Converts the query string using the same tokenizer as the index

RRF Combination Query: Merging BM25 and Vector Search

This is the pattern you'll use most often in production. It assumes embeddings generated by pgvector or VectorChord are already present in a vec_embedding column.

There's one important implementation point. When extracting the top 20 by score inside a CTE, placing ROW_NUMBER() OVER (ORDER BY score) and LIMIT 20 at the same level won't behave as expected. PostgreSQL returns arbitrary rows when it encounters a LIMIT without an ORDER BY, and the window function will then rank only those arbitrary rows. This goes unnoticed in small-scale tests but becomes the cause of non-reproducible RRF results as data grows. To get the actual top 20, you must sort and limit in a subquery first, then apply ranking in the outer query.

sql

WITH bm25_results AS (
  -- Extract actual top 20 in subquery first, then assign ranks
  SELECT id, ROW_NUMBER() OVER (ORDER BY score) AS rank
  FROM (
    SELECT id,
           embedding <&> to_bm25query('doc_bm25_idx', tokenize(:query, 'bert_tok')) AS score
    FROM documents
    ORDER BY score
    LIMIT 20
  ) top_bm25
),
vector_results AS (
  -- Extract top 20 by vector similarity, assign ranks
  SELECT id, ROW_NUMBER() OVER (ORDER BY score) AS rank
  FROM (
    SELECT id,
           vec_embedding <=> :query_vector AS score
    FROM documents
    ORDER BY score
    LIMIT 20
  ) top_vec
),
rrf AS (
  -- FULL OUTER JOIN to include documents that appear in only one set
  SELECT
    COALESCE(b.id, v.id) AS id,
    COALESCE(1.0 / (60 + b.rank), 0) +
    COALESCE(1.0 / (60 + v.rank), 0) AS rrf_score
  FROM bm25_results b
  FULL OUTER JOIN vector_results v ON b.id = v.id
)
SELECT d.id, d.body, r.rrf_score
FROM rrf r
JOIN documents d ON d.id = r.id
ORDER BY r.rrf_score DESC
LIMIT 10;

The key is using FULL OUTER JOIN. If a document didn't make the BM25 top 20 but ranked 1st in vector search, it gets included rather than discarded. COALESCE(..., 0) handles cases where a document appears in only one result set by treating its score from the missing method as 0.

Applying to a RAG Pipeline

In LLM-based RAG systems, retrieval quality directly impacts generation quality. What surprised me most when I first added hybrid search to this pipeline was how often vector-only search fetched irrelevant documents for queries where exact spelling matters — things like product version numbers or function names. Queries like "v2.3.1 migration issue" would return results mixed with "v2.2 upgrade guide" or "v3.0 changelog." After adding RRF, search results for these cases noticeably improved.

python

import asyncpg  # Python async PostgreSQL driver
import numpy as np
# The output dimension of the embed function must match the dimension of the vec_embedding column
from your_embedding_model import embed
 
async def hybrid_search(query: str, top_k: int = 10) -> list[dict]:
    query_vector = embed(query)
 
    # In production, using asyncpg.create_pool() is recommended
    conn = await asyncpg.connect(DATABASE_URL)
 
    results = await conn.fetch("""
        WITH bm25_results AS (
            SELECT id, ROW_NUMBER() OVER (ORDER BY score) AS rank
            FROM (
                SELECT id,
                       embedding <&> to_bm25query('doc_bm25_idx', tokenize($1, 'bert_tok')) AS score
                FROM documents
                ORDER BY score
                LIMIT 20
            ) top_bm25
        ),
        vector_results AS (
            SELECT id, ROW_NUMBER() OVER (ORDER BY score) AS rank
            FROM (
                SELECT id,
                       vec_embedding <=> $2 AS score
                FROM documents
                ORDER BY score
                LIMIT 20
            ) top_vec
        ),
        rrf AS (
            SELECT COALESCE(b.id, v.id) AS id,
                   COALESCE(1.0 / (60 + b.rank), 0) +
                   COALESCE(1.0 / (60 + v.rank), 0) AS rrf_score
            FROM bm25_results b
            FULL OUTER JOIN vector_results v ON b.id = v.id
        )
        SELECT d.id, d.body, r.rrf_score
        FROM rrf r JOIN documents d ON d.id = r.id
        ORDER BY r.rrf_score DESC
        LIMIT $3
    """, query, query_vector.tolist(), top_k)
 
    await conn.close()
    return [dict(r) for r in results]

When you feed the context retrieved this way into an LLM prompt, you can feel the difference in search quality compared to vector-only search — especially for queries involving proper nouns or version numbers.

Pros and Cons Analysis

Advantages

Item	Details
Operational simplicity	Unified management of search + vector + relational data in a single PostgreSQL instance. No Elasticsearch sync pipeline needed
ACID guarantees	Transactional consistency by default. Elasticsearch's near-realtime indexing can introduce brief reflection delays
BM25 performance	~3x higher QPS than Elasticsearch based on Block WeakAnd (official benchmark figures measured under specific hardware and data conditions)
Vector indexing speed	VectorChord completes a 100-million-vector index in under 20 minutes (official benchmark; reduces indexing time by tens of hours compared to pgvector's HNSW index)
SQL-friendly	Naturally combines with existing JOINs, filters, and aggregations. No separate query DSL required
Cost reduction	No need to operate a separate search engine cluster

What is near-realtime? Elasticsearch has a default refresh interval of 1 second before a document becomes searchable after indexing. During this brief window, newly inserted data may not appear in search results. With PostgreSQL + vchord_bm25, data is searchable immediately upon transaction commit.

Disadvantages and Caveats

Item	Details	Mitigation
Language support	Primarily validated for English. Non-English languages like Korean require separate tokenizer configuration	Explore multilingual BPE models in `pg_tokenizer`. Morpheme-analyzer-based options also exist
Maturity	Released March 2025; production case studies still accumulating	For high-criticality services, run in parallel with Elasticsearch and gradually shift traffic
BM25 semantic limitations	Cannot handle synonyms or contextual nuance	Always pair with vector search. For precision-critical cases, consider adding a cross-encoder reranker after RRF
Resource overhead	Maintaining both BM25 and vector indexes simultaneously increases storage and memory	Analyze actual query patterns and selectively configure indexes with low usage frequency
Large-scale limits	Scenarios requiring distributed indexing at hundreds of TB scale favor dedicated search engines	For workloads exceeding single-node limits, consider partitioning or dedicated distributed solutions in parallel

Operations tip — bm25_limit parameter: For queries that exceed the default result count limit, check the bm25_catalog.bm25_limit GUC parameter and adjust as needed.

The Most Common Mistakes in Practice

Forgetting that BM25 scores are negative and querying with ORDER BY score DESC — The <&> operator returns negative values, so ORDER BY score ASC is highest-relevance order. This is especially easy to mix up when using BM25 standalone without RRF.
Using a different tokenizer for the query versus the index — This is exactly why to_bm25query takes an index name as its first argument. The index itself encodes the tokenizer, so using the same index name guarantees the same tokenizer is used at both indexing time and search time. If you pass a different tokenizer name to tokenize() and connect it to to_bm25query, the scores become meaningless.
Setting the LIMIT for RRF too small — If the LIMIT for BM25 and vector search individually is set too small, good results may already be cut off by the time RRF merges them. It is generally safe to retrieve 3–5 times the final desired result count at each stage.

Closing Thoughts

By combining VectorChord-BM25 with RRF, you can complete hybrid search within a single PostgreSQL stack without Elasticsearch. Schema sync pipelines, transaction inconsistency debugging, operational overhead bouncing between two systems — the core value of this combination is reducing all of that down to a single PostgreSQL instance.

Of course, if you need distributed search at hundreds of terabytes scale, or genuinely require Elasticsearch's mature ecosystem, a simple replacement is not feasible. But many production services operate complex dual systems without ever reaching that scale. For those teams, this combination is well worth evaluating.

Here are 3 steps to get started right now:

Set up the environment with Docker — Run docker run --rm -e POSTGRES_PASSWORD=password -p 5432:5432 tensorchord/vchord-suite:pg18-latest to immediately spin up a PostgreSQL instance that includes VectorChord + BM25 + pg_tokenizer.
Experiment by adding a BM25 index to an existing text column — If you have a PostgreSQL table already in production, you can add an embedding bm25vector column, populate it with tokenize(body, 'bert_tok')::bm25vector, create a BM25 index, and directly compare results against your existing LIKE or tsvector searches. However, running a bulk update after ALTER TABLE ... ADD COLUMN on a production DB can cause table locks, so it's recommended to validate on a test DB first.
Replace the RRF query in your RAG pipeline — Swap the vector-only RAG retriever with the RRF query from the example above, and you can observe the change in search quality for queries involving proper nouns, version numbers, and code-related terms.

References

#PostgreSQL#BM25#하이브리드검색#VectorChord#RRF#벡터검색#pgvector#RAG#Elasticsearch#전문검색

Replacing Elasticsearch with PostgreSQL Alone: Hybrid Search with VectorChord-BM25 and RRF | DEV BAK - 기술블로그

Database

Replacing Elasticsearch with PostgreSQL Alone: Hybrid Search with VectorChord-BM25 and RRF

Core Concepts

Why BM25 Is Better Than TF-IDF

BM25(d, q) = Σ IDF(t) × [TF(t,d) × (k1 + 1)] / [TF(t,d) + k1 × (1 - b + b × |d|/avgdl)]

At first glance this formula looks complicated and I assumed it was basically the same as TF-IDF, but in corpora with many long documents, the difference is quite noticeable in practice.

How VectorChord-BM25 Brings BM25 into PostgreSQL

vchord_bm25 is a PostgreSQL extension that provides several key components.

Component	Role
`bm25vector`	A dedicated type for storing sparse vectors generated by the tokenizer
BM25 index	A custom index in the form `USING bm25 (embedding bm25_ops)`
`<&>` operator	A distance operator that returns BM25 scores (negative values — lower means more relevant)
Block WeakAnd	An early termination algorithm that efficiently retrieves only the top K results

What is Block WeakAnd? Rather than scanning the entire document corpus, it maintains a top-K candidate set and skips blocks of documents that are guaranteed to score lower than the current threshold. Elasticsearch uses this algorithm internally as well. It allows accurate top results to be retrieved quickly without scanning the full index.

Why Vector Search Alone Is Not Enough

RRF — A Way to Combine Two Scores on Different Scales

RRF_score(d) = Σ 1 / (k + rank(d))

Why k=60? It's the experimentally validated value from the original paper by Cormack et al. (SIGIR 2009). It works well across most domains, but running A/B tests to find the optimal value for your use case is recommended.

Practical Application

Basic Setup: Creating a BM25 Index

There's an all-in-one Docker image you can spin up immediately, so getting started isn't difficult. tensorchord/vchord-suite:pg18-latest includes VectorChord, BM25, and pg_tokenizer all in one.

sql

-- Install required extensions
CREATE EXTENSION IF NOT EXISTS pg_tokenizer CASCADE;
CREATE EXTENSION IF NOT EXISTS vchord_bm25 CASCADE;
 
-- Create a BERT-based tokenizer (for English)
SELECT create_tokenizer('bert_tok', $$ model = "bert_base_uncased" $$);
 
-- Create the documents table
CREATE TABLE documents (
  id        SERIAL PRIMARY KEY,
  body      TEXT,
  embedding bm25vector  -- dedicated type for storing sparse vectors
);
 
-- Tokenize raw text and store as bm25vector
INSERT INTO documents (body, embedding)
SELECT body, tokenize(body, 'bert_tok')::bm25vector
FROM raw_texts;
 
-- Create BM25 index
CREATE INDEX doc_bm25_idx ON documents USING bm25 (embedding bm25_ops);

After creating the index, searching looks like this:

sql

-- BM25 search: scores are negative, so ORDER BY ASC = highest relevance first
SELECT
  id,
  body,
  embedding <&> to_bm25query('doc_bm25_idx', tokenize('database search', 'bert_tok')) AS score
FROM documents
ORDER BY score ASC
LIMIT 10;

Code Element	Description
`<&>`	BM25 distance operator. Returns negative values — lower means more relevant
`to_bm25query`	Converts a search query into a BM25 query vector. The index name as first argument determines the tokenizer
`tokenize(...)`	Converts the query string using the same tokenizer as the index

RRF Combination Query: Merging BM25 and Vector Search

This is the pattern you'll use most often in production. It assumes embeddings generated by pgvector or VectorChord are already present in a vec_embedding column.

sql

WITH bm25_results AS (
  -- Extract actual top 20 in subquery first, then assign ranks
  SELECT id, ROW_NUMBER() OVER (ORDER BY score) AS rank
  FROM (
    SELECT id,
           embedding <&> to_bm25query('doc_bm25_idx', tokenize(:query, 'bert_tok')) AS score
    FROM documents
    ORDER BY score
    LIMIT 20
  ) top_bm25
),
vector_results AS (
  -- Extract top 20 by vector similarity, assign ranks
  SELECT id, ROW_NUMBER() OVER (ORDER BY score) AS rank
  FROM (
    SELECT id,
           vec_embedding <=> :query_vector AS score
    FROM documents
    ORDER BY score
    LIMIT 20
  ) top_vec
),
rrf AS (
  -- FULL OUTER JOIN to include documents that appear in only one set
  SELECT
    COALESCE(b.id, v.id) AS id,
    COALESCE(1.0 / (60 + b.rank), 0) +
    COALESCE(1.0 / (60 + v.rank), 0) AS rrf_score
  FROM bm25_results b
  FULL OUTER JOIN vector_results v ON b.id = v.id
)
SELECT d.id, d.body, r.rrf_score
FROM rrf r
JOIN documents d ON d.id = r.id
ORDER BY r.rrf_score DESC
LIMIT 10;

Applying to a RAG Pipeline

python

import asyncpg  # Python async PostgreSQL driver
import numpy as np
# The output dimension of the embed function must match the dimension of the vec_embedding column
from your_embedding_model import embed
 
async def hybrid_search(query: str, top_k: int = 10) -> list[dict]:
    query_vector = embed(query)
 
    # In production, using asyncpg.create_pool() is recommended
    conn = await asyncpg.connect(DATABASE_URL)
 
    results = await conn.fetch("""
        WITH bm25_results AS (
            SELECT id, ROW_NUMBER() OVER (ORDER BY score) AS rank
            FROM (
                SELECT id,
                       embedding <&> to_bm25query('doc_bm25_idx', tokenize($1, 'bert_tok')) AS score
                FROM documents
                ORDER BY score
                LIMIT 20
            ) top_bm25
        ),
        vector_results AS (
            SELECT id, ROW_NUMBER() OVER (ORDER BY score) AS rank
            FROM (
                SELECT id,
                       vec_embedding <=> $2 AS score
                FROM documents
                ORDER BY score
                LIMIT 20
            ) top_vec
        ),
        rrf AS (
            SELECT COALESCE(b.id, v.id) AS id,
                   COALESCE(1.0 / (60 + b.rank), 0) +
                   COALESCE(1.0 / (60 + v.rank), 0) AS rrf_score
            FROM bm25_results b
            FULL OUTER JOIN vector_results v ON b.id = v.id
        )
        SELECT d.id, d.body, r.rrf_score
        FROM rrf r JOIN documents d ON d.id = r.id
        ORDER BY r.rrf_score DESC
        LIMIT $3
    """, query, query_vector.tolist(), top_k)
 
    await conn.close()
    return [dict(r) for r in results]

Pros and Cons Analysis

Advantages

Item	Details
Operational simplicity	Unified management of search + vector + relational data in a single PostgreSQL instance. No Elasticsearch sync pipeline needed
ACID guarantees	Transactional consistency by default. Elasticsearch's near-realtime indexing can introduce brief reflection delays
BM25 performance	~3x higher QPS than Elasticsearch based on Block WeakAnd (official benchmark figures measured under specific hardware and data conditions)
Vector indexing speed	VectorChord completes a 100-million-vector index in under 20 minutes (official benchmark; reduces indexing time by tens of hours compared to pgvector's HNSW index)
SQL-friendly	Naturally combines with existing JOINs, filters, and aggregations. No separate query DSL required
Cost reduction	No need to operate a separate search engine cluster

What is near-realtime? Elasticsearch has a default refresh interval of 1 second before a document becomes searchable after indexing. During this brief window, newly inserted data may not appear in search results. With PostgreSQL + vchord_bm25, data is searchable immediately upon transaction commit.

Disadvantages and Caveats

Item	Details	Mitigation
Language support	Primarily validated for English. Non-English languages like Korean require separate tokenizer configuration	Explore multilingual BPE models in `pg_tokenizer`. Morpheme-analyzer-based options also exist
Maturity	Released March 2025; production case studies still accumulating	For high-criticality services, run in parallel with Elasticsearch and gradually shift traffic
BM25 semantic limitations	Cannot handle synonyms or contextual nuance	Always pair with vector search. For precision-critical cases, consider adding a cross-encoder reranker after RRF
Resource overhead	Maintaining both BM25 and vector indexes simultaneously increases storage and memory	Analyze actual query patterns and selectively configure indexes with low usage frequency
Large-scale limits	Scenarios requiring distributed indexing at hundreds of TB scale favor dedicated search engines	For workloads exceeding single-node limits, consider partitioning or dedicated distributed solutions in parallel

Operations tip — bm25_limit parameter: For queries that exceed the default result count limit, check the bm25_catalog.bm25_limit GUC parameter and adjust as needed.

The Most Common Mistakes in Practice

Forgetting that BM25 scores are negative and querying with ORDER BY score DESC — The <&> operator returns negative values, so ORDER BY score ASC is highest-relevance order. This is especially easy to mix up when using BM25 standalone without RRF.
Using a different tokenizer for the query versus the index — This is exactly why to_bm25query takes an index name as its first argument. The index itself encodes the tokenizer, so using the same index name guarantees the same tokenizer is used at both indexing time and search time. If you pass a different tokenizer name to tokenize() and connect it to to_bm25query, the scores become meaningless.
Setting the LIMIT for RRF too small — If the LIMIT for BM25 and vector search individually is set too small, good results may already be cut off by the time RRF merges them. It is generally safe to retrieve 3–5 times the final desired result count at each stage.

Closing Thoughts

Here are 3 steps to get started right now:

Set up the environment with Docker — Run docker run --rm -e POSTGRES_PASSWORD=password -p 5432:5432 tensorchord/vchord-suite:pg18-latest to immediately spin up a PostgreSQL instance that includes VectorChord + BM25 + pg_tokenizer.
Experiment by adding a BM25 index to an existing text column — If you have a PostgreSQL table already in production, you can add an embedding bm25vector column, populate it with tokenize(body, 'bert_tok')::bm25vector, create a BM25 index, and directly compare results against your existing LIKE or tsvector searches. However, running a bulk update after ALTER TABLE ... ADD COLUMN on a production DB can cause table locks, so it's recommended to validate on a test DB first.
Replace the RRF query in your RAG pipeline — Swap the vector-only RAG retriever with the RRF query from the example above, and you can observe the change in search quality for queries involving proper nouns, version numbers, and code-related terms.

References

#PostgreSQL#BM25#하이브리드검색#VectorChord#RRF#벡터검색#pgvector#RAG#Elasticsearch#전문검색

Core Concepts

Why BM25 Is Better Than TF-IDF

How VectorChord-BM25 Brings BM25 into PostgreSQL

Why Vector Search Alone Is Not Enough

RRF — A Way to Combine Two Scores on Different Scales

Practical Application

Basic Setup: Creating a BM25 Index

RRF Combination Query: Merging BM25 and Vector Search

Applying to a RAG Pipeline

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Practice

Closing Thoughts

References

Core Concepts

Why BM25 Is Better Than TF-IDF

How VectorChord-BM25 Brings BM25 into PostgreSQL

Why Vector Search Alone Is Not Enough

RRF — A Way to Combine Two Scores on Different Scales

Practical Application

Basic Setup: Creating a BM25 Index

RRF Combination Query: Merging BM25 and Vector Search

Applying to a RAG Pipeline

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Practice

Closing Thoughts

References

Recommended Posts

Up to 10 Million Vectors with PostgreSQL Alone — Eliminating the Vector DB with pgvector·pgrag·pgai

Tens of Millions of Vectors in RAG on a Single PostgreSQL with pgvectorscale's StreamingDiskANN and Vector Compression — 28x Faster Than Pinecone

PgBouncer vs Supavisor: PostgreSQL Connection Pooling — What Should You Use in the Serverless Era?

When Does pgvector Hit Its Performance Limits — pgvector vs VectorChord Benchmark Comparison

How PostgreSQL 18's `uuidv7()` and `uuid_extract_timestamp()` Speed Up Time-Series Queries by 44% Without `created_at`

ULID vs UUIDv7: The Standard, Encoding, and DB Compatibility Differences That Determine Your Primary Key Choice