Skip to main content

Command Palette

Search for a command to run...

Create a Semantic Search API with FastAPI, Sentence-BERT, and PostgreSQL pgvector

Updated
4 min read

In this post, I’ll walk you through how I built a semantic similarity search API using FastAPI, Sentence-BERT (SBERT), and PostgreSQL with the pgvector extension.
This project began as a proof of concept (POC) to explore how seamlessly we can integrate deep learning–based text embeddings into a traditional relational database for fast, contextual search — without introducing an external vector database or adding unnecessary complexity to the existing tech stack. Link to Repo - kiransabne04/fastapi-sbert-pgvector-similarity: A FastAPI-based text similarity and semantic search API using Sentence-BERT (all-MiniLM-L6-v2) with PostgreSQL + pgvector for vector storage and similarity matching.

Traditional keyword search only matches exact terms.
Semantic search, on the other hand, understands context — for example, the phrases:

“How do I reset my password?”
and
“Forgot my login credentials.”

mean the same thing, even though they use completely different words.

That’s what Sentence-BERT (SBERT) enables — it transforms sentences into numerical embeddings that capture semantic meaning.
Once we have those embeddings, we can use vector similarity (like cosine similarity) to find text with similar meaning.

High Overview

Here’s the high-level flow of the system we built:

  1. FastAPI serves REST endpoints for inserting and searching text.

  2. Sentence-BERT generates a 384-dimensional vector for each text.

  3. PostgreSQL with pgvector stores these embeddings and performs fast similarity queries using vector math.

Tech Stack

ComponentDescription
FastAPIWeb framework for the API
Sentence-TransformersGenerates text embeddings (SBERT)
PostgreSQL 15 + pgvectorStores embeddings and runs similarity search
UvicornASGI server for FastAPI
Docker ComposeSpins up Postgres with vector support

Model used:

all-MiniLM-L6-v2 — lightweight, accurate, and great for quick experimentation.

Here’s how the key pieces fit together.

Sentence-BERT Embeddings

Using Hugging Face’s sentence-transformers library, each text is transformed into a 384-dimensional embedding vector:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

text = "A forgotten attic filled with old memories"
embedding = model.encode(text)
print(len(embedding))  # 384

PostgreSQL + pgvector

To store and search these embeddings efficiently, we enable the pgvector extension.

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE items (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    description TEXT NOT NULL,
    embedding VECTOR(384),
    created_at TIMESTAMP DEFAULT NOW()
);

We then create a vector index to speed up similarity queries:

CREATE INDEX items_embedding_idx
ON items
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

FastAPI Endpoints

/insert_item

Accepts multiple text items and inserts their embeddings into PostgreSQL.

@app.post('/insert_item')
def insert_items(request: ItemRequest):
    embeddings = [embedder.encode(i.description) for i in request.item_requests]
    # Store in DB as vector

/find_similar

Finds top-N most similar descriptions by comparing embeddings using pgvector’s cosine similarity operator <=>. There are other similarity operators as well for you to explore.

SELECT title, description, 1 - (embedding <=> %s::vector) AS similarity
FROM items
ORDER BY similarity DESC
LIMIT 5;

The result: a list of text items with semantic similarity scores.

Example in Action

Input Text

“The smell of aged paper and leather in a quiet bookstore.”

Output

{
  "similar_description": [
    {
      "title": "A vintage bookstore",
      "description_text": "The bookstore smelled of aged paper and leather...",
      "similarity": 0.86,
      "similarity_percent": 86.42
    },
    {
      "title": "A forgotten attic",
      "description_text": "The air in the attic hung heavy with the scent of forgotten things...",
      "similarity": 0.74,
      "similarity_percent": 74.10
    }
  ]
}

That’s semantic similarity in action. You can further improve/optimize it based on requirement and other pre-steps.

Alternatives to SBERT

Few other alternatives are

  1. intfloat/e5-large-v2

  2. nomic-ai/nomic-embed-text-v1

  3. OpenAI Embeddings (text-embedding-3-large)

  4. Cohere Embeddings (embed-multilingual-v3.0)

  5. Sentence-T5 or Universal Sentence Encoder (USE)

Thoughts:

This little POC with proper architecture & implementation has worked wonder for one of use cases. With just a few hundred lines of Python and SQL, you can build a real semantic search engine — no external AI infrastructure required.

If you’re exploring NLP, information retrieval, or vector databases, this is one of the best starting points you can build on.