Building an AI-Powered Resume Assistant Using Langflow, Astra DB, and OpenAI

1. Introduction

“You can’t fix a broken hiring pipeline with a prettier job board.”

I’ve seen it firsthand: companies drowning in thousands of resumes, relying on brittle keyword filters and outdated screening heuristics. In my own projects, especially when working with early-stage startups or talent platforms, I kept running into the same bottleneck—resume review. It’s manual. It’s subjective. And honestly, it’s incredibly inefficient.

That’s where LLMs changed the game for me.

Instead of throwing more people at the problem or trying to optimize regex filters, I decided to build a smart assistant that could actually understand resumes, extract meaningful context, and provide constructive, personalized feedback. Not just generic summaries—but actionable suggestions tailored to real job descriptions.

Here’s what I’ll walk you through in this guide:
How I built an AI Resume Assistant that lets users upload a resume and get back a rewritten version—optimized for clarity, tone, and alignment with the target role. It delivers detailed feedback, identifies weaknesses, and suggests improvements—all using LLMs and vector search under the hood.

The stack I went with is pretty lean and production-ready:

  • Langflow for building modular LLM pipelines (no more brittle scripts)
  • Astra DB for high-performance vector search (indexed resumes, job role embeddings)
  • OpenAI to power the core resume evaluation logic

By the end of this guide, you’ll not only have the blueprint—I’ll show you the real wiring. Everything from preprocessing PDFs to chunking strategies to prompt tuning. Let’s build something useful.


2. Architecture Overview

“If you can’t sketch it, you probably don’t understand it.”

When I first started piecing this together, I needed a clear way to visualize the data flow. I didn’t want a black-box LLM app. I wanted full control over each step—especially resume parsing, retrieval logic, and prompt dynamics.

Here’s a quick look at the architecture I used (I’ll include the code and flow exports later, no worries):

End-to-End Flow:

  1. Upload Resume (PDF / DOCX)
    → Custom preprocessing node in Langflow handles clean text extraction
  2. Text Chunking + Embedding
    → Processed text is chunked (sentence-aware) and embedded using OpenAI’s text-embedding-3-small
  3. Semantic Indexing with Astra DB
    → Vector storage + metadata (like candidate name, section type, etc.)
  4. Prompt Generation for Feedback
    → Langflow pipeline pulls semantically relevant chunks, injects them into a custom prompt template
  5. OpenAI LLM (GPT-4 or GPT-3.5)
    → Generates actionable, section-wise resume feedback or full rewrites
  6. Return Enhanced Resume + Recommendations

Here’s the diagram I personally used while designing it — want me to generate a clean version of this for your blog?

Why Langflow?

Langflow gave me exactly what I needed: visual control over chaining LLM blocks. I could create custom parsing, chunking, embedding, and response formatting flows—all with traceability. It was like having a no-code canvas that didn’t compromise on depth.

Why Astra DB?

Astra DB was the easiest way I found to implement a scalable vector store with built-in support for metadata filtering. I didn’t want to spin up and maintain a separate Pinecone or Faiss setup—especially when Astra gave me TTLs, region options, and a GraphQL API out of the box.

Why OpenAI?

No surprises here. I went with OpenAI’s gpt-4 and gpt-3.5-turbo because of their performance in reasoning and rewriting tasks. But I’ll show you how I made them less verbose and more targeted for resume work using modular prompt engineering.

Local vs Cloud: What I Learned

When building locally, I used Dockerized Langflow + Astra’s dev token to prototype quickly. But if you’re planning to scale this (e.g., embed this assistant inside a job board), here’s what I recommend:

ScenarioLocalCloud
Fast prototyping
Secure credentials
Scaling to thousands of resumes
Observability / LogsPartial✅ via Langfuse / Cloud dashboards

Personally, I deployed the app on Render (for frontend) and Fly.io (for API logic) with Astra DB in the background—this stack worked really well for smaller batch jobs and async evaluations.


Step 1: Setting Up the Environment

“Before you automate anything, get your environment stable. Debugging a resume pipeline is hard enough—don’t make it worse with a broken setup.”

a. Repo Initialization

Personally, I like keeping things clean and modular from day one. When I started working on this Resume Assistant project, I structured my repo like this:

resume-ai-assistant/
├── langflow_projects/     # Langflow .json flows
├── resume_inputs/         # Uploaded PDFs/DOCs
├── scripts/               # Custom chunking, parsing utilities
├── prompts/               # Dynamic prompt templates
├── app/                   # Frontend / Streamlit / FastAPI app
├── .env                   # API keys (not checked into git)
├── requirements.txt
└── README.md

This folder structure made it easier to debug specific modules and switch between Langflow and script-based workflows when needed.

Tool Versions That Worked Best for Me

Here’s what I used while building this. I strongly recommend pinning exact versions—you’d be surprised how a small SDK change can silently break things, especially in Langflow chains.

langflow==0.3.12
openai==1.14.3
astrapy==0.6.0
python-dotenv==1.0.1
pdfplumber==0.10.3

You can copy this into your requirements.txt to get started quickly:

# requirements.txt
langflow==0.3.12
openai==1.14.3
astrapy==0.6.0
python-dotenv==1.0.1
pdfplumber==0.10.3
tiktoken==0.5.1

And if you’re working in a pyproject.toml/Poetry setup, let me know—I’ll drop a version-specific config for that too.

b. API Keys & Environment Config

You don’t want to hardcode keys, especially if you’re testing Langflow chains on multiple machines or pushing to GitHub. Here’s how I did it:

  1. Created a .env file at the root of my project (never committed this—obviously).
  2. Used python-dotenv to load keys in my custom scripts.
  3. Passed secrets to Langflow via environment variables when running Docker or dev server.
# .env
OPENAI_API_KEY=sk-************
ASTRA_DB_APPLICATION_TOKEN=astradb-************
ASTRA_DB_ID=your-db-id
ASTRA_DB_REGION=us-east1
ASTRA_DB_KEYSPACE=resume_ai

In your scripts or Langflow config blocks, just read them like this:

from dotenv import load_dotenv
import os

load_dotenv()

openai_key = os.getenv("OPENAI_API_KEY")
astra_token = os.getenv("ASTRA_DB_APPLICATION_TOKEN")

Now, if you’re running Langflow via Docker (which I highly recommend once your flow gets complex), just mount your .env like this:

Pro tip: If you’re using a cloud platform like Render or Railway for deployment, you can load these keys directly into their secrets UI—no .env file required.


Step 2: Data Pipeline with Langflow

“Garbage in, garbage out” hits harder when your LLM is summarizing half-broken text chunks from a scanned PDF.

Getting clean, semantically rich input from resumes is one of the most overlooked steps in most AI apps I’ve seen. When I first started working on this, I underestimated how messy resume data could be—and how much it could break the downstream output quality. What helped me was building a modular, pre-processing and embedding flow in Langflow where I could tweak each node without rewriting scripts every time.

Let me show you exactly how I did it.

a. Resume Input Handling (PDF/Text Parser Block)

Most resumes come in PDF format, and believe me, that’s where the fun begins.

Personally, I tested a few different PDF parsers before settling on pdfplumber. It gave me the cleanest output for most structured resumes. Here’s a minimal wrapper I used:

# scripts/pdf_reader.py
import pdfplumber

def extract_text_from_pdf(file_path):
    with pdfplumber.open(file_path) as pdf:
        return "\n".join(page.extract_text() for page in pdf.pages if page.extract_text())

You can integrate this logic into a custom node in Langflow using the “Python Function” block. I created one called PDFParserNode, which simply takes a file input and returns clean text output to the chunker.

When it goes wrong:

Scanned PDFs or image-based resumes will fail silently. When I hit these, I added a fallback using Tesseract OCR (but only for edge cases since it’s compute-heavy). You could also flag bad inputs with a Langflow logic node and return a user prompt: “Upload a non-scanned version.”

b. Chunking Strategy for Vector Storage

You might be thinking: “Why not just split the text every 500 tokens and be done with it?”
Trust me—I tried. It failed.

Naive chunking breaks context in the worst places: right in the middle of a project description or mid-sentence in a skills list. That results in garbage embeddings, and you’ll notice your retrieval relevance nosedive.

What worked best for me was sentence-aware chunking with overlap, especially for resumes with detailed experience sections.

Here’s a snippet I used:

from nltk import sent_tokenize

def chunk_text(text, max_tokens=300, overlap=50):
    from tiktoken import get_encoding
    tokenizer = get_encoding("cl100k_base")
    sentences = sent_tokenize(text)
    
    chunks, current_chunk = [], ""
    for sentence in sentences:
        if len(tokenizer.encode(current_chunk + sentence)) < max_tokens:
            current_chunk += " " + sentence
        else:
            chunks.append(current_chunk.strip())
            current_chunk = sentence
    if current_chunk:
        chunks.append(current_chunk.strip())
    
    # Add overlap manually
    overlapped_chunks = []
    for i in range(0, len(chunks)):
        start = max(0, i - 1)
        overlapped_chunks.append(" ".join(chunks[start:i+1]))
    
    return overlapped_chunks

Inside Langflow, I wrapped this in a script node and chained it to the embedding step. If you want a visual, I can show how my flow looked in the Langflow UI with the parser → chunker → embedder path.

c. Vector Embedding via OpenAI

Here’s the deal: text-embedding-3-small is cheaper and faster, and for resume chunks under 400 tokens, it works just fine. I used this over ada-002 in my later iterations to cut cost while maintaining retrieval quality.

In Langflow, I used the OpenAI Embedding node like this:

  • Input: Chunked text
  • Model: text-embedding-3-small
  • Output: 1536-dim vector
  • Metadata: chunk_id, section_label, resume_id (you’ll thank yourself later for this)

If you’re not using Langflow’s built-in embedding, here’s a quick code equivalent:

from openai import OpenAI
import os

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def get_embedding(text):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

d. Astra DB Setup

You might be wondering: “Why Astra over Pinecone or Weaviate?”
For me, it came down to simplicity: Astra gave me an out-of-the-box vector store with metadata filtering and no infra headache.

Here’s the minimal setup I used to push resume chunks into Astra:

from astrapy.db import AstraDB
import os

db = AstraDB(
    token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"),
    api_endpoint=f"https://{os.getenv('ASTRA_DB_ID')}-{os.getenv('ASTRA_DB_REGION')}.apps.astra.datastax.com",
    namespace=os.getenv("ASTRA_DB_KEYSPACE"),
    collection_name="resume_chunks"
)

def insert_chunk(chunk_text, embedding, metadata):
    db.insert_one({
        "text": chunk_text,
        "embedding": embedding,
        "metadata": metadata
    })

Schema Design That Worked for Me:

Each vector in Astra included:

  • chunk_id: For traceability in Langflow
  • resume_id: To group chunks per candidate
  • section: e.g., “experience”, “skills”, etc. (used in reranking prompts)

Astra handles filtering beautifully. For example, you can retrieve only "experience" sections during prompt composition—super useful when generating feedback just for work history.


Step 3: Retrieval-Augmented Feedback Pipeline

“A model without context is just guessing.”
When I first wired up GPT-4 for resume feedback, the responses sounded polished… but wildly off-base. Why? Because it didn’t know the resume. It was hallucinating a context that didn’t exist.

That’s when I knew RAG (Retrieval-Augmented Generation) was the only way forward.

a. Langflow: Creating the RAG Chain

Let’s walk through how I built a minimal but powerful RAG setup inside Langflow. Here’s what the core flow looked like:

  • Input Node: Takes parsed resume text from previous stage
  • Vector Retrieval Node: Queries Astra DB for the most relevant chunks
  • Prompt Assembly Node: Injects retrieved context into prompt
  • LLM Node: Uses GPT-4-turbo (or GPT-3.5-turbo for cheaper runs)
  • Output Formatter Node: Returns structured, readable feedback

Astra Vector Search Setup

In Langflow’s vector search block, I configured:

  • Collection: resume_chunks
  • Query vector: Embedded resume summary
  • Metadata filters (optional): e.g., fetch only “experience” sections
  • Top-k: I found 5–8 hits to be the sweet spot

This might surprise you: GPT-4 gives much better feedback when you feed it just the right amount of context — not too much, not too little.

b. Prompt Engineering: Dynamic & Modular

Let me tell you something I learned the hard way — one-size-fits-all prompts just don’t work for resumes. You need to structure prompts differently for different feedback types: clarity, phrasing, alignment, tone, or even keyword density for ATS systems.

Here’s a base template I built:

resume_prompt = f"""
You're a professional resume reviewer helping candidates improve their resumes for tech roles.
Below is a candidate's resume content:

{text}

Based on this, provide detailed and constructive feedback in the following format:

1. Clarity:
2. Phrasing:
3. Alignment with job roles:
4. Missing critical sections:
5. ATS readability suggestions:

Avoid generic advice. Be precise and specific.
"""

Injecting Few-shot Examples

To improve consistency, I added a few-shot block to the prompt like this:

example = """
Example Resume:
"Worked on several software projects."

Feedback:
1. Clarity: Unclear which projects were worked on.
2. Phrasing: "Several" is vague — replace with numbers or names.
3. Alignment: No mention of outcomes or technologies.

---

"""
resume_prompt = example + resume_prompt

Langflow supports templated inputs, so I plugged this into a “Prompt Template” block where I could modify different instruction flavors dynamically.

Personally, this made my testing and iteration so much faster. I could tune how strict or lenient the tone of feedback was just by adjusting the few-shot layer.

c. Langflow Chain Design (Visual + Flow)

I’ll describe my Langflow setup visually, but if you want the actual JSON export, I can include that too.

[Input Node]
     ↓
[Embedding Node] → [Astra Vector Search Node]
     ↓                                                     ↓
     └──────────┬─────────┘
                                  ↓
               [Prompt Builder Node]
                                 ↓
                      [LLM (GPT-4)]
                                 ↓
             [Response Formatter Node]
                                 ↓
                    [Output to UI / API]

Node-by-Node Breakdown:

  • Input Node: Accepts plain text resume input (from PDF parser flow)
  • Embedding Node: Re-embeds resume summary for similarity search
  • Astra Vector Search Node: Fetches top relevant chunks
  • Prompt Builder Node: Dynamically formats the instruction + context
  • LLM Node: Runs OpenAI GPT-4-turbo with system-level instructions
  • Output Formatter: Cleans up feedback into JSON or markdown for display

You might be wondering: “Why not just use the full resume as context?”
I tried that. The feedback turned into a generic checklist. When I used only relevant sections via Astra retrieval, the feedback quality improved drastically.


Step 4: Interactive Interface (Optional but Worth Every Bit)

“You can have the smartest model in the world, but if no one can use it — what’s the point?”

Early on, I built this system purely CLI-first. Functional? Yes. Usable? Not unless you were me. So I started layering on interactivity.

This is optional, of course. But if you’re deploying this for a team — or making it public-facing — a clean, fast frontend is where it starts to shine.

a. Streamlit UI (Fastest Way to Get Started)

Streamlit was my go-to for the MVP. Here’s the stripped-down version I started with:

# streamlit_app.py
import streamlit as st
from my_rag_pipeline import get_resume_feedback

st.title("LLM-Powered Resume Assistant")

uploaded_file = st.file_uploader("Upload your resume", type=["pdf", "txt"])

if uploaded_file:
    resume_text = extract_text(uploaded_file)  # Your parser here
    feedback, score, rewritten = get_resume_feedback(resume_text)

    st.subheader("Score")
    st.write(score)

    st.subheader("Rewritten Resume")
    st.text_area("Improved Version", rewritten, height=300)

    st.subheader("Detailed Feedback")
    st.write(feedback)

    # Optional download
    if st.button("Download PDF"):
        download_pdf(rewritten, filename="updated_resume.pdf")

The get_resume_feedback() method is just a wrapper around the Langflow chain you already built — nothing new, just hooked into a friendly interface.

b. Optional Features That Took It to the Next Level

Once I had the basics working, I added these:

  • PDF Download: I used pdfkit with a Jinja2 HTML template.
import pdfkit
from jinja2 import Template

def download_pdf(resume_text, filename):
    template = Template(open("template.html").read())
    html = template.render(resume=resume_text)
    pdfkit.from_string(html, filename)
  • Authentication: For public demos, I wired in Auth0. Firebase Auth also works well, especially if you’re going mobile-first.
  • Langchain + FastAPI (for APIs): For production use, I wrapped the Langflow logic into a FastAPI backend and connected it to a Langchain UI dashboard. Way more extensible long-term.

Step 5: Testing & Evaluation

“Models don’t fail silently — they fail confidently.
That’s what makes evaluation so tricky.

Here’s how I personally test the system in a loop.

a. Feedback Quality Evaluation

No rocket science here — just a simple truth: you need some gold-standard references.

Here’s how I do it:

  • Take 10+ hand-reviewed resumes (annotated with great vs bad phrasing)
  • Feed them into the pipeline
  • Compare: Does the feedback flag the same issues a real reviewer would?

I’ve used OpenAI’s own gpt-4 as a meta-evaluator too — but keep it consistent. Don’t mix models unless you want hallucination compounding.

b. LLM Hallucination Control

This might surprise you: hallucination isn’t just a model issue — it’s often a prompting issue.

Here’s what’s worked best for me:

  • Ground every prompt with explicit context, especially:
    • The job title
    • The resume section (e.g., “Work Experience”)
    • The feedback scope (grammar? alignment? ATS-fit?)
  • Use memoryless chains: Stateless prompts reduce drift
  • Guard with structure: Ask for output in bullet format or JSON

c. Logging & Traceability

If you’re not tracking what your model does — you’re flying blind.

Langflow gives you basic node-level logs, which is a good start.

But I took it further with Langfuse — it’s an observability layer for LLMs. Here’s how I use it:

  • Log input resume text
  • Log intermediate chunks fetched from Astra DB
  • Log final prompt & LLM response
  • Tag any user feedback or scoring
from langfuse import Langfuse

langfuse = Langfuse(public_key="...", secret_key="...")

langfuse.trace(
    name="resume_feedback",
    input=resume_text,
    output=llm_response,
    metadata={"job_title": "Data Scientist"}
)

That wraps up interface and evaluation.

You might be wondering: Is all this effort worth it for just a resume assistant?

In my experience — yes. The modularity you build here applies to dozens of other use cases. Anywhere user input needs to be interpreted, critiqued, and rewritten — this architecture shines.


Step 6: Deployment (Optional, but Seriously Powerful)

“A prototype that never leaves your laptop is just a fancy script.”

When I first containerized the whole thing and spun it up across environments — that’s when it started to feel real.

If you’re just running locally, skip this. But if you’re planning to share your Langflow app with stakeholders, or make it part of a resume evaluation product, here’s how I did it.

a. Langflow in Docker (My Production Setup)

Langflow’s local setup is easy enough, but for production I wrapped everything into Docker. Here’s the base Dockerfile I used:

# Dockerfile
FROM python:3.10-slim

WORKDIR /app

COPY . .

RUN pip install -r requirements.txt

CMD ["langflow", "run", "--host", "0.0.0.0", "--port", "7860"]

Then, build and run:

docker build -t resume-rag-app .
docker run -p 7860:7860 --env-file .env resume-rag-app

This let me deploy it on Fly.io in minutes.

b. Astra DB: Scaling with Region Awareness

Astra DB handled scale far better than I initially expected — especially on vector queries. That said, here’s one thing I had to deal with: latency across regions.

So if your frontend’s in the EU but your Astra DB is in us-east1, expect a ~300ms RTT. For low-latency feedback, either:

  • Spin up Astra in the same region as your backend (e.g., eu-west-1)
  • Or, if you’re using a multi-tenant backend, cache embeddings locally before querying Astra

Throughput was never the bottleneck — but metadata query design was. Filtering on tags like candidate_id and section made everything faster, especially when pulling resume sections separately.

c. Frontend + Backend Hosting

Here’s the combo I eventually settled on:

ComponentHostingWhy it worked
Langflow UIRenderSimple deployments, good for MVP demos
API backendFly.io / AWS LambdaFly.io gave me a persistent container for LLM chaining; Lambda worked better for stateless prompt eval
Resume ParserCloudflare WorkersBlazing fast for simple PDF → text preprocessing
AuthAuth0Clean integration with both frontend and backend routes

And yes, I did try pushing everything into a monorepo. Don’t — keep Langflow separate from your core logic unless you’re customizing nodes deeply.


Step 7: Final Thoughts — What I Learned From Shipping It

Let me be blunt — it wasn’t all smooth sailing. But that’s the good stuff. That’s what makes the build better next time.

What Worked Surprisingly Well

  • Langflow’s Visual Node Interface
    It made experimenting with different prompt styles and chain flows so much faster. Honestly, I underestimated how helpful visual chaining could be at the prototyping stage.
  • Astra DB at Scale
    Embeddings persisted reliably, vector search latency stayed predictable. Even with 100K+ vector entries, it held up better than some local FAISS deployments I tested.

What Broke (and Burned Time)

  • Resume Parsing Inconsistencies
    This one bit me hard. Scanned PDFs? Forget it. Even standard PDFs can throw weird line breaks and unicode junk. I ended up needing fallback heuristics using PyMuPDF after pdfplumber failed on some docs.
  • LLM Feedback Verbosity
    GPT-4 loves to talk. Sometimes too much. I had to start wrapping responses in truncation logic, or use stricter prompt formatting like:
“Give feedback in bullet points only. Max 5 bullets. No preamble or summary.”

What I’d Optimize Next Time

  • Chunking Strategy
    I’d lean more heavily into semantic chunking. Even with sentence-boundary-aware chunking, sometimes the embeddings lacked context. One thing I’m experimenting with is hybrid chunking: semantic units + overlap + role-aware chunk tags.
  • Metadata Tagging
    More structured tags = smarter search. I’d formalize tags like:
    • role=Software Engineer
    • section=Experience
    • skill=Python
    So that feedback could be filtered within specific resume dimensions.
  • Multi-modal Inputs
    If I had more time, I’d add OCR + layout detection (like LayoutParser) to handle image-based resumes. Some of the best designer resumes I saw were all graphics — zero text.

Final Word

You don’t need to deploy everything to feel the value. Even if you stop at local testing with Langflow and a CLI script, this RAG pipeline can actually improve resumes in the wild — I’ve seen it.

But once you ship it, you’ll uncover patterns, edge cases, and user behavior you never anticipated. And that’s where this thing really evolves from a project… into a product.

Leave a Comment