Thrive Career Wellness:
Automated Outplacement Platform
Engineering an agentic system to mitigate corporate liability by maximizing interview conversion rates for offboarded employees.
Executive Summary
01 // The Liability Problem
Client: Thrive Career Wellness (Outplacement Provider)
The Context: Companies conducting layoffs face high legal risks. To mitigate wrongful termination lawsuits, they must demonstrate they provided meaningful support to help ex-employees land new roles.
The Constraint: We cannot use generic "resume builders." The system needs to map a candidate's specific Impact Stories to open market opportunities with high precision. Furthermore, due to strict enterprise data agreements, we have Zero-Trust PII constraintsβno candidate names or contact info can touch the public LLM layer.
02 // The 3-Workflow Architecture
To match the nuance of a human career coach, I architected three distinct graph workflows. This ensures separation of concerns between Strategy, Execution, and Validation.
A. The Resume Graph (Supervisor-Worker-Editor)
Uses a Supervisor (DeepSeek-R1) to plan the pivot strategy, a Writer to draft the content, and an Editor to enforce the "Liability Shield"βrejecting any hallucinated skills.
import os
import json
import re
from typing import TypedDict, Optional, List, Any
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import HumanMessage
from langgraph.graph import StateGraph, END
from schemas import CandidateProfile, ResumeStrategy, JobMatchReport
load_dotenv()
# =============================================================================
# 1. SHARED STATE DEFINITIONS
# =============================================================================
class AgentState(TypedDict):
"""Represents the state for the resume generation workflow."""
profile: CandidateProfile
target_role: str
strategy: ResumeStrategy
final_draft: str
match_report: JobMatchReport
editor_feedback: Optional[str] # Changed to Optional to handle None
class ApplicationState(TypedDict):
"""Represents the state for the cover letter tailoring workflow."""
job_title: str
job_description: str
candidate_profile: str
cover_letter: str
status: str
class JobSearchState(TypedDict):
"""Represents the state for the job market search and ranking workflow."""
profile: CandidateProfile
raw_jobs: List[dict]
ranked_jobs: List[dict]
search_criteria: str
# =============================================================================
# 2. MODEL FACTORY
# =============================================================================
# DeepSeek Reasoner for high-level strategy (Supervisor/Profiler)
supervisor_llm = ChatOpenAI(
model="deepseek-reasoner",
base_url="https://api.deepseek.com",
api_key=os.getenv("DEEPSEEK_API_KEY")
)
# DeepSeek Chat for content generation and scrubbing (Writer/Editor/Matcher)
worker_llm = ChatOpenAI(
model="deepseek-chat",
base_url="https://api.deepseek.com",
api_key=os.getenv("DEEPSEEK_API_KEY"),
temperature=0.7
)
# =============================================================================
# 3. HELPER FUNCTIONS
# =============================================================================
def get_anonymized_profile(profile: CandidateProfile) -> str:
"""
Redacts Personally Identifiable Information (PII) from the profile.
"""
safe_profile = profile.model_copy(deep=True)
safe_profile.full_name = "[CANDIDATE_NAME]"
safe_profile.contact_email = "[CONTACT_EMAIL]"
safe_profile.phone = "[PHONE_NUMBER]"
safe_profile.linkedin = "[LINKEDIN_URL]"
return safe_profile.model_dump_json()
def extract_json_from_text(text: str) -> Optional[Any]:
"""
Extracts and parses JSON content from a string.
"""
try:
match = re.search(r"```json\n(.*?)\n```", text, re.DOTALL)
if match: return json.loads(match.group(1))
match = re.search(r"(\{.*\})", text, re.DOTALL)
if match: return json.loads(match.group(1))
return None
except (json.JSONDecodeError, AttributeError):
return None
def sanitize_latex(latex_code: str) -> str:
"""
Cleans and validates LaTeX source code generated by the LLM.
"""
# 1. Clean Markdown and extra whitespace
clean = latex_code.replace("```latex", "").replace("```", "").strip()
# 2. ESCAPE DOLLAR SIGNS: Critical for currency in text
clean = re.sub(r'(?<!\\)\$([0-9])', r'\\$\1', clean)
# 3. FORCE ITEM MAPPING: Converts LLM's raw \item bullets to your command
clean = re.sub(r'\\item\s+\\textbf\{(.*?)\}:\s+(.*)', r'\\resumeItem{\1}{\2}', clean)
# 4. REMOVE EMPTY STRUCTURES: Prevents Tectonic from hanging
clean = re.sub(r"\\resumeItemListStart\s*\\resumeItemListEnd", "", clean)
clean = re.sub(r"\\resumeSubHeadingListStart\s*\\resumeSubHeadingListEnd", "", clean)
return clean
# =============================================================================
# 4. WORKFLOW 1: RESUME GENERATION
# =============================================================================
def supervisor_node(state: AgentState):
"""
Analyzes candidate profile against target role to define a pivot strategy.
"""
prompt = ChatPromptTemplate.from_messages([
("system", """You are a Career Strategy Architect.
Analyze the profile and generate a strategy.
Return ONLY valid JSON matching the schema:
{{
"reasoning_summary": "Pivot justification",
"gaps_identified": ["skill1", "skill2"],
"instruction_to_writer": "How to tailor bullets",
"next_action": "delegate_to_writer"
}}"""),
("user", "Target: {target_role}\nProfile: {profile}")
])
chain = prompt | supervisor_llm
response = chain.invoke({
"target_role": state['target_role'],
"profile": get_anonymized_profile(state['profile'])
})
data = extract_json_from_text(response.content)
if data:
if "strategy_overview" in data: data["reasoning_summary"] = data.pop("strategy_overview")
if "gaps_identified" not in data: data["gaps_identified"] = []
return {"strategy": ResumeStrategy(**(data or {}))}
def writer_node(state: AgentState):
"""
Generates tailored LaTeX resume content based on the defined strategy.
"""
strategy = state['strategy']
# Include editor feedback if this is a re-try loop
feedback_context = ""
if state.get("editor_feedback"):
feedback_context = f"\nCRITICAL FEEDBACK FROM PREVIOUS DRAFT: {state['editor_feedback']}\nFix these issues immediately."
latex_template = r"""
\documentclass[letterpaper,10pt]{article}
\usepackage{latexsym, fullpage, titlesec, marvosym, verbatim, enumitem, hyperref, fancyhdr, times, xcolor}
\pagestyle{fancy} \fancyhf{} \fancyfoot{} \renewcommand{\headrulewidth}{0pt} \renewcommand{\footrulewidth}{0pt}
\addtolength{\oddsidemargin}{-0.55in} \addtolength{\evensidemargin}{-0.55in} \addtolength{\textwidth}{1.1in}
\addtolength{\topmargin}{-0.6in} \addtolength{\textheight}{1.2in}
\titleformat{\section}{\vspace{-8pt}\scshape\raggedright\large}{}{0em}{}[\color{black}\titlerule \vspace{-4pt}]
\newcommand{\resumeItem}[2]{\item\small{\textbf{#1}{: #2 \vspace{-2pt}}}}
\newcommand{\resumeSubheading}[4]{\vspace{-2pt}\item[]\begin{tabular*}{0.98\textwidth}{l@{\extracolsep{\fill}}r}\hspace{-10pt}\textbf{#1} & #2 \\ \hspace{-10pt}\textit{\small#3} & \textit{\small #4} \end{tabular*}\vspace{-6pt}}
\newcommand{\resumeSubHeadingListStart}{\begin{itemize}[leftmargin=*]} \newcommand{\resumeSubHeadingListEnd}{\end{itemize}}
\newcommand{\resumeItemListStart}{\begin{itemize}} \newcommand{\resumeItemListEnd}{\end{itemize}\vspace{-6pt}}
\begin{document}
\begin{center}\huge \textbf{[CANDIDATE_NAME]} \\ \vspace{4pt} \small [CONTACT_EMAIL] $\vert$ [PHONE_NUMBER] $\vert$ [LINKEDIN_URL] \end{center}
\vspace{-20pt}
% CONTENT_START
\end{document}
"""
system_instruction = """
You are an Expert LaTeX Resume Writer.
1. STRICTLY follow the Action-Context-Result (ACR) framework for bullets.
2. DO NOT use standard '\\item'. YOU MUST USE '\\resumeItem{{Heading}}{{Content}}'.
3. ESCAPE ALL CURRENCY SYMBOLS. Write '\$5M', never '$5M'.
4. DO NOT invent commands. Use ONLY the commands provided in the template.
5. Output the FULL LaTeX code starting from \documentclass.
"""
prompt = ChatPromptTemplate.from_messages([
("system", system_instruction),
("user", "STRATEGY: {instruction}\nDATA: {profile_data}\nFEEDBACK: {feedback}\nTEMPLATE: {template}")
])
chain = prompt | worker_llm
response = chain.invoke({
"instruction": strategy.instruction_to_writer,
"profile_data": get_anonymized_profile(state['profile']),
"feedback": feedback_context,
"template": latex_template
})
return {"final_draft": sanitize_latex(response.content)}
def editor_node(state: AgentState):
"""
Fact-checks the resume draft against the source profile.
Decides whether to APPROVE the draft or REJECT it for hallucination.
"""
prompt = ChatPromptTemplate.from_messages([
("system", """
You are a Strict Background Checker.
Compare the Draft against the Source Profile.
Rules:
1. If the Draft contains skills or metrics NOT in Source, REJECT.
2. If the Draft has broken LaTeX syntax, REJECT.
3. If the Draft is faithful, APPROVE.
Return JSON:
{{
"status": "APPROVE" | "REJECT",
"feedback": "Specific instructions on what to fix (if REJECT)",
"corrected_latex": "Optional minor fixes"
}}
"""),
("user", "SOURCE: {profile}\nDRAFT: {draft}")
])
chain = prompt | worker_llm
response = chain.invoke({
"profile": get_anonymized_profile(state['profile']),
"draft": state['final_draft']
})
data = extract_json_from_text(response.content)
if data:
# If the editor wants to reject, we pass the feedback back to the graph state
if data.get("status") == "REJECT":
return {
"editor_feedback": data.get("feedback", "General hallucination detected."),
"final_draft": state['final_draft'] # Keep old draft to show history if needed
}
# If approved, check if there are minor auto-fixes
if data.get("corrected_latex"):
return {
"editor_feedback": None, # Clear feedback
"final_draft": sanitize_latex(data['corrected_latex'])
}
# Default Approve
return {"editor_feedback": None, "final_draft": state['final_draft']}
def should_continue(state: AgentState):
"""
Conditional Edge Logic:
If editor_feedback is present, loop back to 'writer'.
Else, go to END.
"""
if state.get("editor_feedback"):
return "writer"
return END
def build_graph():
"""Compiles the primary resume generation state machine."""
workflow = StateGraph(AgentState)
# Add Nodes
workflow.add_node("supervisor", supervisor_node)
workflow.add_node("writer", writer_node)
workflow.add_node("editor", editor_node)
# Set Entry
workflow.set_entry_point("supervisor")
# Standard Edges
workflow.add_edge("supervisor", "writer")
workflow.add_edge("writer", "editor")
# Conditional Edge (The Loop)
workflow.add_conditional_edges(
"editor",
should_continue,
{
"writer": "writer", # Loop back if rejected
END: END # Finish if approved
}
)
return workflow.compile()
# =============================================================================
# 5. WORKFLOW 2: APPLICATION TAILORING
# =============================================================================
def cover_letter_node(state: ApplicationState):
"""Generates a contextual cover letter (Now PII-Safe)."""
raw_profile = CandidateProfile.model_validate_json(state['candidate_profile'])
safe_profile_json = get_anonymized_profile(raw_profile)
prompt = f"""
Write a punchy, 3-paragraph cover letter for the role: {state['job_title']}.
JOB DESCRIPTION: {state['job_description']}
CANDIDATE PROFILE: {safe_profile_json}
Use [CANDIDATE_NAME] and [CONTACT_EMAIL] placeholders.
"""
msg = worker_llm.invoke([HumanMessage(content=prompt)])
return {"cover_letter": msg.content}
def cover_letter_scrubber_node(state: ApplicationState):
"""Verifies cover letter accuracy (Now PII-Safe)."""
raw_profile = CandidateProfile.model_validate_json(state['candidate_profile'])
safe_profile_json = get_anonymized_profile(raw_profile)
prompt = f"""
Compare the following cover letter against the profile.
Delete any skills or achievements that are not directly supported by the profile.
PROFILE: {safe_profile_json}
LETTER: {state['cover_letter']}
"""
msg = worker_llm.invoke([HumanMessage(content=prompt)])
return {"cover_letter": msg.content}
def build_application_graph():
"""Compiles the application packet state machine."""
workflow = StateGraph(ApplicationState)
workflow.add_node("writer", cover_letter_node)
workflow.add_node("scrubber", cover_letter_scrubber_node)
workflow.set_entry_point("writer")
workflow.add_edge("writer", "scrubber")
workflow.add_edge("scrubber", END)
return workflow.compile()
# =============================================================================
# 6. WORKFLOW 3: JOB DISCOVERY
# =============================================================================
def profiler_node(state: JobSearchState):
"""Identifies cross-sector skill clusters to broaden search parameters."""
prompt = f"Identify core skill clusters for a cross-sector pivot. PROFILE: {get_anonymized_profile(state['profile'])}"
response = supervisor_llm.invoke([HumanMessage(content=prompt)])
return {"search_criteria": response.content}
def matcher_node(state: JobSearchState):
"""Ranks and selects the top 10 matches from a raw vector search pool."""
job_list_str = "\n".join([f"ID: {j['id']} | Title: {j['title']} | Desc: {j.get('description', '')[:200]}" for j in state['raw_jobs']])
prompt = f"""
Select the top 10 Job IDs based on the strategy.
STRATEGY: {state['search_criteria']}
POOL: {job_list_str}
Return ONLY a JSON list of integers.
"""
response = worker_llm.invoke([HumanMessage(content=prompt)])
try:
raw_data = extract_json_from_text(response.content)
if isinstance(raw_data, dict):
for key in ["ids", "selected_ids", "matches"]:
if key in raw_data and isinstance(raw_data[key], list):
raw_data = raw_data[key]
break
ai_ids = [str(x) for x in raw_data] if isinstance(raw_data, list) else []
final_jobs = [j for j in state['raw_jobs'] if str(j['id']) in ai_ids]
if not final_jobs:
final_jobs = state['raw_jobs'][:10]
return {"ranked_jobs": final_jobs}
except Exception:
return {"ranked_jobs": state['raw_jobs'][:10]}
def build_job_search_graph():
"""Compiles the market discovery state machine."""
workflow = StateGraph(JobSearchState)
workflow.add_node("profiler", profiler_node)
workflow.add_node("matcher", matcher_node)
workflow.set_entry_point("profiler")
workflow.add_edge("profiler", "matcher")
workflow.add_edge("matcher", END)
return workflow.compile()
B. The Application Graph (Writer-Scrubber)
Generates tailored cover letters while strictly adhering to PII protocols. The Scrubber node verifies that no hallucinated contact details or false claims made it into the final PDF packet.
C. The Job Search Graph (Profiler-Matcher)
Instead of simple keyword matching, the Profiler analyzes the candidate's transferrable skills (e.g., "Project Management" -> "Product Owner") and the Matcher ranks vector search results based on that strategic pivot.
03 // Synthetic Market Simulation
We could not train on real client data due to privacy laws. To validate the system's ability to handle diverse roles (from Junior DevOps to Staff Product Managers), I built a Synthetic Data Engine.
This async pipeline generates thousands of realistic "Impact Stories" and "Job Descriptions," allowing us to stress-test the Agents' reasoning capabilities before a single real user logs in.
"""
Candidate Persona Generator
Utilizes asynchronous concurrency to generate detailed, story-based
professional profiles for system testing and database seeding.
"""
import os
import json
import random
import asyncio
from openai import AsyncOpenAI
from dotenv import load_dotenv
from tqdm.asyncio import tqdm
load_dotenv()
# --- CONFIGURATION ---
NUM_TO_GENERATE = 30
MAX_CONCURRENT_REQUESTS = 10 # Lowered concurrency to handle larger JSON payloads
OUTPUT_FILE = "candidates_database.json"
client = AsyncOpenAI(
base_url="https://api.deepseek.com",
api_key=os.getenv("DEEPSEEK_API_KEY")
)
# --- SEED DATA ---
ROLES = ["Backend Engineer", "Data Scientist", "Product Manager", "DevOps Engineer"]
SENIORITY = ["Junior", "Senior", "Staff"]
async def generate_single_profile(profile_id: int, semaphore: asyncio.Semaphore) -> dict:
"""
Generates a single comprehensive candidate profile via asynchronous LLM call.
Args:
profile_id: Unique integer ID for the candidate.
semaphore: Concurrency controller to prevent API rate-limiting.
Returns:
A validated dictionary representing a candidate profile or None on failure.
"""
async with semaphore:
role = random.choice(ROLES)
level = random.choice(SENIORITY)
prompt = f"""
Generate a detailed Resume Profile for a {level} {role} in JSON format.
CRITICAL RULE: In the 'achievements' and 'description_bullets' fields,
DO NOT provide short bullets. Provide a 5-sentence story describing:
1. A specific technical crisis or project goal.
2. The architecture/tools the candidate chose to solve it.
3. The measurable result (e.g., latency reduced by 50%, $2M saved).
JSON SCHEMA:
{{
"full_name": "Name",
"contact_email": "email",
"phone": "phone",
"linkedin": "url",
"summary": "Summary",
"skills": ["Python", "SQL", "Docker", "AWS"],
"experience_history": [
{{ "company": "Co", "role": "Role", "achievements": ["5-sentence impact story..."] }}
],
"projects": [
{{ "title": "Name", "description_bullets": ["4-sentence narrative..."] }}
],
"education": [{{ "school": "Uni", "degree": "Degree", "year": "2024" }}]
}}
"""
try:
response = await client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": prompt}],
temperature=0.8
)
# Extract and clean JSON content from markdown wrappers
content = response.choices[0].message.content.replace("```json", "").replace("```", "").strip()
data = json.loads(content)
# Inject metadata for system consistency
data['id'] = profile_id
data['level'] = level
return data
except (json.JSONDecodeError, Exception) as e:
print(f"β Failed to generate profile {profile_id}: {e}")
return None
async def main():
"""
Orchestrates the parallel generation of candidate personas.
"""
print(f"π Generating {NUM_TO_GENERATE} Narratively-Dense Candidates...")
sem = asyncio.Semaphore(MAX_CONCURRENT_REQUESTS)
# Initialize task list
tasks = [generate_single_profile(i, sem) for i in range(NUM_TO_GENERATE)]
# Execute with visual progress tracking
results = await tqdm.gather(*tasks, desc="Building Personas")
# Filter out failed requests and save to local JSON
valid_profiles = [p for p in results if p is not None]
with open(OUTPUT_FILE, "w") as f:
json.dump(valid_profiles, f, indent=2)
print(f"\nβ
Success: Saved {len(valid_profiles)} narrative profiles to {OUTPUT_FILE}.")
if __name__ == "__main__":
asyncio.run(main())
"""
Job Database Generator
Orchestrates high-concurrency asynchronous API calls to DeepSeek
to build a diverse dataset of job descriptions.
"""
import os
import json
import random
import asyncio
from openai import AsyncOpenAI
from dotenv import load_dotenv
from tqdm.asyncio import tqdm
load_dotenv()
# --- CONFIGURATION ---
NUM_TO_GENERATE = 1000
MAX_CONCURRENT_REQUESTS = 50 # Limits simultaneous API hits to prevent rate limiting
OUTPUT_FILE = "jobs_database.json"
client = AsyncOpenAI(
base_url="https://api.deepseek.com",
api_key=os.getenv("DEEPSEEK_API_KEY")
)
# --- SEED DATA ---
SECTORS = ["FinTech", "Healthcare", "E-commerce", "Cybersecurity", "Green Energy",
"Gaming", "Logistics", "EdTech", "LegalTech", "AgriTech"]
ROLES = ["Machine Learning Engineer", "Backend Developer", "Data Scientist",
"DevOps Engineer", "Full Stack Developer", "Product Manager",
"Security Analyst", "Cloud Architect"]
SENIORITY = ["Junior", "Mid-Level", "Senior", "Staff", "Lead"]
async def generate_single_jd(job_id: int, semaphore: asyncio.Semaphore) -> dict:
"""
Generates a single job description using an asynchronous API call.
Args:
job_id: Unique identifier for the job record.
semaphore: Controller to limit the number of concurrent tasks.
Returns:
A dictionary containing job metadata or None if the request fails.
"""
async with semaphore:
sector = random.choice(SECTORS)
role = random.choice(ROLES)
level = random.choice(SENIORITY)
prompt = (
f"Write a brief Job Description for a {level} {role} at a {sector} company. "
"Format: Plain text. Include 1. Role Summary, 2. Key Responsibilities, 3. Tech Stack. "
"Keep it under 100 words."
)
try:
response = await client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": prompt}],
temperature=0.9
)
content = response.choices[0].message.content
return {
"id": job_id,
"title": f"{level} {role}",
"sector": sector,
"description": content
}
except Exception as e:
print(f"β Error on ID {job_id}: {e}")
return None
async def main():
"""
Main orchestrator to schedule and execute batch job generation.
"""
print(f"π Starting Async Generation: {NUM_TO_GENERATE} records...")
sem = asyncio.Semaphore(MAX_CONCURRENT_REQUESTS)
# Schedule all tasks into the event loop
tasks = [generate_single_jd(i, sem) for i in range(NUM_TO_GENERATE)]
# Execute tasks in parallel with a progress bar
results = await tqdm.gather(*tasks, desc="Generating Jobs")
# Filter failures and write to local storage
valid_jobs = [job for job in results if job is not None]
with open(OUTPUT_FILE, "w") as f:
json.dump(valid_jobs, f, indent=2)
print(f"\nβ
Success: Saved {len(valid_jobs)} jobs to {OUTPUT_FILE}")
if __name__ == "__main__":
asyncio.run(main())
04 // Pydantic Typing & PII Protection
In enterprise software, unstructured JSON is a liability. I enforced strict Pydantic Schemas for every data exchange. This guarantees that the AI cannot output malformed data that would break the downstream PDF compiler.
Crucially, the `CandidateProfile` schema includes fields for PII (`contact_email`, `phone`) that are programmatically redacted before entering the Agent Graph, ensuring compliance with GDPR/CCPA.
"""
Data schemas for the Thrive Wellness Career Platform.
Defines Pydantic models for candidate profiles, AI agent strategies,
and job matching analytics to ensure strict type validation across the system.
"""
from pydantic import BaseModel, Field
from typing import List, Optional, Union
# --- 1. CANDIDATE PROFILE COMPONENTS ---
class Experience(BaseModel):
"""Represents a single professional role in a candidate's work history."""
company: str
location: str
role: str
duration: str
achievements: List[str]
class Project(BaseModel):
"""Represents a technical or professional project in a candidate's portfolio."""
title: str
tech_stack: str
date: str
description_bullets: List[str]
class Education(BaseModel):
"""Represents a formal academic degree or certification."""
school: str
location: str
degree: str
year: str
class CandidateProfile(BaseModel):
"""The central data model for a candidate's full professional identity."""
full_name: str
contact_email: str
phone: str
linkedin: str
summary: str
skills: List[str]
experience_history: List[Experience]
projects: List[Project]
education: List[Education]
target_role: Optional[str] = None
job_description: Optional[str] = None
# --- 2. AI AGENT STRATEGY MODELS ---
class ResumeStrategy(BaseModel):
"""Output schema for the Supervisor Agent's strategic planning phase."""
reasoning_summary: str = Field(
description="The internal logic explaining the pivot strategy."
)
gaps_identified: List[Union[str, dict]] = Field(
description="List of detected skill deficiencies or narrative weaknesses."
)
instruction_to_writer: str = Field(
description="Technical directives for the Writer Agent to follow."
)
next_action: str = Field(
description="Determines the next node in the graph (e.g., 'delegate_to_writer')."
)
# --- 3. ANALYTICS & SCORING MODELS ---
class JobMatchReport(BaseModel):
"""Schema for ATS-style alignment analysis against a specific job role."""
score: int = Field(
description="Numerical match assessment ranging from 0 to 100."
)
missing_keywords: List[str] = Field(
description="Specific technical or soft skills missing from the resume."
)
hiring_manager_tip: str = Field(
description="Actionable advice to make the candidate more competitive."
)
05 // Production Roadmap
The current architecture is an MVP. The roadmap to scale this to 100k+ users involves four critical infrastructure upgrades:
We will replace manual review with CI/CD for Agents. Using a "Golden Dataset" of perfect resumes approved by human recruiters, we will run `deepeval` on every model update. If the "Hallucination Score" exceeds 0.1%, the deployment automatically rolls back.
To reduce inference costs while maintaining reasoning quality, we will use our Synthetic Data Engine to create a training dataset. We will distill the complex reasoning patterns of DeepSeek-R1 (Teacher) into a smaller, quantized Llama-3-8B (Student) model, hosted on our own vLLM clusters.
We will implement an A/B testing framework where a "Challenger" model runs in shadow mode on 5% of traffic. We measure success not by latency, but by the business KPI: "Did the user get an interview?"
We will operationalize PII redaction by deploying Microsoft Presidio as a sidecar proxy. This ensures that even if a developer accidentally logs a payload, names and emails are tokenized at the network edge before they ever hit the database logs or the model.
"""
Thrive Wellness Career Transition Platform - Application Entry Point
------------------------------------------------------------------
This module serves as the frontend interface for the multi-agent career
transition system. It handles:
1. User input and profile management via Streamlit sidebar.
2. Local job database loading and vector search indexing.
3. Orchestration of Agent Graphs (Resume, Application, Search).
4. Local LaTeX compilation and PDF rendering.
5. PII Restoration and data privacy enforcement.
"""
import streamlit as st
import subprocess
import os
import json
import base64
import re
import urllib.parse
import chromadb
from chromadb.utils import embedding_functions
from dotenv import load_dotenv
load_dotenv()
from schemas import CandidateProfile, Experience, Education, Project
from agent_graph import build_graph, build_application_graph, build_job_search_graph
# --- PAGE CONFIGURATION ---
st.set_page_config(page_title="Thrive Career Wellness", page_icon="πΏ", layout="wide")
# Initialize session state counters for dynamic UI elements
if 'exp_count' not in st.session_state: st.session_state.exp_count = 1
if 'proj_count' not in st.session_state: st.session_state.proj_count = 1
# --- DATA LOADING ---
@st.cache_data
def load_jobs():
"""
Loads the job market database from local storage.
Returns:
list: A list of job dictionaries. Returns a default placeholder if
the database file is missing.
"""
try:
with open("jobs_database.json", "r") as f:
return json.load(f)
except FileNotFoundError:
return [{"id": 0, "title": "Default", "sector": "Tech", "description": "N/A"}]
@st.cache_data
def load_candidates():
"""
Loads pre-defined candidate personas for demonstration purposes.
Returns:
list: A list of candidate profile dictionaries.
"""
try:
with open("candidates_database.json", "r") as f:
return json.load(f)
except FileNotFoundError:
return []
jobs_db = load_jobs()
candidates_db = load_candidates()
# --- VECTOR DATABASE ---
@st.cache_resource
class JobBoard:
"""
Manages vector search operations for job market discovery.
Attributes:
client: The ChromaDB persistent client.
collection: The specific document collection for job embeddings.
"""
def __init__(self):
self.client = chromadb.PersistentClient(path="./chroma_db")
self.collection = self.client.get_or_create_collection(
name="job_market",
embedding_function=embedding_functions.DefaultEmbeddingFunction()
)
if self.collection.count() == 0:
self._index_jobs()
def _index_jobs(self):
"""
Populates the vector database with job descriptions from the local store.
"""
ids = [str(j['id']) for j in jobs_db]
docs = [f"{j['title']} - {j['description']}" for j in jobs_db]
metadatas = [{"title": j['title'], "sector": j.get('sector', 'General'), "id": j['id']} for j in jobs_db]
self.collection.add(documents=docs, metadatas=metadatas, ids=ids)
def recommend_jobs(self, resume_text, top_k=50):
"""
Performs semantic similarity search against the job database.
Args:
resume_text (str): The candidate's resume content or summary.
top_k (int): Number of results to retrieve.
Returns:
list: A list of job dictionaries matching the semantic query.
"""
results = self.collection.query(query_texts=[resume_text], n_results=top_k)
jobs = []
if results['ids']:
for i in range(len(results['ids'][0])):
meta = results['metadatas'][0][i]
jobs.append({
"id": meta['id'],
"title": meta['title'],
"sector": meta['sector'],
"description": results['documents'][0][i]
})
return jobs
# --- LOCAL LATEX COMPILER ---
def compile_latex_local(latex_code):
"""
Compiles LaTeX source into PDF using the local Tectonic engine.
Args:
latex_code (str): The raw LaTeX source string.
Returns:
tuple: (bytes, None) on success, or (None, str) containing stderr on failure.
"""
with open("resume.tex", "w", encoding="utf-8") as f:
f.write(latex_code)
try:
res = subprocess.run(
[os.path.join(os.getcwd(), "tectonic.exe"), "resume.tex"],
capture_output=True,
text=True
)
if res.returncode == 0:
with open("resume.pdf", "rb") as f:
return f.read(), None
else:
return None, res.stderr
except Exception as e:
return None, str(e)
# --- SIDEBAR: PROFILE MANAGEMENT ---
with st.sidebar:
st.header("1. Member Profile")
persona_names = ["New Blank Profile"] + [f"{p.get('level', 'N/A')} - {p['full_name']}" for p in candidates_db]
selected_persona = st.selectbox("Select Candidate:", persona_names)
if selected_persona == "New Blank Profile":
p_data = {"full_name": "", "contact_email": "", "phone": "", "linkedin": "",
"experience_history": [], "projects": [],
"education": [{"school": "", "degree": "", "year": ""}]}
st.session_state.exp_count = 1
st.session_state.proj_count = 1
else:
p_data = candidates_db[persona_names.index(selected_persona) - 1]
st.session_state.exp_count = max(1, len(p_data.get('experience_history', [])))
st.session_state.proj_count = max(1, len(p_data.get('projects', [])))
full_name = st.text_input("Full Name", p_data['full_name'], key=f"fn_{selected_persona}")
email = st.text_input("Email", p_data['contact_email'], key=f"em_{selected_persona}")
phone = st.text_input("Phone", p_data['phone'], key=f"ph_{selected_persona}")
linkedin = st.text_input("LinkedIn", p_data.get('linkedin', ""), key=f"li_{selected_persona}")
with st.expander("πΌ Work History", expanded=True):
for i in range(st.session_state.exp_count):
st.markdown(f"**Job #{i+1}**")
d_exp = p_data['experience_history'][i] if i < len(p_data['experience_history']) else {"company": "", "role": "", "achievements": [""]}
st.text_input("Company", key=f"comp_{selected_persona}_{i}", value=d_exp['company'])
st.text_input("Role", key=f"role_{selected_persona}_{i}", value=d_exp['role'])
st.text_area("Impact Story", key=f"ach_{selected_persona}_{i}", value="\n".join(d_exp['achievements']))
if st.button("β Add Job"):
st.session_state.exp_count += 1
st.rerun()
with st.expander("π Projects", expanded=False):
for j in range(st.session_state.proj_count):
st.markdown(f"**Project #{j+1}**")
d_pj = p_data['projects'][j] if j < len(p_data['projects']) else {"title": "", "description_bullets": [""]}
st.text_input("Title", key=f"ptit_{selected_persona}_{j}", value=d_pj['title'])
st.text_area("Story", key=f"pdesc_{selected_persona}_{j}", value="\n".join(d_pj['description_bullets']))
if st.button("β Add Project"):
st.session_state.proj_count += 1
st.rerun()
with st.expander("π Education", expanded=False):
d_edu = p_data['education'][0] if p_data.get('education') else {"school": "", "degree": "", "year": ""}
school = st.text_input("University", d_edu['school'], key=f"sch_{selected_persona}")
degree = st.text_input("Degree", d_edu['degree'], key=f"deg_{selected_persona}")
grad_year = st.text_input("Year", d_edu['year'], key=f"yr_{selected_persona}")
st.header("2. Target Role")
sects = sorted(list(set(j['sector'] for j in jobs_db)))
sel_sect = st.selectbox("Industry Sector", sects)
filt_jobs = [j for j in jobs_db if j['sector'] == sel_sect]
sel_job_t = st.selectbox("Role Template", [f"{j['title']} (ID: {j['id']})" for j in filt_jobs])
sel_job_obj = next(j for j in filt_jobs if f"{j['title']} (ID: {j['id']})" == sel_job_t)
target_role = st.text_input("Target Role Name", sel_job_obj['title'])
job_desc = st.text_area("Job Description", sel_job_obj['description'], height=150)
submit = st.button("π Generate Strategic Resume", type="primary")
# --- EXECUTION LAYER ---
if submit:
# Collect UI Data
exps = []
for i in range(st.session_state.exp_count):
k_comp, k_role, k_ach = f"comp_{selected_persona}_{i}", f"role_{selected_persona}_{i}", f"ach_{selected_persona}_{i}"
c = st.session_state[k_comp] if k_comp in st.session_state else p_data['experience_history'][i]['company']
r = st.session_state[k_role] if k_role in st.session_state else p_data['experience_history'][i]['role']
a = st.session_state[k_ach] if k_ach in st.session_state else "\n".join(p_data['experience_history'][i]['achievements'])
exps.append(Experience(company=c, location="N/A", role=r, duration="N/A", achievements=[a]))
projs = []
for j in range(st.session_state.proj_count):
k_tit, k_desc = f"ptit_{selected_persona}_{j}", f"pdesc_{selected_persona}_{j}"
t = st.session_state[k_tit] if k_tit in st.session_state else p_data['projects'][j]['title']
d = st.session_state[k_desc] if k_desc in st.session_state else "\n".join(p_data['projects'][j]['description_bullets'])
projs.append(Project(title=t, tech_stack="N/A", date="N/A", description_bullets=[d]))
k_sch, k_deg, k_yr = f"sch_{selected_persona}", f"deg_{selected_persona}", f"yr_{selected_persona}"
edu_school = st.session_state[k_sch] if k_sch in st.session_state else p_data['education'][0]['school']
edu_degree = st.session_state[k_deg] if k_deg in st.session_state else p_data['education'][0]['degree']
edu_year = st.session_state[k_yr] if k_yr in st.session_state else p_data['education'][0]['year']
profile = CandidateProfile(
full_name=full_name, contact_email=email, phone=phone, linkedin=linkedin,
summary=f"Strategic {target_role} transition.", skills=[],
experience_history=exps, projects=projs,
education=[Education(school=edu_school, location="N/A", degree=edu_degree, year=edu_year)],
target_role=target_role, job_description=job_desc
)
st.session_state.profile = profile
# Orchestrate Multi-Agent Workflow
with st.status("π€ AI Agents Initializing...", expanded=True) as status:
st.write("π§ Reasoning through career pivot strategy...")
app = build_graph()
st.session_state.result = app.invoke({"profile": profile, "target_role": target_role})
st.write("π― Mapping broad market opportunities...")
board = JobBoard()
raw = board.recommend_jobs(target_role, top_k=50)
search_agent = build_job_search_graph()
matches = search_agent.invoke({"profile": profile, "raw_jobs": raw, "ranked_jobs": []})
st.session_state.ranked_results = matches['ranked_jobs']
status.update(label="β
Ready!", state="complete")
st.rerun()
# --- PRESENTATION LAYER ---
if 'result' in st.session_state and st.session_state.result:
res = st.session_state.result
prof = st.session_state.profile
st.divider()
st.header("π Strategic Career Assets")
with st.expander("π§ View AI Reasoning & Pivot Strategy", expanded=True):
st.write(res['strategy'].reasoning_summary)
# PII RESTORATION: Double-pass loop to handle AI-escaped placeholders
raw_latex = res['final_draft']
pii_map = {
"NAME": prof.full_name,
"EMAIL": prof.contact_email,
"NUMBER": prof.phone,
"URL": prof.linkedin
}
clean_latex = raw_latex
for key, value in pii_map.items():
# Pass 1: Standard Placeholders
clean_latex = clean_latex.replace(f"[CANDIDATE_{key}]", value)\
.replace(f"[CONTACT_{key}]", value)\
.replace(f"[PHONE_{key}]", value)\
.replace(f"[LINKEDIN_{key}]", value)
# Pass 2: Escaped Placeholders (Latex safe)
clean_latex = clean_latex.replace(f"[CANDIDATE\\_{key}]", value)\
.replace(f"[CONTACT\\_{key}]", value)\
.replace(f"[PHONE\\_{key}]", value)\
.replace(f"[LINKEDIN\\_{key}]", value)
c1, c2 = st.columns(2)
with c1:
st.subheader("LaTeX Source Editor")
edited = st.text_area("Source Code", value=clean_latex, height=600, key="latex_editor")
with c2:
st.subheader("Resume Preview")
if st.button("π Compile PDF Document"):
with st.spinner("Generating PDF..."):
pdf_bytes, error_log = compile_latex_local(edited)
if pdf_bytes:
b64 = base64.b64encode(pdf_bytes).decode()
st.markdown(f'<iframe src="data:application/pdf;base64,{b64}" width="100%" height="600"></iframe>', unsafe_allow_html=True)
else:
st.error("LaTeX Compilation Failed")
with st.expander("π View Compiler Error Log"):
st.code(error_log)
if 'ranked_results' in st.session_state:
st.divider()
st.header("π― Thrive Job Match (Top 10)")
m1, m2 = st.columns(2)
for i, job in enumerate(st.session_state.ranked_results):
job_id = job['id']
with (m1 if i % 2 == 0 else m2):
with st.container(border=True):
st.subheader(job['title'])
st.caption(f"Sector: {job['sector']} | ID: {job_id}")
with st.expander("π Full Responsibilities & Details"):
st.write(job['description'])
ce, cp = st.columns([1, 1.2])
with ce:
cl_base = st.session_state.get(f"packet_{job_id}", f"Hi, I'm {prof.full_name}, I'm writing to express interest in the {job['title']} role...")
cl_clean = cl_base.replace("[CANDIDATE_NAME]", prof.full_name)\
.replace("[CONTACT_EMAIL]", prof.contact_email)
subj = urllib.parse.quote(f"Interest in {job['title']}")
mailto = f"mailto:hiring@thrive.com?subject={subj}&body={urllib.parse.quote(cl_clean)}"
st.link_button("π§ Email Manager", mailto, use_container_width=True)
with cp:
if st.button(f"π€ Prep Cover Letter", key=f"prep_{job_id}", use_container_width=True):
with st.spinner("Tailoring cover letter..."):
# π‘οΈ PII GUARD: Create anonymized profile for the agent
safe_prof = prof.model_copy()
safe_prof.full_name = "[CANDIDATE_NAME]"
safe_prof.contact_email = "[CONTACT_EMAIL]"
safe_prof.phone = "[PHONE_NUMBER]"
safe_prof.linkedin = "[LINKEDIN_URL]"
app_res = build_application_graph().invoke({
"job_title": job['title'],
"job_description": job['description'],
"candidate_profile": safe_prof.model_dump_json(), # Safe JSON
"cover_letter": "", "status": "pending"
})
st.session_state[f"packet_{job_id}"] = app_res['cover_letter']
st.rerun()
if f"packet_{job_id}" in st.session_state:
st.info("π Tailored Application Ready")
final_cl = st.session_state[f"packet_{job_id}"]
for key, value in pii_map.items():
final_cl = final_cl.replace(f"[CANDIDATE_{key}]", value)\
.replace(f"[CONTACT_{key}]", value)
st.text_area("Cover Letter", value=final_cl, height=200, key=f"text_{job_id}")
06 // Unit Economics: The "Staff Engineer" Pivot
The difference between a hobby project and a sustainable business is cost structure. Scaling to 100k users using a raw API wrapper (Scenario A) creates a dangerous variable cost liability. Distillation (Scenario B) converts this into a predictable fixed cost.
The Variable Cost Trap
Passing every prompt to DeepSeek/OpenAI.
- Cost per Resume: $0.02
- 100k Users/Mo: $2,000/mo
- Risk: Uncapped
The Fixed Cost Win
Self-hosting Llama-3 (Student) on vLLM.
- Training Cost: $60 (One-time)
- Hosting (T4 GPU): $250/mo (Flat)
- 100k Users/Mo: $0.0025 each
Tech Stack
- > Orchestration: LangGraph (Multi-Agent State Machine)
- > Validation: Pydantic (Strict Schemas)
- > Simulation: Asyncio + Faker (Synthetic Data)
- > Vector Search: ChromaDB (Prototype) -> Pinecone (Prod)
- > Inference: vLLM / Ray Serve (Planned)