LLM Integration¶
Kairos uses LiteLLM as a unified LLM client, supporting Claude, GPT, Gemini, and 100+ other models through a single interface. Three core functions: skill extraction, JD analysis, and resume tailoring.
Architecture¶
graph TB
subgraph "kairos.core (interfaces)"
T0[SkillExtractorService]
T1[JdAnalysisService]
T2[ResumeTailorService]
end
subgraph "kairos.llm (implementations)"
SE[SkillExtractor]
A[Analyzer]
T[Tailor]
P0[Skill Extraction Prompts]
P1[JD Analysis Prompts]
P2[Resume Tailoring Prompts]
SDK[LiteLLM acompletion]
end
subgraph "LLM Providers"
Claude[Claude API]
GPT[OpenAI API]
Gemini[Gemini API]
Other[...]
end
SE --> T0
A --> T1
T --> T2
SE --> P0
A --> P1
T --> P2
SE --> SDK
A --> SDK
T --> SDK
SDK --> Claude
SDK --> GPT
SDK --> Gemini
SDK --> Other
Multi-Provider Support¶
LiteLLM provides a unified interface — switch models by changing a string:
from litellm import acompletion
# Claude
response = await acompletion(model="claude-sonnet-4-20250514", messages=[...])
# GPT
response = await acompletion(model="gpt-4o", messages=[...])
# Gemini
response = await acompletion(model="gemini/gemini-2.0-flash", messages=[...])
Fallback Chains¶
Configure fallback models in case primary provider is unavailable:
Provider API Keys¶
Each provider needs its own API key, configured via environment variables:
Only the key for the configured model (and fallbacks) is required.
Skill Extraction¶
Runs once during resume import to build the user's Skill Profile.
Prompt Strategy¶
Temperature: 0.0 (deterministic extraction)
The prompt instructs the LLM to: 1. Analyze each work experience and project section from the resume 2. Extract every technical skill mentioned (languages, frameworks, databases, tools, concepts) 3. Estimate proficiency (1-10) based on context clues: duration, depth of usage, role seniority 4. Estimate years of experience per skill 5. Cite evidence — which project/role demonstrates each skill
Output is parsed as a list[SkillEntry] via tool use with Pydantic schema. Users can review and adjust proficiency ratings after generation.
JD Analysis¶
Prompt Strategy¶
Temperature: 0.0 (deterministic extraction)
The prompt instructs the LLM to: 1. Extract structured data from the JD text 2. Identify required vs preferred skills with estimated proficiency level needed 3. Extract ATS keywords 4. Compare against the user's Skill Profile (proficiency-aware, not just binary) 5. Compute a match score (0.0 - 1.0) with per-skill breakdown and reasoning
Output is parsed as a JdAnalysis Pydantic model via structured output.
Resume Tailoring¶
Anti-Hallucination Rules (enforced in system prompt)¶
- Only rephrase, reorder, or emphasize existing content
- Never invent skills, experiences, achievements, or metrics
- Never change company names, dates, job titles, education details
- Only modify sections marked as tailorable
- Preserve the original format (LaTeX or Markdown)
Temperature: 0.3 (controlled creativity)
Validation¶
After LLM response, a validation step checks: - No new company names or job titles appeared - No new degree or institution names - Dates haven't changed - Section types match what was sent
If validation fails, retry with stricter prompt or flag to user.
Output Flow¶
flowchart TD
Input["Base Resume Sections\n+ JD Analysis"] --> LLM[LLM via LiteLLM]
LLM --> Raw[Raw LLM Response]
Raw --> Parse[Parse modified sections]
Parse --> Validate[Validate: no fabricated content]
Validate --> |Pass| Diff[Generate diff]
Validate --> |Fail| Retry[Retry with stricter prompt]
Diff --> User[Show to user for approval]
Structured Output Pattern¶
All LLM calls use tool use for structured output (works across providers):
from litellm import acompletion
# LiteLLM 统一接口,tool use 跨 provider 通用
response = await acompletion(
model=settings.llm_model,
messages=[...],
tools=[{
"type": "function",
"function": {
"name": "output",
"description": "Structured analysis output",
"parameters": JdAnalysis.model_json_schema(),
},
}],
tool_choice={"type": "function", "function": {"name": "output"}},
)
Cost¶
| Operation | Tokens (approx) | Cost at Sonnet | Frequency |
|---|---|---|---|
| Skill Extraction | 1,000-2,000 | ~$0.01-0.02 | Once per resume import |
| JD Analysis | 500-1,500 | ~$0.003-0.01 | Per job |
| Resume Tailoring | 1,000-3,000 | ~$0.01-0.03 | Per job |
LiteLLM provides built-in cost tracking via response._hidden_params["response_cost"].
Model Configuration¶
Default: claude-sonnet-4-20250514
[llm]
model = "claude-sonnet-4-20250514"
fallback_models = ["gpt-4o"]
temperature_extraction = 0.0
temperature_tailoring = 0.3
Configurable via environment variable KAIROS_LLM_MODEL or config file. Any LiteLLM-supported model string works.