Skip to content

LLM Integration

Kairos uses LiteLLM as a unified LLM client, supporting Claude, GPT, Gemini, and 100+ other models through a single interface. Three core functions: skill extraction, JD analysis, and resume tailoring.

Architecture

graph TB
    subgraph "kairos.core (interfaces)"
        T0[SkillExtractorService]
        T1[JdAnalysisService]
        T2[ResumeTailorService]
    end

    subgraph "kairos.llm (implementations)"
        SE[SkillExtractor]
        A[Analyzer]
        T[Tailor]
        P0[Skill Extraction Prompts]
        P1[JD Analysis Prompts]
        P2[Resume Tailoring Prompts]
        SDK[LiteLLM acompletion]
    end

    subgraph "LLM Providers"
        Claude[Claude API]
        GPT[OpenAI API]
        Gemini[Gemini API]
        Other[...]
    end

    SE --> T0
    A --> T1
    T --> T2
    SE --> P0
    A --> P1
    T --> P2
    SE --> SDK
    A --> SDK
    T --> SDK
    SDK --> Claude
    SDK --> GPT
    SDK --> Gemini
    SDK --> Other

Multi-Provider Support

LiteLLM provides a unified interface — switch models by changing a string:

from litellm import acompletion

# Claude
response = await acompletion(model="claude-sonnet-4-20250514", messages=[...])

# GPT
response = await acompletion(model="gpt-4o", messages=[...])

# Gemini
response = await acompletion(model="gemini/gemini-2.0-flash", messages=[...])

Fallback Chains

Configure fallback models in case primary provider is unavailable:

[llm]
model = "claude-sonnet-4-20250514"
fallback_models = ["gpt-4o", "gemini/gemini-2.0-flash"]

Provider API Keys

Each provider needs its own API key, configured via environment variables:

ANTHROPIC_API_KEY=sk-ant-...      # Claude
OPENAI_API_KEY=sk-...             # GPT
GEMINI_API_KEY=...                # Gemini

Only the key for the configured model (and fallbacks) is required.

Skill Extraction

Runs once during resume import to build the user's Skill Profile.

Prompt Strategy

Temperature: 0.0 (deterministic extraction)

The prompt instructs the LLM to: 1. Analyze each work experience and project section from the resume 2. Extract every technical skill mentioned (languages, frameworks, databases, tools, concepts) 3. Estimate proficiency (1-10) based on context clues: duration, depth of usage, role seniority 4. Estimate years of experience per skill 5. Cite evidence — which project/role demonstrates each skill

Output is parsed as a list[SkillEntry] via tool use with Pydantic schema. Users can review and adjust proficiency ratings after generation.

JD Analysis

Prompt Strategy

Temperature: 0.0 (deterministic extraction)

The prompt instructs the LLM to: 1. Extract structured data from the JD text 2. Identify required vs preferred skills with estimated proficiency level needed 3. Extract ATS keywords 4. Compare against the user's Skill Profile (proficiency-aware, not just binary) 5. Compute a match score (0.0 - 1.0) with per-skill breakdown and reasoning

Output is parsed as a JdAnalysis Pydantic model via structured output.

Resume Tailoring

Anti-Hallucination Rules (enforced in system prompt)

  1. Only rephrase, reorder, or emphasize existing content
  2. Never invent skills, experiences, achievements, or metrics
  3. Never change company names, dates, job titles, education details
  4. Only modify sections marked as tailorable
  5. Preserve the original format (LaTeX or Markdown)

Temperature: 0.3 (controlled creativity)

Validation

After LLM response, a validation step checks: - No new company names or job titles appeared - No new degree or institution names - Dates haven't changed - Section types match what was sent

If validation fails, retry with stricter prompt or flag to user.

Output Flow

flowchart TD
    Input["Base Resume Sections\n+ JD Analysis"] --> LLM[LLM via LiteLLM]
    LLM --> Raw[Raw LLM Response]
    Raw --> Parse[Parse modified sections]
    Parse --> Validate[Validate: no fabricated content]
    Validate --> |Pass| Diff[Generate diff]
    Validate --> |Fail| Retry[Retry with stricter prompt]
    Diff --> User[Show to user for approval]

Structured Output Pattern

All LLM calls use tool use for structured output (works across providers):

from litellm import acompletion

# LiteLLM 统一接口,tool use 跨 provider 通用
response = await acompletion(
    model=settings.llm_model,
    messages=[...],
    tools=[{
        "type": "function",
        "function": {
            "name": "output",
            "description": "Structured analysis output",
            "parameters": JdAnalysis.model_json_schema(),
        },
    }],
    tool_choice={"type": "function", "function": {"name": "output"}},
)

Cost

Operation Tokens (approx) Cost at Sonnet Frequency
Skill Extraction 1,000-2,000 ~$0.01-0.02 Once per resume import
JD Analysis 500-1,500 ~$0.003-0.01 Per job
Resume Tailoring 1,000-3,000 ~$0.01-0.03 Per job

LiteLLM provides built-in cost tracking via response._hidden_params["response_cost"].

Model Configuration

Default: claude-sonnet-4-20250514

[llm]
model = "claude-sonnet-4-20250514"
fallback_models = ["gpt-4o"]
temperature_extraction = 0.0
temperature_tailoring = 0.3

Configurable via environment variable KAIROS_LLM_MODEL or config file. Any LiteLLM-supported model string works.