AlphaLens — Project Documentation

AlphaLens

AI-Powered Multi-Agent Equity Research Platform

"One engineer, six AI agents, and 90 seconds — replacing what an entire equity research team takes eight hours to produce."

Student	Rishi Sehgal
Email	sehgal.r@northeastern.edu
Date	April 2026
Core Components	Prompt Engineering & Retrieval-Augmented Generation (RAG)
Application Type	Research Synthesis Tool
Repository	github.com/rishisehgal/alphalens
Web Page	rishisehgal.github.io/AlphaLens
Live App	rishisehgal-alphalens.streamlit.app
Video Demo	loom.com/share/36213fda55af4fd2829f19d726ef1957

LangGraph Google Gemini 2.5 Flash ChromaDB SEC EDGAR Streamlit

1. Abstract3
2. Problem Statement & Motivation3
3. System Architecture4
4. The Six Agents5
5. Core Component: Prompt Engineering7
6. Core Component: Retrieval-Augmented Generation8
7. Data Sources & Integration10
8. Frontend & User Experience10
9. Evaluation Framework & Performance Metrics11
10. Challenges & Solutions12
11. Ethical Considerations13
12. Future Improvements14
13. Conclusion14
14. References15

1. Abstract

AlphaLens is a multi-agent AI system that automates professional equity research for retail investors. Given a stock ticker symbol, the system executes a pipeline of six specialized AI agents — each replicating a distinct analyst role — and produces a structured investment research report with citations, risk flags, valuation estimates, and a cross-verification layer, all within 90 seconds and at zero monetary cost using free-tier APIs.

The system implements two of the assignment's core components: Prompt Engineering, through carefully structured, context-aware prompts that drive each agent's reasoning and produce reliable structured JSON output; and Retrieval-Augmented Generation (RAG), through a full pipeline that ingests SEC 10-K and 10-Q filings, chunks and embeds them with Google's Gemini Embedding model, stores vectors in ChromaDB, and retrieves relevant excerpts to ground all narrative claims in primary source material.

The system is built on LangGraph for orchestration, Google Gemini 2.5 Flash for all language model calls, ChromaDB for vector storage, and Streamlit for the frontend. Data is sourced from SEC EDGAR, Alpha Vantage, yfinance, and the Federal Reserve's FRED API — all publicly available with no licensing cost.

2. Problem Statement & Motivation

Equity research is one of the most information-dense tasks in finance. A professional analyst preparing a research report on a single company will spend 8 or more hours reading annual filings (often 200+ pages), cross-checking financial statements, building valuation models, assessing macroeconomic context, and writing a structured report with citations. This workflow is effectively inaccessible to retail investors.

Existing solutions fall into two categories: (1) static financial data dashboards that show numbers but provide no interpretation, and (2) general-purpose LLM chatbots that can discuss finance but hallucinate figures and lack grounding in primary source documents. Neither solves the core problem.

AlphaLens addresses this gap by combining structured data APIs with a RAG pipeline over actual SEC filings, coordinated through a multi-agent state machine that mirrors the professional research workflow. The result is a system that not only produces a human-readable report but also explicitly measures confidence and flags divergences between management narrative and financial data.

Assignment Alignment

AlphaLens implements Prompt Engineering (systematic prompting strategies, context management, structured output, graceful error handling) and Retrieval-Augmented Generation (knowledge base construction, vector storage and retrieval, document chunking, ranking and filtering). It falls into the Research Synthesis Tool application type.

3. System Architecture

3.1 High-Level Architecture

The system is organized into four layers: a data layer (API clients for external data sources), an agent layer (six specialized LangGraph nodes), an orchestration layer (LangGraph StateGraph managing parallel and sequential execution), and a presentation layer (Streamlit frontend with custom HTML/CSS components and Plotly charts).

┌─────────────────────────────────────────────────────────────────────┐ │ ALPHALENS SYSTEM ARCHITECTURE │ └─────────────────────────────────────────────────────────────────────┘ ┌──── PRESENTATION LAYER ─────────────────────────────────────────────┐ │ Streamlit App (app.py) · Custom HTML/CSS · Plotly Charts │ │ Agent Progress Animation · Report Viewer · Follow-up Chat │ └───────────────────────────────┬─────────────────────────────────────┘ │ ticker input ┌──── ORCHESTRATION LAYER ───────▼─────────────────────────────────────┐ │ LangGraph StateGraph · AlphaLensState (TypedDict) │ │ │ │ START ──► data_fusion ──┬──► rag_citation ──► risk_scanner ──┐ │ │ └──► quant_analysis ───────────────────┤ │ │ ▼ │ │ END ◄── report_synthesis ◄── verify │ └──────────────────────────────────────────────────────────────────────┘ ┌──── AGENT LAYER ──────────────────────────────────────────────────────┐ │ data_fusion rag_citation quant_analysis risk_scanner │ │ verify report_synthesis │ │ All agents: Google Gemini 2.5 Flash · Structured JSON prompting │ └──────────────────────────────────────────────────────────────────────┘ ┌──── DATA LAYER ────────────────────────────────────────────────────────┐ │ SEC EDGAR (10-K / 10-Q filings, no key) │ │ Alpha Vantage (income stmt, balance sheet, cash flows) │ │ yfinance (live price, technicals, fallback fundamentals) │ │ FRED (macro: fed funds rate, GDP, CPI, unemployment) │ │ ChromaDB (local vector store, cosine similarity, 3072-dim) │ └────────────────────────────────────────────────────────────────────────┘

3.2 LangGraph State Machine

All agents share a single AlphaLensState TypedDict that is passed through the graph. The TypedDict uses Annotated fields with custom reducer functions (_merge_metadata and operator.add) on fields written by parallel branches, preventing race conditions when both rag_citation and quant_analysis attempt to update metadata and error_log simultaneously.

class AlphaLensState(TypedDict):
    ticker:         str
    financial_data: dict           # Agent 1 output — structured financials
    rag_chunks:     list[dict]     # Agent 2 output — filing excerpts + metadata
    quant_results:  dict           # Agent 3 output — DCF, RSI, MACD, earnings
    risk_flags:     list[dict]     # Agent 4 output — severity-classified flags
    verification:   dict           # Agent 5 output — divergences, confidence
    report:         dict           # Agent 6 output — full structured report
    metadata:       Annotated[dict, _merge_metadata]   # parallel-safe
    error_log:      Annotated[list, operator.add]       # parallel-safe
    chat_history:   list[dict]

3.3 Execution Flow

The pipeline has three execution phases:

Sequential start: data_fusion runs first, acquiring all structured financial data and SEC filing URLs. No other agent can start until this completes, as all downstream agents depend on its output.
Parallel middle: rag_citation and quant_analysis execute in parallel. RAG Citation downloads and processes the SEC filing while Quant Analysis runs the DCF model and technical indicators. LangGraph's add_edge([A, B], C) syntax creates a fan-in at risk_scanner that waits for both branches.
Sequential tail: risk_scanner → verify → report_synthesis run sequentially, each building on all prior outputs.

4. The Six Agents

Each agent was designed to mirror a specific role in a professional equity research team.

Agent 1

Data Fusion Agent

Replaces: Junior Analyst

Makes parallel API calls to SEC EDGAR, Alpha Vantage, yfinance, and FRED. Normalizes all data into a unified financial_data dict. Implements graceful fallback: if Alpha Vantage is rate-limited, yfinance provides backup fundamentals with a reduced confidence flag. Never crashes — always returns a sources_status dict documenting what succeeded and what failed.

Agent 2

RAG Citation Agent

Replaces: Research Associate

Downloads the 10-K/10-Q HTML from EDGAR, parses it into sections (MD&A, Risk Factors, Financial Statements, Notes), chunks each section into 500-token overlapping windows, embeds them via Gemini Embedding, stores vectors in ChromaDB, and retrieves the top-k most relevant chunks per query dimension. Returns chunks with full citation metadata: section name, estimated page number, and source file.

Agent 3

Quant Analysis Agent

Replaces: Quantitative Analyst

Runs three quantitative models: (1) a 3-stage DCF with bear/base/bull scenarios and WACC estimation, always reported at MEDIUM confidence to communicate model uncertainty; (2) technical indicators — RSI(14), MACD(12/26/9) — computed on 6 months of daily price data from yfinance; (3) earnings surprise analysis comparing actual EPS to consensus estimate.

Agent 4

Risk Scanner Agent

Replaces: Risk / Compliance Officer

Implements a two-stage risk detection pipeline: first, a fast keyword/regex pattern match identifies candidate risk phrases (going concern, material weakness, auditor change, revenue recognition change, related-party, covenant violations, customer concentration). Second, Gemini classifies each candidate by severity (HIGH/MEDIUM/LOW) and generates a description with a citation link. The two-stage approach avoids expensive LLM calls on every chunk.

Agent 5 — Crown Jewel

Verification Agent

Replaces: Senior Analyst

Cross-references all outputs from Agents 1–4. Extracts quantitative claims from financial data, extracts narrative claims from management's filing text, and prompts Gemini to identify divergences between them — cases where management's forward-looking statements are inconsistent with the actual historical numbers. Outputs a divergences list and per-section confidence_scores (HIGH/MEDIUM/LOW) that drive the overall report confidence rating.

Agent 6

Report Synthesis Agent

Replaces: Publishing Editor

Generates five report sections iteratively — Executive Summary, Financial Health, Risk Flags, Valuation, Verification Verdict — using one focused Gemini call per section rather than a single large call. This produces more coherent output and keeps each prompt within safe token limits. Each section includes inline citations and a confidence badge. The overall report confidence is computed as the minimum confidence across all sections, adjusted by the verification divergence count.

5. Core Component: Prompt Engineering

5.1 Prompting Strategies

AlphaLens uses four distinct prompting strategies across its agents, each chosen for the specific requirements of that agent's task:

Strategy	Agent(s)	Purpose
Structured JSON Output	Verification, Risk Scanner	Prompts specify the exact JSON schema the model must return, including field names and allowed enum values (HIGH/MEDIUM/LOW). This enables programmatic parsing of LLM output without additional extraction logic.
Iterative Section Prompting	Report Synthesis	Each of the five report sections receives its own focused prompt. Separating concerns produces more coherent output and prevents the model from conflating content across sections, as would occur with a single large prompt.
Grounded Evidence Prompting	Verification, RAG Citation	The prompt explicitly provides both the quantitative data and the relevant filing excerpts, then instructs the model to cross-reference them. The model is prohibited from making claims not supported by the provided context, reducing hallucination.
Two-Stage Classification	Risk Scanner	Pattern matching identifies candidate risks first (fast, deterministic). Only confirmed candidates are passed to the LLM for severity classification. This reduces API calls by 60–80% compared to passing every chunk to the LLM.

5.2 Context Management

Each agent's prompt is assembled programmatically from modular sub-sections rather than using static template strings. The Verification Agent's _build_prompt() function, for example, composes four independently rendered blocks — a financial data summary, numbered filing excerpts, quant results, and risk flag counts — and measures the total character length before sending. If the assembled prompt exceeds a safe threshold, filing excerpts are truncated to stay within the model's output token budget.

The follow-up chat feature uses a conversation history injection pattern: the full pipeline state (financial metrics, report sections, risk flags, RAG chunks, and verification divergences) is injected as the first user turn, followed by a model acknowledgment turn, and then the actual conversation history. This gives the model the full research context on every turn without re-running the pipeline.

5.3 Error Recovery & Edge Case Handling

LLM outputs are inherently non-deterministic. AlphaLens implements a multi-layer recovery strategy for every structured output call:

Code fence stripping: Gemini frequently wraps JSON in ```json markdown blocks. The parser strips these before attempting to parse.
Trailing content trimming: The parser finds the last valid closing bracket in the response and truncates anything after it, recovering from partial truncation.
Schema validation: After parsing, required fields are validated and filled with safe defaults if missing.
Safe fallback objects: If all recovery attempts fail, the agent returns a pre-defined safe default (e.g., empty divergences list, MEDIUM confidence) rather than crashing the pipeline.
Retry with backoff: 429 rate-limit errors trigger an exponential backoff retry, up to 3 attempts.

6. Core Component: Retrieval-Augmented Generation

The RAG pipeline transforms raw SEC filing HTML into a searchable vector knowledge base, retrieves the most relevant excerpts for each analysis dimension, and injects those excerpts as grounding context into downstream LLM prompts. This is what distinguishes AlphaLens from a system that merely summarizes financial numbers — the narrative analysis is grounded in primary source documents.

6.1 Document Ingestion

The EdgarClient queries the SEC EDGAR full-text search API to locate 10-K and 10-Q filings for a given ticker. It retrieves the filing index, extracts the primary document URL, and downloads the HTML. The EDGAR API enforces a rate limit of 10 requests per second; the system uses a token bucket rate limiter (implemented in src/utils/rate_limiter.py) to stay compliant.

The FilingParser (in src/data/filing_parser.py) uses BeautifulSoup to parse the downloaded HTML and extract named sections. It identifies section boundaries using SEC standard item markers (Item 1A. Risk Factors, Item 7. Management's Discussion and Analysis, Item 8. Financial Statements, etc.) and stores each section with its estimated page number and character offset. This metadata is preserved through the entire pipeline for citation generation.

6.2 Chunking Strategy

Text is split using LangChain's RecursiveCharacterTextSplitter, configured with token-accurate counting via the tiktoken library using the cl100k_base tokenizer. This is important because the embedding model has a fixed input token limit — character-based splitting would produce chunks of inconsistent actual token length.

Parameter	Value	Rationale
Chunk Size	500 tokens	Balances specificity (small enough for precise retrieval) with context (large enough to contain complete sentences and their surrounding context)
Chunk Overlap	100 tokens	Prevents key information that spans a chunk boundary from being lost; 20% overlap is standard for financial text
Splitter Hierarchy	Paragraph → sentence → word	RecursiveCharacterTextSplitter tries to break at natural boundaries before resorting to mid-sentence breaks

Each chunk is stored with a metadata dict containing: ticker, filing_type (10-K or 10-Q), filing_date, section_name, chunk_index (within its section), estimated_page, and source_file. This metadata enables citation generation — when the report references a claim, it can link directly to the section and approximate page number in the original filing.

6.3 Embedding & Vector Storage

Chunks are embedded using Google's gemini-embedding-001 model, which produces 3072-dimensional dense vectors. Two distinct task types are used to optimize retrieval quality:

RETRIEVAL_DOCUMENT — used when embedding filing chunks for storage. Optimizes the representation for being retrieved.
RETRIEVAL_QUERY — used when embedding the query string at retrieval time. Optimizes for matching against stored document vectors.

Vectors are stored in ChromaDB using a PersistentClient backed by a local .chroma/ directory. Each collection is namespaced by ticker symbol and filing date, so re-running the same ticker reuses cached embeddings rather than re-computing them (a significant cost and latency saving). The similarity metric is cosine distance, which is standard for dense text embeddings.

Batching is applied during embedding: chunks are grouped into batches of 20 (the API's limit) with exponential backoff on rate-limit errors. A per-call rate limiter ensures the system stays within the free-tier limits.

6.4 Retrieval & Ranking

At retrieval time, the FilingRetriever executes multiple focused queries rather than a single broad query. The query set covers distinct analysis dimensions:

"revenue growth drivers and business performance"
"risk factors competition regulatory litigation"
"management discussion outlook guidance"
"financial statements cash flow debt"
"going concern material weakness audit"

Each query retrieves top-k = 5 chunks. Results are de-duplicated by chunk ID and re-ranked by cosine similarity score. The final retrieval set (up to 25 unique chunks) is passed to the Risk Scanner and Verification agents as grounding context.

Retrieved chunks are also stored in AlphaLensState["rag_chunks"] and injected into the follow-up chat context, enabling the chat widget to answer qualitative questions (e.g., "what drove revenue growth?") from the actual filing text rather than the LLM's training data.

7. Data Sources & Integration

Source	Data Provided	Rate Limit	Auth	Fallback
SEC EDGAR	10-K / 10-Q filing HTML, accession numbers, filing dates	10 req/sec	None required	Skip RAG; report as "Filing unavailable"
Alpha Vantage	Income statement, balance sheet, cash flow (annual + quarterly)	25 req/day	Free API key	yfinance fundamentals at reduced confidence
yfinance	Live price, 52-week high/low, P/E, beta, historical OHLCV	~2 req/sec	None required	Partial data with sources_status flag
FRED (Federal Reserve)	Fed Funds Rate, GDP, CPI, Unemployment Rate	120 req/min	Free API key	Omit macro section
Google Gemini 2.5 Flash	LLM inference, embeddings	15 RPM (free tier)	API key	Exponential backoff, up to 3 retries
ChromaDB	Local vector storage and retrieval	Local I/O	None	Re-embed on corruption

The system tracks the status of every data source in a sources_status dict within the pipeline metadata. This dict is displayed in the sidebar so users can immediately understand if any data source was degraded or unavailable. Confidence scores are adjusted accordingly.

8. Frontend & User Experience

The frontend is built in Streamlit but uses no default Streamlit styling. All visible UI elements are rendered as custom HTML via st.markdown(..., unsafe_allow_html=True). This was a deliberate design choice to make the application look like a professional financial product — the default Streamlit look is generic and unsuitable for an equity research platform.

The UI system is built around a dark color palette (#0A0A0F primary background, #3B82F6 accent blue) and uses Inter as the primary typeface. Four Plotly charts are rendered with custom dark-theme templates matching the application's color system:

RSI Gauge — Semicircular gauge with oversold (green), neutral (blue), and overbought (red) zones.
DCF Sensitivity Heatmap — Fair value under a grid of growth rate × discount rate assumptions, colored green-amber-red.
Price + MACD Chart — Six months of daily price data with MACD histogram subplot.
Earnings Surprise Bar — Horizontal bar showing actual vs. estimated EPS, colored by beat/miss direction.

Agent execution progress is displayed using a custom animated HTML component with CSS pulse animation for running agents. The sidebar shows pipeline metadata: total execution time, per-agent latencies, data source status, and overall confidence bars per report section. A follow-up chat widget at the bottom of the report allows the user to ask questions grounded in the pipeline output.

9. Evaluation Framework & Performance Metrics

A custom evaluation framework is implemented in src/eval/. The framework defines five metrics derived from the RAG and LLM evaluation literature, applied to a golden test set of three tickers: AAPL, NVDA, and MSFT.

≥0.60

Retrieval Precision @ 5

Fraction of top-5 chunks containing relevant keywords

≥0.80

Retrieval Recall @ 5

Fraction of required sections (MD&A, Risk Factors) covered

≥0.90

Faithfulness Score

Monetary claims in report grounded in retrieved chunks

±15%

Numerical Accuracy

Revenue, EPS, margins vs. ground truth within tolerance

1.00

Earnings Surprise Accuracy

Beat / miss direction always correct

70 / 70

Unit Test Coverage

All 70 pytest tests passing across 4 test files

Metric Definitions

Metric	Definition	Threshold
Retrieval Precision@k	Among the top-k retrieved chunks, the fraction that contain at least one keyword from the relevant keyword set for the query	≥ 0.60
Retrieval Recall@k	Among the required source sections (MD&A, Risk Factors, Financial Statements), the fraction that appear in the top-k results	≥ 0.80
Faithfulness	For each monetary claim (dollar amount) in the report text, whether that figure appears (within 5%) in at least one retrieved chunk. Measured as fraction of claims grounded.	≥ 0.90
Numerical Accuracy	For each key financial metric (revenue, net income, gross margin, EPS, D/E ratio), whether the pipeline's value matches the golden ground truth within ±15%	All within ±15%
Earnings Surprise Accuracy	Whether the beat/miss direction (actual EPS vs. consensus estimate) is correct. This is a binary check — correct direction or not.	1.00 (exact)

Note on Metric Measurement

The metrics above represent design targets defined in the evaluation framework. Full automated measurement requires a complete pipeline run per golden ticker. Due to API rate limits encountered during development, manual inspection of pipeline outputs confirmed retrieval quality and numerical accuracy are within target ranges for AAPL and NVDA. The evaluation runner (python -m src.eval.runner --tickers AAPL NVDA MSFT) is included and functional for automated measurement with sufficient API quota.

10. Challenges & Solutions

Challenge	Root Cause	Solution Implemented
LangGraph node name conflicts with state keys	Newer LangGraph versions raise a ValueError if a node name matches a TypedDict key. The node named `"verification"` conflicted with the state field `verification`.	Renamed the node to `"verify"`. Updated all references in `graph.py`, `app.py`, and `components.py`.
DCF unavailable for AAPL — insufficient data	The DCF model requires multi-year historical revenue growth rates from Alpha Vantage. With only 25 requests/day on the free tier, Alpha Vantage calls are frequently exhausted. The yfinance fallback does not always provide the multi-year fundamental history required for the DCF growth rate calculation.	DCF gracefully returns `None` with an explanatory message in the error log. The pipeline continues, sets DCF confidence to LOW, and the report section notes "DCF unavailable — insufficient historical data." The system design separates DCF availability from overall pipeline success.
JSON parse failure in Verification Agent	When the verification prompt is long (many risk flags + filing excerpts), Gemini's response may be truncated at the `max_output_tokens` limit, producing a syntactically invalid JSON string cut off mid-value. The raw snippet: "divergences": [ { "claim": "Total net sales for 2025 were $416.16 billion — the string is never closed.	Implemented a multi-stage parser: (1) strip markdown code fences, (2) find the last valid closing bracket and truncate, (3) attempt `json.loads()`, (4) on failure, use `re.search` to extract a partial `divergences` array, (5) fall back to safe defaults. Separately, prompt length is now measured before sending and filing excerpts are truncated to keep the total under a character budget that leaves adequate output token headroom.
Gemini free-tier quota exhaustion across multiple keys	All `gemini-2.0-flash` free-tier quota (per Google Cloud project) was exhausted during development. New API keys from the same Google Workspace (Northeastern) account returned `limit: 0`, indicating the quota was not provisioned for that account type.	Switched to `gemini-2.5-flash`, which has its own independent quota pool. Changed the `GEMINI_MODEL` config constant. Also added a `get_script_run_ctx()` check in `_get_secret()` to prevent Streamlit secrets access at module import time, which was causing `set_page_config()` ordering errors.
Parallel state merging race condition	When `rag_citation` and `quant_analysis` run in parallel and both attempt to update `metadata["agent_latencies"]` and `error_log`, one branch's updates would overwrite the other's.	Used LangGraph's `Annotated[dict, _merge_metadata]` and `Annotated[list, operator.add]` on conflicting fields. The custom `_merge_metadata` reducer performs a dict merge rather than a replace, preserving both branches' contributions.
Plotly gauge steps reject 8-digit hex colors	Plotly's gauge step `color` property does not accept the `#RRGGBBAA` format used by our color system for opacity.	Converted all gauge step colors to `rgba(r,g,b,a)` format inline.
Nested f-string triple-quote syntax error	`f"""...{''.join(f"""...""" for ...)}..."""` is invalid Python syntax — triple-quoted f-strings cannot be nested.	Pre-computed the inner HTML string into a separate variable before the outer f-string.

11. Ethical Considerations

11.1 Financial Advice Disclaimer

AlphaLens is explicitly not financial advice and is designed for educational purposes only. This disclaimer appears in three locations: the application header, each report section footer, and the follow-up chat system prompt (which instructs the model never to provide investment recommendations). The system's role is to surface publicly available information in an organized format, not to make trading recommendations.

11.2 Data Sources & Copyright

All data used by AlphaLens comes from public domain or publicly accessible sources:

SEC EDGAR filings are public domain documents — companies are legally required to file them and they are freely available to anyone.
FRED economic data is published by the U.S. Federal Reserve and is explicitly in the public domain.
yfinance and Alpha Vantage provide aggregated market data under their respective free-tier terms of service, used only for educational and non-commercial purposes consistent with this project.

11.3 Bias & Representation

The system covers any publicly traded U.S. equity with an SEC filing — it does not favor any particular company, sector, or market cap. However, several inherent biases should be acknowledged:

Data quality bias: Companies with more complete Alpha Vantage data will receive more accurate DCF estimates. Smaller companies with thinner data coverage will receive lower confidence scores.
English-language bias: The RAG pipeline is optimized for English-language SEC filings. Non-U.S. companies with filings in other languages are not supported.
LLM bias: Gemini may reproduce patterns from its training data that reflect historical market biases. The Verification Agent's quantitative cross-check partially mitigates this by grounding claims in actual data.

11.4 Confidence Communication

The system is designed to communicate uncertainty explicitly rather than project false confidence. Every section of the report carries a confidence badge (HIGH/MEDIUM/LOW). The DCF model is always reported at MEDIUM confidence by design, acknowledging that all DCF models are sensitive to assumption choice. The Verification Agent's divergence list surfaces inconsistencies rather than smoothing them over. The goal is to give users the information needed to make their own judgment, not to substitute for it.

11.5 No User Data Collection

AlphaLens does not collect, store, or transmit any user data. Ticker searches are not logged. The local ChromaDB database stores only text excerpts from public SEC filings, not user information. The application is entirely self-contained within the user's local environment.

11.6 Limitations

Known Limitations

DCF model accuracy depends on data completeness from Alpha Vantage (25 req/day free tier)
RAG quality depends on EDGAR filing availability — not all tickers have machine-readable filings
Technical indicators are computed on 6-month price history; longer lookbacks improve reliability
Gemini free-tier rate limits (15 RPM) mean the pipeline may slow under concurrent usage
The system does not verify information that emerged after the most recent SEC filing date

12. Future Improvements

Multimodal integration: Parse financial tables and charts directly from PDF filings using Gemini's vision capabilities, rather than relying on HTML text extraction which can miss structured data in tables.
Comparable company analysis: Automatically retrieve and compare peer company metrics (same sector, similar market cap), turning AlphaLens from a single-company tool into a relative valuation platform.
Fine-tuning: Fine-tune a smaller model on a dataset of professional analyst reports to improve the quality and style of the Executive Summary and Financial Health sections, reducing dependence on the large Gemini model for generation tasks.
Persistent user portfolio: Allow users to save multiple ticker analyses and track confidence score changes over time as new filings are published.
Streaming report generation: Stream report section text as it is generated rather than displaying all at once after pipeline completion, improving perceived responsiveness.
Earnings call transcript RAG: Extend the RAG knowledge base to include quarterly earnings call transcripts from sources like The Motley Fool or Seeking Alpha (public domain transcripts), providing the forward-looking context that 10-K filings lack.
Automated eval CI: Run the evaluation framework as part of a CI pipeline on a weekly schedule, tracking metric trends over time as models and data sources change.

13. Conclusion

Project Links

Web Page: https://rishisehgal.github.io/AlphaLens/
Live App: https://rishisehgal-alphalens.streamlit.app/
Video Demo: https://www.loom.com/share/36213fda55af4fd2829f19d726ef1957
GitHub: https://github.com/RishiSehgal/AlphaLens

AlphaLens demonstrates that the combination of Prompt Engineering and Retrieval-Augmented Generation, coordinated through a multi-agent state machine, can compress a complex professional workflow into an automated, reliable, and explainable system. The six-agent architecture mirrors how a real research team operates — each agent has a narrow, well-defined responsibility, and the LangGraph orchestrator manages their coordination and data flow.

The project's most technically meaningful contribution is the Verification Agent: using an LLM to cross-reference quantitative financial data against management's narrative claims — and flagging divergences with specific evidence — is a use of generative AI that goes beyond summarization. It produces a form of analysis that was previously only possible through careful manual reading and financial modeling expertise.

The system was built entirely on zero-cost free-tier APIs, demonstrating that meaningful generative AI applications do not require expensive infrastructure. The full stack — orchestration, embeddings, LLM inference, vector storage, data ingestion, and frontend — costs $0 to run, making it accessible as a learning project and deployable for individual use without ongoing cost.

The challenges encountered during development — LLM output reliability, API quota management, parallel state merging, and Streamlit rendering constraints — are representative of real engineering problems in production AI systems, and the solutions implemented (defensive parsing, rate limiters, annotated reducers, lazy secret loading) reflect industry-standard patterns for building robust LLM applications.

14. References

LangGraph Documentation — Multi-Agent State Machines. LangChain, Inc., 2024. https://langchain-ai.github.io/langgraph/
Google AI — Gemini API Documentation. Google LLC, 2025. https://ai.google.dev/gemini-api/docs
Lewis, P., et al. "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Advances in Neural Information Processing Systems, 2020.
SEC EDGAR Full-Text Search API. U.S. Securities and Exchange Commission. https://efts.sec.gov/LATEST/search-index
Federal Reserve Economic Data (FRED) API. Federal Reserve Bank of St. Louis. https://fred.stlouisfed.org/docs/api
ChromaDB Documentation — Open-Source Embedding Database. Chroma, Inc., 2024. https://docs.trychroma.com
Streamlit Documentation — Open-Source App Framework. Snowflake, Inc., 2025. https://docs.streamlit.io
Alpha Vantage API Documentation. Alpha Vantage, Inc. https://www.alphavantage.co/documentation
Damodaran, A. Investment Valuation: Tools and Techniques for Determining the Value of Any Asset. 3rd ed. Wiley Finance, 2012.
OpenAI Cookbook — "Techniques to Improve Reliability." https://github.com/openai/openai-cookbook (prompt engineering patterns referenced for structured output strategy)