Implementation Plan: Weekly Investment Decision System¶
This document captures implementation details that support docs/spec.md.
1. Execution Model¶
- Scheduled run via event service cron.
- On-demand run via CLI command.
- Advisory-only output, human decision required for any trade.
2. Data Inputs¶
2.1 Market Data¶
Provider: Finnhub (free tier, 60 req/min)
- Index performance: SPY, QQQ, DIA, IWM (day change via quote endpoint).
- Sector ETF performance: XLK, XLF, XLE, XLV, XLY, XLP, XLI, XLU, XLRE, XLB.
- Ticker quotes: price, day change, 52-week range, P/E (via quote + basic financials endpoints).
- Earnings calendar dates (via earnings calendar endpoint).
- Ticker news headlines with titles, publishers, and links (via company news endpoint).
- Dividend yield, ex-dates, and payout ratio (via basic financials endpoint).
- Insider transactions (via stock insider transactions endpoint).
- Institutional ownership changes (via institutional ownership endpoint).
Note: Candle/historical data requires paid tier. Period performance in market overview uses day change from quote endpoint.
2.2 Brokerage Data¶
Provider: SnapTrade (CLI today, SDK candidate)
- Accounts and balances.
- Positions across accounts (including cost basis and holding period per lot).
- Order history.
- Dividend history and upcoming ex-dates.
- Realized gains/losses (for tax-lot awareness).
2.3 Macro/Economic Data¶
Provider: FRED API + derived market-implied signals
FRED indicators (initial):
- FEDFUNDS, DGS2, DGS10, CPIAUCSL, PCEPI, UNRATE, ICSA
Derived signals: - VIX trend - Sector rotation - Short/medium momentum snapshots
2.4 Web Search¶
Provider: Brave Search API (free tier, 2000 queries/month)
- Per-symbol news search:
"{symbol} stock news latest" - Per-symbol analyst coverage:
"{symbol} analyst rating outlook" - Results include title, description, URL, published date, source domain
- Enriches rubric evaluation prompts in weekly review (appended to
{news_context}) - Also available as standalone CLI:
bof search web,bof search symbol
2.5 SEC Filings¶
Provider: SEC EDGAR APIs (rate limited + required User-Agent)
Initial filing scope: - 10-K, 10-Q, 8-K, 13F, Form 4
Adapter target:
- src/bo_finance/libs/sec_data.py
2.6 Portfolio Targets¶
Primary tracking source: local YAML file
- File:
investments/allocation.yaml - Defines target asset allocation by sector (e.g., Tech: 0.40, Healthcare: 0.20).
- Defines max single-position concentration (e.g., 0.10).
- Defines target cash reserve (e.g., 0.05).
- Defines position sizing tiers by conviction level (full, half, starter).
2.7 Thesis Definitions¶
Primary tracking source: local YAML files
- Directory:
investments/theses/ - One thesis file per symbol
- Contains conditions for add/trim/exit + disconfirming signals
Watchlist transition:
- Existing watchlist.txt is transitional input and will be replaced by thesis files.
2.8 Investment Policy¶
Primary tracking source: local YAML file
- File:
investments/policy.yaml - Defines investment philosophy, risk tolerance, time horizon.
- Defines behavioral constraints (e.g., "no shorting", "no options", "DCA into conviction positions").
- Defines sector convictions and preferences.
- Loaded into every agent session as foundational context — constrains what the agent should recommend.
- Equivalent to an Investment Policy Statement (IPS) in traditional advisory.
3. Storage Architecture¶
3.1 Directory Separation¶
The investments/ directory currently mixes human-edited configuration (inputs) with generated artifacts (outputs). These concerns are separated as follows:
investments/ — human-edited, version-controlled inputs:
- theses/ — per-symbol thesis YAML files.
- scoring/ — rubric definition YAML files.
- allocation.yaml — target allocation policy.
- risk-policy.yaml — risk thresholds and breach rules.
- alerts.yaml — user-defined alert rules.
- policy.yaml — investment policy statement (values, beliefs, constraints).
- watchlist.txt — transitional, replaced by thesis files.
artifacts/ — generated outputs, gitignored:
- reports/weekly/YYYY-MM-DD.json — structured weekly report.
- reports/weekly/YYYY-MM-DD.md — rendered markdown report.
- events/ — JSONL audit logs from event service.
- memory/ — agent long-term memory (observations, patterns, session summaries).
- bof.db — SQLite database for time-series and accumulated state.
The artifacts/ directory is gitignored because it contains machine-generated data specific to each user's brokerage accounts.
3.2 SQLite Database¶
Provider: Python stdlib sqlite3 (no new dependency).
The database stores accumulated state that needs querying across time — data that doesn't fit naturally into per-run JSON files or human-edited YAML.
Tables:
portfolio_snapshots— periodic position snapshots for TWR calculation. Columns:date,symbol,quantity,market_value,cost_basis,sector.dividends— dividend events. Columns:date,symbol,amount,type(qualified/ordinary).transactions— order fills for realized gain/loss tracking. Columns:date,symbol,action,quantity,price,lot_date(for ST/LT classification).rubric_scores— historical scores for trend analysis. Columns:run_id,date,symbol,metric,score,rationale.alerts— active alert rules and trigger history. Columns:id,symbol,condition,created_at,last_triggered_at,cooldown_seconds.performance_cache— precomputed performance metrics. Columns:date,period,portfolio_return,benchmark_return,alpha,sharpe,max_drawdown.
Design principles:
- Write-ahead logging (WAL mode) for safe concurrent reads during event service runs.
- Schema migrations via numbered SQL files in src/bo_finance/migrations/.
- Database is recreatable from brokerage API history — it's a cache of computed state, not a source of truth.
- All queries go through a DatabaseService in src/bo_finance/services/database.py, not raw SQL in callers.
3.3 Output Artifacts¶
Per run:
1. artifacts/reports/weekly/YYYY-MM-DD.json
2. artifacts/reports/weekly/YYYY-MM-DD.md
3. Event/audit stream entries (artifacts/events/ JSONL sink)
4. Portfolio snapshot rows inserted into artifacts/bof.db
3.4 JSON Report Contract¶
Top-level keys. Full structure defined by Pydantic models in src/bo_finance/models/report.py.
{
"metadata": { "run_id": "uuid", "date": "YYYY-MM-DD", "trigger": "scheduled|manual" },
"decisions": [],
"performance": { "portfolio_return_pct": 0.0, "benchmark_return_pct": 0.0, "alpha": 0.0, "sharpe_ratio": 0.0, "max_drawdown_pct": 0.0, "period": "ytd" },
"income": { "dividends_received": 0.0, "upcoming_ex_dates": [], "portfolio_yield": 0.0 },
"allocation": { "targets": {}, "actual": {}, "drift": {}, "rebalance_actions": [] },
"tax": { "realized_gains_st": 0.0, "realized_gains_lt": 0.0, "unrealized_losses_harvestable": [] },
"risk_snapshot": { "concentration": {}, "sector_exposure": {}, "volatility": {}, "beta": 0.0, "correlation_flags": [], "breaches": [] },
"action_items": []
}
3.5 Markdown Report Sections¶
- Executive summary — agent-provided narrative when available, rule-based template fallback.
- Performance vs. benchmark — portfolio return vs. SPY/QQQ, alpha, Sharpe, max drawdown.
- Position/thesis decisions — one row per tracked symbol.
- Allocation drift — target vs. actual sector weights, rebalancing suggestions.
- Income summary — dividends received, upcoming ex-dates, portfolio yield.
- Tax snapshot — YTD realized gains (ST/LT), harvestable unrealized losses.
- Evidence table — columns: source, link, timestamp, extracted claim, decision it supports.
- Portfolio risk snapshot and breaches — concentration, beta, correlation flags, threshold violations.
- Insider & institutional signals — notable insider buys/sells, ownership changes for held symbols.
- Ranked action items — ordered by priority score descending.
3.6 Evidence Record Structure¶
Each evidence entry attached to a decision:
{
"source_type": "sec_filing|news|market_data|macro|thesis",
"source_url": "https://...",
"retrieved_at": "ISO-8601",
"claim": "extracted or summarized assertion",
"attribution": "10-Q filing dated ..."
}
4. Recommendation and Risk Logic¶
Recommendation enum:
- BUY_MORE, HOLD, TRIM, EXIT
4.1 Confidence Scoring¶
Each recommendation carries a confidence value (0.0–1.0) derived from rubric-based composite scoring:
- Rubric dimensions: thesis alignment (0.40), news sentiment (0.30), valuation signal (0.30).
- Agent-evaluated: the CLI agent reads rubric prompts with gathered data and assigns integer scores per dimension.
- Composite: each score is normalized to 0.0–1.0, then combined via weighted average.
- Confidence: fraction of total rubric weight that was actually scored (supports partial evaluation).
- Action thresholds:
>= 0.70BUY_MORE,>= 0.40HOLD,>= 0.20TRIM,< 0.20EXIT.
Rubric definitions are YAML files in investments/scoring/ with versioned scales, anchor descriptions, and evaluation prompt templates. Composite math is deterministic; agent judgment enters only through the per-rubric score assignment.
4.2 Action Ranking¶
Action items are priority-ranked by:
1. Risk breach severity (critical breaches surface first).
2. Confidence level (higher confidence = higher priority).
3. Recommendation urgency (EXIT > TRIM > BUY_MORE > HOLD).
4.3 Risk Dimensions¶
Quantitative metrics: - Position concentration: single name %, top-5 %, HHI index. - Sector concentration: actual vs. target allocation drift. - Portfolio beta: weighted-average beta to SPY. - Correlation flags: pairs of holdings with rolling correlation > threshold. - Max drawdown: worst peak-to-trough over trailing window. - Sharpe ratio: excess return per unit of volatility.
Qualitative signals: - Insider transaction direction (net buy/sell) for held symbols. - Institutional ownership trend (increasing/decreasing).
Risk policy:
- Config-driven from investments/risk-policy.yaml
4.4 Agent-External Intelligence Model¶
Key assumption: The BOF CLI does not call LLM APIs. All AI reasoning happens in the calling agent session. The CLI is a structured data tool; the agent interprets, evaluates, and writes back conclusions.
The recommendation enum and confidence score are always produced by rule-based logic over structured inputs. Agent judgment enters only through:
- Score assignment via bof review score (integer per rubric dimension).
- Evidence recording via bof review evidence (structured claims with attribution).
- Narrative annotation via bof review annotate (prose sections like executive summary).
Design implications: 1. No LLM SDK dependency in the BOF codebase. 2. Any agent can drive the workflow — Claude Code, Agent SDK app, or a human. 3. Without agent evaluation, the CLI still produces a data-complete draft report with empty scores and template narratives. 4. CLI outputs are designed for agent consumption: structured JSON, rendered prompts with context, and clean text sections from SEC filings.
5. Service Mode and Orchestration¶
5.1 Event backbone¶
Existing async pub/sub service (bof service run) with AsyncEventBus, pluggable EventSource producers, and EventSink consumers.
Event flow (CLI-driven, current):
1. cron.tick trigger → projected workflow.request
2. Weekly review handler executes pipeline (data gathering + draft report)
3. Emits workflow.started, workflow.step, workflow.completed|workflow.failed
5.2 Agent orchestrator¶
The service mode evolves from "event bus with sinks" to "autonomous agent orchestrator with human-in-the-loop."
When a trigger fires (cron tick, Telegram message, alert), the service:
- Assembles context: trigger payload, investment policy (
investments/policy.yaml), relevant long-term memory entries, current portfolio state summary. - Spawns a headless agent session (e.g.,
claude -p "..."or Agent SDK subprocess) with the assembled context as system prompt andbofCLI as available tools. - The agent executes the workflow autonomously — calling
bofcommands, reading output, reasoning, writing back scores/evidence/narrative. - Captures agent output: report artifacts to disk, conversation summary to Telegram (if triggered by Telegram), events to audit log.
- Updates long-term memory with session outcomes (observations, decisions made, thesis changes noted).
Agent invocation contract:
- Input: structured prompt with context + tool access to bof CLI commands.
- Output: completed workflow artifacts (report JSON/MD) + optional conversation reply (for Telegram).
- Timeout: configurable per workflow type (e.g., 10 minutes for weekly review).
- Failure mode: on timeout or error, emit workflow.failed event, notify via Telegram if available.
5.3 Telegram bidirectional interface¶
Current state: receive-only (long-polling). Needs: send capability and conversational routing.
Proactive messages (service → human): - Weekly review summary after cron-triggered review completes. - Alert notifications (price, drift, earnings, ex-date). - Errors or failures that need human attention.
Reactive messages (human → service → agent → human):
- Human sends query: "how is NVDA doing?", "what's my allocation drift?", "run a review now"
- Service spawns agent session with query as input + portfolio context.
- Agent uses bof commands to gather data, composes a reply.
- Service sends reply back to the Telegram chat.
Implementation:
- TelegramClient — new class for sending messages (sendMessage, sendPhoto for charts, callback ACK).
- Conversation state tracked per chat_id — enables multi-turn exchanges within a session.
- Agent session receives Telegram thread context as short-term memory.
5.4 Memory system¶
Short-term memory¶
- Scope: per agent session, ephemeral.
- Content: conversation thread (for Telegram multi-turn), current review state, data already fetched.
- Storage: passed as prompt context to the agent session. Not persisted after session ends.
Long-term memory¶
- Scope: across sessions, durable.
- Content: past review outcomes, thesis evolution, cross-week observations, patterns ("AAPL has beaten earnings 4 quarters in a row"), what worked and what didn't.
- Storage:
artifacts/memory/— structured files (YAML or JSON). Agent reads relevant entries at session start; writes new observations at session end viabof memory write --type observation --content "...". - Retrieval: agent can query with
bof memory search --query "AAPL earnings"to find relevant past observations. - Decay: observations older than a configurable window (e.g., 6 months) are surfaced with lower relevance or archived.
Values and beliefs (investment policy)¶
- Scope: rarely changes, human-edited.
- Content: investment philosophy, risk tolerance, time horizon, sector convictions, behavioral guardrails.
- Storage:
investments/policy.yaml— version-controlled, human-authored. - Usage: loaded into every agent session as foundational context. Constrains recommendations — the agent should not suggest actions that violate the policy without explicitly flagging the conflict.
5.5 Service CLI surface¶
# Existing
bof service run --telegram --cron weekly_review=1w --event-log artifacts/events.jsonl
# New flags
bof service run \
--agent-command "claude -p" # headless agent binary + flags
--agent-timeout 600 # seconds per agent session
--memory-dir artifacts/memory # long-term memory directory
# Memory management
bof memory list [--type observation|pattern]
bof memory search --query "..."
bof memory write --type observation --content "..."
bof memory forget ID
6. Phased Delivery¶
Phase 1: Weekly Backbone (DONE)¶
Delivered: typed brokerage outputs via SnapTrade SDK, weekly orchestrator with on-demand CLI, report artifact persistence (JSON + MD), baseline ingestion (portfolio, market, watchlist, news), macro ingestion via FRED.
Phase 2a: Scoring Rubrics (DONE)¶
Delivered: rubric YAML schema + loader/validator, three initial rubrics (thesis_alignment, news_sentiment, valuation_signal), composite scoring with weighted average and action thresholds, bof review rubric/score/finalize CLI commands, rubric prompt rendering in --dry-run mode.
Phase 2b: Thesis System + SEC Filings + Agent Write-Back (TODO — next up)¶
Scope:
1. Thesis YAML schema + loader/validator.
2. bof thesis command surface: bof thesis list, bof thesis show SYMBOL, bof thesis check SYMBOL (outputs thesis conditions vs. current market/position state — quantitative conditions evaluated by CLI, qualitative conditions presented as context for the agent).
3. Migration from watchlist to thesis files.
4. Thesis context feeds into thesis_alignment rubric prompts.
5. SEC filing adapter per §2.5: bof sec filings SYMBOL (filing index from EDGAR), bof sec read SYMBOL --type 10-K (fetch + clean filing into readable text sections: risk factors, MD&A, guidance — deterministic HTML cleaning, no LLM).
6. Agent write-back commands: bof review evidence SYMBOL --source-type TYPE --claim "..." --source-url URL --attribution "..." (record structured evidence), bof review annotate --run-id ID --field FIELD --value "..." (write narrative sections back into report).
Exit criteria: 1. Every recommendation maps to thesis conditions + evidence. 2. SEC filings are fetchable and readable via CLI; agent can extract and record evidence from them. 3. Agent can write evidence and narrative back into the report via CLI commands.
Phase 3: Risk Engine + Portfolio Analytics¶
Scope:
1. Implement risk metrics per §4.3 (concentration, beta, correlation, Sharpe, max drawdown).
2. Breach detection with severity per investments/risk-policy.yaml.
3. Risk-informed action generation.
4. Insider transaction and institutional ownership signals per §2.1.
Exit criteria: 1. Report includes quantitative risk scores, qualitative signals, breaches, and mitigations.
Phase 4: Decision Layer¶
Scope: 1. Combine thesis + risk + constraints into final decisions. 2. Confidence scoring per §4.1, action ranking per §4.2. 3. JSON output conforms to §3.4, markdown output includes all §3.5 sections delivered so far.
Exit criteria: 1. Every decision carries confidence score and linked evidence. 2. Action items are priority-ranked. 3. Report is directly actionable as weekly review checklist.
Phase 5: Storage Migration + Performance, Income & Tax Tracking¶
Scope:
1. SQLite database bootstrap per §3.2: schema creation, DatabaseService, migration runner.
2. Artifacts directory setup per §3.1: move outputs from investments/reports/ to artifacts/, gitignore, update path constants.
3. Performance measurement: TWR, benchmark comparison (vs. SPY), alpha.
4. Dividend tracking: yield, ex-dates, income received, yield-on-cost.
5. Cost basis enrichment: holding period (ST/LT), realized gains/losses.
6. Tax-loss harvesting signals: unrealized losses with ST/LT flag and wash-sale window check.
7. Historical trend analysis: cross-report comparison over time.
Phase 6: Allocation & Rebalancing¶
Scope: 1. Target allocation per §2.6, drift computation, rebalancing suggestions. 2. Tax-aware trade sizing (prefer selling LT gains, harvest ST losses). 3. Position sizing by conviction tier from thesis definitions. 4. Cash reserve tracking: actual vs. target dry powder.
Phase 7: Alerts & Threshold Monitoring¶
Scope:
1. Portfolio-level event predicates wired into existing event service.
2. Price, drift, earnings proximity, and dividend ex-date alerts.
3. Alert rules defined in investments/alerts.yaml.
4. CLI: bof alert list/add/remove.
Phase 8: Service Mode — Agent Orchestration + Telegram + Memory¶
Scope:
1. Agent orchestrator: spawn headless agent sessions from event handlers (configurable agent command, timeout, output capture).
2. Telegram send capability: TelegramClient for sendMessage, formatted replies, callback ACK.
3. Telegram conversational routing: human query → agent session → reply. Multi-turn state per chat_id.
4. Investment policy: investments/policy.yaml schema, loaded as agent context.
5. Long-term memory: artifacts/memory/ store, bof memory list/search/write/forget CLI commands.
6. Cron → agent: cron tick triggers full agent-driven weekly review without human intervention.
7. Proactive notifications: review summaries and alert messages sent to Telegram after completion.
Exit criteria:
1. bof service run with cron trigger spawns a headless agent that completes a weekly review end-to-end.
2. Telegram bot receives a human query, spawns an agent, and sends back a composed reply.
3. Agent sessions load investment policy and relevant long-term memory as foundational context.
4. Agent sessions write observations to long-term memory after completing a workflow.
7. Test Infrastructure (Cross-Cutting)¶
- Fixture layering: each phase builds on prior fixtures.
- Recorded responses: all external API calls captured as fixtures. No live calls in CI.
- Schema validation: shared validator enforces §3.4 contract.
- Injectable dependencies: run_id and timestamps are injectable so tests control non-deterministic inputs.
8. Security and Traceability¶
- Secrets never logged.
- Redaction applied to serialized event payloads.
- Recommendations carry source/timestamp provenance.
- All workflow interactions and agent write-backs are auditable.