Skip to content

Implementation Plan: Weekly Investment Decision System

This document captures implementation details that support docs/spec.md.

1. Execution Model

  1. Scheduled run via event service cron.
  2. On-demand run via CLI command.
  3. Advisory-only output, human decision required for any trade.

2. Data Inputs

2.1 Market Data

Provider: Finnhub (free tier, 60 req/min)

  • Index performance: SPY, QQQ, DIA, IWM (day change via quote endpoint).
  • Sector ETF performance: XLK, XLF, XLE, XLV, XLY, XLP, XLI, XLU, XLRE, XLB.
  • Ticker quotes: price, day change, 52-week range, P/E (via quote + basic financials endpoints).
  • Earnings calendar dates (via earnings calendar endpoint).
  • Ticker news headlines with titles, publishers, and links (via company news endpoint).
  • Dividend yield, ex-dates, and payout ratio (via basic financials endpoint).
  • Insider transactions (via stock insider transactions endpoint).
  • Institutional ownership changes (via institutional ownership endpoint).

Note: Candle/historical data requires paid tier. Period performance in market overview uses day change from quote endpoint.

2.2 Brokerage Data

Provider: SnapTrade (CLI today, SDK candidate)

  • Accounts and balances.
  • Positions across accounts (including cost basis and holding period per lot).
  • Order history.
  • Dividend history and upcoming ex-dates.
  • Realized gains/losses (for tax-lot awareness).

2.3 Macro/Economic Data

Provider: FRED API + derived market-implied signals

FRED indicators (initial): - FEDFUNDS, DGS2, DGS10, CPIAUCSL, PCEPI, UNRATE, ICSA

Derived signals: - VIX trend - Sector rotation - Short/medium momentum snapshots

Provider: Brave Search API (free tier, 2000 queries/month)

  • Per-symbol news search: "{symbol} stock news latest"
  • Per-symbol analyst coverage: "{symbol} analyst rating outlook"
  • Results include title, description, URL, published date, source domain
  • Enriches rubric evaluation prompts in weekly review (appended to {news_context})
  • Also available as standalone CLI: bof search web, bof search symbol

2.5 SEC Filings

Provider: SEC EDGAR APIs (rate limited + required User-Agent)

Initial filing scope: - 10-K, 10-Q, 8-K, 13F, Form 4

Adapter target: - src/bo_finance/libs/sec_data.py

2.6 Portfolio Targets

Primary tracking source: local YAML file

  • File: investments/allocation.yaml
  • Defines target asset allocation by sector (e.g., Tech: 0.40, Healthcare: 0.20).
  • Defines max single-position concentration (e.g., 0.10).
  • Defines target cash reserve (e.g., 0.05).
  • Defines position sizing tiers by conviction level (full, half, starter).

2.7 Thesis Definitions

Primary tracking source: local YAML files

  • Directory: investments/theses/
  • One thesis file per symbol
  • Contains conditions for add/trim/exit + disconfirming signals

Watchlist transition: - Existing watchlist.txt is transitional input and will be replaced by thesis files.

2.8 Investment Policy

Primary tracking source: local YAML file

  • File: investments/policy.yaml
  • Defines investment philosophy, risk tolerance, time horizon.
  • Defines behavioral constraints (e.g., "no shorting", "no options", "DCA into conviction positions").
  • Defines sector convictions and preferences.
  • Loaded into every agent session as foundational context — constrains what the agent should recommend.
  • Equivalent to an Investment Policy Statement (IPS) in traditional advisory.

3. Storage Architecture

3.1 Directory Separation

The investments/ directory currently mixes human-edited configuration (inputs) with generated artifacts (outputs). These concerns are separated as follows:

investments/ — human-edited, version-controlled inputs: - theses/ — per-symbol thesis YAML files. - scoring/ — rubric definition YAML files. - allocation.yaml — target allocation policy. - risk-policy.yaml — risk thresholds and breach rules. - alerts.yaml — user-defined alert rules. - policy.yaml — investment policy statement (values, beliefs, constraints). - watchlist.txt — transitional, replaced by thesis files.

artifacts/ — generated outputs, gitignored: - reports/weekly/YYYY-MM-DD.json — structured weekly report. - reports/weekly/YYYY-MM-DD.md — rendered markdown report. - events/ — JSONL audit logs from event service. - memory/ — agent long-term memory (observations, patterns, session summaries). - bof.db — SQLite database for time-series and accumulated state.

The artifacts/ directory is gitignored because it contains machine-generated data specific to each user's brokerage accounts.

3.2 SQLite Database

Provider: Python stdlib sqlite3 (no new dependency).

The database stores accumulated state that needs querying across time — data that doesn't fit naturally into per-run JSON files or human-edited YAML.

Tables:

  • portfolio_snapshots — periodic position snapshots for TWR calculation. Columns: date, symbol, quantity, market_value, cost_basis, sector.
  • dividends — dividend events. Columns: date, symbol, amount, type (qualified/ordinary).
  • transactions — order fills for realized gain/loss tracking. Columns: date, symbol, action, quantity, price, lot_date (for ST/LT classification).
  • rubric_scores — historical scores for trend analysis. Columns: run_id, date, symbol, metric, score, rationale.
  • alerts — active alert rules and trigger history. Columns: id, symbol, condition, created_at, last_triggered_at, cooldown_seconds.
  • performance_cache — precomputed performance metrics. Columns: date, period, portfolio_return, benchmark_return, alpha, sharpe, max_drawdown.

Design principles: - Write-ahead logging (WAL mode) for safe concurrent reads during event service runs. - Schema migrations via numbered SQL files in src/bo_finance/migrations/. - Database is recreatable from brokerage API history — it's a cache of computed state, not a source of truth. - All queries go through a DatabaseService in src/bo_finance/services/database.py, not raw SQL in callers.

3.3 Output Artifacts

Per run: 1. artifacts/reports/weekly/YYYY-MM-DD.json 2. artifacts/reports/weekly/YYYY-MM-DD.md 3. Event/audit stream entries (artifacts/events/ JSONL sink) 4. Portfolio snapshot rows inserted into artifacts/bof.db

3.4 JSON Report Contract

Top-level keys. Full structure defined by Pydantic models in src/bo_finance/models/report.py.

{
  "metadata": { "run_id": "uuid", "date": "YYYY-MM-DD", "trigger": "scheduled|manual" },
  "decisions": [],
  "performance": { "portfolio_return_pct": 0.0, "benchmark_return_pct": 0.0, "alpha": 0.0, "sharpe_ratio": 0.0, "max_drawdown_pct": 0.0, "period": "ytd" },
  "income": { "dividends_received": 0.0, "upcoming_ex_dates": [], "portfolio_yield": 0.0 },
  "allocation": { "targets": {}, "actual": {}, "drift": {}, "rebalance_actions": [] },
  "tax": { "realized_gains_st": 0.0, "realized_gains_lt": 0.0, "unrealized_losses_harvestable": [] },
  "risk_snapshot": { "concentration": {}, "sector_exposure": {}, "volatility": {}, "beta": 0.0, "correlation_flags": [], "breaches": [] },
  "action_items": []
}

3.5 Markdown Report Sections

  1. Executive summary — agent-provided narrative when available, rule-based template fallback.
  2. Performance vs. benchmark — portfolio return vs. SPY/QQQ, alpha, Sharpe, max drawdown.
  3. Position/thesis decisions — one row per tracked symbol.
  4. Allocation drift — target vs. actual sector weights, rebalancing suggestions.
  5. Income summary — dividends received, upcoming ex-dates, portfolio yield.
  6. Tax snapshot — YTD realized gains (ST/LT), harvestable unrealized losses.
  7. Evidence table — columns: source, link, timestamp, extracted claim, decision it supports.
  8. Portfolio risk snapshot and breaches — concentration, beta, correlation flags, threshold violations.
  9. Insider & institutional signals — notable insider buys/sells, ownership changes for held symbols.
  10. Ranked action items — ordered by priority score descending.

3.6 Evidence Record Structure

Each evidence entry attached to a decision:

{
  "source_type": "sec_filing|news|market_data|macro|thesis",
  "source_url": "https://...",
  "retrieved_at": "ISO-8601",
  "claim": "extracted or summarized assertion",
  "attribution": "10-Q filing dated ..."
}

4. Recommendation and Risk Logic

Recommendation enum: - BUY_MORE, HOLD, TRIM, EXIT

4.1 Confidence Scoring

Each recommendation carries a confidence value (0.0–1.0) derived from rubric-based composite scoring:

  • Rubric dimensions: thesis alignment (0.40), news sentiment (0.30), valuation signal (0.30).
  • Agent-evaluated: the CLI agent reads rubric prompts with gathered data and assigns integer scores per dimension.
  • Composite: each score is normalized to 0.0–1.0, then combined via weighted average.
  • Confidence: fraction of total rubric weight that was actually scored (supports partial evaluation).
  • Action thresholds: >= 0.70 BUY_MORE, >= 0.40 HOLD, >= 0.20 TRIM, < 0.20 EXIT.

Rubric definitions are YAML files in investments/scoring/ with versioned scales, anchor descriptions, and evaluation prompt templates. Composite math is deterministic; agent judgment enters only through the per-rubric score assignment.

4.2 Action Ranking

Action items are priority-ranked by: 1. Risk breach severity (critical breaches surface first). 2. Confidence level (higher confidence = higher priority). 3. Recommendation urgency (EXIT > TRIM > BUY_MORE > HOLD).

4.3 Risk Dimensions

Quantitative metrics: - Position concentration: single name %, top-5 %, HHI index. - Sector concentration: actual vs. target allocation drift. - Portfolio beta: weighted-average beta to SPY. - Correlation flags: pairs of holdings with rolling correlation > threshold. - Max drawdown: worst peak-to-trough over trailing window. - Sharpe ratio: excess return per unit of volatility.

Qualitative signals: - Insider transaction direction (net buy/sell) for held symbols. - Institutional ownership trend (increasing/decreasing).

Risk policy: - Config-driven from investments/risk-policy.yaml

4.4 Agent-External Intelligence Model

Key assumption: The BOF CLI does not call LLM APIs. All AI reasoning happens in the calling agent session. The CLI is a structured data tool; the agent interprets, evaluates, and writes back conclusions.

The recommendation enum and confidence score are always produced by rule-based logic over structured inputs. Agent judgment enters only through: - Score assignment via bof review score (integer per rubric dimension). - Evidence recording via bof review evidence (structured claims with attribution). - Narrative annotation via bof review annotate (prose sections like executive summary).

Design implications: 1. No LLM SDK dependency in the BOF codebase. 2. Any agent can drive the workflow — Claude Code, Agent SDK app, or a human. 3. Without agent evaluation, the CLI still produces a data-complete draft report with empty scores and template narratives. 4. CLI outputs are designed for agent consumption: structured JSON, rendered prompts with context, and clean text sections from SEC filings.

5. Service Mode and Orchestration

5.1 Event backbone

Existing async pub/sub service (bof service run) with AsyncEventBus, pluggable EventSource producers, and EventSink consumers.

Event flow (CLI-driven, current): 1. cron.tick trigger → projected workflow.request 2. Weekly review handler executes pipeline (data gathering + draft report) 3. Emits workflow.started, workflow.step, workflow.completed|workflow.failed

5.2 Agent orchestrator

The service mode evolves from "event bus with sinks" to "autonomous agent orchestrator with human-in-the-loop."

When a trigger fires (cron tick, Telegram message, alert), the service:

  1. Assembles context: trigger payload, investment policy (investments/policy.yaml), relevant long-term memory entries, current portfolio state summary.
  2. Spawns a headless agent session (e.g., claude -p "..." or Agent SDK subprocess) with the assembled context as system prompt and bof CLI as available tools.
  3. The agent executes the workflow autonomously — calling bof commands, reading output, reasoning, writing back scores/evidence/narrative.
  4. Captures agent output: report artifacts to disk, conversation summary to Telegram (if triggered by Telegram), events to audit log.
  5. Updates long-term memory with session outcomes (observations, decisions made, thesis changes noted).

Agent invocation contract: - Input: structured prompt with context + tool access to bof CLI commands. - Output: completed workflow artifacts (report JSON/MD) + optional conversation reply (for Telegram). - Timeout: configurable per workflow type (e.g., 10 minutes for weekly review). - Failure mode: on timeout or error, emit workflow.failed event, notify via Telegram if available.

5.3 Telegram bidirectional interface

Current state: receive-only (long-polling). Needs: send capability and conversational routing.

Proactive messages (service → human): - Weekly review summary after cron-triggered review completes. - Alert notifications (price, drift, earnings, ex-date). - Errors or failures that need human attention.

Reactive messages (human → service → agent → human): - Human sends query: "how is NVDA doing?", "what's my allocation drift?", "run a review now" - Service spawns agent session with query as input + portfolio context. - Agent uses bof commands to gather data, composes a reply. - Service sends reply back to the Telegram chat.

Implementation: - TelegramClient — new class for sending messages (sendMessage, sendPhoto for charts, callback ACK). - Conversation state tracked per chat_id — enables multi-turn exchanges within a session. - Agent session receives Telegram thread context as short-term memory.

5.4 Memory system

Short-term memory

  • Scope: per agent session, ephemeral.
  • Content: conversation thread (for Telegram multi-turn), current review state, data already fetched.
  • Storage: passed as prompt context to the agent session. Not persisted after session ends.

Long-term memory

  • Scope: across sessions, durable.
  • Content: past review outcomes, thesis evolution, cross-week observations, patterns ("AAPL has beaten earnings 4 quarters in a row"), what worked and what didn't.
  • Storage: artifacts/memory/ — structured files (YAML or JSON). Agent reads relevant entries at session start; writes new observations at session end via bof memory write --type observation --content "...".
  • Retrieval: agent can query with bof memory search --query "AAPL earnings" to find relevant past observations.
  • Decay: observations older than a configurable window (e.g., 6 months) are surfaced with lower relevance or archived.

Values and beliefs (investment policy)

  • Scope: rarely changes, human-edited.
  • Content: investment philosophy, risk tolerance, time horizon, sector convictions, behavioral guardrails.
  • Storage: investments/policy.yaml — version-controlled, human-authored.
  • Usage: loaded into every agent session as foundational context. Constrains recommendations — the agent should not suggest actions that violate the policy without explicitly flagging the conflict.

5.5 Service CLI surface

# Existing
bof service run --telegram --cron weekly_review=1w --event-log artifacts/events.jsonl

# New flags
bof service run \
  --agent-command "claude -p"    # headless agent binary + flags
  --agent-timeout 600            # seconds per agent session
  --memory-dir artifacts/memory  # long-term memory directory

# Memory management
bof memory list [--type observation|pattern]
bof memory search --query "..."
bof memory write --type observation --content "..."
bof memory forget ID

6. Phased Delivery

Phase 1: Weekly Backbone (DONE)

Delivered: typed brokerage outputs via SnapTrade SDK, weekly orchestrator with on-demand CLI, report artifact persistence (JSON + MD), baseline ingestion (portfolio, market, watchlist, news), macro ingestion via FRED.

Phase 2a: Scoring Rubrics (DONE)

Delivered: rubric YAML schema + loader/validator, three initial rubrics (thesis_alignment, news_sentiment, valuation_signal), composite scoring with weighted average and action thresholds, bof review rubric/score/finalize CLI commands, rubric prompt rendering in --dry-run mode.

Phase 2b: Thesis System + SEC Filings + Agent Write-Back (TODO — next up)

Scope: 1. Thesis YAML schema + loader/validator. 2. bof thesis command surface: bof thesis list, bof thesis show SYMBOL, bof thesis check SYMBOL (outputs thesis conditions vs. current market/position state — quantitative conditions evaluated by CLI, qualitative conditions presented as context for the agent). 3. Migration from watchlist to thesis files. 4. Thesis context feeds into thesis_alignment rubric prompts. 5. SEC filing adapter per §2.5: bof sec filings SYMBOL (filing index from EDGAR), bof sec read SYMBOL --type 10-K (fetch + clean filing into readable text sections: risk factors, MD&A, guidance — deterministic HTML cleaning, no LLM). 6. Agent write-back commands: bof review evidence SYMBOL --source-type TYPE --claim "..." --source-url URL --attribution "..." (record structured evidence), bof review annotate --run-id ID --field FIELD --value "..." (write narrative sections back into report).

Exit criteria: 1. Every recommendation maps to thesis conditions + evidence. 2. SEC filings are fetchable and readable via CLI; agent can extract and record evidence from them. 3. Agent can write evidence and narrative back into the report via CLI commands.

Phase 3: Risk Engine + Portfolio Analytics

Scope: 1. Implement risk metrics per §4.3 (concentration, beta, correlation, Sharpe, max drawdown). 2. Breach detection with severity per investments/risk-policy.yaml. 3. Risk-informed action generation. 4. Insider transaction and institutional ownership signals per §2.1.

Exit criteria: 1. Report includes quantitative risk scores, qualitative signals, breaches, and mitigations.

Phase 4: Decision Layer

Scope: 1. Combine thesis + risk + constraints into final decisions. 2. Confidence scoring per §4.1, action ranking per §4.2. 3. JSON output conforms to §3.4, markdown output includes all §3.5 sections delivered so far.

Exit criteria: 1. Every decision carries confidence score and linked evidence. 2. Action items are priority-ranked. 3. Report is directly actionable as weekly review checklist.

Phase 5: Storage Migration + Performance, Income & Tax Tracking

Scope: 1. SQLite database bootstrap per §3.2: schema creation, DatabaseService, migration runner. 2. Artifacts directory setup per §3.1: move outputs from investments/reports/ to artifacts/, gitignore, update path constants. 3. Performance measurement: TWR, benchmark comparison (vs. SPY), alpha. 4. Dividend tracking: yield, ex-dates, income received, yield-on-cost. 5. Cost basis enrichment: holding period (ST/LT), realized gains/losses. 6. Tax-loss harvesting signals: unrealized losses with ST/LT flag and wash-sale window check. 7. Historical trend analysis: cross-report comparison over time.

Phase 6: Allocation & Rebalancing

Scope: 1. Target allocation per §2.6, drift computation, rebalancing suggestions. 2. Tax-aware trade sizing (prefer selling LT gains, harvest ST losses). 3. Position sizing by conviction tier from thesis definitions. 4. Cash reserve tracking: actual vs. target dry powder.

Phase 7: Alerts & Threshold Monitoring

Scope: 1. Portfolio-level event predicates wired into existing event service. 2. Price, drift, earnings proximity, and dividend ex-date alerts. 3. Alert rules defined in investments/alerts.yaml. 4. CLI: bof alert list/add/remove.

Phase 8: Service Mode — Agent Orchestration + Telegram + Memory

Scope: 1. Agent orchestrator: spawn headless agent sessions from event handlers (configurable agent command, timeout, output capture). 2. Telegram send capability: TelegramClient for sendMessage, formatted replies, callback ACK. 3. Telegram conversational routing: human query → agent session → reply. Multi-turn state per chat_id. 4. Investment policy: investments/policy.yaml schema, loaded as agent context. 5. Long-term memory: artifacts/memory/ store, bof memory list/search/write/forget CLI commands. 6. Cron → agent: cron tick triggers full agent-driven weekly review without human intervention. 7. Proactive notifications: review summaries and alert messages sent to Telegram after completion.

Exit criteria: 1. bof service run with cron trigger spawns a headless agent that completes a weekly review end-to-end. 2. Telegram bot receives a human query, spawns an agent, and sends back a composed reply. 3. Agent sessions load investment policy and relevant long-term memory as foundational context. 4. Agent sessions write observations to long-term memory after completing a workflow.

7. Test Infrastructure (Cross-Cutting)

  1. Fixture layering: each phase builds on prior fixtures.
  2. Recorded responses: all external API calls captured as fixtures. No live calls in CI.
  3. Schema validation: shared validator enforces §3.4 contract.
  4. Injectable dependencies: run_id and timestamps are injectable so tests control non-deterministic inputs.

8. Security and Traceability

  1. Secrets never logged.
  2. Redaction applied to serialized event payloads.
  3. Recommendations carry source/timestamp provenance.
  4. All workflow interactions and agent write-backs are auditable.