Implementation Plan: Weekly Investment Decision System¶

This document captures implementation details that support docs/spec.md.

1. Execution Model¶

Scheduled run via event service cron.
On-demand run via CLI command.
Advisory-only output, human decision required for any trade.

2. Data Inputs¶

2.1 Market Data¶

Provider: Finnhub (free tier, 60 req/min)

Index performance: SPY, QQQ, DIA, IWM (day change via quote endpoint).
Sector ETF performance: XLK, XLF, XLE, XLV, XLY, XLP, XLI, XLU, XLRE, XLB.
Ticker quotes: price, day change, 52-week range, P/E (via quote + basic financials endpoints).
Earnings calendar dates (via earnings calendar endpoint).
Ticker news headlines with titles, publishers, and links (via company news endpoint).
Dividend yield, ex-dates, and payout ratio (via basic financials endpoint).
Insider transactions (via stock insider transactions endpoint).
Institutional ownership changes (via institutional ownership endpoint).

Note: Candle/historical data requires paid tier. Period performance in market overview uses day change from quote endpoint.

2.2 Brokerage Data¶

Provider: SnapTrade (CLI today, SDK candidate)

Accounts and balances.
Positions across accounts (including cost basis and holding period per lot).
Order history.
Dividend history and upcoming ex-dates.
Realized gains/losses (for tax-lot awareness).

2.3 Macro/Economic Data¶

Provider: FRED API + derived market-implied signals

FRED indicators (initial): - FEDFUNDS, DGS2, DGS10, CPIAUCSL, PCEPI, UNRATE, ICSA

Derived signals: - VIX trend - Sector rotation - Short/medium momentum snapshots

2.4 Web Search¶

Provider: Brave Search API (free tier, 2000 queries/month)

Per-symbol news search: "{symbol} stock news latest"
Per-symbol analyst coverage: "{symbol} analyst rating outlook"
Results include title, description, URL, published date, source domain
Enriches rubric evaluation prompts in weekly review (appended to {news_context})
Also available as standalone CLI: bof search web, bof search symbol

2.5 SEC Filings¶

Provider: SEC EDGAR APIs (rate limited + required User-Agent)

Initial filing scope: - 10-K, 10-Q, 8-K, 13F, Form 4

Adapter target: - src/bo_finance/libs/sec_data.py

2.6 Portfolio Targets¶

Primary tracking source: local YAML file

File: investments/allocation.yaml
Defines target asset allocation by sector (e.g., Tech: 0.40, Healthcare: 0.20).
Defines max single-position concentration (e.g., 0.10).
Defines target cash reserve (e.g., 0.05).
Defines position sizing tiers by conviction level (full, half, starter).

2.7 Thesis Definitions¶

Primary tracking source: local YAML files

Directory: investments/theses/
One thesis file per symbol
Contains conditions for add/trim/exit + disconfirming signals

Watchlist transition: - Existing watchlist.txt is transitional input and will be replaced by thesis files.

2.8 Investment Policy¶

Primary tracking source: local YAML file

File: investments/policy.yaml
Defines investment philosophy, risk tolerance, time horizon.
Defines behavioral constraints (e.g., "no shorting", "no options", "DCA into conviction positions").
Defines sector convictions and preferences.
Loaded into every agent session as foundational context — constrains what the agent should recommend.
Equivalent to an Investment Policy Statement (IPS) in traditional advisory.

3. Storage Architecture¶

3.1 Directory Separation¶

The investments/ directory currently mixes human-edited configuration (inputs) with generated artifacts (outputs). These concerns are separated as follows:

investments/ — human-edited, version-controlled inputs: - theses/ — per-symbol thesis YAML files. - scoring/ — rubric definition YAML files. - allocation.yaml — target allocation policy. - risk-policy.yaml — risk thresholds and breach rules. - alerts.yaml — user-defined alert rules. - policy.yaml — investment policy statement (values, beliefs, constraints). - watchlist.txt — transitional, replaced by thesis files.

artifacts/ — generated outputs, gitignored: - reports/weekly/YYYY-MM-DD.json — structured weekly report. - reports/weekly/YYYY-MM-DD.md — rendered markdown report. - events/ — JSONL audit logs from event service. - memory/ — agent long-term memory (observations, patterns, session summaries). - bof.db — SQLite database for time-series and accumulated state.

The artifacts/ directory is gitignored because it contains machine-generated data specific to each user's brokerage accounts.

3.2 SQLite Database¶

Provider: Python stdlib sqlite3 (no new dependency).

The database stores accumulated state that needs querying across time — data that doesn't fit naturally into per-run JSON files or human-edited YAML.

Tables:

portfolio_snapshots — periodic position snapshots for TWR calculation. Columns: date, symbol, quantity, market_value, cost_basis, sector.
dividends — dividend events. Columns: date, symbol, amount, type (qualified/ordinary).
transactions — order fills for realized gain/loss tracking. Columns: date, symbol, action, quantity, price, lot_date (for ST/LT classification).
rubric_scores — historical scores for trend analysis. Columns: run_id, date, symbol, metric, score, rationale.
alerts — active alert rules and trigger history. Columns: id, symbol, condition, created_at, last_triggered_at, cooldown_seconds.
performance_cache — precomputed performance metrics. Columns: date, period, portfolio_return, benchmark_return, alpha, sharpe, max_drawdown.

Design principles: - Write-ahead logging (WAL mode) for safe concurrent reads during event service runs. - Schema migrations via numbered SQL files in src/bo_finance/migrations/. - Database is recreatable from brokerage API history — it's a cache of computed state, not a source of truth. - All queries go through a DatabaseService in src/bo_finance/services/database.py, not raw SQL in callers.

3.3 Output Artifacts¶

Per run: 1. artifacts/reports/weekly/YYYY-MM-DD.json 2. artifacts/reports/weekly/YYYY-MM-DD.md 3. Event/audit stream entries (artifacts/events/ JSONL sink) 4. Portfolio snapshot rows inserted into artifacts/bof.db

3.4 JSON Report Contract¶

Top-level keys. Full structure defined by Pydantic models in src/bo_finance/models/report.py.

{
  "metadata": { "run_id": "uuid", "date": "YYYY-MM-DD", "trigger": "scheduled|manual" },
  "decisions": [],
  "performance": { "portfolio_return_pct": 0.0, "benchmark_return_pct": 0.0, "alpha": 0.0, "sharpe_ratio": 0.0, "max_drawdown_pct": 0.0, "period": "ytd" },
  "income": { "dividends_received": 0.0, "upcoming_ex_dates": [], "portfolio_yield": 0.0 },
  "allocation": { "targets": {}, "actual": {}, "drift": {}, "rebalance_actions": [] },
  "tax": { "realized_gains_st": 0.0, "realized_gains_lt": 0.0, "unrealized_losses_harvestable": [] },
  "risk_snapshot": { "concentration": {}, "sector_exposure": {}, "volatility": {}, "beta": 0.0, "correlation_flags": [], "breaches": [] },
  "action_items": []
}

3.5 Markdown Report Sections¶

Executive summary — agent-provided narrative when available, rule-based template fallback.
Performance vs. benchmark — portfolio return vs. SPY/QQQ, alpha, Sharpe, max drawdown.
Position/thesis decisions — one row per tracked symbol.
Allocation drift — target vs. actual sector weights, rebalancing suggestions.
Income summary — dividends received, upcoming ex-dates, portfolio yield.
Tax snapshot — YTD realized gains (ST/LT), harvestable unrealized losses.
Evidence table — columns: source, link, timestamp, extracted claim, decision it supports.
Portfolio risk snapshot and breaches — concentration, beta, correlation flags, threshold violations.
Insider & institutional signals — notable insider buys/sells, ownership changes for held symbols.
Ranked action items — ordered by priority score descending.

3.6 Evidence Record Structure¶

Each evidence entry attached to a decision:

{
  "source_type": "sec_filing|news|market_data|macro|thesis",
  "source_url": "https://...",
  "retrieved_at": "ISO-8601",
  "claim": "extracted or summarized assertion",
  "attribution": "10-Q filing dated ..."
}

4. Recommendation and Risk Logic¶

Recommendation enum: - BUY_MORE, HOLD, TRIM, EXIT

4.1 Confidence Scoring¶

Each recommendation carries a confidence value (0.0–1.0) derived from rubric-based composite scoring:

Rubric dimensions: thesis alignment (0.40), news sentiment (0.30), valuation signal (0.30).
Agent-evaluated: the CLI agent reads rubric prompts with gathered data and assigns integer scores per dimension.
Composite: each score is normalized to 0.0–1.0, then combined via weighted average.
Confidence: fraction of total rubric weight that was actually scored (supports partial evaluation).
Action thresholds: >= 0.70 BUY_MORE, >= 0.40 HOLD, >= 0.20 TRIM, < 0.20 EXIT.

Rubric definitions are YAML files in investments/scoring/ with versioned scales, anchor descriptions, and evaluation prompt templates. Composite math is deterministic; agent judgment enters only through the per-rubric score assignment.

4.2 Action Ranking¶

Action items are priority-ranked by: 1. Risk breach severity (critical breaches surface first). 2. Confidence level (higher confidence = higher priority). 3. Recommendation urgency (EXIT > TRIM > BUY_MORE > HOLD).

4.3 Risk Dimensions¶

Quantitative metrics: - Position concentration: single name %, top-5 %, HHI index. - Sector concentration: actual vs. target allocation drift. - Portfolio beta: weighted-average beta to SPY. - Correlation flags: pairs of holdings with rolling correlation > threshold. - Max drawdown: worst peak-to-trough over trailing window. - Sharpe ratio: excess return per unit of volatility.

Qualitative signals: - Insider transaction direction (net buy/sell) for held symbols. - Institutional ownership trend (increasing/decreasing).

Risk policy: - Config-driven from investments/risk-policy.yaml

4.4 Agent-External Intelligence Model¶

Key assumption: The BOF CLI does not call LLM APIs. All AI reasoning happens in the calling agent session. The CLI is a structured data tool; the agent interprets, evaluates, and writes back conclusions.

The recommendation enum and confidence score are always produced by rule-based logic over structured inputs. Agent judgment enters only through: - Score assignment via bof review score (integer per rubric dimension). - Evidence recording via bof review evidence (structured claims with attribution). - Narrative annotation via bof review annotate (prose sections like executive summary).

Design implications: 1. No LLM SDK dependency in the BOF codebase. 2. Any agent can drive the workflow — Claude Code, Agent SDK app, or a human. 3. Without agent evaluation, the CLI still produces a data-complete draft report with empty scores and template narratives. 4. CLI outputs are designed for agent consumption: structured JSON, rendered prompts with context, and clean text sections from SEC filings.

5. Service Mode and Orchestration¶

5.1 Event backbone¶

Existing async pub/sub service (bof service run) with AsyncEventBus, pluggable EventSource producers, and EventSink consumers.

Event flow (CLI-driven, current): 1. cron.tick trigger → projected workflow.request 2. Weekly review handler executes pipeline (data gathering + draft report) 3. Emits workflow.started, workflow.step, workflow.completed|workflow.failed

5.2 Agent orchestrator¶

The service mode evolves from "event bus with sinks" to "autonomous agent orchestrator with human-in-the-loop."

When a trigger fires (cron tick, Telegram message, alert), the service:

Assembles context: trigger payload, investment policy (investments/policy.yaml), relevant long-term memory entries, current portfolio state summary.
Spawns a headless agent session (e.g., claude -p "..." or Agent SDK subprocess) with the assembled context as system prompt and bof CLI as available tools.
The agent executes the workflow autonomously — calling bof commands, reading output, reasoning, writing back scores/evidence/narrative.
Captures agent output: report artifacts to disk, conversation summary to Telegram (if triggered by Telegram), events to audit log.
Updates long-term memory with session outcomes (observations, decisions made, thesis changes noted).

Agent invocation contract: - Input: structured prompt with context + tool access to bof CLI commands. - Output: completed workflow artifacts (report JSON/MD) + optional conversation reply (for Telegram). - Timeout: configurable per workflow type (e.g., 10 minutes for weekly review). - Failure mode: on timeout or error, emit workflow.failed event, notify via Telegram if available.

5.3 Telegram bidirectional interface¶

Current state: receive-only (long-polling). Needs: send capability and conversational routing.

Proactive messages (service → human): - Weekly review summary after cron-triggered review completes. - Alert notifications (price, drift, earnings, ex-date). - Errors or failures that need human attention.

Reactive messages (human → service → agent → human): - Human sends query: "how is NVDA doing?", "what's my allocation drift?", "run a review now" - Service spawns agent session with query as input + portfolio context. - Agent uses bof commands to gather data, composes a reply. - Service sends reply back to the Telegram chat.

Implementation: - TelegramClient — new class for sending messages (sendMessage, sendPhoto for charts, callback ACK). - Conversation state tracked per chat_id — enables multi-turn exchanges within a session. - Agent session receives Telegram thread context as short-term memory.

5.4 Memory system¶

Short-term memory¶

Scope: per agent session, ephemeral.
Content: conversation thread (for Telegram multi-turn), current review state, data already fetched.
Storage: passed as prompt context to the agent session. Not persisted after session ends.

Long-term memory¶

Scope: across sessions, durable.
Content: past review outcomes, thesis evolution, cross-week observations, patterns ("AAPL has beaten earnings 4 quarters in a row"), what worked and what didn't.
Storage: artifacts/memory/ — structured files (YAML or JSON). Agent reads relevant entries at session start; writes new observations at session end via bof memory write --type observation --content "...".
Retrieval: agent can query with bof memory search --query "AAPL earnings" to find relevant past observations.
Decay: observations older than a configurable window (e.g., 6 months) are surfaced with lower relevance or archived.

Values and beliefs (investment policy)¶

Scope: rarely changes, human-edited.
Content: investment philosophy, risk tolerance, time horizon, sector convictions, behavioral guardrails.
Storage: investments/policy.yaml — version-controlled, human-authored.
Usage: loaded into every agent session as foundational context. Constrains recommendations — the agent should not suggest actions that violate the policy without explicitly flagging the conflict.

5.5 Service CLI surface¶

# Existing
bof service run --telegram --cron weekly_review=1w --event-log artifacts/events.jsonl

# New flags
bof service run \
  --agent-command "claude -p"    # headless agent binary + flags
  --agent-timeout 600            # seconds per agent session
  --memory-dir artifacts/memory  # long-term memory directory

# Memory management
bof memory list [--type observation|pattern]
bof memory search --query "..."
bof memory write --type observation --content "..."
bof memory forget ID

6. Phased Delivery¶

Phase 1: Weekly Backbone (DONE)¶

Delivered: typed brokerage outputs via SnapTrade SDK, weekly orchestrator with on-demand CLI, report artifact persistence (JSON + MD), baseline ingestion (portfolio, market, watchlist, news), macro ingestion via FRED.

Phase 2a: Scoring Rubrics (DONE)¶

Delivered: rubric YAML schema + loader/validator, three initial rubrics (thesis_alignment, news_sentiment, valuation_signal), composite scoring with weighted average and action thresholds, bof review rubric/score/finalize CLI commands, rubric prompt rendering in --dry-run mode.

Phase 2b: Thesis System + SEC Filings + Agent Write-Back (TODO — next up)¶

Scope: 1. Thesis YAML schema + loader/validator. 2. bof thesis command surface: bof thesis list, bof thesis show SYMBOL, bof thesis check SYMBOL (outputs thesis conditions vs. current market/position state — quantitative conditions evaluated by CLI, qualitative conditions presented as context for the agent). 3. Migration from watchlist to thesis files. 4. Thesis context feeds into thesis_alignment rubric prompts. 5. SEC filing adapter per §2.5: bof sec filings SYMBOL (filing index from EDGAR), bof sec read SYMBOL --type 10-K (fetch + clean filing into readable text sections: risk factors, MD&A, guidance — deterministic HTML cleaning, no LLM). 6. Agent write-back commands: bof review evidence SYMBOL --source-type TYPE --claim "..." --source-url URL --attribution "..." (record structured evidence), bof review annotate --run-id ID --field FIELD --value "..." (write narrative sections back into report).

Exit criteria: 1. Every recommendation maps to thesis conditions + evidence. 2. SEC filings are fetchable and readable via CLI; agent can extract and record evidence from them. 3. Agent can write evidence and narrative back into the report via CLI commands.

Phase 3: Risk Engine + Portfolio Analytics¶

Scope: 1. Implement risk metrics per §4.3 (concentration, beta, correlation, Sharpe, max drawdown). 2. Breach detection with severity per investments/risk-policy.yaml. 3. Risk-informed action generation. 4. Insider transaction and institutional ownership signals per §2.1.

Exit criteria: 1. Report includes quantitative risk scores, qualitative signals, breaches, and mitigations.

Phase 4: Decision Layer¶

Scope: 1. Combine thesis + risk + constraints into final decisions. 2. Confidence scoring per §4.1, action ranking per §4.2. 3. JSON output conforms to §3.4, markdown output includes all §3.5 sections delivered so far.

Exit criteria: 1. Every decision carries confidence score and linked evidence. 2. Action items are priority-ranked. 3. Report is directly actionable as weekly review checklist.

Phase 5: Storage Migration + Performance, Income & Tax Tracking¶

Scope: 1. SQLite database bootstrap per §3.2: schema creation, DatabaseService, migration runner. 2. Artifacts directory setup per §3.1: move outputs from investments/reports/ to artifacts/, gitignore, update path constants. 3. Performance measurement: TWR, benchmark comparison (vs. SPY), alpha. 4. Dividend tracking: yield, ex-dates, income received, yield-on-cost. 5. Cost basis enrichment: holding period (ST/LT), realized gains/losses. 6. Tax-loss harvesting signals: unrealized losses with ST/LT flag and wash-sale window check. 7. Historical trend analysis: cross-report comparison over time.

Phase 6: Allocation & Rebalancing¶

Scope: 1. Target allocation per §2.6, drift computation, rebalancing suggestions. 2. Tax-aware trade sizing (prefer selling LT gains, harvest ST losses). 3. Position sizing by conviction tier from thesis definitions. 4. Cash reserve tracking: actual vs. target dry powder.

Phase 7: Alerts & Threshold Monitoring¶

Scope: 1. Portfolio-level event predicates wired into existing event service. 2. Price, drift, earnings proximity, and dividend ex-date alerts. 3. Alert rules defined in investments/alerts.yaml. 4. CLI: bof alert list/add/remove.

Phase 8: Service Mode — Agent Orchestration + Telegram + Memory¶

Scope: 1. Agent orchestrator: spawn headless agent sessions from event handlers (configurable agent command, timeout, output capture). 2. Telegram send capability: TelegramClient for sendMessage, formatted replies, callback ACK. 3. Telegram conversational routing: human query → agent session → reply. Multi-turn state per chat_id. 4. Investment policy: investments/policy.yaml schema, loaded as agent context. 5. Long-term memory: artifacts/memory/ store, bof memory list/search/write/forget CLI commands. 6. Cron → agent: cron tick triggers full agent-driven weekly review without human intervention. 7. Proactive notifications: review summaries and alert messages sent to Telegram after completion.

Exit criteria: 1. bof service run with cron trigger spawns a headless agent that completes a weekly review end-to-end. 2. Telegram bot receives a human query, spawns an agent, and sends back a composed reply. 3. Agent sessions load investment policy and relevant long-term memory as foundational context. 4. Agent sessions write observations to long-term memory after completing a workflow.

7. Test Infrastructure (Cross-Cutting)¶

Fixture layering: each phase builds on prior fixtures.
Recorded responses: all external API calls captured as fixtures. No live calls in CI.
Schema validation: shared validator enforces §3.4 contract.
Injectable dependencies: run_id and timestamps are injectable so tests control non-deterministic inputs.

8. Security and Traceability¶

Secrets never logged.
Redaction applied to serialized event payloads.
Recommendations carry source/timestamp provenance.
All workflow interactions and agent write-backs are auditable.