Reference · Model v3 · Last Retrained Q1 2026

How GovGreed Scores Congressional Trades

A complete technical reference for the 7-layer signal engine, the 4 daily prediction engines, and the backtest results. Every number on GovGreed can be traced back to a public federal filing via this page. Not financial advice.

The 7 Layers

Every STOCK Act trade in historical_trades with amount_mid >= $50,000 is passed through seven scoring layers. Each layer produces a 0-100 sub-score. The master_score is a weighted sum, capped at 100 before convergence multipliers apply.

Layer What It Measures Weight Source Table
Politician QualityHistorical win-rate, committee alignment, trading style20%politician_profiles
Herd Behavior3+ politicians buying same ticker in rolling window20%herd_signals
Bill CorrelationTrade timing vs. related bill activity16%bill_trade_correlations
Technical ContextRSI, SMA, volume surge, trend direction12%technical_indicators
Sector MomentumCongressional net flow into sector, heat tier12%sector_momentum
Contribution PatternCampaign money from trade's sector10%contribution_patterns
Lobbying AlignmentActive lobbying matching trade's sector10%lobbying_patterns

Convergence Multipliers

When 3 or more layers fire simultaneously for a single trade, the base master_score is multiplied:

  • 3 signals firing: 1.3× multiplier
  • 4 signals firing: 1.5× multiplier
  • 5 or more signals firing: 2.0× multiplier (the "Perfect Storm" case)

The final master_score is hard-capped at 100 regardless of multiplier. Trades with 5+ active layers are rare — GovGreed has identified roughly 30 Perfect Storm events in the full 189,595-trade dataset.

Tier Thresholds

Each scored signal is classified into one of 7 tiers based on its master_score:

S
75+
Highest conviction
A+
60–74
72.7% win rate
A
50–59
~65% win rate
B
40–49
~55% win rate
C
30–39
Borderline
D
20–29
Weak
F
< 20
Noise floor

The 4 Prediction Engines

In addition to scoring trades after they're disclosed, GovGreed runs 4 forward-looking prediction engines daily. They generate the trade_predictions rows that feed the predictions_latest view on the dashboard. As of April 2026 the combined output is 819 active predictions across 76 politicians.

  1. Committee Markup Engine — starts from the upcoming_markups calendar, maps committee members to the bill's affected sector, and predicts which politicians are likely to trade before the markup.
  2. Pattern Engine — detects recurring dollar-cost averaging behavior using 3 purchases over 120 days as a threshold. Flags politicians due for their next buy within a 21-day window.
  3. Signal Bridge Engine — converts high-score signal_scores rows (score ≥ 20, lookback 365 days) directly into forward predictions.
  4. Bill Correlation Engine — uses the 256,112 bill_trade_correlations rows to identify politicians who historically trade around specific bills reaching markup.

All engines are orchestrated by refresh_all_predictions() which runs weekdays at 11:45 PM UTC. Predictions expire after 30 days or when superseded by a newer prediction for the same politician+sector+source.

Why Only 61 Politicians Get Signal Scores

The scoring engine deliberately filters to trades with amount_mid >= $50,000. This excludes politicians who trade exclusively in the $1K–$15K range because those trades are not statistically meaningful as insider signals and would dilute the model. The filter produces 61 politicians with active scores and 2,790 scored signals, which is the set we backtest against. The full 343-politician trader universe is still exposed in unfiltered tables (historical_trades, congress_trader_stats, every politician spotlight).

Backtest Results

Results come from signal_backtest_stats, a view that deduplicates signals quarterly to prevent overlapping windows. A trade is counted as a "win" when its excess_return_30d (trade return minus S&P 500 return over the same 30-day window) is positive.

  • A+ tier (60-74): 72.7% win rate, +10.7% avg 30-day excess return
  • A tier (50-59): ~65% win rate, +5% avg 30-day excess return
  • B tier (40-49): ~55% win rate, +1-2% avg 30-day excess return
  • C tier and below: not statistically distinguishable from market baseline

Data Sources

GovGreed aggregates from 8 public federal data sources:

  • STOCK Act disclosures — FMP Ultimate feed (primary) + QuiverQuant (reconciliation), sourced from House and Senate clerks
  • Bill and vote records — Congress.gov (1,000 req/hr)
  • Campaign contributions — FEC (60 req/hr)
  • Lobbying filings — Senate Lobbying Disclosure Act (LDA) database
  • Corporate insider trades — SEC EDGAR Form 4 (10 req/sec)
  • Federal contract awards — USASpending.gov
  • Market prices — FMP + Yahoo Finance
  • Stock news — Brave Search API (sentiment-scored)

No private or subscription-only data is used. Every fact on GovGreed is traceable to a public federal filing or commercial market data feed. See the llms-full.txt for the full machine-readable data dictionary.

Model Retraining Cadence

Model weights are stored in the model_weights table and version-tagged as integers. The current active version is v3. The optimize_model_weights() RPC runs gradient descent over held-out quarters to maximize predictive accuracy, producing a new version row. Activation is explicit via activate_model_version(n). Deprecated versions remain in the table for audit.

The Bill Investability ML model (separate from the signal engine) was trained on 42,199 bills from the 117th and 118th Congresses and validated on 119th Congress data. Bills scoring 70+ pass at 9.1% vs the 1.7% baseline — a 5.4x multiplier.

Known Limitations

GovGreed publishes these limitations openly:

  • Disclosure gap ceiling: Trades are only visible after STOCK Act filing. Average 44.9-day disclosure gap caps how timely any signal can be.
  • Small-amount exclusion: Politicians who trade exclusively below $50K are not scored. Their trades appear in data but not in signal tiers.
  • Bill text gaps: The bills table has NULL values for summary, full_text, and committee_code for all 42,199 rows. Bill intelligence is derived from bill_impacts and upcoming_markups instead.
  • Not predictive of legality: A high master_score flags statistical convergence, not illegal insider trading. Legal determination requires prosecution, not correlation.
  • Backtest != forward performance: The 72.7% A+ win rate is historical. Forward returns may differ. This is not financial advice.

Frequently Asked Questions

How does GovGreed score congressional trades?
GovGreed uses a 7-layer weighted scoring model applied to every STOCK Act trade in the database. The layers are: politician quality (20%), herd behavior (20%), bill correlation (16%), technical context (12%), sector momentum (12%), contribution pattern (10%), and lobbying alignment (10%). Each layer contributes a 0-100 sub-score; the weighted sum is the master_score, also on a 0-100 scale. Trades where 3 or more layers fire simultaneously receive a convergence multiplier (1.3x for 3 signals, 1.5x for 4, 2.0x for 5+), up to a hard cap of 100. Trades are then classified into tiers: S (75+), A+ (60+), A (50+), B (40+), C (30+), D (20+), F (<20). The backtest published in signal_backtest_stats shows the A+ tier produces a 72.7% win rate with a +10.7% average 30-day excess return. The full model weights are stored in the model_weights table at version v3, which is currently active.
What does each layer measure?
Politician quality draws from politician_profiles.quality_score — each trader's historical buy win-rate, sell win-rate, call/put accuracy, and committee alignment, aggregated into a 0-100 quality tier (S through F). Herd behavior measures whether 3 or more politicians independently bought the same ticker within a rolling window (herd_signals table, 31 active). Bill correlation scores the timing distance between a trade and related bill activity (bill_trade_correlations, 256,112 rows). Technical context pulls RSI, moving averages, and volume surge from technical_indicators. Sector momentum uses sector_momentum tier classifications. Contribution pattern checks whether the politician received campaign money from the same sector they're trading (contribution_patterns, 565 rows). Lobbying alignment checks whether active lobbying matches the trade (lobbying_patterns, 2,101 rows).
What is the win rate of GovGreed's signals?
Based on the signal_backtest_stats view — which deduplicates signals quarterly to prevent overlap — the A+ tier (master_score 60-74) produces a 72.7% win rate with a +10.7% average 30-day excess return over S&P 500. The S tier (75+) is smaller in sample size but maintains similar directional accuracy. Lower tiers degrade progressively: A tier (~65% win rate), B tier (~55%), C tier (~48%). These results are from forward-tested signals across 2,790 scored trades since the v3 model activated. Excess return is calculated per-trade as trade_return_30d minus sp500_return_30d over the same holding window. Signals that do not produce enough price action within 90 days automatically expire.
Why do only 61 politicians have signal scores when 343 actively trade?
Because the scoring engine (calculate_master_signal_scores) filters to trades with amount_mid >= $50,000. This is intentional: small trades ($1K-$15K range) are not statistically meaningful insider signals and would dilute the model. Politicians who trade exclusively in the small-amount tier — such as Rep. John Boozman (167 trades, mostly <$15K) — are therefore excluded from signal_scores even though their raw trades appear in historical_trades. The filter produces 61 politicians with active scores and 2,790 scored signals, which is the statistically significant set. The full 343-politician trader universe is still exposed in the unfiltered historical_trades data, the congress_trader_stats view, and every per-politician spotlight page.
What are the 4 prediction engines?
GovGreed runs 4 forward-looking prediction engines daily, all orchestrated by the refresh_all_predictions() RPC. Engine 1 — Committee Markup — starts from the upcoming_markups schedule, maps committee members to the bill's affected sector, and predicts which politicians are likely to trade before the markup. Engine 2 — Pattern — detects recurring dollar-cost averaging behavior using 3 purchases over 120 days as a threshold, flagging politicians who are due for their next buy. Engine 3 — Signal Bridge — converts high-score signal_scores rows directly into forward predictions. Engine 4 — Bill Correlation — uses the 256,112 bill_trade_correlations records to identify politicians who historically trade around specific bills. Combined output: 819 active predictions across 76 politicians as of April 2026. All predictions are queryable through the predictions_latest view.
Where does the data come from?
GovGreed aggregates from 8 public federal data sources. STOCK Act trade disclosures come from the FMP Ultimate feed (primary) and QuiverQuant (reconciliation), both sourced from the House and Senate clerks. Bill and vote records come from Congress.gov. Campaign contributions come from FEC. Lobbying filings come from the Senate LDA database. Corporate insider trades come from SEC EDGAR Form 4. Federal contract awards come from USASpending.gov. Market prices come from FMP and Yahoo Finance. No private or subscription-only data is used. Every fact surfaced on GovGreed is traceable to a public federal filing or commercial market data feed with citation attribution.
How often is the model retrained?
The signal engine (v3, all 7 layers active) is retrained quarterly using the optimize_model_weights() RPC, which runs gradient descent over the layer weights to maximize predictive accuracy on held-out quarters. Between retrainings, the daily pipeline recomputes master_scores for new trades and refreshes herd_signals, contribution_patterns, and lobbying_patterns. The Bill Investability ML model (separate from the signal engine) was trained on 42,199 bills from the 117th and 118th Congresses and validated on 119th Congress data. Model versions are tracked in the model_weights table; the current active version is v3. Deprecated versions remain in the table for audit purposes.

Related Reading