Week 2 Live — 26 May 2026 · Sprint window: Fri 30 May → Mon 2 Jun · First prediction due this weekend.  Check back over the weekend for the latest weekly setup and any changes.

Design Thinking 3 · Weekly Prediction Exercise

Read the Market.
Predict. Learn. Improve.

Each week your team studies real market data, queries four AI models, and submits a prediction. The following Monday, you score your results and update your reasoning.

3
Primary predictions
11
S&P Sectors
4
AI models
10
Sprint weeks
🔥

Why This Project Will Change How You See Yourself

Read This First

Most students your age are studying markets in textbooks. You are about to do it live, with real data, real AI tools, and real accountability.

This is not a simulation. Every week you make a real prediction about real financial markets, record it with a timestamp, and find out exactly how right or wrong you were. By the time this trimester ends, you will have built something that most people in their twenties — including many finance graduates — simply do not have: a documented, evidence-based track record of market analysis.

What you build here will follow you into interviews, into master's programmes, and into the first years of your career. The question is how seriously you take it.

Wherever You Are Headed — This Project Gives You a Story

💼
Internship Hunt

Going for an Internship?

Internship interviewers — at tech companies, banks, consultancies, startups — are looking for one thing above all else: evidence that you can think under uncertainty and communicate clearly. Most candidates talk about their GPA and their club memberships.

You walk in and say: "Every week for ten weeks I made a real prediction about the S&P 500, the Nasdaq, and the Russell 2000 using a structured evidence framework that combined seasonal data, macro indicators, chart analysis, and four AI models. I then scored my own calibration — not just whether I was right, but whether my confidence was appropriate. Here is what I learned about the difference between a confident AI answer and a carefully reasoned human judgment."

That is not a student project. That is a demonstration of analytical maturity. It is rare at any age. At twenty years old it is exceptional.

🎓
Masters Application

Targeting a Master's Programme?

Top master's programmes — in data science, finance, business analytics, AI, or management — ask you to write a statement of purpose. Most applicants describe their coursework and grades. A few describe a project that genuinely challenged them.

Your statement of purpose says: "As a capstone project during my undergraduate studies, I led the Technical Agent role in a ten-week Agile sprint team that built a real-time market intelligence system. Each sprint cycle began with data collection at market close, moved through three structured analytical agents, was synthesised by four AI models, and culminated in a human-reviewed prediction with a documented confidence level. We tracked calibration — not just accuracy — and used the retrospective to improve the reasoning process each week."

Programmes in finance, data science, or AI will read that and see someone who already understands human-AI collaboration, evidence-based reasoning, and iterative improvement. That is exactly what they want in their cohort.

🚀
Entering the Workforce

Heading Straight into a Career?

The working world is full of people who can follow instructions. It is genuinely short of people who can look at complex, contradictory information and say clearly: "Here is what I think, here is why, here is what would change my mind, and here is my confidence level." That skill is valuable in any industry.

In a job interview you say: "I have experience working with real financial data, using multiple AI tools, and synthesising conflicting information into a structured recommendation. More importantly, I have ten weeks of documented evidence showing how I improved at it — where I was overconfident, where I was right to be uncertain, and how I changed my process based on what the data showed me."

Financial literacy is not just for finance jobs. Understanding how markets, interest rates, and economic signals interact is relevant to product management, operations, strategy, marketing, and technology. The person who reads the room — and reads the market — has a different kind of intelligence.

The Compounding Effect — Why Starting at Twenty Changes Everything

Year 1

You learn to read a chart, understand why interest rates move markets, and think in probabilities rather than certainties. You develop a vocabulary — EMA (Exponential Moving Average — a weighted average that reacts faster to recent price changes), yield, regime, calibration — that most of your peers do not have.

Year 3

You are making better financial decisions in your own life. You understand what happens to your savings when the Fed raises rates. You read economic news and understand what it actually means. This is the beginning of genuine financial autonomy.

Year 10

The habits you built — evidence before confidence, uncertainty when signals conflict, continuous improvement through retrospectives — apply to every major decision you will make. Career moves. Business decisions. Investment choices. The discipline is the asset.

Year 20

Someone who learned to think this way at twenty and kept practising has a twenty-year head start on clear reasoning under uncertainty. That is the real compound interest — not money, but the quality of judgment that money follows.

Warren Buffett started investing at eleven years old and has said his only regret was not starting earlier. He did not say that because early investing guarantees wealth. He said it because starting early means your reasoning compounds. Every mistake you analyse and improve on adds to a mental model that becomes more valuable with each decade. You are starting this now. At twenty.

Financial Literacy Is a Superpower Most People Never Acquire

Most people — including many highly educated professionals — go through their entire lives without understanding how financial markets work. They do not know what the yield curve means, why the Fed's words move stock prices, or how oil connects to inflation expectations. This is not their fault. Nobody taught them. They were never asked to engage with real data in a structured, accountable way.

The Financially Illiterate Person
  • Hears "the Fed raised rates" and has no idea what it means for their savings or investments
  • Makes big financial decisions based on tips from friends or confident-sounding strangers
  • Confuses price going up with value being created
  • Is overconfident when things are going well and panics when they are not
You — After Ten Sprints
  • Understands why the 10-year yield affects every asset class and can explain it simply to anyone
  • Has a framework for evaluating any claim: what is the evidence, what contradicts it, what would change the conclusion
  • Knows the difference between a high-confidence call and a coin flip, and can articulate why
  • Has a documented track record of being wrong, learning from it, and improving

What You Are Actually Building Over Ten Weeks

📊
An Analytical Framework

A repeatable process for reading complex information and forming a structured, evidence-backed view

🤖
AI Fluency

You will know how to use AI tools effectively — and crucially, when not to trust them. That distinction is increasingly rare and valuable

🐙
A GitHub Portfolio

Ten weeks of disciplined, documented work. Commits. Evidence. Analysis. Retrospectives. A real portfolio that any employer or admissions committee can inspect

🧠
Calibrated Confidence

You will learn the difference between feeling sure and being right. That is one of the most powerful cognitive skills a person can develop

🗣️
Communication Under Pressure

Every Monday you present to your peers and defend your reasoning. Week after week. That repetition builds a kind of confidence that classroom exercises never do

🌍
Global Awareness

You will understand why what happens in Washington affects markets in Singapore. That macro awareness changes how you read the news, forever

"I do not care whether you predict the market correctly. I care whether you learn to think."

Markets are one of the most honest feedback systems that exist. You make a call. You record it. The outcome is public, objective, and undeniable. There is no partial credit for almost being right. There is also no penalty for being wrong with good reasoning — only for being overconfident without evidence.

That feedback loop — predict, observe, score, improve — is exactly what separates professionals who grow from those who stagnate. You are practising it at twenty. The students who take this seriously and put real energy into it will look back at this trimester and recognise it as one of the formative experiences of their intellectual development.

The question is not whether this is a valuable project. It is. The question is whether you are going to treat it as one.

— Professor Dr. Tan, CP3405 Design Thinking 3

🎯 The Challenge — Specifically for You

If you are going for an internship this year:

Treat every sprint presentation as a rehearsal for explaining your work to a hiring manager. Practise saying: "I predicted X because of Y and Z. I was wrong because I underestimated W. Here is what I changed." That sentence, said confidently and clearly, is worth more than any certification.

If you are applying for a master's programme:

Keep a personal reflection journal alongside your GitHub work. Write two sentences after every sprint: what surprised you this week, and what it changed in how you think. Ten weeks of genuine reflection is a personal statement that writes itself.

If you are heading into your first job:

Make your GitHub repository public and link it in your resume. A real, documented, ten-week project with consistent commits, evidence, and retrospectives tells any employer that you are someone who finishes things, works in a team, and improves over time. That is exactly what they are trying to assess in every interview.

If you are not yet sure where you are headed:

That is fine. The skill you are building — structured reasoning under uncertainty — is useful everywhere. The worst outcome of this project is that you spend ten weeks thinking rigorously about complex real-world problems and learn to communicate your thinking clearly. There is no profession where that is a disadvantage.

📌

Your Week 2 Mission — What Happens This Week

26 May – 2 June 2026

US markets close this Friday 30 May at 4:00 PM Eastern Time. That is when your first live sprint begins. Over the weekend, your team collects data, runs the agents, queries four AI models, and commits a prediction to GitHub — before Monday's class.

Monday 2 June is your first sprint review. You arrive with everything already done. You present for 20–25 minutes. The class sees your SPX, NDX, and IWM predictions — and all 11 sector ETF calls — on the prediction board.

TODAY — MONDAY 26 MAY (Class)
  • Read this entire website before class
  • Form your group and agree on roles (R1–R10)
  • Create your GitHub repository — README + evidence folder
  • Create free accounts: Claude, ChatGPT, Gemini, DeepSeek
  • Bookmark the three data source URLs
TUE–FRI 27–30 MAY (In-week)
  • R3: Open the Almanac PDF — read May vital stats and the Memorial Day week day notes
  • R4: Check CME FedWatch, TradingEconomics calendar, 10-year yield
  • R5: Open ProRealTime or TradingView — study SPX, NDX, IWM charts
  • R9: GitHub repo clean, README written, evidence folder structure set up
WEEKEND 31 MAY – 1 JUN (Sprint Work)
  • Saturday: Pull Finviz (1W) + Yahoo Sectors (5D) — screenshot both
  • Write three agent outputs — Almanac, Macro, Technical — before opening any AI
  • Query all 4 LLMs with the shared prompt — save every raw response
  • Complete the Multi-LLM comparison table
  • R7 leads the Human Score discussion — write the override paragraph
  • Commit prediction file to GitHub before Sunday night
  • Submit predictions to the Prediction Board (link at bottom of this site)
MONDAY 2 JUN (Sprint Review — Class)
  • Everything is already prepared. Walk in ready to present.
  • 20–25 minute presentation — every role holder speaks
  • Show GitHub repo live — commits, evidence folder, release tag
  • R7 presents the Human Score — the thing no AI raised
  • R10 updates calibration after Friday close is recorded
⚠️ Key Market Events You Must Know This Week (27–30 May 2026)
Mon 26
US markets CLOSED — Memorial Day. 4-day trading week ahead.
Tue 27
Consumer Confidence (10AM ET). Earnings: AutoZone (AZO), Zscaler (ZS).
Wed 28
New Home Sales. Earnings: Marvell (MRVL), Salesforce (CRM), Snowflake (SNOW) — major tech read-through for NDX.
Thu 28
PCE INFLATION (Personal Consumption Expenditures — the Fed's preferred inflation measure) + GDP 2nd estimate. THE swing event of the week. Hot PCE (≥2.8%) = yields spike, stocks sell off. Cool PCE (≤2.5%) = yields ease, rally continues. Low-volume holiday week amplifies the reaction.
Fri 30
Chicago PMI. Month-end rebalancing. 4PM ET: markets close → your prediction window locks → pull Finviz and record actuals over the weekend.

The single most important thing to do before Saturday: Read the exemplary solution page at exemplary-solution.html — it shows you exactly what a complete, high-quality submission looks like for this exact week. Study every section. Your submission does not need to reach the same conclusions, but it must show the same structure.

🎯

What This Exercise Is About

You are technology students, not finance students. You do not need to become a trader. You need to become a disciplined thinker who can gather evidence, reason under uncertainty, and explain a conclusion clearly. The market is the practice arena. The real skill is the process.

📋

Saturday over the weekend: Set the Prediction

US markets close Friday 4PM ET = Saturday over the weekend. The moment that happens, your sprint begins. Pull the data, run the agents, query all four AI models, and commit your SPX / NDX / IWM predictions to GitHub — with % range, direction, and confidence level — before Saturday ends.

📈

Following Friday: Record Actuals

After US markets close the following Friday (Saturday over the weekend), record the actual % change for SPX, NDX, and IWM. Do not change your Saturday prediction. The GitHub commit timestamp is the immutable evidence of what you said before the week played out.

🔁

Monday Class: Present & Score

You arrive at Monday's class with your prediction already filed and your actuals already recorded. You present: what you predicted for SPX/NDX/IWM, what actually happened, your calibration score, and what you are changing. No last-minute preparation on Monday morning.

⚠️ Important framing: This is not a competition to predict correctly. It is an exercise in calibration — stating your confidence accurately. A cautious, well-reasoned "uncertain" is worth more than an overconfident wrong prediction. You are marked on reasoning quality, not market accuracy.

📊

The Nine Assets You Track

Reference

These are the instruments your team makes weekly predictions about. Each one measures a different part of the global economy. Together, they tell a story about where money is moving and why.

🇺🇸
Index

S&P 500

A basket of the 500 largest US companies. Think Apple, Microsoft, Amazon, Google, Tesla. It is the most widely watched barometer of the US stock market. When people say "the market went up today," they almost always mean the S&P 500.

Why it matters: It is the benchmark everything else is measured against. If the S&P 500 is rising, the broad economy is usually healthy. If it falls, investors are worried.

💻
Index

Nasdaq 100

The 100 largest non-financial companies listed on the Nasdaq exchange. This is heavily weighted toward technology: semiconductors, cloud software, AI, biotech. It moves faster and more dramatically than the S&P 500.

Why it matters: Compare Nasdaq vs S&P 500 performance each week. If Nasdaq leads, the market is favouring growth and tech. If Nasdaq lags, investors may be rotating into safer, older industries.

🏘️
Index

Russell 2000

2,000 small US companies. Unlike the S&P 500 giants that earn globally, most Russell 2000 companies sell almost entirely within the US and borrow heavily. This makes them very sensitive to US interest rates and domestic economic conditions.

Why it matters: When Russell 2000 underperforms the S&P 500, it often signals that investors fear higher rates or a slowing US economy. It is one of the best early warning signals.

🥇
Commodity

Gold

A physical metal that has been used as money and a store of value for thousands of years. Gold does not pay dividends or interest. People buy it when they are afraid: afraid of inflation, afraid of economic collapse, afraid of currency debasement.

Why it matters: Gold rising while stocks fall = fear trade. Gold rising while stocks also rise = inflation concern. Gold falling while stocks rise = confidence, risk-on mood. It is the market's fear gauge alongside VIX (CBOE Volatility Index — measures expected 30-day market volatility, known as the Fear Index).

🛢️
Commodity

Crude Oil (WTI (West Texas Intermediate — the North American benchmark crude oil price))

West Texas Intermediate crude oil is the North American benchmark price for a barrel of oil. Oil touches almost every product and service in the economy: transport, manufacturing, heating, plastics. A major oil price move creates ripples everywhere.

Why it matters: Rising oil = higher inflation expectations, pressure on consumer spending, boost for energy stocks. Falling oil = potential deflation, relief for consumers, pressure on energy sector. Geopolitical events (wars, sanctions) can move oil dramatically in a single day.

📐
Rate

10-Year Treasury Yield

The interest rate the US government pays to borrow money for 10 years. This is the most important single number in global finance. Mortgages, corporate loans, and valuations of all stocks are mathematically linked to this rate.

Why it matters: Rising yield → borrowing costs go up → companies worth less → stock valuations fall → pressure on Russell 2000 and growth stocks especially. Falling yield → the opposite. Think of yield as gravity: the higher it is, the harder it is for asset prices to stay elevated.

🏛️
Fixed Income

US Treasury Bonds (TLT)

When yields rise, bond prices fall — and vice versa. Bonds and yields always move in opposite directions. The TLT ETF (Exchange-Traded Fund — a basket of securities traded like a single stock on an exchange) tracks long-dated US Treasury bonds (20+ years). When investors are scared, they often flee into bonds — "flight to safety."

Why it matters: Watch whether stocks and bonds are moving together or apart. Stocks up + bonds up = unusual, often a "Goldilocks" moment. Stocks down + bonds up = classic fear trade. Stocks down + bonds also down = something unusual (inflation shock, fiscal worry).

Volatility

VIX — The Fear Index

The CBOE Volatility Index measures how much the options market expects the S&P 500 to move over the next 30 days. It is calculated from the prices of options contracts — essentially, how much investors are paying for insurance against a market drop.

Why it matters: VIX below 15 = calm. VIX 15–25 = moderate concern. VIX above 30 = fear. VIX above 40 = panic (happened during COVID crash, 2008 crisis). A rising VIX usually means falling stocks. Watch for VIX spikes as early warnings.

Crypto

Bitcoin

The largest cryptocurrency by market cap. Bitcoin trades 24/7 — unlike stocks — and is highly sensitive to risk appetite, liquidity conditions, and regulatory news. In recent years it has increasingly correlated with Nasdaq during risk-off events.

Why it matters: Bitcoin often moves first. When risk appetite improves, Bitcoin can surge before stocks do. When fear hits, Bitcoin can crash faster than any stock index. It is a useful leading indicator of risk appetite, though very noisy.

How These Assets Talk to Each Other

The key cross-asset relationships to watch every week:

🔗 Yields ↑ → Stocks (especially Nasdaq & Russell) ↓ — higher borrowing costs hurt growth companies most
🔗 Gold ↑ + Stocks ↓ — fear trade, money fleeing risk
🔗 Oil ↑ → Energy sector ↑, Consumer Discretionary ↓ — oil is both a threat and a sector opportunity
🔗 VIX ↑ → S&P ↓ — almost always true in the short term
🔗 Bitcoin ↑ with Nasdaq ↑ — risk-on mood, growth appetite
🔗 Russell 2000 lagging S&P 500 — warning sign: broad rally may be narrowing

🗂️

The 11 S&P 500 Sectors

Reference

The S&P 500 is divided into 11 sectors. Each week, watch which sectors are leading (buying) and which are lagging (selling). The pattern tells you what kind of week it was — risk-on or risk-off, growth or defensive.

💡
Technology XLK
Software, chips, hardware. Apple, NVIDIA, Microsoft.
↑ = risk-on, growth appetite
💊
Healthcare XLV
Pharma, hospitals, biotech. J&J, UnitedHealth.
↑ = defensive positioning
💳
Financials XLF
Banks, insurance, asset managers. JPMorgan, Goldman.
↑ with yields = banks benefit from rates
Energy XLE
Oil & gas companies. ExxonMobil, Chevron.
↑ follows oil price rise
🛒
Consumer Discretionary XLY
Things you want (not need). Amazon, Tesla, Nike.
↑ = consumer confidence strong
🧴
Consumer Staples XLP
Things you always buy. Coca-Cola, P&G, Walmart.
↑ = defensive, fear of slowdown
🏭
Industrials XLI
Manufacturing, transport, defence. Boeing, Caterpillar.
↑ = economic growth optimism
🔧
Materials XLB
Mining, chemicals, steel. Copper, lithium producers.
↑ = global growth, infrastructure demand
🏠
Real Estate XLRE
Property trusts. Heavily rate-sensitive.
↑ when yields fall, ↓ when yields rise
🔌
Utilities XLU
Power, water, gas utilities. Very stable, bond-like.
↑ = defensive, safety seeking
📡
Communications XLC
Telecom, media, social. Meta, Alphabet, Netflix.
↑ = growth/ad-revenue confidence

Rotation pattern to memorise: When money moves from Technology → Utilities/Staples, that is defensive rotation (investors getting scared). When money moves from Staples/Utilities → Technology/Financials, that is risk-on rotation (investors getting confident). Sector leadership is one of the most reliable signals of market mood.

🔗

Your Three Bookmarks

Tools

You do not need to visit ten websites. Bookmark these three. Together they give you every number on the weekly scorecard in under 5 minutes over the weekend from weekend.

Bookmark #1 — Primary

Finviz Futures Performance

finviz.com/futures_performance.ashx

One page. All macro assets (S&P, Nasdaq, Russell, Gold, Oil, 10-Year, Bonds, Bitcoin, VIX). Switch from "1D" to "1W" with one click to see the full weekly % change. No login. No paywall.

★ Use this first over the weekend
Bookmark #2 — Sectors

Yahoo Finance Sectors

finance.yahoo.com/sectors/

All 11 S&P sectors with weekly % change as a colour-coded heatmap. Change the time period to "5D" to see the prior week's performance. Green = up, Red = down.

Free · No login needed
Bookmark #3 — Context

TradingEconomics Calendar

tradingeconomics.com/calendar

The week-ahead macro calendar. Shows CPI (Consumer Price Index — measures the change in prices paid by consumers), jobs data, Fed speeches, earnings dates, and every major economic event that could move markets. Feed this into your Macro Agent and your prediction reasoning.

Free · No login needed

the weekend workflow: Open Tab 1 (Finviz, set to 1W) → screenshot → open Tab 2 (Yahoo Sectors, set to 5D) → screenshot → open Tab 3 (TradingEconomics, look at the coming week) → note key events. Total time: 5 minutes. Commit both screenshots to your GitHub evidence folder labelled with the date.

📰

Macro Agent — Your Seven Locked Tools

R4 Reference

Use only these seven sources. If every student uses different sources, your outputs are incomparable. These seven cover everything the Macro Agent needs. They are all free, all require no login, and all update in real time. Do not supplement with social media, Reddit, paid terminals, or opinion blogs.

🏦
CME FedWatch Tool
cmegroup.com/markets/interest-rates/cme-fedwatch-tool.html

Shows the market's implied probability of each Fed rate outcome at every upcoming FOMC meeting. The most important single macro signal in the world.

What to record: Current Fed rate. Probability of hold/cut/hike at the next FOMC meeting. Did the probability shift this week vs last week? Which direction?
Key insight: A shift from 96% hold to 85% hold is a big deal even if the rate does not change. The market repriced expectations — that moves assets.
📅
TradingEconomics Calendar
tradingeconomics.com/calendar

The week-ahead economic calendar. Every data release — CPI, PCE, jobs, GDP, retail sales, Fed speeches — with date, time, previous value, and market consensus forecast.

What to record: List every data release in the coming week. Highlight the 1–2 most market-moving events. Note what consensus expects vs last reading.
Key insight: Markets move on surprises. If consensus expects CPI at 2.8% and it prints 3.2%, that is the shock — not the number itself.
📊
Finviz Futures Performance
finviz.com/futures_performance.ashx

One page. All macro assets — indices, bonds, dollar, oil, gold, bitcoin. Switch to 1W for weekly view. Your primary data source for recording last week's actual % changes.

What to record: Weekly % change for oil (CL), dollar (DX), gold (GC), 10-year (ZN), 30-year (ZB). Screenshot in 1W mode immediately after Friday US close.
Key insight: Dollar and oil moving in the same direction is unusual and often signals a geopolitical driver rather than a macro economic one.
📈
US Treasury Yield Curve
home.treasury.gov/resource-center/data-chart-center/interest-rates

Official US government source for the 2-year, 10-year, and 30-year Treasury yields. Updated daily. The 10-year is the most important single rate in global finance.

What to record: 2-year yield, 10-year yield, 30-year yield. Is the curve normal (10yr > 2yr) or inverted (2yr > 10yr)? Inverted curves have historically preceded recessions.
Key insight: 10-year at 4.60%+ = gravity on stock valuations. Every 0.25% rise in the 10-year is felt most by growth stocks and REITs.
📰
Reuters / AP News
reuters.com  |  apnews.com

Wire service news — confirmed facts, official statements, central bank announcements, geopolitical developments. These are primary sources. Use them for confirmed events only.

What to record: Any confirmed event that moved markets this week (central bank decision, geopolitical development, major earnings surprise, policy change). Label as confirmed fact.
Key rule: If you cannot find it on Reuters or AP, do not include it as a fact in the Macro Agent output. Separate confirmed news from rumour.
📋
Schwab Weekly Outlook
schwab.com/learn/story/weekly-traders-outlook

Published every Friday, written by professional market strategists. Gives a structured summary of the week that was plus the week-ahead setup. Free, no paywall.

What to use it for: Cross-check your own macro reading. Do the professionals agree with your assessment? If the Schwab strategist says "moderately bearish" and your team says "bullish," what are you seeing differently?
Key rule: This is a cross-check, not a substitute. Do not copy Schwab's view as your own analysis.
📆
Earnings Whispers
earningswhispers.com

The most complete free earnings calendar. Shows every company reporting that week, consensus EPS estimates, and the "whisper number" (what traders actually expect vs what analysts publish).

What to record: Name the 3–5 biggest earnings releases of the coming week. State which sector they affect (e.g. NVDA = XLK, JPMorgan = XLF). These are your sector-level catalysts.
Key insight: One unexpected earnings miss from a major company can drag its entire sector down even if the broad market is flat.

Macro Agent Output Template — Fill This Every Week

Every field maps directly to one of the seven tools above. No field should take more than 2 minutes to complete. Total time for the Macro Agent: 15–20 minutes.

Macro Agent Output Template — Week of [DATE] — Source: R4 FED & RATES (CME FedWatch + Treasury.gov):
 · Current Fed rate: [X.XX–X.XX%]
 · Next FOMC date: [date]. Hold probability: [XX%]. Cut probability: [XX%]. Direction vs last week: [unchanged / shifted dovish / shifted hawkish]
 · 2-year yield: [X.XX%]   10-year yield: [X.XX%]   30-year yield: [X.XX%]
 · Yield curve: [normal / inverted / flat]. 10-year direction this week: [rising / falling / flat]
 · [Implication: rising 10-year = headwind for growth stocks / falling 10-year = relief for REITs and tech]

COMMODITIES & DOLLAR (Finviz Futures 1W):
 · WTI Crude Oil: [close price], weekly change: [+/−X.X%], direction: [rising / falling / flat]
 · Gold: [close price], weekly change: [+/−X.X%], direction: [rising / falling]
 · DXY (Dollar): [level], direction: [strengthening / weakening]
 · [Cross-asset implication: oil rising + yields rising = stagflation risk. Oil falling = inflation relief.]

WEEK-AHEAD CALENDAR (TradingEconomics):
 · [Day, Date]: [Event name] — Expected: [X.X%], Previous: [X.X%] — IMPORTANCE: [High/Med/Low]
 · [Day, Date]: [Event name] — Expected: [...], Previous: [...] — IMPORTANCE: [...]
 · [Repeat for each major event this week. Maximum 5 entries. Rank by market-moving potential.]

KEY EARNINGS THIS WEEK (Earnings Whispers):
 · [Company (TICKER)] — Day — Sector: [XLX] — What to watch: [one sentence]
 · [2–3 entries maximum. Only include earnings that could move an entire sector.]

CONFIRMED NEWS EVENTS (Reuters / AP):
 · [Event description] — Source: Reuters/AP — Date — Market implication: [one sentence]
 · [Confirmed facts only. No rumours. No social media. If you cannot cite Reuters or AP, leave it out.]

MACRO BIAS: [Hawkish / Dovish / Neutral / Binary-risk]
PRIMARY DRIVER THIS WEEK: [The single most important macro event — in one sentence]
CONFIDENCE: [Low / Medium / High] — [Brief justification]
INVALIDATION: [What macro event would reverse your bias completely]
Sources accessed: [date]. All data from the seven approved tools only.

The three-level distinction your Macro Agent must always make:
Level 1 — Confirmed fact: "The Fed held rates at 3.50–3.75% on April 29." (Reuters confirmed)
Level 2 — Market expectation: "FedWatch shows 96% probability of hold at June 17 meeting." (CME data)
Level 3 — Your interpretation: "Given Warsh's hawkish track record, we believe tone risks are skewed to the upside for yields." (your analysis)

Label all three differently in your output. Never present an interpretation as a fact.

📆

Almanac Reference Card — Annual Seasonal Patterns

R3 Reference · 2026

Use the PDF for week-specific day notes. Use this card for monthly stats and sector seasonality. The Almanac PDF is your authoritative source — open it every weekend and look up the current month's daily calendar notes. This reference card gives you the key statistics without having to re-read the book every week. It does not replace the PDF. The PDF contains day-specific notes (e.g. "Memorial Day week Dow down 17 of last 29") that are not reproduced here.

2026 Context — The 4-Year Presidential Cycle
Q1 2026
Modest gains expected. Market still in recovery mode from 2025.
Q2–Q3 2026 ← WE ARE HERE
THE WEAK SPOT. Dow avg −2.0%, S&P −2.5%, Nasdaq −6.6% historically.
Q4 2026
THE SWEET SPOT begins. Historical rally as midterm results known.
Q1–Q2 2027
Pre-election year — historically the best two quarters of the 4-year cycle.

Monthly Vital Statistics — S&P 500 (1950–2025)

Every sprint, check which month you are in and read the corresponding row. The Midterm Year column is especially important in 2026 — it overrides the normal average for the year.

Month S&P Rank S&P Avg % Nasdaq Avg % Russell 2K Avg % S&P Midterm Year Avg Key Pattern Note
January #5 +1.1% +1.5% +1.2% varies January Barometer — as January goes, so goes the year (historically 75% accurate)
February #11 −0.02% +0.3% +0.5% −1.2% Weakest month. Post-January hangover often.
March #6 +1.0% +1.2% +1.5% +0.5% Best Six Months period (Nov–Apr) ending soon. Q1 tax-related buying.
April #2 +1.4% +1.6% +1.8% +0.8% 2nd best month historically. Last month of Best Six Months. Strong Q1 earnings season.
May ← Current #8 +0.3% +1.1% +1.3% −0.7% Worst Six Months begins. "Sell in May" pattern. Midterm year historically negative. Check PDF for week-specific notes.
June #9 +0.2% +1.0% +0.8% −2.1% Summer doldrums begin. Week after June Triple-Witching Dow down 28 of 34! Nasdaq Best 8 Months ends.
July #4 +1.3% +1.8% +1.5% −1.0% Best month of Q3 historically. First month of quarter effect. But midterm year drags it down.
August #10 +0.02% +0.5% +0.3% −1.8% Summer doldrums peak. Very low volume. Small moves can look larger than they are.
September ⚠️ #12 −0.7% −0.8% −1.0% −2.5% Worst month of the year. Midterm year makes it worse. Historically the highest crash risk month.
October #7 +0.9% +1.0% +1.2% +2.8% Bear-killer month. Midterm bottoms historically occur in October. Q4 Sweet Spot starts. 10 of last 16 bear markets bottomed in midterm October.
November ★ #1 +1.9% +2.5% +3.1% +4.2% Best month of the year. Best Six Months begins. Post-midterm election rally. Santa Claus Rally setup.
December ★ #3 +1.4% +1.8% +2.2% +3.8% Santa Claus Rally (last 5 + first 2 trading days). Year-end window dressing. 3rd best month.

Sector Seasonality Quick Reference (Active Signals for May–July 2026)

ETF / Sector Signal Window 25-yr Avg What it means for your prediction
XLK (Tech) LONG Mar M → Jul M +10.9% Still in seasonal long window. Support your tech bullish call with this until mid-July.
XLU (Utilities) LONG Mar M → Oct E +9.3% Seasonal long BUT yields at 4.60% override this. Contradiction to document in your output.
XLF (Financials) SHORT May B → Jul B −6.3% Active seasonal headwind. Supports bearish call on financials through July.
Gold/XAU SHORT May M → Jun E −6.8% Seasonal headwind BUT gold near all-time highs. Contradiction: seasonal says down, price says strong. Document both.
XLB (Materials) SHORT May M → Oct M −5.1% 6-month seasonal short window active. Consistent with global growth slowdown concern.
XOI/XLE (Oil) SHORT Jun B → Aug E −5.7% Begins in early June. Not yet active in May — but starting next sprint.
XLV (Healthcare) LONG Oct B → May B +8.7% Window technically ended early May. Healthcare is now seasonally neutral — not a strong call either way.
XLK (Tech) LONG Aug M → Jan M +9.7% Second seasonal long window for tech begins August. Double bullish seasonal support for XLK in 2026.

The Almanac PDF is still your authoritative source for day-specific notes. This reference card gives you the monthly stats and sector seasonality. For notes like "Day after Memorial Day Dow down 8 of last 10" or "First Trading Day in May S&P up 19 of last 28" — those are in the daily calendar pages of the PDF and are not reproduced here. Open the PDF every weekend and read the calendar entries for the coming week.

📅

The Almanac Agent — How to Use It

Step-by-Step Guide

What is the Stock Trader's Almanac? Published every year since 1968, it is a data book of historical stock market patterns. It tells you what the market has done in each month, each week, and around specific dates over the past 50–75 years. It does not predict the future — it shows you what has historically happened and how often. Your job as the Almanac Agent is to read the relevant patterns for the coming week and turn them into a structured hypothesis — not a trading instruction.

⚠️ 2026 CONTEXT

Critical: 2026 is a Midterm Election Year — This Changes Everything

Before reading any monthly pattern, you must understand the 4-Year Presidential Cycle. It is the most important macro seasonal context in the entire Almanac.

Q2–Q3 2026 — The Weak Spot

The Almanac calls Q2–Q3 of every midterm year the "Weak Spot." Historical averages: Dow −2.0%, S&P −2.5%, Nasdaq −6.6% across these two quarters. 10 of the last 16 bear markets bottomed in a midterm year. We are now in this window.

May Midterm Year Stats

In midterm election years specifically, May ranks #8 Dow, #9 S&P, #6 Nasdaq — below average performance. Midterm year May average: Dow −0.6%, S&P −0.7%, Nasdaq −0.8%. This is a meaningful headwind vs. the normal May average.

Q4 2026 — The Sweet Spot Ahead

The good news: after the midterm weakness, Q4 2026 historically begins the "Sweet Spot" — the best period of the 4-year cycle. The Almanac's 2026 outlook: net year gain of 4–8% but front-loaded with pain in Q2–Q3.

How this affects your weekly Almanac output: When you read a bullish seasonal pattern for a specific week, you must weigh it against the broader midterm year headwind. A bullish Memorial Day week pattern is less convincing in a midterm year with elevated yields. Always state both the pattern and the cycle context.

May Data

May Vital Statistics — What the Almanac Actually Says

This is the raw data from the Almanac (page 65). Here is how to read every row.

MetricDJIAS&P 500NasdaqRussell 1KRussell 2KWhat it means for your output
Rank (best month)#9#8#5#6#4May is mid-tier to below average for Dow/S&P, better for Nasdaq and small caps. Not a strong seasonal month.
Up / Down (75 yrs)41 / 3446 / 2933 / 2132 / 1429 / 17S&P up 61% of Mays. Slight bullish lean historically but not dominant. Russell 2K up 63% — small caps tend to do better in May.
Avg % Change−0.02%+0.3%+1.1%+0.9%+1.3%Nasdaq and small caps historically do much better in May than the Dow. Tech/growth seasonal tilt in May.
Midterm Yr Avg %−0.6%−0.7%−0.8%+0.1%−1.0%⚠️ This row matters most for 2026. In midterm years specifically, May is negative for all indices except Russell 1K. This is your key 2026 caveat.

Key Week-Level Patterns for Late May (Memorial Day Week)

Memorial Day Week — Bearish Lean

Dow down 17 of the last 29 Memorial Day weeks. This is a statistically meaningful bearish tendency. Despite a bullish run of 12 straight years (1984–1995), the recent record is poor.

Day After Memorial Day — Mixed

Dow up 23 of last 39 — marginally bullish. But down 8 of the last 10 years. The recent trend has turned bearish for the Tuesday re-open after the long weekend.

Friday Before Memorial Day — Flat

Dow mixed, up 13 down 13, average −0.05%. Light volume, traders leaving early. Not a directional signal — just expect thin, directionless trading.

Week After Options Expiration — Bullish

S&P up 30 of last 45, avg +0.40%. This is actually one of the stronger week-level patterns in May. We are currently in this window after the May options expiration.

Almanac note directly relevant to Week 2: "Better to reposition in May than to sell in May and go away." The Almanac is nuanced — it does not say May is catastrophic, just below average. The real "Worst Six Months" warning applies to the full May–October period, not necessarily every single week.

Sectors

Sector Seasonality — What the Almanac Says About Sectors in May–June

The Almanac has a dedicated Sector Seasonality table (page 94). These are average returns when you enter/exit sectors at historically optimal times. Here are the patterns most relevant to late May / early June:

SectorSeasonal TypeStartEnd25-yr Avg ReturnWhat this means for your prediction
Banking (BKX)SHORT (bearish)May (early)July (early)−6.3%Financials historically weak from early May through early July. Almanac says avoid/underweight banks in this window.
Gold & Silver (XAU)SHORT (bearish)May (mid)June (late)−6.8%Gold seasonally weak mid-May through late June. Note: this is the seasonal pattern — actual gold is near all-time highs, which is a contradiction to document.
Materials (S5MATR)SHORT (bearish)May (mid)Oct (mid)−5.1%Materials sector historically one of the weakest in the May–October worst six months period.
Oil (XOI)SHORT (bearish)June (early)Aug (late)−5.7%Oil sector seasonal weakness begins in early June. Relevant for Week 3+ predictions.
Info Tech (S5INFT)LONG (bullish)March (mid)July (mid)+10.9%Technology still in its seasonal long window through July. This is a bullish sector signal for tech/Nasdaq through June.
Utilities (UTY)LONG (bullish)March (mid)Oct (early)+9.3%Utilities in a long seasonal window — but also being crushed by rising yields. This is a contradiction to document: seasonal says bullish, rates say bearish.
Healthcare (S5HLTH)LONG (bullish)Oct (early)May (early)+8.7%Healthcare seasonal window technically ends in early May. You are at the tail end of a bullish period — note this in your output.

Key sector insight from Almanac for late May / early June 2026: The pattern strongly supports avoiding Banking and Materials, and still holding Technology (seasonal long window through July). The Gold/Silver seasonal short is notable given gold is near all-time highs — this contradiction between the seasonal pattern and actual price action is exactly the kind of nuance that belongs in your Almanac Agent output.

★ HOW TO

How to Write Your Almanac Agent Output — Step by Step

Step 1 — Identify the relevant pages

Over the weekend (after US Friday market close), look up three things in the Almanac: (1) the Monthly Almanac page for the current month, (2) the relevant day-specific notes for the coming week, (3) the Sector Seasonality table on page 94. You do not need to read the whole book — just these targeted lookups.

Step 2 — Record the four key facts

Fact to recordWhere to find itExample for Week 2
Monthly rank and average % changeMay Vital Statistics table, first two rowsMay ranks #8 S&P, avg +0.3% — below average month. Midterm year avg −0.7%.
Up/Down record for the monthMay Vital Statistics, Up/Down rowS&P up 46, down 29 of 75 years (61% up). Slight bullish lean.
Specific week/day patternCalendar day notes in the Almanac daily pagesMemorial Day week: Dow down 17 of last 29. Day after: down 8 of last 10.
Relevant sector seasonalitiesSector Seasonality table, page 94Banking seasonal short began early May. Tech still in seasonal long window through July.

Step 3 — Assess pattern strength

Not all patterns are equal. Use this simple framework:

STRONG pattern
Occurs 70%+ of the time across 20+ years. Example: "Week after options expiration, S&P up 30 of 45 (67%)."
MODERATE pattern
Occurs 55–70% of the time. Useful context but not dominant. Example: "S&P up 61% of Mays historically."
WEAK / USE WITH CAUTION
Occurs <55% of the time, or recent years conflict with historical record. Example: "Memorial Day week up 12 straight (1984–1995) but down 17 of last 29."

Step 4 — Write the structured Almanac Agent output

This is what you paste into the synthesis prompt. Use this exact structure every week:

Almanac Agent Output Template — Week of 26 May 2026 MONTH: May 2026
CYCLE CONTEXT: Midterm election year (2026). Q2–Q3 is the "Weak Spot" of the 4-Year Cycle. Almanac forecasts tougher trading through Q3 before a Q4 "Sweet Spot" rally.

MONTHLY STATS:
 - S&P 500: ranks #8 of 12 months. Up 61% of the time. Avg +0.3% normally.
 - Midterm year May avg: −0.7% for S&P. This is the active context for 2026.
 - Nasdaq historically stronger in May: avg +1.1%, ranks #5.
 - Russell 2000: avg +1.3%, ranks #4 — best of the indices in normal Mays.

SPECIFIC WEEK PATTERN (Memorial Day Week, 26–30 May):
 - Memorial Day week: Dow down 17 of last 29. Bearish lean. Pattern strength: MODERATE.
 - Day after Memorial Day (Tue 27 May): Dow down 8 of last 10. Recent trend bearish.
 - Week after options expiration: S&P up 30 of 45, avg +0.40%. Bullish lean. Pattern strength: MODERATE.
 - These two patterns contradict each other. Net: mixed / slight bearish lean.

SECTOR SIGNALS:
 - Banking: seasonal SHORT window (May–July). Avoid Financials seasonally.
 - Technology: still in seasonal LONG window (March–July). Supports tech/Nasdaq.
 - Gold/Silver: seasonal SHORT begins mid-May. Contradicts current elevated gold price.
 - Materials: seasonal SHORT (May–October). Consistent with last week's poor sector performance.

ALMANAC SEASONAL BIAS: Cautiously bearish-neutral. Midterm year cycle and Memorial Day week pattern are headwinds. Week-after-expiration pattern is a mild tailwind. Technology seasonal long is the one clear positive.

PATTERN CONFIDENCE: LOW–MEDIUM. Two contradicting week-level patterns. Midterm year context dominates.

ALMANAC THESIS: "Seasonality suggests cautious/bearish-neutral for late May in a midterm year. Technology is the seasonal bright spot. Banking and Materials face seasonal headwinds. The Memorial Day week bearish tendency adds to the macro pressure from elevated yields. However, the week-after-options-expiration bullish pattern provides some offset. Confidence is low given conflicting signals."

Source: Stock Trader's Almanac 2026, pp. 65–66 (May Vital Statistics), p. 94 (Sector Seasonality), pp. 10–11 (2026 Outlook). Accessed: 24 May 2026.

The most important rule for the Almanac Agent: Never write "the Almanac says the market will go up this week." Always write "the seasonal pattern suggests X, with Y confidence, because Z — and this conflicts/aligns with the macro/technical evidence as follows." The Almanac gives you a probability context, not a certainty. Your job is to translate a historical pattern into a structured hypothesis with an honest confidence level.

📋 Quick Reference — Where to Look in the Almanac Each Week

What you needAlmanac page / sectionWhat to read
Monthly overview and key bullet pointsMonthly Almanac page (e.g. p.65 for May)Read the bullet points at the top. These are the key historical footnotes. Copy the most relevant 2–3 into your output.
Monthly vital statistics tableSame page, bottom halfRead Rank, Up/Down count, Avg % Change, and Midterm Year Avg. Record all five indices.
Specific day/week notesDaily calendar pages for that weekLook for any bold italicised notes in the calendar entries, e.g. "Memorial Day Week Dow Down 17 of Last 29."
Sector seasonalityp. 94 (Sector Seasonality table)Look for any sector whose seasonal period starts or ends in the current month. Record Long/Short, start, end, and avg return.
4-Year Cycle contextpp. 10–11 (2026 Outlook)Read once at the start of the trimester. Reference it every week as standing context. Do not re-read every week — just paste the key sentence into your output.
📈

The Technical Agent — Reading Charts

Education + How-To

You are technology students — charts are just data visualisation. Technical analysis is simply the practice of reading a price chart to understand whether buyers or sellers are currently in control, and where the price is likely to find support or resistance. You do not need to be a trader to do this. You need to observe, measure, and report. This section teaches you the four tools you will use every week: the 8 EMA, the 21 EMA, trendlines, and support/resistance levels.

Basics

What You Are Actually Looking At — Reading a Price Chart

The X axis

Time. Left is older data, right is the most recent. Each bar or candle = one day (on a daily chart) or one week (on a weekly chart). You will use daily charts for weekly predictions.

The Y axis

Price. Higher = more expensive. The current price is at the far right edge of the chart. The chart history shows you whether price has been trending up, down, or sideways.

Green vs Red bars/candles

Green (or white) = price closed higher than it opened that day. Red (or black) = price closed lower. A series of green candles = buyers in control. A series of red candles = sellers in control.

Volume bars (bottom)

The bars at the bottom of most charts show trading volume — how many shares traded that day. High volume on an up day = conviction. High volume on a down day = selling pressure. Low volume = indecision.

Where to get charts: Use ProRealTime (already in your toolkit) or TradingView.com (free, no login for basic charts). Search for SPX (S&P 500), NDX (Nasdaq 100), RUT (Russell 2000). Set the chart to Daily timeframe and look at 3–6 months of history.

Tool 1

The 8 EMA — Your Short-Term Momentum Signal

What is an EMA?

EMA stands for Exponential Moving Average. It calculates the average closing price over the last N days, but gives more weight to the most recent days. This makes it react faster to new price moves than a simple average.

The 8 EMA is the average of the last 8 trading days, weighted toward the most recent. On a chart it appears as a smooth curved line that follows price closely.

How to add it in ProRealTime / TradingView

  1. Open your chart for SPX (S&P 500)
  2. Click "Indicators" or the + icon
  3. Search for "EMA" or "Exponential Moving Average"
  4. Set the period to 8. Set colour to orange.
  5. Repeat and add a second EMA with period 21. Set colour to blue.
8 EMA — How to Read It (simplified chart diagram)
8 EMA (orange line) Price ABOVE 8 EMA = Bullish momentum Price BELOW 8 EMA = Caution ↑ Bounce off EMA Break below ↓
✅ BULLISH signals
  • Price is above the 8 EMA → buyers in control
  • Price pulls back to 8 EMA and bounces off it → healthy trend
  • 8 EMA is sloping upward → momentum building
⚠️ WARNING signals
  • Price breaks below the 8 EMA → short-term momentum failing
  • 8 EMA is flattening → momentum stalling
  • Price closes below 8 EMA for multiple days → watch the 21 EMA next
Tool 2

The 21 EMA — Your Medium-Term Trend Confirmation

The 21 EMA is the average of the last 21 trading days (approximately one month of trading). It moves more slowly than the 8 EMA. Think of it as the difference between your daily mood (8 EMA) and your general personality over the last month (21 EMA).

8 EMA vs 21 EMA — The Four Conditions
ZONE 1: BULLISH 8 EMA above 21 EMA Both rising ZONE 2: CAUTION 8 EMA crossing 21 EMAs compressing Crossover! ZONE 3: BEARISH 8 EMA below 21 EMA Both falling ZONE 4: RECOVERY 8 EMA crossing back above 21 EMA 8 EMA 21 EMA
Zone 1 — Bullish Trend ✅

8 EMA above 21 EMA, both sloping up, price above both. Full momentum. This is where you want to see S&P 500 and Nasdaq during a bull run.

Output: "Technical: Bullish. 8/21 EMA aligned up."
Zone 2 — Compression ⚠️

The two EMAs are converging or crossing. This is a transition zone — the trend may be changing. Do not make a high-confidence call here.

Output: "Technical: Uncertain. EMAs compressing."
Zone 3 — Bearish Trend ❌

8 EMA below 21 EMA, both sloping down. Sellers in control. Price is likely to face resistance at both EMAs on any bounce.

Output: "Technical: Bearish. 8 EMA below 21 EMA."
Zone 4 — Recovery 🔄

8 EMA crossing back above 21 EMA from below. Potential trend reversal. Wait for confirmation — one cross does not confirm a new trend.

Output: "Technical: Recovering. Watch for follow-through."
Tool 3

Trendlines — Drawing the Channel Price is Moving In

A trendline is a straight line you draw on the chart connecting a series of highs or lows. It shows you the direction and angle of the trend. Think of it as drawing the "floor" (uptrend line) or "ceiling" (downtrend line) of price movement.

Uptrend Line vs Downtrend Line
UPTREND LINE (Support) Connect the LOWS. Line acts as a floor. Low 1 Low 2 Low 3 Price stays ABOVE line ✓ If price breaks BELOW = warning! DOWNTREND LINE (Resistance) Connect the HIGHS. Line acts as a ceiling. High 1 High 2 High 3 Price stays BELOW line ✓

How to draw an uptrend line

  1. Identify at least two low points (price bounced up from these)
  2. Connect them with a straight line, extending it to the right
  3. If a third low touches the line: the trendline is confirmed
  4. If price closes below this line: potential trend break — flag it

What to report in your Technical Agent output

  • Is price in an uptrend, downtrend, or sideways channel?
  • Where is the trendline currently? (e.g. "support at ~7,350")
  • Is price approaching, bouncing off, or breaking the trendline?
  • Has there been a confirmed break? (needs a closing price, not just an intraday dip)
Tool 4

Support & Resistance — The Price Levels That Matter

Support is a price level where buying has historically been strong enough to stop price from falling further. Resistance is a price level where selling has historically been strong enough to stop price from rising further. These levels matter because many market participants are watching the same levels simultaneously.

Support Becoming Resistance (and Vice Versa)
Resistance Key level Support ↑ Bounces at support Breaks through resistance → Old resistance = new support Breaks below ↑ Now resistance
How to identify key levels
  • Look for price levels where the chart has bounced multiple times
  • Round numbers (e.g. 7,000; 7,500) often act as psychological levels
  • Recent highs and recent lows are the most important levels to track
The flip rule

When price breaks through a resistance level, that level often flips and becomes the new support. When price breaks below a support level, that level often flips and becomes the new resistance. This is one of the most reliable patterns in charts.

For your weekly output

State 2–3 specific price levels for S&P 500 and Nasdaq. For each level say: is it support or resistance? How many times has price touched it? Is price currently approaching it, above it, or below it?

★ OUTPUT

Technical Agent Output Template — What to Write Every Week

This is the structured output you paste into the LLM synthesis prompt. Use this format every single week — consistency makes comparison across weeks meaningful.

Technical Agent Output — Week of 26 May 2026 (S&P 500 example) INSTRUMENT: S&P 500 (SPX), Daily Chart
LAST CLOSE: 7,473 (Fri 23 May 2026)

8 EMA vs PRICE:
 - Price is ABOVE the 8 EMA. Momentum intact short-term.
 - 8 EMA estimated at ~7,420. Price is ~53 points above it — healthy gap.

8 EMA vs 21 EMA:
 - 8 EMA is ABOVE 21 EMA. Trend structure bullish.
 - 21 EMA estimated at ~7,380. Gap between 8 and 21 EMA = ~40 pts, not compressing.
 - EMA condition: Zone 1 (Bullish) — both rising, price above both.

TRENDLINE:
 - Uptrend line drawn from March 2026 lows, connecting lows at ~7,050, ~7,200, ~7,350.
 - Current trendline support: approximately 7,330–7,360 on the coming week.
 - Price is above the trendline. No break detected.

KEY LEVELS:
 - Resistance 1: 7,500 (round number, prior intraweek high). Not yet broken.
 - Resistance 2: 7,550 (prior weekly closing high from April). Would be new high.
 - Support 1: 7,350 (confluence of 21 EMA + trendline). Key level to hold.
 - Support 2: 7,200 (major prior breakout level, would flip back to support).

BREADTH NOTE:
 - Only 57% of S&P 500 stocks above their 200-day MA. Narrow rally — caution.
 - Russell 2000 outperformed this week (+2.7%). Broadening signal, but needs follow-through.

TECHNICAL BIAS: Neutral-Bullish, with caution on breadth.
CONFIDENCE: Medium. Structure intact but narrow breadth and proximity to resistance reduce conviction.
INVALIDATION: Close below 7,350 (trendline + 21 EMA confluence). That would shift bias to Bearish.
WATCH THIS WEEK: Can price break and hold above 7,500? Does Russell 2000 confirm above 2,900?

The one rule for Technical Agent: Never say "the chart looks good." State specific levels, specific EMA positions, and a specific invalidation condition. "Bullish while above 7,350" is a complete technical statement. "Looks bullish" tells the synthesis agent nothing.

🧠

The Human Score — Where Your Team's Thinking Matters Most

THE DIFFERENTIATOR
The Core Problem With AI-Only Outputs

If every team follows the same prompt template and pastes results from the same four AI models, every team's prediction will look almost identical. There is no differentiation. No thinking. No grade separation. Just formatted AI output.

The Human Score is the section where your team disagrees with, adjusts, or overrides the AI consensus using your own reasoning. This is where marks are earned. This is what makes your team's prediction yours.

Why

Why AI Cannot Replace Your Judgment

🤖 What AI does
  • Averages patterns from past data it was trained on
  • Produces plausible-sounding text based on the prompt it received
  • Has a knowledge cutoff — it cannot know what happened last week unless you tell it
  • Tends toward the "safe" answer — moderate confidence, neutral to slight direction
🧠 What you can do that AI cannot
  • Weigh a specific geopolitical event (e.g. Iran deal) that happened this week
  • Notice that the chart pattern is unusually clear or unusually ambiguous right now
  • Decide that one piece of evidence is more important than others in this specific context
  • Disagree with all four AIs and explain precisely why — and be right
⚡ What happens when AI is wrong
  • All four models may miss a regime change because it does not match past patterns
  • All four models may anchor on the most recent data in your prompt
  • All four models may be confidently wrong in the same direction
  • The team that spotted the divergence and overrode the AI consensus — and was right — learns the most
🧠 FRAMEWORK

The Human Score — A Five-Dimension Judgment Framework

After completing the multi-LLM comparison table, your team independently scores each dimension on a scale of −2 to +2. These scores reflect your human judgment — informed by the AI outputs but not determined by them. Then you write a one-sentence justification for any score where you differ significantly from the AI consensus.

Dimension −2 −1 0 +1 +2 Your Score AI Consensus Difference?
Macro / News Weight
How strongly does the macro environment drive direction this week?
Strongly bearish macro Mildly bearish Neutral / mixed Mildly bullish Strongly bullish macro _________
Technical Structure
What does the chart tell you independent of any AI?
Broken down, below both EMAs Weak, compressing Mixed signals Above both EMAs, intact Clear uptrend, all signals aligned _________
Almanac Seasonal Weight
How much should seasonality influence this week's call?
Strong bearish seasonal Mild bearish seasonal Weak or conflicting pattern Mild bullish seasonal Strong bullish seasonal _________
AI Model Agreement Quality
How much do you trust the AI consensus this week?
All models agree but their evidence seems wrong to you Models agree but you are sceptical Models split, you cannot break the tie Models agree and you agree Models agree, evidence very clear _________
Wild Card / Human Observation
Is there something you noticed that no AI picked up?
Strong bearish factor AI missed Minor bearish factor AI missed Nothing extra to add Minor bullish factor AI missed Strong bullish factor AI missed _________
Human Score Total = sum of 5 dimensions
+6 to +10 → Team calls BULLISH override
+2 to +5  → Team leans Neutral-Bullish
−1 to +1  → Team calls NEUTRAL / UNCERTAIN
−5 to −2 → Team leans Neutral-Bearish
−10 to −6 → Team calls BEARISH override
When your score differs from the AI consensus

If your Human Score points in a different direction than the AI consensus, you must write one paragraph explaining: what you saw that the AI did not weight correctly, and why you are making a different call. This paragraph is what separates your team's analysis from every other team's.

★ EXAMPLE

Worked Example — Human Score for Week 2 (26 May 2026)

The AI consensus (3 of 4 models) said Neutral-Bullish, S&P +0.3% to +0.9%. Here is how one team applied human judgment to arrive at a different final call.

DimensionAI SaidTeam ScoreTeam Reasoning
Macro / News Weight +1 (mildly bullish — oil drop) −1 The AI focused on the oil price drop as good news. Our team noted that the Iran deal is NOT confirmed — if it falls through, oil spikes $6+ in a single day. That binary risk means the macro environment is more fragile than the AI's +1 implies. We scored it −1 for unresolved binary risk.
Technical Structure +1 (price above EMAs) +1 Agreed. EMAs intact, trendline holding. No disagreement here. But we noted the 57% breadth figure — we flagged this as a warning, consistent with DeepSeek's concern, which the other three models underweighted.
Almanac Seasonal Weight 0 (mixed signals) −1 The AI called it mixed/neutral. But we specifically looked at the midterm year row in the May Vital Statistics: −0.7% average for S&P in midterm Mays. Memorial Day week down 17 of last 29. The AI did not give the midterm context enough weight. We scored it −1.
AI Model Agreement Quality +1 (3 of 4 agree) 0 3 of 4 models agreed but their % estimates ranged from −0.5% to +1.0% — a 1.5 percentage point spread. That is wide. Agreement on direction is not the same as agreement on magnitude. We reduced this to 0 rather than +1.
Wild Card / Human Observation 0 (nothing flagged) −1 This is the key human observation no AI raised: The Friday PCE (Personal Consumption Expenditures — the Fed's preferred inflation measure) print releases into a short 4-day holiday week with lower volume. Low-volume markets can exaggerate moves in both directions. A hot PCE in a thin market could cause an outsized selloff. None of the four AIs mentioned the volume-thinness amplification effect. We scored it −1.
Team Alpha — Human Score Summary — Week 2

HUMAN SCORE TOTAL: −2 (Macro −1 + Technical +1 + Almanac −1 + AI Quality 0 + Wild Card −1)

HUMAN CALL: Neutral-to-Cautious. We are slightly more cautious than the AI consensus of Neutral-Bullish.

OVERRIDE PARAGRAPH: The AI models collectively underweighted three factors our team considers significant this week: (1) the Iran deal remains unconfirmed, making the oil-driven macro relief fragile and binary; (2) the midterm year seasonal context in the Almanac specifically shows negative May averages in years like this one; (3) the PCE data releases on Friday into a shortened, lower-volume holiday week — a condition that historically amplifies price moves. For these reasons, our team adjusts the final regime call from Neutral-Bullish (AI consensus) to Neutral-Cautious, and reduces our S&P 500 % range to −0.2% to +0.6% vs. the AI consensus of +0.3% to +0.9%.

Note: This override may be wrong. If PCE comes in cool and Iran confirms a deal, the AI consensus will have been right. We document our reasoning so we can learn from the outcome either way.

Why this team scores higher than a team that just copies the AI output: They showed independent reasoning on three specific dimensions. They identified a factor (low-volume holiday week amplification of PCE) that none of the four AI models raised. They quantified their disagreement (+/−) and explained it. They adjusted the final prediction specifically — not just the words, but the actual % range. And they acknowledged they might be wrong while explaining why they made the call anyway. That is what thinking looks like.

What happens when your human override is wrong? You still score well — because the scoring system rewards calibration and reasoning quality, not accuracy. A team that said "we are cautious because of X, Y, Z" and was wrong about direction but right to be cautious (the week was flat) scores well. A team that confidently said "bullish" with no caveats and was wrong scores poorly. Being wrong with good reasoning is better than being right by accident.

🧠 Human Score Checklist — Over the weekend (after US Friday market close) Before Submitting

Did you read the AI outputs before scoring, not after?
Did you score all 5 dimensions independently as a team?
Is there any dimension where you disagree with the AI? (There should usually be at least one.)
Did you write a specific justification for any override or disagreement?
Is your final prediction number different from the AI average if your Human Score differs?
Did you include the Wild Card — something you noticed that the AIs did not raise?
👥

Scrum Roles — Ten Students, Ten Responsibilities

Agile Structure

Each group has approximately 10 students. Every student has a named role. No role is ownerless. Your role determines what you produce, what you present on Monday, and what evidence you commit to GitHub. Roles are fixed for the trimester but responsibilities are shared — everyone understands every agent.

All Roles

The Ten Roles — Who Does What

#RoleScrum CategoryWeekly DeliverablePresents On MondayGitHub Evidence
R1 Product Owner Scrum Core Sprint goal statement. Acceptance criteria for the week's prediction brief. Backlog priority decisions. Opens the presentation. States the sprint goal and what "done" means this week. sprint_goal_WXX.md
acceptance_criteria.md
R2 Scrum Master Scrum Core Facilitated stand-up notes (3× per week). Impediment log. Sprint retrospective summary. Presents retrospective: what worked, what failed, what changes next sprint. standup_WXX.md
retrospective_WXX.md
R3 Almanac Agent Lead Agent Completed Almanac Agent output using the template on this site. Monthly stats, week pattern, sector seasonality, cycle context, bias statement. Presents Almanac output. Explains which seasonal pattern was most relevant and why. almanac_agent_WXX.md
R4 Macro / News Agent Lead Agent FedWatch direction, rates, dollar, oil, economic calendar summary, key news events ranked by market impact. Presents macro output. States the single most important macro driver for the coming week. macro_agent_WXX.md
R5 Technical Agent Lead Agent Chart analysis for S&P 500, Nasdaq, Russell 2000. 8/21 EMA condition, trendline status, key support/resistance levels, technical bias. Presents charts live. Shows EMA positions, trendline, and key levels with annotations. technical_agent_WXX.md
charts/ folder with screenshots
R6 LLM Synthesis Operator Synthesis All four LLM responses queried with identical prompt. Comparison table completed. Raw responses stored in GitHub. Presents the comparison table. Highlights where models agreed and where they diverged. synthesis_claude_WXX.txt
synthesis_chatgpt_WXX.txt
synthesis_gemini_WXX.txt
synthesis_deepseek_WXX.txt
llm_comparison_WXX.md
R7 Human Score Analyst Synthesis Five-dimension Human Score table. Override paragraph where team judgment differs from AI consensus. Final adjusted prediction with confidence band. Presents the Human Score and the team's final call. Must explain any override of the AI consensus. This is the most important individual presentation slot. human_score_WXX.md
R8 Data & Evidence Lead Quality All data sourced and cited with access date. Screenshots of Finviz and Yahoo Finance taken Monday and Friday. Evidence folder organised and complete. Presents last week's actuals vs. this week's predictions. Shows the data trail clearly. actuals_WXX.md
evidence/ folder
finviz_1W_WXX.png
yahoo_sectors_5D_WXX.png
R9 GitHub & Integration Lead Quality All files committed. Branches used for draft work. Pull request merged by Saturday evening SGT — filed Saturday. README (a text file in a GitHub repository that explains the project, how to use it, and its current status) updated. Release tag created. Shows GitHub commit history live. Confirms all evidence is in the repository and accessible. README.md
All commits + PRs
Release tag vWXX
R10 QA (Quality Assurance — the process of checking that outputs meet defined standards) & Learning Log Lead Quality Calibration score for last week calculated. LLM horse race updated. Learning log entry: what the team believed, what happened, what changes next week. Closes the presentation. Announces calibration score, LLM horse race standing, and one thing the team will do differently next sprint. calibration_log.md
llm_horserace.md
learning_log_WXX.md

Role Deep-Dives — What Each Person Actually Does

🎯
R1 · Scrum Core

Product Owner

You define what "done" looks like for each sprint. Before Monday class you write a one-paragraph sprint goal: what is the team trying to learn this week? What question is the prediction trying to answer? You also maintain the backlog — a list of things the team could improve — and you prioritise it.

Key question to answer every sprint: "What is the most important thing we can improve in our prediction workflow this week?"

🔄
R2 · Scrum Core

Scrum Master

You run the stand-ups (done / doing / blocked — maximum 10 minutes each). You remove blockers — if someone cannot get data or cannot access a tool, you solve it. You do not do their work, you unblock it. You write the retrospective: what went well, what went badly, what changes next sprint.

Critical rule: Stand-up is about blockers, not status reports. If nobody is blocked, it should take 3 minutes.

📅
R3 · Agent

Almanac Agent Lead

Every Saturday from weekend you open the Almanac, look up the current month's vital statistics, read the day-specific notes for the coming week, and check the sector seasonality table. You write the structured Almanac Agent output using the template on this site. You do not just copy the Almanac — you interpret it in context of the current macro environment.

Key output sentence: "Seasonality suggests _____, with _____ confidence, because _____, but this conflicts/aligns with current conditions because _____."

📰
R4 · Agent

Macro / News Agent Lead

Over the weekend (after US Friday market close) you check CME FedWatch, the 10-year yield level, DXY (US Dollar Index — measures the dollar against a basket of major currencies), oil price, and the week-ahead economic calendar on TradingEconomics. You identify the single most market-moving event of the coming week and explain whether it is likely to be a bullish, bearish, or binary-risk catalyst. You rank your evidence by confidence.

Key discipline: Separate confirmed facts (rate is 3.75%) from expectations (market prices in 96% hold) from opinions (analysts think…). Label all three differently.

📈
R5 · Agent

Technical Agent Lead

Over the weekend (after US Friday market close) you open ProRealTime or TradingView, pull up S&P 500, Nasdaq, and Russell 2000 daily charts, and read the 8/21 EMA condition, trendline status, and key support/resistance levels. You annotate screenshots and commit them to GitHub. You use the Technical Agent output template on this site.

Non-negotiable: State a specific invalidation level every week. "Bullish while above 7,350" is complete. "Looks bullish" is not acceptable.

🤖
R6 · Synthesis

LLM Synthesis Operator

You run the synthesis workflow. Once R3, R4, and R5 have their outputs ready, you paste them into the shared prompt template and query all four AI models (Claude, ChatGPT, Gemini, DeepSeek). You save every raw response as a text file in GitHub. You complete the comparison table. You hand the table to R7 before they write the Human Score.

Quality check: Were the prompts identical for all four models? If not, the comparison is invalid. Do not change a word between models.

🧠
R7 · Synthesis

Human Score Analyst

This is the most intellectually demanding role. After seeing the LLM comparison table, you lead the team discussion on the five Human Score dimensions. You write the override paragraph if your team's judgment differs from the AI consensus. You own the final prediction number. Your job is to think — not to summarise what the AI said.

The grade question: "What did our team see that none of the four AIs raised?" If you cannot answer this, your Human Score is incomplete.

🗂️
R8 · Quality

Data & Evidence Lead

You take the weekend Finviz (1W) and Yahoo Finance Sectors (5D) screenshots the moment markets open. You record the Friday close actuals. You maintain the evidence folder in GitHub with consistent file naming. You are the team's data quality officer — if a number cannot be traced to a source with a date, it does not go in the brief.

File naming discipline: Every evidence file includes the week number and date. Example: finviz_1W_2026-W22_Mon.png

🐙
R9 · Quality

GitHub & Integration Lead

You own the repository. You create branches for draft work (e.g. sprint/W22-technical), review pull requests before merging, keep the README current, and create a release tag (vW22) over the weekend before class. If the repo is messy, your score suffers. If someone cannot find the evidence, that is your problem to fix.

Before over the weekend class: Is the README up to date? Are all files merged to main? Is the release tag created? If not, fix it before walking into class.

R10 · Quality

QA & Learning Log Lead

Over the weekend (after US Friday market close) you score last week's prediction using the calibration scoring table on this site. You update the LLM horse race tracker (which model was closest to the actual S&P move?). You write the learning log entry: what did the team predict, what actually happened, what was confusing, and what changes next sprint.

The most important line in the learning log: "We were wrong about _____ because we underestimated _____. Next sprint we will _____."

Structure

Practical A (3 Groups) and Practical B (2 Groups) — Same Standard, Same Expectations

Practical A — 3 Groups

Three groups present back-to-back. Each group has a 20–25 minute slot. With three groups that is 60–75 minutes of presentations. The remaining 45–60 minutes is used for cross-group discussion, sprint planning, GitHub work, and agent preparation. Every role holder speaks — no silent passengers.

Practical B — 2 Groups

Two groups present. Each group has a 20–25 minute slot. With two groups that is 40–50 minutes of presentations. The remaining 70–80 minutes allows for deep cross-group discussion — groups interrogate each other's SPX/NDX/IWM predictions, Human Score reasoning, and what the AI models got right or wrong.

Cross-group challenge rule: After each presentation, any student from another group can ask one question. The question must be about the reasoning, not the outcome. "Why did you score Technical +1 when breadth was only 57%?" is a good question. "Were you right?" is not — you find that out on Friday.

🎤

The 20–25 Minute Sprint Presentation

Over the weekend (after US Friday market close)

This is not a slide deck presentation. You present live evidence — your GitHub repository, your charts, your comparison table, your data. You can use slides to organise your points but the primary artefact is what is in GitHub. If it is not committed, it does not exist.

⏱️ TIMING

The 22-Minute Presentation — Slot by Slot (+ 3 min buffer/questions)

0:00–1:30
90 sec
R1 — Product Owner
Sprint Goal & Last Week's Score

State this sprint's goal in one sentence. Then immediately state last week's calibration score (R10 hands you the number). Example: "Our goal this week is to improve our macro agent's treatment of binary risk events. Last week we scored +4 on calibration."

1:30–3:00
1 min
R8 — Data & Evidence Lead
Last Week: Prediction vs Actual

Show the actuals table from Finviz. What did the team predict last week? What actually happened? Which assets were you right on? Which were you wrong on? State the direction accuracy (not %, just up/down/flat) for all 9 assets.

3:00–5:00
90 sec
R3 — Almanac Agent Lead
Seasonal Context for This Week

Read out your Almanac Agent output from GitHub. State: current month rank, the most relevant week pattern, and the midterm year cycle context. State your seasonal bias and confidence. Maximum 3 bullet points. No unnecessary detail.

7:00–9:30
90 sec
R4 — Macro / News Agent Lead
The Macro Picture This Week

State: current Fed rate, FedWatch probability, 10-year yield level, oil direction, and the single most important calendar event this week. Explain in one sentence why that event matters. State your macro bias (hawkish/dovish/neutral/binary).

7:00–9:30
2 min
R5 — Technical Agent Lead
Chart Reading Live

Show the annotated S&P 500 chart live on screen. Point to: the 8 EMA, the 21 EMA, the trendline, and the two key levels. State the EMA zone (1–4). State the technical bias and the specific invalidation level. This is a live chart walk — not a slide. You must be able to point at the screen.

9:30–11:30
90 sec
R6 — LLM Synthesis Operator
What the Four AIs Said

Show the comparison table. State each model's regime call in one word. Identify the point of maximum agreement (all four say X) and the point of maximum divergence (models split). Do not read out full AI responses — just the comparison table. State which model's reasoning your team found most credible, and why.

11:30–15:00
2.5 min ★
R7 — Human Score Analyst ★ MOST IMPORTANT SLOT
Our Team's Thinking — The Override

Show the Human Score table. State your total score and what it means. Then — and this is the most important part — explain the Wild Card: the one thing your team observed that none of the four AIs raised. State whether your final call differs from the AI consensus. If it does, explain precisely why. If it does not, explain why you agreed. Your team's final prediction % range goes here.

This is where every team will be different. This is what earns your grade. If your Human Score section sounds like a summary of the AI outputs, you have not done this correctly.

15:00–17:00
1 min
R9 — GitHub & Integration Lead
Repository Evidence Check

Open GitHub live. Show: this week's commits, the evidence folder, the release tag, and the README. This takes 60 seconds. If the repo is clean and organised, it takes 30 seconds. If it is messy, that is visible to everyone in the room.

17:00–19:00
1 min
R10 — QA & Learning Log Lead
Calibration Score + LLM Horse Race + One Change

State last week's calibration score. Update the class on the LLM horse race (which model has been most accurate across all sprints so far). State one specific change the team commits to making in next week's workflow. Close with: "Our prediction for this week is filed in GitHub as [filename]."

R2 — Scrum Master note: After the presentation, during the open discussion period, the Scrum Master facilitates a 2-minute retrospective check with the class: "What question does our prediction leave unanswered?" This surfaces blind spots before Friday.

Rules

Presentation Rules — Non-Negotiable

❌ You may NOT
  • Read an AI output verbatim as your own analysis
  • Present something that is not committed to GitHub
  • Say "the market might go up or down" without a stated confidence
  • Have only one or two students speak while others stand silently
  • Present without showing the actual chart (R5)
✅ You MUST
  • Every role holder speaks during their designated slot
  • State a specific prediction with a % range and confidence level
  • State a specific invalidation condition
  • Show live GitHub as evidence of the week's work
  • Finish within the 22-minute window (3 minutes for questions from other groups)
⭐ What earns the highest marks
  • R7 identifies a genuine insight the AI models did not raise
  • The team's final prediction differs from the AI consensus with clear justification
  • The retrospective shows the team actually changed something from last sprint
  • Calibration improves week-on-week (team gets better at knowing when to be confident)
  • GitHub history shows a consistent, disciplined weekly cadence
Definition of Done — Every Sprint
Prediction filed in GitHub by Saturday evening SGT — filed Saturday
All four LLM raw responses stored in evidence folder
Human Score table completed with override paragraph
Annotated chart screenshot in GitHub
Friday actuals committed after market close
Calibration score updated in learning log
README reflects current sprint status
Release tag vWXX created before class
🗓️

The Weekly Sprint Rhythm

Saturday → Friday → Monday

⚡ The sprint starts on Saturday at weekend — not Monday. US markets close at 4:00 PM Eastern Time on Friday. That is Friday 4PM ET / start of the Singapore weekend. The moment that happens, last week's data is final and your sprint begins. You have the whole weekend to build a quality analysis. Do not arrive at Monday's class still building your prediction.

Time Zone Quick Reference
Friday 4:00 PM ET
= weekend after US Friday close
US markets close → sprint begins
Saturday or Sunday
Data pull, agents, LLMs
Build your analysis
Sunday evening SGT at the latest
Prediction committed to GitHub
Timestamp = locked prediction
Mon–Fri (next week)
Market trades, watch it play out
Do not change your prediction
Friday 4:00 PM ET
= weekend after US Friday close
Record actuals → sprint ends
Following Monday class
Present, score, learn
Everything already prepared
SAT
Weekend Sprint — After US Friday Market Close

Step 1 — Pull the data (R8)

  • Open Finviz (finviz.com/futures_performance.ashx) — set to 1W. Screenshot immediately.
  • Open Yahoo Finance Sectors (finance.yahoo.com/sectors) — set to 5D. Screenshot immediately.
  • Record closing values and weekly % change for: SPX, NDX, IWM, Gold, Oil, 10-Year Yield, Bonds, VIX, Bitcoin.
  • Record all 11 sector % changes. Note the top 3 leaders and bottom 3 laggards.
  • Commit both screenshots to GitHub: evidence/finviz_1W_YYYY-WXX_Sat.png and evidence/yahoo_sectors_5D_YYYY-WXX_Sat.png
SAT
Saturday Morning SGT — Three Agent Outputs (R3, R4, R5)

Step 2 — Build the three agents in parallel

  • R3 Almanac Agent: Look up this week's month page, day notes, and sector seasonality table. Write the structured Almanac output using the template on this site.
  • R4 Macro/News Agent: Check CME FedWatch, 10-year yield, oil, and TradingEconomics calendar for the coming week. Write the macro output.
  • R5 Technical Agent: Open ProRealTime or TradingView. Pull up SPX, NDX, and IWM daily charts. Read the 8/21 EMA condition, trendline, and key levels. Annotate screenshots. Write technical output.
  • All three outputs committed to GitHub before running the LLM synthesis.
SAT
weekend — LLM Synthesis (R6)

Step 3 — Query all four AI models

  • Paste the three agent outputs into the shared prompt template. Do not change a word between models.
  • Query Claude, ChatGPT, Gemini, and DeepSeek in sequence. Save each raw response.
  • Complete the Multi-LLM Comparison Table. Identify agreement and divergence.
  • Commit all four raw responses: synthesis_claude_YYYY-WXX.txt etc.
SAT
weekend — Human Score & Final Prediction (R7)

Step 4 — Apply human judgment and file the prediction

  • R7 leads the team discussion on the five Human Score dimensions.
  • Identify any override — what does your team see that the AIs did not weight correctly?
  • Write the final prediction for SPX, NDX, and IWM — each with direction (Up/Down/Flat), a % range, and a confidence level.
  • Commit prediction_YYYY-WXX_teamname.md to GitHub before midnight SGT Saturday. This timestamp is your locked prediction. It cannot be changed after this point.
MON–FRI
Monday–Friday — The Market Trades Your Prediction

Step 5 — Watch and log, do not change

  • Do not modify the prediction file. The commit timestamp is evidence. Changing it after the fact invalidates your calibration score.
  • Mid-week (Wednesday): R8 does an optional check. If a major surprise event moved markets more than 2% in a day, log it in the learning log as a mid-sprint note. This is not a prediction change — it is an observation.
  • R5 may note mid-week if a key technical level was broken — again, logged as observation only.
WKD2
Following weekend after US Friday close — Record Actuals (R8, R10)

Step 6 — Lock the actuals and score

  • Open Finviz (1W) and Yahoo Sectors (5D) — these now show the completed week.
  • Record actual % change for SPX, NDX, and IWM — these are the three scored assets.
  • Also record Gold, Oil, Yield, VIX, Bitcoin, and sector leaders/laggards for context.
  • R10 calculates the calibration score for each of the three primary predictions (see Scoring section).
  • R10 updates the LLM horse race: which model predicted SPX direction most accurately this week?
  • Commit actuals: actuals_YYYY-WXX.md — alongside the original prediction file. Both files together = the evidence of prediction vs outcome.
WKD2
weekend — Prepare for Monday (R1, R2, R7, R10)

Step 7 — Build the Monday presentation

  • R10 writes the learning log entry: what the team predicted, what happened, what was surprising, what changes next sprint.
  • R7 prepares the Human Score explanation — why the team agreed or disagreed with the AI consensus, and how that played out.
  • R2 prepares the retrospective: what Scrum practice worked, what failed, what one thing changes next sprint.
  • R9 creates the release tag vWXX and confirms the repo is clean before Monday class.
  • Every role holder reviews their slot in the 20–25 minute presentation structure.
MON
Monday Class — Present, Score, Learn (All Roles)

Step 8 — 20–25 minute sprint review presentation

  • All preparation is complete before you walk in. There is no last-minute scramble.
  • Every role holder presents their designated slot (see Presentation section for full timing).
  • SPX, NDX, and IWM predictions vs actuals are stated clearly and scored publicly.
  • The Human Score Analyst (R7) presents the team's reasoning and any override — this is the most important slot.
  • After the presentation: next sprint's goal is agreed, backlog is updated, and the new prediction window begins the following Friday at 4PM ET (= Saturday over the weekend).
🤖

Multi-LLM Synthesis

Core Process

Over the weekend (after US Friday market close), all four AI models receive the same structured prompt built from your three agent outputs. You compare their predictions side-by-side before writing your team's consensus.

Claudeclaude.ai
ChatGPTchat.openai.com
Geminigemini.google.com
DeepSeekchat.deepseek.com
Shared Prompt Template — paste this into each AI chat window (identical for all four) You are a market intelligence synthesis assistant. Given the evidence below, produce a weekly market regime recommendation.

ALMANAC EVIDENCE:
[Paste your Almanac Agent output here — seasonal bias, confidence, caveats]

MACRO / NEWS EVIDENCE:
[Paste your Macro/News Agent output here — FedWatch, rates, dollar, oil, calendar events]

TECHNICAL EVIDENCE:
[Paste your Technical Agent output here — EMA signals, trendlines, key levels]

REQUIRED OUTPUT — respond in exactly this structure:
1. Weekly Regime: [Bullish / Bearish / Neutral / Uncertain]
2. Confidence Score: [Low / Medium / High] + brief justification
3. Key Supporting Evidence: (3 points max)
4. Key Contradictions: (2 points max)
5. Invalidation Conditions: what would change this view
6. Predicted % move — SPX (S&P 500): [+X.X% to +X.X%] — direction + range
   Predicted % move — NDX (Nasdaq 100): [+X.X% to +X.X%] — direction + range
   Predicted % move — IWM (Russell 2000): [+X.X% to +X.X%] — direction + range
7. Plain-English brief: 2–3 sentences a non-expert can understand
8. Disclaimer: remind the reader this is not financial advice

Rule: Do not change the prompt between models. Do not add extra context for one model that others do not get. Fair comparison requires identical inputs. Store all four raw responses in GitHub named: synthesis_claude_YYYY-WXX.txt, synthesis_chatgpt_YYYY-WXX.txt, etc.

Multi-LLM Comparison Table — Fill This Every Sprint

After querying all four models, fill this table before writing your consensus:

Dimension Claude ChatGPT Gemini DeepSeek
Weekly Regimefill infill infill infill in
Confidence ScoreLow/Med/Highfill infill infill in
SPX % estimatee.g. +0.5% to +1.2%fill infill infill in
NDX % estimatee.g. +0.8% to +1.5%fill infill infill in
IWM % estimatee.g. −0.5% to +0.5%fill infill infill in
Top supporting reasonkey phrasefill infill infill in
Top contradiction citedkey phrasefill infill infill in
Invalidation conditionwhat would change itfill infill infill in
Tone / caveat languagecautious/assertivefill infill infill in

Consensus protocol: Where ≥3 models agree → high-confidence core of your brief. Where models diverge → document as a contradiction and include in your watchlist. Your final prediction must state which view was chosen and why you weighted certain models more or less for that specific week.

🏆

Calibration Scoring — SPX, NDX & IWM

How You're Graded

The three scored instruments are SPX (S&P 500), NDX (Nasdaq 100), and IWM (Russell 2000). You make a prediction for each one every week: direction (Up/Down/Flat) and a % range. You are not scored on whether you predicted correctly — you are scored on how well your stated confidence matched your outcome. This is called calibration. A cautious, evidenced "uncertain" is worth more than an overconfident wrong call.

Why These Three

Why SPX, NDX, and IWM — and What Comparing Them Tells You

SPX — S&P 500

The benchmark. 500 large US companies. The most widely followed index in the world. Your SPX prediction is the headline call.

Ticker: SPX or ^GSPC
ETF: SPY
NDX — Nasdaq 100

Tech-heavy. If NDX outperforms SPX, the market is in a growth/tech-led regime. If NDX lags, the rally is rotating away from tech.

Ticker: NDX or ^NDX
ETF: QQQ
IWM — Russell 2000

Small caps. Most sensitive to US interest rates and domestic economic conditions. IWM lagging SPX is an early warning sign.

Ticker: RUT or ^RUT
ETF: IWM

The relationship between the three is the insight. If SPX is up but IWM is down, the rally is narrow and rate-sensitive — a warning. If NDX leads SPX by 2%+ in a week, tech is driving everything — check yields. If IWM leads all three, risk appetite is broad and healthy. Predicting all three forces you to think about which kind of week it will be, not just which direction.

Stated Confidence Direction Outcome Score Reason
HighUp or Down✅ Correct+3 ptsWell-evidenced, committed, and right
MediumUp or Down✅ Correct+2 ptsGood reasoning, measured confidence
Low / UncertainUp or Down✅ Correct+1 ptHonest about limits, got lucky — acceptable
HighUp or Down❌ Wrong−2 ptsOverconfidence penalty — worst outcome
MediumUp or Down❌ Wrong0 ptsTried, wrong, not overconfident — neutral
Low / UncertainAny❌ Wrong+1 ptHonest uncertainty — always rewarded

LLM Horse Race — Tracked Across the Trimester

Each week you record which AI model was closest to the actual S&P 500 % move. Over 10 weeks, a leaderboard emerges. By Week 10, you will have real data to answer: which AI model is most calibrated for weekly market regime prediction? This is valuable AI literacy — based on evidence, not opinion.

Upset of the week recognition: When the actual move defied all four AI models, that week's richest learning is: why were the models all wrong? The team that best explains the miss scores highest for that week, regardless of their prediction accuracy.

📌

This Week's Setup

Week 2 · 26 May 2026

This section is your team's working sprint board. Use the worked example below as your reference standard for what a complete submission looks like. Your team owns this workflow — read it, follow it, and improve on it every sprint.

Week 1 Actuals

What Actually Happened: 19–23 May 2026

This is your evidence base for the Week 2 prediction. These are the Friday 23 May closing figures. Source: Finviz 1W view + Yahoo Finance.

AssetFri 23 May CloseWeekly % ChangeSignal Reading
S&P 5007,473+1.0%Recovered after Mon–Tue selloff. Closed near highs. Breadth narrow — only 57% of stocks above 200-day MA.
Nasdaq 100~21,100+0.7%Lagged S&P. Nvidia earnings beat helped Wed recovery but tech breadth flat all week.
Russell 20002,869+2.7%Best performer this week. Small caps led — unusual when yields are elevated. Driven by Iran deal optimism and oil drop.
Gold4,523−0.4%Slight pullback. Still elevated near all-time highs. Safe haven demand eased slightly as stocks recovered.
Crude Oil (WTI)~$61–62−5.7%Big drop. Iran–US nuclear deal draft talk + Trump comments drove oil sharply lower. Key macro event of the week.
10-Year Yield4.60%+4.5%Hit a ONE-YEAR HIGH of 4.60% on Monday. Eased slightly by Friday as oil dropped. Elevated yields = ongoing pressure on growth stocks.
US Bonds (TLT)~$83–84−2.6%Fell as yields rose. Bonds and stocks sold off together Mon–Tue — unusual, signals fiscal/inflation concern.
VIX16.70+1.2%Elevated but not panicked. Stayed below 20. Market cautious but not fearful. Dropped from intraweek highs as stocks recovered.
Bitcoin75,400−2.5%Pulled back with risk-off sentiment early week. Did not recover fully even as stocks bounced. Risk appetite for crypto muted.

Week 1 Sector Scorecard (Yahoo Finance 5D view, Fri 23 May)

⚡ Energy+5.7%
Led all sectors. Oil drop paradox — energy stocks rose on volume/deal hopes.
🏥 Healthcare+0.9%
Defensive outperform. Steady amid macro noise.
🧴 Staples+0.7%
Defensive bid. Rotation into safety early week.
💡 Technology+0.1%
Barely positive. Nvidia beat helped but rate pressure capped gains.
💳 Financials−0.3%
Slight drag despite rising yields. Credit concern offset rate benefit.
🔌 Utilities−2.7%
Worst hit by rising yields. Bond-like sector punished.
🛒 Discretionary−2.4%
Consumer spending fears. Rate-sensitive.
🏠 Real Estate−2.6%
Crushed by 4.60% yield. REITs (Real Estate Investment Trusts — companies that own income-producing real estate) = most rate-sensitive sector.
🔧 Materials−3.0%
Worst sector. Global growth demand worry + strong dollar pressure.

Week 1 Story in one sentence: The 10-year yield hitting a 1-year high of 4.60% was the defining event — it crushed Real Estate, Utilities, and Discretionary; oil's sharp drop on Iran deal talk saved Russell 2000 and gave Energy an odd boost; stocks recovered Thursday–Friday but breadth remained narrow.

Macro Context

Fed, Rates & Key Events Heading into Week 2

CME FedWatch
~96% Hold at June 17 meeting
Current Fed rate: 3.50–3.75%. Market has almost entirely priced out any cut for June. First cut not expected until September at earliest. Tone: Hawkish hold.
10-Year Yield
4.60% — 1-year high
This is the key risk for Week 2. If yields stay elevated or push higher, growth stocks and rate-sensitive sectors remain under pressure. Watch for any move above 4.65%.
Oil / Iran
WTI ~$61 — fell sharply
Iran–US nuclear deal talks drove oil down ~6% this week. If a deal is confirmed: oil stays low, inflation expectations ease, yields could fall — bullish for stocks. If talks collapse: oil spikes, yields rise, stocks fall.
Almanac — Late May
Seasonally Positive
May has historically pushed higher 65–85% of the time. Pre-holiday weekend effect (US Memorial Day Monday 26 May = markets closed) tends to see light-volume drift upward on Thursday–Friday.

Week 2 Calendar Watch (27–30 May): US markets CLOSED Monday 26 May (Memorial Day). Key events: Consumer Confidence (Tue), GDP (Gross Domestic Product — the total value of goods and services produced) revision (Thu), PCE inflation — the Fed's preferred measure — (Fri). PCE on Friday is the single most important data point of the week. A hot PCE = yields rise, stocks struggle. A cool PCE = yields fall, stocks rally.

★ WORKED EXAMPLE

Complete Model Submission — What a Strong Week 2 Prediction Looks Like

This is a full example of what Prof. Dr. Tan expects to see from every team on Monday 26 May. Read it carefully. Your submission does not need to reach the same conclusion — but it must show the same structure, evidence discipline, and honest confidence calibration.

Step 1 — Three Agent Outputs (built before querying any AI)

🗓️ Almanac Agent

Late May (Week 21–22) historically bullish. S&P up 65–80% of the time. Pre-Memorial Day weekend typically sees light-volume drift upward as sellers step aside. Confidence in pattern: Medium — macro environment (elevated yields) is unusual vs. historical baseline. Caveat: seasonal tailwind exists but is not strong enough to override a macro shock like a PCE beat.

📰 Macro / News Agent

Fed: 96% hold probability at June 17 FOMC (Federal Open Market Committee — the Fed body that sets interest rates). Rate: 3.50–3.75% unchanged. 10-year yield at 4.60% — 1-year high, eased Friday as oil fell. Oil: WTI ~$61, down ~6% on Iran deal hopes — no confirmation yet. Key event this week: PCE inflation Friday. If PCE ≥ 2.8% YoY: yields likely spike, stocks sell off. If PCE ≤ 2.5%: yields ease, stocks rally. Also: Consumer Confidence Tue, GDP revision Thu. Macro driver: Oil/Iran geopolitics. Bias: Cautiously neutral — PCE is a binary risk event.

📈 Technical Agent

S&P 500: Price above 8 EMA ✓. 8 EMA above 21 EMA ✓ — momentum intact. However breadth weak: only 57% of S&P stocks above 200-day MA — rally is narrow. Russell 2000 outperformed but needs follow-through above 2,900 to confirm. Key level to watch: S&P 7,500 resistance. Invalidation: Close below 7,350 = trend reversal warning. VIX at 16.7 — calm but not complacent. Technical bias: Neutral-bullish while above 7,350.

Step 2 — Multi-LLM Comparison Table

All four models received the identical prompt with the three agent outputs above pasted in. Responses recorded Saturday 24 May 2026.

DimensionClaudeChatGPTGeminiDeepSeek
Weekly RegimeNeutral-BullishNeutralNeutral-BullishUncertain
S&P % estimate+0.3% to +0.9%−0.5% to +0.8%+0.2% to +1.0%−1.0% to +1.0%
ConfidenceMediumLow–MediumMediumLow
Top supporting reasonSeasonal tailwind + oil drop reduces inflation pressure + technical momentum intactTechnical trend above EMAs, pre-holiday volume typically thin and upward-driftingOil decline = easing inflation expectations = yields may ease = supports equitiesBreadth weakness and elevated yields make directional call unreliable this week
Top contradictionPCE on Friday is binary — a hot print reverses everything10-year at 4.60% is gravity. Stocks climbing against this is unsustainable if yields rise furtherIran deal not confirmed. If talks collapse, oil spikes, yields follow, stocks sell offStocks and bonds both fell Mon–Tue — simultaneous selloff signals deeper concern than a normal correction
Invalidation conditionPCE ≥ 2.8% or Iran talks collapse causing oil spike above $6810-year yield breaks above 4.70%; VIX closes above 22Any confirmed geopolitical escalation in Middle East beyond current Iran talksAny single macro print that causes 10-year to spike above 4.75%
ToneMeasured, evidence-based, acknowledged PCE risk clearlyCautious, emphasised wide uncertainty bandSlightly more optimistic, leaned on oil narrativeMost conservative — flagged structural concerns about narrow breadth

Step 3 — Team Consensus Brief & Final Prediction

Team Alpha — Week 2 Consensus Brief — Filed: Mon 26 May 2026, 09:00 SGT

REGIME: Neutral-Bullish with elevated uncertainty. Three of four models lean cautiously positive; DeepSeek is the outlier flagging structural breadth concern.

PREDICTIONS (26–30 May 2026):

S&P 500 ......... UP    +0.3% to +0.8%    Confidence: MEDIUM
Nasdaq 100 ...... UP    +0.2% to +0.7%    Confidence: LOW–MEDIUM
Russell 2000 .... FLAT   −0.5% to +0.8%   Confidence: LOW
Gold ............ DOWN   −0.5% to −1.5%   Confidence: MEDIUM
Crude Oil (WTI) . FLAT   −1% to +2%       Confidence: LOW (Iran binary)
10-Year Yield ... FLAT   4.50% to 4.65%   Confidence: MEDIUM
US Bonds (TLT) .. FLAT   slight recovery   Confidence: LOW
VIX ............. DOWN   toward 15–16     Confidence: MEDIUM
Bitcoin ......... FLAT   −2% to +2%       Confidence: LOW

LEADING SECTOR: Energy — Iran deal uncertainty keeps energy stocks in focus; oil volatility = sector volatility but energy companies benefit from geopolitical premium even if oil price is falling.

LAGGING SECTOR: Real Estate — 10-year yield at 4.60% is direct headwind for REITs. No near-term catalyst for relief unless PCE comes in cold.

KEY EVIDENCE (3 points):
1. Oil dropped 6% on Iran deal hopes — reduces inflation pressure, gives Fed room to hold without hiking, mild tailwind for stocks
2. Seasonal bias: late May pre-holiday is historically positive (65–80% of years) with light selling pressure
3. Technical: S&P above 8 EMA and 21 EMA — momentum structure still intact despite narrow breadth

KEY CONTRADICTION (why we kept confidence MEDIUM not HIGH):
PCE inflation data releases Friday. This is the Fed's preferred inflation measure. Claude and ChatGPT both flagged it — a hot print above 2.8% YoY would likely push yields above 4.65%, invalidate our bullish lean, and cause a Friday selloff. DeepSeek's concern about narrow breadth is valid: when only 57% of S&P stocks are above their 200-day MA, the index can look healthy while the majority of stocks are struggling. This is a structural warning we are monitoring but not acting on this week.

INVALIDATION CONDITIONS:
Our bullish lean is WRONG if: (a) PCE Friday ≥ 2.8% YoY, (b) 10-year yield breaks and holds above 4.70%, (c) Iran talks collapse causing oil spike above $68, or (d) VIX closes above 22 on any day this week.

PLAIN ENGLISH: We think the market will drift slightly higher this week, mainly because oil falling is good news for inflation and the pre-holiday week tends to be quiet and positive. However, Friday's inflation data is a wildcard that could reverse everything, so we are not very confident. If you hear bad inflation news on Friday, our prediction is probably wrong.

⚠️ This is an educational exercise and decision-support prototype. Nothing in this brief constitutes financial advice. All AI model outputs were reviewed and verified by the team before publication.

Why this submission scores well: It shows all three agent outputs before touching any AI. It queried all four models with identical prompts and filled the comparison table. It did not just copy one model's answer — it identified where models agreed (3 of 4 lean bullish) and where one diverged (DeepSeek flagged breadth concern) and explained why. Confidence was kept at MEDIUM because of the known PCE risk on Friday — not HIGH, which would have been overconfident. Invalidation conditions are specific and testable. The plain-English summary is readable by a non-expert. The disclaimer is present.

📋 Prof. Dr. Tan's Instructions for Week 2

  1. US markets are closed Monday 26 May (Memorial Day). Your prediction covers the 4-day week: Tue 27 – Fri 30 May.
  2. Build your three agent outputs before opening any AI tool. Write them in your own words first.
  3. Query all four LLMs. Save the raw responses in GitHub as: evidence/synthesis_claude_2026-W22.txt etc.
  4. Complete the comparison table. Identify where models agree and where they diverge.
  5. Special this week: Predict both a leading sector AND a lagging sector with reasons.
  6. Commit your prediction file to GitHub by the weekend before class. File name: prediction_2026-W22_[teamname].md
  7. On Friday 30 May after market close, record actuals and commit. Do not change your Monday prediction.
🛠

The Product — IRIS 2.0 Build Target

Sprint 5–10 Goal

From Week 5 onward, your team builds toward a live, deployable product. The weekly prediction workflow you practise in Weeks 2–4 becomes the engine inside this product. By Week 10, you demo a working application on Hugging Face that any user can open in a browser and use.

Layer 1 — Market Regime

SPX, NDX, and IWM weekly predictions. 9 macro assets tracked. Multi-LLM synthesis panel. Human Score layer. Plain-English weekly brief with disclaimer.

Layer 2 — Sector Analysis

All 11 XL ETF (Exchange-Traded Fund) direction calls displayed as a colour-coded heatmap. Rotation signal chart. Almanac seasonal badges. Top and bottom sector picks with reasoning.

Layer 3 — Stock Spotlight

User enters any ticker (NVDA, GOOGL, AAPL, etc.). Live 6-month chart via yfinance. 8 EMA (Exponential Moving Average) + 21 EMA auto-calculated. EMA zone detected. 4-LLM synthesis for that stock. Sector linkage shown.

Approved Technology Stack
ToolPurposeAccess
Python + FlaskBackend web framework — handles routes, data, LLM (Large Language Model) callspip install flask
yfinanceFree market data — any ticker, live OHLCV (Open, High, Low, Close, Volume) history, no API (Application Programming Interface) key neededpip install yfinance
HTML/CSS/JSFrontend dashboard — Flask serves Jinja templatesNo install needed
Anthropic APILive Claude calls for LLM synthesisconsole.anthropic.com
OpenAI APILive ChatGPT calls for LLM synthesisplatform.openai.com
Gemini APILive Gemini calls for LLM synthesisai.google.dev
DeepSeek APILive DeepSeek calls for LLM synthesisplatform.deepseek.com
Hugging Face SpacesFree hosting — deploy as Docker Space, get a public URLhuggingface.co/spaces
GitHubSource control — HF Space syncs automatically from repogithub.com
10-Week Build Roadmap
W2–W4
Manual workflow: agents, LLM synthesis, prediction filing, GitHub discipline. No product code yet. Master the thinking process first.
W5
Hugging Face Space created. Flask app scaffolded. Static HTML dashboard deployed. SPX/NDX/IWM data loading from yfinance. Basic chart visible. Public URL working.
W6
Dashboard v1 — Layer 1 live. Market regime tab shows actuals + predictions. EMA zone detection. Multi-LLM comparison panel. Human Score panel.
W7
Sector layer — Layer 2 live. 11-ETF heatmap. Rotation signal chart. Sector predictions via UI. Almanac seasonal badges auto-displayed.
W8
Stock Spotlight — Layer 3 live. Ticker input. yfinance pulls data. 8 EMA + 21 EMA calculated and charted. EMA zone auto-detected. 4-LLM synthesis per stock. Sector linkage shown.
W9
UX (User Experience) polish and user testing. Non-expert tests the dashboard. Confusing labels fixed. Prediction history tab shows 8-sprint track record. Calibration chart live. LLM horse race visible.
W10
Final demo day. All three layers live on Hugging Face. 10-sprint track record visible. LLM horse race results shown. GitHub release vW10 tagged. Public URL shared.

The five things that separate a great product from a good one:

1. The 10-sprint track record — a running chart of prediction vs actual with calibration scores. Nobody has built this before.
2. The LLM horse race — which AI model was most accurate over 10 weeks? Real data, real answer.
3. Human Score as a visible layer — not hidden inside the brief. Scored, displayed, comparable to the AI consensus.
4. Plain-English brief quality — test it on someone who has never read a financial report. If they understand it, you succeeded.
5. Stock Spotlight sector linkage — when a user enters NVDA, show them it is in XLK and XLK's seasonal window is bullish. That connection is the value-add over a plain stock chart.

🏆

Live Prediction Board — All Groups

Competitive

This is where the competition happens. Every group files their prediction over the weekend — SPX, NDX, IWM directions and % ranges, plus all 11 sector ETF calls. All predictions are visible to all groups the moment they are filed. By Monday morning, everyone knows what everyone else predicted. That is the pressure that makes you think harder.

📋 How to use the Prediction Board

01

Select your group from the dropdown in the board header

02

Click "File Prediction" — enter all SPX/NDX/IWM predictions and all 11 sector ETF directions

03

Add your Human Score total and the one insight your team saw that no AI raised

04

Hit "Live Board" to see what all other groups predicted — before Monday class

05

After Friday close: enter actuals in "Record Actuals" to unlock the leaderboard scores

06

Check the Leaderboard for calibration scores — scored on confidence accuracy, not just direction

Predictions are shared — that is the point. Once you file a prediction, every other group can see it. This creates genuine competitive pressure and makes Monday's class discussion much richer. You will want to come in having thought carefully about where you agree and disagree with other groups — and why.