DSC NYU Datathon — The HVAC Margin Rescue Challenge

DSC NYU Datathon | 3-Day Challenge | v0 Required

1. The Problem

You’re the CFO of a $50M/year commercial HVAC contractor.

Last quarter’s results:

  • Bid margin: 15.2%
  • Realized margin: 6.8%

This wasn’t bad luck. This is the pattern. By the time your PM realizes margin is gone, there’s no runway to recover.

Your mission: Build an AI agent using v0 that autonomously analyzes a portfolio of HVAC projects, detects margin erosion, explains root causes, and delivers specific recovery actions — without being asked.

2. The Dataset

405 commercial HVAC projects | $6.4B total portfolio | 1.46M+ records

The dataset spans projects from 2018–2024 across Healthcare, Commercial Office, K-12 Education, Data Center, and Multifamily Residential sectors. Use the *_all.csv files — these are the working dataset.

2.1 Core Files
  • contracts_all.csv — Base contract info (project ID, value, GC, dates) | 405 rows
  • sov_all.csv — Schedule of Values — 15 line items per project | 6,075 rows
  • sov_budget_all.csv — Bid-time cost estimates per SOV line | 6,075 rows
  • labor_logs_all.csv — Daily crew time entries with role, hours, rate | 1,202,039 rows
  • material_deliveries_all.csv — Material receipts linked to SOV lines | 22,438 rows
  • billing_history_all.csv — Pay application history | 6,479 rows
  • billing_line_items_all.csv — Line-level billing detail per application | 90,112 rows
  • change_orders_all.csv — Change orders (approved, pending, rejected) | 4,255 rows
  • rfis_all.csv — Requests for information | 22,065 rows
  • field_notes_all.csv — Unstructured daily field reports (messy) | 103,676 rows

Synthetic Data: Google Drive

3. Data Quality Heads-Up

This is real-world-style data — it is intentionally messy. Before querying, expect to handle:

  • Role name inconsistencies in labor_logs_all.csv — e.g. "JM Pipefitter", "Journeyman P.F.", "Pipefitter JM" all refer to the same trade
  • Mixed date formats across files — some dates are YYYY-MM-DD, others are not

There are additional data quality issues beyond these two. Finding and handling them is part of the challenge.

Your agent must reason through the noise — not after someone else cleans it up.

4. Portfolio Composition

The dataset covers 405 projects across six year cohorts. Your agent should analyze the full portfolio — the signal is somewhere in there.

4.1 Cohorts
  • PRJ-2018-xxx — 80 projects | Years active: 2018–2020
  • PRJ-2019-xxx — 80 projects | Years active: 2019–2021
  • PRJ-2020-xxx — 80 projects | Years active: 2020–2022
  • PRJ-2021-xxx — 80 projects | Years active: 2021–2023
  • PRJ-2022-xxx — 60 projects | Years active: 2022–2024
  • PRJ-2023-xxx — 20 projects | Years active: 2023–2025
  • PRJ-2024-xxx — 5 projects | Years active: 2024–2026

Project types span Healthcare, Commercial Office, K-12 Education, Data Center, and Multifamily Residential across contract values from ~$2M to ~$45M.

The portfolio contains projects with severe margin erosion — your agent should find them.

5. What You’re Building

An agentic system — not a dashboard. The distinction matters:

  • A dashboard shows data when a human looks at it
  • An agent acts: it ingests the portfolio, reasons across tables, surfaces problems unprompted, and delivers specific recovery actions
5.1 Required Capabilities
5.1.1 Autonomous Portfolio Scan

The agent independently ingests all project data, computes margin health across the portfolio, and flags at-risk projects without being prompted for each one.

5.1.2 Root Cause Reasoning

For flagged projects, the agent drills into the data — cross-referencing labor logs, field notes, change orders, and billing — to explain why margin is eroding, not just that it is.

5.1.3 Proactive Recommendations

The agent delivers specific, dollar-quantified actions: which change orders to submit, what to bill, where labor is bleeding, which field note signals indicate uncaptured scope. Generic “investigate further” outputs will score poorly.

5.1.4 Interface

Use v0 to build a UI that surfaces agent outputs. The interface should feel like a CFO briefing, not a data table — executive-readable in 30 seconds, with the ability to drill down.

6. Time Management

  • Day 1 — Data ingestion, aggregation pipeline, agent scaffolding
  • Day 2 — Agent reasoning loops, root cause logic, recommendation engine
  • Day 3 — v0 UI, polish, deployment, demo prep

A working agent with one sharp insight beats a broken complex one.

7. What Good Output Looks Like

A strong agent surfaces findings unprompted. Here is the kind of output that scores well:

⚠️ CRITICAL — PRJ-2021-260 | Nashville Mixed-Income Housing
Contract: $2,608,000 | Actual Cost: $4,991,000 | Realized Margin: -91%

Root causes:
  • Labor: $3,819K actual vs $807K estimated — 4.7× overrun. Crew ramped
    to 12–18 workers/day through peak phase; estimate assumed 5–8.
  • Material: $1,172K actual vs $355K estimated — 3.3× overrun.
    Late-stage delivery clustering suggests expediting and substitutions.
  • Billing is 99.4% complete — no recovery possible through billing alone.

Recovery actions:
  1. Audit 9 approved COs for unexecuted scope — if any work was performed
     without documented contract relief, submit supplemental CO immediately.
  2. Review field notes for references to owner-directed work outside
     original scope (labor logs show 3 crew expansions with no CO trigger).
  3. Engage GC on retention release: $259K held. Release accelerates
     cash recovery on a completed project.

This is agent output. A table showing -91% with a red cell is a dashboard.

8. Domain Reference

  • SOV — Schedule of Values — contract breakdown by work type
  • Burden rate — Labor overhead multiplier (taxes, insurance, benefits)
  • Earned value — Budget × % complete
  • Retention — Payment held until completion (typically 10%)
  • Budget coverage — Estimated budget as % of contract value — healthy projects run 88–110%

Good luck. Time starts now.

Requirements

Submission Requirements

Everything your team must submit and how to satisfy deployment, v0 usage, and technical expectations for the HVAC Margin Rescue Challenge.

1. Deliverables

  1. Working agent — GitHub repo or deployed URL
  2. Demo video (3 min) — Show the agent autonomously finding problems and recommending actions
  3. v0 proof — Project link or prompt history
  4. Technical summary (1 page) — Architecture, agent design, AI approach, key decisions

2. Deployment (Mandatory)

  • Agent must be live and publicly accessible
  • Core features must function at demo time
  • No localhost-only or slide-only demos

Deploying on Vercel is strongly encouraged, but not required.

If it’s not deployed, it will not be judged.

3. Required Use of v0

All teams must meaningfully use v0 in their development workflow.

3.1 Option A — Build in v0 Chat

Proof required: live v0 project link

3.2 Option B — v0 in IDE

Use v0 inside:

Deploy anywhere (Vercel strongly encouraged).

Proof required: prompt history or chat logs from v0 integration

3.3 Option C — Hybrid

Proof required: specific components or flows generated by v0, before/after iteration examples, explanation of how v0 influenced architecture or UX

4. Proof of v0 Usage

Teams must provide the required proof for the workflow option they choose. Submissions that fail to clearly demonstrate meaningful use of v0 may be disqualified.

5. Key Formulas

Labor Cost = (hours_st + hours_ot × 1.5) × hourly_rate × burden_multiplier
Variance = Actual Cost - Budget
Billing Gap = % Complete - % Billed
Budget Coverage = Estimated Budget / Contract Value

6. Working with 1M+ Records

The labor_logs_all.csv alone has 1.2M rows. Recommended approaches:

  • Pre-aggregate on the server (sum costs by sov_line_id and project_id) before passing to the agent
  • Use DuckDB, SQLite, or a simple aggregation step to reduce to ~6K summary rows
  • The LLM context window can’t hold raw labor logs — aggregate first, then reason

7. Stack Suggestions

v0 + Next.js + Shadcn/ui + Recharts + Claude/OpenAI API + DuckDB (for aggregation)

Hackathon Sponsors

Prizes

6 non-cash prizes
First Place
1 winner

$500 Vercel v0 Credit

Second Place
1 winner

$300 Vercel v0 Credit

Third Place
1 winner

$100 Vercel v0 Credit

Best UI/UX
1 winner

Will be announced!

Most Creative
1 winner

Will be announced!

Technical Excellence
1 winner

Will be announced!

Devpost Achievements

Submitting to this hackathon could earn you:

Judges

Mitchell Itkin

Mitchell Itkin
Pulse Foundry AI

Mark Bakshiyev

Mark Bakshiyev
Pulse Foundry AI

DJ Lee

DJ Lee
Pulse Foundry AI

Judging Criteria

  • Agent Quality (40 points)
    Finds the right at-risk projects and applies correct margin logic/////Goes beyond retrieval — shows reasoning across tables and signals
  • Recommendations (30 points)
    Actions are specific and dollar-quantified where possible/////A CFO could realistically act on or delegate from the output
  • Implementation (20 points)
    Built with v0; handles large record volumes (e.g. aggregation before LLM)/////Deployed and demonstrably working at submission time
  • Business Insight (10 points)
    Explains why margin erodes, not only that it did/////Shows forward-looking or diagnostic insight, not only static reporting

Questions? Email the hackathon manager

Tell your friends

Hackathon sponsors

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.