AI/ML

Director's Cut

Adaptive film that reshapes itself around your emotions in real time — Gemini 3 NYC Hackathon

Gemini 3 NYC Hackathon

9 technologies
3 key decisions
4 results

Problem

Problem

Films are fixed. Your emotional response to them is not. DirectorsCut explores what happens when you remove that constraint: a mystery film plays while a Director Agent watches how you react — and dynamically chooses the next story branch, scene visuals, and narration based on your real-time emotion history. Every viewing is different because every viewer is different.

Approach

Approach

The browser captures webcam frames every 10 seconds and sends them to Gemini 2.5 Flash via the Gemini Live API for real-time emotion detection. Emotion readings accumulate in a rolling window of 8. At each story decision point, a Director Agent (LlamaIndex + Gemini 2.5 Flash) reads the accumulated emotion history and selects the next branch from the story graph. Scene visuals are generated via Veo, narration via TTS, and all assets stream back to the React frontend over WebSocket. The FastAPI backend (Railway) handles WebSocket sessions, frame relay, and the LlamaIndex agent loop.

Architecture

Architecture

Director's Cut — system diagram

Webcam (browser)Gemini Live (emoti…WebSocket (Railway)Emotion Accumulato…Director Agent (Ll…Veo / TTS (scene g…React UI (Netlify)

Key Technical Decisions

Key Technical Decisions

Assembly Instructions — 3 Steps
01

Rolling window of 8 emotion readings

A single frame is noisy — lighting, blinks, or momentary distraction produce false readings. Accumulating 8 readings before making a branch decision smooths over noise while still being responsive to genuine sustained emotional states. The window resets on story reset.

02

LlamaIndex agent over direct prompt chain

The Director Agent needs to reason over the full emotion history, map it to the story graph state, and select the optimal branch considering both emotional fit and narrative coherence. LlamaIndex's agent abstraction with structured tool calls made this reasoning transparent and debuggable compared to a single long prompt.

03

Gemini Live API for client-side inference

Running emotion detection client-side via Gemini Live reduces round-trip latency and eliminates the need to stream raw video to the backend. A backend frame relay fallback handles cases where the Gemini Live API is unavailable, ensuring the system degrades gracefully.

Results

Results

  • Fully adaptive narrative — no two runs of the film follow the same branch sequence
  • Real-time emotion detection with 10-second decision cadence via Gemini Live API
  • Director Agent correctly maps sustained emotion states to appropriate story branches
  • Built and demoed at Gemini 3 NYC Hackathon

Tech Stack

Tech Stack

Gemini 2.5 FlashLlamaIndexFastAPIReactWebSocketVeoTTSRailwayNetlify

Links