← eden_hadad

aistudio

An AI video editor that critiques its own cuts.

stack: next.js · gemini · claude · ffmpeg · replicate musicgen

role: sole builder · live on railway

// the problem

My parents run a silverware workshop and a calligraphy studio. They make beautiful things and have no way to show them — Instagram wants vertical video with music, and they are not going to learn CapCut. Neither are most craftspeople. The tools assume an editor; my parents needed a tool that assumes a phone, a pile of clips, and one sentence about what they want.

// what I built

A Hebrew-first wizard: upload clips, answer four questions, pick one of four points of view — lively, beautiful, product showcase, or "I'll say what to do." Behind it, a pipeline: Gemini analyzes every clip frame by frame and tags each moment with quality and what's actually in it. Claude Sonnet plans the edit against the chosen POV. Claude Haiku writes a music prompt, MusicGen generates a track, and ffmpeg detects its beats from raw PCM peaks so cuts can land on them. Then ffmpeg assembles the final vertical video. The render runs as a background job — you can close the phone and come back.

// how the self-verification works

One-pass AI editors trust the model. I don't. Three layers check the work.

First, deterministic blocking: the analyzer tags moments as great, good, or skip, and any planned cut whose time range overlaps a skip moment is rejected after planning — by interval math, not by hoping the model read the description. This rule exists because a kid's middle finger kept surviving three consecutive QA cycles while the planner politely ignored the warning text.

Second, structural verification: after the render, the verifier checks duration, shot count, and music presence. If the numbers are clean it accepts with zero LLM calls — two model passes always describe the same video differently, and that noise once sent the loop chasing phantom failures.

Third, an autonomous QA agent: it uploads test clips to the live app like a real user, polls the job, downloads the result, samples frames at 1 fps, and asks Gemini to compare what was promised against what rendered. In one run it caught the plan claiming "both kids smiling bright" when only one was, and a "peace sign" that was actually hair movement.

// what broke

Gemini returned 503 "high demand" on every video upload for two weeks. It wasn't demand — a raw HTTP probe showed it was an account-level quota gate dressed up as a server error. The workaround ships still frames as inline images instead of video, which hits a different quota pool entirely.

A 4K iPhone clip OOM-killed the production container, because req.formData() buffers the whole file in RAM and Railway's box has 1GB. The error surfaced on the client as undefined is not an object, which is not what "out of memory" looks like. Uploads now stream to disk chunk by chunk.

And the classic: audio plays, video freezes at second two. iPhones record variable frame rate, and concatenating segments that preserve it leaves timestamp cliffs at every boundary. Every segment is now forced to constant 30fps before assembly.

// where it is now

Live at aistudio.edenhadad.com, in alpha with my family — the people it was built for. The QA agent runs full cycles against production and writes honest reports. Some of them are still painful to read. That's the point.

// © eden_hadad · edenhadad.com