AI Engineering6 min readAugust 12, 2025

Prompt Engineering Practices That Survive Production, Not Just Demos

A prompt that works in a playground and a prompt that survives real user input, edge cases, and six months of drift are different disciplines entirely.

Tanjil Ahmed

Lead Software Engineer · Notionhive

Prompt engineering gets dismissed as trivial by people who've only written prompts for a demo, and taken very seriously by anyone who's had a production prompt silently degrade after a model version upgrade. The difference between the two is a set of practices that has nothing to do with clever wording.

Version and test prompts like code — a prompt change is a deploy, and it needs a regression suite of real examples.
Structured output (JSON mode, function calling) beats parsing free text every time it's available.
Keep a golden set of tricky real user inputs and re-run it against every model or prompt change.
Separate the system prompt's instructions from the user's untrusted input explicitly — prompt injection starts where that line blurs.

The teams that get burned are the ones treating the prompt as a one-time creative writing exercise instead of a living piece of the system that needs the same regression discipline as any other code path touching production data.

A prompt without a regression test is a function without a test suite — it works until the day it quietly doesn't.