Guardrails are the layer of filters and validators around an LLM that block unsafe, off-topic, or wrongly-formatted output before it reaches the user. They include content moderation models, format validators (regex, JSON schema), topic filters, and prompt-injection detectors.
Most production LLM apps use a mix of provider-level safety (OpenAI moderation, Anthropic's constitutional AI) and application-level guardrails (NeMo Guardrails, Guardrails AI library, Lakera). The trade-off is always latency and false-positive rate vs robustness.