Introduction
Every developer integrating LLMs hits the same wall: the model returns plain text, and you need structured data. The industry converged on JSON Schema, Pydantic models, and function calling to solve this. But what if forcing JSON on LLMs is the wrong abstraction entirely?
This article introduces a different paradigm — one where structured output is not just machine-parseable, but also human-renderable. Where the LLM doesn't return a JSON blob you have to build UI for, but a document that is the UI.
The JSON Schema Era: How We Got Here
From Regex Parsing to Pydantic — A Brief History
In 2022, getting structured data from an LLM meant prompt engineering and prayer. You'd write "respond in JSON format" and hope for the best. When the model inevitably returned Sure! Here's the JSON: before the actual payload, you'd write regex to extract it.
Then came the tooling wave:
- Instructor (2023) wrapped Pydantic models around LLM calls, giving you type-safe extraction with automatic retries when validation failed.
- Outlines and Guidance (2023-2024) took it further with constrained decoding — forcing the model's token generation to follow a grammar, guaranteeing valid JSON at the token level.
- OpenAI Structured Outputs (2024) made it a first-class API feature. Define a JSON Schema, get conforming output. No parsing, no retries.
By 2025, the ecosystem settled: if you want structured data from an LLM, you define a schema and the model fills it in. Problem solved.
Or is it?
Why Every LLM Provider Shipped JSON Mode
The answer is simple: developers asked for it. When you're building a pipeline — LLM output feeds into a database, triggers a function, or populates a UI — you need predictable structure. JSON is universal. Every language parses it. Every database stores it.
But this framing treats LLM output as data transfer — a serialization problem between the model and your backend. That's one valid use case. It's not the only one.
The Hidden Costs of Forcing JSON on LLMs
34% More Tokens — The JSON Tax You're Already Paying
JSON is verbose by design. Curly braces, quoted keys, colons, commas — all structural tokens that carry no semantic content. Research and benchmarks consistently show that Markdown is 34-38% more token-efficient than JSON for equivalent data.
Here's the same information in both formats:
JSON (87 tokens):
{
"patient": {
"name": "Jane Doe",
"date_of_birth": "1990-03-15",
"chief_complaint": "Persistent headache for 3 days",
"vitals": {
"blood_pressure": "120/80",
"heart_rate": 72,
"temperature": 98.6
},
"assessment": "Tension-type headache. No red flags.",
"plan": "OTC analgesics, follow up in 1 week if no improvement"
}
}Markdown with MDMA (fewer tokens, and the user can actually read it):
# Patient Intake
```mdma
id: patient-intake
type: form
fields:
- name: patient_name
type: text
label: Patient Name
required: true
sensitive: true
- name: dob
type: date
label: Date of Birth
sensitive: true
- name: chief_complaint
type: textarea
label: Chief Complaint
required: true
- name: assessment
type: textarea
label: Assessment
- name: plan
type: textarea
label: Plan
onSubmit: submit-intake
`` The JSON version gives you data. The MDMA version gives you data and an interactive form the user fills out — without building a single line of frontend code.
At scale, the token difference compounds. If your application makes 10,000 LLM calls per day, switching from JSON to Markdown-based structured output can cut your token costs by a third.
10-15% Reasoning Degradation in JSON Mode
This one is less discussed but well-documented. When you force an LLM into JSON mode, you're constraining its generation process. The model can no longer "think out loud" — it must produce valid JSON from the first token.
Multiple benchmarks show 10-15% performance degradation on reasoning tasks when using constrained JSON output compared to free-form generation. The model is spending capacity on structural compliance instead of problem-solving.
With Markdown, the LLM writes in the format it was trained on. Markdown is the lingua franca of the internet — it appears massively in training data. The model doesn't fight the format; it flows with it.
Schema Boilerplate That Scales with Complexity
A simple JSON Schema for a contact form is manageable. A schema for a multi-step KYC workflow with conditional fields, nested objects, and validation rules? You're looking at hundreds of lines of schema definition before writing any application logic.
Here's what a KYC form schema looks like in Pydantic:
class Address(BaseModel):
street: str
city: str
state: str
zip_code: str = Field(pattern=r'^\d{5}(-\d{4})?$')
class KYCApplication(BaseModel):
full_name: str = Field(min_length=2, max_length=100)
date_of_birth: date
ssn: str = Field(pattern=r'^\d{3}-\d{2}-\d{4}$')
email: EmailStr
phone: str
address: Address
employment_status: Literal['employed', 'self-employed', 'unemployed', 'retired']
annual_income: float = Field(ge=0)
source_of_funds: str
is_pep: bool # Politically Exposed Person
documents_provided: list[str]Then you need a separate UI component to render this form, validation logic in the frontend, and mapping between the schema and the rendered fields.
With MDMA, the schema, the UI definition, and the validation rules are one thing:
id: kyc-form
type: form
fields:
- name: full_name
type: text
label: Full Legal Name
required: true
sensitive: true
- name: date_of_birth
type: date
label: Date of Birth
required: true
sensitive: true
- name: ssn
type: text
label: Social Security Number
required: true
sensitive: true
validation:
pattern: "^\\d{3}-\\d{2}-\\d{4}$"
message: "Format: XXX-XX-XXXX"
- name: email
type: email
label: Email Address
required: true
sensitive: true
- name: employment_status
type: select
label: Employment Status
required: true
options:
- { label: Employed, value: employed }
- { label: Self-Employed, value: self-employed }
- { label: Unemployed, value: unemployed }
- { label: Retired, value: retired }
- name: annual_income
type: number
label: Annual Income
required: true
validation:
min: 0
- name: source_of_funds
type: textarea
label: Source of Funds
required: true
- name: is_pep
type: checkbox
label: Politically Exposed Person (PEP)
onSubmit: submit-kyc
`` One definition. The LLM generates it. The renderer displays it. The runtime validates and collects the data. No Pydantic model. No separate frontend component. No mapping layer.
Function Calling vs Structured Output vs Generative UI — When to Use Which
The industry treats these as competing approaches. They're not — they solve different problems.
Function Calling: Great for Actions, Wrong for Data Collection
Function calling (tool use) excels when you want the LLM to do something: search a database, call an API, send a message. The model decides which function to invoke and with what parameters.
But function calling is awkward for collecting data from users. The model calls a collectuserinfo function, your backend receives JSON, and then... you still need to build UI to display and edit that data. Function calling is a model-to-system interface, not a model-to-user interface.
JSON Structured Output: Reliable but Invisible to Users
Structured output via JSON Schema gives you validated, typed data. It's the right choice when the output feeds directly into a pipeline — no human in the loop.
But the moment a human needs to see, review, or modify that output, you're building custom UI. Every new schema means a new form component. Every schema change means a frontend update.
Generative UI: The Missing Third Option
What if the LLM's output was the interface?
This is the generative UI paradigm: instead of the model returning data that you render, the model returns a renderable document that collects, displays, and processes data.
MDMA implements this with extended Markdown. The LLM writes standard Markdown (headings, paragraphs, lists) interspersed with YAML-defined interactive components. A single renderer handles every document. No per-schema UI work.
Use function calling when the LLM needs to take action on behalf of the user. Use JSON structured output when the data flows into a machine-only pipeline. Use generative UI (MDMA) when a human needs to interact with the output.
What If the Structured Output Was Also the UI?
The False Choice Between Machine-Parseable and Human-Readable
The current paradigm forces a choice: either the LLM returns data (JSON, machine-parseable, invisible to users) or text (Markdown, human-readable, unstructured). You pick one and build infrastructure for the other.
MDMA rejects this trade-off. An MDMA document is:
- Human-readable: it's Markdown. Headings, paragraphs, and lists render as expected.
- Machine-parseable:
mdmablocks parse into typed AST nodes with known schemas. - Interactive: forms collect input, buttons trigger actions, approval gates enforce workflows.
- Auditable: every interaction is logged with tamper-evident hash chaining.
The same document serves the user, the system, and compliance — without conversion layers.
Markdown as a Structured Format — Why LLMs Already Prefer It
LLMs don't "prefer" JSON. They were trained on the internet, and the internet runs on Markdown. README files, documentation, forum posts, chat messages — all Markdown or Markdown-adjacent.
When you ask an LLM to generate Markdown, you're asking it to work in its native medium. When you ask it to generate JSON, you're asking it to switch to a format optimized for machines, not for language models.
MDMA leans into this. The LLM writes Markdown as usual, and when it needs to express structure — a form, a table, a workflow gate — it drops into a mdma YAML block. The transition is natural:
Based on your description, this sounds like a P2 incident.
Let me set up the triage workflow.
## Incident Assessment
```mdma
id: incident-form
type: form
fields:
- name: incident_title
type: text
label: Incident Title
required: true
- name: severity
type: select
label: Severity Level
required: true
options:
- { label: "P1 - Critical", value: P1 }
- { label: "P2 - High", value: P2 }
- { label: "P3 - Medium", value: P3 }
- { label: "P4 - Low", value: P4 }
- name: affected_systems
type: text
label: Affected Systems
required: true
- name: customer_impact
type: textarea
label: Customer Impact
required: true
onSubmit: submit-incident
`` `
Once submitted, I'll route this to the on-call team and
create the response checklist.Natural language and structured components coexist in one document. The model doesn't need to choose between explaining context and collecting data — it does both.
Extended Markdown: Forms, Tables, and Approval Gates from LLM Output
How MDMA Extends Markdown with Interactive Components
MDMA (Markdown Document with Mounted Applications) adds nine component types to standard Markdown via fenced code blocks with the mdma language tag:
| Component | Purpose | Example Use Case |
|---|---|---|
form | Structured data collection | Patient intake, bug reports, KYC |
table | Tabular data with sorting/filtering | Search results, audit logs |
approval-gate | Multi-step approval workflow | Manager sign-off, compliance review |
tasklist | Checklist with completion tracking | Pre-deploy checklist, triage steps |
button | Clickable action with confirmation | Notify Slack, deploy to production |
callout | Highlighted message (info/warning/error) | SLA warnings, compliance notices |
chart | Data visualization | Revenue trends, error rates |
webhook | HTTP request trigger | Slack notifications, API calls |
thinking | Collapsed AI reasoning | Debug transparency |
Components communicate through bindings — {{field_name}} expressions that resolve at runtime. When a user fills a form field, bound components update automatically.
Code Example: From JSON Schema to a Single Prompt
The JSON Schema approach requires three artifacts:
- Schema definition (Pydantic/JSON Schema)
- LLM integration (function calling or structured output)
- Frontend component (React form mapped to schema fields)
The MDMA approach requires one:
import { buildSystemPrompt } from '@mobile-reality/mdma-prompt-pack';
const systemPrompt = buildSystemPrompt({
customPrompt: `You are a customer support assistant. When a user
reports an issue, generate an MDMA form to collect structured
details, followed by a tasklist for resolution steps.`,
});
// Send to any LLM — OpenAI, Claude, Gemini, local models
const response = await llm.chat({
system: systemPrompt,
messages: conversation,
});
// The response contains Markdown with ```mdma blocks
// Parse it into an interactive document
import { unified } from 'unified';
import remarkParse from 'remark-parse';
import { remarkMdma } from '@mobile-reality/mdma-parser';
const processor = unified().use(remarkParse).use(remarkMdma);
const ast = await processor.run(processor.parse(response));The LLM decides what fields to include based on the conversation. No predefined schema. The form structure emerges from context.
What the User Actually Sees vs What the System Parses
When the LLM generates this response:
I've reviewed your request. Here's the change management form:
```mdma
id: change-request
type: form
fields:
- name: change_title
type: text
label: Change Title
required: true
- name: risk_level
type: select
label: Risk Level
options:
- { label: Low, value: low }
- { label: Medium, value: medium }
- { label: High, value: high }
- name: rollback_plan
type: textarea
label: Rollback Plan
required: true
onSubmit: submit-change
`` `
```mdma
id: tech-lead-approval
type: approval-gate
title: Tech Lead Approval
requiredApprovers: 1
allowedRoles:
- tech-lead
- engineering-manager
onApprove: proceed-to-deploy
onDeny: return-to-author
requireReason: true
`` The user sees: a paragraph of text, a rendered form with input fields and dropdowns, and an approval gate with Approve/Deny buttons.
The system sees: typed AST nodes, validated component schemas, dispatchable actions, and a binding map connecting the form to the approval gate.
The audit log sees: every field change, every button click, every approval decision — timestamped, actor-tagged, and hash-chained.
One response. Three consumers. Zero custom UI code.
MDMA vs Instructor vs Outlines vs Guidance
Feature Comparison at a Glance
| Feature | Instructor | Outlines | Guidance | MDMA |
|---|---|---|---|---|
| Output format | JSON (Pydantic) | JSON/regex/grammar | JSON/structured text | Markdown + YAML components |
| Schema definition | Python classes | JSON Schema/regex | Template programs | YAML in Markdown |
| Requires custom UI | Yes | Yes | Yes | No — renderer included |
| Human-readable output | No (raw JSON) | No | Partially | Yes — it's Markdown |
| Approval workflows | Not built-in | Not built-in | Not built-in | First-class approval-gate |
| Audit trail | Not built-in | Not built-in | Not built-in | Hash-chained event log |
| PII handling | Not built-in | Not built-in | Not built-in | Automatic detection & redaction |
| Token efficiency | JSON overhead | JSON overhead | Varies | 34-38% fewer tokens |
| LLM compatibility | OpenAI, Anthropic, others | Local models (HF) | Local models | Any LLM (prompt-based) |
| Validation | Pydantic validators | Grammar constraints | Template constraints | YAML schema + field validation |
When You Still Need JSON (and When You Don't)
Use Instructor/Outlines/Guidance when:
- Output feeds directly into a machine pipeline (no human interaction)
- You need guaranteed JSON for database insertion or API calls
- You're doing bulk extraction from documents (NER, classification)
- Latency is critical and you need constrained decoding
Use MDMA when:
- A human needs to see, review, or interact with the output
- You need approval workflows or multi-step processes
- Compliance requires audit trails (HIPAA, SOX, MiFID)
- You want the LLM to dynamically decide what data to collect
- You're building a chatbot that needs to do more than return text
The key distinction: Instructor and friends solve LLM → machine communication. MDMA solves LLM → human → machine communication.
Building Human-in-the-Loop Workflows Without a Separate Frontend
Approval Gates as First-Class Output
Most human-in-the-loop implementations follow this pattern:
- LLM produces JSON
- Backend stores the JSON
- Frontend renders a custom approval UI
- User approves/denies
- Backend processes the decision
That's five steps across three systems. With MDMA, it's one:
The deployment package is ready for review.
```mdma
id: deploy-approval
type: approval-gate
title: Production Deployment Approval
description: >
Deploying v2.4.1 to production. Changes include
the new payment processing module and updated rate limits.
requiredApprovers: 2
allowedRoles:
- tech-lead
- engineering-manager
- director
onApprove: execute-deployment
onDeny: return-to-author
requireReason: true
`` The LLM generates the approval gate. The renderer displays it. The runtime enforces role-based access, collects approvals, and fires the appropriate action. The event log records who approved, when, and why — with hash-chained integrity.
No custom UI. No separate approval system. No integration work.
Multi-Step Forms Driven by Conversation Context
MDMA components communicate through bindings, enabling multi-step workflows within a single document:
## Step 1: Risk Assessment
```mdma
id: risk-form
type: form
fields:
- name: change_type
type: select
label: Change Type
required: true
options:
- { label: Infrastructure, value: infrastructure }
- { label: Application, value: application }
- { label: Database, value: database }
- name: risk_level
type: select
label: Risk Level
required: true
options:
- { label: Low, value: low }
- { label: Medium, value: medium }
- { label: High, value: high }
onSubmit: assess-risk
`` `
## Step 2: Pre-deployment Checklist
```mdma
id: pre-deploy-checklist
type: tasklist
items:
- id: tests-pass
text: All tests pass in CI
required: true
- id: code-review
text: Code reviewed by 2 engineers
required: true
- id: staging-verified
text: Changes verified on staging
required: true
- id: rollback-ready
text: Rollback plan documented
required: true
onComplete: checklist-done
```
## Step 3: Approval
```mdma
id: change-approval
type: approval-gate
title: Change Approval
description: >
Approve deployment of {{change_type}} change
with {{risk_level}} risk level.
requiredApprovers: 1
allowedRoles:
- tech-lead
- engineering-manager
onApprove: execute-change
onDeny: reject-change
requireReason: true
`` Notice: {{changetype}} and {{risklevel}} in the approval gate description are bindings to the form fields above. When the user fills the form, the approval gate description updates in real time.
The entire workflow — assessment, checklist, approval — lives in one document generated by one LLM response. No orchestration service. No state machine definition. No workflow engine configuration.
Getting Started with MDMA
Installation and First Render
npm install @mobile-reality/mdma-parser \
@mobile-reality/mdma-runtime \
@mobile-reality/mdma-renderer-react \
@mobile-reality/mdma-prompt-packParse and render an MDMA document in under 20 lines:
import { unified } from 'unified';
import remarkParse from 'remark-parse';
import { remarkMdma } from '@mobile-reality/mdma-parser';
import { createDocumentStore } from '@mobile-reality/mdma-runtime';
import { MdmaProvider, MdmaDocument } from '@mobile-reality/mdma-renderer-react';
// Parse
const processor = unified().use(remarkParse).use(remarkMdma);
const ast = await processor.run(processor.parse(markdownString));
// Create runtime
const store = createDocumentStore(ast, {
documentId: 'my-doc',
sessionId: crypto.randomUUID(),
environment: 'production',
});
// Render
function App() {
return (
<MdmaProvider store={store} ast={ast}>
<MdmaDocument />
</MdmaProvider>
);
}Plugging into Your Existing LLM Pipeline
MDMA works with any LLM. Use the prompt pack to give the model MDMA authoring capabilities:
import { buildSystemPrompt } from '@mobile-reality/mdma-prompt-pack';
const systemPrompt = buildSystemPrompt({
customPrompt: `You are an HR onboarding assistant. When a new
employee needs to be onboarded, generate MDMA forms to collect
their information, a tasklist for IT setup, and an approval
gate for their manager.`,
});
// Works with OpenAI
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: 'New hire: Senior Engineer starting March 1st' },
],
});
// Works with Anthropic
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
system: systemPrompt,
messages: [
{ role: 'user', content: 'New hire: Senior Engineer starting March 1st' },
],
});
// Parse the response — same code regardless of LLM provider
const ast = await processor.run(processor.parse(response.content));
const store = createDocumentStore(ast, { /* config */ });The LLM generates the document structure dynamically based on conversation context. Different conversations produce different forms, different checklists, different approval flows — all handled by the same renderer.
Conclusion
JSON Schema solved the structured LLM output problem for machine-to-machine communication. But the moment a human enters the loop — reviewing data, filling forms, approving decisions — JSON becomes the wrong abstraction. You end up building custom UI for every schema, managing state across frontend and backend, and bolting on audit trails as an afterthought.
MDMA takes a different approach: let the LLM write what it's best at (Markdown), extend it with interactive components (YAML blocks), and handle everything else — rendering, validation, state management, audit logging — in the runtime.
The result: structured output that humans can actually use, without the JSON tax.
Links:
AI-Powered Interactive Documents & Generative UI Insights
Are you exploring how large language models can move beyond plain text to deliver structured, interactive experiences? At MDMA, we're pioneering the intersection of Markdown and generative UI — enabling LLMs to return forms, approval workflows, and dynamic components instead of static responses. Our growing library of articles covers the technical foundations, business applications, and architectural patterns behind this shift:
- Generative UI: AI-Driven User Interfaces Transforming Design
- LLM Interface: The Missing Layer Between Your AI Model and Your Users
- AI Form Builder: Cut Dev Time 80% with MDMA vs Retool vs Custom
- Markdown for AI Agents: Build Interactive Agents Fast 2026
- Google A2UI vs MDMA 2026: Cut AI UI Token Costs 16%
Dive into these resources to understand why generative UI is replacing plain-text chat interfaces across healthcare, fintech, and enterprise workflows. If you'd like to integrate MDMA into your product or explore a partnership, reach out to our team. And if you're passionate about shaping the future of LLM-powered interfaces, check our open positions — we're hiring.
