Introduction
Every developer integrating LLMs hits the same wall: the model returns plain text, and you need structured data. The industry converged on JSON Schema, Pydantic models, and function calling to solve this. But what if forcing JSON on LLMs is the wrong abstraction entirely?
This article introduces a different paradigm — one where structured output is not just machine-parseable, but also human-renderable. Where the LLM doesn't return a JSON blob you have to build UI for, but a document that is the UI.
The JSON Schema Era: How We Got Here
From Regex Parsing to Pydantic — A Brief History
In 2022, getting structured data from an LLM meant prompt engineering and prayer. You'd write "respond in JSON format" and hope for the best. When the model inevitably returned Sure! Here's the JSON: before the actual payload, you'd write regex to extract it.
Then came the tooling wave:
- Instructor (2023) wrapped Pydantic models around LLM calls, giving you type-safe extraction with automatic retries when validation failed.
- Outlines and Guidance (2023-2024) took it further with constrained decoding — forcing the model's token generation to follow a grammar, guaranteeing valid JSON at the token level.
- OpenAI Structured Outputs (2024) made it a first-class API feature. Define a JSON Schema, get conforming output. No parsing, no retries.
By 2025, the ecosystem settled: if you want structured data from an LLM, you define a schema and the model fills it in. Problem solved.
Or is it?
Why Every LLM Provider Shipped JSON Mode
The answer is simple: developers asked for it. When you're building a pipeline — LLM output feeds into a database, triggers a function, or populates a UI — you need predictable structure. JSON is universal. Every language parses it. Every database stores it.
But this framing treats LLM output as data transfer — a serialization problem between the model and your backend. That's one valid use case. It's not the only one.
The Hidden Costs of Forcing JSON on LLMs
34% More Tokens — The JSON Tax You're Already Paying
JSON is verbose by design. Curly braces, quoted keys, colons, commas — all structural tokens that carry no semantic content. Research and benchmarks consistently show that Markdown is 34-38% more token-efficient than JSON for equivalent data.
Here's the same information in both formats:
JSON (87 tokens):
{ "patient": { "name": "Jane Doe", "date_of_birth": "1990-03-15", "chief_complaint": "Persistent headache for 3 days", "vitals": { "blood_pressure": "120/80", "heart_rate": 72, "temperature": 98.6 }, "assessment": "Tension-type headache. No red flags.", "plan": "OTC analgesics, follow up in 1 week if no improvement" } }
Markdown with MDMA (fewer tokens, and the user can actually read it):
# Patient Intake ```mdma id: patient-intake type: form fields: - name: patient_name type: text label: Patient Name required: true sensitive: true - name: dob type: date label: Date of Birth sensitive: true - name: chief_complaint type: textarea label: Chief Complaint required: true - name: assessment type: textarea label: Assessment - name: plan type: textarea label: Plan onSubmit: submit-intake ``
The JSON version gives you data. The MDMA version gives you data and an interactive form the user fills out — without building a single line of frontend code.
At scale, the token difference compounds. If your application makes 10,000 LLM calls per day, switching from JSON to Markdown-based structured output can cut your token costs by a third.
10-15% Reasoning Degradation in JSON Mode
This one is less discussed but well-documented. When you force an LLM into JSON mode, you're constraining its generation process. The model can no longer "think out loud" — it must produce valid JSON from the first token.
Multiple benchmarks show 10-15% performance degradation on reasoning tasks when using constrained JSON output compared to free-form generation. The model is spending capacity on structural compliance instead of problem-solving.
With Markdown, the LLM writes in the format it was trained on. Markdown is the lingua franca of the internet — it appears massively in training data. The model doesn't fight the format; it flows with it.
Schema Boilerplate That Scales with Complexity
A simple JSON Schema for a contact form is manageable. A schema for a multi-step KYC workflow with conditional fields, nested objects, and validation rules? You're looking at hundreds of lines of schema definition before writing any application logic.
Here's what a KYC form schema looks like in Pydantic:
class Address(BaseModel): street: str city: str state: str zip_code: str = Field(pattern=r'^\d{5}(-\d{4})?$') class KYCApplication(BaseModel): full_name: str = Field(min_length=2, max_length=100) date_of_birth: date ssn: str = Field(pattern=r'^\d{3}-\d{2}-\d{4}$') email: EmailStr phone: str address: Address employment_status: Literal['employed', 'self-employed', 'unemployed', 'retired'] annual_income: float = Field(ge=0) source_of_funds: str is_pep: bool # Politically Exposed Person documents_provided: list[str]
Then you need a separate UI component to render this form, validation logic in the frontend, and mapping between the schema and the rendered fields.
With MDMA, the schema, the UI definition, and the validation rules are one thing:
id: kyc-form type: form fields: - name: full_name type: text label: Full Legal Name required: true sensitive: true - name: date_of_birth type: date label: Date of Birth required: true sensitive: true - name: ssn type: text label: Social Security Number required: true sensitive: true validation: pattern: "^\\d{3}-\\d{2}-\\d{4}$" message: "Format: XXX-XX-XXXX" - name: email type: email label: Email Address required: true sensitive: true - name: employment_status type: select label: Employment Status required: true options: - { label: Employed, value: employed } - { label: Self-Employed, value: self-employed } - { label: Unemployed, value: unemployed } - { label: Retired, value: retired } - name: annual_income type: number label: Annual Income required: true validation: min: 0 - name: source_of_funds type: textarea label: Source of Funds required: true - name: is_pep type: checkbox label: Politically Exposed Person (PEP) onSubmit: submit-kyc ``
One definition. The LLM generates it. The renderer displays it. The runtime validates and collects the data. No Pydantic model. No separate frontend component. No mapping layer.
Function Calling vs Structured Output vs Generative UI — When to Use Which
The industry treats these as competing approaches. They're not — they solve different problems.
Function Calling: Great for Actions, Wrong for Data Collection
Function calling (tool use) excels when you want the LLM to do something: search a database, call an API, send a message. The model decides which function to invoke and with what parameters.
But function calling is awkward for collecting data from users. The model calls a collectuserinfo function, your backend receives JSON, and then... you still need to build UI to display and edit that data. Function calling is a model-to-system interface, not a model-to-user interface.
JSON Structured Output: Reliable but Invisible to Users
Structured output via JSON Schema gives you validated, typed data. It's the right choice when the output feeds directly into a pipeline — no human in the loop.
But the moment a human needs to see, review, or modify that output, you're building custom UI. Every new schema means a new form component. Every schema change means a frontend update.
Generative UI: The Missing Third Option
What if the LLM's output was the interface?
This is the generative UI paradigm: instead of the model returning data that you render, the model returns a renderable document that collects, displays, and processes data.
MDMA implements this with extended Markdown. The LLM writes standard Markdown (headings, paragraphs, lists) interspersed with YAML-defined interactive components. A single renderer handles every document. No per-schema UI work.
Use function calling when the LLM needs to take action on behalf of the user. Use JSON structured output when the data flows into a machine-only pipeline. Use generative UI (MDMA) when a human needs to interact with the output.
What If the Structured Output Was Also the UI?
The False Choice Between Machine-Parseable and Human-Readable
The current paradigm forces a choice: either the LLM returns data (JSON, machine-parseable, invisible to users) or text (Markdown, human-readable, unstructured). You pick one and build infrastructure for the other.
MDMA rejects this trade-off. An MDMA document is:
- Human-readable: it's Markdown. Headings, paragraphs, and lists render as expected.
- Machine-parseable:
mdmablocks parse into typed AST nodes with known schemas. - Interactive: forms collect input, buttons trigger actions, approval gates enforce workflows.
- Auditable: every interaction is logged with tamper-evident hash chaining.
The same document serves the user, the system, and compliance — without conversion layers.
Markdown as a Structured Format — Why LLMs Already Prefer It
LLMs don't "prefer" JSON. They were trained on the internet, and the internet runs on Markdown. README files, documentation, forum posts, chat messages — all Markdown or Markdown-adjacent.
When you ask an LLM to generate Markdown, you're asking it to work in its native medium. When you ask it to generate JSON, you're asking it to switch to a format optimized for machines, not for language models.
MDMA leans into this. The LLM writes Markdown as usual, and when it needs to express structure — a form, a table, a workflow gate — it drops into a mdma YAML block. The transition is natural:
Based on your description, this sounds like a P2 incident. Let me set up the triage workflow. ## Incident Assessment ```mdma id: incident-form type: form fields: - name: incident_title type: text label: Incident Title required: true - name: severity type: select label: Severity Level required: true options: - { label: "P1 - Critical", value: P1 } - { label: "P2 - High", value: P2 } - { label: "P3 - Medium", value: P3 } - { label: "P4 - Low", value: P4 } - name: affected_systems type: text label: Affected Systems required: true - name: customer_impact type: textarea label: Customer Impact required: true onSubmit: submit-incident `` ` Once submitted, I'll route this to the on-call team and create the response checklist.
Natural language and structured components coexist in one document. The model doesn't need to choose between explaining context and collecting data — it does both.
Extended Markdown: Forms, Tables, and Approval Gates from LLM Output
How MDMA Extends Markdown with Interactive Components
MDMA (Markdown Document with Mounted Applications) adds nine component types to standard Markdown via fenced code blocks with the mdma language tag:
| Component | Purpose | Example Use Case |
|---|---|---|
form | Structured data collection | Patient intake, bug reports, KYC |
table | Tabular data with sorting/filtering | Search results, audit logs |
approval-gate | Multi-step approval workflow | Manager sign-off, compliance review |
tasklist | Checklist with completion tracking | Pre-deploy checklist, triage steps |
button | Clickable action with confirmation | Notify Slack, deploy to production |
callout | Highlighted message (info/warning/error) | SLA warnings, compliance notices |
chart | Data visualization | Revenue trends, error rates |
webhook | HTTP request trigger | Slack notifications, API calls |
thinking | Collapsed AI reasoning | Debug transparency |
Components communicate through bindings — {{field_name}} expressions that resolve at runtime. When a user fills a form field, bound components update automatically.
Code Example: From JSON Schema to a Single Prompt
The JSON Schema approach requires three artifacts:
- Schema definition (Pydantic/JSON Schema)
- LLM integration (function calling or structured output)
- Frontend component (React form mapped to schema fields)
The MDMA approach requires one:
import { buildSystemPrompt } from '@mobile-reality/mdma-prompt-pack'; const systemPrompt = buildSystemPrompt({ customPrompt: `You are a customer support assistant. When a user reports an issue, generate an MDMA form to collect structured details, followed by a tasklist for resolution steps.`, }); // Send to any LLM — OpenAI, Claude, Gemini, local models const response = await llm.chat({ system: systemPrompt, messages: conversation, }); // The response contains Markdown with ```mdma blocks // Parse it into an interactive document import { unified } from 'unified'; import remarkParse from 'remark-parse'; import { remarkMdma } from '@mobile-reality/mdma-parser'; const processor = unified().use(remarkParse).use(remarkMdma); const ast = await processor.run(processor.parse(response));
The LLM decides what fields to include based on the conversation. No predefined schema. The form structure emerges from context.
What the User Actually Sees vs What the System Parses
When the LLM generates this response:
I've reviewed your request. Here's the change management form: ```mdma id: change-request type: form fields: - name: change_title type: text label: Change Title required: true - name: risk_level type: select label: Risk Level options: - { label: Low, value: low } - { label: Medium, value: medium } - { label: High, value: high } - name: rollback_plan type: textarea label: Rollback Plan required: true onSubmit: submit-change `` ` ```mdma id: tech-lead-approval type: approval-gate title: Tech Lead Approval requiredApprovers: 1 allowedRoles: - tech-lead - engineering-manager onApprove: proceed-to-deploy onDeny: return-to-author requireReason: true ``
The user sees: a paragraph of text, a rendered form with input fields and dropdowns, and an approval gate with Approve/Deny buttons.
The system sees: typed AST nodes, validated component schemas, dispatchable actions, and a binding map connecting the form to the approval gate.
The audit log sees: every field change, every button click, every approval decision — timestamped, actor-tagged, and hash-chained.
One response. Three consumers. Zero custom UI code.
MDMA vs Instructor vs Outlines vs Guidance
Feature Comparison at a Glance
| Feature | Instructor | Outlines | Guidance | MDMA |
|---|---|---|---|---|
| Output format | JSON (Pydantic) | JSON/regex/grammar | JSON/structured text | Markdown + YAML components |
| Schema definition | Python classes | JSON Schema/regex | Template programs | YAML in Markdown |
| Requires custom UI | Yes | Yes | Yes | No — renderer included |
| Human-readable output | No (raw JSON) | No | Partially | Yes — it's Markdown |
| Approval workflows | Not built-in | Not built-in | Not built-in | First-class approval-gate |
| Audit trail | Not built-in | Not built-in | Not built-in | Hash-chained event log |
| PII handling | Not built-in | Not built-in | Not built-in | Automatic detection & redaction |
| Token efficiency | JSON overhead | JSON overhead | Varies | 34-38% fewer tokens |
| LLM compatibility | OpenAI, Anthropic, others | Local models (HF) | Local models | Any LLM (prompt-based) |
| Validation | Pydantic validators | Grammar constraints | Template constraints | YAML schema + field validation |
When You Still Need JSON (and When You Don't)
Use Instructor/Outlines/Guidance when:
- Output feeds directly into a machine pipeline (no human interaction)
- You need guaranteed JSON for database insertion or API calls
- You're doing bulk extraction from documents (NER, classification)
- Latency is critical and you need constrained decoding
Use MDMA when:
- A human needs to see, review, or interact with the output
- You need approval workflows or multi-step processes
- Compliance requires audit trails (HIPAA, SOX, MiFID)
- You want the LLM to dynamically decide what data to collect
- You're building a chatbot that needs to do more than return text
The key distinction: Instructor and friends solve LLM → machine communication. MDMA solves LLM → human → machine communication.
Building Human-in-the-Loop Workflows Without a Separate Frontend
Approval Gates as First-Class Output
Most human-in-the-loop implementations follow this pattern:
- LLM produces JSON
- Backend stores the JSON
- Frontend renders a custom approval UI
- User approves/denies
- Backend processes the decision
That's five steps across three systems. With MDMA, it's one:
The deployment package is ready for review. ```mdma id: deploy-approval type: approval-gate title: Production Deployment Approval description: > Deploying v2.4.1 to production. Changes include the new payment processing module and updated rate limits. requiredApprovers: 2 allowedRoles: - tech-lead - engineering-manager - director onApprove: execute-deployment onDeny: return-to-author requireReason: true ``
The LLM generates the approval gate. The renderer displays it. The runtime enforces role-based access, collects approvals, and fires the appropriate action. The event log records who approved, when, and why — with hash-chained integrity.
No custom UI. No separate approval system. No integration work.
Multi-Step Forms Driven by Conversation Context
MDMA components communicate through bindings, enabling multi-step workflows within a single document:
## Step 1: Risk Assessment ```mdma id: risk-form type: form fields: - name: change_type type: select label: Change Type required: true options: - { label: Infrastructure, value: infrastructure } - { label: Application, value: application } - { label: Database, value: database } - name: risk_level type: select label: Risk Level required: true options: - { label: Low, value: low } - { label: Medium, value: medium } - { label: High, value: high } onSubmit: assess-risk `` ` ## Step 2: Pre-deployment Checklist ```mdma id: pre-deploy-checklist type: tasklist items: - id: tests-pass text: All tests pass in CI required: true - id: code-review text: Code reviewed by 2 engineers required: true - id: staging-verified text: Changes verified on staging required: true - id: rollback-ready text: Rollback plan documented required: true onComplete: checklist-done ``` ## Step 3: Approval ```mdma id: change-approval type: approval-gate title: Change Approval description: > Approve deployment of {{change_type}} change with {{risk_level}} risk level. requiredApprovers: 1 allowedRoles: - tech-lead - engineering-manager onApprove: execute-change onDeny: reject-change requireReason: true ``
Notice: {{changetype}} and {{risklevel}} in the approval gate description are bindings to the form fields above. When the user fills the form, the approval gate description updates in real time.
The entire workflow — assessment, checklist, approval — lives in one document generated by one LLM response. No orchestration service. No state machine definition. No workflow engine configuration.
Getting Started with MDMA
Installation and First Render
npm install @mobile-reality/mdma-parser \ @mobile-reality/mdma-runtime \ @mobile-reality/mdma-renderer-react \ @mobile-reality/mdma-prompt-pack
Parse and render an MDMA document in under 20 lines:
import { unified } from 'unified'; import remarkParse from 'remark-parse'; import { remarkMdma } from '@mobile-reality/mdma-parser'; import { createDocumentStore } from '@mobile-reality/mdma-runtime'; import { MdmaProvider, MdmaDocument } from '@mobile-reality/mdma-renderer-react'; // Parse const processor = unified().use(remarkParse).use(remarkMdma); const ast = await processor.run(processor.parse(markdownString)); // Create runtime const store = createDocumentStore(ast, { documentId: 'my-doc', sessionId: crypto.randomUUID(), environment: 'production', }); // Render function App() { return ( <MdmaProvider store={store} ast={ast}> <MdmaDocument /> </MdmaProvider> ); }
Plugging into Your Existing LLM Pipeline
MDMA works with any LLM. Use the prompt pack to give the model MDMA authoring capabilities:
import { buildSystemPrompt } from '@mobile-reality/mdma-prompt-pack'; const systemPrompt = buildSystemPrompt({ customPrompt: `You are an HR onboarding assistant. When a new employee needs to be onboarded, generate MDMA forms to collect their information, a tasklist for IT setup, and an approval gate for their manager.`, }); // Works with OpenAI const response = await openai.chat.completions.create({ model: 'gpt-4o', messages: [ { role: 'system', content: systemPrompt }, { role: 'user', content: 'New hire: Senior Engineer starting March 1st' }, ], }); // Works with Anthropic const response = await anthropic.messages.create({ model: 'claude-sonnet-4-20250514', system: systemPrompt, messages: [ { role: 'user', content: 'New hire: Senior Engineer starting March 1st' }, ], }); // Parse the response — same code regardless of LLM provider const ast = await processor.run(processor.parse(response.content)); const store = createDocumentStore(ast, { /* config */ });
The LLM generates the document structure dynamically based on conversation context. Different conversations produce different forms, different checklists, different approval flows — all handled by the same renderer.
Conclusion
JSON Schema solved the structured LLM output problem for machine-to-machine communication. But the moment a human enters the loop — reviewing data, filling forms, approving decisions — JSON becomes the wrong abstraction. You end up building custom UI for every schema, managing state across frontend and backend, and bolting on audit trails as an afterthought.
MDMA takes a different approach: let the LLM write what it's best at (Markdown), extend it with interactive components (YAML blocks), and handle everything else — rendering, validation, state management, audit logging — in the runtime.
The result: structured output that humans can actually use, without the JSON tax.
Links:
AI-Powered Interactive Documents & Generative UI Insights
Are you exploring how large language models can move beyond plain text to deliver structured, interactive experiences? At MDMA, we're pioneering the intersection of Markdown and generative UI — enabling LLMs to return forms, approval workflows, and dynamic components instead of static responses. Our growing library of articles covers the technical foundations, business applications, and architectural patterns behind this shift:
Dive into these resources to understand why generative UI is replacing plain-text chat interfaces across healthcare, fintech, and enterprise workflows. If you'd like to integrate MDMA into your product or explore a partnership, reach out to our team. And if you're passionate about shaping the future of LLM-powered interfaces, check our open positions — we're hiring.
