We use cookies to improve your experience.

Mobile Reality logoMobile Reality logo

Structured LLM Output Without JSON Schemas — A Different Approach

Structured LLM Output Without JSON Schemas | MDMA

Introduction

Every developer integrating LLMs hits the same wall: the model returns plain text, and you need structured data. The industry converged on JSON Schema, Pydantic models, and function calling to solve this. But what if forcing JSON on LLMs is the wrong abstraction entirely?

This article introduces a different paradigm — one where structured output is not just machine-parseable, but also human-renderable. Where the LLM doesn't return a JSON blob you have to build UI for, but a document that is the UI.

The JSON Schema Era: How We Got Here

From Regex Parsing to Pydantic — A Brief History

In 2022, getting structured data from an LLM meant prompt engineering and prayer. You'd write "respond in JSON format" and hope for the best. When the model inevitably returned Sure! Here's the JSON: before the actual payload, you'd write regex to extract it.

Then came the tooling wave:

  • Instructor (2023) wrapped Pydantic models around LLM calls, giving you type-safe extraction with automatic retries when validation failed.
  • Outlines and Guidance (2023-2024) took it further with constrained decoding — forcing the model's token generation to follow a grammar, guaranteeing valid JSON at the token level.
  • OpenAI Structured Outputs (2024) made it a first-class API feature. Define a JSON Schema, get conforming output. No parsing, no retries.

By 2025, the ecosystem settled: if you want structured data from an LLM, you define a schema and the model fills it in. Problem solved.

Or is it?

Why Every LLM Provider Shipped JSON Mode

The answer is simple: developers asked for it. When you're building a pipeline — LLM output feeds into a database, triggers a function, or populates a UI — you need predictable structure. JSON is universal. Every language parses it. Every database stores it.

But this framing treats LLM output as data transfer — a serialization problem between the model and your backend. That's one valid use case. It's not the only one.

The Hidden Costs of Forcing JSON on LLMs

34% More Tokens — The JSON Tax You're Already Paying

JSON is verbose by design. Curly braces, quoted keys, colons, commas — all structural tokens that carry no semantic content. Research and benchmarks consistently show that Markdown is 34-38% more token-efficient than JSON for equivalent data.

Here's the same information in both formats:

JSON (87 tokens):

{
  "patient": {
    "name": "Jane Doe",
    "date_of_birth": "1990-03-15",
    "chief_complaint": "Persistent headache for 3 days",
    "vitals": {
      "blood_pressure": "120/80",
      "heart_rate": 72,
      "temperature": 98.6
    },
    "assessment": "Tension-type headache. No red flags.",
    "plan": "OTC analgesics, follow up in 1 week if no improvement"
  }
}

Markdown with MDMA (fewer tokens, and the user can actually read it):

# Patient Intake

```mdma
id: patient-intake
type: form
fields:
  - name: patient_name
    type: text
    label: Patient Name
    required: true
    sensitive: true
  - name: dob
    type: date
    label: Date of Birth
    sensitive: true
  - name: chief_complaint
    type: textarea
    label: Chief Complaint
    required: true
  - name: assessment
    type: textarea
    label: Assessment
  - name: plan
    type: textarea
    label: Plan
onSubmit: submit-intake
`` 

The JSON version gives you data. The MDMA version gives you data and an interactive form the user fills out — without building a single line of frontend code.

At scale, the token difference compounds. If your application makes 10,000 LLM calls per day, switching from JSON to Markdown-based structured output can cut your token costs by a third.

10-15% Reasoning Degradation in JSON Mode

This one is less discussed but well-documented. When you force an LLM into JSON mode, you're constraining its generation process. The model can no longer "think out loud" — it must produce valid JSON from the first token.

Multiple benchmarks show 10-15% performance degradation on reasoning tasks when using constrained JSON output compared to free-form generation. The model is spending capacity on structural compliance instead of problem-solving.

With Markdown, the LLM writes in the format it was trained on. Markdown is the lingua franca of the internet — it appears massively in training data. The model doesn't fight the format; it flows with it.

Schema Boilerplate That Scales with Complexity

A simple JSON Schema for a contact form is manageable. A schema for a multi-step KYC workflow with conditional fields, nested objects, and validation rules? You're looking at hundreds of lines of schema definition before writing any application logic.

Here's what a KYC form schema looks like in Pydantic:

class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str = Field(pattern=r'^\d{5}(-\d{4})?$')

class KYCApplication(BaseModel):
    full_name: str = Field(min_length=2, max_length=100)
    date_of_birth: date
    ssn: str = Field(pattern=r'^\d{3}-\d{2}-\d{4}$')
    email: EmailStr
    phone: str
    address: Address
    employment_status: Literal['employed', 'self-employed', 'unemployed', 'retired']
    annual_income: float = Field(ge=0)
    source_of_funds: str
    is_pep: bool  # Politically Exposed Person
    documents_provided: list[str]

Then you need a separate UI component to render this form, validation logic in the frontend, and mapping between the schema and the rendered fields.

With MDMA, the schema, the UI definition, and the validation rules are one thing:

id: kyc-form
type: form
fields:
  - name: full_name
    type: text
    label: Full Legal Name
    required: true
    sensitive: true
  - name: date_of_birth
    type: date
    label: Date of Birth
    required: true
    sensitive: true
  - name: ssn
    type: text
    label: Social Security Number
    required: true
    sensitive: true
    validation:
      pattern: "^\\d{3}-\\d{2}-\\d{4}$"
      message: "Format: XXX-XX-XXXX"
  - name: email
    type: email
    label: Email Address
    required: true
    sensitive: true
  - name: employment_status
    type: select
    label: Employment Status
    required: true
    options:
      - { label: Employed, value: employed }
      - { label: Self-Employed, value: self-employed }
      - { label: Unemployed, value: unemployed }
      - { label: Retired, value: retired }
  - name: annual_income
    type: number
    label: Annual Income
    required: true
    validation:
      min: 0
  - name: source_of_funds
    type: textarea
    label: Source of Funds
    required: true
  - name: is_pep
    type: checkbox
    label: Politically Exposed Person (PEP)
onSubmit: submit-kyc
`` 

One definition. The LLM generates it. The renderer displays it. The runtime validates and collects the data. No Pydantic model. No separate frontend component. No mapping layer.

Function Calling vs Structured Output vs Generative UI — When to Use Which

The industry treats these as competing approaches. They're not — they solve different problems.

Function Calling: Great for Actions, Wrong for Data Collection

Function calling (tool use) excels when you want the LLM to do something: search a database, call an API, send a message. The model decides which function to invoke and with what parameters.

But function calling is awkward for collecting data from users. The model calls a collectuserinfo function, your backend receives JSON, and then... you still need to build UI to display and edit that data. Function calling is a model-to-system interface, not a model-to-user interface.

JSON Structured Output: Reliable but Invisible to Users

Structured output via JSON Schema gives you validated, typed data. It's the right choice when the output feeds directly into a pipeline — no human in the loop.

But the moment a human needs to see, review, or modify that output, you're building custom UI. Every new schema means a new form component. Every schema change means a frontend update.

Generative UI: The Missing Third Option

What if the LLM's output was the interface?

This is the generative UI paradigm: instead of the model returning data that you render, the model returns a renderable document that collects, displays, and processes data.

MDMA implements this with extended Markdown. The LLM writes standard Markdown (headings, paragraphs, lists) interspersed with YAML-defined interactive components. A single renderer handles every document. No per-schema UI work.

Use function calling when the LLM needs to take action on behalf of the user. Use JSON structured output when the data flows into a machine-only pipeline. Use generative UI (MDMA) when a human needs to interact with the output.

What If the Structured Output Was Also the UI?

The False Choice Between Machine-Parseable and Human-Readable

The current paradigm forces a choice: either the LLM returns data (JSON, machine-parseable, invisible to users) or text (Markdown, human-readable, unstructured). You pick one and build infrastructure for the other.

MDMA rejects this trade-off. An MDMA document is:

  • Human-readable: it's Markdown. Headings, paragraphs, and lists render as expected.
  • Machine-parseablemdma blocks parse into typed AST nodes with known schemas.
  • Interactive: forms collect input, buttons trigger actions, approval gates enforce workflows.
  • Auditable: every interaction is logged with tamper-evident hash chaining.

The same document serves the user, the system, and compliance — without conversion layers.

Markdown as a Structured Format — Why LLMs Already Prefer It

LLMs don't "prefer" JSON. They were trained on the internet, and the internet runs on Markdown. README files, documentation, forum posts, chat messages — all Markdown or Markdown-adjacent.

When you ask an LLM to generate Markdown, you're asking it to work in its native medium. When you ask it to generate JSON, you're asking it to switch to a format optimized for machines, not for language models.

MDMA leans into this. The LLM writes Markdown as usual, and when it needs to express structure — a form, a table, a workflow gate — it drops into a mdma YAML block. The transition is natural:

Based on your description, this sounds like a P2 incident.
Let me set up the triage workflow.

## Incident Assessment

```mdma
id: incident-form
type: form
fields:
  - name: incident_title
    type: text
    label: Incident Title
    required: true
  - name: severity
    type: select
    label: Severity Level
    required: true
    options:
      - { label: "P1 - Critical", value: P1 }
      - { label: "P2 - High", value: P2 }
      - { label: "P3 - Medium", value: P3 }
      - { label: "P4 - Low", value: P4 }
  - name: affected_systems
    type: text
    label: Affected Systems
    required: true
  - name: customer_impact
    type: textarea
    label: Customer Impact
    required: true
onSubmit: submit-incident
`` `

Once submitted, I'll route this to the on-call team and
create the response checklist.

Natural language and structured components coexist in one document. The model doesn't need to choose between explaining context and collecting data — it does both.

Extended Markdown: Forms, Tables, and Approval Gates from LLM Output

How MDMA Extends Markdown with Interactive Components

MDMA (Markdown Document with Mounted Applications) adds nine component types to standard Markdown via fenced code blocks with the mdma language tag:

Component / Purpose / Example Use Case
ComponentPurposeExample Use Case
formStructured data collectionPatient intake, bug reports, KYC
tableTabular data with sorting/filteringSearch results, audit logs
approval-gateMulti-step approval workflowManager sign-off, compliance review
tasklistChecklist with completion trackingPre-deploy checklist, triage steps
buttonClickable action with confirmationNotify Slack, deploy to production
calloutHighlighted message (info/warning/error)SLA warnings, compliance notices
chartData visualizationRevenue trends, error rates
webhookHTTP request triggerSlack notifications, API calls
thinkingCollapsed AI reasoningDebug transparency

Components communicate through bindings — {{field_name}} expressions that resolve at runtime. When a user fills a form field, bound components update automatically.

Code Example: From JSON Schema to a Single Prompt

The JSON Schema approach requires three artifacts:

  1. Schema definition (Pydantic/JSON Schema)
  2. LLM integration (function calling or structured output)
  3. Frontend component (React form mapped to schema fields)

The MDMA approach requires one:

import { buildSystemPrompt } from '@mobile-reality/mdma-prompt-pack';

const systemPrompt = buildSystemPrompt({
  customPrompt: `You are a customer support assistant. When a user
  reports an issue, generate an MDMA form to collect structured
  details, followed by a tasklist for resolution steps.`,
});

// Send to any LLM — OpenAI, Claude, Gemini, local models
const response = await llm.chat({
  system: systemPrompt,
  messages: conversation,
});

// The response contains Markdown with ```mdma blocks
// Parse it into an interactive document
import { unified } from 'unified';
import remarkParse from 'remark-parse';
import { remarkMdma } from '@mobile-reality/mdma-parser';

const processor = unified().use(remarkParse).use(remarkMdma);
const ast = await processor.run(processor.parse(response));

The LLM decides what fields to include based on the conversation. No predefined schema. The form structure emerges from context.

What the User Actually Sees vs What the System Parses

When the LLM generates this response:

I've reviewed your request. Here's the change management form:

```mdma
id: change-request
type: form
fields:
  - name: change_title
    type: text
    label: Change Title
    required: true
  - name: risk_level
    type: select
    label: Risk Level
    options:
      - { label: Low, value: low }
      - { label: Medium, value: medium }
      - { label: High, value: high }
  - name: rollback_plan
    type: textarea
    label: Rollback Plan
    required: true
onSubmit: submit-change
`` `

```mdma
id: tech-lead-approval
type: approval-gate
title: Tech Lead Approval
requiredApprovers: 1
allowedRoles:
  - tech-lead
  - engineering-manager
onApprove: proceed-to-deploy
onDeny: return-to-author
requireReason: true
`` 

The user sees: a paragraph of text, a rendered form with input fields and dropdowns, and an approval gate with Approve/Deny buttons.

The system sees: typed AST nodes, validated component schemas, dispatchable actions, and a binding map connecting the form to the approval gate.

The audit log sees: every field change, every button click, every approval decision — timestamped, actor-tagged, and hash-chained.

One response. Three consumers. Zero custom UI code.

MDMA vs Instructor vs Outlines vs Guidance

Feature Comparison at a Glance

Feature / Instructor / Outlines / Guidance / MDMA
FeatureInstructorOutlinesGuidanceMDMA
Output formatJSON (Pydantic)JSON/regex/grammarJSON/structured textMarkdown + YAML components
Schema definitionPython classesJSON Schema/regexTemplate programsYAML in Markdown
Requires custom UIYesYesYesNo — renderer included
Human-readable outputNo (raw JSON)NoPartiallyYes — it's Markdown
Approval workflowsNot built-inNot built-inNot built-inFirst-class approval-gate
Audit trailNot built-inNot built-inNot built-inHash-chained event log
PII handlingNot built-inNot built-inNot built-inAutomatic detection & redaction
Token efficiencyJSON overheadJSON overheadVaries34-38% fewer tokens
LLM compatibilityOpenAI, Anthropic, othersLocal models (HF)Local modelsAny LLM (prompt-based)
ValidationPydantic validatorsGrammar constraintsTemplate constraintsYAML schema + field validation

When You Still Need JSON (and When You Don't)

Use Instructor/Outlines/Guidance when:

  • Output feeds directly into a machine pipeline (no human interaction)
  • You need guaranteed JSON for database insertion or API calls
  • You're doing bulk extraction from documents (NER, classification)
  • Latency is critical and you need constrained decoding

Use MDMA when:

  • A human needs to see, review, or interact with the output
  • You need approval workflows or multi-step processes
  • Compliance requires audit trails (HIPAA, SOX, MiFID)
  • You want the LLM to dynamically decide what data to collect
  • You're building a chatbot that needs to do more than return text

The key distinction: Instructor and friends solve LLM → machine communication. MDMA solves LLM → human → machine communication.

Building Human-in-the-Loop Workflows Without a Separate Frontend

Approval Gates as First-Class Output

Most human-in-the-loop implementations follow this pattern:

  1. LLM produces JSON
  2. Backend stores the JSON
  3. Frontend renders a custom approval UI
  4. User approves/denies
  5. Backend processes the decision

That's five steps across three systems. With MDMA, it's one:

The deployment package is ready for review.

```mdma
id: deploy-approval
type: approval-gate
title: Production Deployment Approval
description: >
  Deploying v2.4.1 to production. Changes include
  the new payment processing module and updated rate limits.
requiredApprovers: 2
allowedRoles:
  - tech-lead
  - engineering-manager
  - director
onApprove: execute-deployment
onDeny: return-to-author
requireReason: true
`` 

The LLM generates the approval gate. The renderer displays it. The runtime enforces role-based access, collects approvals, and fires the appropriate action. The event log records who approved, when, and why — with hash-chained integrity.

No custom UI. No separate approval system. No integration work.

Multi-Step Forms Driven by Conversation Context

MDMA components communicate through bindings, enabling multi-step workflows within a single document:

## Step 1: Risk Assessment

```mdma
id: risk-form
type: form
fields:
  - name: change_type
    type: select
    label: Change Type
    required: true
    options:
      - { label: Infrastructure, value: infrastructure }
      - { label: Application, value: application }
      - { label: Database, value: database }
  - name: risk_level
    type: select
    label: Risk Level
    required: true
    options:
      - { label: Low, value: low }
      - { label: Medium, value: medium }
      - { label: High, value: high }
onSubmit: assess-risk
`` `

## Step 2: Pre-deployment Checklist

```mdma
id: pre-deploy-checklist
type: tasklist
items:
  - id: tests-pass
    text: All tests pass in CI
    required: true
  - id: code-review
    text: Code reviewed by 2 engineers
    required: true
  - id: staging-verified
    text: Changes verified on staging
    required: true
  - id: rollback-ready
    text: Rollback plan documented
    required: true
onComplete: checklist-done
```

## Step 3: Approval

```mdma
id: change-approval
type: approval-gate
title: Change Approval
description: >
  Approve deployment of {{change_type}} change
  with {{risk_level}} risk level.
requiredApprovers: 1
allowedRoles:
  - tech-lead
  - engineering-manager
onApprove: execute-change
onDeny: reject-change
requireReason: true
`` 

Notice: {{changetype}} and {{risklevel}} in the approval gate description are bindings to the form fields above. When the user fills the form, the approval gate description updates in real time.

The entire workflow — assessment, checklist, approval — lives in one document generated by one LLM response. No orchestration service. No state machine definition. No workflow engine configuration.

Getting Started with MDMA

Installation and First Render

npm install @mobile-reality/mdma-parser \
            @mobile-reality/mdma-runtime \
            @mobile-reality/mdma-renderer-react \
            @mobile-reality/mdma-prompt-pack

Parse and render an MDMA document in under 20 lines:

import { unified } from 'unified';
import remarkParse from 'remark-parse';
import { remarkMdma } from '@mobile-reality/mdma-parser';
import { createDocumentStore } from '@mobile-reality/mdma-runtime';
import { MdmaProvider, MdmaDocument } from '@mobile-reality/mdma-renderer-react';

// Parse
const processor = unified().use(remarkParse).use(remarkMdma);
const ast = await processor.run(processor.parse(markdownString));

// Create runtime
const store = createDocumentStore(ast, {
  documentId: 'my-doc',
  sessionId: crypto.randomUUID(),
  environment: 'production',
});

// Render
function App() {
  return (
    <MdmaProvider store={store} ast={ast}>
      <MdmaDocument />
    </MdmaProvider>
  );
}

Plugging into Your Existing LLM Pipeline

MDMA works with any LLM. Use the prompt pack to give the model MDMA authoring capabilities:

import { buildSystemPrompt } from '@mobile-reality/mdma-prompt-pack';

const systemPrompt = buildSystemPrompt({
  customPrompt: `You are an HR onboarding assistant. When a new
  employee needs to be onboarded, generate MDMA forms to collect
  their information, a tasklist for IT setup, and an approval
  gate for their manager.`,
});

// Works with OpenAI
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'system', content: systemPrompt },
    { role: 'user', content: 'New hire: Senior Engineer starting March 1st' },
  ],
});

// Works with Anthropic
const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  system: systemPrompt,
  messages: [
    { role: 'user', content: 'New hire: Senior Engineer starting March 1st' },
  ],
});

// Parse the response — same code regardless of LLM provider
const ast = await processor.run(processor.parse(response.content));
const store = createDocumentStore(ast, { /* config */ });

The LLM generates the document structure dynamically based on conversation context. Different conversations produce different forms, different checklists, different approval flows — all handled by the same renderer.

Conclusion

JSON Schema solved the structured LLM output problem for machine-to-machine communication. But the moment a human enters the loop — reviewing data, filling forms, approving decisions — JSON becomes the wrong abstraction. You end up building custom UI for every schema, managing state across frontend and backend, and bolting on audit trails as an afterthought.

MDMA takes a different approach: let the LLM write what it's best at (Markdown), extend it with interactive components (YAML blocks), and handle everything else — rendering, validation, state management, audit logging — in the runtime.

The result: structured output that humans can actually use, without the JSON tax.

Links:

AI-Powered Interactive Documents & Generative UI Insights

Are you exploring how large language models can move beyond plain text to deliver structured, interactive experiences? At MDMA, we're pioneering the intersection of Markdown and generative UI — enabling LLMs to return forms, approval workflows, and dynamic components instead of static responses. Our growing library of articles covers the technical foundations, business applications, and architectural patterns behind this shift:

Dive into these resources to understand why generative UI is replacing plain-text chat interfaces across healthcare, fintech, and enterprise workflows. If you'd like to integrate MDMA into your product or explore a partnership, reach out to our team. And if you're passionate about shaping the future of LLM-powered interfaces, check our open positions — we're hiring.

Did you like the article?Find out how we can help you.

Matt Sadowski

CEO of Mobile Reality

CEO of Mobile Reality

Related articles

Agentic AI drives autonomous business decisions, while generative AI powers content. Understand their roles to boost efficiency and strategic impact in 2026.

24.03.2026

Generative vs Agentic AI: Key Differences for Business 2026

Agentic AI drives autonomous business decisions, while generative AI powers content. Understand their roles to boost efficiency and strategic impact in 2026.

Read full article

Discover what an LLM interface is and why most AI products fail at the last mile. Learn how to turn raw model output into interactive forms, approvals, and workflows with an open-source solution.

24.03.2026

LLM Interface: The Missing Layer Between Your AI Model and Your Users

Discover what an LLM interface is and why most AI products fail at the last mile. Learn how to turn raw model output into interactive forms, approvals, and workflows with an open-source solution.

Read full article

LLMs lose flexibility with JSON schemas. Generative UI lets AI return interactive forms, tables, and approval gates from extended Markdown. See real examples.

20.03.2026

Generative UI: AI-Driven User Interfaces Transforming Design

LLMs lose flexibility with JSON schemas. Generative UI lets AI return interactive forms, tables, and approval gates from extended Markdown. See real examples.

Read full article