We use cookies to improve your experience.

Mobile Reality logoMobile Reality logo

Getting Started with Markdown for AI Agents: Build Your First MDMA Agent in 5 Minutes

Interactive AI agent deployment workflow using markdown for AI agents with MDMA code and approval interface

Introduction

Markdown for AI agents is replacing custom parsing pipelines as the standard way to connect large language models with actionable interfaces. What if your agents could bypass intent analysis and document conversion entirely, receiving structured content directly in a format both humans and machines can read? That's the shift happening right now — and it eliminates the fragile parsing logic that plagues production AI systems.

At Mobile Reality, we've tackled this across 75+ production AI deployments. We built MDMA (Markdown Document with Mounted Applications) to extend standard text markdown with interactive components — forms, tables, approval gates — allowing LLMs to generate validated UI through an API-first architecture where every document passes strict Zod schema validation before rendering. You can read more about our approach in our guide to building AI agents.

Markdown for agents solidified as a standard protocol in 2026, coinciding with Anthropic's structured outputs GA release in late Jan 2026 (Feb rollout across platforms). This guarantees that API responses conform to schemas, making data streams and files predictable — eliminating the cascading failures that come from parsing unstructured text. You can select any compatible TypeScript or Python client to consume these API responses.

In this tutorial, you'll build a mortgage pre-approval agent that outputs interactive forms using MDMA. We'll cover setup, file structure, and deployment — giving you production-ready foundations faster than traditional development paths. You don't need to design custom frontends or copy boilerplate between projects.

Why Markdown for Agents Transforms How AI Systems Process Content

Raw HTML bloats agent context windows with presentation noise that models must filter before processing semantic content. You don't want to pay for wasted tokens. According to Cloudflare's research, converting HTML to structured markdown reduces token costs by approximately 80%, shrinking a 16,180-token web page to just 3,150 tokens. This means your agents receive clean, parseable content without navigating DOM hierarchies.

The quality impact is equally significant. Analysis of LLM output formats shows GPT-4 scores 81.2% on reasoning tasks with Markdown prompts versus 73.9% with JSON — a 7.3-point gap. When you force a model to simultaneously reason about a problem and conform to a rigid schema, both suffer. And since output tokens cost 3-10x more than input tokens across major providers, format choice directly affects your bill.

Format / Simple Element / Complex Document / Reasoning Accuracy
FormatSimple ElementComplex DocumentReasoning Accuracy
HTML12-15 tokens16,180 tokensBaseline
JSON8-10 tokens~8,000 tokens73.9% (GPT-4)
Markdown3 tokens3,150 tokens81.2% (GPT-4)

Modern web infrastructure now supports this natively through HTTP header negotiation. Cloudflare's Markdown for Agents enables automatic conversion via Accept: text/markdown, allowing AI crawlers and agents to request simplified content directly without downloading the full file. Response metadata includes x-markdown-tokens header counts and content-signal classifications, giving your API consumers precise visibility into costs. You don't need to maintain separate content pipelines for different client platforms.

MDMA extends these efficiency gains into interactive web workflows. By embedding YAML component definitions within standard text markdown, we generate validated forms and approval gates without per-feature API development. Your agents output structured docs that render immediately — you don't build custom UI for each workflow. This serves as a fundamentals reference markdown-for-agents developers can rely on for deterministic rendering. The open source TypeScript tool handles the entire path from LLM output to interactive app.

What MDMA Adds to Standard Markdown

Standard Markdown gives you headings, paragraphs, lists, and code blocks. MDMA extends it with nine interactive component types defined in YAML inside fenced code blocks. The model writes natural prose, then drops in a component definition wherever structured interaction is needed.

Here's a minimal example — a contact form embedded in a Markdown document:

markdown
# Contact Us

Fill out the form below and we'll get back to you within 24 hours.

` ``mdma
id: contact-form
type: form
fields:
  - name: full_name
    type: text
    label: Full Name
    required: true
  - name: email
    type: email
    label: Email Address
    required: true
    sensitive: true
  - name: message
    type: textarea
    label: Your Message
    required: true
onSubmit: submit-contact
` ``

The sensitive: true flag on the email field triggers automatic PII redaction in the audit log — the runtime never stores that value in plain text. This is handled by the framework, not by your application code.

MDMA ships nine component types that cover the interaction patterns we kept hitting across production projects. Each type captures structured information and renders content that web applications can select and process:

Component / What It Does
ComponentWhat It Does
FormMulti-field data collection with typed fields, validation, and PII flags
ButtonAction trigger with confirmation dialog (primary, secondary, danger)
TasklistInteractive checklist with per-item completion state
TableSortable, filterable data display with column types
ChartData visualization — renders as table by default, override with Recharts
CalloutAlert banners (info, warning, error, success)
Approval GateWorkflow blocker requiring N approvers with role restrictions
WebhookHTTP trigger with retries, timeout, and environment-based policy
ThinkingCollapsible AI reasoning block for chain-of-thought transparency

Building a Mortgage Pre-Approval Agent

Mortgage workflows are a good test case because they combine everything that makes plain chat insufficient: structured data collection, sensitive fields, multi-step validation, and mandatory approval gates. Here's how to build one from scratch.

Step 1: Install the Packages

bash
pnpm add @mobile-reality/mdma-parser @mobile-reality/mdma-runtime \
  @mobile-reality/mdma-renderer-react @mobile-reality/mdma-prompt-pack

You need four packages: the parser transforms Markdown files into a typed AST, the runtime manages state and events, the React renderer displays components, and the prompt-pack teaches your LLM the MDMA syntax. As a developer, you don't need to configure MCP servers or write custom API handlers — the tool handles content parsing and validation out of the box.

Step 2: Configure the System Prompt

The prompt-pack includes a 327-line system prompt that teaches any LLM the exact MDMA syntax — all nine component types, binding expressions, validation rules, and a self-check checklist the model runs before finalizing output.

typescript
import { buildSystemPrompt } from '@mobile-reality/mdma-prompt-pack';

const systemPrompt = buildSystemPrompt({
  customPrompt: `You are a mortgage pre-approval assistant.
When a user requests a loan assessment, generate:
1. A form collecting loan amount, term, income, and credit score
2. Mark income and SSN fields as sensitive: true
3. An approval gate requiring a senior underwriter sign-off
4. A callout summarizing the risk assessment`
});

The function always includes the full MDMA spec regardless of your custom instructions. This prevents the model from "forgetting" the component format in long conversations.

Step 3: Parse and Render the LLM Response

When the model responds with Markdown containing MDMA blocks, parse it into an AST and create a reactive document store:

typescript
import { unified } from 'unified';
import remarkParse from 'remark-parse';
import { remarkMdma } from '@mobile-reality/mdma-parser';
import { createDocumentStore } from '@mobile-reality/mdma-runtime';
import type { MdmaRoot } from '@mobile-reality/mdma-spec';

const processor = unified().use(remarkParse).use(remarkMdma);

async function parseResponse(markdown: string) {
  const tree = processor.parse(markdown);
  const ast = (await processor.run(tree)) as MdmaRoot;

  const store = createDocumentStore(ast, {
    documentId: 'mortgage-assessment',
    sessionId: crypto.randomUUID(),
    environment: 'production',
  });

  // Listen to all events (field changes, approvals, etc.)
  store.getEventBus().onAny((action) => {
    console.log('Action:', action.type, action.componentId);
  });

  return { ast, store };
}

Step 4: Render in React

tsx
import { MdmaDocument } from '@mobile-reality/mdma-renderer-react';
import '@mobile-reality/mdma-renderer-react/styles.css';

function MortgageAssessment({ ast, store }) {
  return <MdmaDocument ast={ast} store={store} />;
}

That's it. The renderer handles all nine component types, applies default styles, and wires up state management automatically. Every field change, approval decision, and button click flows through the store and gets logged. Developers don't need to write client-side event handlers — the tool manages that path through the API layer.

Step 5: React to User Actions

The store dispatches typed actions you can subscribe to:

typescript
// User fills in loan amount
store.dispatch({
  type: 'FIELD_CHANGED',
  componentId: 'mortgage-form',
  field: 'loan_amount',
  value: '350000',
});

// Underwriter approves the application
store.dispatch({
  type: 'APPROVAL_GRANTED',
  componentId: 'underwriter-approval',
  actor: { id: 'user-123', role: 'senior-underwriter' },
});

// Check component state
const formState = store.getComponentState('mortgage-form');
console.log(formState?.values.loan_amount); // '350000'
console.log(formState?.touched); // true

Every dispatched action is recorded in the event log with timestamps, actor IDs, and — if the field is marked sensitive — automatic PII redaction.

What the LLM Actually Generates

When you send the mortgage system prompt to GPT-4o or Claude and a user asks "I want to apply for a $350,000 mortgage," the model responds with something like this:

markdown
# Mortgage Pre-Approval Assessment

Based on your request, here's the pre-approval form. Please provide
the following information for evaluation.

` ``mdma
id: mortgage-form
type: form
fields:
  - name: loan_amount
    type: number
    label: Requested Loan Amount ($)
    required: true
    default: 350000
  - name: loan_term
    type: select
    label: Loan Term
    required: true
    options:
      - { label: "15 Years", value: 15 }
      - { label: "30 Years", value: 30 }
  - name: annual_income
    type: number
    label: Annual Gross Income ($)
    required: true
    sensitive: true
  - name: ssn
    type: text
    label: Social Security Number
    required: true
    sensitive: true
  - name: credit_score
    type: select
    label: Credit Score Range
    options:
      - { label: "Excellent (750+)", value: excellent }
      - { label: "Good (700-749)", value: good }
      - { label: "Fair (650-699)", value: fair }
      - { label: "Poor (<650)", value: poor }
onSubmit: submit-mortgage-application
` ``

` ``mdma
id: risk-callout
type: callout
variant: info
title: Risk Assessment
content: "Application will be evaluated against current lending criteria. Pre-approval does not guarantee final approval."
` ``

` ``mdma
id: underwriter-approval
type: approval-gate
title: Senior Underwriter Review
description: "Loan application for ${{mortgage-form.loan_amount}} requires underwriter sign-off before proceeding."
requiredApprovers: 1
allowedRoles:
  - senior-underwriter
  - risk-officer
onApprove: proceed-to-closing
onDeny: return-to-applicant
requireReason: true
` ``

The parser validates every block against Zod schemas. If the model generates an invalid component — a missing required field, a malformed binding expression, a type that doesn't exist — the validator catches it before rendering. No malformed UI reaches the client app. You can test this output locally with curl against your API server before deploying.

Validating Documents Before Production

MDMA includes a validator with 10 static analysis rules. Run it from the CLI or programmatically in your CI pipeline:

bash
# Validate all MDMA documents
npx @mobile-reality/mdma-cli validate "documents/**/*.md"

# Auto-fix common issues (kebab-case IDs, missing sensitivity flags)
npx @mobile-reality/mdma-cli validate "documents/**/*.md" --fix

# JSON output for CI integration
npx @mobile-reality/mdma-cli validate "documents/**/*.md" --json

Or use it programmatically:

typescript
import { validate } from '@mobile-reality/mdma-validator';

const result = validate(markdownString, { autoFix: true });

console.log(`${result.summary.errors} errors, ${result.summary.warnings} warnings`);
console.log(`${result.fixCount} issues auto-fixed`);

The validator checks YAML correctness, schema conformance, unique IDs, kebab-case formatting, binding resolution, action references, and PII sensitivity flags. It catches problems like a form referencing a non-existent submit handler or a binding expression pointing to a component that doesn't exist in the document. Run tests in your CI pipeline and curl the API to verify files before they reach production servers.

Enterprise Considerations

Three features that matter when you move from prototype to production:

PII redaction. Fields marked sensitive: true are automatically redacted before logging. The TypeScript runtime detects five PII categories — email, phone, SSN, credit card, name patterns — and applies hash, mask, or omit strategies. For a mortgage workflow handling income information and social security numbers, this is not optional. The web content never exposes sensitive information to unauthorized systems.

Tamper-evident audit trail. Every event — field change, approval, button click — is recorded with a sequence number, a hash of the current entry, and the hash of the previous entry. Integrity verification is one call: store.getEventLog().verifyIntegrity(). When a regulator asks "who approved this loan application and when?" you have a cryptographically verifiable answer.

Policy engine. Environment-based rules prevent dangerous operations in non-production environments. Block webhook calls in preview, block emails in staging, allow everything in production. The policy design ensures a developer doesn't accidentally trigger a live underwriting API during testing. You don't copy production configs to dev — the platform handles environment separation automatically.

Conclusion

Markdown for AI agents works because it aligns with how LLMs actually generate content — naturally, with lower token costs and better reasoning quality than JSON alternatives. MDMA extends that foundation with validated interactive components that turn model output into forms, approval gates, and workflows without per-use-case frontend development.

  • Token costs drop up to 80% compared to HTML and 16% compared to JSON, with measurably better model reasoning accuracy
  • Nine component types cover the interaction patterns that come up in production AI workflows — data collection, approvals, task tracking, webhooks
  • Every component is validated against Zod schemas before rendering — malformed output never reaches the user
  • PII redaction, tamper-evident audit logs, and environment-based policies ship in the framework, not as afterthoughts
  • The full TypeScript stack — parser, runtime, renderer, prompt-pack, validator, CLI — is open source on GitHub

Clone the repo, run the mortgage example, and validate your setup in minutes. Copy the blueprints into your server environment and adapt them for your client projects. Markdown for agents makes development faster by removing the gap between what AI systems generate and what users need. For advanced orchestration patterns, see our guide to building AI agents and business automation with AI agents.

Frequently Asked Questions

What is MDMA and how does it differ from standard Markdown?

MDMA (Markdown Document with Mounted Applications) extends standard Markdown by embedding interactive components defined in YAML inside fenced code blocks, enabling LLMs to generate validated forms, approval gates, and buttons alongside natural prose. While standard Markdown only provides static content like headings and lists, MDMA documents pass strict Zod schema validation before rendering, turning model output into interactive web workflows without requiring custom frontend development for each use case.

How much does Markdown reduce token costs for AI agents?

According to Cloudflare's research cited in the article, converting HTML to structured Markdown reduces token costs by approximately 80%, shrinking a 16,180-token web page to just 3,150 tokens. Additionally, GPT-4 scores 81.2% on reasoning tasks with Markdown prompts versus 73.9% with JSON, delivering both significant cost savings and measurably better model accuracy.

What types of interactive components can MDMA generate?

MDMA supports nine component types defined in YAML: Form, Button, Tasklist, Table, Chart, Callout, Approval Gate, Webhook, and Thinking. These cover essential production patterns including multi-field data collection with validation, workflow blockers requiring role-based approvals, HTTP triggers with retry logic, and collapsible reasoning blocks for transparency.

How does MDMA protect sensitive data like PII in production?

MDMA automatically redacts fields marked with `sensitive: true` before logging, detecting five PII categories including SSN, credit card, email, phone, and name patterns using hash, mask, or omit strategies. The framework also maintains a tamper-evident audit trail where every event is recorded with sequence numbers and cryptographic hashes linking to previous entries, providing integrity verification through a single API call.

AI-Powered Interactive Documents & Generative UI Insights

Are you exploring how large language models can move beyond plain text to deliver structured, interactive experiences? At MDMA, we're pioneering the intersection of Markdown and generative UI — enabling LLMs to return forms, approval workflows, and dynamic components instead of static responses. Our growing library of articles covers the technical foundations, business applications, and architectural patterns behind this shift:

Dive into these resources to understand why generative UI is replacing plain-text chat interfaces across healthcare, fintech, and enterprise workflows. If you'd like to integrate MDMA into your product or explore a partnership, reach out to our team. And if you're passionate about shaping the future of LLM-powered interfaces, check our open positions — we're hiring.

Did you like the article?Find out how we can help you.

Matt Sadowski

CEO of Mobile Reality

CEO of Mobile Reality

Related articles

Cut AI UI token costs by 16% using MDMA’s Markdown vs Google A2UI JSON. Gain audit trails, PII redaction, approval gates, and better model reasoning.

01.04.2026

Google A2UI vs MDMA 2026: Cut AI UI Token Costs 16%

Cut AI UI token costs by 16% using MDMA’s Markdown vs Google A2UI JSON. Gain audit trails, PII redaction, approval gates, and better model reasoning.

Read full article

Agentic AI drives autonomous business decisions, while generative AI powers content. Understand their roles to boost efficiency and strategic impact in 2026.

27.03.2026

Generative vs Agentic AI: Key Differences for Business 2026

Agentic AI drives autonomous business decisions, while generative AI powers content. Understand their roles to boost efficiency and strategic impact in 2026.

Read full article

Discover what an LLM interface is and why most AI products fail at the last mile. Learn how to turn raw model output into interactive forms, approvals, and workflows with an open-source solution.

24.03.2026

LLM Interface: The Missing Layer Between Your AI Model and Your Users

Discover what an LLM interface is and why most AI products fail at the last mile. Learn how to turn raw model output into interactive forms, approvals, and workflows with an open-source solution.

Read full article