Open Standard v1.0 · Draft

The front door
for AI agents

robots.txt taught crawlers where to go. sitemap.xml told search engines what exists. ARA tells AI agents what a site is, what it contains, and how to interact with it — in 300 tokens instead of 50,000.

0x fewer tokens

0 vs 50,000 HTML

0% info retrieved

0 bots redirected

Get Started → Read the Spec

.well-known/ara/manifest.json

{
  // ARA v1.0 — agent discovery
  "ara_version": "1.0",
  "identity": {
    "name": "TechShop",
    "type": "ecommerce",
    "languages": ["en", "fr"]
  },
  "resources": {
    "products": "/schemas/product",
    "orders": "/schemas/order"
  },
  "actions": "/.well-known/ara/actions.json",
  "protocols": ["REST", "MCP", "A2A"],
  "policies": {
    "rate_limit": "60/min",
    "auth": "bearer"
  }
}

robots.txt → sitemap.xml → ARA

The Problem

AI agents are flying blind on the web

Every AI agent visiting your site wastes thousands of tokens parsing HTML noise. There's no standard. Every agent guesses.

10,000–50,000
tokens

Just to parse raw HTML — most of it navigation, ads, boilerplate. The actual content is buried.

~55%
accuracy

Average information retrieval from HTML parsing. Agents miss facts, misunderstand structure, hallucinate capabilities.

Fragile & slow

Screenshot analysis, DOM scraping, UI automation — all brittle. One site redesign breaks every agent integration.

Token efficiency comparison

Without ARA

10,830 tokens

With ARA

844 tokens

How it compares

ARA vs everything else

Existing standards each solve a slice. ARA is the only one designed end-to-end for AI agent interaction.

	robots.txt	sitemap.xml	Schema.org	llms.txt	OpenAPI	ARA
Site discovery	—	Partial	—	Partial	—	✓ Complete
Global overview	—	URLs only	—	Plain text	—	✓ Structured
Data schemas	—	—	Fragmented	—	Yes	✓ Semantic
Actions	—	—	Limited	—	Yes	✓ Multi-protocol
Intent mapping	—	—	—	—	—	✓ Native
MCP / A2A support	—	—	—	—	—	✓ Native
LLM-optimized digest	—	—	—	Basic	—	✓ Optimized
Agent policies	Basic	—	—	—	Partial	✓ Complete

Why not LLMs.txt?

ARA is what llms.txt should have been

llms.txt is a plain text file with links. Useful as a first step — but it gives AI agents no structure, no schemas, no actions, and no machine-readable policies.

Feature	llms.txt	ARA
Format	Plain text / markdown links	Structured JSON
Site overview	Partial (manual)	Complete (structured)
Data schemas	—	JSON Schema + Schema.org
Available actions	—	Full query & mutation definitions
Protocol support	—	REST, MCP, A2A, GraphQL
Access policies	—	Rate limits, auth, data usage
Token cost for agents	~800 tokens to parse	~150 tokens (manifest only)
Machine-readable	Partially	Fully

→

ARA is designed to replace llms.txt — or you can keep both. Run /ara migrate to convert your existing llms.txt into full ARA files automatically.

3-Layer Architecture

Understand any site in 1 request

One HTTP GET to manifest.json gives an AI agent complete understanding of your site — identity, structure, schemas, actions, and policies.

Layer 01 Discovery

`manifest.json` ~150 tokens

Identity, content map, capabilities, protocols, policies. The single entry point an agent fetches first.

"identity": {
  "name": "TechShop",
  "type": "ecommerce"
}

Layer 02 Understanding

`schemas/` ~250 tokens

Semantic resource schemas with Schema.org annotations — agents understand your data types without inferring them.

"products": {
  "type": "catalog",
  "count": 2000
}

Layer 03 Interaction

`actions.json` ~350 tokens

Agent actions with natural-language intent examples — agents know what they can do and how to call it.

"search_products": {
  "intent": "find products by..."
}

GEO Layer GEO

`digest.md` ~300 tokens

LLM-optimized 200–400 token summary — AI search engines cite your facts, not their guesses.

# TechShop
- 2,000 products across 14 categories
- Free EU shipping above €50
- 30-day returns, 2-yr warranty

01 manifest.json ~150 tk

↓

02 schemas/ ~250 tk

↓

03 actions.json ~350 tk

↓

04 digest.md (GEO) ~300 tk

Content Negotiation

Force AI bots to use ARA — without their cooperation

GPTBot, ClaudeBot, PerplexityBot, Google-Extended and 10 other AI crawlers don't know ARA exists yet. We solved this with server-side content negotiation.

GPTBot visits your site

↓

Your server detects the User-Agent

↓ 302 redirect

/.well-known/ara/digest.md

↓

Bot reads ~300 tokens of structured context

✓

Done — no ARA knowledge required

This works via a 4-layer signal strategy

HTTP Headers

Link: </.well-known/ara/manifest.json>; rel="ara-manifest" on every response

HTML <head> hints

<link rel="ara-manifest"> + <meta name="ara:manifest"> tags

JSON-LD

potentialAction pointing to manifest (Schema.org-compatible)

Content Negotiation

302 redirect for 14 known AI bot User-Agents

Supported stacks

Next.js Cloudflare Worker nginx Apache WordPress Laravel Django Vercel

14 AI bots redirected to ARA digest

GPTBot ClaudeBot PerplexityBot Google-Extended Googlebot-Extended Bytespider CCBot ChatGPT-User cohere-ai anthropic-ai Applebot-Extended YouBot meta-externalagent Amazonbot

Choose your path

Two ways to implement ARA

Whether you use an AI editor or prefer the command line — ARA can be set up either way.

Fastest · Recommended

With AI editor agents

If you use Claude Code, Cursor, Opencode, or another AI-powered editor, dedicated ARA agents automate the entire setup in minutes — audit, generate, enforce, monitor.

/ara transform https://yoursite.com — generates all 4 ARA files
/ara enforce https://yoursite.com — injects middleware automatically
/ara audit https://yoursite.com — verifies your grade (A–F)

✓ Claude Code — available now ⏳ Cursor — coming soon ⏳ Opencode — coming soon

See Claude Code agents →

Manual · Works with any stack

Without an AI editor (manual)

If you don't use an AI editor, you can still implement ARA manually. Use npx to generate a starting template, then review and complete the generated files by hand.

mkdir -p .well-known/ara — create the directory
npx ara-generate https://yoursite.com --output .well-known/ara/ — generates a template
Edit the generated files — add your real data, schemas, and actions manually
npx ara-validate https://yoursite.com — check your score

Note: npx generates a baseline. You will need to review and enrich the files with your site's real content — the generator cannot guess your actual data structures.

Manual setup guide →

Claude Code Agents

The fastest path to ARA-ready

The fastest way to go ARA-ready — 4 Claude Code agents automate the full lifecycle (audit → generate → enforce → monitor) in minutes. Currently available for Claude Code only — Cursor and Opencode agents are in development.

Cursor, Opencode et d\'autres éditeurs IA — contributions bienvenues sur github.com/aka9871/ara-agents'> Currently available for Claude Code only. We are actively building agents for Cursor, Opencode, and other AI editors — contributions welcome at github.com/aka9871/ara-agents

Step 01 — Audit

ara-auditor

Scores any site A–F across 13 criteria — finds missing layers, detects llms.txt, checks enforcement signals.

/ara audit https://yoursite.com

13 criteria A–F grade llms.txt detection enforcement check

Step 02 — Transform

ara-transformer

Generates all 4 ARA files from any URL or local codebase — manifest, schemas, actions, digest.md.

/ara transform https://yoursite.com

manifest.json schemas/ actions.json digest.md

Step 03 — Enforce

ara-enforcer

Injects content-negotiation middleware for 8+ frameworks — forces AI bots to read ARA.

/ara enforce https://yoursite.com

Next.js nginx WordPress Vercel Laravel Django Apache Cloudflare

Step 04 — Monitor

ara-monitor

Measures GEO impact — tracks citation rate and semantic accuracy across AI search engines.

/ara monitor https://yoursite.com

citation probes semantic accuracy weekly reports

How to install: claude mcp install ara or copy agents from github.com/aka9871/ara-agents to your ~/.claude/agents/ directory

/ara transform → /ara enforce → /ara audit → /ara monitor

Coming soon for other editors

C Cursor

~40%

O Opencode

~15%

github.com/aka9871/ara-agents'> Want to help? We are open source — github.com/aka9871/ara-agents

Incremental Adoption

Start with 1 file. Grow as you need.

ARA is designed to be adopted progressively. Start with the manifest, add layers when you're ready.

Level 0

Nothing

Agent must parse HTML — 10k+ tokens, ~55% accuracy

Level 1

manifest.json

90% token reduction — agent has full site identity & content map

Level 2

+ schemas/

Full structural understanding — every data type semantically typed

Level 3

+ actions.json

Programmatic interaction — agents can search, query, mutate

Level 4

+ MCP / A2A

Native agent experience — first-class protocol support, real-time streams

Get Started

Make your site ARA-ready in 5 minutes

Three commands. No DNS changes. No deploys. Works with any stack.

01 Create the directory

$ mkdir -p .well-known/ara

02 Generate the 4 ARA files

$ npx ara-generate https://yoursite.com --output .well-known/ara/

03 Validate — expect Grade A

$ npx ara-validate https://yoursite.com

Validate your site

Run the official ARA validator against any URL — instant grade with detailed breakdown.

Open validator →

Explore the spec

Full v1.0 specification, JSON schemas, examples, and reference implementations.

Read on GitHub →

The front door for AI agents

manifest.json ~150 tokens

schemas/ ~250 tokens

actions.json ~350 tokens

digest.md ~300 tokens