How this site is built.

This website is my personal playground to experiment with new technologies. Every page is carefully designed to be read by both humans and AI agents equally. Here's the build, in detail.

Stack

Framework
Astro 6 — static-site generator, content collections with Zod schemas.
Hosting
Amazon S3 behind CloudFront. Static-only, no backend, no session state.
Deployment
GitHub Actions on every push to main. Build → S3 sync → CloudFront invalidation → IndexNow ping.
Search
Pagefind — fully client-side, no server, no analytics on queries.
Editor
VS Code with Claude Code. Hand-tuned.

Agent-readiness surfaces

Underneath the surface, every page on this website has a machine-readable copy. These files each serve a unique purpose to help AI agents know exactly where and how to consume my content.

Discovery

How agents figure out what's available without crawling the entire website.

/robots.txt
Typical crawler permissions, but with a bonus: a Content-Signals section with specs to help AI agents use search, ai-input, ai-train on my website.
/ai-policy.txt
My policy for AI-use and citation-attribution. robots.txt points here when an agent needs more nuance.
/.well-known/agent-skills/index.json
The site-wide manifest (RFC agent-skills v0.2): every readable surface listed with a sha256 content hash, so agents can verify content hasn't changed before re-fetching.
/.well-known/llms.txt
RFC 8615 alias of /llms.txt — covers crawlers that probe the standardised well-known path first.

Site-wide content

Three complementary takes on the corpus, each tuned to a different consumer.

/llms.txt
Curated index. A short table of contents at the top, pointing AI tools to my essays, glossary, and well-known paths. This answers the question "where to start" when an AI agent lands directly on an article instead of the homepage.
/llms-full.txt
Full corpus, every essay concatenated into one plain-text file. For agents that prefer one fetch over many. Fully in sync with each page's standalone llms.txt files. AI agents can choose which format to consume based on their needs and capabilities.
/corpus.json
Structured catalog of every essay (title, slug, pillar, publication date, URL) — for programmatic indexing without parsing raw HTML.

Standards-based feeds

The two feed formats most readers and crawlers already speak.

/feed.json
JSON Feed 1.1 — modern feed format for readers and agents that prefer JSON over XML.
/rss.xml
RSS 2.0 — legacy-compatible feed for traditional readers and crawlers that don't support JSON Feed.

/now snapshot

/now is the freshest page on the site, so it gets its own machine-readable bundle.

/now.json
JSON snapshot of the current /now page: where I am in the world, what I'm working on, thinking about, or doing.
/now/llms.txt
Plain-text render of /now with the freshness badge baked in, designed for AI grounding when the agent only cares about the "what's current" beat.
/now/rss.xml
Change-feed: a single-item RSS whose pubDate moves only when /now actually changes — gives agents a cheap signal to re-poll.

Every article, glossary term, press entry, and stream item also get an accompanying llms.txt sibling — same content as the rendered page, with an attribution header at the top.

Build-pipeline highlights

  • Auto-derived /now sections. The "Recently" block on the /now page pulls from Stream and Writing at build time, so the page stays fresh even when the handwritten content ages.
  • Citation reciprocity in JSON-LD. Press entries emitted twice (standalone NewsArticle nodes plus compact citation arrays inside Person/ProfilePage). Glossary terms get stable @id URLs so AI tools can resolve entities.
  • Conditional GET on llms surfaces. S3 sets ETag + Last-Modified; CloudFront passes them through. Agents re-polling llms.txt get 304 Not Modified when nothing has changed. Zero bytes, zero cost.
  • Site-wide Link response header. CloudFront advertises discovery entry points on every response. Agents can find llms.txt, agent-skills, and ai-policy without crawling.
  • IndexNow on every deploy. Pings Bing, Yandex, Seznam, and Naver — including the llms surface URLs explicitly, not just the HTML pages.
  • Per-article enrichment. wordCount, timeRequired (ISO 8601 at 225 WPM), and SpeakableSpecification for voice-readable slices. Glossary mentions inside articles auto-emit about edges to the canonical term — no per-post frontmatter needed.

Design principles

Type

Three faces, three jobs. -webkit-font-smoothing: antialiased throughout.

Aa
Literata Display · Serif

High-contrast serif for headlines.

Aa
Inter Body · Sans

A clean sans for sustained reading.

Aa
JetBrains Mono Mono · Code

For code and the footer terminal.

Colour

Two themes, swapped via the footer toggle. Single accent. SVG logos use fill="currentColor" so they theme-adapt without a second asset.

Light
  • Surface#f9f8f6
  • Surface alt#eeebe4
  • Ink#2b3139
  • Muted#6b6359
  • Primary#3348a5
Dark
  • Surface#151515
  • Surface alt#1b1b1b
  • Ink#edeae6
  • Muted#b8b4b1
  • Primary#6970bf
Voice
  • Direct.
  • Operator-first.
  • British English.
  • Honest about what failed.
  • Specific over generic.

Same voice across homepage, 404, footer terminal, and unsubscribe flows.

Architecture

Single source → many surfaces. Edit once; everything downstream rebuilds together — no manual sync.

  • src/content/MDX · MD
  • src/data/TypeScript
  • HTML
  • JSON-LD
  • llms.txt
  • RSS
  • JSON Feed
  • OG cards
  • corpus.json

What's deliberately not on this site

  • No popups, exit-intents, or modals. The site never interrupts you to ask for an email.
  • No third-party tracking beyond Google Analytics. No Facebook pixel, no LinkedIn Insight tag, no session replay.
  • No comments. If you want to respond, write to me. I usually reply within a few business days.
  • No paywall, no gated content, no signup wall. Every essay, glossary term, and press entry is public on first visit.
  • No AI-generated content presented as written by me. When AI helps, it's editorial. Voice and judgment stay mine. Always.