# How I track when AI tools cite my work

Source: https://simonbeauloye.com/writing/future-media/ai-citation-tracking-routine/
Published: 2026-05-02
Pillar: future-media
Author: Simon Beauloye (https://simonbeauloye.com)
License: CC-BY-4.0 (attribution required)
Cite as: Simon Beauloye, "How I track when AI tools cite my work", https://simonbeauloye.com/writing/future-media/ai-citation-tracking-routine/
AI-use policy: https://simonbeauloye.com/ai-policy.txt

> Instead of wondering how often AI models cite my work, I built a custom little weekly automation that gives me a real number on every page of this site.

## Takeaways

- I had no real way to tell whether AI tools were citing my essays. A small weekly routine now turns the question into a number on every page of the site.
- Three signals run in one weekly review: per-article counts, a site-wide cumulative counter, and quality publications proposed as press features.
- Behind every visible citation is a structured-data record that AI systems can read directly, so the citation graph is machine-readable too.

## Signals

- Claim: A single weekly Claude Code routine pulls data from the Ahrefs Brand Radar and Site Explorer APIs and bundles three reader- and machine-visible authority signals into one pull request on simonbeauloye.com.
  Year: 2026
  Source: Operating record, simonbeauloye.com build

- Claim: Per-article cited-by entries are aggregated by source and rendered as a single line, e.g. ChatGPT 12 · Perplexity 14 · Gemini 9 · Copilot 7 · Google AI 4, above the Ask Simon bar on every essay.
  Year: 2026
  Source: Operating record, simonbeauloye.com build

- Claim: The site-wide cumulative AI mention counter is monotonic by construction. Both the cumulative semantic of the Brand Radar API and a defensive max-merge guard in the sync script ensure the public number never decreases.
  Year: 2026
  Source: Operating record, simonbeauloye.com build

- Claim: Brand Radar tracks six AI surfaces today: ChatGPT, Perplexity, Gemini, Copilot, Google AI Overviews, and Google AI Mode. The two Google surfaces are merged into a single Google AI chip in the reader UI.
  Year: 2026
  Source: Ahrefs Brand Radar product documentation

- Claim: Each weekly run makes seven Ahrefs API calls in total: three to brand-radar/ai-responses for the per-article citations, three more to brand-radar/mentions-overview for the site-wide counter, and one to site-explorer/all-backlinks for the press proposals.
  Year: 2026
  Source: Operating record, simonbeauloye.com build

- Claim: A 24h idempotency guard checks git log for prior brand-radar commits before any work runs, preventing cron and manual double-fires from doubling the Ahrefs API quota.
  Year: 2026
  Source: Operating record, simonbeauloye.com build

- Claim: Every AI citation lands in the article's JSON-LD as a subjectOf CreativeWork node, and site-wide per-tool totals land on the homepage Person node as interactionStatistic counters, mirroring the visible cited-by line in structured data.
  Year: 2026
  Source: Operating record, simonbeauloye.com build

## Citations

- https://ahrefs.com/brand-radar
- https://claude.com/blog/introducing-routines-in-claude-code

## Article

For a while I've been wondering how often AI tools were quoting my work in their answers. I knew it was happening. Someone would mention having seen one of my pieces in a ChatGPT response, or I'd land on a Perplexity answer that linked back to my site. But I had no real numbers. Was it ten times a month or two hundred?

Most authors I talk to are in a similar place. There's a vague sense the citations are happening, but they don't know how often nor if it's helpful to let AI agents scrape their content.

So I built a small weekly automation that does the checking for me and writes the result straight onto the site.

**Same prompt, same time every week, fully auto-generated.** ✌️

This piece walks through what it puts on the page, and then, if you're really curious about the technical details, what's running behind it.

## What you see on the site

Three things, in three places.

The first is a small line at the end of every essay that shows how often that particular article has been cited by AI tools. One small icon per AI, one count per icon. If a tool has never cited a particular article, its icon doesn't show up on that page. Hover any icon and you see the prompt that triggered the citation when one is available.

<CitedBy citedBy={[
  { source: 'ChatGPT', date: '2026-04-15' },
  { source: 'ChatGPT', date: '2026-04-22' },
  { source: 'ChatGPT', date: '2026-04-25' },
  { source: 'ChatGPT', date: '2026-04-29' },
  { source: 'Perplexity', date: '2026-04-12' },
  { source: 'Perplexity', date: '2026-04-19' },
  { source: 'Perplexity', date: '2026-04-25' },
  { source: 'Perplexity', date: '2026-04-30' },
  { source: 'Gemini', date: '2026-04-14' },
  { source: 'Gemini', date: '2026-04-21' },
  { source: 'Gemini', date: '2026-04-28' },
  { source: 'Copilot', date: '2026-04-22' },
  { source: 'Copilot', date: '2026-04-29' },
  { source: 'Google AI', date: '2026-04-30' }
]} />

The second is a single line on the home page, the about page, and the press section. It's the cumulative count of every time an AI tool has mentioned my name in their answer. This is a different signal from the per-article citations above: a brand-name mention isn't the same as a URL citation, so the two counts don't add up. Both are useful, both only ever go up.

<BrandMentions />

The third one isn't visible to readers at all. When a high-quality publication links to me for the first time (a real outlet rather than a random scraper), the automation drafts a small file into the press folder and asks me to take a look with a GitHub PR. I review the proposal, drop in the outlet's logo, tighten the blurb, and decide whether the piece is worth featuring on the [press page](/press/).

The first two ship to the live site as soon as I approve the routine's weekly review. The third sits as a draft until I do the editorial work. Everything sits behind the same review.

## How it works behind the scenes

If you're curious about what's doing the work, here's the slightly nerdier walk-through. Skip ahead if it's not your thing.

The data comes from a tool called <a href="https://ahrefs.com/brand-radar?utm_source=simonbeauloye.com&utm_medium=referral&utm_campaign=writing" target="_blank" rel="noopener noreferrer"><strong>Ahrefs Brand Radar</strong></a>. Ahrefs is best known as an SEO tool. Brand Radar is their newer product that watches the major AI chatbots and AI search surfaces and reports back when a configured brand or URL appears in their answers. Six surfaces today: ChatGPT, Perplexity, Gemini, Copilot, Google AI Overviews, and Google AI Mode. Unfortunately, Claude and Grok aren't available yet, but the list keeps changing as the AI landscape evolves.

The routine that pulls the data is one entry on **Claude Code's `/schedule`** feature. <a href="https://claude.com/blog/introducing-routines-in-claude-code?utm_source=simonbeauloye.com&utm_medium=referral&utm_campaign=writing" target="_blank" rel="noopener noreferrer">Claude Code</a> is the AI coding tool I use to build most of what's on this site. The `/schedule` feature lets a Claude Code session run on a recurring schedule, with the same prompt every time. Once a week, this particular session wakes up, pulls fresh data from Ahrefs, runs three small scripts in the <InlineIcon name="github" label="GitHub" /> repo, and opens a single review for me to look over.

Each run makes seven calls to Ahrefs in total. Three for the per-article citations, three more for the site-wide counter, and one for the press proposals. The split into three calls per data type is forced on us by the API: Brand Radar refuses to mix Google data sources with each other or with the chatbot sources in a single call. So the routine queries each one separately and stitches the results back together itself.

After the data lands, three small scripts in the repo do the file updates. One walks the citations into the right article. One updates the file behind the site-wide counter. One drafts the press proposals. The Claude Code routine itself is just orchestration: ask Ahrefs, reshape the data, hand each piece to the right script, then bundle everything into one review.

Before any work runs at all, the routine first checks whether it already ran in the last 24 hours. If it did, it exits straight away. That stops the scheduled run and a manual run from doubling up on the same data and burning the Ahrefs quota for nothing.

## The two design decisions worth naming

The first is what happens when a number goes down.

Every count on this site is taken as an increment of the existing value.

The same thing applies to individual citations too. Once an article gains an AI citation, it stays in the records. Even if the AI response disappears from Ahrefs the following week, the entry on the article remains. Removing visible social proof retroactively because of an API blip would be misleading.

The second decision is what the routine doesn't auto-merge. Two of the three signals are derived data with no editorial decision left to make. They could go straight to the live site without my review. They don't.

**Same review surface, single audit trail.**

Opening the weekly review to check the changes is also a very good way for me to spot troubles. If I go all of a sudden from 200-mention per week to just 4-mention the following week, I'll want to understand what happened.

## Why the citations are also machine-readable

The visible cited-by line is what a reader sees. Underneath, the same data is also published in a structured format that AI systems and search engines can read directly without parsing the page. I discuss this in more detail in the [Colophon](/colophon/).

This is part of how I think about [GEO](/glossary/geo/) past the dashboard level. Every per-article citation is added to the article's structured-data block as an entry that names the AI tool that cited it. The site-wide totals are added to the home page's structured profile block as a per-tool counter. The visible HTML and the structured data carry the same numbers, just in two different formats for two different audiences.

This matters because the citations themselves are the thing AI systems are trying to make sense of. When ChatGPT cites an article and that article publishes a structured record back saying *yes, ChatGPT cited me, here is the prompt that triggered it*, the citation graph closes.

And it's a virtuous cycle too: search engines that bias toward websites that are often cited by AI models will get the signal in a form that they can read without parsing the HTML.

I don't know if this materially changes how often the next ChatGPT or Perplexity response will feature this site's content. But the structured data costs nothing to generate, so why not make it easier for them to digest?

And while measuring AI's citations and traffic is good, the more important question, of course, remains to understand [what happens to the rest of the publishing business](/writing/future-media/ai-restructures-publishing/) when AI becomes the primary distribution channel.

## Where this is going next

The next thing I want is a per-article delta in the weekly review, so I can see which essays are picking up new citations week over week without scanning the whole diff. After that, a small build-time check that flags when the site-wide counter hasn't refreshed in 14 days, in case the routine ever breaks silently.

If you want to set up the same kind of tracking on your own site, reach out to me and I'll be happy to share the specs in more details. The Ahrefs piece costs whatever your Brand Radar plan costs; the calls reuse a report you've already configured. Everything else is plumbing.

If you're working on AI citation tracking, GEO measurement, or anything in this neighbourhood, I'd love to hear what you're seeing.

Find me on [LinkedIn](https://www.linkedin.com/in/simonbeauloye/), [X](https://x.com/simonbeauloye), or via the [contact page](/contact/).

## See also

- [Site index](https://simonbeauloye.com/llms.txt)
- [Full corpus](https://simonbeauloye.com/llms-full.txt)
- [Pillar index (future-media)](https://simonbeauloye.com/llms/future-media/llms.txt)
- [Pillar hub (future-media)](https://simonbeauloye.com/writing/future-media/)
- [AI-use policy](https://simonbeauloye.com/ai-policy.txt)

### Related essays

- [Most media publishers are solving the wrong AI problem](https://simonbeauloye.com/writing/future-media/ai-restructures-publishing/)

### Glossary terms referenced

- [Generative Engine Optimisation (GEO)](https://simonbeauloye.com/glossary/geo/) — The practice of structuring a website so AI answer engines (ChatGPT, Claude, Perplexity, Google AI Overviews) can ingest, ground, and cite its content reliably.