How we measure agent-readiness.
MAGNET, Machine-Actionable Generative Entity Test, is the framework Nmow built to score whether a website can be discovered, understood, trusted, and transacted with by AI agents. Forty-six items across seven dimensions, calibrated for the MENA market.
What MAGNET actually measures.
Most GEO checklists you’ll find online are about ten items long. They look at schema markup, mention llms.txt, suggest a few content tweaks, and call it done. That’s a starting point, not a methodology.
Forty-six items, seven dimensions
Architecture in the open, rubric in the audit
Seven dimensions, at a glance.
Each dimension addresses a discrete failure mode in agent-readiness. Weights shift by business profile; the diagram shows the default service-business weighting.
Can agents retrieve deterministic facts about your offerings?
Can agents retrieve deterministic facts about your business and your offerings without inferring from prose?
Heaviest weight
Scored items
Can vision-based agents click the right elements?
Can vision-based agents click the right elements and complete workflows without hallucinating element function?
Vision-agent ready
Scored items
When an LLM cites your content, will the citation hold up?
When an LLM cites your content, will the citation hold up to scrutiny?
Citation defensibility
Scored items
Can the agents that want to read your site actually reach it?
Can the agents that want to read your site actually reach it?
Access prerequisite
Scored items
Does your timestamp evidence match reality?
Does your timestamp evidence match reality?
Timestamp truth
Scored items
What evidence of credibility does an LLM find?
What evidence of credibility does an LLM find when weighing whether to cite you?
Citation worthiness
Scored items
Can an agent complete your primary conversion on a buyer’s behalf?
Can an agent complete your primary conversion action on a buyer’s behalf, end-to-end?
Transactability
Scored items
Five bands. What your score means in practice.
Each total score maps to a band, and each band maps to a recommended next step.
Agent-Native
Agent-Friendly
Legacy-Optimized
Frictional
Dark Site
Weights shift by business profile.
Generic checklists score every site the same way. MAGNET adjusts dimension weights based on the business profile being audited, because the failure modes that matter most differ by category.
Default weighting
Commerce-weighted
Authority-weighted
Trust-weighted
The exact per-profile weighting is part of the audit. The right profile is determined during the audit’s Discovery phase. Hybrid businesses (e.g., e-commerce with a strong content arm) are scored under both profiles, with the lower of the two scores presented as the headline.
Five failures cap your maximum score.
Some failures are categorical. A site can score perfectly on six dimensions and still be effectively invisible to agents because of one fundamental break. The veto checks identify those breaks and cap the maximum achievable score until they’re fixed.
Not Crawlable
The content isn't reachable at all: an auth wall, infinite scroll with no real URLs, or other access blockers. Agents can't reach it, so dimension scoring is theoretical.
Cap: maximum 25
Crawler Blocking
Blocking the major AI crawlers wholesale (GPTBot, ClaudeBot, PerplexityBot and peers), whether intentional or via a misconfigured WAF.
Cap: maximum 30
JS-Only Rendering
The majority of meaningful content only renders client-side, with no server-rendered HTML for the agent crawlers that don't execute JavaScript. At-scale fetch studies show AI crawlers don't run JavaScript the way Googlebot does.
Cap: maximum 40
Human-in-the-Loop Conversion
The primary conversion can only be completed through synchronous human contact (a phone call, WhatsApp, an in-person visit) with no digital alternative an agent can traverse.
Cap: maximum 40
Zero Schema
No structured data of any kind across the audited pages (no JSON-LD, microdata, or RDFa). A single Organization schema in the head is enough to clear it.
Cap: maximum 50
What MAGNET doesn’t measure.
MAGNET measures agent-readiness specifically. It does not measure overall business health, marketing performance, or brand strength. A site can score 100 on MAGNET and still have a failing business: the score is necessary but not sufficient.
Paid media performance
Organic search ranking
Conversion rate optimization
Social media presence
Operational fundamentals
Brand sentiment
The performance side runs on its own method.
MAGNET measures the agent era. Growth, paid, and retention answer to a discipline that is just as explicit, surfaced on every service page rather than buried in a deck.
The honest horizon
Map the growth loops, model the math, isolate the binding constraint, then call the number you will actually hit, not the one in the pitch deck.
See the methodThe retention curve
Engineer activation, engagement, and resurrection by state so the curve flattens onto a retained base instead of decaying toward zero.
See the methodMER over vanity ROAS
Read the whole funnel for leaks, judge spend on blended efficiency, and report on the number that maps to revenue instead of the one each platform claims for itself.
See the methodWe score ourselves on the framework we sell.
nmow.ai’s current MAGNET score, with the per-dimension breakdown. Updated quarterly. The score is generated by the same audit infrastructure used in client engagements: no special treatment, no shortcuts, no aspirational rounding.
D6 Entity Authority is our weakest dimension: Wikipedia and Wikidata entries are pending. D7 Agentic Conversion is partial because Nmow’s primary conversion is consultation booking rather than direct transaction; we score against the booking flow’s agent-readiness, not against e-commerce checkout. (Scores shown are being refreshed under MAGNET v2.)
Common questions about the framework.
Why these seven dimensions and not others?
Each dimension addresses a discrete failure mode we’ve seen in production audits across MENA businesses. The framework was iterated against five test sites before launch: these seven were the categories that recurred. Earlier drafts tried four dimensions (too coarse) and eleven (overlapping). Seven is the resolution at which dimensions stay distinct without leaving meaningful gaps.
Why this specific weighting?
Weights reflect the relative impact each dimension has on whether an agent will cite or transact with a site. D1 Structured Data, D3 Content Extractability, and D6 Entity Authority tie for the most weight: structured data, extractable content, and resolvable entity identity are the strongest drivers of whether an agent cites you. D7 carries the least in the default profile because most businesses aren’t transactional, though it rises sharply for e-commerce. Profile-based weight shifts further calibrate to category-specific dynamics. The exact item-level rubric inside each dimension is proprietary to the audit deliverable.
Is MAGNET open-source?
The architecture (the seven dimensions, weighting, banding, and veto logic) is documented publicly on this page. The full 46-item rubric, the per-item scoring criteria, and the automated detection infrastructure are proprietary to the Nmow audit. The reason: the value isn’t in the framework existing, it’s in the rigor of how each item is scored. Publishing the rubric without the scoring discipline would invite the kind of "I read the checklist" surface-level work that motivated us to build a real methodology in the first place.
How often is the framework updated?
Quarterly minor versions; annual major versions. Minor versions adjust item weights and add new items as the agentic landscape evolves. Major versions can change dimension structure if needed: we expect at most one major version every 18 months. The current version is v2.0.
Can I score myself against MAGNET without buying an audit?
You can use the public architecture on this page to estimate where you sit. Reading through the seven dimensions and asking "are we strong here? weak here?" usually places teams within a band. But a real score requires the rubric and the detection infrastructure, both proprietary. The closest you can get without us is: identify which dimensions you’re confident on and which you’re not, and use that as your priority list.
Why is this MENA-calibrated specifically?
Generic GEO frameworks were built against English-language, US/EU-default sites. MAGNET was built against MENA-specific failure modes: bilingual coherence between Arabic and English, regional payment rail support (Mada, STC Pay, Tabby, Tamara) at the markup level, regulated-vertical content disclosure norms in Saudi Arabia and the UAE, geographic IP-blocking that catches LLM crawler infrastructure, Arabic Wikipedia entity scarcity. None of those score in a generic framework. All of them score here.
See where your site sits on MAGNET.
The Agent-Readiness Audit is a four-week diagnostic that scores your site against the full framework, with a prioritized remediation plan.