Microsoft Clarity now reports bots that ignore Robots.txt


Microsoft Clarity now displays bot requests that go against a website’s URL rules in the tool’s Bot Analytics dashboard, the company announced in a statement. blog post.

Clarity will calculate and display these requests as a percentage of total bot activity over a given period and add them to existing AI visibility tools in the dashboard, which began showing in May. anchor queries behind AI citations.

What the Violations view shows

When a bot sends a request to a Clarity-connected website, the tool now checks that request against the site’s robots.txt directives to determine if the path was denied.

Denied bot requests are then calculated and displayed as a percentage of total bot activity over a given period.

Clarity allows site owners to filter the bot queries displayed by the bot operator, bot name, query activity type, requested URLs and paths, to compare and contrast the patterns of crawlers known to follow the rules with those that do not.

This is done by accessing a side-by-side view comparing crawlers generally considered compliant to those showing violations.

How to turn it on

The feature does not automatically enable for all sites and must be enabled by a site’s project administrator in the AI ​​Visibility section of Project Settings, particularly for sites using a supported CDN.

Supported CDNs include Quickly, Amazon CloudFront, Cloudflare, Azure Front Door and Akamai. WordPress sites using the latest Microsoft Clarity plugin are also supported.

Why it matters

With concerns about AI crawlers eating up server resources and biased analyzesbeing able to see this activity is important.

And since Clarity is freeit’s a free way to check if robots follow these rules. This only tells you that the requests happened, not why.

This data only covers requests that reached paths prohibited by a site’s robots.txt file. Robots.txt is advisory and doesn’t block anything, so Clarity records successful requests rather than those it stopped.

This move also recognizes that manually analyzing server logs for bot requests and manually testing URLs against the robots.txt file to identify unauthorized requests is not scalable, with Clarity now automatically counting the number of crawler requests that violate a site’s rules.

Looking to the future

Websites now have more accurate automated ways to assess how well robots.txt file rules are being followed.

The big question is whether making this behavior easier to measure will change crawler behavior or whether it will just help site owners keep clearer track of what’s going on.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *