Only crawl canonical URLs in Site Audit
closed
Andrey Kirillov
closed
Andrey Kirillov
Hi Luke,
Thank you for the request. Unfortunately we can't know whether the page is canonical or not before we crawl it. So there is no easy way to exclude all non-canonical pages from a crawl. However you can use some options under Crawl Settings to achieve your goal:
- Switch off the option called "Follow links on non-canonical pages". After that Site Audit will not crawl links present on non-canonical pages.
- Switch on the option called "Remove URL parameters". So Site Audit will ignore all pages with parameters.
- Use Include/Exclude rules.
Also If you have a sitemap that contains all canonical pages on your website you can start crawling from it and set Max depth to 0. This way Site Audit will only crawl pages from a sitemap.
Let me know if it helps.