Technical SEO·12 min read

How to Do a Complete SEO Audit in 2026

An SEO audit is the difference between guessing what is wrong with your site and knowing exactly where to spend your time. This guide walks through the process we use on client sites: crawling, technical health checks, content evaluation, Core Web Vitals, internal linking analysis, and schema validation. No fluff, no theory without application.

Why most sites need an audit more than they need new content

There is a persistent belief in SEO that publishing more content is always the right next move. In our experience running SEO audits across industries, the opposite is usually true. Most sites are sitting on fixable problems that suppress the content they already have. Orphaned pages that no internal link points to. Title tags competing against each other for the same query. Redirect chains that bleed authority across three or four hops before landing anywhere useful. These are not obscure edge cases. They are the norm.

An audit gives you a clear inventory of what is broken, what is underperforming, and what is actually working. It replaces opinion with data. And because search engines keep changing how they evaluate pages, especially with AI-generated overviews and entity-based ranking, a site that ranked well eighteen months ago may have structural problems it did not have before. The goal is not to produce a 200-page PDF that nobody reads. The goal is to walk away with a prioritized list of actions that will move organic traffic within the next quarter.

Setting up your audit toolkit

You do not need a dozen paid subscriptions to run a meaningful audit. The essential stack is smaller than most people assume, and the free tools from Google and Microsoft cover more ground than they get credit for.

Start with Google Search Console. This is non-negotiable. It is the only place you will get first-party data on how Google sees your site: which queries you appear for, which pages are indexed, which pages have errors, and how your Core Web Vitals perform in the field. If you do not have it verified, stop here and set it up. Every other step depends on it.

Add Bing Webmaster Tools next. Bing powers a meaningful share of search traffic, particularly in enterprise environments where Edge is the default browser, and its backlink data is surprisingly useful as a cross-reference. It also feeds data to several AI search surfaces, so understanding how Bing indexes your site matters more in 2026 than it did two years ago.

For behavior data, install Microsoft Clarity. It is free, has no traffic limits, and gives you session recordings and heatmaps. During an audit, Clarity answers questions that analytics alone cannot: are users actually scrolling to your CTAs? Are they rage-clicking on something that is not a link? Is your above-the-fold layout causing confusion on mobile? This behavioral layer adds context that pure crawl data misses entirely.

For crawling, Screaming Frog remains the industry standard. The free version handles up to 500 URLs, which is enough for smaller sites. For anything larger, the paid license is worth it. Screaming Frog will surface broken links, redirect chains, missing meta tags, duplicate content, orphaned pages, and structured data errors in a single crawl. It is the closest thing to a Swiss army knife in technical SEO.

Google Lighthouse, built into Chrome DevTools, handles page-level performance audits. Run it on your five or ten most important pages to get a baseline for Core Web Vitals, accessibility, and best practices. Do not obsess over the score itself, but pay close attention to the specific diagnostics it flags: unused JavaScript, render-blocking resources, images without explicit dimensions.

You can also use our SEO score calculator to get a quick baseline reading before diving into the full audit. It will not replace a proper crawl, but it highlights the most obvious gaps upfront.

Running the crawl and reading the results

The crawl is the foundation of every audit. Configure Screaming Frog to respect your robots.txt (or not, depending on whether you suspect it is blocking important content), set the user agent to Googlebot, and let it run. On a site with a few thousand pages, this might take twenty minutes. On larger sites, you may want to run it overnight.

Once the crawl finishes, the first thing to check is the response code distribution. You want to see the vast majority of your URLs returning 200 status codes. A high number of 301 redirects is not inherently bad, but redirect chains (301 to 301 to 301) waste crawl budget and dilute link equity. Anything returning a 404 that should be live is an immediate fix. 5xx errors signal server-side problems that need escalation to your hosting or engineering team.

Next, look at the indexability report. Screaming Frog will flag pages that are noindexed, canonicalized to another URL, or blocked by robots.txt. Cross-reference this with Google Search Console's index coverage report. The overlap matters: if Search Console says a page is "Crawled, currently not indexed" and Screaming Frog shows it has thin content and zero internal links, you have your diagnosis. If Search Console says a page is excluded by "Alternate page with proper canonical tag" but you never set that canonical intentionally, you have a different kind of problem entirely.

Pay particular attention to the depth report. Pages buried four or five clicks from the homepage rarely rank well, not because Google refuses to crawl them, but because the lack of internal link equity signals low importance. If your most valuable content is deep in the site architecture, that is a structural issue worth fixing. For a more detailed look at crawl infrastructure, our technical SEO service covers how we approach this for sites with complex architectures.

Evaluating technical health beyond the crawl

A crawl will catch the mechanical issues, but a proper technical SEO evaluation goes further. Start with your XML sitemap. Load it in a browser and compare the URLs it contains against what your crawl found. A sitemap should include every indexable page you want Google to know about and exclude everything else: paginated pages, filtered views, staging URLs, and anything you have noindexed. A mismatch between your sitemap and your actual indexable pages creates confusion for search engines.

Check your robots.txt by visiting /robots.txt directly. Verify that it is not blocking CSS or JavaScript files that Googlebot needs to render your pages. This was a common problem years ago, and it still surfaces on sites that inherited legacy directives. Also confirm your sitemap URL is declared in robots.txt. It is a small thing, but it helps crawlers discover your sitemap without relying solely on Search Console submission.

HTTPS is table stakes at this point, but mixed content still shows up regularly. Look for pages served over HTTPS that load images, scripts, or stylesheets over HTTP. Browsers may block that content or show security warnings, and search engines treat it as a negative signal. Screaming Frog's "Insecure Content" tab makes these easy to find.

Canonical tags deserve their own pass. Open Screaming Frog's canonicals report and look for self-referencing canonicals (which are correct), canonicals pointing to different URLs (which may or may not be intentional), and missing canonicals entirely. The most dangerous pattern is when two pages canonical to each other, creating a loop. Google will pick one and it may not be the one you intended.

Structured data validation is the last piece of the technical puzzle. Use Search Console's Rich Results report to see which schemas Google has recognized and which have errors. Common issues include missing required fields, incorrect date formats, and schema types that do not match the actual page content. If you are running an ecommerce site, review your Product schema. If you are publishing articles, verify your Article schema has author, datePublished, and dateModified fields populated correctly.

Core Web Vitals: where performance meets rankings

Core Web Vitals stopped being optional the moment Google made them a ranking signal, and their weight has only increased since. The three metrics that matter are Largest Contentful Paint (LCP), Interaction to Next Paint (INP), and Cumulative Layout Shift (CLS). If any of these fall outside the "good" threshold on the majority of your pages, you are leaving rankings on the table.

LCP measures how quickly the largest visible element loads. The target is under 2.5 seconds. The most common culprits behind a slow LCP are unoptimized hero images, slow server response times (TTFB above 800ms), and render-blocking JavaScript that delays the main content paint. Run Lighthouse on your key landing pages and check the LCP element it identifies. Sometimes it is an image you can optimize. Sometimes it is a font file loaded from a third-party CDN. Sometimes it is a layout component that depends on JavaScript to render, which means the browser cannot paint it until the script executes.

INP replaced First Input Delay in March 2024 and measures responsiveness more comprehensively. Where FID only tracked the first interaction, INP tracks all interactions throughout the page lifecycle and reports the worst one. If your pages have heavy JavaScript, complex event handlers, or third-party scripts that block the main thread, INP will suffer. The threshold is under 200 milliseconds.

CLS measures visual stability. A CLS score above 0.1 means elements are shifting around as the page loads, which frustrates users and signals a poor experience. The usual causes are images without explicit width and height attributes, ads or embeds that inject content after initial render, and web fonts that swap in late and change text dimensions. Use our Core Web Vitals calculator to assess where your pages currently stand and identify which metric needs the most attention.

Field data from Search Console's Core Web Vitals report matters more than lab data from Lighthouse, because field data reflects real user experiences across devices and network conditions. However, lab data is what you use to diagnose and fix issues. Use both: field data to identify which pages fail, lab data to understand why.

Auditing content quality and coverage

Technical health gets your pages crawled and indexed. Content quality determines where they rank. The content portion of an audit evaluates whether your pages deserve the rankings you are targeting, and whether your overall content strategy has gaps that competitors are exploiting.

Pull a list of all indexed pages from Screaming Frog, along with their word count, title tags, meta descriptions, and H1 tags. Sort by word count and look at the bottom of the list. Pages with fewer than 300 words of unique content are thin content candidates. They may be legitimate (a contact page does not need 2,000 words), but many will be tag pages, author archives, or old landing pages that add no value. These pages either need to be consolidated, expanded, or noindexed.

Title tag and H1 duplication is more common than people expect. When two pages compete for the same keyword with nearly identical titles, they cannibalize each other. Neither ranks as well as a single, consolidated page would. Screaming Frog's duplicate detection makes this easy to spot. The fix is usually to merge the weaker page into the stronger one and redirect.

Search Console's Performance report reveals which pages get impressions but low clicks. A page with thousands of impressions and a click-through rate below two percent has a positioning or messaging problem. The title and description may not match the search intent, or the page may be ranking on page two and needs a content refresh to break through. Filter by query to understand what users actually searched for when they saw your page. If the queries do not align with your content, you have an intent mismatch that no amount of on-page tweaking will fix. You need to restructure the page around what users actually want.

A thorough content strategy review also looks at topical coverage. Map your existing content against the topics your audience cares about. Where are the gaps? If you sell project management software but have no content about team collaboration, resource planning, or workflow automation, you are ceding those topics to competitors. A content audit should produce a clear list of topics to create, pages to consolidate, and content to retire.

Internal linking: the most underrated part of any audit

If content quality determines your ceiling, internal linking determines how much of that ceiling you actually reach. Internal links distribute authority across your site, help search engines understand topical relationships, and guide users toward conversion. Most sites do internal linking poorly, and the audit is where you find out how poorly.

Screaming Frog's inlinks report shows how many internal links point to each page. Your most important pages, the ones you want to rank for competitive terms, should have the most internal links. In practice, this is rarely the case. The homepage and navigation pages get linked heavily by default, while the high-value content pages you spent weeks creating sit with three or four internal links. That is a distribution problem you can fix without writing a single new piece of content.

Orphaned pages are the extreme version of this problem. These are pages that exist on your site but receive zero internal links. Search engines can still find them through the sitemap, but the lack of internal links signals that even your own site does not consider them important. Either link to them from relevant pages or remove them.

Anchor text matters more for internal links than most people realize. Generic anchors like "click here" or "learn more" waste an opportunity to reinforce topical relevance. Descriptive anchor text that includes target keywords (without being spammy about it) helps search engines understand what the target page is about. Review your top landing pages and check what anchor text other pages use when linking to them. If the anchors are all generic, that is a quick win.

Broken internal links are pure waste. Every broken link is a dead end for both users and crawlers. Screaming Frog flags these immediately. Fix them by updating the href to the correct destination or removing the link if the target page no longer exists. On larger sites, broken internal links tend to accumulate after migrations, URL changes, and content deletions. A quarterly sweep keeps them under control.

Schema markup gaps and structured data validation

Structured data has moved from a nice-to-have to a ranking factor that directly affects whether your content appears in rich results, AI-generated summaries, and knowledge panels. An audit that skips schema review is an incomplete audit.

Start with what you have. Screaming Frog's structured data tab shows which pages contain JSON-LD, microdata, or RDFa markup, and which schema types are deployed. Cross-reference this with Search Console's Enhancements section, which reports on specific schema types like FAQ, HowTo, Product, Article, and Breadcrumb. Any errors or warnings flagged here need to be addressed, because invalid schema is worse than no schema. It can prevent your pages from appearing in rich results entirely.

The more interesting question is what schema you are missing. If you publish articles, every article page should have Article schema with headline, author, datePublished, and dateModified. If you have a FAQ on any page, FAQPage schema makes those Q&A pairs eligible for rich results. If your site has a physical location, LocalBusiness schema should be on your contact or location pages. BreadcrumbList schema improves how your URLs appear in search results and helps search engines understand your site hierarchy.

For sites focused on AI search optimization, structured data plays a particularly important role. AI systems that generate search summaries rely heavily on structured data to identify entities, relationships, and factual claims. Sites with robust schema markup are more likely to be cited in AI-generated answers because the data is easier for machines to parse and trust.

Using AI to accelerate the audit process

A manual audit produces the most thorough results, but AI tools can meaningfully accelerate parts of the process. The crawl itself is still mechanical, but the analysis layer benefits from AI's ability to process large datasets and identify patterns that a human reviewing a spreadsheet might miss.

Claude, for example, is genuinely useful for analyzing crawl exports. Export your Screaming Frog data as a CSV, feed it to Claude, and ask it to identify patterns: which URL directories have the highest error rates, which content clusters have the thinnest pages, where redirect chains are concentrated. Claude Code can even automate the generation of redirect maps and meta tag recommendations based on crawl data. This does not replace the judgment calls an experienced SEO makes, but it compresses the data processing phase from hours to minutes.

Where AI falls short is context. An AI can tell you that a page has a 404 error, but it cannot tell you whether that page drove revenue last quarter or whether the URL was shared in a viral LinkedIn post that still sends referral traffic. That contextual layer, connecting technical findings to business impact, still requires human judgment. The best audit workflow uses AI for data processing and pattern recognition, then applies human expertise for prioritization and strategic recommendations.

Turning findings into a prioritized action plan

The audit itself is only valuable if it leads to action. Every audit we run produces a single prioritized document that organizes findings by impact and effort. Not everything can be fixed at once, and trying to fix everything simultaneously usually means nothing gets done well.

The highest priority fixes are anything blocking indexation of important pages. A noindex tag accidentally deployed to your product pages during a staging push, a canonical tag pointing your highest-traffic page to a redirect, a robots.txt rule blocking an entire subdirectory. These are fires. They cost you traffic every day they remain unfixed, and the fix is usually simple.

Next come Core Web Vitals failures on high-traffic pages. If your top ten landing pages fail INP or LCP in field data, fixing those pages will have an outsized impact on your overall search performance. The effort varies, sometimes it is as simple as lazy-loading offscreen images, sometimes it requires rewriting how third-party scripts load, but the return is measurable and often fast.

After that, internal linking improvements and content consolidation typically deliver the best returns. These are medium-effort tasks that compound over time. Adding ten internal links to an underlinked page will not double its traffic overnight, but across dozens of pages, the cumulative effect on authority distribution is significant.

Lower priority but still valuable: meta tag optimization, schema additions, image alt text, and accessibility improvements. These are the kinds of tasks that are easy to batch and hand off to a junior team member or automate with a script. They matter, but they matter less than the items above.

Set clear timelines. Critical fixes within one week. High-impact improvements within thirty days. Medium-priority work within a quarter. Then schedule a follow-up crawl to verify the fixes are live and measure the impact. An audit without a follow-up is just an expensive report.

Frequently asked questions about SEO audits

How long does a full SEO audit take?

A thorough audit for a small to medium site (under 5,000 pages) typically takes 8 to 15 hours spread across a few days. Larger enterprise sites can take several weeks. The crawl alone may run for hours on sites with tens of thousands of URLs. The analysis and recommendation phase usually takes longer than the data collection.

How often should I perform an SEO audit?

Comprehensive audits every six months, with lighter technical checks monthly. After any major site change, a CMS migration, a redesign, a URL restructure, or a significant algorithm update, run a focused audit on the affected areas immediately. Continuous monitoring through Search Console catches many issues between formal audits.

What is the most common issue found during audits?

Thin or duplicate content, without question. Most sites have pages with near-identical content, auto-generated tag or category pages with minimal unique value, or legacy pages that were never consolidated. After that, missing or misconfigured canonical tags and broken internal links are the next most frequent findings.

Can I do an SEO audit myself without expensive tools?

Yes. Google Search Console, Bing Webmaster Tools, Lighthouse, Microsoft Clarity, and Screaming Frog's free tier cover the majority of what you need. You will miss some competitive intelligence without paid tools, but the technical and on-page audit can be done well with free resources alone.

What should I prioritize after completing an audit?

Anything blocking indexation comes first: noindex tags on important pages, broken canonical chains, crawl errors. Then address Core Web Vitals failures. Content consolidation and internal linking improvements typically deliver the highest return on effort after technical fixes are in place.

Ready for a professional SEO audit?

Our AI-powered audit covers technical SEO, content quality, and competitive positioning. Get actionable recommendations in 48 hours.