English Español Deutsch Français Italiano Português (Brasil) Русский 中文 日本語

SEMrush Toolbox: Site Audit, Technical SEO




Ross Tavendale: Hello everyone and welcome to another SEMrush webinar. Today we are joined by a very special guest. I personally have never heard of him, but apparently he's a big deal. It's none other than Mr. Craig Campbell. Craig, how you doing?

Craig Campbell: All good. All good.

Ross Tavendale: What have you been up to today? Anything interesting?

Craig Campbell: Hosted an event in Glasgow. Had some good speakers. Great event, just trying to push the name out there in front of a local audience, because no one knows who I am up here. They're like, "Who? What?" 

Ross Tavendale: No, that's not true. You're definitely well known... we're here to talk about the Site Audit tool. I'm going to take people through it and show them how it works, but everyone who is interested, I actually have done an academy training session on the audit tool which is available at the SEMrush Academy

Craig, after that, I believe some of the audience have submitted their own businesses and you're going to live audit them on the spot. Is that right?

Craig Campbell: Yes. There's a website out there called powerrubber.com, this guy attends SEMrush audits all the time, so when I was scrolling through the list of submissions I always see Power Rubber talking in the YouTube chat and stuff. 

Using the Site Audit Tool in SEMrush 

Ross Tavendale: All righty, so now sharing the screen. Okay, so this a website called The Children's Furniture Company, and this is the dashboard that you see as soon as you log in to SEMrush. It's a bit scary; numbers like these, so 77,000 warnings, 3000 errors and 2500 notices. 

Craig, see when you look at this, what's the first thing that jumps out at you and scares you a little bit here?

Craig Campbell: For me, errors is always the first thing that I'm looking at, but the HTTPS as well only working across 69% of the website is a little worrying. 

I'm not sure which one of those I would tackle first. They're probably equally as important, but yeah, I would start from the ground up and go for the HTTPS and resolve that, which should normally be a very quick fix to be fair.

Ross Tavendale: Let's go into the errors and I'll show you exactly what's going on. We'll go across the top tab so you can get an idea of what's going on here. They've got a bunch of problems with broken JavaScript and CSS, that feels like something that you'd write a Jira ticket for to give to a developer. 

One of the things Craig mentioned is the HTTPS problems, so with mixed content. When it comes to mixed content Craig, give us an overview of what that is and why you need to fix it.

HTTPS URL Issues and Mixed Content

Craig Campbell: Previously, and I can't remember the dates...but obviously Google have come out and said that every website should have an SSL certificate. Previously people had websites built when it was HTTP, and now it's HTTPS, so what we've got now is people getting their SSL certificates installed, but all the links and stuff like that on their website are not all HTTPS, they're HTTP. 

Essentially you're giving two different variations of a URL to Google, which you don't want to do, you want to make sure that that HTTP forwards on to the HTTPS. It's broken links essentially, and as far as I'm concerned, then I'd want to make sure that there's no juice flowing out of my website and everything is just seamless right into the HTTPS version of my website.

Ross Tavendale: That's actually a big, easy win to do straight at the start. When we're running through it, we've got issues at the top, and then it takes you through all of the crawled pages on the site as well. 

This is particularly useful if you're looking to do some sort of content inventory. I've had to disconnect GA because this is real people's information, but you can connect up Google Analytics so you can start looking at pages according to issues and also the amount of page views you're getting. 

Because at the end of the day, look, if it's something that's driving you several thousand page views, but it’s chock-full of errors, then there's probably quite a bit of opportunity there for you to dig into.

Building a Website Structure that Improves SEO

Now when it comes to UX, that's always a big thing in our world; making sure that everything's properly siloed in the correct navigation, so that's where the site structure piece comes in really handy. 

This particular site is really flat, so one of the things you'd be saying is, "We probably want to start building this out into different content silos because there's absolutely nothing in here." They've got literally three top level and it's straight into their shop, which is pretty poor when it comes to keyword targeting. 

Craig, I know you do a lot of affiliate stuff in retailing and things like that, how would you typically structure navigations and how would you typically structure out someone's website?

Craig Campbell: If you're selling, say sports equipment for example, and I have a football section, I've got a rugby section, American football section or whatever, then I want the domain name/football/ and then the product, so that everything has a proper structure and it's relevant and all that kind of stuff. 

Because if not, and everything's all flat and you've got domainname.com/nikefootballboots, then domainname/rugbytopforscotland, or whatever the hell your URL's going to be, it's just going to be outrageously messy.

I think starting from the ground up, start with a good silo structure is the biggest place to do it. 

Ross Tavendale: The Statistics one is really nice if your a client-side or if you're agency-side pitching a client, to run one of the audits for them and just give them a quick overview of what's going on. Then you can start prioritizing things, so the fact that 99% of the pages have no schema markup. 

For those who don't know, schema markup is a type of code that essentially describes the content and describes the entities inside of the HTML, so Google finds it much easier to pick up. It allows you then to get much more rich snippets in the search.

Sitemap Pages and Indexed Pages in Google

This here, this is a rich snippet, which is caused by schema. Crawl depth is really important as well, so if you've got a massive site with a huge, flat structure, it's going to be very hard for the bot to get into the very deep pages because there's no actual path for them to do it. 42% of pages only have one internal link, no wonder the site isn't ranking and isn't doing well.

Craig, do you ever look at the number of pages on the sitemap versus how many are indexed inside of Google?

Craig Campbell: I think that's quite a basic thing for me to do because obviously you've got issues when you're outsourcing content you can't do for affiliate sites, to see if they are being indexed and stuff.

Ross Tavendale: If you've got a website and you want to see how many pages are typically in the index, this is a really basic, basic, basic way of doing it, and there's a lot of SEOs screaming at their computer right now saying, "Ah, that's rubbish. That's not how that works." 

I know it's not, you need to go into Search Console to see this properly, but if you do a little site: command it will show you roughly how many pages in it. If it's a wild amount different, then you probably want to look into why pages aren't being indexed.

Then the killer feature for me inside all of this is the compare crawl. What you can see here is the pages crawled, overall technical score, issues, errors and warnings. That relates back to the initial dashboard. 

When we move over, we can then see month-on-month as we are doing these changes. When I do the setup inside of the Site Audit tool, I like to have it run every single week.

The cool thing about that is as the amount of fixes you're doing increases, you get this progress report. Let me just zoom out of this. The progress report essentially takes you through all the errors, warnings and notices that you're battering through and taking down to zero, and then you can overlay your Google Analytics there on top of that as well.

Issues to Prioritize in a SEMrush Site Audit

All right, so just jumping into some of the issues. The way this is split up Craig, we've got a bunch of stuff about sitemaps and 404s and mixed content, so if I'm a beginner SEO person, typically you'd want to start where there are tons of issues, but that's probably not always the case. Is there a particular area that you would start on over others?

Craig Campbell: For me personally, I would be fixing the mixed content more. I see that as fairly important. I've seen a question pop in there as well, how do you fix HTTP? If it's on WordPress you can do a plugin, really simple SSL which it'll force it all over. It really depends on the platform you're on. But that's a two-minute fix for me, but I feel it's the most important one that I would tackle first, over href line tags and whatever else. 

When I'm looking at a website I want to fix things at a foundational level first, and then worry about things like international targeting and stuff. I want to start at the bottom, get all the foundations accurate, the website's working, it's loading, there are no JavaScript problems, so I'd probably go into the JavaScript stuff next. 

Again, people go, "Oh why? Why are you not going for the broken links or whatever else is on there?" The JavaScript thing, I wouldn't be fixing it, I will just delegate that task to a developer. 

Between the mixed content issue and the JavaScript, it's like two minutes worth of work for me, and then I would start to tackle everything else. It's not as if I would be fixing something and then waiting three weeks, and going back and fixing the broken links, that's just the way I see it.

Auditing Internal and External Links

Ross Tavendale: The next big thing I really need to bring people's attention to is internal linking. For me, this is such a big deal. One of the things I recommend is go look at all of the different pages and the links, to the external links pointing to them, work out where all of the authority's held.

Maybe you wrote a blog post, it's blown up, maybe you’ve done a press release at some point that's had a bunch of links pointing to it, we want to see where that is so we can start redistributing some of that internal link equity.

In terms of this stuff, so broken links, yeah, just fix them. Do that before 404s because the broken links cause the 404s. There's no point in redirecting these if you can just update it and have it fixed. Broken external links, what do you think about that one Craig? I'm not as bothered about these, especially if it’s such a small amount.

Craig Campbell: Broken external links is secondary as far as I'm concerned. As you say, the broken internal links is something that's massive, so I would come back to it. 

Ross Tavendale: Absolutely. Anything in the notices section, so interestingly, if it's orphaned in the sitemap, that's a massive issue so that needs fixing very quickly. 

Permanent redirects, again, this isn't a huge issue. If you've fixed everything and it's all perfect, but your rankings aren't moving, maybe have a look and see that you haven't redirected something in a strange way. 

The Pareto Principle for On-Site SEO 

This bit down here's always been fascinating to me, so in this particular site, 95% of the links are pointing to 30% of the pages. I typically go between a 80/20 rule, so 80% of links pointing to 20% of the money pages that matter to you. Do you have any kind of structure of rules for doing that Craig, or do you do it by keyword? How do you do the internal linking bit?

Craig Campbell: I don't really do 80/20 as such, I think obviously that kind of ratio is round about right figures. Looking across a lot of my affiliate websites, they are probably 20 or 30 money pages that are there, so what I would like to say for anyone listening, say for example Ross Tavendale says, "80/20," and it has to be that way, don't stop at 20. 

If you were doing it in terms of 100, 80/20 would suggest that you've got 20 money pages, but if I had 30 or 35 really good money pages, then that ratio would change slightly from Ross'. I think it's a good guide to use 80/20 because let's be realistic, even in my own blog which has got hundreds of blog posts, 20 of them probably still get read, if that.

Ross Tavendale: Also, if you connect GA, you can then start seeing if you're actually getting real page use. What we're doing is we're correlating the amount of clicks someone's getting to the amount of actual traffic that's being sent from Google itself.

Now I feel like I've been talking quite a bit. 

Sample Website Technical SEO Audit

Craig Campbell: Power Rubber has a website which is powerrubber.com, and he has asked us to have a look at it and use it as an example for the site audit. I, earlier on, put Power Rubber through the SEMrush tool, and it comes back with a fairly decent score of 82%.

I think anything between 80 and 100 is good. There's sometimes some false positives, some silly little button that you've got on there that is flagging up mixed content or whatever, and that's going to knock your score down. 

Sometimes there's just things that you just have to accept that are there, because SEMrush are suggesting that you've maybe got too many redirects, but you may have a big website and those redirects are legit.

Again, there's not much to fix there, but 82% doesn't look too bad on the outside, and when you really have to dig deep, the first thing we always look at is the errors, as we said earlier. Errors is the first place I would always go to, fix them first and then look onto your warnings and then notices.

We don't have a huge amount of errors on here, and you'll see here, there are 310 pages with a slow load speed. That's something you would have to do research on. Could be a low-cost server you're on, it could be something within the site build, too many plugins, whatever it may be. 

You would have to identify why those pages are loading slow, and you can then start to run your website through third-party tools like GTmetrix and stuff, and various other tools out there, because it is flagging up as some of these URLs are loading three, four seconds and stuff like that. That's obviously not good enough, so we want to do further research into that. That would be one of the errors that Power Rubber have.

The next one which would concern me more than probably the other two is why 118 pages can't be crawled. Having a look at some of these, it may have been an error where when SEMrush was crawling it, for whatever reason, there was a glitch in the system that couldn't get there. Some of these do look like money pages and product pages, so I would want to know why they're not being crawled and make sure that that gets fixed.

Again, I think people out there have to use the Site Audit tool as a guide because I’ve seen a few questions popping up, and people are saying, "What about schema? It doesn't tell us how to fix it or how to do it." SEMrush, the audit tool, is there to guide you, and then you have to go and implement schema. 

Some people say, "There's plugins that do it." You obviously want to manually do schema. In my opinion, I'm not sure what you would say to that Ross, but in terms of schema, but I don't see any reason why you wouldn't use manual schema and stuff like that.

I think you've got to use SEMrush as a guide and then elaborate on it. It's telling you there's a problem and you have to then go on and identify what those problems are and do further research, and then implement what you think is right at that point.

Ross Tavendale: One of the interesting things about the tool, if you hover over why and how to fix it, so it does tell you to a degree how to fix it in these little boxes. If you go to schema.org, that's going to tell you all of the different types of schema that you can use depending on your business type.

There's a guy called Simo Ahava if you want to Google him when you guys get a bit of time, he's a Tag Manager expert. He's got a bunch of really great blog posts on how to do that implementation.

Craig Campbell: Cool. Good information. From there, obviously you want to identify warnings, and Ross was talking about false positives. I'm not sure what you would agree here, but whenever I see this warning here, 831 pages have low text to HTML ratio. Obviously, there's going to be pages there that potentially have less content compared to the amount of code that's on the website, but that is probably something I would ignore in a lot of cases, Ross. I'm not sure, do you feel that's important?

Ross Tavendale: It's an interesting one. If they're using some sort of common CMS with tons of plugins, and also we can see the fact that nothing's compressed and their site speed's horrific, it's probably because there's so much code that it's constantly firing all the time, and we don't actually know if it's being used or it's not being used.

I'm not that bothered with regards to thin content, I'm more bothered by the fact that there's too much code on the page for what it's trying to physically do. If you go into Chrome and you go into Inspect Element, and I look at all the network sources and refresh the page. You'll see a waterfall of everything that's firing, and it'll also actually show you the things that are firing, but not physically used on the page to change anything. That's something you probably want to look at.

If you're running things like Google Maps, literally go right click, inspect element, look at all the blue lights in there. If there's tons of JavaScript and things like that, you might think that you don't need the vast, vast majority of it. 

Low text? It's an E-commerce site so that's just going to happen. For me, back to that 80/20 rule, if you've got 20% of your core landing pages, make sure that they are not in that. Other things like the product pages and things, if they're important to you, add more content, otherwise, I tend to do it through landing pages, not through products.

Craig Campbell: I think it's a fair statement there in terms of the money pages. Just make sure your money pages are not in there is probably a good thing to be doing. Probably something I should do myself as well, rather than just put that on the back-burner.

You've also got issues with uncached JavaScript and CSS files, and again, more of a dev thing.  I would pass it to a dev that will tell you to sort that stuff out, and it will help your load speed.

You've got 254 pages that don't have meta descriptions. There probably is some product pages and stuff in there. Everyone knows that meta descriptions don't really help you directly rank well because everyone spammed them to death back in the day, but indirectly it can encourage clicks and stuff like that, so I think it is important to sort that out. 

I do think that meta descriptions and stuff are quite important to encourage clicks and stuff, and I know a lot of people just ignore them and don't give a hoot. When you start to encourage clicks, click-through rate and all that kind of stuff, indirectly help your rankings in my opinion. Would you agree with that Ross?

Ross Tavendale: We actually ran a study on this at Type A Media. We changed titles and descriptions to try and manipulate click-through rate in particular areas of the website. Click-through rate goes up and so does traffic...and rankings stay high.

Having something that's compelling to click on, schema feeds into that a lot. Making sure you've got rich snippets in play and things of that nature. Descriptions are a biggy for me to make it compelling to click.

Craig Campbell: Cool. Two pages don't have a header. Obviously go in and identify that. 

Ross Tavendale: When you see things like H1 tag problems and then you see those sorts of pages, it looks like there's fundamental problems at a template level that they probably need to look at first. 

I would seriously consider, when you're doing these sorts of audits, for the Power Rubber guys actually in the chat, I would work out of all the different templates, CSS templates that you've got across the site, what are the core ones powering the bits of the site with the most traffic? I would look at fixing them first. It's interesting, so it says it's not finding a sitemap as well.

Craig Campbell: Sometimes it happens with my own website and there is a sitemap there. Not sure why that would happen, but that's why I just clicked on that one there. There obviously is a sitemap.

Ross Tavendale: It may not be in the correct syntax, so a lot of the times when SEMrush says that there's no XML site-mapping, you click on it and there is a page that loads called Sitemap, a lot of the times it's just an incorrect format. Although technically there is a page called Sitemap with the links on it, it's the wrong format to be considered a site map.

For the guys out there, once you've got your sitemap, of course, straight into search console and have that submitted, and make sure that there's no breakages or no orphan pages in there, because that's quite a strong crawling signal from what I've seen, to have an up to date sitemap.

Craig Campbell: Notices is something you would probably come along and fix after you'd sorted out the initial stuff. You've got 4000 URLs with a permanent redirect, again, there might be a very good reason. 32 pages are blocked from crawling; that would concern me, I would want to know why. If it's just author pages and stuff, that's completely fine. 29 pages have more than one header tag. Again, there could be debate over that, but you would want to identify that. 

24 pages that only have one incoming internal link, so you probably want to look at your internal linking structure there and make sure that none of those, as Ross suggested earlier, are good money pages, so I think you want to make sure that you're powering up those money pages with a better internal linking structure. 

Overall, I think the website's, it's one of those ones when you get something like that landing on your desk and you go, "Jeez, there's a lot of problems there," a lot of those problems can be sorted very, very quickly. It's something that's probably just been left to the side there.

SEMrush is looking for every last wee thing to be broken and so it can be daunting when you think your website's got 700 errors. Pretty sure if I put Power Rubber into the SEMrush Organic Research report, it will still be getting decent traffic and stuff like that. 

I think you've always got to think ahead and try and just better yourself and be better than the competition and everything else, so I think it's a job. Just good housekeeping is really important. 

That's where I think the site audit tool on SEMrush is very, very aggressive in my opinion. Something flags up some false positives, but I think through experience in using it, it's a tool I would use as part of my everyday toolkit. I think it does a great job.

Fixing Plugin Problems

Ross Tavendale: Here's a good one (question), so Northern Dock Systems says, "The site audit shows lots of errors we received from 'popular plugins' we use on our website which we regularly update. Should we be concerned? The plugins causing errors, is this a problem?

Craig Campbell: Yes and no. SEMrush is not Google, so because it flags up a problem on SEMrush, SEMrush is there to highlight that there maybe an error there. But if these plugins are working well and they're adding functionality...

I'm not condoning the use of heavily using plugins to use functionality in your websites, there's a lot of reasons why you don't want to rely on that. I think you can get a website made that's not reliant on plugins, and that would be the best way to do it. 

If you have to use plugins for one reason or another, whether it's a functionality thing or it's just where you're at in life, then I wouldn't be too concerned that SEMrush didn't like them. I'd be more concerned if Google didn't like them. If there were issues with rankings and stuff, that's when I would take notice because SEMrush don't have a say in the rankings. They're there to just help guide you.

Tips to Increase Website Speed

Ross Tavendale: Northern Dock also asks, "Besides optimizing images, what can we do to increase site speed?"

Craig Campbell: You'd have to look at your server, cache and plugins, all that kind of stuff to improve performance, and there are other things that the guys can do in terms of cleaning up code on your website. Loads of templates out there have so much bloated code it's unreal, so all of that kind of stuff can be done to improve your site speed.

If you're like me and you can add a plugin that optimizes your images, that's as far as my technical knowledge would go. Going and cleaning up code, I would blow up a website so I wouldn't even dream of doing that. 

Just continue to reevaluate the website and see if there's anything else that you can do that's cheap, cost effective, that's going to narrow down that speed. There are loads of guys out there doing site audit optimization for a couple of hundred bucks, so I think outsourcing it is always a good idea.

Ross Tavendale: One thing I would say is, for site speed, run a lighthouse report, so it's inside Google Chrome. What that's going to do is give you a full rundown of everything that's blocking everything on the site. 

There's something known as the critical rendering path, which is essentially all of these things that load, they load on a waterfall sequence, and one typically has to wait for the other one to load. If you've got tons and tons and tons of resources, it's blocking your critical rendering path, and a lot of times the things that are actually loading don't change anything on the page, so getting rid of that is a good port of call.

The guys in the chat said things about CDNs, so potentially look at things like Cloudflare to speed it up. If you're on shared hosting, don't do that, that's silly. Go onto a BPS or some sort of Cloud service like Amazon AWS. I'm all out of speed stuff.

Craig was saying, take away the bloat in your code, minify your JavaScript, your CSS, turn your CSS into sprites, not into multiple files. A sprite is just a single file that determines everything on page. Lighthouse audits would be a biggy for me.

Optimizing Product vs Category Pages

Some other things, "I use category pages on my web shop as landing pages and optimize those, is that a good idea? When you have some thousand products it is time-consuming to make them all SEO friendly." He's saying should you optimize products or category landing pages?

Craig Campbell: It's really up to yourself. I think product pages can bring in a lot of long-tail initial traffic, click-through rate, people on your website and all that kind of stuff. Category pages is going to be a little bit harder, more competitive, so if you're asking the question which one would you do first? I wouldn't necessarily do one or the other first.

I think you have to look at the top 20 pages on your website, the ones that get the most search, and whether that's a category or a product page, look to optimize them. 

Ross Tavendale: I'm in a similar boat. For me, it very much depends on the volume around the keyword of the category and the volume around the product. If the product is Nike Air Max trainers, which I suppose is technically a category because there's multiple types, I would optimize that. If the category is sneakers or trainers, it'd really depend. 

Just because you're sending a bunch of generic traffic to your trainers or your sneakers page, does it convert to the same rate and is it worth the same level of cash value as optimizing for the product page?

Don't be really myopic when it comes to search. Search is great, traffic is lovely, rankings are great. If it's not making you any money then probably move away from it and do a bit of analysis up front to work that out.

Properly Removing Pages 

One of the guys in the chat is saying, "Any other recommendations other than 301? Should I just remove from the sitemap and wait for them to die?" This is that he wants to get pages out of the index or he's not bothered anymore, what would you typically do there? You're going to just let them 404? You're going to 410 them? What's your mindset?

Craig Campbell: 410 always. 410 always, and the reason I say that is that I've seen so many websites in the past who have just done a 301 or whatever, or just left them, 404s, and they get jammed in the index, and they're draining power from your website and everything else as well. 

If those pages are gone and never going to come back, then I would always use a 410 to say they're gone, and that's all there is to it, just to retain power in your website and stuff.

One, I don't think leaving 404 errors is a good thing, and two, things can sometimes get jammed in the index. I know eventually that Google are supposed to come back to your website and eventually the 404 will drop. I've seen loads of instances where those 404s have not dropped and they've had to 410 them to get rid of them.

Ross Tavendale: Also if they need to stay on the site for a navigational purpose, you can't just delete them. You could always just apply a noindex, don't do it in the robots.txt anymore because Google's not going to actually honor that anymore. You can noindex it, in the robots.txt you could potentially disallow crawling if you really want to, but I think Craig's option of 410’ing makes sense because 410 means just to remove from index. 

We're coming to the end, Craig. It's been an absolute pleasure. Where can people find you? Craigcampbellseo.com now?

Craig Campbell: Yeah, craigcampbellseo.com and you'll be able to get me on Twitter Facebook and all that from there, if you want to talk to me.

Ross Tavendale: Excellent. I know a good agency in London if you're struggling with your rankings Craig. It's typeamedia.net, or @rtavs on Twitter.

Well, that's everything from us guys. Thank you so much and we will see you next time.

Craig Campbell: Cheers. Thank you.

All levels
Discover SEMrush