Sometimes you make something better not by adding to it, but by taking away.
When I got into the SEO game about 10 years ago, I remember being taught that it is important to pump as many pages into Google’s index as possible. That really, really doesn’t work anymore nowadays.
Panda and other Google algorithms have taught us that quality - not quantity - is the name of the game. An effective approach to keeping quality high is “SEO pruning”: cutting off or editing underperforming pages to make a site stronger. You would rather have little but great content.
That’s what SEO pruning is all about.
The Idea Behind SEO Pruning
The concept of pruning is taken from Bonsai trees. They tend to grow to the top, which means the outer branches grow out of proportion, while the inner branches get crummy and die off. Pruning can get a Bonsai tree into the desired shape by making sure growth is distributed more equally.
The same applies to websites and SEO performance.
Most sites follow a power curve: a small number of pages gets most of the traffic. That in itself is not bad, but it results in a long-tail of underperforming pages that don’t get enough “light”. They are the crummy branches. What you want instead is a high ratio of indexed pages vs. pages with top 10 rankings. Sites with a low amount of underperforming pages seem to get a sort of extra boost.
We don’t know what the exact mechanics behind pruning are, but it seems to be a mix of optimizing crawl budget, quality score, and user experience. The original idea of a quality score comes from paid search, but the concept also exists in relation to a site’s link graph or brand combination searches. For content, Google seems to measure the quality of all pages in its index and accrue it to a "domain quality score”.
Now, the question is how can you prune your site and what tools should you use?
Let’s take a look!
How to Do SEO pruning with SEMrush
The goal of this step-by-step guide is to create a spreadsheet that tells you which URLs are underperforming, so you can prune them. I am using one of my personal projects in Germany to lead through the tutorial.
What the final result should look like. Hat tip to conditional formatting.
Step 1: Crawl Your Site with “Site Audit”
First, we need to crawl our site, so go to “Site Audit” under “Projects” and get one started.
The “Site Audit” feature in SEMrush: Dashboard -> Projects -> Site Audit.
Once the crawl ran through, you get a report that lets you export the crawled pages (on “crawled pages”, click on the export button in the top right).
Crawled pages report in the Site Audit feature.
Export and save the data.
Step 2: Extract User Behavior and Backlink Data
Now we have a basic index of pages, but we still need the data to assess which ones to prune. We can use many different criteria to assess the performance of a URL:
- Organic entrances
- % bounced entrances / bounce rate
- Total pageviews/sessions
- Number + quality of backlinks
- Crawl frequency
- # of ranking keywords
- Social shares
In my example, I used # of ranking keywords, backlinks (from SEMrush and Search Console), traffic, user signals. “Organic Traffic Insights” has all of that (can be found in the projects section).
The “Organic Traffic Insights” report in SEMrush.
Don’t forget to define the time range before you export. By default, it is set to 7 days, but that is not very representative. Instead, choose the last 3-6 months.
Pick the right time range for the data export.
What you get out should look something like this:
Export from “Organic Traffic Insights” in SEMrush
What is nice is that the export gives you get average user behavior metrics across your site to benchmark. Plus, below the URL overview, you get all ranking keywords for each URL*.
*(With the latter you can create a pivot table and check whether several URLs rank for the same keyword to assess potential keyword cannibalization. But that is a topic for another blog article.)
We also need to export backlinks, so head over to “Backlink Analytics” -> “Backlinks” and export ‘em all. Just make sure you exclude lost backlinks before the export, otherwise, your list will be noisy.
“Backlinks” in SEMrush
Additionally, you can add backlinks from any tool you like, e.g., Search Console. The goal is to get a number of backlinks per URL, or a backlink quality metric.
In Search Console, just click on “Links” -> “Top linked pages - externally” and export that list.
External links report in Google Search Console: Status -> Links -> Top linked pages - externally.
Once you downloaded all the data, import the CSVs into the spreadsheet from the crawl in step 1. Then, use VLOOKUP to group all data per URL. Create a pivot table for the backlink data, so you can group the sum (or average strength if you use a proprietary metric) of backlinks per URL.
You should now have a list of URLs with information about rankings, traffic, and user behavior performance.
I indicated which data comes from SEMrush (orange) and which from Search Console (grey) in my spreadsheet:
The final pruning table sorted after links from search console.
Step 3: Identify Underperforming Pages
From here on it’s all Excel magic.
Apply conditional formatting to each column and sort the table after traffic to see why certain URLs are not performing as well. Sometimes, a page has a high bounce rate, other times not enough links, or it was never optimized for a keyword.
What you want to do is take the whole list and group URLs into three buckets.
Performers (do nothing)
The first bucket is for pages that are performing well. They rank (well) for keywords, bring in traffic, have good user signals and backlinks. So, you obviously don’t want to prune those. Pages that cannot be removed also fall into this bucket, like homepage, Terms of Service, feature landing pages, etc.
Slight Underperformers (edit)
Slight underperformers bring in some traffic but don’t rank for any keyword in the top 10. They could get on the first page on Google with some optimization. They might also have brought in a lot of traffic once but dropped in rankings over time.
Definite Underperformers (merge or redirect)
The last group is reserved for pages that never got any traffic or backlinks. They are sometimes duplicates or overlap with other content, e.g., two articles targeting the same topic/keyword. Often, they are just pages that are part of the infrastructure and have never been optimized for a (good) keyword, e.g., archive, author, tag pages.
Pages that often fall into this bucket are:
- Author/tag/archive pages
- Blog articles
- Hub pages (from topic clusters)
- Empty inventory (products that aren’t available anymore)
It is best to redirect those pages either to another one that is similar (maybe targets the same keyword) or to the next higher page in the hierarchy (could be the blog overview page or a category page). You can also merge the content with another page, depending on how well that is feasible.
Step 4: Determine What to Do with Underperformers
Pruning doesn’t mean deleting URLs. You have a couple of options for what to do with the 2nd and 3rd bucket:
- Merge the content of two or more URLs into one. Make sure to 301-redirect the old URLs to the new one.
- Redirect a URL to another or the page next higher in the hierarchy.
- Edit/rework/rewrite/update the content.
Start by setting all URLs in the underperformer bucket to meta=noindex. You might not be able to change all pages, for example when it is an author page that you need people to be able to click on. But you can prevent Google from indexing it, which already helps.
Then, go through the list of underperformers (bucket 3) and decide what to do with each of the pages you can change (merge, redirect, or edit).
Only then would I start working on the second bucket, the slight underperformers. In some cases, it might not even be necessary to prune slight underperformers.
Step 5: Release Changes in Stages
At this point, you just have to make the changes and roll them out.
I recommend releasing in stages: start with underperformers and wait for a bit (3-6 weeks). See what happens and make a data-informed decision. If the change is positive for your site, you can try to continue with the slight underperformers.
You don’t want to prune too much at once. Try to find the optimum, instead.
What results can you achieve with pruning?
One site we pruned at Atlassian gained significantly in top 20 rankings, and we gained roughly 25% in organic traffic about a month after we pruned.
# of keywords over the last 12 months in SEMrush.
How often should you prune?
I recommend 1-2x/year; otherwise, you risk “over pruning”. It is like cutting a tree too short. Sometimes, the results need a bit of time to show. It often depends on the size of the site; a larger site will show results faster.
Who should prune?
Everybody, but your site shouldn’t be “too small”. If you have 100 pages, it probably doesn’t make sense, but you can start thinking about it from 1,000 pages on. Keep in mind that these are arbitrary values that I’m providing to give a rough idea of what “small” and “large” is. You need some content to prune. You can’t cut branches off that aren’t there, yet.
What else can you prune?
You can (and should) prune backlinks and inventory after the same principle: disavow spammy backlinks and get rid of products nobody buys to improve the overall quality of your profile. It is part of keeping things in order.
Sometimes, less is more.