Don’t Take Google for Granted When it Comes to Indexation

Over the past seventeen years or so I’ve been involved with SEO one of the most common statements I hear from non-SEOs is that you should just trust Google to pick up the most important pages on your website and let them get on with it. However, this is a very dangerous view to take – especially when it comes to seasonal products where you absolutely have to make sure those pages are working hard for you.

One of the conventional ways to see how much of your site has been indexed by Google was to submit a sitemap and wait for Google to update the ‘indexed’ number. You can even add these numbers to a spreadsheet and chart the ratio of submitted and indexed pages.


The problem is that all this would do is tell you you’ve got a problem, but not how to solve it. Some SEOs would suggest creating separate sitemaps for different types of pages such as products, collections and other non-core pages. Software such as Shopify will do this for you. But all that does is just narrow down the problem, but not actually highlight the individual pages that aren’t indexed and cannot send you traffic.

For years we’ve wanted a tool that can help us identify which pages on our own sites aren’t indexed and now we’ve found one. URL Profiler is a great bit of software which allows you to import your sitemaps (and other URL lists) and see which Google have indexed and in what part of their database they have put them. This second aspect is vitally important as some of your pages could be classically viewed as “indexed” but Google has a very low opinion of them that they’re very unlikely to ever be returned for a normal search.

URL Profiler

In the URL Profiler reports each URL will be assigned “Base” if it is in the normal database, “Deep” if it is in Google’s extra one for undesirable URLs or “None” if either you have instructed Google not to index it or Google sees no need to do so. They will also list the ‘cached date’ so you can see when Google last (generally) crawled the page. This will help you work out if any changes you have recently made to the page will be reflected in the search results i.e. changed the copy to include new keywords/topics etc.

Why is it important to run this report?
There are various different reasons to check to see which pages are indexed. The first is that often you may create products and assign them to a category or collection which itself isn’t linked to within the website – often a navigation link is missing. So you may actually believe you have put products live but they may only be found by your internal search service.

Another reason is that if you’ve created sections of content in the past to optimise your search results and now Google and the other search engines have changed tack and you feel the need to remove them, perhaps with the upcoming update in mind then you would want to be absolutely sure that you have removed every single one of those potentially damaging pages by checking their indexation. Essentially you would go through the list by eye to confirm that they have been removed or noindexed.

The third reason that we often find is that you would have product variant URLs that make the different pages appear virtually identical. You may, for example have products in different sizes or colours etc., and Google have taken the decision themselves to exclude these other versions. In this case it makes sense to review how you create these variations as if Google is taking the view that a proportion of your site has little value then it could affect their perception of your website as a whole.

Also sometimes products are duplicated by mistake, you may have duplicated a product to carry over some information and forgot to change other aspects of the product page which may cause a page or a few pages not to be indexed.

A fourth reason is that you have mistakenly instructed Google to not index pages via a noindex tag. Often this can be left on by mistake or a member of staff didn’t know what the tag meant. Using the URL Profiler software you will easily be able to spot them.

Other ways to use URL Profiler
When you run URL Profiler you can also get the amount of content on the page. You can re-order the CSV report to show the shortest first and establish which need to be improved or if you need to add noindex. Google’s Panda algorithm looks at short and low quality pages, so being able to spot short ones should help you solve part of the problem.

Another issue with search engines is that often you may have content duplicated throughout your website and other websites. Although the indexation report may show you the symptoms of duplication, this specific ‘duplication’ report will show you the causes.

Firstly run it looking for internal duplication and this should flag up duplicated products, categories and other forms of content and then run it against external websites and it should flag up if other sites have copied your content (bad for SEO) or you’ve simply used manufacturers’’ descriptions (also bad for SEO).

Another SEO metric that you may want to track, perhaps against indexation, is pages that have high (adjusted) bounce rates as this could show you how users’ perception of quality is used by Google to decide which pages to index and which pages you should fix.

You could even find your most shared content by selecting the “social shares” option. This way you can then work out which types of content are the most successful and replicate that approach.

From the couple of weeks we’ve spent using URL Profiler we can see it’s a very powerful tool that allows you to fix issues that you previously didn’t know existed. Get the 14 day trial and see what you think of it.