What Is Log File Analysis?

Introduction

Google has limited time and resources to fulfill their mission of organizing the world's information to make it universally accessible and useful. One of the consequences of these limitations is they are only able to crawl a fraction of large, enterprise websites. As we have discussed in our post on crawl budget, enterprise webmasters must prioritize technical SEO in order to ensure Google is frequently crawling their most important traffic and revenue-driving pages.

This article provides an introduction to log file analysis, a task allowing marketers to study how Google interacts with their websites in order to inform changes for technical SEO.

For technical SEO purposes, a log file is a collection of server data from a given period of time showing requests to your webpages from humans and search engines. It's almost like a sign-in sheet for your website — including who visited, where they came from, and other info about them.

Marketers analyze the data from these log files in order to understand, for example, how their website is being crawled by Googlebot. The insights from this data can be used to resolve bugs, errors, or hacks that are negatively impacting how Google is discovering, understanding, and adding your content to search results.

Q: What are examples of things I might see in a log?

There are specific things that should also be viewable in a log file. These include:

The IP address of the request server
The date and time the request was made (timestamp)
The requested URL
the HTTP status code (what is redirected? Is it a 404 page?)
the user agent (this is where we can see if the visitor was Googlebot)

WHAT TECHNICAL SEO INSIGHTS COME FROM LOG FILE ANALYSIS?

Log file analysis yields answers to important technical SEO questions such as whether:

crawl budget is being used efficiently
certain pages are being crawled more often than others
Google is unaware of certain areas or pages on a website
Google is facing accessibility issues in certain areas of a website
Google is visiting your site frequently or infrequently

HOW TO PERFORM LOG FILE ANALYSIS

A log file analysis can be difficult to parse manually because a log file includes so much data. If you don't really know what you're looking at, or how to isolate what you're looking for, it can be challenging to get the information you need from the log data to create a technical SEO strategy

Luckily, there are some tools you can use to help you. One of the best tools to use for SEO purposes is Screaming Frog. Using Screaming Frog, or other SEO log analyzers will help you visualize and organize your site's log data so it's easier to understand and see what's going on.

Google Search Console also has some log file analysis capabilities, with their crawl stats report feature.

As we said, log file analysis can answer important technical SEO questions about your website. An SEO log file analysis tool will speed up the process to help you get answers more easily.

SEO Log File Analysers can Help You Understand:

Crawled URLs

How many pages of my site is Googlebot able to crawl within their crawl budget?Log files contain important crawl data concerning your site. You can see exactly which URLs Googlebot and other search engine bots are able to crawl per crawl visit.

Crawl Budget

Have I maximized my crawl budget or is there room for improvement?You can analyze which URLs are being crawled on your site so that can identify crawl issues and improve your crawl budget.

Crawl Frequency

How often is Googlebot visiting my site? Analyzing the server log of your site will help you understand how often search engines are crawling your webpages, the number of URLs being crawled for each visit, and which user agent bots visit your site the most.

Broken links & Errors

Do my site links look healthy for search engine bots and users?It's important to stay on top of any errors or broken links across your site.

301 Redirects

Do I have too many redirects on my site? Log data can help you see the number of 301 redirects on your site. If there are too many redirects across your site, or too many in a redirect chain, you will be able to identify the problem so you work to solve it.

Slow Pages

Which pages on my site are slower and harder for search engines to crawl?It's important to recognize that certain pages might be harder for Googlebot to crawl. Once you're aware of the problem, you can look into page speed solutions.

Uncrawled & orphan pages

Are there pages on my site that search engine crawlers are unable to find?There might be pages that search engine crawlers simply crawl or that they are unable to find. This can be for a variety of reasons depending on the specific web page and the structure of your site. Once you've identified that there is an uncrawled page, you can then investigate why.

CONCLUSION

In short, log file analysis is one of the first steps towards helping search engines index and rank your website better.

If you are looking for solutions that address recommendations from log file analysis, check out our technical SEO software platform here.