Pinpointing the cause for errors or slow page load times can sometimes be a challenge. Especially when identifying sporadic issues related to page load times on a high-traffic site.
To help our customers track down any problems that their site might be experiencing, we've made a CLI tool called pagely-logwatch available for you to use.
When running pagely-logwatch, the following information will be output:
- Header Status Returns: non-HIT's, sorted by most up top
- Header Status Returns: longer than two second load, sorted by most up top
- Look for 503/504 only
- User-Agents grouped with IP: return code, non-HIT's, sorted by most up top
- User-Agents grouped with IP: non-HIT's, sorted by most up top
- User-Agents slower than 5 seconds, sorted by most up top
Basic Usage
To use pagely-logwatch, access your server via SSH and run the command as outlined in the usage examples below.
pagely-logwatch [grep_expression] [-f /custom/path/file.log]
General Usage Examples
Here's a few quick examples of how you can use pagely-logwatch
to identify issues.
(The examples in this section are intended as a quick reference. For more example on tracking down specific issues, take a look at the case-specific examples later in this article.)
Filtering By Domain
To run pagely-logwatch on a specific domain, just pass the domain as the first argument like this:
pagely-logwatch example.com
Filtering By Domain and Date
To narrow things down even further, you can pass additional filters with grep
, such as a specific date:
pagely-logwatch "example.com | grep '19/Mar/2020'"
Filtering By Domain and Request Method/Path
If you're tracking down issues on a specific page, filtering by the request method and path will help you narrow down the possibilities:
pagely-logwatch "example.com | grep 'GET /example/path/here'"
Filtering By Domain and Status Code
If you're looking for a specific status code, you can filter the logs like this:
pagely-logwatch "example.com | grep '403'"
Case-Specific Examples
Here's a few examples of common scenarios where pagely-logwatch may help to track down a specific problem.
Locating 404 Not Found Errors
One thing that pagely-logwatch can help with is locating 404 errors that are happening across your site. By getting a list of 404 errors, you can easily identify broken links or opportunities to redirect users to more relevant content.
pagely-logwatch "example.com | grep '404'"
Identifying Pages with Slow Load Times
The pagely-logwatch script can also help you identify paths that are taking too long to load. Simply run the pagely-logwatch
command and locate the Header Status Returns: longer than two second load, sorted by most up top section.
Reviewing Bot Activity
If you're attempting to identify bot activity, you can use the following command to filter out standard Windows, Mac, and Linux clients.
Note: Since this works by filtering by the user agent string, it won't be 100% reliable. While it will usually get you the information that you need, user agents strings are set by the incoming request and could still look like a normal desktop visitor.
pagely-logwatch "example.com | grep -v 'Windows' | grep -v 'Mac' | grep -v 'Linux'"