goaccess(1) | User Manuals | goaccess(1) |
goaccess - fast web log analyzer and interactive viewer.
goaccess [filename] [options...] [-c][-M][-H][-q][-d][...]
goaccess GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
It provides fast and valuable HTTP statistics for system administrators that require a visual server report on the fly.
GoAccess parses the specified web log file and outputs the data to the X terminal. Features include:
Expanding the panel can display more information such as host's reverse DNS lookup result, country of origin and city. If the -a argument is enabled, a list of user agents can be displayed by selecting the desired IP address, and then pressing ENTER.
NOTE: Optionally and if configured, all panels can display the average time taken to serve the request.
There are three storage options that can be used with GoAccess. Choosing one will depend on your environment and needs.
Multiple options can be used to configure GoAccess. For a complete up-to-date list of configure options, run ./configure --help
The following options can be supplied to the command or specified in the configuration file. If specified in the configuration file, long options need to be used without prepending -- and without using the equal sign =.
Note that if there are spaces within the format, the string needs to be enclosed in single/double quotes. Inner quotes need to be escaped.
Color Syntax
DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
FG# = foreground color [-1...255] (-1 = default term color)
BG# = background color [-1...255] (-1 = default term color)
Optionally, it is possible to apply color attributes (multiple attributes are comma separated), such as: bold, underline, normal, reverse, blink
If desired, it is possible to apply custom colors per panel, that is, a metric in the REQUESTS panel can be of color A, while the same metric in the BROWSERS panel can be of color B.
Available color definitions:
COLOR_MTRC_HITS
COLOR_MTRC_VISITORS
COLOR_MTRC_DATA
COLOR_MTRC_BW
COLOR_MTRC_AVGTS
COLOR_MTRC_CUMTS
COLOR_MTRC_MAXTS
COLOR_MTRC_PROT
COLOR_MTRC_MTHD
COLOR_MTRC_HITS_PERC
COLOR_MTRC_HITS_PERC_MAX
COLOR_MTRC_VISITORS_PERC
COLOR_MTRC_VISITORS_PERC_MAX
COLOR_PANEL_COLS
COLOR_BARS
COLOR_ERROR
COLOR_SELECTED
COLOR_PANEL_ACTIVE
COLOR_PANEL_HEADER
COLOR_PANEL_DESC
COLOR_OVERALL_LBLS
COLOR_OVERALL_VALS
COLOR_OVERALL_PATH
COLOR_ACTIVE_LABEL
COLOR_BG
COLOR_DEFAULT
COLOR_PROGRESS
See configuration file for a sample color scheme.
--html-prefs='{"theme":"bright","perPage":5,"layout":"horizontal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
Note This is just a WebSocket server to provide the raw real-time data. It is not a WebServer itself. To access your reports html file, you will still need your own HTTP server, place the generated report in it's document root dir and open the html file in your browser. The browser will then open another WebSocket-connection to the ws-server you may setup here, to keep the dashboard up-to-date.
Only if configured using --with-openssl
Only if configured using --with-openssl
Bits-hidden | Level 1 | Level 2 | Level 3 |
IPv4 | 8 | 16 | 24 |
IPv6 | 64 | 80 | 96 |
req
Only ignore request from valid requests
panels
Ignore request from panels.
Note that it will count them towards the total number of requests
If using GeoIP2, you will need to download the GeoLite2 City or Country database from MaxMind.com and use the option --geoip-database to specify the database. You can also get updated database files for GeoIP legacy, you can find these as GeoLite Legacy Databases from MaxMind.com. IPv4 and IPv6 files are supported as well. For updated DB URLs, please see the default GoAccess configuration file.
Note: --geoip-city-data is an alias of --geoip-database.
GoAccess can parse virtually any web log format.
Predefined options include, Common Log Format (CLF), Combined Log Format (XLF/ELF), including virtual host, Amazon CloudFront (Download Distribution), Google Cloud Storage and W3C format (IIS).
GoAccess allows any custom format string as well.
There are two ways to configure the log format. The easiest is to run GoAccess with -c to prompt a configuration window. Otherwise, it can be configured under ~/.goaccessrc or the %sysconfdir%.
Note: If the query string is in %U, there is no need to use %q. However, if the URL path, does not include any query string, you may use %q and the query string will be appended to the request.
It uses a special specifier which consists of a tilde before the host specifier, followed by the character(s) that delimit the XFF field, which are enclosed by curly braces. i.e., "~h{, }
For example, "~h{, }" is used in order to parse "11.25.11.53, 17.68.33.17" field which is delimited by a comma and a space (enclosed by double quotes).
XFF field | specifier |
"192.1.2.3, 192.68.33.17, 192.1.1.2" | "~h{, }" |
"192.1.2.12", "192.68.33.17" | ~h{", } |
192.1.2.12, 192.68.33.17 | ~h{, } |
192.1.2.14 192.68.33.17 192.1.1.2 | ~h{ } |
Note: In order to get the average, cumulative and maximum time served in GoAccess, you will need to start logging response times in your web server. In Nginx you can add $request_time to your log format, or %D in Apache.
Important: If multiple time served specifiers are used at the same time, the first option specified in the format string will take priority over the other specifiers.
GoAccess requires the following fields:
Note: Piping data into GoAccess won't prompt a log/date/time configuration dialog, you will need to previously define it in your configuration file or in the command line.
To output to a terminal and generate an interactive report:
To generate an HTML report:
To generate a JSON report:
To generate a CSV file:
GoAccess also allows great flexibility for real-time filtering and parsing. For instance, to quickly diagnose issues by monitoring logs since goaccess was started:
And even better, to filter while maintaining opened a pipe to preserve real-time analysis, we can make use of tail -f and a matching pattern tool such as grep, awk, sed, etc:
or to parse from the beginning of the file while maintaining the pipe opened and applying a filter
or to convert the log date timezone to a different timezone, e.g., Europe/Berlin
There are several ways to parse multiple logs with GoAccess. The simplest is to pass multiple log files to the command line:
It's even possible to parse files from a pipe while reading regular files:
Note that the single dash is appended to the command line to let GoAccess know that it should read from the pipe.
Now if we want to add more flexibility to GoAccess, we can do a series of pipes. For instance, if we would like to process all compressed log files access.log.*.gz in addition to the current log file, we can do:
Note: On Mac OS X, use gunzip -c instead of zcat.
GoAccess has the ability to output real-time data in the HTML report. You can even email the HTML file since it is composed of a single file with no external file dependencies, how neat is that!
The process of generating a real-time HTML report is very similar to the process of creating a static report. Only --real-time-html is needed to make it real-time.
By default, GoAccess will use the host name of the generated report. Optionally, you can specify the URL to which the client's browser will connect to. See https://goaccess.io/faq for a more detailed example.
By default, GoAccess listens on port 7890, to use a different port other than 7890, you can specify it as (make sure the port is opened):
And to bind the WebSocket server to a different address other than 0.0.0.0, you can specify it as:
Note: To output real time data over a TLS/SSL connection, you need to use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.
Another useful pipe would be filtering dates out of the web log
The following will get all HTTP requests starting on 05/Dec/2010 until the end of the file.
or using relative dates such as yesterdays or tomorrows day:
If we want to parse only a certain time-frame from DATE a to DATE b, we can do:
If we want to preserve only certain amount of data and recycle storage, we can keep only a certain number of days. For instance to keep & show the last 5 days:
Assuming your log contains the virtual host (server blocks) field. For instance:
And you would like to append the virtual host to the request in order to see which virtual host the top urls belong to
To exclude a list of virtual hosts you can do the following:
To parse specific pages, e.g., page views, html, htm, php, etc. within a request:
Note, $7 is the request field for the common and combined log format, (without Virtual Host), if your log includes Virtual Host, then you probably want to use $8 instead. It's best to check which field you are shooting for, e.g.:
Or to parse a specific status code, e.g., 500 (Internal Server Error):
Also, it is worth pointing out that if we want to run GoAccess at lower priority, we can run it as:
and if you don't want to install it on your server, you can still run it from your local machine:
Note: SSH requires -n so GoAccess can read from stdin. Also, make sure to use SSH keys for authentication as it won't work if a passphrase is required.
GoAccess has the ability to process logs incrementally through its internal storage and dump its data to disk. It works in the following way:
NOTES
GoAccess keeps track of inodes of all the files processed (assuming files will stay on the same partition), in addition, it extracts a snippet of data from the log along with the last line parsed of each file and the timestamp of the last line parsed. e.g., inode:29627417|line:20012|ts:20171231235059
First it compares if the snippet matches the log being parsed, if it does, it assumes the log hasn't changed dramatically, e.g., hasn't been truncated. If the inode does not match the current file, it parses all lines. If the current file matches the inode, it then reads the remaining lines and updates the count of lines parsed and the timestamp. As an extra precaution, it won't parse log lines with a timestamp ≤ than the one stored.
Piped data works based off the timestamp of the last line read. For instance, it will parse and discard all incoming entries until it finds a timestamp >= than the one stored.
For instance:
then, load it with
To read persisted data only (without parsing new data)
Each active panel has a total of 366 items or 50 in the real-time HTML report. The number of items is customizable using max-items Note that HTML, CSV and JSON output allow a maximum number greater than the default value of 366 items per panel.
A hit is a request (line in the access log), e.g., 10 requests = 10 hits. HTTP requests with the same IP, date, and user agent are considered a unique visit.
If you want to enable dual-stack support, please use --addr=:: instead of the default --addr=0.0.0.0.
The generated report will attempt to reconnect to the WebSocket server after 1 second with exponential backoff. It will attempt to connect 20 times.
If you think you have found a bug, please send me an email to goaccess@prosoftcorp.com or use the issue tracker in https://github.com/allinurl/goaccess/issues
Gerardo Orellana <hello@goaccess.io> For more details about it, or new releases, please visit https://goaccess.io
SEPTEMBER 2023 | GNU+Linux |