Website analytics with Goaccess

I’ve been looking into what’s available in the world of web analytics. If you aren’t familiar with the term, it’s key information for answering questions such as

There’s just no getting around the fact that the giant in this area is Google Analytics. It’s easy to see why, just add some javascript to your website and Google will take care of the rest. Particularly good for those using shared hosting accounts and don’t have access to log files. And of course this helps Google track everyone on the internet.

But what if you don’t want to help Google track and record people for advertising purposes? What if you want LESS Google in your life? That’s where Goaccess comes in. Goaccess is a fast open source program that provides website metrics in a useful and organized manner. One stand out feature I really like is the ability for Goaccess to provide information in a terminal or as a static website. The static website is admittedly nicer to look at, but being able to SSH into a server to get immediate information is pretty slick. I should mention Goaccess also provides a websocket server as a third way to provide information. I haven’t seen a reason to experiment with this option - my websites don’t warrant millisecond level real time updates.

To set Goaccess up you first need to make sure your web server is set up to provide access logs. In my case I’m using Caddy so my Caddyfile looks something like this:

example.com {
    tls pretend@example.com #email address
    root /var/www/html #place where your website files are located
    log / /var/www/logs/examplewebsite_access.log "{combined}"

The log line is the important section. The “/” tells Caddy you want to log the entire example.com website. The /var/www/logs/examplewebsite_access.log tells Caddy where to store the access file and what to name it. Caddy has two logging formats, common (the default) and combined. Combined is common plus two additional bits of information. So in this Caddyfile we tell Caddy to use the combined (full) logging format. For more information on logging consult the Caddy docs page.

I would also like to point out that Victor Augusteo has instructions to combine Caddy & Goaccess using Docker. I’ve never gotten into using Docker, but for those who have, Victor’s approach might be preferable.

Now that Caddy is all set up it’s time to focus on Goaccess. I have my main computer (Ubuntu 19.04) and my server (Ubuntu 18.04 lts). A quick look shows that the current version of Goaccess is 1.3, but both of my computers only have 1.2 in their repositories. I don’t want to use Docker. The Official Goaccess repository for Debian/Ubuntu brings up errors. Well, time to roll up our sleeves and just compile it. The Goaccess website has some nice clear instructions for compiling that I’m going to stick close to - with one important change at the end.

First the prep work:

Ubuntu 19.04

sudo apt install build-essential libncurses-dev libgeoip-dev checkinstall

Ubuntu 18.04 lts

sudo apt install build-essential libncursesw5-dev libgeoip-dev checkinstall

Important notes: If you plan on using Goaccess web server functionality you may want to add the optional OpenSSL dependency. In addition, I couldn’t get the MaxMind GeoLocation support to work on either 18.04 or 19.04 so I’m sticking with the legacy GeoLocation support (libgeoip-dev).

You may have noticed that checkinstall is listed for both Ubuntu 19.04 and 18.04 but checkinstall doesn’t appear anywhere on the Goaccess website. When you install software the old fashioned way (./configure, make, make install) it doesn’t play well with the rest of your software. You are usually lucky if the programmer included a make uninstall. Package management systems don’t play well with make, make install methods. You might overwrite important files, OR the package manager could overwrite key files to software you installed with make. Checkinstall exists to solve that problem, it integrates your compliled program into your existing package manager (.deb or .rpm) and lets you cleanly remove or upgrade the program. Bottom line: if you are using a .deb system (Ubuntu, Debian, Linux Mint, etc.) or .rpm system (Red Hat) then ALWAYS use checkinstall instead of using make install.

Now to download, compile, and install

wget https://tar.goaccess.io/goaccess-1.3.tar.gz
tar -xzvf goaccess-1.3.tar.gz
cd goaccess-1.3/
./configure --enable-utf8 --enable-geoip=legacy
sudo checkinstall

Because checkinstall was used instead of “make install” Goaccess has been turned into a .deb file and can be handled like any other .deb package. If you want to remove Goaccess at some point you can now easily do so with

sudo apt remove goaccess

and Goaccess will be cleanly removed like any other .deb file.

UPDATE Apt has been pestering me to “upgrade” to the older 1.2 version. The fix for this is to select option #3 (specify a version number) during checkinstall and use 1:1.3 instead of the 1.3 provided default.

Now that Goaccess has been compiled and installed:

To use it through the terminal

goaccess /var/www/logs/examplewebsite_access.log -c

When the program first starts up it will ask what log format to use. You will want the NCSA Combined Log Format.

To create a static webpage

goaccess /var/www/logs/examplewebsite_access.log -o report.html –log-format=COMBINED

There is one particularly nice feature that is listed on the github page that I don’t think is listed prominently enough. If you have Goaccess installed on your local computer and have multiple computers as web servers you can use SSH to get web analytics WITHOUT needing to install Goaccess on each computer.

ssh your_username@example.com ‘cat /path_to_your_logs/examplewebsite_access.log’ | goaccess -c –log-format=COMBINED -

This example does require that you log into computers with SSH by using key authentication and not the less secure password login method.