disabling-cgit-scraping-logs.md (2839B)
1 --- 2 title: "Disabling cgit scraping logs" 3 date: 2021-11-03T10:47:00 4 tags: ["Docker", "Linux", "Servers", "Snippets", "Software"] 5 --- 6 7 One recurring problem that keeps happening every couple of months is my server will run out of disk space, the cause is usually the docker directory blowing up in size to a few gigabytes which on my small VPS can really start to cause issues. 8 9 You can find the offending containers using the excellent `ncdu` via: 10 ``` 11 sudo ncdu /var/lib/docker 12 ``` 13 14 When you have the ID of the container (e.g. `/var/lib/docker/overlay2/6fe41495127cc92398107df951416ec27463fd4ff6525a7d227bcf0c4e63803a`) you can find the corresponding container via: 15 ``` 16 for i in $(docker ps -a | awk '{if (NR!=1) {print $NF}}') 17 do 18 if docker inspect "$i" | grep -q 6fe41495127cc92398107df951416ec27463fd4ff6525a7d227bcf0c4e63803a 19 then 20 echo "$i" 21 fi 22 done 23 ``` 24 25 With the offender found, you can start a shell in this container and browse to the files (in my case, /var/log) 26 ``` 27 docker exec -it cgit sh 28 cd /var/log/httpd/ 29 ls -lah 30 ``` 31 32 Here I have a gigabyte `error_log` and a hundred megabyte `access_log`. Using `tail -f` to have a look at the files, it's mainly bots scraping diffs causing these logs. 33 34 Now let's get these disabled, there's a `robots=index, nofollow` option in `/etc/cgitrc` that can be changed to `robots=none`. To stop this option being reset on the container restarting, we'll mount this file to the host filesystem. Below are the relevant lines from my `docker-compose.yml` file: 35 ``` 36 volumes: 37 - $CONFDIR/cgit/cgitrc:/etc/cgitrc 38 ``` 39 40 As an added bonus, we can fix a long-standing issue with this container where code that should be highlighted is just blank. The line in question is `source-filter=/opt/highlight.sh` Comment out or remove this line and you'll have code previews working as expected. 41 42 Unfortunately, even with the above in place logs are immediately starting to fill up again with bot user agents. Time for a more janky solution! Logging is being controlled in this container via the `/etc/httpd/conf/httpd.conf` file, again we're going to mount this on the host filesystem with a `docker-compose` declaration: 43 44 ``` 45 volumes: 46 - $CONFDIR/cgit/cgitrc:/etc/cgitrc 47 - $CONFDIR/cgit/httpd.conf:/etc/httpd/conf/httpd.conf 48 ``` 49 50 With this file on our host filesystem, we can now edit it. The offending lines are as follows: 51 ``` 52 ErrorLog "logs/error_log" 53 LogLevel warn 54 55 LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined 56 LogFormat "%h %l %u %t \"%r\" %>s %b" common 57 CustomLog "logs/access_log" combined 58 ``` 59 60 All you need to do is pre-append all lines except ErrorLog with a `#` symbol, then change the `ErrorLog` location to `/dev/null`. 61 62 With that drastic and janky change, restart the container and you should notice that no more logs are being created.