diff options
Diffstat (limited to 'content/posts/automating-paperwork-payslips.md')
-rw-r--r-- | content/posts/automating-paperwork-payslips.md | 84 |
1 files changed, 84 insertions, 0 deletions
diff --git a/content/posts/automating-paperwork-payslips.md b/content/posts/automating-paperwork-payslips.md new file mode 100644 index 0000000..cf3e55d --- /dev/null +++ b/content/posts/automating-paperwork-payslips.md @@ -0,0 +1,84 @@ +--- +title: "Automating Grabbing Payslips For Use With Paperless" +date: 2019-02-04T11:47:00 +tags: ["formats", "guides", "linux", "servers", "software"] +--- + +My workplace has recently started sending out Payslips as email attachments instead of the usual physical sheet which I'm a big fan of, [Paperless](https://github.com/the-paperless-project/paperless) is always on hand to sort and process any paperwork I have which keeps things organised and under control. + +To tie all these processes together, we're going to use `getmail`, `mpack` and `qpdf`. + +Please note that this will download your **entire inbox** every time so it helps if you don't run the script too often, and keep your inbox size to manageable levels. + +Firstly, we'll need to specify a number of variables for use later in the script: +``` +email_sender="payslipsender@address" +email_username="youremailaddress" +email_password="youremailpassword" +payslip_password="yourpdfpassword" +payslip_pattern=Payslip +payslip_filetype=pdf +import_directory="$HOME/import" +temp_directory="$(mktemp -d)" +``` + +We'll change to the temporary directory and make the directories that `getmail` expects to be there: +``` +cd "$temp_directory" || exit +mkdir {cur,new,tmp} +``` + +As I don't really want to keep an copy of my whole inbox around for no good reason, I dump my email to a temporary directory and write my `getmail` config file into this directory with a heredoc. +Here I'm using IMAP with SSL but getmail supports [a number of different methods of grabbing mail](http://pyropus.ca/software/getmail/configuration.html#conf-retriever): + +``` +cat << EOF > getmailrc +[retriever] +type = SimpleIMAPSSLRetriever +server = your.imap.server +username = $email_username +port = 993 +password = $email_password + +[destination] +type = Maildir +path = $temp_directory/ +EOF +``` + +Then run `getmail` using the temporary directory as your working directory: +``` +getmail --getmaildir "$temp_directory" +``` + +Change directory to our newly saved items, then extract all attachments that match our search pattern in the variable above. Lastly, move these attachments to the Paperless import directory. +``` +cd new || exit +grep "$email_sender" ./* | cut -f1 -d: | uniq | xargs munpack -f +mv "$payslip_pattern"*"$payslip_filetype" "$import_directory" +``` + +Now Paperless won't work on these files unless they're decrypted, which we can do as follows: +``` +cd "$import_directory" || exit +for i in $payslip_pattern*$payslip_filetype; do + fileProtected=0 + qpdf "$i" --check || fileProtected=1 + if [ $fileProtected == 1 ]; then + qpdf --password="$payslip_password" --decrypt "$i" "decrypt-$i" && rm "$i" + fi +done +``` + +Now we have a directory full of unencrypted files to let Paperless work with. Last but not least, we'll need to delete the old temporary directory we used: +``` +rm -r "$temp_directory" +``` + +Lastly all you need to do is set up the above script as a cron job to run after pay day! The cron line I'm using is as follows: +``` +0 0 2 * * $HOME/path/to/script/payslip.sh & +``` + +* **Edit 2019-02-27:** `pdftk` replaced with `qpdf` as it required java which pulls down ~200MB dependencies. +* **Edit 2019-08-09:** Added cron section. |