summaryrefslogtreecommitdiffstats
path: root/content/posts/scraping-now-albums.md
diff options
context:
space:
mode:
authorbreadcat2020-06-19 12:23:15 +0100
committerbreadcat2020-06-19 12:23:15 +0100
commit70bb5d5a801428b0fb390abf79f19ffcf5e29c67 (patch)
treeb9fd7990156bd58bc38d58f91829c05933215102 /content/posts/scraping-now-albums.md
parent0f9a31348079c0a061bcc194912e75cc1c07bc1f (diff)
downloadblog.minskio.co.uk-70bb5d5a801428b0fb390abf79f19ffcf5e29c67.tar.gz
blog.minskio.co.uk-70bb5d5a801428b0fb390abf79f19ffcf5e29c67.tar.bz2
blog.minskio.co.uk-70bb5d5a801428b0fb390abf79f19ffcf5e29c67.zip
Simple migration of existing posts to hugo format
Diffstat (limited to 'content/posts/scraping-now-albums.md')
-rw-r--r--content/posts/scraping-now-albums.md37
1 files changed, 37 insertions, 0 deletions
diff --git a/content/posts/scraping-now-albums.md b/content/posts/scraping-now-albums.md
new file mode 100644
index 0000000..f3eff3b
--- /dev/null
+++ b/content/posts/scraping-now-albums.md
@@ -0,0 +1,37 @@
+---
+title: "Scraping and Grabbing Now! albums"
+date: 2018-12-04T16:28:00
+tags: ["guides", "linux", "lists", "music", "servers", "snippets", "software"]
+---
+
+Recently a collegue at work came to me to download them an album from online, unfortunately as it was a compilation album and the individual tracks had been released a million times already this wasn't to be released through the usual channels.
+
+No matter though, vague scripting to the rescue! The tracklist that I was after was available on the [now website](https://www.nowmusic.com/album/now-rock-n-roll/) which had no issues being scraped.
+
+```
+source=$(wget https://www.nowmusic.com/album/now-rock-n-roll/ -qO-)
+artists=$(printf "$source" | grep artist | sed 's/^.*>\([^<]*\)<.*$/\1/')
+titles=$(printf "$source" | grep \"title\" | sed 's/^.*>\([^<]*\)<.*$/\1/')
+paste <(printf "$artists") <(printf "$titles") | sed -e 's/\t/ - /g' > parse_list.txt
+```
+
+Now we have all 73 tracks in a single text file, no fuss, no muss.
+
+All of these tracks are incredibly likely to be uploaded to youtube, so we can grab them using the ever-excellent `youtube-dl`
+
+To manage this, we'll run a youtube search on every entry, and grab the resulting output, converting it to `mp3` along the way.
+
+```
+while read line; do youtube-dl -x --audio-format=mp3 ytsearch:"$line lyrics"; done < parse_list.txt
+```
+
+Please note, I append a " lyrics" in the search string to avoid too obvious music videos that sometimes have
+
+With this, we have 73 `mp3` files dumped into our working directory with messy filenames. I usually throw these into `beets` in singleton mode via docker to improve the quality of the filenames/tags.
+
+```
+docker run -it -v $(pwd):/music linuxserver/beets bash
+beet im -s /music
+```
+
+This will take some time, and will need a lot of nannying as there are no existing tags to work with initially. After the process however you'll be rewarded with tagged files ready to (rock 'n) roll. \ No newline at end of file