Friday, April 16, 2010

Scripts to add replaygain tags to a flac collection

If, like me, you like to be able to play your music without constantly having to lunge for the volume control, and you store the master copy of your (ripped) CD collection in FLAC format you might be interested in a quick way to (re)calculate all the ReplyGain tags on your music. I'll also go into how to automatically create smaller MP3 copies of the files with the replaygain bias "burned in" so stupid players still get the volume right.
Finally, there's another little script here that can be used to fix up issues like inconsistent artist naming. It only does "The X" -> "X, The" at the moment, but it's easy enough to extend.
Read on if you're happy with shell scripting (and a Linux or advanced Mac OS X user).

The following shell snippet will scan the target directory, in this case $HOME/flac, and calculate ReplayGain for all .flac files within it. It assumes that each directory containing any flac files may be treated as an album for purposes of ReplayGain tagging. None of the other track metadata is touched and the audio stream is not altered in any way.
You'll need the flac tools installed since metaflac is what we'll use to calculate and apply replaygain tags. On an Ubuntu system the flac tools are in the (surprise!) flac package.
Copy this to
if test $# -lt 1 ; then
        echo "Usage: $0 target-dir [target-dir [target-dir [ ... ] ] ]"
        exit 1
find "$@" -type d -execdir \
 bash -c 'echo Scanning `pwd`/{}; if ls "{}"/*.flac >&/dev/null ; then metaflac --add-replay-gain "{}"/*.flac; fi' \
Then invoke as sh $HOME/flac (assuming $HOME/flac is the directory you want to recursively retag). You may specify more than one target directory to scan.
Once you've tagged your flac files with track and album level ReplayGain information you can play music from different publishers, in different styles, and of different ages without some of it being almost inaudible and other tracks mind blowingly loud. Your player must support ReplayGain, of course.
Rhythmbox (if you use it) doesn't apply ReplayGain by default, but does support it. To enable:
gconftool-2 --type bool --set /apps/rhythmbox/use_replaygain 1
No idea about iTunes and friends - can they even play flac?
Handily, you can also bulk convert your flac formatted collection to other formats (say, mid-quality MP3 or vorbis) for something like a portable player or a laptop with a small disk and in the process "burn in" the volume bias indicated by the ReplayGain tagging. That way players that don't understand per-track preamp, replaygain, etc will still produce reasonable volume levels without you having to re-encode your tracks from a lossy source and get the usual horrid results.
The lame mp3 encoder automatically calculates replaygain, but it doesn't handle per-album gain. You can instruct it not to apply replaygain and instead apply the preamp recorded in the flac metadata with the undocumented flac decoder option --apply-replaygain-which-is-not-lossless. There may be better ways; ideally you could specify an encoding preamp to the mp3 encoder.
This script will scan the directory flac in the current directory, and write mp3 files to the mp3 directory under the current directory. Output will be medium/high quality VBR MP3 with gain burned in to the audio stream and proper ID3 tags included. You can run one instance of the script per CPU core for best performance, as the instances will communicate and make sure not to step on each others' toes or duplicate work. You will need the lame MP3 encoder installed to use this script. The script will skip files that already exist in the target directory, so you can re-run it after ripping your latest CD acquisitions and it'll only encode the new files.
Of course, only a crazy person would run this without first making backups of their music collection. Then again, you'd have backups already, right? Right? Sigh.
Save to and chmod a+x

if test "$#" -lt 1 ; then
        echo "Usage: $0 target-directory"
        exit 1

function gettag() {
        awk -F = /^${1}=/' { print $2; }' /tmp/tags-$$

function rmlock() {
        trap '' EXIT
        rm -f "${lockfn}"
        exit 2

if test "$1" = "convert" ; then
        # Called by self to convert a track. $2 is source track name,
        # $3 is source dir prefix, and $4 is target dir prefix.
        # The track metadata is extracted, then the track is decoded with replaygain
        # bias applied during decoding. High quality VBR MP3 encoding is then done
        # on the adjusted audio stream. The end result is a decent quality MP3
        # that will play at a sane volume even on players that don't understand
        # ReplayGain MP3 headers/tags.
        if ! test -a "${ofn}" || test "${ifn}" -nt "${ofn}" ; then
                # The file is either absent in the target directory or is older
                # then the source file.
                # Check to see if another instance of the script is currently encoding
                # this file, and if not, start encoding it ourselves. There's a race here,
                # in that another instance may start work on the file between when we check
                # and when we start work, but we don't actually care as the worst outcome
                # is a little wasted CPU time.
                # We'll use a lock file in the target directory to indicate activity.
                odir="`dirname "${ofn}"`"
                lockfn="${odir}/.`basename "${ofn}"`.lock"
                if test -e "${lockfn}" 2>/dev/null ; then
                        # Appears locked by another instance. Is it still alive?        
                        if kill -0 $(cat "${lockfn}"); then
                                echo "Skipping ${ifn} - locked by another instance"
                                exit 1
                                echo "Clearing stale lock file for ${ofn}"
                                rm "${lockfn}"
                # Notify any concurrent instances that we're working on this file
                trap rmlock EXIT SIGTERM SIGINT SIGQUIT SIGHUP
                mkdir -p "${odir}"
                touch "${lockfn}"
                # then process it
                echo "$$" > "${lockfn}"
                metaflac --export-tags-to=/tmp/tags-$$ "${ifn}"
                echo -n "Converting: ${ifn} ..."
                flac -d "$2" -c -s --apply-replaygain-which-is-not-lossless 2>/dev/null \
                        | nice lame -S -h -V 3 --noreplaygain \
                                --tt "$(gettag TITLE)" --ta "$(gettag ARTIST)" --tl "$(gettag ALBUM)" \
                                --ty "$(gettag DATE)" --tn "$(gettag TRACKNUMBER)" --tg "$(gettag GENRE)" \
                                --ignore-tag-errors \
                                - "/tmp/working-$$.mp3"
                if [ $? ] ; then
                        mv /tmp/working-$$.mp3 "${ofn}"
                echo " done"
                echo "Skipping ${ifn} - destination exists and is up to date"
        exit 0

# For each flac file in the source tree test to see if an MP3 must
# be created and if so, make one.
find "$1" -name \*.flac \
        -exec "$0" convert "{}" flac mp3 \;
Here's another script that goes through the collection and changes FLAC metadata to rename, eg, "The Cure" to "Cure, The". It's easily adapted to do other things too.
# Fix some common metadata issues on my flac collection, including rewriting
# artist names from "The X" to "X, The".
# Typical usage:
#    find flac -type f -name \*.flac -exec bin/cleanup_flac "{}" \;

if test $# -lt 1 ; then
        echo "Usage: $0 target-file.flac"
        exit 1

function gettag() {
        awk -F = /^${1}=/' { print $2; }' /tmp/tags-$$

metaflac --export-tags-to=/tmp/tags-$$ "$1"
artist="$(gettag ARTIST)"

# Convert "The X" -> "X, The"
echo -n "Testing \"$1\": "
if test "${artist:0:4}" = "The "; then
        artist="${artist:4}, The"
        metaflac --remove-tag="ARTIST" --set-tag="ARTIST=${artist}" "$1"
        echo "artist changed to ${artist}"
        echo "ok"

1 comment: