This Python package provides APIs to read, calculate and write ReplayGain using Python as well as two scripts that utilize these APIs to apply ReplayGain information on audio files.
This is a Python 3 fork of Felix Krull's rgain
repository on Bitbucket.
ReplayGain is a proposed standard published by David Robinson in 2001 to measure and normalize the perceived loudness of audio in computer audio formats such as MP3 and Ogg Vorbis. It allows media players to normalize loudness for individual tracks or albums. This avoids the common problem of having to manually adjust volume levels between tracks when playing audio files from albums that have been mastered at different loudness levels.
-- Source: Wikipedia
ReplayGain is the name of a technique invented to achieve the same perceived playback loudness of audio files. It defines an algorithm to measure the perceived loudness of audio data.
-- Source: hydrogenaud.io
To install these dependencies on Debian or Ubuntu (16.10 or newer):
console
$ apt install \
gir1.2-gstreamer-1.0 \
gstreamer1.0-plugins-base \
gstreamer1.0-plugins-good \
gstreamer1.0-plugins-bad \
gstreamer1.0-plugins-ugly \
python3 \
python3-gi
(Or if you prefer to install the latest PyGObject from source code,
replace python3-gi
with libcairo2-dev libgirepository1.0-dev
.)
You will also need GStreamer decoding plugins for any audio formats you want to use.
Just install it like any other Python package using pip
:
console
$ python3 -m pip install --user rgain3
replaygain
This is a program like, say, vorbisgain
or mp3gain
, the difference
being that instead of supporting a mere one format, it supports several:
The basic usage of the program is simple:
console
$ replaygain AUDIOFILE1 AUDIOFILE2 ...
There are various options; see them by running:
console
$ replaygain --help
collectiongain
This program is designed to apply Replay Gain to whole music collections, plus
the ability to simply add new files, run collectiongain
and have it
replay-gain those files without asking twice.
To use it, simply run:
console
$ collectiongain PATH_TO_MUSIC
and re-run it whenever you add new files. Run:
console
$ collectiongain --help
to see possible options.
If, however, you want to find out how exactly collectiongain
works, read on
(but be warned: It's long, boring, technical, incomprehensible and awesome).
collectiongain
runs in two phases: The file collecting phase and the actual
run. Prior to analyzing any audio data, collectiongain
gathers all audio
files in the directory and determines a so-called album ID for each from the
file's tags:
Otherwise, if the file contains an album tag, it is joined with either
a MusicBrainz album artist ID, if that exists
The resulting artist-album combination is the album ID for that file. - If the file doesn't contain a Musicbrainz album ID or an album tag, it is presumed to be a single track without album; it will only get track gain, no album gain.
Since this step takes a relatively long time, the album IDs are cached between
several runs of collectiongain
. If a file was modified or a new file was
added, the album ID will be (re-)calculated for that file only.
The program will also cache an educated guess as to whether a file was already
processed and had ReplayGain added -- if collectiongain
thinks so, that
file will totally ignored for the actual run. This flag is set whenever the file
is processed in the actual run phase (save for dry runs, which you can enable
with the --dry-run
switch) and is cleared whenever a file was changed. You
can pass the --ignore-cache
switch to make collectiongain
totally ignore
the cache; in that case, it will behave as if no cache was present and read your
collection from scratch.
For the actual run, collectiongain
will simply look at all files that have
survived the cleansing described above; for files that don't contain ReplayGain
information, collectiongain
will calculate it and write it to the files (use
the --force
flag to calculate gain even if the file already has gain data).
Here comes the big moment of the album ID: files that have the same album ID are
considered to be one album (duh) for the calculation of album gain. If only one
file of an album is missing gain information, the whole album will be
recalculated to make sure the data is up-to-date.
Proper ReplayGain support for MP3 files is a bit of a mess: on the one hand,
there is the mp3gain
application which was relatively widely used (I
don't know if it still is) -- it directly modifies the audio data which has the
advantage that it works with pretty much any player, but it also means you have
to decide ahead of time whether you want track gain or album gain. Besides, it's
just not very elegant. On the other hand, there are at least two commonly used
ways to store proper ReplayGain information in ID3v2 tags.
Now, in general you don't have to worry about this when using this package: by
default, replaygain
and collectiongain
will read and write ReplayGain
information in the two most commonly used formats. However, if for whatever
reason you need more control over the MP3 ReplayGain information, you can use
the --mp3-format
option (supported by both programs) to change the
behaviour.
Possible choices with this switch are:
| Name | Description |
|------|-------------|
| replaygain.org
(alias: fb2k
) | Replay Gain information is stored in ID3v2 TXXX frames. This format is specified on the replaygain.org website as the recommended format for MP3 files. Notably, this format is used by music players like foobar2000 and Quod Libet. The latter can also fall back on the legacy format. |
| legacy
(alias: ql
) | Replay Gain information is stored in ID3v2.4 RVA2 frames. This format is described as "legacy" by replaygain.org; however, it might still be the primary format for some music players. It should be noted that this format does not support volume adjustments of more than 64 dB: if the calculated gain value is smaller than -64 dB or greater than or equal to +64 dB, it is clamped to these limit values. |
| default
| This is the default implementation used by both replaygain
and collectiongain
. When writing ReplayGain data, both the replaygain.org
as well as the legacy
format are written. As for reading, if a file contains data in both formats, both data sets are read and then compared. If they match up, that ReplayGain information is returned for the file. However, if they don't match, no ReplayGain data is returned to signal that this file does not contain valid (read: consistent) ReplayGain information. |
Fork and clone this repository. Inside the checkout create a virtualenv
and install rgain3
in develop mode:
Note that developing from source requires the Python headers and therefore the
python3.x-dev
system package to be installed.
console
$ python3 -m venv env
$ source env/bin/activate
(env) $ python -m pip install -Ue .
To run the tests with the Python version of your current virtualenv, simply
invoke pytest
installing test
extras:
console
(env) $ python -m pip install -Ue ".[test]"
(env) $ pytest
You can run tests for all supported Python version using tox
like so:
console
(env) $ tox
With the exception of the manpages, all files are::
The manpages were originally written for the Debian project and are::
I see that multiple audio formats are supported, but only mp4 is supported for video.
Is there a reason why other formats such as mkv can't be supported? Codec can't be the problem as when mkv is remuxed to mp4 it works normally, but maybe the container itself is incompatible?
Anyway, if supporting other formats is possible, I am willing to implement it as I know Python, but it would be nice if I could get some guidance in the process.
Hey.
Opus has it's own special tags for R128 gain data:
- R128_TRACK_GAIN
- R128_ALBUM_GAIN
See https://datatracker.ietf.org/doc/html/rfc7845#section-5.2.1 .
Would be nice if one could select whether only these (default), only replaygain or both types of tags would be written.
Default to only write Opus' type because of the RFC which says:
To avoid confusion with multiple normalization schemes, an Opus comment header SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK tags, unless they are only to be used in some context where there is guaranteed to be no such confusion.
Thanks.
Python distutils/setuptools/packaging allows for generating RPM packages which then can be installed on Fedora or Red Hat systems.
This adds the necessary metadata to pull in the dependencies of the program without having to install them by hand, a step that people often forget and then the program does not work.
We also point out the documentation files in the program to be added to the RPM.
The current implementation seems to rely on identical ARTIST
and ALBUM
tags to recognize files as belonging to the same album. Obviously this doesn’t work well for samplers (and things like Split-EPs for that matter). Tagging the tracks with a mutual ALBUMARTIST
isn’t a real solution either.
I would suggest a command line option for replaygain
that switches off the auto-detection and forces all tracks to be considered being from the same album.
For collectiongain
a command line option could be introduced that forces tracks in the same directory to be considered being from the same album.
An alternative would be an 'light' version of the detection, that only looks for matching ALBUM
tags. This would however lead to problems when iterating through big collections with collectiongain
, because bands don’t seem to have a agreed on using world-wide unique album names. ;-)
This fixes on the one hand a syntax error in the string formatting
performed within the __str__
method by removing it and relying on the
method provided by the dataclass.
On the other hand, this change allowed to remove also the __repr__
,
__eq__
and __ne__
methods, which are also provided by the dataclass
wrapper.
I'm updating from 1.0.0 to 1.1.0 in Debian unstable, and during my build/test procedure I found that an automated test was intermittently getting stuck. As a smoke-test for new versions, I run this script https://salsa.debian.org/python-team/packages/rgain3/-/blob/debian/master/debian/tests/replaygain in a virtual machine. The expected result is that it performs replay gain analysis on the four test audio clips found in the same directory as the script itself, as though they were an album (they're actually short clips taken from sound-theme-freedesktop), then terminates successfully.
However, the result I'm actually getting in the test VM is that it hangs, like this:
```
Checking for Replay Gain information ...
message-new-instant.oga:none
phone-incoming-call.oga:none
phone-outgoing-busy.oga:none
phone-outgoing-calling.oga:none
Calculating Replay Gain information ...
message-new-instant.oga:7.70 dB
phone-incoming-call.oga:-8.91 dB
phone-outgoing-busy.oga:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4621:gst_pad_push_data:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4368:gst_pad_chain_data_unchecked:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4621:gst_pad_push_data:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4368:gst_pad_chain_data_unchecked:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4621:gst_pad_push_data:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4368:gst_pad_chain_data_unchecked:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4621:gst_pad_push_data:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4368:gst_pad_chain_data_unchecked:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4621:gst_pad_push_data:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4368:gst_pad_chain_data_unchecked:
(replaygain:1723): GStreamer-WARNING **: 11:55:36.510: ../gst/gstpad.c:4621:gst_pad_push_data:
Message https://marc.info/?l=gstreamer-devel&m=138703546904972&w=2 on the GStreamer upstream mailing list suggests that this might be to do with the change in commit d5cfdd86 - maybe it shouldn't be sending a flush event?
Unfortunately I can't seem to reproduce this when not in the virtual machine, so perhaps it's timing-related.
multimedia audio replaygain analysis