20 December, 2015
Rip. Mix. Burn.
Summary: In this article, we describe a command line utility (rip) which rips compact disc (CD) audio to the MPEG Audio Layer III (MP3) file format. The program is a thin rapper, written in Perl, over a pipe (or pipline) utilizing cdparanoia and LAME to read the CD audio and generate the MP3 file without intermediate or temporary files. A compliment of ID3v2 tag information is added to the MP3 output file from user defined album information. The rip program is licensed under the GNU General Public License version 3.
Introduction
Rip. Mix. Burn. from the 2001 Apple television commercial
In January 2001, at the MacWorld San Fransisco, Steve Jobs gave the keynote address identifying the release date of MacOS X and introducing a new PowerMac G4, iTunes and, "one more thing..." the PowerBook G4 Titanium. For us today, it is the introduction of iTunes that is most interesting. Steve began the iTunes introduction explaining the phrase "Rip. Mix. Burn." to anyone "over 30" (and making the standard pitch that this gives the end user, not the record companies, the power to listen to music as they want—if only the record company executives had been listening carefully). After setting up the growing trend, Steve admitted that Apple was "late to this party" in regard to the digital music revolution. His major criticism of other music programs was that they were "too complex." Users did not understand, or even know about, the features of these programs. The iTunes application (originally SoundJam MP purchased from Casady & Greene) he introduced that day he also described as "Really Clean. Really Simple. Far more powerful." (Apple of 2015, are you listening to your leader?)
Thinking about the first of those verbs (Mix is still relevant, Burn has largely been relegated to the burn barrel of history thanks to flash storage), is a monolithic application like iTunes the most clean and simple way to rip CD audio? Some might answer "yes" to that question, iTunes works well enough, but for those of you who appreciate the power of the UN*X shell (and remember MacOS X is BSD UN*X under a pretty set of graphics), the answer would most decidedly be "no."
The beauty of the UN*X design is that each program does one thing and does that one thing well. And all programs can interact through the common language of files, files on the disk are certainly files, but so are other devices like the console, network or the black hole of the null device (/dev/null). In addition, programs can be written to read from standard input and write to standard output (also "files") such that a sequence of programs can process data, each doing their own function well (thanks to Douglas McIlroy and Ken Thompson's night of work in 1973). This is the UN*X pipe, or pipeline, and it allows programs to work together (often referred to as "filter") where one program operates on the output of another. Pipelines can become long commands as many programs are joined together to process data.
However, for the problem of ripping CD audio to MP3, there are just two core functions: reading the CD audio and encoding that data to the MP3 file format. The program cdparanoia is a great choice for reading CD audio files. Not only is it very good at reading data from less than perfect optical drives and media (hence the paranoia in the name), it also happily writes the audio data to standard out (necessary for our pipeline), and the program is stable (last stable version (10.2) was released in 2008). With a program to read CD audio data, the next necessary piece is an MP3 encoder. The clear choice is LAME. LAME has long history of development, happily reads audio data from standard in, and is also quite stable (the last stable version (3.99) was released in 2011). And that's it for the heavy lifting.
The simple view of the pipeline to rip the first track of the album currenlty in the cd-rom drive is as follows (with $ serving as our prompt standin):
$cdparanoia 1 - | lame --preset standard - outfile.mp3
Here we are invoking cdparanoia, asking it to read the audio data from track number 1 and write that data to standard out (indicated by the - in place of the outfile name). The default audio output format for cdparanoia is the WAV file format, which LAME happily accepts. The output of cdparanoia is piped directly to the input of LAME (that's what the vertical bar "|" is for). The LAME program is called with the standard preset (variable bit rate, quality 2, mostly transparent to many people for most audio sources), the - instructs LAME to read from standard input, and the MP3 format audio will be written to outfile.mp3. In this pipeline, the audio data is entirely in memory from when it is read from the CD to when it is written to the MP3 file. All that is left to do is to write a wrapper around this pipeline (a slightly longer version of this command line in practice) to iterate for each track on the album.
The rip program and interface
The program rip is a solution to that problem. The rip code is written in Perl, a classic "glue language" for automating shell commands (there isn't anything singular about the choice of Perl for this task, rip could have been written as a Bash script, in Python, PHP or any of a variety of other languages that allow easy interaction with the shell—no need for defending your tool of choice). rip is largely going to be doing a few things to manage the rip. rip will deal with parsing command-line options controlling its behavior, parse a file containing information about the album (this isn't necessarily stored on the audio CD—there is an extension to the Red Book (literally a red book defined the audio CD format, c. 1980) called CD-Text (c. 1996) but we will not rely on that data being there), iterate through the tracks on the audio CD invoking the cdparanoia to LAME pipeline for each track and indicating progress.
The rip command line options allow control of not only rip, but also allow full control of the LAME MP3 encoder. The command line, and options, are illustrated below:
$rip --help
Usage: rip [OPTION] input

Rip CD audio -> mp3.

 -a, --adh      album description format (input) help
 -f, --file     specify file name with string (default: "\tn - \ti.mp3")
                 macro options:
                  \al - album title
                  \ar - artist
                  \ti - song title
                  \tn - track number
 -h, --help     give this help
 -p, --pc       alter file names for correctness on windows machines
 -s, --silent   do not show progress bar
 -o, --options  pass next argument to LAME (default: "--preset standard")
 -v, --version  show version information

$
Since the first option is help about the album description format, let's look at that format. rip uses its own format for specifying album information. Certainly the freedb is available with a plethora of CD information. That information is easy to retrieve (try searching CDDB with your favorite package manager or you can find perl-CDDB in CPAN), but that information is user generated and isn't necessary consistent, or even correct, for all discs. To help in that consistency, rip will expect you to type a bit of album information (but not really that much to ask if you're going to the trouble of ripping your CD collection and you'd like it done well, and it will give you a little more control if you'd only like to rip a few tracks from the album). Running rip with the -a flag produces an example album description file for a late 1990s release of the album Texas Flood by Stevie Ray Vaughn and Double Trouble.
$./rip -a
input example - album description format

#Stevie Ray Vaughan & Double Trouble --- Texas Flood (1983)
artist=Stevie Ray Vaughan and Double Trouble
album=Texas Flood
cover=tf.jpg
year=1983

Love Struck Baby
Pride and Joy
Texas Flood
Tell Me
Testify
Rude Mood
Mary Had a Little Lamb
Dirty Pool
I'm Cryin'
Lenny
SRV Speaks
Tin Pan Alley (AKA Roughest Place in Town)
Testify (Live)
Mary Had a Little Lamb (Live)
Wham! (Live)

# begins a comment line
# blank lines are ignored
# use - as a track name to skip that track

$
The format is a clear text file. Comment lines, beginning with #, and blank lines are ignored. The album information is set off in the delimiter separated value (DSV) style. The album information includes artist, album, year and file name for the cover art. This information will be included in the ID3v2 tag generated by LAME. The track names appear in order, each on one line. If you are only interested in ripping the hits from an album, such as the six singles from Jagged Little Pill by Alanis Morissette, use - in place of a song title and rip will skip that track (you can stop listing tracks once you've included the last one you want to rip). From the Jagged Little Pill example, looking at first the whole album and then the singles from that album:
$cat jlp.ad
#Alanis Morissette --- Jagged Little Pill (1995)
artist=Alanis Morisette
album=Jagged Little Pill
cover=jlp.jpg
year=1995

All I Really Want
You Oughta Know
Perfect
Hand In My Pocket
Right Through You
Forgiven
You Learn
Head Over Feet
Mary Jane
Ironic
Not The Doctor
Wake Up
$cat jlp_the_singles.ad
#Alanis Morissette --- Jagged Little Pill (1995)
artist=Alanis Morisette
album=Jagged Little Pill
cover=jlp.jpg
year=1995

All I Really Want
You Oughta Know
-
Hand In My Pocket
-
-
You Learn
Head Over Feet
-
Ironic
$
The -f option specifies how the files will be named. The next argument after -f is a string prototype for the file name. This string can include both constants (string literals that will end up in the file name) and macros that are expanded, or replaced, with the corresponding information for each track. If you are ripping the singles from Jagged Little Pill in an empty directory for that album, you can name your files as ## - track title.mp3 by using the following command line:
$rip -sf "\tn - \ti.mp3" jlp_the_singles.ad
$ls
01 - All I Really Want.mp3  04 - Hand In My Pocket.mp3  08 - Head Over Feet.mp3
02 - You Oughta Know.mp3    07 - You Learn.mp3          10 - Ironic.mp3
$
The file names can be any combination of fixed text, the " - " separator used above and macro expansion, the track number (\tn) and song title (\ti) as above. This can be useful if you rip each album twice, once at insane settings for the permanent home collection and once at more conservative settings for mobile devices in the workout room. You can easily mark each file by bitrate, no need to inspect file properties to find the appropriate version.
The -h option gives the help message seen above and is always available if you don't remember the program usage or switch for a particular behavior.
The Windows file system is sensitive to a variety of characters. The -p option replaces those characters (\,:,*,?,",|,<,>) with alphabetic character representations. This is useful if you have access to a Windows machine or share your ripped tracks with Windows devices on the home network. The slash (/) character will be replaced in all file names as it is the UN*X directory separator, and it isn't a good idea to try and include that character in a file name. While you don't find this character often in song titles, a classic example are the songs "Kuru/Speak Like A Child" and "6/4 Jam" from the 2000 Sony release of the "Jaco Pastorius" album (1976) by Jaco Pastorius.
The default behavior of rip is to show a progress bar indicating the total progress of the rip, which can be several minutes. The progress is shown below, and can be suppressed with the -s flag if you don't want any output, or are running rip in the context of another program.
$./rip jlp_the_singles.ad
 |=======>                                | You Oughta Know
The most important option for your output is the -o option. With this switch, the next argument is passed directly to LAME in order to direct LAME in its compression work. The default value in rip, when the flag isn't used, is "--preset standard." For LAME, this preset will produce a variable bitrate file of high, but not maximal quality. For those of you who have used LAME, "--preset standard" is equivalent to the "-V 2" command line. The maximum variable bit rate is achieved with "--preset extreme", and the constant bitrate of 320kbps is the "--preset insane" option. If you have used LAME directly before, you can pass any combination of options to LAME through this mechanism tailoring your output to exactly what you wish.
The -v option prints the rip version and exits. As of this writing, rip is at version 0.1.
The rip source code and dependancies
The rip source code is licensed under the GNU General Public License version 3 and depends on a recent version of Perl (any Perl 5 version should work, and that'll be almost any version of Perl you find in the wild), cdparanoia (version 10.2 from September 2008 is the version you're likely to find on any system) and LAME (the latest version is 3.99 released in October 2011, you may find an older version on some systems, but they should work just fine, particularly if you pass in the correct options via the "-o" flag to an older version of LAME).
rip v0.1 (2015-12-20) — 
Conclusion
Rip. Mix. Burn. That is the vision Steve Jobs packed neatly for Macintosh users in January 2001. The idea certainly wasn't novel, but the paradigm still holds, at least for anyone "over thirty" that has a compact disc collection. The program rip wraps a single pipeline utilizing an excellent CD audio reader, cdparanoia and what is considered the best MP3 encoder, LAME. rip can either hide the complexity of the process or allow for full control of the underlying encoder—full control in producing the MP3 audio you want. The rip program allows for precise file titling and ID3v2 tagging of MP3 files. The rip program also provides a clean, simple interface.
So don't be afraid to tackle ripping your entire CD collection or finding classic albums on CD at garage sales or used record stores and CD exchanges. rip can help.
rip. Mix. Burn.
Home | Privacy Policy |  Page generated in 0.00143 seconds.