Play It Again Paul, LLC | Content Challenges Posed by Personal Recordings

Content Challenges Posed by Personal Recordings (November 2019)

Working with commercial tapes and cassettes is straightforward. Unless the tape has been damaged, the sound quality is good, and inserts identify the album title, artist(s), and track titles. After digitizing the recording, I can easily spot where tracks begin and end (see Figure 1) and split the tracks into individual audio files, adding album, artist, and track metadata to each file. After a bit more processing, the collection of files is then ready to be burned onto a CD, converted to MP3 files, or delivered as WAV files, depending on the customer’s desires.

Figure 1. Waveform from one side of a commercial cassette (five tracks).

Personal recordings are very different. As presented in last month’s article, Technical Challenges Posed by Personal Recordings, several factors can interfere with obtaining clean audio. The issues don’t end there, however. There are additional challenges associated with the tapes’ content, and these are explored below.

Identifying Tracks in the Waveforms

Personal recordings can contain all kinds of things, like musical performances or practice sessions, verbal presentations, storytelling, conversations, audio “skits,” kids playing with their recorder while goofing off, and throw-away content. These recordings, their tracks, and even fragments of tracks are precious to customers because they evoke memories of relatives, friends, an event, a time in one’s life, Little Sally at four years old, or how Little Johnny sounded practicing the piano 50 years ago. Though they may not be able to verbalize it, customers want a collection of audio tracks assembled in meaningful groupings and orders, which is what I truly intend to provide. The first step after conversion in achieving that goal is identifying tracks within the waveforms and splitting them into separate audio files, but that is rarely straightforward (see Figure 2, where the 30-minute waveform might represent a single track or more than a dozen—and the customer may not know).

Figure 2. Waveform from one side of a personal cassette (unknown number of tracks).

The easiest way to define tracks is to treat each side of a tape as a single track. However, the resulting track length of a cassette would be between 15 and 45 minutes; that for a 2400’ reel-to-reel tape recorded at 3.75 inches/second would be two hours. In my opinion, this approach is less than optimal, and I believe customers would appreciate recordings that are more accessible.

Labels on personal cassette inserts or tape boxes are often helpful in identifying tracks—when they’re present, if they’re complete and up to date, and if they refer to the actual cassette or tape that’s in the case or box. Their usefulness is limited, though, if contiguous tracks have similar content, voices, and labels.

Sometimes there are artifacts that could indicate the beginning or end of a track. One such artifact is when the audio begins at an abnormally high pitch and quickly descends to normal (a phenomenon that one customer labeled onomatopoetically as a “slurp”). This can indicate the beginning of a track since it occurs if the audio to be recorded is already in progress when recording begins; the electric signal is sent to the record head before the tape has gotten up to speed. Something similar can happen when recording is stopped. If the electric signal is still being sent to the record head while the tape speed is slowing to 0, the pitch will rapidly shift toward higher frequencies (the inverse of a “slurp”). Another artifact indicating that recording has been stopped is a loud pop left behind when the recorder lifts the tape off the record head.

With musical content, these artifacts provide cues that one track has ended, and another is about to begin. The same might be true with speech content. However, they may merely indicate that recording was paused, and what occurs before and after the artifact should be treated as belonging to the same track. In this case and with customer concurrence, I will apply a rapid fade-out to the preceding segment and a rapid fade-in to the following segment, inserting a brief period of silence between the segments (the length determined by the content and tempo in each segment).

In the absence of artifacts, separate tracks can usually be identified by the performance of a musical piece or what appears to be a complete conversation, event, etc. Somewhat less likely, changes in voices, topic of conversation, etc. could indicate the continuation of the current track or the beginning of a new one. Customer input is needed to determine how to proceed.

Clearly, nailing down tracks can be an extensive undertaking. Ideally, the customer would listen to the tapes before giving them to me and provide track identification instructions anchored by words or phrases within the recordings. Otherwise, the approach I usually take is to provide a collection of converted “segments” (i.e., candidate tracks) in the sequence in which they occurred on the tape—on a flash drive, CDs, or uploaded to the Internet cloud. The customer can then listen to all the segments in succession and compile instructions on which segments should be treated as complete tracks, combined with preceding and/or following segments, or discarded.

Defining Track Metadata

After tracks on each cassette or tape are identified, I will ask the customer to provide titles (subject to change as the project progresses) and identify who was speaking or performing. If neither the customer nor I can identify a musical piece, it will end up being called “Unidentified Piece #1.” With non-musical tracks, I may suggest a track title based on the content. Though my suggestions aren’t likely to be the final titles, they’ll provide a starting point for the customer’s consideration and subsequent revision.

Track title and artist/speaker information will be added to the audio files’ metadata (see Labeling MP3 Tracks: ID3 Tags and Altering Metadata) and, if CDs are desired, incorporated into the CD Text file burned onto the CD and included on the CD case insert (see Labeling Music CDs: CD-Text). If the customer can identify the year in which the track was recorded, that value can be incorporated into the metadata and insert as well.

Grouping and Sequencing Tracks

Once all the tracks have been identified, split, labeled, and cleaned up, the remaining tasks are collecting tracks into logical groupings and sequencing them within those groupings. This can be trivial if there are only a handful of tracks in the project, but more serious consideration is warranted if there are a couple hundred tracks.

The cassettes or tapes themselves often don’t provide any help. When someone wants to record something, he or she might grab a new tape, pick a previously used tape and fast-forward to the end of what was previously recorded, or intentionally or accidentally record over something that was previously there—potentially leaving a fragment of the original recording.

When there will be dozens of tracks in a project, I will create a spreadsheet to log track numbers (based on their appearance on the original media, not the final sequencing), the media from which they came, and phrases describing their content. Later, I will add track titles, performers or participants, and dates (if available). As this log progresses, candidate groupings are likely to emerge, and I will share those suggested groupings with the customer. Initially, the customer might wish to place music recordings in one group and voice recordings in another, or family voices in one group and friends’ voices in another, tracks associated with events or topics in separate groups, and so forth. Between my suggestions and the customer’s reflections, we can finalize the groupings destined to be burned onto CDs or become folders for MP3 or WAV files and the track orders for each grouping.

If the customer desires MP3 or WAV files, the groupings will be reflected as folders, and I’ll add sequence numbers to the tracks’ file names within each group’s folder to reflect the order. After the project has been completed, the customer is free to reorganize the files in any fashion by renaming the folders or moving audio files from one folder to another—and making appropriate changes to the sequence numbers in the tracks’ file names to reflect the revised order.

While there is no limit to the number of MP3 or WAV files appearing in a folder, this is not true for CDs. The decisions about grouping and sequencing tracks require a bit more deliberation, since they must take CDs’ capacity into account—700 MB (80 minutes) for consumer grade CDs or 650 MB (74 minutes) for premium CDs. Also, since the CD Text file has single fields for the album title and album artist(s), thought will need to be given on how to title the CD and represent the artists. (Track-level information about track artists/participants and dates is still reflected on the CD case insert.) After the CDs are delivered, the customer might have a change of mind about grouping and sequencing. He or she can rip, rearrange, and reburn the tracks onto new CDs—but doing so risks loss of album and track metadata (see Hints for Ripping a CD and Hints for Burning a CD).

The Rewards

Although converting personal recordings entails more extensive efforts, to me the results are worth the investment. I feel the customer appreciates what I provide, not only for the content, but also because of his or her role as a participant in the project. And I derive satisfaction and enjoyment from delivering a product for the customer to treasure.

Back to Paul's Blog & Contents