Field Notes from Ghana, Part 1: Audio recording

 Last December, I took a somewhat unusual data-gathering trip to Accra, Ghana. Rather than targeting a single language, the trip focused on a few of Ghana’s extant surrogate systems. Over the course of two weeks, I made field recordings that incorporated several languages, instruments, modalities, and methodologies.

This post is the first in an ongoing series of observations and discussion about that trip. Over this series I intend to describe some of my methodology and goals—and ultimately, some conclusions—in the hope that it helps others plan similar efforts. Today, I’ll give a basic overview and then dig into one of my current technical preoccupations: controlling dynamic range in surrogate language recordings.

 Overview

In my limited time, I wanted to scratch the surface of as many available surrogate languages as possible. The intention was a sort of “surrogate language speed dating”, in which I would work with several practitioners and speakers for a few sessions at a time, using existing materials as an aide.

On this trip, I spent most of my time with Benjamin N., a practitioner of the Birifor and Dagara variants of the gyil surrogate tradition. Birifor and (northern) Dagara are typically considered distinct variants within the Dagaare dialect continuum, both associated with northern Ghana and southern Burkina Faso. My work with Benjamin expanded across both language varieties and several modalities: gyil or resonator xylophone, gaŋgaa or double-headed cylindrical drum, and whistling. I dedicated one or more elicitation sessions to each language-modality pairing, during which I would elicit both existing (proverbial) and novel (productive) material. I also worked with speakers of Eʋe, drawing on existing materials, and briefly with a Twi surrogate language practitioner. All told, these language-modality pairings add up to eight surrogate varieties, six from the Birifor/Dagara continuum.

Crucially, the goal was not to gain a full, nuanced picture of any system as a whole. Instead, I wanted a sketch: a small but usable list of existing surrogate words and phrases and their spoken equivalents (in the West African tradition, the target is mostly proverbial expressions), a fairly robust surrogate phonology, and a preliminary sense of the system’s productivity and flexibility.

There’s an obvious disadvantage to this approach: it doesn’t allow for a lot of confidence in any kind of larger scientific point to be drawn.

But I think there’s an advantage worth considering, given the topic at hand: just fifty or a hundred years ago, there’s reason to think these systems were widespread in many parts of the world. Now, they’re increasingly difficult to track down, and quickly losing vitality. This kind of quick-and-dirty approach helps cover a lot of ground, which is necessary if we want the bigger picture of speech surrogacy works, before the population and genetic diversity of these systems is greatly diminished. I see this work as hands-on typology, and we need more of it before it’s too late.

Now, I want to discuss just one of the many methodological considerations I faced in Accra: how to produce listenable field recordings.

The dynamic range problem

I’m not a professional audio engineer, and my approach here is probably very different from a savvy field engineer’s. Still, I think I got decent recordings in a low-profile setup realistic for linguistic research.

Eliciting linguistic and musical surrogate data simultaneously requires a lot of equipment flexibility. An intimate interview setting conditions a consultant to speak quietly; surrogate languages tend to be loud and carry long distances. A consultant may switch from a low speaking voice to a loud drumbeat, yell or whistle many times within the course of a recording.

This can be a strain on equipment that needs a “sweet spot” of sensitivity to produce a faithful recording without too-quiet spots or clipping. This problem is exacerbated by the equipment a field linguist tends to have access to: typically no more than a few pieces of consumer or “pro-sumer” gear. The usual tools of an audio engineer—multiple high-quality microphones, close monitoring and level adjustment—just aren’t realistic for a researcher focused on data gathering.

So, what is the best option for a surrogate language researcher with only, say, a handheld recorder and a lavalier mic?

Familiarity with basic digital audio processing goes a long way here. Compressor/limiters are the basic audio dynamic controls that allow a recording to be made at a safe sensitivity level, then adjusted so that quieter parts are still audible. This approach was sufficient for my Twi recordings, where I used a single lavalier mic to record quiet speech, louder sung vocals and a plucked string instrument at once. Moderate compression was sufficient to keep everything within a listenable dynamic range.

Compression is going to be unsatisfactory in more demanding settings, however. Noisier environments and wider dynamic ranges require more severe compression. At a certain point, this begins to affect both the sound quality and the faithfulness of the data. I reached this point with recordings featuring the gyil and gaŋgaa, both percussion instruments with huge dynamic ranges.

One solution is the “safety track”. You may use two recording devices, or one, if your recorder can send the same signal to multiple inputs. When one track is set at a significantly lower level than the other, a dynamically variable recording will produce two complementary tracks: a less sensitive one (the “safety”) with areas that are too quiet, and a more sensitive one with too-loud clipping areas. Then you may manually cut between the two tracks as appropriate (a “gate” on the safety will silence the quieter parts of that track automatically if you wish). This method is technically appropriate but becomes extremely time intensive if there’s any amount of quick transition between speaking and playing.

One alternate method edges out a little on the branch of “practices a pro engineer may disapprove of”, but it is ultimately what I settled on. Essentially, the approach is to use our two recording devices—the handheld and the lavalier—to divide the frequency space in half, reducing the dynamic load on each one.

I’ll give an example: recording a gyil player. A lavalier mic attached to the player’s collar as normal will hang almost directly above the keys of the gyil, so it will pick up the voice and the mid-high range of the instrument; it is likely to peak and distort in the low-mid frequencies during the louder gyil sections.

The handheld recorder may be positioned pointing at the bass-register keys of the instrument to pick up the “low end”; it won’t be close enough to pick up the voice in detail.

Set up correctly, these two tracks can be mixed together to produce a full-bodied recording. All this requires is some simple but drastic equalizing, “chopping” the high-mid end off one track and the low from the other. Here’s an example from one of my gyil sessions:

Hi-pass filter
The lavalier track (I used an Audio-Technica AT803), with a high-pass filter removing the low end, and some minor adjustments for a clearer sound in the high-mid range.
Low-pass filter
The handheld track (I used a Zoom H4n field recorder), with a low-pass filter removing the high-mid range and some pretty serious gain in the low end.

You can see from these figures how each pattern directly complements the other, with the low- and high-pass filters meeting directly in the low-mid range. I followed up with some compression on the lavalier, which boosted the vocals without overly compromising the sound quality of the gyil. Played together, the two tracks maintain crispness in the high end and a substantial presence in the low end, and any clipping from the low-mid region of the lavalier mic is excised completely. These specific EQ patterns will look different for different equipment and different surrogate systems; the point is to have the option of a drastic EQ intervention on a distorted track without harming its overall sound quality.

There are still tradeoffs to this approach. In my recordings, there is a tendency to lose brilliance and presence in the high end of the gyil, and the gaɲgaa is probably inescapably boomy. To my ear, the results are nevertheless more than acceptable, though not at the level of high-quality field recording equipment tended by an able audio engineer. Some of these recordings will likely be available on this blog in the future for listeners to make their own judgements.

That is just one of the technical considerations I think is fairly particular to surrogate language recordings. In future installments, I’ll write about more of the linguistic methodology I tried and some of the results of the trip, but hopefully these technological points are worth experimenting with as well.

-Lucas James

Author: Lucas James

Dartmouth '21, Linguistics.