Speech Surrogates – Encoding language through music, whistling, and other modalities

Senegalese sabar – Is it a drum language?

Today’s post is a guest post from Sofiya Ros, PhD student at University of Utrecht in the Netherlands. Learn more about Sofiya under the contributor bios page!

Senegalese drummers show the practice of playing drums in correlation to speech. These drummers are part of the social class of griots [Hale 1998, Tang 2012], and their most common drum is a single-headed drum known as sabar.

Although sabar drums are rarely used as a speech surrogate and their main function is to affect the listener rather than to convey a message, it is clear that the practices of playing the sabar involve a close connection to linguistic expressions. In personal interviews griots say that ‘the sabar can speak’ and utter spoken expressions in correlation to sabar rhythms they play [Winter 2014: 646].

To what extent do such correlations justify the claim that Sabar is a drum language?

Playing the sabar involves at least 9 different drum strokes (hand strokes, stick strokes or their combination), which can be seen as the basic phonemic units of the genre. These strokes compose different longer Sabar rhythms which can be correlated with spoken utterances in Wolof — the lingua-franca of Senegal.

Mapping between Sabar and Wolof is not as clear as in more well-known cases of speech surrogacy on drums (e.g. the Yoruba drum language). Yoruba has three contrastive tones: high, low and middle, and tone counters formed by a combination of two of the tones. The tones and counters of spoken Yoruba can be represented by the notes of the Yoruba drums. The drums literally mimic the spoken utterances, whereas Sabar works differently as Wolof is not tonal.

I am working with the data collected during previous expeditions to Senegal: bàkks (classical phrases in Sabar, not improvised on the spot) and improvisations in Sabar and their translations to Wolof. Our data has the advantage that it includes not only bàkks, which are more like fossils, previously generated phrases learnt by heart, but also improvisations, which attest that Sabar drumming is still productive and its performance is not restricted to an existing repertoire of traditional texts.

So, there they are, Senegalese griots, and here am I, having some recordings of their drumming with translations, trying to find out what lies behind the drumming. I am approaching this problem by finding out the rules of “translation” from Wolof to Sabar rhythms. Sabar exhibits certain rules on its own: rhythm production “involves grammatical operations different from those of the spoken language, and that meaningful sabar rhythms deserve to be studied as a separate object for linguistic research, a drum language referred to as Sabar” [Winter 2014: 645].

Since Sabar rhythms are clearly connected to Wolof, we should be interested in the rules and regularities that govern this connection.

My first step is to find the rules of the translation, the correlation between the rhythms and the spoken language. First, I test the hypothesis on phonological mapping between the two languages, meaning that each stroke of Sabar represents a syllable or a number of syllables in Wolof.

Surprisingly enough, there were enough correlations to assume that the phonological mapping can be a feature, for example, bi (‘the’) is always translated with the ‘gin’ stroke in 45 cases (and only once as ‘tan’):

(1) Adduna bi
tan tan gin gin
‘this world’

However, of course there are irregularities as well:

(2) Jëfee ndigël rekka wóor
turun gin tan tac tan rwan
‘to do what is commanded is the only true way’

(3) Jëf rekk gu baa-xa wóor
ce rwe gin pax gin gin
‘to do to do what is good in the only true way’

In 2 and 3 the same word wóor (‘true’) is drummed differently: as rwan and as gin.

My first statistical analysis has already shown some correlations: for example, ‘gin’ stroke (hand stroke at the edge of the drum) is used to represent short vowels in 1738 cases (87%) and long vowels in 252 cases (13%) and this distribution is significantly different from the general distribution of short and long vowels, so there are significantly more short vowels for gin than in general in the data (X² = 54.1531, p < .00001).

I am still working on the generalisations, nevertheless, inspire of some irregularities, I am already getting a shadowy feeling that there is a way out of this chaos.

-Sofiya Ros

References

Hale, Thomas A. 1998. Griots and griottes. Bloomington: Indiana University Press.
Tang, Patricia. 2012. The rapper as a modern griot: Reclaiming ancient traditions. Hip hop Africa: New African music in a globalizing world, ed. by Eric Charry, 79–91. Bloomington: Indiana University Press.
Winter Yoad (2014): “On the Grammar of a Senegalese Drum Language” – Language 90.3, pp. 644-668

How to decode a surrogate language, or, speech surrogates as wug testing

Let’s say in the course of doing fieldwork on a language, you come across a speech surrogate. What do you do? How do you begin to unravel the system and discover its inner workings?

As with most of my posts, I’ll focus on musical surrogates here, though I believe the methodology would be largely similar regardless of the modality.

With a musical surrogate language, chances are that there will be a preexisting repertoire of songs or commonly used phrases. While of course you should record as many of these as possible (more on the overall health of speech surrogates in another post), I wouldn’t recommend starting here to decode the surrogate unless the connection to speech is immediately apparent. The reason for this is that this material is likely to have been learned by rote, and quite possibly as music first before understanding its linguistic underpinnings. The repertoire may have been passed down largely unchanged from generation to generation, while the spoken language slowly evolved, creating a confusing distance between the two. It may not even be based on speech at all, but rather on a sung repertoire, as in the case of the Sambla balafon (McPherson 2018, McPherson and James ms). Finally, in most cultures, musical surrogate languages in their natural setting are full of proverbs and other figurative language, which may make it difficult to translate and compare spoken language to surrogate speech, especially if fieldwork is in its early stages or there is no thorough grammatical description available. Thus, even though the fixed repertoire may be the most natural, it is likely not the best place to start.

Instead, start with spontaneity. Ask how to say simple things. Treat it as you would regular elicitation for the spoken language. Depending upon the natural productivity of the surrogate language, this will be an easier or harder task for your musician consultants. In this post, I’ll describe my experience with the Sambla balafon, where the tradition remains highly productive. We’ll have other posts that describe cases where that may not be true, though I would maintain that testing productive phrases in this way is still an informative exercise regardless of how natural it is for the consultants.

It boils down to this: Elicitation with musical surrogate languages is essentially a kind of wug testing. Wug testing takes a morphological or phonological pattern observable in natural speech and asks speakers to apply it to novel forms. It reveals what speakers know about the rules and patterns in their language, rules that we infer from observing regular speech but in this context, speakers could simply be reproducing forms that they have heard rather than productively applying those rules. When we ask a musician to produce a novel phrase in the surrogate language, we are essentially asking, “What rules have they internalized from the repertoire or tradition and what can that reveal about how the relationship between the spoken and surrogate language?” Ideally it can go a step further and also reveal something about the way the spoken language works.

For the Sambla balafon, my initial meeting with Mamadou Diabate in Vienna provided me with a range of phrases on the balafon, most of which Mamadou himself had offered. This included some everyday phrases like “Where are you from?” or “What are you doing?” as well as a couple elicited phrases that I snuck in, like “I will buy a goat” vs. “I will buy goats”. Though some of the phrases were more complex than my level of Seenku understanding at the time, it was apparent that factors like tone and vowel length were playing a role in the surrogate language.

When I returned to Burkina Faso the following summer, I was determined to get to the bottom of how exactly tone and syllable structure worked, and whether there were any other phonological contrasts that I was missing. I spent a few days with Mamadou’s nephew Nigo Diabate focusing on elicitation. I began with a simple sentence whose tones I felt more or less confident about:

mó sḭ̌ səmâ nɛ̏
‘I am dancing’

Then, just as in regular spoken elicitation, I systematically varied one element, in this case, the subject pronoun, swapping out the High-toned 1sg mó for a Superhigh-toned 1pl mi̋:

mi̋ sḭ̌ səmâ nɛ̏
‘We are dancing’

Sure enough, only the beginning of the phrase differed on the balafon, with the pronoun mi̋ corresponding to a higher note on the balafon. Transcriptions of these two phrases are provided below, where the Seenku names of the notes of the balafon are abbreviated along the left-hand side and the square of the grid is filled in when that note is struck. (See Strand 2009 or McPherson 2018 for discussion of Sambla tuning and note names.)

After a few iterations with different subjects, I varied the verb, changing from səmâ ‘dance’ with a contour tone to kȍeewith a level extralow tone.

mó sḭ̌ kȍee nɛ̏
‘I am singing’

Sure enough, the notes corresponding to the verb dropped to the same level.

Of course, the relationship between the spoken and surrogate language may not always be straightforward or one-to-one. For the Sambla balafon, for instance, elicitation sometimes reveals free variation, such as the following two musical equivalents of ȁ sḭ̌ səmâ nɛ̏ ‘s/he is dancing’, with an extralow-toned subject pronoun:

Notice that in both of these cases, the səmâ nɛ̏ part is played lower on the instrument than it was with the other pronouns. I suspect this has to do with the initial extralow tone, but it remains a bit of a mystery, since when asked, a musician will also accept it played higher up.

Cases of variation like this highlight the importance of (1) recording the same phrase more than once, preferably on different occasions to avoid self-priming, and (2) collecting data from multiple consultants, if possible. This can help triage errors from true variation, identify which version might be most natural, and determine how consistent the surrogate language system is between practitioners. Much of this is common sense from wug testing, highlighting once again the connection between surrogate languages and these other types of experimental work.

Before wrapping up this post, it’s worth pointing out that different consultants will have differing levels of tolerance for this kind of elicitation. In my experience in Mali and Burkina Faso, classical wug testing in the spoken language is a near impossible task (“But that’s not a word…”), but the Sambla balafonists have been fairly tolerant of my bizarre requests to say unusual things on the instrument. Thus, elicitation work on speech surrogates offers a unique source of “external evidence” (Churma 1979) to probe the phonology of a language where more traditional methods like wug testing may fail.

Nevertheless, there are limits. The Sambla balafonists are much more willing to provide phrases than single words, since phrases offer some possibility of disambiguation and thus render the message more natural. For the same reasons, they are not too keen on meaningless frame sentences that would allow us to look for subtle differences in the musical rendition of single words. As Mamadou has told me, it just doesn’t mean anything and thus isn’t true balafon speech. This may depend on the tradition, since I have seen others like Samuel Akinbo working with Yorùbá dùndún drummers (Akinbo 2019) have more success eliciting single words.

Only once the productive rules are worked out through elicitation would I recommend tackling the fixed repertoire. You’ll be armed with the musicians’ metalinguistic knowledge of the system and use that to determine how close of a match the rote material is to its spoken translation.

References

Akinbo, Samuel. 2019. Representation of Yorùbá tones by a talking drum: An acoustic analysis. Langues et Linguistique Africaine.

Churma, Donald. 1979. Arguments from external evidence in phonology. PhD dissertation, Ohio State University.

McPherson, Laura. 2018. The talking balafon of the Sambla: Grammatical principles and documentary implications. Anthropological Linguistics 60.3: 255-294.

McPherson, Laura and Lucas James. Ms. Artistic adaptation of Seenku tone: Musical surrogates vs. vocal music. Submitted to Selected Proceedings of ACAL50.

Strand, Julie. 2009. The Sambla xylophone: Tradition and identity in Burkina Faso. PhD dissertation, Wesleyan University.

Enphrasing isn’t just about disambiguation

In this post, I want to talk about enphrasing, a phenomenon of abridging surrogate systems that Stern (1957) defines simply as: ‘the lexical unit is replaced with a phrase’. This phenomenon is attested in systems across Central and West Africa (Finnegan 2012), in the Amazon (Seifart, Meyer et al. 2018), and Southeast Asia (e.g. Bradley 1979). I have not found any evidence of enphrasing in North America.

Enphrasing substitutes shorter expressions for longer ones, sometimes directly elaborating on the original word or phrase. Here’s an example from Kele, which I’ll continue to draw from below: in drummed Kele, the word songe ‘moon’ is replaced with a sequence meaning ‘the moon looks down on the earth’ (Carrington 1949:33). There’s a classical analogue in the Old Norse kenning, which replaced words like vargr ‘wolf’ with elaborations like svalg áttbogi ylgjar ‘the evil off-spring of the she-wolf’ for aesthetic purposes.

Enphrasing is often explained functionally as a way of disambiguating surrogate speech. Because abridging systems encode a limited amount of phonological information, utterances with similar properties (e.g. the same tone pattern) can often be homophonous. Enphrasing is said to help with this problem: as explained in Seifart et al. (2018), “short words that would come out as homophones in drumming are replaced by longer, less ambiguous expressions, often with poetic creativity.” This explanation dates back to the coining of the term itself in Stern (1957).

But I think there’s another dimension to enphrasing that needs to be mentioned alongside disambiguation: roteness. It’s no earth-shattering observation, but I think it’s useful to include as part of the framing: enphrased sequences can’t just be longer phrases, they also have to be rote expressions in common use.

Here’s a neat example of what I mean using the same Kele word as I mentioned before, songe ‘moon’. According to Carrington, both songe and koko ‘fowl’ would be drummed on their own as two high-toned strokes. He reports an enphrased sequence for koko that corresponds to the phrase ‘the fowl, the little one which says kiokio’. With enphrasing, the homophony between songe and koko is eliminated.

But hold on—couldn’t ‘the moon looks down at the earth’ just as easily be ‘the fowl looks down at the earth’? On the drum, the phrases should be homophonous, beginning with two high-toned strokes. Semantically speaking, a flighted bird is no less likely a candidate to look down at the earth than the moon. Why are these enphrased sequences not just as ambiguous as the original word pair?

The answer isn’t a big revelation, but I think it’s important. These are rote phrases, understood by a community of practice to be part of their existing repertoire. Enphrasing doesn’t just elaborate single words into longer phrases. It also pares down the universe of possible correspondences to a cognitively manageable level, ensuring practitioners and their audiences are drawing from a shared reservoir of mutually understood, appropriate language.

Why is this framing important? For one thing, it’s useful to be reminded that speech surrogacy is as much a cultural phenomenon as it is a linguistic one. The practitioners of a given system share a language, a traditional music, a regional or ethnic identity, perhaps even a familial tie. The same goes for their non-practicing audience. The lexicon of a system—the words that are common enough to have an enphrased sequence, and the content of enphrased sequences themselves—is shaped by this shared language and culture. Notably, Central and West African systems are not only rich with proverbial expressions drawn from pre-existing oral literature but are often seen to form part of that oral literature itself (Finnegan 2012). It’s worth it to consider the cultural role that enphrasing plays, not only its functionality.

More expansively, I also think it helps guide our research into broader questions about roteness in surrogate speech. One of the fundamental typological questions that I am most interested in is: what circumstances produce novel utterances in surrogate speech? It’s clear that surrogate systems have different levels of productivity. Differences in the use of enphrasing may help explain these disparities.

Take the Dagaare gyil tradition. Musically, it is a living tradition, and individual performers certainly compose new pieces and embellish the existing repertoire. But linguistically, is a good example of a system that is currently pretty much unproductive. Its traditional performance is restricted to an existing repertoire of texts; only the “progression of variations [is] left to the performer’s discretion”, and even that is somewhat restrained to a traditional ordering (Campbell 2005:48).

Michael Vercelli illustrates this in the gyil’s role in Dagaare funerals:

“[The] gyil player will directly address the participants through the use of understood phrases spoken on the instrument … Just as a contemporary wedding DJ would choose songs appropriate for the bride and groom’s first dance, the gyil player must select songs appropriate for the specific funeral ceremony.”

—(Vercelli 2012:3)

These “understood phrases” aren’t examples of enphrasing. They don’t replace short words with disambiguated rote expressions. They’re more like fossils, a repertoire of previously generated phrases encased in musical amber. This intrigues me, because I have worked with two Dagaare surrogate practitioners—a gyil player and a whistler—who effortlessly produced novel utterances under elicitation conditions. It’s not impossible to come up with a phrase like “the good red guinea fowl” or “I was walking” and produce it within the Dagaare surrogate system. However, my consultants have always caveated these attempts with words like “artificial”, “experiment”, and “unnatural”. It seems that the surrogate system is intact, but the act of generating novel phrases has been pushed out of the scope of the tradition as practiced.

So while it isn’t enphrasing by definition, I want to argue that this ‘fossilization’ is part of the same process that creates enphrasing. There is a set of functional and cultural pressures to rely on a small repertoire of disambiguated rote expressions: they are not only easier to process and understand, but can also serve to strengthen shared community ties. Dagaare traditional music seems to have responded to those pressures, allowing a rich tradition of recognizable rote expressions to take root.

What’s on the other end of the spectrum? As I mentioned before, I haven’t found any North American whistling systems that rely on elaborated stock phrases. The documentation I have read suggests that these systems are frequently used for short, spontaneous conversations. Familiarity is essential for comprehension; but, rather than being built in with a common repertoire of elaborated rote sequences, these systems arrive at familiarity primarily through everyday speech and context clues.

Here’s an example from a Tlaxcalan Spanish whistling system:

Tlaxcalans, like Gomeros, boast that they can whistle anything they can say. Generally, however, conversations are between family members friends, or neighbors, and are restricted to short which tends to put exchanges in familiar contexts. Such whistled requests or advice as “Bring me a shovel”, or “I’m going to the pueblo” are common.

A typical exchange between a farmer in his field and a friend on the road might consist of the following: “Pedro, ¿ a donde vas?” (“Pedro, where are you going?”). “A Tlaxcala” (“To Tlaxcala, the nearby capital city of the state). “ ¿ Porque?” (“Why?”). “A vender mis cebollas” (“To sell my onions”); and end with a whistled “Adios”.

—Wilken (1979:883)

Mazatec whistling, similarly, is “frequently (though not necessarily) concerned with topics immediately obvious to both parties … and used in situations where cultural context plays a much greater part” (Cowan 1948:283-4).

Neither Tlaxcalan Spanish nor Mazatec whistling allow for complex, sprawling dialogue, and familiarity is still essential to comprehension. But it’s obvious that these short, everyday phrases constitute an opposite strategy to a system like Dagaare’s: . These systems seem to be unaffected by the pressures towards disambiguated rote expressions, instead relying informally on cultural and situational context.

Many surrogate systems probably fall in between these extremes, employing a mixture of spontaneity and roteness. This middle ground is (finally) where enphrasing appears. At the risk of running long on this post, I should give a few examples, because it’s important to describe how enphrasing actually works: creating stock phrases that can be spontaneously combined in novel sequences. I’ll leave it to the reader to evaluate where their own speech surrogates of study fall on this continuum.

My first example is Bora manguaré drumming. Briefly, this Amazonian surrogate system has a ‘singing mode’, where a set of rote rhythmic phrases, associated with sung lyrics, are played as a form of musical performance. That’s just like the Dagaare gyil tradition. But manguaré also has a ‘talking mode … used to transmit relatively informal messages and public announcements” (Seifart et al. 2018:6). These messages take the form of enphrased sequences:

“nouns and verbs are marked with special disyllabic markers…On nouns, the marker –úβù is used…For verbs, the marker – is used… In drummed messages, these markers do not carry any semantic value, but function purely to identify the preceding sequences of beats as representing nouns or verbs…there are conventional long forms for words that occur frequently in manguaré messages to render them less ambiguous… for instance, the Bora noun referring to a commonly hunted deer species nììβúgwà is replaced in manguaré messages with ìámé-tùùtáβààbè néébá-nììβúgwà-úβù, literarily ‘deceased annatto deer, damaged animal’.”

—(ibid:9)

Even more interestingly, these enphrased sequences are embedded into a standard ‘frame’, helping to reduce the cognitive load of comprehension further:

—(ibid:6)

Manguaré shows how enphrasing can balance the need for flexible, spontaneous communication with the advantages of roteness. By elaborating words in identifiable units, situated clearly within a frame, enphrasing allows manguaré a lot of intelligibility without an extremely restrictive lexicon of immediate topics. Given this system, it’s easy to imagine, as Seifart and colleagues suggest, that “manguarés were in daily use [in the 20^th century], including for conversations about almost any subject and that messages were relayed from one roundhouse to another to reach further distances” (ibid:5).

Returning to Kele, there are many similarities here with Bora drumming, though the details differ. As with Bora drumming, Kele drumming employs a variety of communication strategies. There are documented cases of fully fixed paragraph-length sequences in this system. For instance:

“Another stock communication is the announcement of a dance, again with the drum speaking in standardized and repetitive phrases:

All of you, all of you,

come, come, come

let us dance

in the evening

when the sky has gone down river

down to the ground. [Carrington 1949: 61-2]”

—(Finnegan 2012:472)

It’s not apparent that these announcements are embedded in a ‘musical mode’ the way that rote expressions are in the Dagaare and Bora systems. However, Carrington certainly describes them as ‘standardized’, and associates them with cultural practices (like dances and funerals) in a similar way. This portion of the surrogate practice shows a strict reliance on elaborated rote phrases.

In addition, though, we see the famous Kele enphrased sequences in action. Of course, an enphrased sequence corresponding to a single word like ‘moon’ or ‘fowl’ is pragmatically useless without a larger context. And like Bora, that larger context is provided within a frame where multiple sequences can be combined to produce a novel utterance. Take this example (lightly edited from Carrington for ease of comparison), which is not associated with a traditional social function, instead announcing a news event and providing listeners with instructions:

English: ‘The missionary is coming up-river to our village tomorrow. Bring water and firewood to his house.’

Kele (spoken): Bosongo atoya ko nda bokenge wasu lɛlɛngo. eʃaka balia la toala ko nda ndakɔ yande.

Kele (drummed, with spoken equivalent):

bosongo olimo ko nda lokonda	‘white man spirit from the forest
wa lokasa lwa lonjwa	of the leaf used for roofs
atoya likolo atoya likolo	comes up-river, comes up-river
ko lɛlɛngɔ ekaliekele	when tomorrow has risen
likolo ko nda use	on high in the sky
ko nda likelenge liboki	to the town and the village
liaaka la iso	of us
yaku yaku yaku yaku	come, come, come, come
yatikeke balia ba lɔkɔila	bring water of lɔkɔila vine
yatikeke tokolokolo twa toala	bring sticks of firewood
ko nda ndakɔ ya tumbe elundu likolo	to the house with shingles high up above
ya bosongo olimo ko nda lokonda	of the white man spirit from the forest
wa lokasa lwa lonjwa	of the leaf used for roofs’

—Carrington (1949:54)

There are a few interesting points here. First is its spontaneity: given the context, it seems likely that this really is a fairly novel message. It’s not clear why the Kele tradition would have developed such a specific stock phrase out of whole cloth, though of course it’s possible. In any case there are multiple similar cases in Carrington’s documentation, showing that these phrases must have been generated quite readily and productively. This seems like a spontaneous message aided by the availability of relevant enphrased sequences.

The other interesting part is the evidence of a frame. While it’s not as clear cut as in the Bora drumming system, there are differences in the syntax of the drummed message compared to the spoken one, suggesting that the surrogate system has a preferred ordering.

Here’s an informal gloss to highlight the difference:

Spoken Kele: Missionary comes [up-river] to village our tomorrow. Bring water and firewood to house his.

Drummed Kele: Missionary comes up-river tomorrow to village our. Bring water bring firewood to house of missionary.

The main variation is with the verb yatikeke, which appears before both ‘firewood’ and ‘lɔkɔila vine’ only in the drummed phrase. If enphrasing simply replaced these two words with their elaborated equivalents, the conjunction could have remained intact. But it seems that the bring [noun] frame is preferred in this instance, just as it is in Bora drumming.

Without going further into the other differences—namely the restatement of the possessive pronoun’s referent and the change of position of the temporal phrase—I will say that enphrasing systems need to be subject to a lot more detailed analysis, syntactic and otherwise. I would bet that many more systems employ these tools, including the framing devices on display in Bora and Kele, than we’re aware of.

So while much has been written on the basic idea and content of enphrasing, I would like to see a much more detailed discussion around its actual mechanics and properties. I’m encouraged by the rigorous approach taken in papers like Seifart, Meyer, Grawunder, and Dentel’s work on Bora drumming. I hope this post may help inspire a more complex discussion in that vein.

—Lucas James

References

Bradley, D. “Speech through Music: the Sino-Tibetan Gourd Reed-Organ.” Bulletin of the School of Oriental and African Studies, vol. 42, no. 3, 1979, pp. 535–540., doi:10.1017/S0041977X00135773.

Campbell, Corinna Siobhan. Gyil music of the Dagarti people: Learning, performing, and representing a musical culture. Diss. Bowling Green State University, 2005.

Carrington, John F. “A comparative study of some Central African gong-languages”. Vol. 13. Académie royale des sciences d’outre-mer. Classe des sciences morales et politiques, 1949.

Cowan, George M. “Mazateco Whistle Speech.” Language, vol. 24, no. 3, 1948, pp. 280–286. JSTOR, www.jstor.org/stable/410362.

Finnegan, Ruth. Oral Literature in Africa, Open Book Publishers, 2012. ProQuest Ebook Central.

Godsey, Larry Dennis. “The Use of the Xylophone in the Funeral Ceremony of the Birifor of Northwest Ghana.” Diss. University of California, Los Angeles, 1980.

Lewis, T. Becoming a garamut player in Baluan, Papua New Guinea: Musical analysis as a pathway to learning, Taylor and Francis, 2018. doi:10.4324/9781315406503.

Seifart, Meyer et al. “Reducing language to rhythm: Amazonian Bora drummed language exploits speech rhythm for long-distance communication.” Royal Society open science vol. 5,4 170354. 25 Apr. 2018, doi:10.1098/rsos.170354

Stern, Theodore. “Drum and whistle languages: An analysis of speech surrogates.” American Anthropologist 59.3 (1957): 487-506.

Vercelli, Michael. “Ritual Communication Through Percussion: Identity and Grief Governed by Birifor Gyil Music.”.” DMA diss. University of Arizona, 2006.

Wilken, Gene C. “Whistle speech in Tlaxcala (Mexico).” Anthropos H. 5./6 (1979): 881-888.

Field Notes from Ghana, Part 1: Audio recording

Last December, I took a somewhat unusual data-gathering trip to Accra, Ghana. Rather than targeting a single language, the trip focused on a few of Ghana’s extant surrogate systems. Over the course of two weeks, I made field recordings that incorporated several languages, instruments, modalities, and methodologies.

This post is the first in an ongoing series of observations and discussion about that trip. Over this series I intend to describe some of my methodology and goals—and ultimately, some conclusions—in the hope that it helps others plan similar efforts. Today, I’ll give a basic overview and then dig into one of my current technical preoccupations: controlling dynamic range in surrogate language recordings.

Overview

In my limited time, I wanted to scratch the surface of as many available surrogate languages as possible. The intention was a sort of “surrogate language speed dating”, in which I would work with several practitioners and speakers for a few sessions at a time, using existing materials as an aide.

On this trip, I spent most of my time with Benjamin N., a practitioner of the Birifor and Dagara variants of the gyil surrogate tradition. Birifor and (northern) Dagara are typically considered distinct variants within the Dagaare dialect continuum, both associated with northern Ghana and southern Burkina Faso. My work with Benjamin expanded across both language varieties and several modalities: gyil or resonator xylophone, gaŋgaa or double-headed cylindrical drum, and whistling. I dedicated one or more elicitation sessions to each language-modality pairing, during which I would elicit both existing (proverbial) and novel (productive) material. I also worked with speakers of Eʋe, drawing on existing materials, and briefly with a Twi surrogate language practitioner. All told, these language-modality pairings add up to eight surrogate varieties, six from the Birifor/Dagara continuum.

Crucially, the goal was not to gain a full, nuanced picture of any system as a whole. Instead, I wanted a sketch: a small but usable list of existing surrogate words and phrases and their spoken equivalents (in the West African tradition, the target is mostly proverbial expressions), a fairly robust surrogate phonology, and a preliminary sense of the system’s productivity and flexibility.

There’s an obvious disadvantage to this approach: it doesn’t allow for a lot of confidence in any kind of larger scientific point to be drawn.

But I think there’s an advantage worth considering, given the topic at hand: just fifty or a hundred years ago, there’s reason to think these systems were widespread in many parts of the world. Now, they’re increasingly difficult to track down, and quickly losing vitality. This kind of quick-and-dirty approach helps cover a lot of ground, which is necessary if we want the bigger picture of speech surrogacy works, before the population and genetic diversity of these systems is greatly diminished. I see this work as hands-on typology, and we need more of it before it’s too late.

Now, I want to discuss just one of the many methodological considerations I faced in Accra: how to produce listenable field recordings.

The dynamic range problem

I’m not a professional audio engineer, and my approach here is probably very different from a savvy field engineer’s. Still, I think I got decent recordings in a low-profile setup realistic for linguistic research.

Eliciting linguistic and musical surrogate data simultaneously requires a lot of equipment flexibility. An intimate interview setting conditions a consultant to speak quietly; surrogate languages tend to be loud and carry long distances. A consultant may switch from a low speaking voice to a loud drumbeat, yell or whistle many times within the course of a recording.

This can be a strain on equipment that needs a “sweet spot” of sensitivity to produce a faithful recording without too-quiet spots or clipping. This problem is exacerbated by the equipment a field linguist tends to have access to: typically no more than a few pieces of consumer or “pro-sumer” gear. The usual tools of an audio engineer—multiple high-quality microphones, close monitoring and level adjustment—just aren’t realistic for a researcher focused on data gathering.

So, what is the best option for a surrogate language researcher with only, say, a handheld recorder and a lavalier mic?

Familiarity with basic digital audio processing goes a long way here. Compressor/limiters are the basic audio dynamic controls that allow a recording to be made at a safe sensitivity level, then adjusted so that quieter parts are still audible. This approach was sufficient for my Twi recordings, where I used a single lavalier mic to record quiet speech, louder sung vocals and a plucked string instrument at once. Moderate compression was sufficient to keep everything within a listenable dynamic range.

Compression is going to be unsatisfactory in more demanding settings, however. Noisier environments and wider dynamic ranges require more severe compression. At a certain point, this begins to affect both the sound quality and the faithfulness of the data. I reached this point with recordings featuring the gyil and gaŋgaa, both percussion instruments with huge dynamic ranges.

One solution is the “safety track”. You may use two recording devices, or one, if your recorder can send the same signal to multiple inputs. When one track is set at a significantly lower level than the other, a dynamically variable recording will produce two complementary tracks: a less sensitive one (the “safety”) with areas that are too quiet, and a more sensitive one with too-loud clipping areas. Then you may manually cut between the two tracks as appropriate (a “gate” on the safety will silence the quieter parts of that track automatically if you wish). This method is technically appropriate but becomes extremely time intensive if there’s any amount of quick transition between speaking and playing.

One alternate method edges out a little on the branch of “practices a pro engineer may disapprove of”, but it is ultimately what I settled on. Essentially, the approach is to use our two recording devices—the handheld and the lavalier—to divide the frequency space in half, reducing the dynamic load on each one.

I’ll give an example: recording a gyil player. A lavalier mic attached to the player’s collar as normal will hang almost directly above the keys of the gyil, so it will pick up the voice and the mid-high range of the instrument; it is likely to peak and distort in the low-mid frequencies during the louder gyil sections.

The handheld recorder may be positioned pointing at the bass-register keys of the instrument to pick up the “low end”; it won’t be close enough to pick up the voice in detail.

Set up correctly, these two tracks can be mixed together to produce a full-bodied recording. All this requires is some simple but drastic equalizing, “chopping” the high-mid end off one track and the low from the other. Here’s an example from one of my gyil sessions:

Hi-pass filter — The lavalier track (I used an Audio-Technica AT803), with a high-pass filter removing the low end, and some minor adjustments for a clearer sound in the high-mid range.

Low-pass filter — The handheld track (I used a Zoom H4n field recorder), with a low-pass filter removing the high-mid range and some pretty serious gain in the low end.

You can see from these figures how each pattern directly complements the other, with the low- and high-pass filters meeting directly in the low-mid range. I followed up with some compression on the lavalier, which boosted the vocals without overly compromising the sound quality of the gyil. Played together, the two tracks maintain crispness in the high end and a substantial presence in the low end, and any clipping from the low-mid region of the lavalier mic is excised completely. These specific EQ patterns will look different for different equipment and different surrogate systems; the point is to have the option of a drastic EQ intervention on a distorted track without harming its overall sound quality.

There are still tradeoffs to this approach. In my recordings, there is a tendency to lose brilliance and presence in the high end of the gyil, and the gaɲgaa is probably inescapably boomy. To my ear, the results are nevertheless more than acceptable, though not at the level of high-quality field recording equipment tended by an able audio engineer. Some of these recordings will likely be available on this blog in the future for listeners to make their own judgements.

That is just one of the technical considerations I think is fairly particular to surrogate language recordings. In future installments, I’ll write about more of the linguistic methodology I tried and some of the results of the trip, but hopefully these technological points are worth experimenting with as well.

-Lucas James

Why study musical surrogate languages? A linguist’s perspective

It may seem like an unusual choice for a linguist to study musical instruments, as if we’re dallying uninvited with the territory of ethnomusicology. However, linguistics is the study of human language in all of its myriad forms, and I would argue that musical surrogate languages represent one of these forms. Now, it’s not a primary form like spoken or signed language. No child grows up with a musical surrogate language as their mother tongue, used for all day-to-day interactions. But musical surrogate languages are linguistically grounded in a way that other forms of non-verbal self-expression, like dance or dress, are not. In other words, we can learn a lot about spoken language through musical surrogate languages.

Here, I’ll offer an anecdote about how I stumbled into this area, in the hopes that it will encourage other linguists to more purposefully join the quest to better understand these systems.

It was 2013. I was in Burkina Faso on my first major field trip to document Seenku, a small Mande language spoken near Bobo-Dioulasso (by an ethnicity known to outsiders as the Sambla). By this time, I had been working in West Africa for five years, having recently wrapped up a project in neighboring Mali documenting Tommo So, a member of the Dogon language family. A lifelong lover of music, I had spent the last year analyzing Tommo So vocal music to understand what happened to the tones of the language when it was set to melody. Even in my early days working with Seenku, it was clear that its tone system was more complex than that of Tommo So, so I wondered what those tones would do when sung.

I asked my two main consultants, Clement and Emma, if they could bring me any recordings of Seenku songs. A couple days later, Clement gave me a folder of mp3s from a Sambla musician, Mamadou Diabate. That night, I greedily pored over the music, thrilled to hear what this language would sound like in song. To my great disappointment, the majority of the recordings were instrumental, a delicious but impenetrable tangle of xylophones (known in West Africa as balafons). I managed to find a couple of sung phrases and brought them up with Clement and Emma the next day to translate.

tɛ́nɛ́ gba̰̋ kú dɔ̏ɔ-nɛ̏-fi̋ɛ kənű mḭ̏ɛ̰, kʊ́ lɛ́ í bɛ̋ í wó bṵ́ɛ̰ɛ-kɛ̰̋ɛ̰
‘Whoever smells the odor of second day millet beer, he will say he worships the fetish.’

I thought I had wrapped up the work when Clement said enigmatically, “I know what the balafon is saying.”

Until that moment, the thought had never crossed my mind that an instrument could speak. Sure, I had heard the phrase “talking drums” before, but never given it much thought. Clement told me how everyone in the village could understand when the balafon spoke to them and offered a few translations from the song we were listening to (Ji Te So, from Mamadou Diabate’s album Keneya). And then I turned my attention back to pressing matters of conjugating verbs and figuring out phrasal tone, tucking the balafon conversation away for a rainy day.

That day came the following year, when a winter trip to Burkina Faso was postponed due to revolution. I was headed to Europe for a workshop and remembered that Mamadou Diabate lived in Vienna. On a whim, I reached out to him to see if I could pay him a visit and try to unravel some of the mysteries of how the balafon speaks. He agreed, and before I knew it, there we were in his studio. He was sitting behind his balafon, a huge instrument of wood and gourds, dressed in a crisp white shirt and black slacks, ready for a concert later. He told me that anything you could say with your mouth, you could say on the balafon. I offered a few phrases I’d been learning (things you would probably never actually say on the balafon, but you theoretically could).

“I will buy a goat.”
“I will buy goats.”

And there it was, that tonal difference that distinguishes Seenku singulars from plurals, it was clear as day in the notes of the balafon. A pitch difference, subtle in the human voice, rendered categorically distinct by the keys of the instrument. Mamadou explicitly pointed out that “goat”, bî, had two tones, and he enunciated them: bí. ȉ. Then he played them on the balafon. I had been grappling with the representation of this word, wondering whether that slight pitch fall I heard was just a phonetic effect or a true phonological feature of the word, a question Mamadou all but answered for me in the space of thirty seconds.

Since that visit in November of 2014, it has become increasingly clear how much we as linguists can learn from musical surrogate languages. They are the musical embodiment of what speakers know about their language, whether consciously or subconsciously. Used productively, they can provide a window onto underlying forms, as in the case of the Sambla balafon where postlexical tone processes aren’t encoded when a musician plays. In rote lyrics or proverbs, they can reveal archaic language or crystallized encodings that suggest earlier stages of the grammar. Rhythmic measurements can shed light on inner speech and how closely it matches verbal production, even though the oral articulators aren’t involved. And since every linguistic contrast isn’t encoded in surrogate speech, the choice of what is encoded allows us to explore questions of salience, functional load, or the balance between the need for communication and aesthetic principles.

Most research on surrogate languages has been carried out by ethnomusicologists or anthropologists, who produce fascinating work on the musical and cultural underpinnings of the systems. But linguists have been slow to the table, leaving many questions unanswered and many details unquestioned.

With most musical surrogate systems falling into disuse, time is of the essence. Just like the call to document endangered languages in the 1990s, this is my call for us to document endangered musical surrogate languages. Otherwise we’ll never know what they could have taught us about human communication.

-Laura McPherson