Clearly, Lomax’s project is right up Fitch’s alley. Essentially, cantometrics is a comprehensive componential analysis of musical styles. It came into vogue in the mid-1950s and was applied to kinship terminologies and other semantic domains, and even, as we can see, to musical styles. (Hockett’s identification of language’s design features above is also componential analysis and it reflects Hockett’s involvement in componential studies of kinterm semantics in the early 1960s.) An example of Lomax’s coding sheets that nicely illustrates his method is shown below:
Lomax clustered human populations into geographic regions showing strong similarities in their musical styles as defined by cantometrics (see below, Fig. 2, from Lomax, Alan. “Factors of Musical Style,” Theory and Practice: Essays Presented to Gene Weltfish, edited by Stanley Diamond. The Hague: Mouton, 1980, p. 39).
He interpreted it as a tree with two roots. One root in Siberia/Patagonia, with immediate offshoots into a) Nuclear America and Circum-Pacific (South America, North America, Australia, Melanesia) and thence into Oceania and Melanesia and into East Africa; b) Central Asia and thence to Europe and Asian High culture. The other root in African hunter-gatherers (Pygmies and Bushmen) with an offshoot in Early Agriculture. The Siberian-Patagonian style and its derivatives is characterized by “male-dominated solos or rough unison choralizing, by free or irregular rhythms and by a steadily increasing information load in various parameters – in glottal, then other ornaments, in long phrases and complex melodic form, in increasingly explicit texts and in complexly organized orchestral accompaniment” (Lomax 1980, 39-40). The Pygmy-Bushmen style and its derivatives are polyphonic, interlocked, more feminized (or at least with a balance of men and women in the ensemble), regular in rhythm, repetitious, melodically brief, cohesive and well-integrated, without ornamentation, with meaningless vocables and frequent yodeling.
The world of traditional music is, therefore, sharply divided along the axes of male vs. female, monophonic vs. polyphonic, rhythmically regular vs. irregular, solo vs. multi-part, language-friendly vs. meaningless. But the macroareas defined by the Siberian-Patagonian and Pygmy-Bushmen roots are overlapping, with examples of Pygmy-Bushmen vocalizing popping up sporadically in Lithuania, Georgia, Vietnam, Papua New Guinea and Amazonia. One gets an impression that the Pygmy-Bushmen style is a relic survival in poorly accessible geographic areas such as the Ethiopian Highlands (Dorze), the Caucasus (Svan), Papua New Guinea and the Amazonian rainforest.In the book entitled “Who Asked the First Question? The Origins of Human Choral Singing, Intelligence, Language and Speech” (Logos, 2006), a prominent Georgian ethnomusicologist, Joseph Jordania, expanded on Darwin’s Musical Protolanguage theory by arguing that
all hominid species beginning with Homo erectus communicated by means of polyphonic, multi-part, interlocked music. Homo sapiens expanded out of Africa without possessing articulated speech.
Correspondingly, we see traces of the Pygmy-Bushmen style everywhere around the globe, including such remote areas as Papua New Guinea and South America. After the initial expansion of music-minded Homo sapiens, it’s regional varieties began developing articulated speech independently of each other. Just like thousands of years later they would begin developing agriculture in a small number of centers in the Middle East, China, Papua New Guinea and the New World from where the new economic type spread by diffusion and migration. Jordania (2006, 349) wrote,
“After the advent of articulated speech musical (pitch) language lost its initial survival value, was marginalized and started disappearing. Articulated speech became the main communication medium in human societies. Early human musical abilities started to decline. The ancient tradition of choral singing started disappearing century by century and millennia by millennia. Musical activity, formerly an important part of social activity, also started to decline and became a field for professional activity.”This is completely in line with Fitch’s thinking, plus it translates into concrete musical traditions to allow one to begin thinking about the possibility of the evolution of music into speech as a historical process, not as a theoretical abstraction.
What uniquely characterizes Jordania’s position is his belief that human populations outside of Africa began switching to articulated speech earlier than African (and to a certain degree European) populations. Levels of linguistic diversity accurately reflect this process: in the areas such as the New World and Papua New Guinea linguistic diversity is exorbitant. America counts 140-150 linguistic stocks with no demonstrated kinship between them. Africa and Europe, on the other hand, have very low diversity when defined as the number of independent stocks. Africa has no more than 20; Europe around 10 (with Pictish, Etruscan and other extinct dialects).
Jordania’s hypothesis carries with it very specific predictions for human population history. Modern humans must have expanded out of Africa in Mid-Pleistocene prior to the emergence of clear signs of modern human behavior in the archaeological record. The discovery of a fully modern chin in Zhirendong (South China) at 100,000 BP agrees with this.
Assuming that Upper Paleolithic technologies were enabled by the same symbolic capacity as the newly evolved faculty of speech, the Middle-to-Upper Paleolithic transition must have happened independently in different parts of the world. We do have plenty of evidence for this: in Australia and Southeast Asia the modern human “behavioral package” is poorly visible until the Holocene, the transition from Middle Stone Age to Late Stone Age occurred in South Africa around 45,000 BP without any apparent connection to the MP-UP transition in Europe that also began around the same time. The lack of archaeological indicators of modern human behavior in association between “anatomically modern humans” in Africa supports the notion that they lacked the symbolic capacity that produced language.
Up until now, Jordania’s model seems like it could work without coming into conflict with archaeology and paleobiology. But several problems loom large.
The New World must have been peopled during the very first Mid-Pleistocene wave out of Africa because the Pygmy-Bushmen style is well-attested in South America. Not only that it’s well-attested, it’s attested as a unique form unknown in the Old World called by Victor Grauer “canonic-echoic” (see below). If a “mutation” affected not a derived tradition but a “basal clade,” then this mutation must be old. But archaeology presently cannot support such an early entry into the New World. This may not be the biggest objection to Jordania, though, because under my out-of-America model the paucity of artifacts in the New World prior to 15,000 YBP is not indicative of the lack of human presence but rather of small population size and a “non-invasive” ecological adaptation.
More importantly, and it’s here that Jordania’s model begins to fall apart, if the New World, the Sahul and parts of Asia – all having elevated levels of linguistic diversity – shifted to language earlier than Africans and Europeans, then why don’t we see the proliferation of definitive signs of modern human behavior in the archaeological record of Australia, Papua New Guinea and the New World after 100,000 BP? They are much more readily available in Africa and Europe than in Australia, Papua New Guinea and the New World. Finally, Jordania’s insight implies that language evolved multiple times in one single hominid species (Homo sapiens sapiens). Fitch’s position, on the contrary, is that, while music evolved multiple times in different taxa (whales, birds, monkeys, humans), language is uniquely human, hence it must have evolved only once.
It’s unlikely that such a complex system as language equipped with propositional, combinatorial semantics evolved multiple times in the past 100,000 years but hardly ever before. But the elegance with which Jordania’s model fits with the Musical Protolanguage theory and explains the vastly different levels of linguistic diversity in Africa vs. the New World keeps me awake.
Victor Grauer, Alan Lomax’s one-time student and research assistant, has recently returned to active ethnomusicological research and decided to convert Lomax’s two-root tree of musical styles into a single-root tree which would reflect the population genetic trees produced in support of recent out-of-Africa model of human dispersals.
He started off by re-arranging Lomax’s diagram above into a more clearly bifurcating tree and by renaming Lomax’s musical styles into a mix of descriptive labels (the Siberian-Patagonian root became “breathless solo,” Australian style “iterative one-beat,” the Amazonian version of the Pygmy-Bushmen style “canonic-echoic,” etc.) and number-letter combinations inspired by the “haplotypes” of population geneticists.
The resulting single-root tree mimicked genetic phylogenies by showing a) the radical divergence of the Pygmy-Bushmen style of vocalizing from the rest of human musical styles; and b) the progressive loss of the rich properties of the Pygmy-Bushmen style (polyphony, open-throated singing, interlock, yodeling, etc.) as consequence of serial founder effects befalling humans on their journey out of Africa and into the Americas.
Primitive polyphony turned into monophony, multi-part singing into solo and unison, open-throats yielded to breathless, constricted and coarse vocalizing, etc. At the same time, new properties emerged (from Jordania’s point of view, under the influence of growing and positively selected speech) where previously there were none: for example, meaningless vocables turned into long phrases. As with Jordania, Grauer’s model makes sense but doesn’t have a proving power. At closer examination, it begins to conjure doubts. First, the compatibility between genetic trees and Graeur’s musical tree is far from perfect:
in genetic trees, Pygmies and Bushmen don’t share a clade. Instead, as in Y-DNA, Pygmies fall into the same clade (BT) with the rest of humans, while Bushmen are outliers. If genes and music mapped well onto each other,
we would expect Pygmies to share more cantometric properties with the rest of humans but not with Bushmen. But Grauer (“Concept, Style, and Structure in the Music of the African Pygmies and Bushmen: A Study in Cross-Cultural Analysis,” Ethnomusicology 53 (3), 2009) insists that
Pygmies and Bushmen belong to the same musical tradition. Second, in most genetic trees, American Indians occupy downstream clades and it’s widely believed that the New World was colonized much later than other continents.
But on Grauer’s tree we have American Indians represented in all clades (from shouted hocket in Hupa to breathless solo in Patagonia to iterative one-beat in North America). Third, Pygmy-Bushmen style has clear parallels in Papua New Guinea and South America, but genetic trees do not document any haplotype sharing between Sub-Saharan Africa, Papua New Guinea and Amazonia or the Andes. Finally, since the publication of Grauer’s tree of musical styles,
new genetic evidence (Denosovan and Neandertal admixture in modern humans) has falsified the recent out-of-Africa with serial founder effects model of modern human evolution, thus leaving Grauer without the solid foundation in “hard data” that his theory once enjoyed.
Genetics aside, there is nothing in ethnomusicological data per se that necessitates an out-of-Africa reading of the distribution of modern human musical traditions. First, the derived nature of monophony cannot be clearly documented. Australia is completely monophonic without any trace of polyphony. The fact that Pygmy-Bushmen style tends to survive in isolated pockets and geographic refugia does not mean that more wide-spread styles are all derived from it.
It only suggests that Pygmy-Bushmen style may have been more prominent in the past. Just like “breathless solo” or “iterative one-beat,” also confined to relic areas, could have been. Second, although both Jordania and Grauer claim to have walked away from the old-fashioned evolutionary model of musical evolution from simple to complex, the subjugation of the whole of human musical history to one single trend of disintegration of a bundle of properties attested in African foragers sounds like the revival of evolutionism in its most primitive form.
Third, if, as Fitch argues, music and language share the same kin communication base, then monophony cannot be a derived trait because lullabies (presumably primordial among humans) are of necessity monophonic. Hence, monophony and polyphony can both be ancestral because confined to different domains of human life. Fourth, many archaic monophonic traditions (such as “iterative one-beat” found in Australia and North America or breathless solo – the other root in Lomax’s model) are tightly linked to drumming, and drumming is an ancestral trait in humans homological with drumming in chimpanzees (Fitch 2006, 194-195). At the same time,
polyphonic singing is often associated with complex musical instruments such as panpipes (again, attested from Sub-Saharan Africa to the Andes). Archaeologically, the earliest attested instruments (flutes dated at 36,000 years in Europe) are naturally simple. This may indicate that
both vocal and instrumental polyphony must have undergone considerable progressive evolution since the Late Pleistocene. Fifth, it’s unclear if monophony and polyphony are as starkly differentiated as Grauer often portrays them. For instance, meaningless vocables are widely found in North America (tribes there were even known to outsiders by their most popular nonsense words) but solo and unison vocalizing dominate there. The loose and flexible structure of Amazonian “canonic-echoic” (from Grauer’s A clade) echoes the “free or irregular rhythms” of “breathless solo” (Grauer’s B clade).
In fact Grauer acknowledges that there is a crossover between the two divergent musical styles taking place not in Africa, but in the Circumpolar region.“While B2 seems in many ways almost the opposite of any of the African A styles, there are some very interesting points in common, as indicated by the scored lines linking B2 with A2. The two styles are both characterized by yodel or yodel-like vocalizing (especially in the “joik” songs of Lapland, but also reflected in the heavy glottalization found throughout this style area), continuous vocalizing (interrupted by gasps for breath), wide intervals, and an emphasis on “nonsense” vocables.”Sixth, the change from a more loose and flexible structure of “canonic-echoic” toward more regimented Pygmy-Bushmen style seems to be more natural than the change in the opposite direction. (And Grauer uses the same logic to argue that Pygmy kinship systems are more archaic compared to Bantu.)