Cambridge Encyclopedia :: Cambridge Encyclopedia Vol. 17

comparative method - Terminology, Origin and development

In anthropology, initially an attempt to locate individual human societies within the framework of an evolutionary history of mankind. The aim was to classify societies into types, corresponding to a particular evolutionary level. The term may now apply to any method for comparing different cultures or social institutions.

The comparative method (in comparative linguistics) is a technique used by linguists to demonstrate genetic relationships between languages. From these cognate lists, regular sound correspondences between the languages are established, and a sequence of regular sound changes can then be postulated which allows the proto-language to be reconstructed from its daughter languages.

Developed in the 19th century through the study of the Indo-European languages, the comparative method remains the standard by which mainstream linguists judge whether two languages are related, with alternative lexicostatistical methods widely considered to be less reliable. Criticisms of the comparative method have also arisen as a result of a number of advances in linguistic thought - including several proposed alternatives to the traditional linear model of language descent - with the result that reconstructions obtained by the comparative method are now generally treated with a degree of skepticism.

Terminology

In the present context, related has a specific meaning: two languages are genetically related if they are descended from the same ancestor language (Lyovin 1997:1-2). Therefore, French and Spanish are considered to belong to the same family of languages, the Romance languages (Beekes 1995:25).

Descent, in turn, is defined in terms of transmission across the generations: children learn a language from the parents' generation and are then influenced by their peers;

However, it is possible for languages to have different degrees of relatedness. The reason for this is that although all three languages share a common ancestor, Proto-Indo-European, English and German also share as a more recent common ancestor one of the daughter languages of Proto-Indo-European, Proto-Germanic, whilst Russian does not. Therefore, English and German are considered to belong to a different subgroup of the Indo-European language family, the Germanic languages, than Russian (which belongs to the Slavic subgroup) (Beekes 1995:22, 27-29). The division of related languages into sub-groups by the comparative method is accomplished by finding languages with large numbers of shared linguistic innovations from the parent language; two languages having many shared retentions from the parent language is not sufficient evidence of a sub-group.

This definition of relatedness implies that even if two languages are quite similar in their vocabularies, they are not necessarily closely related.

The comparative method is a method for proving relatedness in the sense just given, as well as a method for reconstructing the proto-phonemes of a languages of a family and uncovering the phonological changes the languages of a family have undergone.

Origin and development

The first known systematic attempt to prove the relationship between two languages on the basis of similarity of grammar and lexicon was made by the Hungarian János Sajnovics in 1770, when he attempted to demonstrate the relationship between Sami and Hungarian (work that was later extended to the whole Finno-Ugric language family in 1799 by his fellow countryman Samuel Gyarmathi) (Szemerényi 1996:6), but the origin of modern historical linguistics as a whole is often traced back to Sir William Jones, an English philologist living in India, who in 1782 made his famous observation:

"The Sanskrit language, whatever be its antiquity, is of a wonderful structure; (Jones 1786, quoted in Lehman 1967 and Szemerényi 1996:4)

Jones' insight was in conceiving of the idea of a proto-language, and consequently of the type of "family tree" model of language development (one proto-language splitting into various daughter languages, some of those then splitting again into further languages), upon which the comparative method is based.

It was the German scholar Friedrich Schlegel who in 1808 first stated the importance of using the oldest possible form of a language when trying to prove its relationships (Szemerényi 1996:7); then, in 1818, the Danish philologist Rasmus Christian Rask developed the principle of regular sound changes to explain his observations of similarities between individual words in the Germanic languages and their cognates in Greek and Latin (Szémerenyi 1996:17). It was another German, Jacob Grimm - better known for his Fairy Tales - who in Deutsche Grammatik (published 1819-37 in four volumes) first made use of something resembling the modern comparative method in attempting to show the development of the Germanic languages from a common origin, the first systematic study of diachronic language change (Szemerényi 1996:7-8). Though the German linguist Hermann Grassmann explained one of these anomalies with the publication of his sound law in 1862 {Szemerenyi 1996:19), it was in 1875 that a Danish scholar, Karl Verner, made a methodological breakthrough when he formulated the sound law which now bears his name, and which was the first sound law to use comparative evidence to show that a phonological change in one phoneme could depend on other factors within the same word, such as the neighbouring phonemes and the position of the accent (Szemerényi 1996:20): in other words, the modern concept of conditioning environments.

Similar discoveries were made by a group of young, radical German academics at the University of Leipzig known as Junggrammatiker (usually rendered as Neogrammarians in English) in the late 1800s, leading them to conclude that all sound changes were ultimately regular, and resulting in two of them, Karl Brugmann and Hermann Osthoff, making in 1878 the famous statement that "sound laws have no exceptions" (Szemerényi 1996:21). This revolutionary idea is fundamental to the modern comparative method, since the method necessarily assumes regular correspondences between sounds in related languages, and consequently regular sound changes from the proto-language. It was this Neogrammarian Hypothesis which led to the comparative method being applied to reconstruct PIE, with Indo-European being at that time by far the most well-studied and language family.

There is no concrete set of steps to be followed in the application of the comparative method, but linguists generally agree on the basic steps, which are as follows:

Assemble cognate lists

Genetic relationship between two (or more) languages can be established if they show a number of regular correspondences in native vocabulary, which means that there is a regularly recurring match between the phonetic structure of basic words with similar meanings (Lyovin 1997:2-3).

 Tongan taha ua tolu nima taŋata tahi tapu feke vaka
 Samoan tasi lua tolu lima taŋata tai tapu feʔe vaʔa ulu
 Māori tahi rua toru ɸā rima taŋata tai tapu ɸeke waka uru
 Rarotongan  taʔi rua toru ʔā rima taŋata tai tapu ʔeke vaka uru
 Hawaiian kahi lua kolu lima kanaka kai kapu heʔe waʔa ulu
 Rapanui -tahi -rua -toru -ha -rima taŋata tai tapu heke vaka uru

Caution needs to be exercised to avoid including borrowings or false cognates in the list, which could skew or obscure the correct data (Lyovin 1997:3-5). Though this may seem to be a cognate, showing that English is genetically related to the Polynesian languages, it is not, as the similarity is due to the fact that English borrowed the word from Tongan (OED 1989:"taboo, tabu, a. and n."). (Finnish, for example, borrowed the word for mother, äiti, from Gothic aiþei (Campbell 2004:65, 300), while Pirahã, a Muran language of South America, borrowed all its pronouns from Nhengatu (Thomason and Everett n.d.:8-12;

For example, although the correspondence d- : d- (where the notation "A : B" means "A corresponds to B") in English and Latin day and dies above is not regular, English and Latin do exhibit a very regular correspondence of t- : d- (Beekes 1995:127). dingua is an Old Latin form of the word later attested as lingua):

University of Phoenix
 English   ten   two   tow   tongue   tooth 
 Latin   decem   duo   duco   dingua   dent- 

Since a truly systematic correspondence can hardly be accidental, if we can rule out alternative possibilities like massive borrowing, the correspondence can be attributed to common descent.

 *ke   Pre-Sanskrit and 
 2.   *ce   Velars replaced by palatals before *i and *e 
 3.   ca   *e becomes a 

Ca is the attested Sanskrit form for and.

In the Dravidian languages of Telugu, Tamil and Malayalam, velar plosives in Proto-Dravidian have been replaced by the corresponding palatal if the velar plosive is followed by /i/, /iː/, /e/ or /eː/. However, this change is absent in Kannada and few other languages in the family.

Verner's Law, discovered by Karl Verner in about 1875, is a similar case: the voicing of consonants in Germanic languages underwent a change that was determined by the position of the old Indo-European accent.

To take another example, when we examine the Romance languages, descended from Latin, we find two different correspondence sets which both involve k:

 Italian   k   k   k   k 
 2.   k   k   k   ʃ 

What we do in this situation is try to see if the two sets occur in complementary distribution (in which case they reflect a single proto-phoneme) or if both occur in identical environments (in which case they must both reflect separate proto-phonemes). The Algonquianist Leonard Bloomfield used the reflexes of the clusters in four of the daughter languages of Proto-Algonquian to come up with the following correspondence sets (although the clusters are shown here ending in -k, this also generally applies to clusters ending in any of the plosives;

 kk   hk   hk   hk 
 2.   kk   hk   sk   hk 
 3.   sk   hk   sk   čk 
 4.   šk   šk   sk   sk 
 5.   sk   šk   hk   hk 

Although all five correspondence sets overlap with one another in various places, they are not in complementary distribution, and so Bloomfield recognized that a different cluster must be reconstructed for each set (his reconstructions were, respectively, *hk, *xk, *čk, *šk, and çk; For example, the voicing of voiceless plosives between vowels is an extremely common sound change, occurring in languages all over the world, whilst the devoicing of voiced plosives between vowels is extremely uncommon. Therefore, if a linguist were comparing two languages with a correspondence of -t- : -d- between vowels, they would reconstruct the proto-phoneme as being *-t-, and assume that it became voiced to -d- in the second language (unless they had a very good reason not to). Similarly, in Bearlake, a dialect of the Athabaskan language of Slavey, there has been a sound change of Proto-Athabaskan *ts → Bearlake (Campbell 1997).

Another assumption used in determining a proto-phoneme is that our reconstruction should ideally involve as few sound changes as possible to arrive at the modern reflexes in the daughter languages. In other words, unless there is persuasive evidence to the contrary, we should reconstruct for a proto-phoneme whatever value is the most common reflex in the daughter languages. For example, in the Algonquian languages, we find the following correspondence set (;

 m   m   m   m   m   b 

The simplest reconstruction for this set would be either *m or *b. Instead, because the reflex of this proto-phoneme is m in five of the languages compared here, and b in only one of them, if we reconstruct *b then we need to assume five separate changes of *bm, whereas if we reconstruct *m, we only need to assume a single change of *mb in one language in the family. For example, if the reconstructed phonemes fit together in the following system, the linguist would be suspicious, because languages generally (though not always) tend to maintain symmetry in their phonemic inventories:

  p t k
 Voiced  (b) d g
 Voiced aspirated  gʷʱ

Since the mid-20th century, a number of linguists have argued that this system is, at best, very suspicious typologically (Szemerényi 1996:143). ...Provided we keep [the interpretation of the results and the method itself] apart, the Comparative Method can continue to be used in the reconstruction of earlier stages of languages." This assumption is problematic even on theoretical grounds: the very fact that different languages evolved according to different sound-change laws seems to indicate a degree of arbitrariness in language evolution.

Borrowings, areal diffusion and random mutations

Even the Neogrammarians recognized that, apart from the general sound change laws, languages are also subject to borrowings from other languages and other sporadic changes (such as irregular inflections, compounding, and abbreviation) that affect one word at a time, or small subsets of words.

Attempts to apply the comparative method to languages which have been affected by the process of areal diffusion can also be problematic. This is, in essence, a subtle form of borrowing, which can take place when a significant number of speakers of one language have some competence in another, possibly unrelated language. This may lead to the languages acquiring phonological characteristics from one another, sometimes even without the conscious borrowing of lexical or morphological forms, with the result that the two languages may end up appearing to be genetically related when in fact they are not. It is also possible that two or more unrelated languages may appear to be related as the result of them all individually undergoing areal diffusion from a third unrelated language (Aikhenvald &

The other exceptions to the sound laws are a more serious problem, because they occur in generic language transmission.

In principle, as those sporadic changes accumulate, they will increasingly obscure the systematic sound laws, and eventually prevent the recognition of the genetic relationship between languages, or lead to incorrect reconstructions of proto-languages and incorrect family trees.

Gradual application

More recently, William Labov and other linguists who have studied contemporary language changes in detail have discovered that even a systematic sound change is at first applied in an unsystematic fashion, with the percentage of its occurrence in a person's speech dependent on various social factors (Beekes 1995:55; Often the sound change begins to affect some words in a language, and then gradually spreads to others, a process known as lexical diffusion. In this model, daughter languages are seen as branching out from the proto-language, gradually growing more and more distant from the proto-language through accumulated phonological, morpho-syntactic, and lexical changes; For example, here is a diagram of the Uto-Aztecan family of languages, spoken throughout the southern and western United States and Mexico (diagram based on Mithun 1999 and Campbell 1997):

The comparative method has been criticsed for its reliance on the Tree Model, used here to represent the Uto-Aztecan language family. Not all of the branches and languages are shown, for lack of space.)

Wave Model

Since languages change gradually, there are long periods in which different dialects of a language, as they evolve into separate languages, remain in contact with one another and influence each other. Therefore, the Tree Model does not reflect the reality of how languages change, as even once they are completely separated, languages which are near to one another will continue to influence each other, often sharing grammatical, phonological, and lexical innovations. A change in one language of a family will often spread to neighboring languages; The following diagram illustrates this conception of language change, called the Wave Model:

The Wave Model has been proposed as an alternative model of language change.

This is a serious challenge to the comparative method, which is entirely based on the assumption that each language has a single "genetic" parent, and hence that the genetic relationship between two languages is due to their descent from a common ancestor. In essence, the Punctuated-Equilibrium Model of language relationships suggests that most languages, for most of human history, were in a state of "equilibrium" with their neighbors: they changed relatively slowly, and exchanged areal features (including phonological features, morphological patterns, and loanwords) with one another. However, at various times, Dixon proposes, there were events which "punctuated" this state of equilibrium (including natural disasters like volcanic eruptions and floods, invasions into the area by new cultures, the expansion into a new area of the people speaking the language, or the invention of a new, life-changing technology), causing the languages to enter a brief period of very rapid change, where they would quickly split into many new daughter languages. If this model does represent the way languages change, then the Tree Model - and the comparative method, which is necessarily based on it - can only be applied in limited cases. This is problematic, as even in extremely small language communities there are always dialect differences, whether based on area, gender, class, or other factors (the Pirahã language of Brazil is spoken by only several hundred people, but has at least two different dialects, one spoken by men and one by women, for example (Aikhenvald & Therefore, the single proto-language reconstructed by the comparative method is, in all likelihood, a language which never existed.

Creoles

Another potential problem for the comparative method is the phenomenon of creole language formation, where a new language is formed from a complicated combination of two languages that are not closely related. In these events, the new language may end up with a lexicon and phonology which is derived from both parent languages, in varying proportions; While the comparative method may be able to detect the existence of a genetic relation between the creole and the parent languages (or between two creoles with shared parents), the reconstructed "proto-language" is likely to be a thoroughly artificial construct.

Subjectivity of the reconstruction

While the identification of systematic sound correspondences between known languages is fairly objective, the reconstruction of their common ancestral language is inherently subjective. It is conceivable that a Proto-Algonquian language with *b in those positions split into two branches, one which preserved *b and one which changed it to *m instead; contrast for example the Romance and Celtic branches of Indo-European.) It is also possible that the nearest common ancestor of the Algonquian languages used some other sound instead, such as *p, which eventually mutated to *b in one branch and to *m in the other.

User Comments Add a comment…

comparative psychology - History, Comparative Psychology and the Comparative Method, Species studied, Animal cognition, Disorders of animal behaviour [next] [back] comparative literature - Overview, Early work, French School, American School, Current developments