This page explains in detail the rationale for the
language classification scheme outlined here.
By and large, I followed the scheme provided in Ruhlen
The main principle behind my presentation is that no
language should be more than three hierarchical levels away from the top level.
Language families are listed in alphabetical order. The name of the family is
Where I thought this necessary, I added another
hierarchical level, that of a branch. The names of the branches are given under
the language family name, bulleted and in normal letters.
The righ-hand column of the table in which the
classification is provided includes the names of some of the languages included
in each family or branch. This listing is not meant to be exhaustive. Where a
family includes some isolates (i.e. individual languages
that cannot be classified lower down in the hierarchy), I included their names
on the corresponding row. Language isolates that cannot be classified by
language family at all are listed in the last row of the table.
I shall now explain some of the divergences from Ruhlen,
in the order of the classification itself, except that I shall deal with the
native languages of the Americas first..
American Languages in general
Classification of native languages
in the Americas is difficult. There is a very large number of
languages and of potential language families. In the 20th century there was a
succession of efforts aimed at reducing the number of first-level categories,
until Greenberg (1987) managed to arrive at just three: Amerind, Na-Dené and
Eskimo-Aleut, a proposal incorporated into Ruhlen as well. This has
generated an enormous controversy, into which I shall not enter here.
However, it is clearly not reasonable to classify all native "Indian" languages of the western hemisphere
into just two groups, while leaving the Old World families mostly intact. In my
view, we can either accept something like the Nostratic hypothesis for Eurasia and
the Amerind hypothesis for the Americas, or, alternatively, we can
accept lower-level families in both areas. It is the latter approach that I intend to follow here.
The sheer number of language
families in the Americas is such that a practical classification, such as the
one aimed at here, must lump some of the families together. For North America,
it would be tempting to simply accept the six phyla proposed by Sapir (1929):
Eskimo-Aleut, Algonkian-Wakashan, Na-Dené, Penutian, Hokan-Siouan and
Aztec-Tanoan. This would, however, disregard research done since then. I
decided, therefore, to accept Ruhlen's Na-Dené and Eskimo-Aleut as
separate families in my scheme, and then take his second- or third-level
subdivisions of Amerind as the other families, with some provisos (names in
bold are subdivisions in Ruhlen, underlined if first-level
under Amerind, not underlined if second- or third-level; names in
blue are family names
in my classification):
By Ruhlen's own admission, this is a
grab-bag of languages, with little in the way of demonstrated genetic unity.
It is best to reduce it to families whose unity is accepted by most specialists
(and several of which are recognizable by the general public):
- A by-now well established name for Algonquian + the Californian languages
Wiyot and Yurok.
- The three subdivisions of
Mosan should be kept separate: Chimakuan,
Salish, and Wakashan.
- Similarly, the three subdivisions of Keresiouan,
namely Caddoan, Iroquioan and
Siouan-Yuchi, should be kept separate.
- Finally, Kutenai and Keres should be
mentioned as language isolates.
I have no argument with Ruhlen here. Keep the
family as is.
The unity of most of Penutian has been accepted for a
while now, but Greenberg's additions of the Gulf group (incl. distant Yuki and
Wappo in California) cannot be accepted by the logic of this classification.
Neither will I include the Mayan group of languages in Penutian, despite
Ruhlen's enthusiasm - the EB article (1993: 771) specifically rejects this
connection. Therefore the best solution seems to me to keep
Penutian with its "original" meaning (Ruhlen's
subgroups I-VI), and keep Gulf and Mayan as
separate families, the latter to be called Mexican
to conform to Ruhlen to this extent.
Tanoan and Uto-Aztecan are kept separate by
Ruhlen. Here, however, it is the EB (and, by implication, Voegelin and
Voegelin) which is ready to group these two families together as the
Aztec-Tanoan "stock" (EB 1993: 770). So shall
Oto-Manguean. I do not propose to enter the
quagmire of evidence and counter-evidence for this unit, and for deciding which languages
belong to it. I shall simply adopt this as a convenient designation for Central
Mexican languages that are neither Aztec-Tanoan nor Maya.
Here I simply express my admiration that Ruhlen has
managed to classify the myriad languages south of Mexico into one of these four
units. Even Greenberg has a greater number!
I have neither the background knowledge nor the time to
try to disentangle these stocks/phyla, so - for now at any rate - I shall keep
Other language families
Although still not accepted by some scholars, this group's
existence has wide support. I see no reason to reject it, or not to place its
three subdivisions at the same hierarchical level. It may be justifiable to
establish an intermediate grouping Mongolian-Tungus, but there is no need to do
so in this classification.
Placing Ainu, Japanese and Korean in Altaic, on the other
hand, is not a good idea. Despite Miller (1971; 1996), placing these languages
in the Altaic group is still not universally accepted, and I opt for the
conservative solution: Japanese and
Korean will be families in their own right
(mostly because of the large number of speakers), while Ainu will be
listed as a language isolate.
Ruhlen (p.188) and the EB (p.746) agree that the
indigenous languages of Australia (but not Tasmania) are all related.
This family, with its two main subdivisions Munda
and Mon-Khmer, should not pose any problems. I have, however, eliminated the
higher-level group Austric that Ruhlen placed it in, together with
Miao-Yao, Daic and Austronesian. The reason is simple: such a wide-ranging
Austric grouping has no justification in my view - the proposed genetic
relationship is too uncertain.
The unity and composition of this family is not in doubt.
Its appartenance to Austric (see more on this below) is problematic, and I do
not accept it.
Austronesian is probably the best example of a language
family whose genetic subdivision is not really suitable for a practical
classification. Some first-level subdivisions may contain just a handful of
marginal languages, while others may have a large number of imprtant ones. If we
wish to have subdivisons of roughly equal strength, we need to reorganize
Ruhlen's scheme as follows:
Atayalic, Tsoulic, and Paiwanic
are to be merged as Formosan.
Malayo-Polynesian is to be divided into its
two sub-units: Western Malayo-Polynesian can be retained, while
Central-Eastern Malayo-Polynesian is to be further subdivided. Once
again, one sub-unit, Central Malayo-Polynesian can provide us with a
useful subdivision, while Eastern Malayo-Polynesian can be
sub-divided into South Halmahera - NW New Guinea, on one hand, and
Oceanic, on the other.
Same comment as above.
I have excluded Elamite from this group. I have seen no
references to conclusive evidence that Elamite, a language that we know from a
few ancient records and that we know very little about, is related to the
Dravidian languages. Let's keep it as a language isolate until we know more.
A non-controversial grouping, surely! I see no advantage
to calling it Indo-Hittite, and - as can be seen - I have placed Hittite into
the Anatolian subdivision, which is at the same level as, e.g. Germanic and
As elsewhere on this page, I am not making a statement
here about my beliefs as to the chronology of subgrouping within a language
family. Anatolian may well have split off from Proto-Indo-European before any of
the other branches - however, this is simply not relevant for a pragmatic
sub-classification of the family.
I also believe that the well-known branches of
Indo-European all deserve their own place in the scheme, and there is no need
for intermediate stages, such as Balto-Slavic or Italo-Celtic.
One innovation: I think that the Romance languages also
deserve their own subdivision. Romance linguistics is clearly a very different
field of study from the study of Latin and its Italic relatives, therefore the
Romance languages should be classed separately. (By the same logic, the modern
Indo-Aryan languages may have to be transferred to a different category
This assemblage of languages, one of the most diverse and
least known, may, or may not, form a genetic unity. At this stage, however,
there is no need to subdivide it, and certainly no evidence to merge it with
See comments under Austro-Asiatic for the rejection of the
I accept the first-level subdivision by Ruhlen,
i.e. into Kordofanian and Niger-Congo. The further subdivison of Niger-Congo,
however, is too unwieldy in Ruhlen, therefore I decided to follow the
subdivision provided in the Encyclopaedia Britannica (EB) (1993: Vol.22,
p.750-751), itself based on Voegelin & Voegelin (1977).
These six subdivisons are included in my classification on the same level as
My scheme is slightly different from Ruhlen's: I
placed Karen and Tibeto-Burman at the same level as Sinitic.
This is analogous to what I did elsewhere (Niger-Kordofanian, Austronesian), in
order to even out the uneven distribution of languages among sub-branches.
Ruhlen calls this category Uralic-Yukaghir. I think
that changing the well-established name of a language family because of the
addition of one other language, even if it is supposed to have split off first,
is an unnecessary complication. Suppose that, say, Ket is found to be Uralic in
the future, are we to rename the family Uralic-Yukaghir-Ket?
"Languages of the World". 1993. Encyclopaedia
Britannica, 15th. ed., Vol.22, pp. 572-796.
Greenberg J.H. 1966. The
Languages of Africa. Bloomington: Indiana University.
- - 1987. Language in the Americas. Stanford:
Stanford University Press.
Grimes B.F. (ed.) 1988. Ethnologue: Languages of the
World. Dallas: Summer Institute of Linguistics.
Miller R.A. 1971. Japanese and the Other Altaic
Languages. Chicago: Chicago University Press.
- - 1996. Languages and History: Japanese, Korean, and
Altaic. Bangkok: White Orchid Press & Oslo: Institute for Comparative
Research in Human Culture.
Ruhlen M. 1991. A Guide to the World's Languages.
London: Hodder & Stoughton.
Sapir E. 1929. "Central and North American Languages,"
Encyclopaedia Britannica, 14th ed., Vol.5.
Voegelin C.F. and F.M.
Classification and Index of the World's Languages. New York: Elsevier.