The Origins


All living organisms communicate - some say that even rocks and minerals do, but this is outside my personal experience and I shall not consider that possibility here.

The very first living organisms on our planet were probably single-cell bacteria, and cells still communicate through chemical messages in animal and vegetable bodies.

This why our senses of taste and particularly the much more refined sense of smell are the oldest and least rational of all, since the chemical messages we receive through them are forwarded to the oldest part of our brain, the so-called reptilian brain: in the fraction of a second, an odour in particular can send our mind flying back many years to far places when and where we first perceived it, for instance the smell of newly mown grass, laundry drying in the sun or stubble burning in a field that now we may seldom experience in our urban lives.

Next came seeing and hearing - no absolute certainty which was first - and communications rose to a more complex level (visual and auditory) in a number of higher life forms that were endowed with the corresponding sensory organs. i.e. the animals.

Animals mostly communicate by emitting cries and assuming ritualised postures - in addition to also sending olfactory messages - but these communications are not articulated, i.e. they are basically simple messages conveying a single meaning (alarm, fear, submission, attraction, challenge, etc.) also because of an inherent anatomical limitation: only humans possess in their throats the Y-shaped hyoid bone that allows to modulate the flow of expired air into a larger variety of sounds.

The human hyoid bone

It was probably a random genetic variation but, as in all such cases, since it afforded its possessors the advantage of more complex and effective communications, over time it became a fixed genetic trait and developed further with increasing use.

Although many animals also have a simpler hyoid apparatus that permits them to produce a number of calls and sounds, it does not exhibit the same complexity and therefore capacity of flexible emission.

Spoken Language

We do not know exactly where and when the first human language was uttered - some hazard the guess of some 100,000-150,000 years ago in Africa in the Middle Paleolithic, at the time of the Homo sapiens and Homo neanderthalensis, supported by paleoanthropological findings of human hyoid bones from 60,000 years ago.

Skull of Ethiopian Homo sapiens (ca. 160,000 BC) - Skull of European Homo neanderthalensis (undated)

We can only surmise that articulated oral messages again provided an evolutionary advantage - sound can carry much farther than gestures, goes over and around obstacles, etc.

"Watch out, a sabre-toothed tiger is hiding in that clump of trees!"

could have been signalled by hand movements, too, but if your hunt mate was himself behind an outcropping boulder, would he have seen them in time to avoid the danger?

In any case, language must have begun as a number of simple coded messages as those exchanged by other animals, and its further development must have taken thousands and thousands of years, eventually differentiating into several 'families' and covering most of our globe.

Present-day distribution of human language families (from Wikipedia)

Classifying a language as belonging to a given family is not an exact science and often the subject of controversy among linguists, witness the number of isolates - like Basque in Europe, an 'orphan' language with no known ancestors, nor progeny.

The major problem in identifying the world's first modern human speech is the absence of corroborating written evidence until about 5,400 years ago, which brings us to the next aspect of language.

Written Language

Once humans had acquired language, eventually they found it expedient to mark it down more permanently, which led over time to the development of written scripts/alphabets.

The first permanent marks left by humans were cave paintings and petroglyphs, considered by most scholars to have some ritual motivation.

Bison in Altamira Cave, Spain (33,000-23,000 BC) - Fighting warriors on rock drawing, Val Camonica, Italy (8,000-6,000 BC)

On the other hand, pictographs/pictograms, i.e. rock-graven symbols believed to represent synthetically some concept/idea rather than trying to depict an actual image, are believed to be the precursors of writing, which seems to have been motivated by the need to record permanently the deeds/attributes/accounting of some deity or ruler.

For instance, the pre-Dynastic Narmer or Scorpion King palette is reputed to be a precursor of Egyptian hieroglyphic writing.

The Narmer palette, Nekhen (Hierakonpolis), Southern Egypt (ca. 3,300 BC)

This form of 'pictorial script' resembles cartoon vignettes, and was still used until quite recently (1800s) by some North-American Indian tribes, who never seemed to have reached the alphabetised stage of their southern cousins, such as those located in Mexico for instance - witness also the similar case of Australian Aborigines and other primitive cultures.

Description of an Ojibwan initiation rite on birch bark

This is probably due to a mostly nomadic lifestyle: a settled agricultural community implies buildings, walls, etc. i.e. a culture with more durable materials available on which writing may survive for decades and even centuries, whereas pelts, leather, wood, etc. are perishable.

The first attested languages start appearing in the 3rd millennium BC in the Middle East (the Fertile Crescent):

Estimated DateLanguage & FamilyName of ScriptTypeSample
ca. 2,900 BCSumerian
(Language isolate)
ca. 2,700 BCEgyptian
ca. 2,600 BCAkkadian
ca. 2,600 BCEblaite
ca. 2,600 BCElamite
(Language isolate,
possibly Elamo-Dravidian)
ca. 2,100 BCHurrian
ca. 1,700 BCLuwian
ca. 1,650 BCCretan Minoan
Linear AUndeciphered,
presumed mixed
ca. 1,600 BCHittite
ca. 1,500 BCCanaanite
presumed phonetic
ca. 1,400 BCMycenaean Greek
Linear BSyllabic
ca. 1,200 BCArchaic Chinese
ca. 1,100 BCPhoenician

Phoenician, the father of many contemporary scripts (Greek, Latin, etc.), was one of the last to appear some 3,000 years ago.

As can be inferred from the table above and the preceding text, language writing developed by adopting 'pictures' representing progressively the sounds of spoken:

  • Words/ideas (logographic scripts, the term ideographic no longer used by linguists)
  • Syllables (syllabic scripts)
  • Single letters (phonetic scripts)

Phonetic scripts are unarguably more efficient, conveying a language with a much smaller number of symbols than the other systems.

Breaking an Unknown Script

The first step in deciphering an unknown alphabet indeed consists in determining how many different symbols it contains, thereby identifying its type - at least roughly:

  • Less than 50: phonetic
  • 50-500: syllabic
  • Over 500: logographic

Next necessary steps in this difficult process are finding:

  1. A decently large corpus (body) of extant texts
  2. A related extant language
  3. Multilingual texts featuring one or more other known languages

Jean-François Champollion (1790-1832) finally succeeded in deciphering the Egyptian hyeroglyphs because, beside being a brilliant philologist and Orientalist:

  1. He had very abundant material to work with.
  2. He had learned Coptic, the lithurgical language still used today by the Ethiopian Church and actually Late Egyptian of the IIIrd century AD (the equivalent of Latin for the Catholic Church)
  3. The Rosetta stone of the III century AD, discovered in 1799 during the French expedition to Egypt, with text in Egyptian hyeroglyphic, demotic, and Greek.
Jean-Fran?ois Champollion - The Rosetta stone

Too small a corpus is why Etruscan is still largely a mystery today: although it can be 'read' fairly accurately - its script is based on adapted Greek letters - its limited available vocabulary is poorly understood and the language is therefore difficult to assign to any known family, a fact that could provide further useful clues.

Map All Sounds?

An alphabet does not necessarily represent in writing all its language sounds. This is particularly evident in Semitic languages, linguistically defined as consonantal or abjad (from the Arabic word for alphabet) languages - both ancient and comparatively new like Arabic and Hebrew - with vowels often omitted altogether.

Actually, Semitic consonantal groups (usually trigrams) represent strong semantic nuclei that are the basis for derivatives and synonims with identical or similar meanings, a coherence not exhibited by Indo-European languages.

Left or Right?

Another aspect to be eventually settled for a writing system is its direction.

Early systems were not much concerned with this, and lines went whichever way (both horizontally and vertically) took the writer's fancy or was influenced by the shape/material of his writing tools and surfaces. Some also went in one direction on one line and reversed it on the next, following a path similar to that of a plough working over a field (boustrophedic writing).

The Greek alphabet and its successors settled on a left-to-right/top-to-bottom direction, while Arabic and Hebrew chose right-to-left. Scripts that incorporate Chinese characters have traditionally been written vertically (top-to-bottom) from the right to the left margin of the page.

Some Latecomers

In both Europe and the Far East, some peoples reached their alphabetisation stage much later than their neighbours, being influenced by them with different results.

Slavic Countries - Cyrillic

In 862 AD two Greek monk brothers, later canonized as Saints Cyril and Methodius, were sent as Byzantium's missionaries among the Slavic peoples of Great Moravia and Pannonia. In order to reproduce their sacred and lithurgic Greek texts for the local people in their Slavic languages, they are credited with having devised the Glagolitic alphabet, based on medieval cursive Greek letters for Slavic sounds also present in Greek and new letters for those which were not.

The Glagolitic Baška tablet, Croatia (1100 AD)

In the 10th century AD, the Preslav Literary School of the First Bulgarian Empire developed a corresponding, simpler Cyrillic alphabet that gradually replaced Glagolitic as the script for most Slavic languages, eventually spreading also east of the Urals into Asia under the Russian Soviet regime.

Cyrillic was therefore an efficient invention, adopting Greek capital letters whenever phonetically possible and 'ad hoc' new letters for uniquely Slavic sounds.

Greek & Russian alphabets
The 12 blue Russian letters at right have the same shape and sound as the Greek letters at left

Japan - A Messy Alphabetic Hodgepodge

The Japanese, too, had no writing system of their own until the 4th century AD, when Chinese books on Confucian philosophy and Buddhism were brought to their country and prompted them to start from scratch, using Chinese characters for their sounds (phonetic values).

An unwise choice, because:

  • Chinese and Japanese are totally unrelated languages
  • Chinese words are monosyllabic, Japanese words polysyllabic

This doubtful solution created a lot of confusion, for example because the Chinese character used to represent a given Japanese syllable had a Chinese meaning of its own, usually quite unrelated to the meaning of the Japanese word where it had been artlessly inserted.

These awkward Japanese attempts must have been met with laughter and scorn by the learned Chinese, with a consequently shameful Japanese 'loss of face'.

Eventually the Japanese decided to develop their own independent writing system, and came up with two sets of syllabic scripts:
  • Hiragana, for Japanese words
  • Katakana, for foreign words

The table at right shows Hiragana characters ordered by columns (vowels) and rows (consonants). This script can raise some criticism:

  • No consistency of form across columns or rows
  • Low aesthetic level

Most alphabets of Oriental languages are also formally elegant, witness the straight lines of Sanskrit characters, the derived characters of Tibetan, the curvy characters of Sinhalese, Thai, Laotian, vertical Mongolian, etc. etc. Even the Hangul of related Korean is more pleasing to the eye than the Nipponic squiggles.

Indian Sanskrit | Tibetan Lantsa | Sinhalese Elu Hodiya | Thailandese Thai | Laotian Lao | Mongolian Mongyol Bicig | Korean Hangul

Table of Hiragana characters

After some time, the pull of Chinese culture was however too strong to resist, so the Japanese re-introduced Chinese characters into their texts, but choosing them now for their meaning (semantic value) rather than for their sound.

The final result is that today a Japanese text can contain about 70% of Chinese characters and 30% of Hiragana characters (mostly for suffixes), but the final complication is that the former may be pronounced in at least 2 different ways:

  1. The On-yomi or 'phonetic reading' (Chinese sounds)
  2. The Kun-yomi or 'explanatory reading' (Japanese sounds)

On-yomi readings have 4 further 'dynastic' alternatives:

  1. Kan'-on (Han reading)
  2. Go-on (Wu reading)
  3. To-on (Tang reading)
  4. Kan'yo-on (popular reading)

Can a messier way be imagined to write down a language ?

Additional Script Symbols

Diacritical Marks

When a given script does not represent satisfactorily the sounds of the language that uses it for writing, additional small symbols are often added to its letters to cover this deficiency, such as accents of various forms (tildes, cedillas, diaereses/umlauts, etc. etc.)

Accents can indicate stress, vowel pitch (acute/grave) or their 'irregular' pronunciation, or both.

In French the diaeresis marks vowel diphthongs that must not be pronounced as single phonemes but two (e.g. Noël, naïve, etc.), while the circumflex shows where a following S once was (e.g. fenêtre, fantôme, etc.)

Many languages omit accents altogether, others like Greek and Hungarian put it on all words.


Some time after a writing system was settled, the need arose to add marks showing the reader where to make short or long pauses - in ancient times reading was always performed aloud, even when alone.

In fact St. Augustin in his Confessions (6.3) relates his visit to fellow bishop St. Ambrose in Milan in ca. 384 AD, and notes in wonder:

"Oculi ducebantur per paginas et cor intellectum rimabatur, vox autem et linguam quiescebant."
(His eyes were led over the pages and his mind caught their meaning,
but his voice and tongue remained silent).

The purpose of punctuation was for a very long time only prosodic, i.e. an aid to loud text recitation, and only much later became syntactical, i.e. an aid to mark sentence/period structures.


A list of my books on Languages & Language-Related Subjects can be found on another page.

