TIKKI TIKKI TEMBO:
The Chemistry of Protolanguage
Yuri Tarnopolsky
---------------------------------- -----------------------------------------
2004
Keywords:
protolanguage, language origin, language evolution, speech generation,
chemistry, chemical nomenclature,
linearization,
Pattern
Theory, Transition State Theory,
complexity,
Ulf Grenander, George Zipf, Noam Chomsky,
Joseph Greenberg,
Manfred
Eigen, Ilya Prigogine, Walter Ross Ashby, René Thom, George
Hammond,
Robert Rosen,
axiom of closure.
ABSTRACT
Protolanguage (Derek
Bickerton)
in linguistics corresponds to an evolutionary stage preceding the
grammaticalized
language
as we know it. It
could
be possible to reconstruct the principles of protolanguage by turning
to
most general principles of evolution
in a larger picture,
of
which chemistry is a relevant part. Both linguistics and
chemistry
are discrete combinatorial systems.
Considering the
chemical
origin of life, chemical analogies might offer some insight into the
origin
of mind, language, and society,
all of which
developed
on the platform of life. The conceptual basis for discrete
combinatorial
systems, including chemistry and
language, can
be
found in Pattern Theory (Ulf Grenander) where ideas,
utterances,
and molecules are configurations. To draw
the parallel further,
chemistry
uses its own language of chemical nomenclature to represent non-linear
molecular structures as linear
strings of symbols.
Chemistry
pays particular attention to the intimate mechanisms of structural
transformations.
A tentative concept
of the mechanism of
protolanguage
generation is suggested as kinetically controlled linearization of a
typically
non-linear observable
configuration through
a
non-observable thought. Generation of linear expressions in
protolanguage
is viewed as a process of
generalized
chemistry,
going from a typically non-linear initial state through a transition
state
toward the linear output, under
the constraint of a
maximized
preservation of configuration topology.
CONTENTS
1.
Introduction
4
2. Preview of main
ideas
8
3. Chemistry and linguistics: sister
sciences
13
4. Noam Chomsky and Joseph
Greenberg
16
5. Chinese and
Chemicalese
25
6. René Thom and images of
change
32
7. Configurations, patterns, and
Nean
40
8. Some risky ideas about mathematics and
life
49
9. Chemolinguistry: a
chimera
52
10.
Tikki Tikki Tembo: language as a form of
life
64
11.
Zipfing the
chimera
69
12.
A chemist and a chimp speak
Nean
78
13.
Scenes from the cave life told in
Nean
84
14.
Concluding
remarks
98
15.
APPENDIX
15.1 Example of
Chemicalese
101
15.2 Examples of real-life large
configurations
102
15.3 The chemical view of the
world
103
15.4. Program
nean
110
References
111
The words and verses differ, each from each,
Compounded out of different elements...
........
Lucretius (De rerum
natura,
II )
The order and connection of ideas is the same
as the order and connection of things..............
Spinoza (Ethica, I, VII)
1. INTRODUCTION
Mark S. Baker in his
book
The
Atoms of Language (Baker, 2001) drew a consistent analogy of
linguistics
with chemistry.
Within the Principles
and
Parameters framework, a language is similar to a chemical element in
the
sense that it is a combination
of certain
parameters.
Baker acknowledged that most people would associate words with atoms of
language, but he simply put
this view aside as
“correct—in
one sense.” (Baker, 2001, p. 51). He was, of course, right, pointing to
the Periodic System as
a metaphor for the
combinatorial
nature of language. Moreover, the book manifested the true chemical
spirit:
it was built around
numerous observable
linguistic
examples, as any typical chemical monograph is built around hundreds of
structures and their
transformations.
It would be unbecoming for a linguist to say “It’s Greek to me” about
chemistry,
anyway, but there is some genuine feeling of kinship
between both areas.
Linguists
evoke chemistry as the science epitomizing not only complexity but also
its successful conquest.
The kinship was
prophetically
noticed long ago, even before the birth of chemistry, but after the
birth
of the famous Greek Democritus:
The words and verses differ, each from each,
Compounded out of different elements—
Not since few only, as common letters, run
Through all the words, or no two words are made,
One and the other, from all like elements,
But since they all, as general rule, are not
The same as all. Thus, too, in other things,
Whilst many germs common to many things
There are, yet they, combined among themselves,
Can form new who to others quite unlike.
Thus fairly one may say that humankind,
The grains, the gladsome trees, are all made up
Of different atoms (Lucretius, 1958, Book II).
Chemistry, on its part,
has been using extensive linguistic parallels for nucleic acids and
proteins
since the discovery of their relation.
Moreover, much earlier, chemistry developed its own tongue with a
lexicon
heavily borrowed from Greek and a refined grammar
with codified flexions and word order.
Mark Baker’s book was the last drop into the bucket of observations
that
I had accumulated over a significant time. This paper is
an attempt of a chemist to view words as atoms of a chemistry.
I am a chemist without any linguistic credentials whatsoever, but with
a life long interest in languages. I am, to a variable degree, familiar
with properties of such languages as Russian (native), English
(current),
German (studied at school), French, Hungarian, Japanese,
Hebrew, and a few others. My very limited hands-on experience with
the non-Indo-European languages, as well as a better (but by
no means perfect) knowledge of both Indo-European but diametrically
opposite English and Russian, persuaded me that, with all
the striking differences in their design, all languages perform the
same function with the same means. Neither the opulence of Bantu
languages, with their classifiers and suffixes, nor the intricately
woven ribbons of the Na-Dene verbs could shake my conviction.
The function is a representation of a non-linear “source,” whatever
it is, and the means is an optimal linearization of the non-linear
representation.
"The vocal-auditory channel has some desirable features as a medium of
communication: it has a high bandwidth, its intensity
can be modulated to conceal the speaker or to cover large distances,
and
it does not require light, proximity, a face-to-face
orientation, or tying up the hands. However it is essentially a serial
interface, lacking the full two-dimensionality needed to
convey graph or tree structures and typographical devices such as
fonts,
subscripts, and brackets. The basic tools of a coding
scheme employing it are an inventory of distinguishable symbols and
their
concatenation." (Pinker and Bloom, 1990).
For over twenty years I
have been watching the development of Pattern Theory (Grenander,
1976-2003),
sometimes from a close
distance, regarding
it
as a general approach to complex systems consisting of atom-like
elements
and connecting bonds. It became
clear to me that this
mathematical
theory of everything nicely covered not only molecules and languages
but
also every discrete
combinatorial system
we
could come in touch, and did it with an unprecedented combination of
generality
and realism.
Furthermore, I have witnessed the entire genesis and evolution of the
science
of complexity, starting from Prigogine (1984) , who
formulated the most fundamental principles of complex natural systems
such as life, mind, and society, and further toward Artificial Life
(Adami, 1998) where languages and molecules were of the same kin
already
at the inception (Eigen, 1971-1979). “Natural” here is
the opposite of “artificial,” such as virtual reality, where people
can walk on the ceiling and turn into wolves right before your eyes.
I am also familiar with the language of musical notation, a couple of
programming
languages, and, due to my profession, with the curious
language of chemical nomenclature invented by organic chemistry to
verbally communicate the non-linear molecular structure.
Finally, the language of poetry—rarely spoken in everyday life—is my
bonus
pass to a gym where one can exercise linking distant
meanings and close
sounds.
The enormous literature comprising computational, formal, traditional,
and historical linguistics, Artificial Intelligence, Artificial Life,
mathematical
structures,
physics of open systems, chemistry, and details of Pattern Theory is
probed
here only highly selectively and
superficially. The
growing
but still manageable bibliography on language evolution and computation
has been nicely collected and
presented at the
website
of University of Illinois at Urbana-Champaign (Language Origin, WWW).
My intent is not to formulate a theory—this should be entrusted to
professionals—but
to offer a new (but organically grown!) spice
for the boiling
cauldron
of linguistic ideas. Whoever likes the aroma can use it for meditation,
inspiration, and, who knows, for some
fun time after a
Ph.D.
thesis. I wish to share a widest and most comprehensive—an illusive but
honorable goal—view of the intellectual
jungle where
linguistics
and chemistry are of the same blood. In short, if it all boils, then
down
and up to Pattern Theory.
I believe that my outsider status, as well as the claim for a larger
picture,
grants me the privilege of choosing my own far-from-academic
style—which is just being natural. I cannot walk on the ceiling.
I further refer to the following key figures of painting a large
picture
with language in the landscape: Lucretius, Ulf Grenander,
George Zipf, Manfred Eigen, Ilya Prigogine, Walter Ross Ashby, Rene
Thom, George Hammond, all of them, except Lucretius and Zipf,
natural scientists and mathematicians. Among the modern linguists,
David Lightfoot is, I believe, the mâitre of the eagle’s eye
view.
I mention others in the main text.
I will widely use the WWW sources. They may die out with time, but a
peculiar
life-like property of the Web is that the new ones
will be cooked, could
be
searched for, and found, garnished with ads. For better or worse, money
will never be out of the larger picture.
©
Yuri
Tarnopolsky, 2004