- a bottom-up approach
Arnold Neumaier (University of Vienna, Austria)
This page is part of the FMathL - Formal mathematical language web site at http://www.mat.univie.ac.at/~neum/FMathL.html
In this essay, formal reasoning is understood as the art of unambiguous communication of knowledge by means of written texts. Communication partners include human beings and computers.
This paper is about a formalization of reasoning, including mathematical reasoning, from first principles. The far goal towards which we work is to do this in such a way that that an intelligent being can - in principle - learn from the formalization without any supervision everything needed to understand contemporary mathematics texts. This paper outlines some important steps towards this goal.
In particular, the formalization should be understandable with little effort by 14 year old children interested in mathematics and its foundations. This precludes the traditional way taken by people working on the foundations of logics and mathematics, who assume that the reader is well-trained in mathematics, or at least in lower level formal logic.
The formalization should also be understandable by robots having sufficiently developped pattern recognition techniques, but need not have any built in expert knowledge about our language or even our symbols.
The quotes from the bible illustrate poetically the extent to which our robot's relation to us mimick our relation to God, our creator.
''But I say unto you, that every idle word that men shall speak,
they shall give account thereof in the day of judgment.''
(Jesus, according to Matthew 12:36)
Mathematical reasoning is typically done in a sloppy formal way, whose informal aspects are communicated through training in institutions like universities, and are rarely completely rigorous. On the other hand, mathematicians believe that all their proofs can be formalized rigorously, should the need arise. However, an actual proof only consists of enough hints from which a hypothetical expert should be able to reconstruct a formal proof.
In the following, I outline a scheme for a formal setting in which all mathematics can be formalized, in a way which is close to the way mathematical texts are traditionally written and interpreted. However, following the paradigm proposed by my brother Wilfried Neumaier, it is at the same time aimed at making the formal language easily comprehensible to people untrained in mathematics but competent as speakers of a natural language. In order to achieve this, a small portion of the natural language is formalized by giving it a rigid meaning sufficient to allow the unambiguous communication of knowledge.
The general principle behind communication patterns is the ability of the communication partners for pattern recognition, together with their ability to draw logical conclusions. (If the latter fails, communication failed and we speak instead of misunderstanding.)
Logic requires directed thinking. Indeed, among the typical English phrases used in logical reasoning are 'follows since', 'follows from', 'as a consequence', ('folgt weil', 'folgt aus', 'als Folge davon' in German), indicating the presence of a direction of following, a sequence of steps. Germans 'folgern' when they infer something logically, call a logical conclusion 'Folgerung' and call something 'folgerichtig' if it is logically consistent.
For higher level communication, this directed nature is primarily sequential in form of a sequence of smallest meaningful units, although multimedia communication allows more complex patterns. We may refer to the smallest meaningful units as symbols and to sequences of such units as texts.
Texts are always coded in some medium, and are classified by it. Text may be written, coded in symbols defined by spatial changes of stable ink patterns called characters. Text may be verbal, coded in symbols defined by temporal changes of stable oscillation patterns of air called phonemes. Text may be mental, coded in symbols defined by temporal changes of the neural activity of a brain. These are the traditional media. In our modern world, brains must be understood in a general sense, including the internal memory and processing units of computers. On the technical level, text may also be optical, coded in symbols defined by temporal changes of stable oscillation patterns of light, and magnetic, coded in in symbols defined by temporal changes of stable magnetic patterns of tapes and disks. The future might reveal further media useful for communicating text.
An interpreter is a (human or artificial) being able to receive, create, and manipulate text in one or several of its form. This requires (and imparts) at least limited understanding of the properties of text.
Mental text, usually called thought, is private and understandable only to the interpreter in whose brain it resides. The other kinds of text mentioned are, in contrast, public, understandable in principle by any interpreter with sufficiently developed pattern recognition facilities for patterns in the medium in which the text is coded. Understanding text requires some sort of learning. The simplest form of learning - the typical way computers learn - is the Nuremberg funnel (Nürnberger Trichter), whose modern form consists in providing completely programmed translation software (where only its use must be learned). On the other hand, the most complex kind of learning - achieved routinely by human children - is partially supervised learning by means of examples in context, later enhanced by more formal language training using (informal) grammars.
Such learning is also required to decipher texts in dead languages which are no longer understood by anyone. Even limited understandable context, such as the Rosetta stone, can be of decisive help in forming an understanding of unknown texts.
Work along the proposed lines, augmented by a large set of examples from which to learn, should be able to serve as such a Rosetta stone from which an interpreter with enough visual pattern recognition facilities but without any knowledge of our culture can - completely and unambiguously - reconstruct how to read and understand the formal core of contemporary mathematical texts.
We propose to achieve this by creating, within an informal text environment, protected areas of formal text, so-called documents, which are described in a completely unambiguous, formal way. Starting from very simple examples, which require only a minimal amount of formal pattern recognition, an informal, but at first approximate understanding of the formal language used inside a document can be inferred and built up, using only the documents. The surrounding informal text is in principle superfluous, but is added to help the reader already competent in normal English language - no background in mathematics or computer science is assumed - understand the formal part without much effort.
Once enough understanding in informally interpreting documents is gained, the language in which documents are written is formally defined, so that the learning interpreter (an intelligent robot, say, having its private operating system about which nothing is known) can check whether its formal interpretation of the language syntax is correct by testing whether all documents in this paper follow the given syntax. (At this stage, the robot would take little notice of the informal text, except as playing material to find out to which extent the informal text resembles the formal text. A very intelligent robot might even guess correctly some of its meaning!
Once the formal basis of texts is learnt, the robot learns - using additional documents from the logic stage of this paper - talking about checking texts for their logical content, still on a formal level without meaning. If the robot has some built-in expert knowledge about the functioning of its private operating system - even if it were created by extra-terrestrial beings without any knowledge of our culture and hence does not resemble our robots -, the robot may compare its private way of thinking with the new formal way of thinking and will be able to identify the practices, thereby learning the meaning of the language. Again, the documents provide sufficient material for the robot to check for itself the success of learning.
Otherwise, a formal way of acquiring meaning is provided by additional documents forming the semantics stage of this paper. After that, even a robot without knowledge of its private operating system but with some self-observation capabilities will be able to identify the practices we mean when we talk about text manipulation with the practices the robot observes applying itself.
Thus the robot will fully understand all formal documents, and will be able to understand any library of documents in our formal sense that it might be interested in reading. Sufficient capabilities and interest assumed, the robot will also be able to create meaningful such documents, understandable to us. Thus it could impart to us any mathematical knowledge it might have and care to share with us.
In a further stage, the robot could try to guess the meaning of the informal text. This is an intellectual task of a level similar to what children do in school, and should be formally teachable, too. However, this is beyond the scope of the present paper.
''In the beginning was the Word, and the Word was with God,
and the Word was God. The same was in the beginning with God.
All our written knowledge is communicated by written texts, so we shall base our formalization on written texts. Thus, in the following, all texts are considered to be written texts made from characters. However, everything we say has analogues for other kinds of texts.
Texts require interpretation to be understood. The analysis of texts naturally splits into three levels: The lowest level, the syntax defines what one may say without ambiguity, i.e., which texts are correctly formed in the language used, and thus defines what a language is. The middle level, the semantics defines what a text means, i.e., which information a text conveys. Finally, the highest level, the logic defines what one is allowed to do, i.e., which texts are fully comprehensible, and thus allows the formation of clear conclusions from a text.
These levels are related in a specific way which will become clear in the course of our exposition. In particular, each level can be mapped into each lower level. The formulation of everything is done by means of correctly formed texts and thus requires syntax. And to know what these checking activities mean requires semantics, an additional language level explaining how the categorys translate into activities for organizing information. Finally, actually performing these activities in real life, in particular the checking such a translation for correctness, requires logic.
On the other hand, the explanation of syntax needs semantics and logic. Hence there appears to be a circularity which invites paradox. Indeed, without some care it is possible to form seemingly meaningful but contradictory formal concepts such as Russell's famous set of all sets which do not contain themselves. However, modern logic has overcome these obstacles and is able to give a rigorous framework for formal reasoning. The form of such a framework will be outlined in the following in a programmatic way.
''All things were made by him [i.e., by the Word that was God];
and without him was not any thing made that was made.''
To define a formal language requires that there is already a rudimentary understanding of texts on some level, including the ability to manipulate texts. The level where such a rudimentary understanding is used is called the informal level.
Remarkably, the essence of the final formal structure is completely independent of the language used to express the informal level, and thus displays the eternal qualities of platonic reality. Indeed, this essence may even be equated with platonic reality. In a sense, platonic reality is what can be unambiguously communicated between reasoning subjects, whether they are human beings, computers, robots, extra-terrestrial beings, or God.
For humans, the informal level is the level of ordinary-day language; for computers it is a programming language which allows standard textprocessing facilities. In particular, we assume that the informal understanding and manipulation of texts includes the parsing of texts into characters, words, lines, and sentences, the recognition of a text inside a text, providing also its context, copying a text before or after another text, the replacement of a text inside a text by another text, the ability to select texts according to certain criteria, and similar activities, as they are required by the informal language.
Documents are a special class of informal text conforming to rules which are at first discussed informally, but as the exposition proceeds, are given a fully rigorous definition and meaning in terms of other documents. Thus while documents will be used from the beginning, the fact that they are documents, and their formal meaning will become apparent only after enough categorys are available to express this in a formal way.
Apart from documents we shall also use sloppy texts, which are informal texts that serve as abbreviations of documents. The rules for making abbreviations and undoing them are defined only informally, but are such that on the informal level it is always possible to transform the sloppy text, usually uniquely, into a document. (If uniqueness is not guaranteed, abbreviation is not allowed.) While no proof of this will be given when defining abbreviations, it will be clear to the programming expert that these transformations can be taught to a computer and hence are formalizable. On the other hand, the same document may have a number of different equivalent sloppy formal versions.
Sloppy texts mainly serve to relieve human readers from having to interpret monotonic and/or repetitive but structured documents, allwing them to use instead the traditional interpretation skills which the mathematical tradition regards as standard. This includes setting or deleting 'superfluous' brackets or quotation marks, using compound sentences in place of simple ones. However, they could be completely avoided by undoing all abbreviations.
Mathematics, the art and science of clear concepts and their relations, can be defined as consisting of everything formulated as a library of sloppy text which can be translated unambiguously into formal documents. Of course, mathematicians are often even more sloppy than is allowed on the sloppy formal level. But the papers and books they write are still understandable under the assumption that their texts mean something meaningful. The necessary interpretation skills needed to make sense of such approximately correct texts include misprint recognition and correction, inventing definitions and guessing properties of undefined objects, and similar corrective actions.
''In the beginning God created the heavens and the earth.
And the earth was without form, and void; and darkness was upon
the face of the deep. And the Spirit of God moved upon the face
of the waters.''
Because initially only the informal level is available, our formalization process will proceed in six stages.
The first stage is purely informal. It introduces enough context so that readers having a different informal view of the matter will hopefully be able to identify within their mental world all concepts used informally, with sufficient clarity that they are able to perform the activities required to understand the informal level.
''And God saw the light, that it was good:
and God divided the light from the darkness.''
In the second stage, we formalize the notions of text and numbers, which are essential to further progress, by using formal documents that specify and exemplify what is meant by formal texts. Syntax, semantics and logic of these documents are still described informally.
In the third stage, the syntax will be formally presented, while semantics and logic are still informal. In particular, this means that the language in which documents are presented is already fully correct in the formal sense, but the expressive power will be kept small since all explanations must be given informally. A formal definition of the language requires activities involving the formal meaning of texts and hence is beyond this stage.
In the fourth stage, syntax and semantics will be formally presented, while the logic is still informal. In particular, this means that the language in which the semantics is presented is already fully correct in the formal sense, and has the full expressive power needed for assigning meaning. However, all explanations about 'giving' meaning must still be given informally and hence are beyond this stage.
In the fifth stage, syntax, semantics and logic will be formally presented, and everything done before on the formal level can be checked to be indeed correctly formed, correctly processed, and correctly interpreted. This is done by formally defining algorithms that are able to perform all tasks needed, and by proving that these algorithms work correctly.
Of course, the actual execution of these algorithms and the actual checking of the proofs belongs to the final, sixth stage. It is part of physical reality, and is done by an interpreter, typically a human being or a computer. Since the latter are built according to formal design principles (which could easily be presented in the present framework), the consistent formalization is completed.
In this way, a fully satisfying basis for formal reasoning is established.
''Verily, verily, I say unto you, He that believeth on me,
the works that I do shall he do also; and greater works than these
shall he do.''
(Jesus, according to John 14:12)
This self-consistent basis is relevant for completely safe program verification. One can start with extremely simple but slow kernels and verify more efficient algorithms by checking their definition and their proof with a lower level verifier.
The proposed setting may also be useful for the automatic translation of mathematics texts, if the latter are specified as formal documents with a well-defined meaning.
my home page (http://www.mat.univie.ac.at/~neum/)
Arnold Neumaier (Arnold.Neumaier@univie.ac.at)