%LaTeX-file for pdflatex

\documentclass[12pt,runningheads]{cl2emult}


%\usepackage{makeidx}  % allows index generation
\usepackage{graphicx} % standard LaTeX graphics tool
                    % for including eps-figure files
%\usepackage{subeqnar} % subnumbers individual equations
                      % within an array
%\usepackage{multicol} % used for the two-column index
%\usepackage{cropmark} % cropmarks for pages without
                      % pagenumbers
%\usepackage{lnp}      % placeholder for figures
%\makeindex            % used for the subject index
                      % please use the style sprmidx.sty with
                      % your makeindex program
                      
                      %\ifx\undefined\du\def\du#1{\underline{\underline{#1}}}\fi

%OPTIONAL%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\usepackage{amsmath}   % useful for coding complex math
%\mathindent\parindent % needed in case "Amstex" is used
%

\textwidth13.7cm
\hoffset-1.5cm

\def\({\left(}
\def\){\right)}


\begin{document}
%double underlined \du{x} text or $\du{xy}$ math string
\title*{ Mathemagics\footnote{Lectures given at a school held in Chapelle des Bois (April 5-10, 1999) on 
``Noise, oscillators and algebraic randomness''.} \\(A Tribute to L. Euler and R. Feynman)}
%
%
\toctitle{Mathemagics (A Tribute to L. Euler and R. Feynman) }
% allows explicit linebreak for the table of content
%
%
\titlerunning{Mathemagics }
% allows abbreviation of title, if the full title is too long
% to fit in the running head
%
\author{Pierre Cartier\thanks{cartier@ihes.fr}}
%
\authorrunning{Pierre Cartier}
% if there are more than two authors,
% please abbreviate author list for running head
%
%
\institute{CNRS, Ecole Normale Sup\'erieure de Paris, 
45 rue d'Ulm, 75230 Paris Cedex 05} 

\maketitle              % typesets the title of the contribution

\bigskip

%First page headline in AmS-TeX for S\'eminaire Lotharingien de Combinatoire
%--first part
\thispagestyle{myheadings}
\font\rms=cmr8 
\font\its=cmti8 
\font\bfs=cmbx8
\markright{\its S\'eminaire Lotharingien de
Combinatoire \bfs 44 \rms (2000), Article~B44d\hfill}
\def\thepage{}
%
%

\hfill {\it To the memory of Gian-Carlo Rota,}

\hfill {\it modern master of mathematical magical tricks}

\vglue 1cm

\centerline{\bf Table of contents}

\bigskip

\noindent 1. Introduction

\noindent 2. A new look at the exponential

2.1. The power of exponentials

2.2. Taylor's formula and exponential

2.3. Leibniz's formula

2.4. Exponential vs. logarithm

2.5. Infinitesimals and exponentials

2.6. Differential equations

\noindent 3. Operational calculus

3.1. An algebraic digression: umbral calculus

3.2. Binomial sequences of polynomials

3.3. Transformation of polynomials

3.4. Expansion formulas

3.5. Signal transforms

3.6. The inverse problem

3.7. A probabilistic application

3.8. The Bargmann-Segal transform

3.9. The quantum harmonic oscillator

\noindent 4. The art of manipulating infinite series

4.1. Some divergent series

4.2. Polynomials of infinite degree and summation of series

4.3. The Euler-Riemann zeta function

4.4. Sums of powers of numbers

4.5. Variation I: Did Euler really fool himself?

4.6. Variation II: Infinite products

\noindent 5. Conclusion: From Euler to Feynman

\noindent References

%First page headline in AmS-TeX for S\'eminaire Lotharingien de Combinatoire
%--restoring the headers and pagenumbering
\pagenumbering{arabic}
\addtocounter{page}{1}
\markboth{\small Pierre Cartier}{\small Mathemagics}
%
%

\newpage

\section{Introduction}
The implicit philosophical belief of the working mathematician is today the 
Hilbert-Bourbaki formalism. Ideally, one works within a 
\textbf{closed system}: the basic principles are clearly enunciated once 
for all, including  (that is an addition of twentieth century science) the 
formal rules of logical reasoning clothed in mathematical form. The basic 
principles include precise definitions of all mathematical objects, and the 
coherence between the various branches of mathematical sciences is achieved 
through reduction to basic models in the universe of sets. A very  
important feature of the system is its \textbf{non-contradiction} ; after 
G\"odel, we have lost the initial hopes to establish this non-contradiction 
by a formal reasoning, but one can live with a corresponding belief in 
non-contradiction. The whole structure is certainly very appealing, but the 
illusion is that it is eternal, that it will function for ever according to 
the same principles.  What history of mathematics teaches us is that the 
principles of mathematical deduction, and not simply the mathematical 
theories, have evolved over the centuries. In modern times, theories like 
General Topology or Lebesgue's Integration Theory represent an almost 
perfect model of precision, flexibility and harmony, and their 
applications, for instance to probability theory, have been very 
successful. 

My thesis is: \textbf{there is another way of doing mathematics, equally 
successful, and the two methods should supplement each other and not 
fight.}

This other way bears various names: symbolic method, operational calculus, 
operator theory \dots Euler was the first to use such methods in his 
extensive study of infinite series, convergent  as well as divergent. The 
calculus of differences was developed by G. Boole around 1860 in a symbolic 
way, then Heaviside created his own symbolic calculus to deal with systems 
of differential equations in electric circuitry. But the modern master was 
R.~Feynman who used his diagrams, his disentangling  of operators, his path 
integrals \dots The method consists in stretching the formulas to their 
extreme consequences, resorting to some internal feeling of coherence and 
harmony. They are obvious pitfalls in such methods, and only experience can 
tell you that for the Dirac $\delta$-function an expression like $x 
\delta(x)$ or $\delta'(x)$ is lawful, but not $\delta(x)/x$ or 
$\delta(x)^2$. Very often, these so-called symbolic methods have been 
substantiated by later rigorous developments, for instance Schwartz 
distribution theory gives a rigorous meaning to $\delta(x)$, but physicists 
used sophisticated formulas in \lq\lq momentum space" long before Schwartz 
codified the Fourier transformation for distributions. The Feynman \lq\lq 
sums over histories" have been immensely successful in many problems, 
coming from physics as well from mathematics, despite the lack of a 
comprehensive rigorous theory.

To conclude, I would like to offer some 
remarks about the word \lq\lq formal". For the mathematician, it usually 
means \lq\lq according to the standard of formal rigor, of formal logic". 
For the physicists, it is more or less synonymous with \lq\lq heuristic" as 
opposed to \lq\lq rigorous". It is very often a source of misunderstanding 
between these two groups of scientists. 

\section {A new look at the exponential}
\label{exponential}
\subsection{The power of exponentials}
\label{power}
The multiplication of numbers started as a shorthand for repeated 
additions, for instance $7$ times $3$ (or rather \lq\lq seven taken three 
times") is the sum of three terms equal to $7$ 
%
$$
7 \times3=\underbrace{7+7+7}_{3~\mathrm{times}}.\nonumber 
\label{equ0}
$$
%
In the same vein $7^3$ (so denoted by Viete and Descartes) means  
$\underbrace{7\times7\times7}_{3~\mathrm{factors}}$. There is no difficulty 
to define $x^2$ as $xx$ or $x^3$ as $xxx$ for any kind of  multiplication 
(numbers, functions, matrices \dots) and Descartes uses interchangeably 
$xx$ or $x^2$, $xxx$ or $x^3$. 

In the exponential (or power) notation, the exponent plays the role of an 
operator. A great progress, taking approximately the years from $1630$ to 
$1680$ to accomplish, was to generalize $a^b$ to new cases where the 
operational meaning of the exponent $b$ was much less visible. By $1680$, a 
well defined meaning has been assigned to $a^b$ for $a$, $b$ real numbers, 
$a>0$. Rather than to retrace the historical route, we shall use a formal 
analogy with vector algebra. From the original definition of $a^b$ as 
$a\times\dots\times a $ ($b$ factors), we deduce the fundamental rules of 
operation, namely 
%
\begin{equation}
(a\times a')^b=a^b \times {a'}^b, ~a^{b+b'}=a^b\times 
a^{b'},~(a^b)^{b'}=a^{b b'},~a^1=a. 
\label{equ1}
\end{equation}
%
The other rules for manipulating powers are easy consequences of the rules 
embodied in (\ref{equ1}). The fundamental rules for \textbf{vector algebra} 
are as follows: 
%
\begin{eqnarray}
&&(\mathbf{v}+\mathbf{v}').\lambda=\mathbf{v}.\lambda+\mathbf{v}'.\lambda, 
~\mathbf{v}.(\lambda+\lambda')=\mathbf{v}.\lambda+\mathbf{v}.\lambda',\nonumber 
\\ &&~(\mathbf{v}.\lambda).\lambda'=\mathbf{v}.(\lambda\lambda'),~\mathbf{v}.1=\mathbf{v}. 
\label{equ2}
\end {eqnarray}
%
The analogy is striking provided we compare  the product $a\times a'$ of 
numbers to the sum $\mathbf{v}+\mathbf{v}'$ of vectors, and the 
exponentiation $a^b$ to the scaling $\mathbf{v}.\lambda$ of the vector 
$\mathbf{v}$ by the scalar $\lambda$. 

In modern terminology, to define $a^b$ for $a$, $b$ real, $a>0$ means that 
we want to consider the set $\mathbf R_+^{\times}$ of real numbers $a>0$ as 
a vector space over the field of real numbers $\mathbf R$. But to vectors, 
one can assign coordinates: if the coordinates of the vector 
$\mathbf{v}(\mathbf{v}')$ are the $v_i(v_i')$, then the coordinates of 
$\mathbf{v}+\mathbf{v}'$ and $\mathbf{v}.\lambda$ are respectively 
$v_i+v_i'$ and $v_i.\lambda$. Since we have only one degree of freedom in 
$\mathbf R_+^{\times}$, we should need one coordinate, that is a bijective 
map $L$ from $\mathbf R_+^{\times}$ to $\mathbf R$ such that 
%
\begin{equation}
L(a\times a')=L(a)+L(a').
\label{equ3}
\end{equation}
%
Once such a logarithm $L$ has been constructed, one defines $a^b$ in such a 
way that $L(a^b)=L(a).b$. It remains the daunting task to construct a 
logarithm. With hindsight, and using the tools of calculus, here is the 
simple definition of \lq\lq natural logarithms" 
%
\begin{equation}
\ln(a)=\int_1^a dt/t~~~\mathrm{for}~a>0.
\label{equ4}
\end{equation}
%
In other words, the logarithm function $\ln(t)$ is the primitive of $1/t$ 
which vanishes for $t=1$. The inverse function $\exp s$ (where $t= \exp s$ 
is synonymous to $\ln(t)=s$) is defined for all real $s$, with positive 
values, and is the unique solution to the differential equation $f'=f$ with 
initial value $f(0)=1$. The final definition of powers is then given by 
%
\begin{equation}
a^b=\exp(\ln(a).b).
\label{equ5}
\end{equation}
%
If we denote by $e$ the unique number with logarithm equal to $1$ (hence 
$e=2.71828\dots$), the exponential is given by $\exp a=e^a$. 

\textbf{The main character in the exponential is the exponent}, as it 
should be, in complete reversal from the original view where $2$ in $x^2$, 
or $3$ in $x^3$ are mere markers.

\subsection{Taylor's formula and exponential}
\label{Taylor}
We deal with the expansion of a function $f(x)$ around a fixed value $x_0$ of $x$, in the form
%
\begin{equation}
f(x_0+h)=c_0+c_1h+\cdots+c_ph^p+\cdots.
\label{equ6}
\end{equation}
%
This can be an infinite series, or simply a finite order expansion (include 
then a remainder). If the function $f(x)$ admits sufficiently many 
derivatives, we can deduce from (\ref{equ6}) the chain of relations
%
\begin{eqnarray} 
&&f'(x_0+h)=c_1+2c_2h+\cdots \nonumber \\ 
&&f''(x_0+h)=2c_2+6c_3h+\cdots\nonumber\\ 
&&f'''(x_0+h)=6c_3+24c_4h+\cdots~.\nonumber 
\label{equ6_1}
\end{eqnarray}
%
By putting $h=0$, deduce
%
$$
f(x_0)=c_0,~~ f'(x_0)=c_1,~~ f''(x_0)=2c_2, \ldots\nonumber
\label{equ6_2}
$$
%
and in general $f^{(p)}(x_0)=p!c_p$. Solving for the $c_p$'s and inserting 
into (\ref{equ6}) we get Taylor's expansion 
%
\begin{equation}
f(x_0+h)=\sum_{p\ge 0}\frac{1}{p!}f^{(p)}(x_0)h^p.
\label {equ7}
\end{equation}
%
Apply this to the case $f(x)= \exp x$, $x_0=0$. Since the function $f$ is 
equal to its own derivative $f'$, we get $f^{(p)}=f$ for all $p$'s, hence 
$f^{(p)}(0)=f(0)=e^0=1$. The result is 
%
\begin{equation}
\exp h=\sum_{p\ge 0}\frac{1}{p!}h^p.
\label{equ8}
\end{equation}
%
This is one of the most important formulas in mathematics. The idea is that 
this series can now be used to define the exponential of large classes of 
mathematical objects: complex numbers, matrices, power series, operators. 
For the modern mathematician, a natural setting is provided by a complete 
normed algebra $A$, with norm satisfying $||ab||\le||a||\cdot ||b||$. For any 
element $a$ in $A$, we define $\exp~a$ as the sum of the series $\sum_{p 
\ge 0}a^p/p!$, and the inequality 
%
\begin{equation}
||a^p/p!||\le ||a||^p/p!
\label{equ9}
\end{equation}
%
shows that the series is absolutely convergent.

But this would not exhaust the power of the exponential. For instance, if 
we take (after Leibniz) the step to denote by $\mathbf{D}f$ the derivative 
of $f$, $\mathbf{D}^2f$ the second derivative, etc\dots (another instance of 
the exponential notation!), then Taylor's formula reads as 
%
\begin{equation}
f(x+h)=\sum_{p \ge 0}\frac{1}{p!}h^p \mathbf{D}^p f(x).
\label{equ10}
\end{equation}
%
This can be interpreted by saying that the \textbf{shift operator} 
$\mathbf{T}_h$ taking a function $f(x)$ into $f(x+h)$ is equal to $\sum_{p 
\ge 0}\frac{1}{p!}h^p\mathbf{D}^p$, that is, to the exponential $\exp h\mathbf{D}$ (question: 
who was the first mathematician to cast Taylor's formula in these terms?). 
Hence the obvious operator formula 
$\mathbf{T}_{h+h'}=\mathbf{T}_h\cdot\mathbf{T}_{h'}$ reads as 
%
\begin{equation}
\exp (h+h')\mathbf{D}=\exp h\mathbf{D}\cdot\exp h'\mathbf{D}.
\label{equ11}
\end{equation}
%
Notice that for numbers, the logarithmic rule is
%
\begin{equation}
\ln(a\cdot a')=\ln(a)+\ln(a')
\label{equ12}
\end{equation}
%
according to the historical aim of reducing via logarithms the 
multiplications to additions. By inversion, the exponential rule is 
%
\begin{equation}
\exp(a+a')=\exp(a)\cdot\exp(a').
\label{equ13}
\end{equation}
%
Hence formula (\ref{equ11}) is obtained from (\ref{equ13}) by substituting 
$h\mathbf{D}$ to $a$ and $h'\mathbf{D}$  to $a'$. 

But life is not so easy. If we take two matrices $A$ and $B$ and calculate 
$\exp(A+B)$ and $\exp A\cdot\exp B$ by expansion we get 
%
\begin{equation}
\exp(A+B)=I+(A+B)+\frac{1}{2}(A+B)^2+\frac{1}{6}(A+B)^3+\cdots
\label{equ14}
\end{equation}
%
%
\begin{eqnarray}
&&\exp A . \exp B=I+(A+B)+\frac{1}{2}(A^2+2AB+B^2)\nonumber \\ 
&&+\frac{1}{6}(A^3+3A^2B+3AB^2+B^3)+\cdots. 
\label{equ15}
\end{eqnarray}
%
If we compare the terms of degree $2$ we get 
%
\begin{equation}
\frac{1}{2}(A+B)^2=\frac{1}{2}(A^2+AB+BA+B^2)
\label{equ16}
\end{equation}
%
in (\ref{equ14}) and not $\frac{1}{2}(A^2+2AB+B^2)$. Harmony is restored if 
$A$ and $B$ commute: indeed $AB=BA$ entails 
%
\begin{equation}
A^2+AB+BA+B^2=A^2+2AB+B^2
\label{equ17}
\end{equation}
%
and more generally the \textbf{binomial formula}
%
\begin{equation}
(A+B)^n=\sum_{i=0}^n \left ( \begin{array}{cc}n\\i\end{array}\right)
A^iB^{n-i} 
\label{equ18}
\end{equation}
%
for any $n\ge 0$. By summation one gets
%
\begin{equation}
\exp(A+B)=\exp A\cdot\exp B
\label{equ19}
\end{equation}
%
\textbf{if $A$ and $B$ commute, but not in general.} 
The success in (\ref{equ11}) comes from the obvious fact that $h\mathbf{D}$ commutes  
to $h'\mathbf{D}$ since numbers commute to (linear) operators.

\subsection{Leibniz's formula}
\label{Leibniz}
Leibniz's formula for the higher order derivatives of the product of two 
functions is the following one
%
\begin{equation}
\mathbf{D}^n(fg)=\sum_{i=0}^n \left ( 
\begin{array}{cc}n\\i\end{array}\right)\mathbf{D}^if .\mathbf{D}^{n-i}g. 
\label{equ20}
\end{equation}
%
The analogy with the binomial theorem is striking and was noticed early. 
Here are possible explanations. For the shift operator, we have 
%
\begin{equation}
\mathbf{T}_h=\exp h\mathbf{D}
\label{equ21}
\end{equation}
%
by Taylor's formula and
%
\begin{equation}
\mathbf{T}_h(fg)=\mathbf{T}_h f\cdot\mathbf{T}_h g
\label{equ22}
\end{equation}
%
by an obvious calculation. Combining these formulas we get
%
\begin{equation}
\sum _{n \ge 0}\frac{1}{n!}h^n\mathbf{D}^n(fg)=\sum_{i \ge 0}\frac{1}{i!}h^i \mathbf{D}^i f.
\sum_{j \ge 0}\frac{1}{j!}h^j \mathbf{D}^j g;
\label{equ23}
\end{equation}
%
equating the terms containing the same power $h^n$ of $h$, one gets
%
\begin{equation}
\mathbf{D}^n(fg)=\sum_{i+j=n}\frac{n!}{i!j!}\mathbf{D}^if\cdot\mathbf{D}^jg
\label{equ24}
\end{equation}
%
that is, Leibniz's formula. 

Another explanation starts from the case $n=1$, that is 
%
\begin{equation}
\mathbf{D}(fg)=\mathbf{D}f\cdot g+f\cdot \mathbf{D}g.
\label{equ25}
\end{equation}
%
In a heuristic way it means that $\mathbf{D}$ applied to a product $fg$ is 
the sum of two operators $\mathbf{D}_1$ acting on $f$ only and 
$\mathbf{D}_2$ acting on $g$ only. These actions being independent, 
$\mathbf{D}_1$ commutes to $\mathbf{D}_2$ hence the binomial formula 
%
\begin{equation}
\mathbf{D}^n=(\mathbf{D}_1+\mathbf{D}_2)^n=\sum_{i=0}^n \left ( 
\begin{array}{cc}n\\i\end{array}\right) \mathbf{D}_1^i\cdot \mathbf{D}_2^{n-i}.
\label{equ26}
\end{equation}
%
By acting on the product $fg$ and observing that 
$\mathbf{D}_1^i\cdot \mathbf{D}_2^j$ transforms $fg$ into $\mathbf{D}^i 
f\cdot \mathbf{D}^j g$, one recovers Leibniz's formula. In more detail, to 
calculate $\mathbf{D}^2(fg)$, one applies $\mathbf{D}$ to $\mathbf{D}(fg)$. 
Since $\mathbf{D}(fg)$ is the sum of two terms $\mathbf{D}f\cdot g$ and 
$f\cdot \mathbf{D}g$ apply $\mathbf{D}$ to $\mathbf{D}f\cdot g$ to get 
$\mathbf{D}(\mathbf{D}f)g+\mathbf{D}f\cdot \mathbf{D}g$ and to
$f\cdot \mathbf{D}g$ 
to get $\mathbf{D}f\cdot \mathbf{D}g+f\cdot \mathbf{D}(\mathbf{D}g)$, hence the sum 
%
\begin{eqnarray}
&&\mathbf{D}(\mathbf{D}f)\cdot g+\mathbf{D}f\cdot
\mathbf{D}g+\mathbf{D}f\cdot \mathbf{D}g
+f\cdot \mathbf{D}(\mathbf{D}g)\nonumber 
\\ &&~~=\mathbf{D}^2f\cdot g+2\mathbf{D}f\cdot \mathbf{D}g+f\cdot \mathbf{D}^2g\nonumber. 
\label{equ26_1}
\end{eqnarray}
%

This last proof can rightly be called \lq\lq formal" since we act on the 
formulas, not on the objects: $\mathbf{D}_1$ transforms $f\cdot g$ into 
$\mathbf{D}f\cdot g$ but this doesn't mean that from the equality of functions 
$f_1\cdot g_1=f_2\cdot g_2$ one gets $\mathbf{D}f_1\cdot
g_1=\mathbf{D}f_2\cdot g_2$ 
(counterexample: from $fg$=$gf$, we cannot infer 
$\mathbf{D}f\cdot g=\mathbf{D}g\cdot f$). The modern explanation is provided by the 
notion of tensor products: if $V$ and $W$ are two vector spaces (over the 
real numbers as coefficients, for instance), equal or distinct, there 
exists a new vector space $V\otimes W$ whose elements are formal finite 
sums $\sum_i\lambda_i(v_i\otimes w_i)$ (with scalars $\lambda_i$ and $v_i$ 
in $V$, $w_i$ in $W$); we take as basic rules the consequences of the fact 
that $v 
\otimes w$ is bilinear in $v$, $w$, but nothing more. Taking $V$ and $W$ to 
be the space $C^{\infty}(I)$ of the functions defined and indefinitely 
differentiable in an interval $I$ of $\mathbf  R$, we define the operators 
$\mathbf{D}_1$ and $\mathbf{D}_2$ in $C^{\infty}(I) 
\otimes C^{\infty}(I)$ by 
%
\begin{equation}
\mathbf{D}_1(f \otimes g)=\mathbf{D}f \otimes g,~~\mathbf{D}_2(f\otimes g)=f \otimes \mathbf{D}g.
\label{equ27}
\end{equation}
%
The two operators $\mathbf{D}_1\mathbf{D}_2$ and $\mathbf{D}_2\mathbf{D}_1$ 
transform $f 
\otimes g$ into $\mathbf{D}f 
\otimes \mathbf{D}g$, hence $\mathbf{D}_1$ and $\mathbf{D}_2$ commute.
Define $\bar{\mathbf{D}}$ as $\mathbf{D}_1+\mathbf{D}_2$   hence 
%
\begin{equation}
\bar{\mathbf{D}}(f \otimes g)=\mathbf{D}f\otimes g+ f \otimes \mathbf{D}g.
\label{equ28}
\end{equation}
%
We can now calculate $\bar{\mathbf{D}}^n=(\mathbf{D}_1+\mathbf{D}_2)^n$ by 
the binomial formula as in (\ref{equ26}) with the conclusion  
%
\begin{equation}
\bar{\mathbf{D}}^n (f \otimes g)=\sum_{i=0}^n 
\left ( \begin{array}{cc} n\\i  \end{array} \right)
\mathbf{D}^i f \otimes \mathbf{D}^{n-i} g.
\label{equ29}
\end{equation}
%

The last step is to go from (\ref{equ29}) to (\ref{equ20}). The rigorous 
reasoning is as follows. There is a linear operator $\mu$ taking $f\otimes 
g$ into $f\cdot g$ and mapping $C^{\infty}(I) \otimes C^{\infty}(I)$ into 
$C^{\infty}(I)$; this follows from the fact that the product $f\cdot g$ is 
bilinear in $f$ and $g$. The formula (\ref{equ25}) is expressed by 
$\mathbf{D}\circ 
\mu=\mu \circ \bar{\mathbf{D}}$ in operator terms, according to the diagram:  
%
$$
\begin{array}{cccc}
  &C^{\infty}(I) \otimes C^{\infty}(I)  & \stackrel{\mu}\longrightarrow & 
  C^{\infty}(I)\\ & \bar \mathbf{D}\downarrow & &  \downarrow \mathbf{D}\\ 
  & C^{\infty}(I) \otimes C^{\infty}(I)     & \stackrel{\mu}\longrightarrow  
  &C^{\infty}(I) .\nonumber 
\end{array}
$$
%
An easy induction entails $\mathbf{D}^n \circ \mu=\mu \circ 
\bar{\mathbf{D}}^n$, and from  (\ref{equ29}) one gets 
%
\begin{eqnarray}
&&\mathbf{D}^n(fg)=\mathbf{D}^n(\mu(f\otimes 
g))=\mu(\bar{\mathbf{D}}^n(f\otimes g))\nonumber 
\\ &&=\mu(\sum_{i=0}^n\left ( \begin{array}{cc} n\\i  \end{array} \right) 
\mathbf{D}^if  
\otimes \mathbf{D}^{n-i}g)=
\sum_{i=0}^n \left ( \begin{array}{cc} n\\i  \end{array} \right)
 \mathbf{D}^if\cdot \mathbf{D}^{n-i}g.
\label{equ30}
\end{eqnarray}
%
\textbf{In words: first replace the ordinary product $f\cdot g$ by the neutral tensor 
product $f \otimes g$, perform all calculations using the fact that 
$\mathbf{D}_1$ commutes with $\mathbf{D}_2$, then restore the product
$\cdot $ in 
place of $\otimes$.} 

When the vector spaces $V$ and $W$ consist of functions of one variable, 
the tensor product $f \otimes g$ can be interpreted as the function 
$f(x)g(y)$ in two variables $x$, $y$; moreover 
$\mathbf{D}_1=\partial/\partial x$, $\mathbf{D}_2=\partial/\partial y$ and 
$\mu$ takes a function $F(x,y)$ of two variables into the one-variable 
function $F(x,x)$ hence $f(x)g(y)$ into $f(x)g(x)$ as it should. Formula 
(\ref{equ25}) reads now 
%
\begin{equation}
\frac {\partial}{\partial x} (f(x)g(x))=
\(\frac {\partial}{\partial x}+\frac {\partial}{\partial y}\)
f(x)g(y) \Big|_{y=x}.
\label{equ31}
\end{equation} 
%
The previous \lq\lq formal" proof is just a rephrasing of a familiar proof using 
Schwarz's theorem that $\frac {\partial}{\partial x}$ and $\frac 
{\partial}{\partial y}$ commute. 

Starting from the tensor product $\mathcal{H}_1 \otimes \mathcal{H}_2$ of 
two vector spaces, one can iterate and obtain 
%
$$
\mathcal{H}_1 \otimes \mathcal{H}_2 \otimes \mathcal{H}_3,~~
\mathcal{H}_1 \otimes \mathcal{H}_2 \otimes \mathcal{H}_3 \otimes \mathcal{H}_4,\ldots \nonumber.
\label{equ31a}
$$
%
Using once again the exponential notation, $\mathcal{H}^{\otimes n}$ is the 
tensor product of $n$ copies of $\mathcal {H}$, with elements of the form 
$\sum \lambda\cdot (\psi_1\otimes \dots\otimes \psi_n)$. In quantum physics, 
$\mathcal H$ represents the state vectors of a particle, and 
$\mathcal{H}^{\otimes n}$ represents the state vectors of a system of $n$ 
independent particles of the same kind. If $H$ is an operator in 
$\mathcal{H}$ representing for instance the energy of a particle, we define 
$n$ operators $H_i$ in $\mathcal{H}^{\otimes n}$ by 
%
\begin{equation}
H_i(\psi_1\otimes\dots\otimes \psi_n)= 
\psi_1\otimes\cdots \otimes H\psi_i\otimes \cdots \otimes\psi_n 
\label{equ32}
\end{equation} 
%
(the energy of the $i$-th particle). Then $H_1,\dots,H_n$ commute pairwise 
and $H_1+\cdots+H_n$ is the total energy if there is no interaction. 
Usually, there  is a pair interaction represented by an operator $V$ in 
$\mathcal{H}\otimes 
\mathcal{H}$; then the total energy is given by $\sum_{i=1}^n 
H_i+\sum_{i<j} V_{ij}$ with 
%
\begin{equation}
V_{12}(\psi_1\otimes\psi_2\otimes\cdots\otimes\psi_n)=V(\psi_1\otimes\psi_2)\otimes\psi_3\otimes  
\cdots
\label{equ33}
\end{equation} 
%
%
\begin{equation}
V_{23}(\psi_1\otimes\cdots\otimes\psi_n)=\psi_1\otimes 
V(\psi_2\otimes\psi_3)\otimes \cdots \otimes\psi_n 
\label{equ34}
\end{equation} 
%
etc\dots There are obvious commutation relations like
%
\begin{eqnarray}
&&H_iH_j=H_jH_i \nonumber \\ && 
H_iV_{jk}=V_{jk}H_i~~\mathrm{if}~~i,j,k~~\mathrm{are}~ \mathrm{distinct}. 
\nonumber 
\end{eqnarray}
%
This is the so-called \lq\lq locality principle": if two operators $A$ and 
$B$ refer to disjoint collections of particles $(\mathbf a)$ for $A$ and 
$(\mathbf b)$ for $B$, they commute. 

Faddeev and his collaborators made an extensive use of this notation in 
their study of quantum integrable systems. Also, Hirota introduced his 
so-called bilinear notation for differential operators connected with 
classical integrable systems (solitons). 

\subsection{Exponential vs. logarithm} 
\label{logarithm}  

In the case of real numbers, one usually starts from the logarithm and 
invert it to define the exponential (called antilogarithm not so long ago).  
Positive numbers have a logarithm; what about the logarithm of $-1$ for 
instance?

Things are worse in the complex domain. For a complex number $z$, define 
its exponential by the convergent series 
%
\begin{equation}
\exp z=\sum_{n\ge0}\frac{1}{n!}z^n.
\label{equ35}
\end{equation} 
%
From the binomial formula, using the commutativity $zz'=z'z$ one gets
%
\begin{equation}
\exp(z+z')=\exp z\cdot \exp z'
\label{equ36}
\end{equation} 
%
as before. Separating real and imaginary part of the complex number 
$z=x+iy$ gives Euler's formula 
%
\begin{equation}
\exp(x+iy)=e^x(\cos y+i\sin y)
\label{equ37}
\end{equation} 
%
subsuming trigonometry to complex analysis. The trigonometric lines are the 
\lq\lq natural" ones, meaning that the angular unit is the radian (hence $\sin 
\delta\simeq\delta$ for small $\delta$). 

From an intuitive view of trigonometry, it is obvious that the points of a 
circle of equation $x^2+y^2=R^2$ can be uniquely parametrized in the form 
%
\begin{equation}
x=R~\cos\theta,~~y=R~\sin\theta
\label{equ38}
\end{equation} 
%
with $-\pi<\theta\le\pi$, but the subtle point is to show that the 
geometric definition of $\sin\theta$ and $\cos\theta$ agree with the 
analytic one given by (\ref{equ37}). Admitting this, every complex number 
$u\ne 0$ can be written as an exponential $\exp z_0$, where $z_0=x_0+iy_0$, 
$x_0$ real and $y_0$ in the interval $]-\pi,\pi]$. The number $z_0$ is 
called the principal determination of the logarithm of $u$, denoted by 
$\mathrm {Ln} (u)$. But the general solution of the equation $\exp z=u$ is 
given by $z=z_0+2\pi i n$ where $n$ is a rational integer. Hence a nonzero 
complex number has infinitely many logarithms. The functional property 
(\ref{equ36}) of the exponential cannot be neatly inverted: for the 
logarithms we can only assert that $\mathrm{Ln}(u_1\cdots u_p)$ and 
$\mathrm{Ln}(u_1)+\ldots+\mathrm{Ln}(u_p)$ differ by the addition of an 
integral multiple of $2\pi i$. 

The exponential of a (real or complex) square matrix A is defined by the 
series  
%
\begin{equation}
\exp A=\sum_{n\ge0}\frac{1}{n!}A^n.
\label{equ39}
\end{equation} 
%
There are two classes of matrices for which the exponential is easy to 
compute: 

\textbf{a)} Let $A$ be diagonal $A=\mathrm{diag}(a_1,\ldots,a_n)$. Then $\exp A$ is 
diagonal with elements $\exp a_1,\ldots,\exp a_n$.  Hence any  
\textbf{complex} diagonal matrix with non zero diagonal elements is an exponential, hence 
admits a logarithm, and even infinitely many ones. 

\textbf{b)} Suppose that $A$ is a special upper triangular matrix, with zeroes on 
the diagonal, of the type  
%
$$
A =\left ( 
\begin{array}{cccc}
0 & a & b &c \\ &0 & d & e \\& &0 & f \\& & &  0 
\end{array}
\right )\nonumber.
\label{matrix A}
$$
%
Then $A^d=0$ if $A$ is of size $d\times d$. Hence $\exp A$ is equal to 
$I+B$ where $B$ is of the form 
$A+\frac{1}{2}A^2+\frac{1}{6}A^3+\cdots+\frac{1}{(d-1)!} A^{d-1}$. Hence 
$B$ is  again a special upper triangular matrix and $A$ can be recovered by 
the formula 
%
\begin{equation}
A=B-\frac{B^2}{2}+\frac{B^3}{3}-\cdots +(-1)^d \frac{B^{d-1}}{d-1}.
\label{equ40}
\end{equation} 
%
This is just the truncated series for $\ln(I+B)$ (notice $B^d=0$). Hence in 
the case of these special triangular matrices, exponential and logarithm 
are inverse operations.

In general, $A$ can be put in triangular form $A=UTU^{-1}$ where $T$ is 
upper triangular. Let $\lambda_1,\dots,\lambda_d$ be the diagonal elements of  
$T$, that is the eigenvalues of $A$. Then 
%
\begin{equation}
\exp A=U\cdot \exp T\cdot U^{-1}
\label{equ41}
\end{equation} 
%
where $\exp T$ is triangular, with the diagonal elements $\exp  
\lambda_1,\ldots ,\exp \lambda_d$. Hence
%
\begin{equation}
\det(\exp A)=\prod_{i=1}^d \exp \lambda_i=\exp \sum_{i=1}^d 
\lambda_i=\exp(\mathrm{Tr}(A)). 
\label{equ42}
\end{equation} 
%
The determinant of $\exp A$ is therefore nonzero. Conversely \textbf {any 
complex matrix} $M$ \textbf {with a nonzero determinant is an exponential}: 
for the proof, write $M$ in the form $U T U^{-1}$ where $T$ is composed of 
Jordan blocks of the form 
%
$$
T_s =\left ( 
\begin{array}{ccccc}
\lambda & 1 &.  &. &0\\ & .&. &.  &.  \\&0 & &. & 1 \\.&. &. &. &  \lambda 
\end{array}
\right )~~ \mathrm{with}~\lambda \ne 0  ~~\nonumber.
\label{matrix T}
$$
%
From the existence of the complex logarithm of $\lambda$ and the study 
above of triangular matrices, it follows that $T_s$ is an exponential, 
hence $T$ and $M=UTU^{-1}$ are exponentials.

Let us add a few remarks:

 a) A complex matrix with nonzero determinant has infinitely many 
 logarithms; it is possible to normalize things to select one of them, but 
 the conditions are rather artificial. 
 
 b) A real matrix with nonzero determinant is not always the exponential of 
 a real matrix; for example, choose $M =\left ( 
\begin{array}{cc} 1   &0\\0&-1 \end{array} \right )$. 
This is not surprising since $-1$ has no real logarithm, but many complex 
logarithms of the form $k\pi i$ with $k$ odd.

c) The noncommutativity of the multiplication of matrices implies that in 
general $\exp(A+B)$ is not equal to $\exp A\cdot \exp B$ . Here the logarithm of 
a product cannot be the sum of the logarithms, whatever normalization we    
choose. 

\subsection{Infinitesimals and exponentials}
\label{infinitesimal}

There are many notations in use for the higher order derivatives of a 
function $f$. Newton uses $\dot f, \ddot{f},\ldots$, the customary notation 
is $f',f'',\ldots$. Once again, the exponential notation can be 
systematized, $f^{(m)}$ or $\mathbf{D}^m f$ denoting the $m$-th derivative of $f$, 
for $m=0,1,\ldots$. This notation emphasizes that the derivation is a 
functional operator, hence 
%
\begin{equation}
(f^{(m)})^{(n)}=f^{(m+n)},~~\mathrm{or}~~\mathbf{D}^m(\mathbf{D}^n f)=\mathbf{D}^{m+n} f
\label{equ43}.
\end{equation}
%
In this notation, it is cumbersome to write the chain rule for the 
derivative of a composite function 
%
\begin{equation}
\mathbf{D}(f \circ g)=(\mathbf{D}f \circ g)\cdot \mathbf{D}g.
\label{equ44}
\end{equation}
%

Leibniz's notation for the derivative is $dy/dx$ if $y=f(x)$. Leibniz was 
never able to give a completely rigorous definition of the infinitesimals 
$dx, dy,\\\dots$\footnote{In modern times, Abraham Robinson has vindicated 
them using the tools of formal logic. There has been many interesting 
applications of his nonstandard analysis, but one has to admit that it 
remains too cumbersome to provide a viable alternative to the standard 
analysis. Maybe in the $21$st century!}. His explanation of the derivative 
is as follows: starting from $x$, increment it by an infinitely small 
amount $dx$; then $y=f(x)$ is incremented by $dy$, see Figure~1.
%
\begin{figure}
\centering 
\includegraphics[width=1.0\textwidth]{cartier1.pdf} 
\caption{\textbf{Geometrical description}: an infinitely small portion of the curve $y=f(x)$,
 after  zooming, becomes infinitely close to a straight line, our function 
 is \lq\lq smooth", not fractal-like. }\label{figure1} 
\end{figure} 
%
%
\begin{equation}
f(x+dx)=y+dy.
\label{equ45}
\end{equation}
%
Then the derivative is $f'(x)=dy/dx$, hence according to
(\ref{equ45}),
%
\begin{equation}
f(x+dx)=f(x)+f'(x) dx.
\label{equ46}
\end{equation}
%
This cannot be literally true, otherwise the function $f(x)$ would be 
linear. The true formula is 
%
\begin{equation}
f(x+dx)=f(x)+f'(x) dx+ o(dx)
\label{equ47}
\end{equation}
%
with an error term $o(dx)$ which is infinitesimal, of a higher order than 
$dx$, meaning $o(dx)/dx$ is again infinitesimal. In other words, the 
derivative $f'(x)$, independent of $dx$, is infinitely close to 
$\frac{f(x+dx)-f(x)}{dx}$ for all infinitesimals $dx$. The modern 
definition, as well as Newton's point of view of fluents, is a 
\textbf{dynamical} one: when $dx$ goes to $0$, $\frac{f(x+dx)-f(x)}{dx}$ tends 
to the limit $f'(x)$. Leibniz's notion is \textbf{statical}: $dx$ is a 
given, fixed quantity. But there is a hierarchy of infinitesimals: $\eta$ 
is of higher order than $\epsilon$ if $\eta/\epsilon$ is again 
infinitesimal. In the formulas, equality is always to be interpreted up to 
an infinitesimal error of a certain order, not always made explicit. 

We use these notions to describe the logarithm and the exponential. By 
definition, the derivative of $\ln x$ is $\frac{1}{x}$, hence 
%
$$
\frac{d\ln x}{dx}=\frac{1}{x},~~\mathrm{that~ is}~\ln(x+dx)=\ln(x)+\frac{dx}{x}\nonumber .
$$
%
Similarly for the exponential
%
$$
\frac{d\exp x}{dx}=\exp x,~~\mathrm{that~ is}~\exp(x+dx)=(\exp x) (1+dx)\nonumber .
$$
%
This is a rule of compound interest. Imagine a fluctuating daily rate of 
interest, namely $\epsilon_1,\epsilon_2,\dots,\epsilon_{365}$ for the days of 
a given year, every daily rate being of the order of $0.0003$. For a fixed 
investment $C$, the daily reward is $C\epsilon_i$ for day $i$, hence the 
capital becomes $C+C\epsilon_1+\dots+C\epsilon_{365}=C\cdot (1+\sum_i 
\epsilon_i)$, that is approximately $C(1+.11)$. If we reinvest every day 
our profit, invested capital changes according to the rule: 
%
$$
\begin{array}{ccccc} 
&C_{i+1}=~~&C_i&+~~~~~C_i \epsilon_i=&C_i(1+\epsilon_i). \\ &\uparrow 
~~&\uparrow&~~~~~\uparrow&\\ 
&\mathrm{capital}&~~\mathrm{capital}&~~~~~\mathrm{profit}&\\ 
&\mathrm{at~day~} i+1&~~\mathrm{at~ day~} i&~~~~~\mathrm{during~ day~} i& 
\nonumber 
\end{array}
$$
%
At the end of the year, our capital is $C\cdot \prod_i(1+\epsilon_i)$. We can 
now formulate the \lq\lq bankers rule": 
%
$$
\mathrm{if}~S=\epsilon_1+\dots+\epsilon_N,~\mathrm{then}~\exp S=
(1+\epsilon_1)\cdots(1+\epsilon_N). ~~~~(B)\nonumber
\label{banker}
$$
%
Here $N$ is infinitely large, and $\epsilon_1,\ldots,\epsilon_N$ are 
infinitely small; in our example, $S=0.11$, hence 
$\exp~S=1+S+\frac{1}{2}S^2+\ldots$ is  equal to $1.1163\ldots$: 
\textbf{by reinvesting daily, the yearly profit of $11\%$ is increased to 
$11.63\%$}. 

Formula (B) is not true without reservation. It certainly holds if all 
$\epsilon_i$ are of the same sign, or more generally if 
$\sum_i|\epsilon_i|$ is of the same order as $\sum \epsilon_i=x$. For a 
counter-example, take $N=2p^2$ with half of the $\epsilon_i$ being equal to 
$+\frac{1}{p}$, and the other half to $-\frac{1}{p}$ (hence $\sum_i 
\epsilon_i=0$ while $\prod_i (1+\epsilon_i)$ is infinitely close to $1/e=\exp(-1)$).

To connect definition (B) of the exponential to the power series expansion  
$\exp S=1+S+\frac{1}{2!}S^2+\cdots$ one can proceed as follows: by algebra 
we get 
%
\begin{equation}
\prod_{i=1}^N(1+\epsilon_i)=\sum_{k=0}^N S_k,
\label{equ48}
\end{equation}
%
where $S_0=1, S_1=\epsilon_1+\dots+\epsilon_N=S$, and generally
%
\begin{equation}
S_k=\sum_{i_1<\dots<i_k} \epsilon_{i_1} \cdots \epsilon_{i_k}.
\label{equ49}
\end{equation}
%
We have to compare $S_k$ to 
$\frac{1}{k!}S^k=\frac{1}{k!}(\epsilon_1+\cdots+\epsilon_N)^k$. Developing 
the $k$-th power of $S$ by the multinomial formula, we obtain $S_k$ plus 
error terms each containing at least one of the $\epsilon_i$'s to a higher 
power, $\epsilon_i^2,\epsilon_i^3,\dots$, hence infinitesimal compared to the 
$\epsilon_i$'s. The general \textbf{principle of compensation of 
errors}\footnote{This terminology was coined by Lazare Carnot in $1797$. 
Our formulation is more precise than his!} is as follows: given a 
sum of infinitesimals 
%
\begin{equation}
\Sigma = \eta_1+\cdots+\eta_M
\label{equ50}
\end{equation}
%
and new summands $\eta_j'=\eta_j+o(\eta_j)$ 
with an error $o(\eta_j)$ of higher order than $\eta_j$, we obtain that
%
\begin{equation}
\Sigma' = \eta_1'+\cdots+\eta_M'
\label{equ51}
\end{equation}
%
is equal to $\Sigma$ plus an error term $o(\eta_1)+\cdots+o(\eta_M)$. \textbf{If the 
$\eta_j$ are of the same sign, the error is $o(\Sigma)$, that is negligible  
compared to $\Sigma$.} 

%
\begin{figure}\centering 
\includegraphics[width=1.0\textwidth]{cartier2.pdf} 
\caption{\textbf{Leibniz' continuum}:  by zooming, a finite segment of line is made of a large 
number of atoms of space: a fractal.}
\label{figure2} 
\end{figure} 
%
The implicit view of the continuum underlying Leibniz's calculus is as 
follows: a finite segment of a line is made of an infinitely large number 
of geometric atoms of space which can be arranged in a succession, each 
atom $x$ being separated by $dx$ from the next one. Hence in the definition  
of the logarithm
%
\begin{equation}
\ln a=\int_1^a \frac{dx}{x}~~~~(\mathrm{for}~a>1),
\label{equ52}
\end{equation}
%
we really have $\sum_{1\le x\le a}\frac{dx}{x}$. Similarly, the bankers 
rule (B) should be interpreted as 
%
\begin{equation}
\exp a=\prod_{0\le x\le a}(1+dx)~~(\mathrm{for}~a>0).
\label{equ53}
\end{equation}
%
\subsection{Differential equations}
\label{differential}
The previous formulation of the exponential suggests a method to solve a 
differential equation, for instance $y'=ry$. In differential form 
%
\begin{equation}
dy=r(x)ydx,
\label{equ54}
\end{equation}
%
that is
%
\begin{equation}
y+dy=(1+r(x)dx)y.
\label{equ55}
\end{equation}
%
The solution is
%
\begin{equation}
y(b)=\prod_{a\le x\le b}(1+r(x)dx)\cdot y(a).
\label{equ56}
\end{equation}
%
What is the meaning of this product? Putting $\epsilon(x)=r(x)dx$, an 
infinitesimal, and expanding the product as in (\ref{equ48}), we get 
%
\begin{equation}
\prod_x (1+\epsilon(x))=\sum_{k\ge0}\sum_{a\le x_1< \ldots <x_k\le b} \epsilon(x_1)\cdots\epsilon(x_k);
\label{equ57}
\end{equation}
%
reinterpreting the multiple sum as a multiple integral, this is
%
\begin{equation}
\sum_{k\ge 0}\int\cdots\int_{\Delta_k} r(x_1)\cdots r(x_k)dx_1\cdots dx_k.
\label{equ58}
\end{equation}
%
The domain of integration $\Delta_k$ is given by the inequalities
%
\begin{equation}
a\le x_1\le x_2\le \ldots\le x_k\le b.
\label{equ59}
\end{equation}
%

The classical solution to the differential equation $y'=ry$ is given by
%
\begin{equation}
y(b)=(\exp \int_a^b r(x) dx)\cdot y(a).
\label{equ60}
\end{equation}
%
Let us see how to go from (\ref{equ58}) to (\ref{equ60}). Geometrically, 
consider the hypercube $C_k$ given by 
%
\begin{equation}
a\le x_1\le b,\cdots,a\le x_k\le b
\label{equ61}
\end{equation}
%
in the euclidean space $\mathbf R^k$ of dimension $k$ with coordinates 
$x_1,\ldots,x_k$. The group $S_k$ of the permutations $\sigma$ of 
$\{1,\ldots,k\}$ acts on $\mathbf R^k$, by transforming the vector 
$\mathbf{x}$ with coordinates   $x_1,\ldots,x_k$ into the vector 
$\sigma.\mathbf{x}$ with coordinates $x_{\sigma^{-1}(1)},\ldots 
,x_{\sigma^{-1}(k)}$. Then the cube $C_k$ is the union of the $k!$ 
transforms $\sigma(\Delta_k)$. Since the function $r(x_1)\ldots r(x_k)$ to 
be integrated is symmetrical in the variables $x_1,\ldots,x_k$ and moreover 
two distinct domains $\sigma(\Delta_k)$ and $\sigma'(\Delta_k)$ overlap by 
a subset of dimension $<k$, hence of volume  $0$, we see that the integral 
of $r(x_1)\cdots r(x_k)$ over $C_k$ is $k!$ times the integral over 
$\Delta_k$. That is 
%
\begin{eqnarray}
&&\int \cdots\int_{\Delta_k} r(x_1)\cdots r(x_k)dx_1\cdots dx_k=\nonumber 
\\ &&~~\frac{1}{k!}\int_a^b dx_1 \cdots \int_a^b dx_k ~r(x_1)\cdots r(x_k)=\frac{1}{k!}
(\int_a^b r(x)dx)^k\nonumber.
\label{equ61a}
\end{eqnarray}
%
Summing over $k$, and using the definition of an exponential by a series, 
we conclude 
%
\begin{equation}
\sum_{k\ge 0}\int\cdots\int_{\Delta_k} r(x_1) \cdots r(x_k)dx_1 \cdots dx_k=
\exp \int_a^b r(x) dx.
\label{equ62}
\end{equation}
%
as promised.

The same method applies to the linear systems of differential equations. We 
cast them in the matrix form
%
\begin{equation}
y'=A\cdot y,
\label{equ63}
\end{equation}
%
that is the differential form
%
\begin{equation}
dy=A(x)ydx.
\label{equ64}
\end{equation}
%
Here $A(x)$ is a matrix depending on the variable $x$, and $y(x)$ is a 
vector (or matrix) function of $x$. From (\ref{equ64}) we get 
%
\begin{equation}
y(x+dx)=(I+A(x)dx)y(x).
\label{equ65}
\end{equation}
%
Formally the solution is given by
%
\begin{equation}
y(b)=\prod_{a\le x\le b}(I+A(x)dx)\cdot y(a).
\label{equ67}
\end{equation}
%

We have to take into account the \textbf{noncommutativity} of the products 
$A(x)A(y)A(z)\ldots$ . Explicitly, if we have chosen intermediate points 
%
$$
a=x_0<x_1<\ldots<x_N=b,\nonumber
\label{equ66a}
$$
%
with infinitely small spacing
%
$$
dx_1=x_1-x_0, dx_2=x_2-x_1,\dots,dx_N=x_N-x_{N-1}\nonumber,
\label{equ66b}
$$
%
the product in (\ref{equ67}) is 
%
$$
(I+A(x_N) dx_N)(I+A(x_{N-1})dx_{N-1})\cdots(I+A(x_1)dx_1).\nonumber
\label{equ66c}
$$
%
We use the notation $\overleftarrow{ \prod}_{1\le i\le N} U_i$ for a 
\textbf{reverse product} $U_N U_{N-1}\cdots U_1$; hence the previous product can be written as  
$\overleftarrow{ \prod}_{1\le i\le N} (I+A(x_i) dx_i)$ and we should 
replace $\displaystyle{\prod}$ by $\overleftarrow{\prod}$ in equation (\ref{equ67}). The 
noncommutative version of equation (\ref{equ48}) is 
%
\begin{equation}
\overleftarrow{\prod}_{1\le i\le N}(I+A_i)=\sum_{k=0}^N ~\sum_{i_1>\cdots>i_k}A_{i_1}\cdots A_{i_k}.
\label{equ68}
\end{equation}
%
Let us define the resolvant (or propagator) as the matrix
%
\begin{equation}
U(b,a)=\overleftarrow{\prod}_{a\le x\le b}(I+A(x)dx).
\label{equ69}
\end{equation}
%
Hence the differential equation $dy=A(x) y dx$ is solved by 
$y(b)=U(b,a)y(a)$ and from (\ref{equ68}) we get
%
\begin{equation}
U(b,a)=\sum_{k\ge 0} \int\cdots \int_{\Delta_k} A(x_k)\cdots 
A(x_1)dx_1\cdots dx_k 
\label{equ70}
\end{equation}
%
with the factors $A(x_i)$ in \textbf{reverse order}
%
\begin{equation}
A(x_k)\cdots A(x_1)~~\mathrm{for}~x_1<\ldots <x_k.
\label{equ70a}
\end{equation}
%

One owes to R.~Feynman and F.~Dyson (1949) the following notational trick. 
If we have a product of factors $U_1,\cdots ,U_N$, each attached to a point 
$x_i$ on a line, we denote by $T(U_1 \cdots U_N)$ (or more precisely by 
$\overleftarrow{T}(U_1\cdots U_N)$) the product $U_{i_1}\cdots U_{i_N}$ 
where the permutation $i_1\ldots i_N$ of $1\ldots N$ is such that 
$x_{i_1}>\cdots > x_{i_N}$. Hence in the rearranged product the abscisses 
attached to the factors increase from right to left. We argue now as in the 
proof of (\ref{equ62}) and conclude that 
%
\begin{eqnarray}
&&\int \cdots \int_{\Delta_k} A(x_k)\cdots A(x_1)dx_1\cdots dx_k\nonumber 
\\
&&=\frac{1}{k!} 
\int_a^bdx_1 \cdots\int_a^b dx_k T(A(x_1)\cdots A(x_k)).
\label{equ71}
\end{eqnarray}
%
We can rewrite the propagator as
%
\begin{equation}
U(b,a)=T \! \exp \int_a^b A(x)dx,
\label{equ72}
\end{equation}
%
with the following interpretation:

a) First use the series $\exp S=\sum_{k\ge0}\frac{1}{k!}S^k$ to expand 
$\exp\int_a^b A(x) dx$. 

b) Expand $S^k=(\int_a^b A(x)dx)^k$ as a multiple integral 
%
$$
\int_a^b dx_1\cdots\int_a^bdx_k~ A(x_1)\cdots A(x_k)\nonumber.
$$
% 

c) Treat $T$ as a linear operator commuting with series and integrals, 
hence 
%
\begin{eqnarray}
&&T \! \exp S=\sum_{k\ge 0}\frac{1}{k!}T(S^k)=\sum_{k\ge 0}\frac{1}{k!} 
T\{\int_a^bdx_1\cdots\int_a^bdx_k~ A(x_1)\cdots A(x_k)\} \nonumber \\ 
&&~~~~ ~~~~=\sum_{k\ge 0}\frac{1}{k!} \int_a^bdx_1\cdots\int_a^bdx_k 
~T(A(x_1)\cdots A(x_k)) 
\nonumber. 
\end{eqnarray}
% 

We give a few properties of the $T$ (or time ordered) exponential:

a) Parallel to the rule
%
\begin{equation}
\int_a^c A(x)dx=\int_a^b A(x)dx+\int_b^c A(x)dx~~(\mathrm{for}~a<b<c)
\label{equ73}
\end{equation}
% 
we get
%
\begin{equation}
T \! \exp \int_a^c A(x)dx=T \! \exp \int_b^c A(x)dx\cdot T \! \exp \int_a^b A(x)dx.
\label{equ74}
\end{equation}
% 
Notice that, in (\ref{equ73}), the two matrices
%
$$
L=\int_a^b A(x)dx,~~M=\int_b^c A(x)dx \nonumber
\label{equ74a}
$$
% 
do not commute, hence $\exp(L+M)$ is in general different from $\exp
L\cdot \exp 
M$. Hence formula (\ref{equ74}) is not in general valid for the ordinary 
exponential.

b) The next formula embodies the classical method of \lq\lq variation of 
constants" and is known in the modern literature as a \lq\lq gauge 
transformation". It reads as 
%
\begin{equation}
S(b)\cdot T \! \exp \int_a^b A(x)dx\cdot S(a)^{-1}=T \! \exp \int_a^b B(x) dx
\label{equ75}
\end{equation}
% 
with
%
\begin{equation}
B(x)=S(x)A(x)S(x)^{-1}+S'(x)S(x)^{-1},
\label{equ76}
\end{equation}
% 
where $S(x)$ is an invertible matrix depending on the variable $x$. The 
general formula (\ref{equ75}) can be obtained by \lq\lq taking a continuous 
reverse product" $\overleftarrow{\prod}_{a\le x\le b}$ over the 
infinitesimal form  
%
\begin{equation}
S(x+dx)(I+A(x)dx))S(x)^{-1}=I+B(x)dx
\label{equ77}
\end{equation}
% 
(for the proof, write $S(x+dx)=S(x)+S'(x)dx$ and neglect the terms 
proportional to $(dx)^2$). We leave it as an exercise to the reader to 
prove (\ref{equ75}) from the expansion (\ref{equ70}) for the propagator.

c) There exists a complicated formula for the $T$-exponential $T \! 
\exp\int_a^b A(x)\\dx$ when $A(x)$ is of the form $\frac{A_1(x)+A_2(x)}{2}$.
Neglecting terms of order $(dx)^2$, we get 
%
\begin{equation}
I+A(x)dx=\(I+A_2(x)\frac{dx}{2}\)\(I+A_1(x)\frac{dx}{2}\)
\label{equ78}
\end{equation}
% 
and we can then perform the product $\overleftarrow{\prod}_{a\le x\le b}$. 
This formula is the foundation of the \textbf{multistep method} in 
numerical analysis: starting from the value $y(x)$ at time $x$ of the 
solution to the equation $y'=Ay$, we split the infinitesimal interval 
$[x,x+dx]$ into two parts 
%
$$
I_1=[x,x+\frac{dx}{2}],~~I_2=[x+\frac{dx}{2},x+dx]\nonumber ;
\label{equ78a}
$$
% 
we move at speed $A_1(x)y(x)$ during $I_1$ and then at speed 
$A_2(x)y(x+\frac{dx}{2})$ during $I_2$. Let us just mention one corollary 
of this method, the so-called \textbf{Trotter-Kato-Nelson} formula: 
%
\begin{equation}
\exp(L+M)=\mathrm{lim}_{n\rightarrow \infty}(\exp(L/n)\exp(M/n))^n .
\label{equ79}
\end{equation}
% 

d) If the matrices $A(x)$ pairwise commute, the $T$-exponential of 
$\int_a^b A(x)dx$ is equal to the ordinary exponential. In the general 
case, the following formula holds 
%
\begin{equation}
T \! \exp\int_a^b A(x) dx=\exp V(b,a)
\label{equ80}
\end{equation}
%
where $V(b,a)$ is explicitly calculated using integration and iterated Lie 
brackets. Here are the first terms
\begin{eqnarray}
&&V(b,a)=\int_a^b A(x)dx+\frac{1}{2}\int 
\int_{\Delta_2}[A(x_2),A(x_1)]dx_1dx_2 \nonumber \\ && +\frac{1}{3}\int \int 
\int_{\Delta_3}[A(x_3),[A(x_2),A(x_1)]]dx_1dx_2dx_3\\
&&-\frac{1}{6}\int \int 
\int_{\Delta_3}[A(x_2),[A(x_3),A(x_1)]]dx_1dx_2dx_3+\cdots~~\nonumber. 
\label {equ80b}
\end{eqnarray}
%
The higher-order terms involve integrals of order $k\ge4$. As far as I can 
ascertain, this formula was first enunciated by K. Friedrichs around $1950$  
in his work on the foundations of Quantum Field Theory. A corollary is the 
\textbf{Campbell-Hausdorff formula}: 
%
\begin{eqnarray}
&&\exp L\cdot \exp M= \nonumber \\ 
&&\exp(L+M+\frac{1}{2}[L,M]+\frac{1}{12}[L,[L,M]]+\frac{1}{12}[M,[M,L]]+\cdots). 
\label{equ81}
\end{eqnarray}
%
It can be derived from (\ref{equ80}) by putting $a=0$, $b=2$, $A(x)=M$ for 
$0\le x\le 1$ and $A(x)=L$ for $1\le x\le 2$.

The $T$-exponential found lately numerous geometrical applications. If $C$ 
is a curve in a space of arbitrary dimension, the line integral $\int_C 
A_{\mu}(x)dx^{\mu}$ is well-defined and the corresponding $T$-exponential 
%
\begin{equation}
T \! \exp \int_C A_{\mu}(x)dx^{\mu}
\label{equ82}
\end{equation}
%
is closely related to the parallel transport along the curve $C$.

\section {Operational calculus}
\label{operational}
\setcounter{equation}{0}
\subsection{An algebraic digression: umbral calculus}
\label{algebraic}
We first consider the classical \textbf{Bernoulli numbers}. I claim that 
they are defined by the equation 
%
\begin{equation}
(B+1)^n=B^n~~\mathrm{for}~n\ge 2,
\label{equ1bis}
\end{equation}
%
together with the initial condition $B^0=1$. The meaning is the following: 
expand $(B+1)^n$ by the binomial theorem, then replace the power $B^k$ by 
$B_k$. Hence $(B+1)^2=B^2$ gives $B^2+2B^1+B^0=B^2$, that is after lowering  
the indices $B_2+2B_1+B_0=B_2$, that is $2B_1+B_0=0$. Treating 
$(B+1)^3=B^3$ in a similar fashion gives $3B_2+3B_1+B_0=0$. We write the 
first equations of this kind 
%
\begin{eqnarray}
&&n=2~~~~2B_1+B_0=0\nonumber \\ &&n=3~~~~3B_2+3B_1+B_0=0\nonumber \\ 
&&n=4~~~~4B_3+6B_2+4B_1+B_0=0 \nonumber \\ 
&&n=5~~~~5B_4+10B_3+10B_2+5B_1+B_0=0.\nonumber 
\end{eqnarray} 
%
Starting from $B_0=1$ we get successively
%
$$
B_1=-\frac{1}{2},~B_2=\frac{1}{6},~B_3=0,~B_4=-\frac{1}{30},\dots\nonumber
$$
%
Using the same kind of formalism, define the Bernoulli polynomials by
%
\begin{equation}
B_n(X)=(B+X)^n.
\label{equ2bis}
\end{equation}
%
According to the previous rule, we first expand $(B+X)^n$ using the 
binomial theorem, then replace $B^k$ by $B_k$. Hence we get explicitly 
%
\begin{equation}
B_n(X)=\sum_{k=0}^n \left ( \begin{array}{cc} n\\k  \end{array} \right) 
B_{n-k}X^k .
\label{equ3bis}
\end{equation}
%
Since $\frac{d}{dX}(X+c)^n=n(X+c)^{n-1}$ for any $c$ independent of $X$, we 
expect 
%
\begin{equation}
\frac{d}{dX}B_n(X)=nB_{n-1}(X).
\label{equ4bis}
\end{equation}
%
This is easy to check on the explicit definition (\ref{equ3bis}). Here is a 
similar calculation 
%
$$
(B+(X+Y))^n=((B+X)+Y)^n=\sum_{k=0}^n \left ( \begin{array}{cc} n\\k  
\end{array} \right)(B+X)^{n-k} Y^k \nonumber , 
$$
%
from which we expect to find
%
\begin{equation}
B_n(X+Y)=\sum_{k=0}^n \left ( \begin{array}{cc} n\\k  
\end{array} \right) B_{n-k} (X) Y^k .
\label{equ5bis}
\end{equation}
%
Indeed from (\ref{equ4bis}) we get
%
\begin{equation}
\(\frac{d}{dX}\)^k B_n(X)=\frac{n!}{(n-k)!}B_{n-k}(X)
\label{equ6bis}
\end{equation}
%
by induction on $k$, hence (\ref{equ5bis}) follows from Taylor's formula 
$B_n(X+Y)=\sum_{k\ge 0}\frac{1}{k!}(\frac{d}{dX})^k B_n(X) Y^k$.

\medskip

We deduce now a generating series for the Bernoulli numbers. Formally
%
\begin{eqnarray}
&&(e^S -1)e^{BS}=e^Se^{BS}-e^{BS}=e^{(B+1)S}-e^{BS}\nonumber \\ 
&&~~~~=\sum_{n\ge 0}\frac{1}{n!}S^n((B+1)^n-B^n)=S((B+1)^1-B^1)=S 
\nonumber. 
\end{eqnarray} 
%
Since $e^{BS}=\sum_{n\ge 0}\frac{1}{n!}B^nS^n$, we expect
%
\begin{equation}
\sum_{n\ge 0}B_n S^n/n!=\frac{S}{e^S-1}.
\label{equ7bis}
\end{equation}
%
Again this can be checked rigorously.

What is the secret behind these calculations?

We consider functions $F(B,X,\ldots)$ depending on a variable $B$ and other 
variables $X,\ldots$. Assume that $F(B,X,\ldots)$ can be expanded as a 
polynomial or power series in $B$, namely 
%
\begin{equation}
F(B,X,\ldots)=\sum_{n\ge 0}B^nF_n(X,\ldots).
\label{equ8bis}
\end{equation}
%
Then the \lq\lq mean value" with respect to $B$ is defined by
%
\begin{equation}
<F(B,X,\dots)>=\sum_{n\ge 0}B_nF_n(X,\dots),
\label{equ9bis}
\end{equation}
%
where the $B_n$'s are the Bernoulli numbers: this corresponds to the rule 
\lq\lq lower the index in $B^n$". If the function $F(B,X,\ldots)$ can be expanded into a series
$\sum_i F_i(B,X,\ldots)G_i(X,\ldots)$ where the $G_i$'s are independent of 
$B$, then obviously\footnote{So far we considered only identities linear in 
the $B_n$'s. If we want to treat nonlinear terms, like products
$B_m\cdot B_n$, 
we need to introduce two independent symbols $B$ and $B'$ and use the 
umbral rule to replace $B^m {B'}^n$ by $B_m B_n$. In probabilistic terms 
(see Section~\ref{probabilistic}), we introduce two independent random 
variables and take the mean value with respect to both simultaneously.} 
%
\begin{equation}
<F(B,X,\ldots)>=\sum_i <F_i(B,X,\ldots)>G_i(X,\ldots).
\label{equ10bis}
\end{equation}
%
The formal calculations given above are justified by this simple rule \textbf{which 
affords also a probabilistic interpretation }(see 
Section~\ref{probabilistic}). 

The previous method is loosely described as ``umbral calculus''. We insisted on speaking of ``mean 
values'' to 
keep in touch with physical applications. From a purely mathematical point of view, it is just applying a 
linear functional acting on polynomials in $B$, mapping $B^n$ into $B_n$ for all $n$'s.

\subsection{Binomial sequences of polynomials} 
\label{binomial}
These are sequences of polynomials $U_0(X),U_1(X),\dots$ in one variable $X$ 
satisfying the following relations: 

 a) $U_0(X)$ is a constant; 
 
 b) for any $n\ge 1$, one gets
%
\begin{equation}
\frac{d}{dX} U_n(X)=n U_{n-1}(X).
\label{equ11bis}
\end{equation}
%
By induction on $n$ it follows that $U_n(X)$ is of degree $\le n$. The 
binomial sequence is \textbf{normalized} if furthermore $U_0(X)=1$, in 
which case every $U_n(X)$ is a monic polynomial of degree $n$, that is 
%
$$
U_n(X)=X^n+c_1X^{n-1}+\ldots+c_n. \nonumber
\label{equ12bbis}
$$
%
Applying Taylor's formula as above (derivation   of formula 
(\ref{equ5bis})), one gets 
%
\begin{equation}
U_n(X+Y)=\sum_{k=0}^n \left ( \begin{array}{cc} n\\k  
\end{array} \right) U_{n-k}(X) Y^k.
\label{equ12bis}
\end{equation}
%
We introduce now a numerical sequence by $u_n=U_n(0)$ for $n\ge 0$. Putting 
$X=0$ in (\ref{equ12bis}) and then replacing $Y$ by $X$ (as a variable), we 
get 
%
\begin{equation}
U_n(X)=\sum_{k=0}^n \left ( \begin{array}{cc} n\\k  
\end{array} \right)u_{n-k}
X^k.
\label{equ13bis}
\end{equation}
%
Conversely, given any numerical sequence $u_0, u_1,\dots$, and defining the 
polynomials $U_n(X)$ by (\ref{equ13bis}), one derives immediately the 
relations  
%
\begin{equation}
\frac{d}{dX}U_n(X)=nU_{n-1}(X),~~U_n(0)=u_n.
\label{equ14bis}
\end{equation}
%

\textbf{The exponential generating series} for the constants $u_n$ is given by
%
\begin{equation}
u(S)=\sum_{n\ge 0}u_n S^n/n! .
\label{equ15bis}
\end{equation}
%
From (\ref{equ13bis}), one obtains the exponential generating series 
%
$$ 
U(X,S)=\sum_{n\ge0}U_n(X)S^n/n!\nonumber
$$
%
for the polynomials $U_n(X)$, namely in the form 
 %
\begin{equation}
U(X,S)=u(S)e^{XS}.
\label{equ16bis}
\end{equation}
%
This could be expected. Writing $\partial_X,~\partial_S\ldots$ for the 
partial derivatives, the basic relation $\partial_XU_n=nU_{n-1}$ translates  
as $(\partial_X-S)U(X,S)=0$ or equivalently as
%
\begin{equation}
\partial_X(e^{-XS}U(X,S))=0.
\label{equ17bis}
\end{equation}
%
Hence $e^{-XS}U(X,S)$ depends only on $S$, and putting $X=0$ we obtain the 
value $U(0,S)=u(S)$.

The umbral calculus can be successfully applied to our case. Hence $U_n(X)$ 
can be interpreted as $\langle(X+U)^n\rangle$ provided $\langle U^n \rangle 
=u_n$. Similarly $u(S)$ is equal to $\langle e^{US} \rangle$ and $U(X,S)$ to 
$\langle e^{(X+U)S}\rangle$. The symbolic derivation of (\ref{equ16bis}) is 
as follows 
%
$$
U(X,S)=\langle e^{(X+U)S}\rangle=\langle e^{XS}\cdot e^{US}\rangle = 
e^{XS}\langle e^{US}\rangle=e^{XS}u(S).\nonumber 
\label{equ17bbis}
$$
%

We describe in more detail the three basic binomial sequences of 
polynomials: 

a) The sequence $I_n(X)=X^n$ satisfies obviously (\ref{equ11bis}).  In this 
(rather trivial) case, we get 
%
$$
i_0=1,~i_1=i_2=\ldots=0,~I(S)=1,~I(X,S)=e^{XS}. \nonumber
\label{equ17cbis}
$$
% 

b) \textbf{The Bernoulli polynomials} obey the rule (\ref{equ11bis}) (see 
formula (\ref{equ4bis})). I claim that they are characterized by the normalization $B_0 (X) = 1$ and the 
further property 
%
\begin{equation}
\int_0^1 B_n(x) dx=0~~\mathrm{for}~n\ge1.
\label{equ18bis}
\end{equation}
% 
Indeed, introducing the exponential generating series
%
\begin{equation}
B(X,S)=\sum_{n\ge0}B_n(X)S^n/n!,
\label{equ19bis}
\end{equation}
% 
the requirement (\ref{equ18bis}) is equivalent to the integral formula
%
\begin{equation}
\int_0^1 B(x,S)dx=1.
\label{equ20bis}
\end{equation}
% 
According to the general theory of binomial sequences, $B(X,S)$ is of the 
form $b(S)e^{XS}$, hence 
%
$$
\int_0^1 B(x,S)dx=\int_0^1 b(S)e^{xS}dx=b(S)\(\frac{e^S-1}{S}\).\nonumber
\label{equ20bbis}
$$
% 
Solving (\ref{equ20bis}) we get $b(S)=S/(e^S-1)$ and from (\ref{equ7bis}) 
this is the exponential generating series for the Bernoulli numbers. The 
exponential generating series for the Bernoulli polynomials is therefore 
%
\begin{equation}
B(X,S)=\frac{Se^{XS}}{e^S-1}.
\label{equ21bis}
\end{equation}
% 
Here is a short table: 
%
\begin {eqnarray}
&&B_0(X)=1\nonumber \\ &&B_1(X)=X-\frac{1}{2}\nonumber 
\\&&B_2(X)=X^2-X+\frac{1}{6}\nonumber \\&& B_3(X)=X^3-\frac{3}{2}X^2+\frac{1}{2}X.\nonumber 
\end{eqnarray}
%

c) We come to the \textbf{Hermite polynomials} which form the normalized 
binomial sequence of polynomials characterized by  
%
\begin{equation}
\int_{-\infty}^{+\infty}H_n(x)d{\gamma}(x)=0~~\mathrm{for}~n\ge1,
\label{equ22bis}
\end{equation}
% 
where $d\gamma (x)$ denotes the normal probability law, that is
%
\begin{equation}
d{\gamma}(x)=(2\pi)^{-1/2}e^{-x^2/2}dx.
\label{equ23bis}
\end{equation}
% 
We follow the same procedure as for the Bernoulli polynomials. Hence for 
the exponential generating series 
%
\begin{equation}
H(X,S)=\sum_{n\ge0}H_n(X)S^n/n!=h(S)e^{XS}
\label{equ24bis}
\end{equation}
% 
we get
%
\begin{equation}
\int_{-\infty}^{+\infty}H(x,S)d{\gamma}(x)=1,
\label{equ25bis}
\end{equation}
% 
that is
%
\begin{equation}
1/h(S)=\int_{-\infty}^{+\infty}e^{xS}d{\gamma}(x).
\label{equ26bis}
\end{equation}
% 
The last integral being easily evaluated, we conclude
%
\begin{equation}
h(S)=e^{-S^2/2}.
\label{equ27bis}
\end{equation}
% 
From this relation, we can evaluate $H(X,S)$ namely
%
\begin{equation}
H(X,S)=e^{XS-S^2/2}=e^{X^2/2}e^{-(X-S)^2/2}
\label{equ28bis}
\end{equation}
% 
and using Taylor's expansion for $e^{-(X-S)^2/2}$, we get
%
\begin{equation}
H_n(X)=(-1)^n e^{X^2/2}\(\frac{d}{dX}\)^n e^{-X^2/2}.
\label{equ29bis}
\end{equation}
% 
In the spirit of operator calculus, use the identity
%
\begin{equation}
e^{X^2/2}\frac{d}{dX}e^{-X^2/2}=\frac{d}{dX}-X,
\label{equ30bis}
\end{equation}
% 
whence
%
\begin{equation}
H_n(X)=\(X-\frac{d}{dX}\)^n.1.
\label{equ31bis}
\end{equation}
% 
This is tantamount to a recursion formula
%
\begin{equation}
H_{n+1}(X)=XH_n(X)-\frac{d}{dX}H_n(X)=XH_n(X)-nH_{n-1}(X).
\label{equ32bis}
\end{equation}
% 
The following table is then easily derived: 
%
\begin{eqnarray} &&H_0(X)=1\nonumber 
\\ &&H_1(X)=X \nonumber \\ &&H_2(X)=X^2-1 \nonumber \\ && H_3(X)=X^3-3X 
\nonumber \\&&H_4(X)=X^4-6X^2+3.\nonumber 
\end{eqnarray} 
%
\subsection{Transformation of polynomials}
\label{transformation}
We use the standard notation $\mathbf C [X]$ to denote the vector space of 
polynomials in the variable $X$ with complex coefficients. Since the 
monomials $X^n$ form a basis of $\mathbf C [X]$, a linear operator 
$\mathbf{U}:~\mathbf C [X]\rightarrow \mathbf C [X]$ is completely 
determined  by the sequence of polynomials $U_n(X)$ defined as the image 
$\mathbf{U}[X^n]$ of $X^n$ under $\mathbf{U}$. Here are a few examples: 
%
\begin{eqnarray}
&&\mathbf{I}~~\mathrm{identity~ operator}~~~~~~~~~~~~I_n(X)=X^n \nonumber 
\\ && 
\mathbf{D}~~\mathrm{derivation} ~~~\frac{d}{dX}~~~~~~~~~~~~D_n(X)=nX^{n-1}\nonumber \\
&& 
\mathbf{T}_c~~\mathrm{translation~operator}~~~~~~T_{c,n}(X)=(X+c)^n.\nonumber 
\end{eqnarray}
%
Notice that in general $\mathbf{T}_c$ transforms a polynomial $P(X)$ into 
$P(X+c)$ and Taylor's formula amounts to 
%
\begin{equation}
\mathbf{T}_c=e^{c\mathbf{D}};
\label{equ33bis}
\end{equation}
% 
furthermore $\mathbf{T}_0=\mathbf{I}$. From the definition of the 
derivative, one gets 
%
\begin{equation}
\mathbf{D}=
\lim_{c\rightarrow0}
(\mathbf{T}_c-\mathbf{I})/c.
\label{equ34bis}
\end{equation}
%
 
We can reformulate the properties of binomial sequences:

- the definition $\mathbf{D}U_n(X)=nU_{n-1}(X)$ amounts to $\mathbf{D}\mathbf{U}=\mathbf{U}\mathbf{D}$;

- the exponential generating series $U(X,S)$ is nothing else than $\mathbf{U}[e^{XS}]$;

- formula (\ref{equ12bis}), after substituting $c$ to $Y$ reads as 
%
$$
U_n(X+c)=\sum_{k=0}^n \left ( 
\begin{array}{cc}n\\k\end{array}\right)U_{n-k}(X)c^k \nonumber 
\label{equ34bbis}
$$
% 
that is
%
$$
\mathbf{T}_c\mathbf{U}[X^n]=\sum_{k=0}^n  \left ( 
\begin{array}{cc}n\\k\end{array}\right) \mathbf{U}[X^{n-k}]c^k
=\mathbf{U}[(X+c)^n]=\mathbf{U}\mathbf{T}_c[X^n]\nonumber.
\label{equ34cbis}
$$
%
Hence this formula expresses that $\mathbf{U}$ commutes to $\mathbf{T}_c$
%
\begin{equation}
\mathbf{T}_c\mathbf{U}=\mathbf{U}\mathbf{T}_c;
\label{equ35bis}
\end{equation}
%

- formula (\ref{equ13bis}) can be rewritten as 
%
\begin{equation}
\mathbf{U}[X^n]=\sum_{k\ge 0}\frac{1}{k!}u_k\mathbf{D}^k [X^n].
\label{equ36bis}
\end{equation}
% 
From the definition (\ref{equ15bis}) of the exponential generating series, 
we obtain 
% 
\begin{equation}
\mathbf{U}=u(\mathbf{D}).
\label{equ37bis}
\end{equation}
% 

To sum up, our operators are characterized by the following equivalent 
properties:

a) $\mathbf{U}$ commutes with the derivative $\mathbf{D}$;

b) $\mathbf{U}$ commutes with the translation operators $\mathbf{T}_c$;

c) $\mathbf{U}$ can be expressed as a power series $u(\mathbf{D})$  in 
$\mathbf{D}$.\\ Furthermore, since $\mathbf{D}$ acts on $e^{XS}$ by 
multiplication by $S$, the operator $\mathbf{U}=u(\mathbf{D})$ multiplies $e^{XS}$ 
by $u(S)$. Hence we recover formula (\ref{equ16bis}). 

\subsection{Expansion formulas} 
\label{expansion}
As we saw before, $B_n(X)$ and $H_n(X)$ are monic polynomials and therefore 
the sequences $(B_n(X))_{n\ge 0}$ and $(H_n(X))_{n\ge 0}$ are two bases of 
the vector space $\mathbf{C}[X]$. Hence an arbitrary polynomial $P(X)$ can 
be expanded as a linear combination of the Bernoulli polynomials, as well 
as of the Hermite polynomials. Our aim is to give explicit formulas.
 
Consider a general binomial sequence $(U_n(X))_{n\ge 0}$ such that $u_0 
\neq 0$, with exponential generating series $U(X,S)=u(S)e^{XS}$. Introduce the inverse
series $v(S)=1/u(S)$; explicitly
% 
$$
v(S)=\sum_{n\ge 0}v_nS^n/n!,\nonumber
\label{equ38bbis}
$$
%  
and the coefficients $v_n$ are defined inductively by
% 
\begin{equation}
v_0=1/u_0,~v_n=-\frac{1}{u_0}\sum_{k=1}^n \left ( 
\begin{array}{cc}n\\k\end{array}\right) u_kv_{n-k}.
\label{equ38bis}
\end{equation}
% 
In the spirit of umbral calculus, let us define the linear form $\phi_0$ on 
$\mathbf{C}[X]$ by $\phi_0[X^n]=v_n$. I claim that \textbf{the expansion 
of an arbitrary polynomial in terms of the $U_n$'s  is given by}
% 
\begin{equation}
P(X)=\sum_{n\ge 0}\frac{1}{n!}\phi_0[\mathbf{D}^n P]\cdot U_n(X).
\label{equ39bis}
\end{equation}
% 

Before giving a proof, let us examine the three basic examples:

a) If $U_n(X)=X^n$, then $u(S)=1$, hence $v(S)=1$. That is $v_0=1$ and 
$v_n=0$ for $n\ge1$. The linear form $\phi_0$ is given by $\phi_0[P]=P(0)$ 
and formula (\ref{equ39bis}) reduces to MacLaurin's expansion 
% 
\begin{equation}
P(X)=\sum_{n\ge0}\frac{1}{n!} (\mathbf{D}^nP) (0)\cdot X^n.
\label{equ40bis}
\end{equation}
% 

b) For the Bernoulli polynomials we know that the series $1/b(S)$ is equal to
$(e^S-1)/S$, hence $v_n=\frac{1}{n+1}$. The linear form $\phi_0$ is defined  
by $\phi_0[X^n]=\frac{1}{n+1}$, that is 
% 
\begin{equation}
\phi_0[P]=\int_0^1P(x)dx.
\label{equ41bis}
\end{equation}
% 
Hence
% 
\begin{equation}
P(X)=\sum_{n\ge 0}\frac{1}{n!} \left(\int_0^1\mathbf{D}^n P(x)dx\right)
\cdot B_n(X).
\label{equ42bis}
\end{equation}
% 

c) In the case of Hermite polynomials, we know that $1/h(S)$ is equal to 
$e^{S^2/2}$, hence 
% 
\begin{equation}
v_{2m}=\frac{(2m)!}{m!2^m},~~v_{2m+1}=0.
\label{equ43bis}
\end{equation}
% 
According to (\ref{equ26bis}), we get 
$v_n=\int_{-\infty}^{+\infty}x^nd{\gamma}(x)$, hence 
% 
\begin{equation}
\phi_0[P] = \int_{-\infty}^{+\infty} P(x) d\gamma 
(x)=(2\pi)^{-1/2}\int_{-\infty}^{+\infty}P(x)e^{-x^2/2}dx.
\label{equ44bis}
\end{equation}
% 

In these three cases, the formula for $\phi_0[P]$ takes a similar form, 
namely 
% 
\begin{equation}
\phi_0[P]=\int_a^b P(x)w(x)dx
\label{equ45bis}
\end{equation}
% 
with the following prescriptions: 

$a=-\infty,~b=+\infty,~~w(x)=\delta(x)$ in case a), 

$a=0,~b=1,~~w(x)=1$ in case b),

$a=-\infty,~b=+\infty,~~w(x)=(2\pi)^{-1/2}e^{-x^2/2}$ in case c). \\The 
normalization $\phi_0[1]=1$ amounts to $\int_a^bw(x)dx=1$, that is $w(x)$ 
is the probability density of a random variable taking values in the 
interval $[a,b]$ (see Section~\ref{probabilistic}). 

There is a peculiarity in case c). Namely, according to the general 
formula (\ref{equ39bis}), an arbitrary polynomial $P(X)$ can be expanded in 
a series of Hermite polynomials $\sum_{n\ge 0}c_nH_n(X)$ where $c_n$ is 
equal  to $\frac{1}{n!}\int_{-\infty}^{+\infty}\mathbf{D}^nP(x) d{\gamma}(x)$. 
Integrating by parts and taking into account the definition 
(\ref{equ29bis}) of $H_n(X)$, we obtain 
% 
\begin{equation}
c_n= \frac{1}{n!} \int_{-\infty}^{+\infty}P(x)H_n(x)d{\gamma}(x).
\label{equ46bis}
\end{equation}
% 
This amounts to the \textbf{orthogonality relation}
% 
\begin{equation}
\int_{-\infty}^{+\infty}H_m(x)H_n(x)d{\gamma}(x)=\delta_{mn}n!
\label{equ47bis}
\end{equation}
% 
for the Hermite polynomials. There is no such orthogonality relation for 
the Bernoulli polynomials.

One final word about the proof of (\ref{equ39bis}). By linearity, it 
suffices to consider the case $P=U_m$, that is to prove the 
\textbf{biorthogonality relation} 
% 
\begin{equation}
\phi_0[\mathbf{D}^nU_m]=n!\delta_{mn}.
\label{equ48bis}
\end{equation}
% 
We first calculate $\phi_0[U_m]$. From formula (\ref{equ13bis}), we obtain
% 
$$
\phi_0[U_m]=\sum_{k=0}^m \left ( 
\begin{array}{cc}m\\k\end{array}\right)u_{m-k}\phi_0[X^k]=\sum_{k=0}^m \left ( 
\begin{array}{cc}m\\k\end{array}\right)u_{m-k}v_k ,\nonumber
\label{equ48bbis}
$$
% 
and from (\ref{equ38bis}), $\phi_0[U_m]$ is $0$ for $m\ge 1$. Since 
$\mathbf{D}^n U_m$ is proportional to $U_{m-n}$ according to the basic 
formula $\mathbf{D}U_m=mU_{m-1}$, one gets $\phi_0[\mathbf{D}^n U_m]=0$ for 
$m \neq n $. Finally $\mathbf{D}^mU_m=m! u_0$, hence $\phi_0[\mathbf{D}^m 
U_m] = v_0 m! u_0 =m!$. 
\subsection{Signal transforms}
\label{signal}
 A transmission device transforms a suitable \textbf{input} $f$ into an 
 \textbf{output} F. Both are evolving in time and are represented by 
 functions of time $f(t)$ and $F(t)$.\footnote{For simplicity, we restrict 
 to the case where input and output are scalars and not vectors.} We 
 assume the device to be \textbf{linear} and in a \textbf{stationary} 
 regime, that is there is a linear operator $\textbf{V}$ taking $f(t)$ into  
 $F(t)$ (linearity) and $f(t+\tau)$ into $F(t+\tau)$ for any fixed $\tau$ 
 (stationarity). 
 
 Here are the main types of response: 
 %
 \begin{eqnarray} 
 &&~~~~\underline{Input}~~~~~~~~\underline{Output}\nonumber\\ && 
 ~~~~\delta(t)~~~~~~~~~~ I(t)\nonumber\\ &&~~~~e^{pt}~~~~~~~~ 
 ~~~\Theta(t)e^{pt}\nonumber\\ &&~~~~t^n~~~~~~~~~~~~V_n(t)\nonumber 
 \end{eqnarray} 
 %
 In the first case, $\delta(t)$ is a Dirac singular function, that is a  
 \textbf{pulse}, and $I(t)$ is the \textbf{impulse response}. By 
 stationarity, $\mathbf{V}$ transforms $\delta(t-\tau)$ into $I(t-\tau)$; an 
 arbitrary input $f$ can be represented as a superposition of pulses 
% 
\begin{equation}
f(t)=\int_{-\infty}^{+\infty}f(\tau)\delta(t-\tau)d\tau
\label{equ49bis}
\end{equation}
% 
hence by linearity the output
% 
\begin{equation}
F(t)=\int_{-\infty}^{+\infty}f(\tau)I(t-\tau)d\tau=\int_{-\infty}^{+\infty}f(t-\tau)I(\tau)d\tau. 
\label{equ50bis}
\end{equation}
% 
In the \textbf{non-anticipating case}, $I(t)$ is zero before the pulse 
$\delta(t)$ occurs, that is $I(t)=0$ for $t<0$. In this case the output is 
given by 
% 
\begin{equation}
F(t)=\int_{-\infty}^t f(\tau)I(t-\tau)d\tau. 
\label{equ51bis}
\end{equation}
% 

In the case of the exponential input $f(t)=e^{pt}$, the output is equal to 
$\int_{-\infty}^{+\infty}e^{p\tau}I(t-\tau)d\tau 
=\int_{-\infty}^{+\infty}e^{p(t-\tau)}I(\tau)d\tau $ according to (\ref{equ50bis}),
 that is, to $\Theta(p)e^{pt}$ with the \textbf{spectral gain}
% 
\begin{equation}
\Theta(p)=\int_{-\infty}^{+\infty}e^{-p\tau}I(\tau)d\tau.
\label{equ52bis}
\end{equation}
% 
We can give an \textbf{a priori} argument: $f_p(t)=e^{pt}$ is a solution of 
the differential equation $\mathbf{D}f_p=pf_p$; since the operator 
$\mathbf{V}$ is stationary, that is, commutes with the translation operators 
$\mathbf{T}_c$, it commutes with 
$\mathbf{D}=\lim_{c\rightarrow0}(\mathbf{T}_c-\mathbf{I})/c$. Hence the 
output $F_p$ corresponding to the input $f_p$ is a solution of the 
differential equation $\mathbf{D}F_p=pF_p$, hence is proportional to 
$e^{pt}$. 

In a similar way, the monomials $t^n$ satisfy the cascade of differential 
equations 
% 
$$
\mathbf{D}[t]=1,~~\mathbf{D}[t^2]=2t,~~\mathbf{D}[t^3]=3t^2,\ldots \nonumber
\label{equ52bbis}
$$
% 
Since $\mathbf{V}$ commutes with $\mathbf{D}$ and the constants are the 
solutions of the differential equation $\mathbf{D}(f)=0$, it follows that 
the images  $V_n(t)=\mathbf{V}[t^n]$ form a binomial sequence of 
polynomials. Explicitly 
% 
$$
V_n(t)=\int_{-\infty}^{+\infty}(t-\tau)^nI(\tau)d\tau=\sum_{k=0}^n \left ( 
\begin{array}{cc}n\\k\end{array}\right)v_{n-k}t^k \nonumber
\label{equ52cbis}
$$
% 
with the constants
% 
\begin{equation}
v_n=(-1)^n\int_{-\infty}^{+\infty}I(\tau)\tau^nd\tau=V_n(0).
\label{equ53bis}
\end{equation}
% 
Comparing (\ref{equ52bis}) to (\ref{equ53bis}) we conclude
% 
\begin{equation}
\Theta(p)=\sum_{n\ge 0}v_np^n/n!.
\label{equ54bis}
\end{equation}
% 
More generally, since $e^{pt}$ is equal to $\sum_{n\ge 
0}\frac{1}{n!}p^nt^n$, application of the linear operator $\mathbf{V}$ 
gives 
% 
\begin{equation}
\mathbf{V}[e^{pt}]=\sum_{n\ge 0}\frac{1}{n!}p^nV[t^n],
\label{equ55bis}
\end{equation}
% 
that is
% 
\begin{equation}
\Theta(p)e^{pt}=\sum_{n\ge 0}\frac{1}{n!}p^n V_n(t).
\label{equ56bis}
\end{equation}
% 
Up to the change in notation ($p$ for $S$, and $t$ for $X$), the spectral 
gain $\Theta(p)$ is nothing else than the numerical exponential generating 
series associated to the binomial sequence $(V_n(t))_{n\ge 0}$. 

Comparing with the results obtained in Section~\ref{transformation}, it is 
tempting to write the operator $\mathbf{V}$ as $\Theta(\mathbf{D})$. 
According to (\ref{equ54bis}), $\Theta(\mathbf{D})$ can be interpreted as 
$\sum_{n\ge 0}v_n\mathbf{D}^n/n!$, but it is known that infinite order 
differential operators are not so easily dealt with. A better 
interpretation is obtained  via Laplace or Fourier transform. Indeed since 
$\mathbf{D}$ multiplies $e^{pt}$ by $p$, any function $F(\mathbf{D})$ ought 
to multiply $e^{pt}$ by  $F(p)$, and the rule 
$\mathbf{V}[e^{pt}]=\Theta(p)e^{pt}$ is in agreement with the 
interpretation $\mathbf{V}=\Theta (\mathbf{D})$. If the input can be 
represented as a \textbf{Laplace transform} 
% 
\begin{equation}
f(t)=\int e^{pt}\phi(p)dp,
\label{equ57bis}
\end{equation}
% 
then $\mathbf{V}=\Theta (\mathbf{D})$ transforms it into the output 
% 
\begin{equation}
F(t)=\int e^{pt}\Theta(p)\phi(p)dp.
\label{equ58bis}
\end{equation}
% 
Similarly, if the input $f(t)$ is given by its \textbf{spectral resolution} 
(or Fourier transform)
% 
\begin{equation}
f(t)=\int_{-\infty}^{+\infty}\hat{f}(\omega)e^{i \omega t}d\omega, 
\label{equ59bis}
\end{equation}
%
then the output is given by 
% 
\begin{equation}
F(t)=\int_{-\infty}^{+\infty}\Theta(i\omega)\hat{f}(\omega)e^{i \omega 
t}d\omega. 
\label{equ60bis}
\end{equation}
%

This is \textbf{Heaviside's magic trilogy}:

~~~~~~~~~~~~~\textbf{symbolic}~~~~~~$p$

~~~~~~~~~~~~~\textbf{operator}~~~~~~$\mathbf{D}$

~~~~~~~~~~~~~\textbf{spectral}~~~~~~$i\omega=2\pi i \nu$~~~~($\nu$ 
frequency, $\omega=2\pi\nu$ pulsation)
% 
$$
\Theta(p)\longleftrightarrow\Theta(\mathbf{D})\longleftrightarrow \Theta(2\pi i\nu).\nonumber
\label{equ60bbis}
$$
%
Recall that in the Laplace transform, $p$ is a complex variable, while in 
the Fourier transform $\omega$ and $\nu$ are real variables (see example in   
the next section).

\subsection{The inverse problem}
\label{inverse}
This is the problem of recovering the input, knowing the output. In 
operator terms, we have to compute the inverse $\mathbf{U}$ of the operator 
$\mathbf{V}$ (if it exists!). Since $\mathbf{V}$ is stationary, so is 
$\mathbf{U}$, and at the level of polynomial inputs and outputs, 
$\mathbf{U}$ corresponds to a binomial sequence of polynomials $U_n(t)$:

\[\left\{\begin{array}{ll}
\mbox{input}&U_n(t)\\ \mbox{output}&t^n.
\end{array}\right.\] 

Together with the numerical sequence $v_n=V_n(0)$, we have to consider the 
numerical sequence $u_n=U_n(0)$. Introducing the exponential generating 
series 
% 
\begin{equation}
u(S)=\sum_{n\ge0}\frac{1}{n!}u_nS^n,~v(S)=\sum_{n\ge0}\frac{1}{n!}v_nS^n=\Theta(S), 
\label{equ61bis}
\end{equation}
%
we can write $\mathbf{U}=u(\mathbf{D})$ and $\mathbf{V}=v(\mathbf{D})$ at 
least when acting on polynomials. Since $\mathbf{U}$ and $\mathbf{V}$ are 
inverse operators, we expect the relation $u(S)v(S)=1$, equivalent to the 
chain of relations 
% 
\begin{equation}
u_0v_0=1,~\sum_{k=0}^n \left ( 
\begin{array}{cc}n\\k\end{array}\right)u_kv_{n-k}=0~~\mathrm{for}~n\ge1,
\label{equ62bis}
\end{equation}
%
to hold. Indeed, this is easily checked (see Section~\ref{expansion}, 
formula (\ref{equ38bis})). 

Since the input $U_n(t)$ corresponds to the output $t^n$, a 
Taylor-MacLaurin expansion of the output corresponds to an expansion of 
the input in terms of the $U_n(t)$. Fix an epoch $t_0$ and use the Taylor 
expansion of the output 
% 
\begin{equation}
F(t)=\sum_{n\ge0}\frac{1}{n!} \mathbf{D}^n F(t_0)(t-t_0)^n.
\label{equ63bis}
\end{equation}
%
Applying the operator $\mathbf{U}$, we get
% 
\begin{equation}
f(t)=\sum_{n\ge0}\frac{1}{n!} \mathbf{D}^n F(t_0)\cdot U_n(t-t_0),
\label{equ64bis}
\end{equation}
%
since $\mathbf{U}$ transforms $(t-t_0)^n$ into $U_n(t-t_0)$ by 
stationarity. The reader is invited to compare this formula to formula 
(\ref{equ39bis}). 

We give one illustrating example. Let the output be a moving average of the 
input
% 
\begin{equation}
F(t)=\int_{t-1}^t f(s) ds,
\label{equ65bis}
\end{equation}
%
corresponding to the impulse response shown in Figure~3.

%
\begin{figure}
\centering 
\includegraphics[width=0.8\textwidth]{cartier3.pdf} 
\caption{The impulse response corresponding to (\ref{equ65bis}) }\label{figure3} 
\end{figure} 
%
The spectral gain is
% 
\begin{equation}
\Theta(p)=\int_0^1 e^{-p\tau}d\tau=\frac{1-e^{-p}}{p}.
\label{equ66bis}
\end{equation}
%
The inverse series $u(p)=1/\Theta(p)$ is given by
% 
\begin{equation}
u(p)=\frac{p}{1-e^{-p}}.
\label{equ67bis}
\end{equation}
%
Since $u(-p)=\frac{p}{e^p-1}$ is the exponential generating series of the 
Bernoulli numbers, the polynomials $U_n(t)$ are easily identified,
% 
\begin{equation}
U_n(t)=(-1)^n B_n(-t)=B_n(t)+nt^{n-1}=B_n(t+1).
\label{equ68bis}
\end{equation}
%
Notice that $\Theta(p)$ vanishes for $p\neq 0$ of the form $p=2\pi in$ with 
an integer $n$; equivalently, the inverse function $u(p)$ has poles for 
$p\neq 0,~p=2\pi i n$. Hence not every output is admissible since 
(\ref{equ65bis}) entails $\sum_n F(t+n)=\int_{-\infty}^{+\infty}f(s)ds$. 
That is, an output satisfies the necessary (and sufficient) condition 
% 
\begin{equation}
\sum_n F(t+n)=c~~(c \ \hbox{a constant)}
\label{equ69bis}
\end{equation}
%
and the input $f(t)$ can be reconstructed from the output $F(t)$ up to the 
addition of a function $f_0(t)$ with 
% 
\begin{equation}
f_0(t)=f_0(t+1),~\int_0^1f_0(t)dt=0.
\label{equ70bis}
\end{equation}
%
\textbf{Exercise} a) Derive from (\ref{equ64bis}) the relation
% 
\begin{equation}
f(t)=\sum_{n\ge 0}\frac{1}{n!}u_n\mathbf{D}^nF(t),
\label{equ71bis}
\end{equation}
%
for a general transmission device, where 
$1/\Theta(p)=\sum_{n\ge0}u_np^n/n!$.

b) In the particular case (\ref{equ65bis}), one gets $u_n=B_n(1)$, hence 
$u_n=B_n$ if $n\ge 2$, and $u_0=1$, $u_1=\frac{1}{2}$.

c) Deduce from relation (\ref{equ7bis}) that $B_n=0$ for $n\ge 3$, $n$ odd.

d) Derive the Euler-MacLaurin summation formula 
% 
\begin{eqnarray}
&&\frac{1}{2}(f(t)+f(t-1))\nonumber \\ &&=\int_{t-1}^t 
f(s)ds+\sum_{m\ge1}\frac{1}{(2m)!}B_{2m}[\mathbf{D}^{2m-1}f(t)-\mathbf{D}^{2m-1}f(t-1)]. 
\label{equ72bis}
\end{eqnarray}
%
\subsection{A probabilistic application}
\label{probabilistic}
We consider a random variable $\xi$. In general, we denote by $\langle X 
\rangle$ the mean value of a random variable $X$. We want to define a 
probabilistic version of the so-called Wick Powers in Quantum Field Theory. 
 
 The goal is to associate to $\xi$ a sequence of random variables $:\xi^n:$ 
 such that 
 
 a) \textbf{the mean value of} $:\xi^n:$ \textbf{is} 0 for $n\ge1$; 
 
 b) \textbf{there exists a normalized binomial sequence of polynomials} 
 $\Pi_n(X)$ \textbf{such that} $:\xi^n:=\Pi_n(\xi)$.
 
 Let $w(x)$ be the probability density associated to $\xi$, hence $w(x)\ge 
 0$ and $\int_{-\infty}^{+\infty}w(x)dx=1$. Moreover, for any (non random) 
 function $f(x)$ of a real variable $x$, the random variable $f(\xi)$ has a 
 mean value given by 
% 
\begin{equation}
\langle f(\xi)\rangle=\int_{-\infty}^{+\infty}f(x)w(x)dx.
\label{equ73bis}
\end{equation}
%
Hence the conditions a) and b) amount to
% 
\begin{equation}
0=\langle \Pi_n(\xi)\rangle=\int_{-\infty}^{+\infty}\Pi_n(x)w(x)dx 
~~\mathrm{for}~ n\ge1. 
\label{equ74bis}
\end{equation}
%
Using the same method as in Section~\ref{binomial}, we introduce the 
exponential generating series $\Pi(X,S)=\sum_{n\ge 0}\Pi_n(X)S^n/n!$, hence  
the relation (\ref{equ74bis}) translates into 
% 
\begin{equation}
\int_{-\infty}^{+\infty}\Pi(x,S)w(x)dx=1.
\label{equ75bis}
\end{equation}
%
Putting $\pi(S)=\Pi(0,S)$, hence $\Pi(X,S)=\pi(S)e^{XS}$, we derive
% 
\begin{equation}
1/\pi(S)=\int_{-\infty}^{+\infty}e^{xS}w(x)dx.
\label{equ76bis}
\end{equation}
%

We translate these relations into probabilistic jargon: replace $S$ by $p$ 
and $X$ by $\xi$ to get 
% 
\begin{equation}
1/\pi(p)=\langle e^{p\xi}\rangle
\label{equ77bis}
\end{equation}
%
% 
\begin{equation}
\Pi(\xi,p)=\sum_{n\ge 0}\frac{1}{n!}p^n:\xi^n:
\label{equ78bis}
\end{equation}
%
% 
\begin{equation}
\Pi(\xi,p)=\pi(p)e^{p\xi}.
\label{equ79bis}
\end{equation}
%
Extending the definition of $:~ :$ by linearity to 
$e^{p\xi}=\sum_{n\ge0}\frac{1}{n!}p^n\xi^n$, we rewrite (\ref{equ78bis})  
as $\Pi(\xi,p)=:e^{p\xi}:$. \textbf{Here is the conclusion} 
% 
\begin{equation}
:e^{p\xi}:~~=~~\frac{e^{p\xi}}{\langle e^{p\xi}\rangle}.
\label{equ80bis}
\end{equation}
%

Let us specialize our results in the case of the binomial sequences 
considered so far: 

a) If $\xi=0$, then $\langle e^{p\xi}\rangle=1$, hence 
$:e^{p\xi}:=e^{p\xi}=1$. That is $:\xi^n:=0$ for $n\ge1$.

b) Suppose that $\xi$ is \textbf{uniformly distributed} in the interval 
$[0,1]$, that is $w(x)=1$ if $0\le x\le 1$, and $w(x)=0$ otherwise. Then  
% 
\begin{equation}
\langle e^{p\xi}\rangle=\int_0^1 e^{px}dx=\frac{e^p-1}{p}.
\label{equ81bis}
\end{equation}
%
We get
% 
\begin{equation}
\sum_{n\ge 0}\frac{1}{n!}p^n :\xi^n:~~=~~:e^{p\xi}:~~=~~\frac{pe^{p\xi}}{e^p-1},
\label{equ82bis}
\end{equation}
%
that is
% 
\begin{equation}
:\xi^n:~~=~~B_n(\xi)
\label{equ83bis}
\end{equation}
%
where $B_n(X)$ is the Bernoulli polynomial of degree $n$. In particular
\begin{eqnarray}
&&:\xi:~~=~~\xi-\langle \xi \rangle= \xi-\frac{1}{2} \nonumber \\ && 
:\xi^2:~~=~~\xi^2-\xi+\frac{1}{6}\nonumber\\&& :\xi^3:~~=~~\xi^3-\frac{3}{2}\xi^2+\frac{1}{2}\xi,  
~~\mathrm{etc}\ldots\nonumber
\end{eqnarray}

c) Assume now that $\xi$ is \textbf{normalized}: $\langle 
\xi\rangle=0,~\langle \xi^2\rangle=1$, and follows a \textbf{Gaussian law}. 
Then $w(x)=(2\pi)^{-1/2}e^{-x^2/2}$ and
% 
\begin{equation}
\langle e^{p\xi}\rangle=(2\pi)^{-1/2}\int_{-\infty}^{+\infty}e^{px-x^2/2} dx=e^{p^2/2}.
\label{equ84bis}
\end{equation}
%
Reasoning as above, we obtain
% 
\begin{equation}
:\xi^n:~~=~~H_n(\xi)
\label{equ85bis}
\end{equation}
%
where $H_n(X)$ is the Hermite polynomial of degree $n$. Explicitly
\begin{eqnarray}
&&:\xi:~~=~~\xi \nonumber \\ && 
:\xi^2:~~=~~\xi^2 -1\nonumber\\&& :\xi^3:~~=~~\xi^3-3\xi\nonumber.
\end{eqnarray}
To get a general formula, apply (\ref{equ80bis}) to obtain the pair of 
relations 
% 
\begin{equation}
:e^{p\xi}:~~=~~e^{-p^2/2}e^{p\xi},~e^{p\xi}=e^{p^2/2}:e^{p\xi}:.
\label{equ86bis}
\end{equation}
%
Equating equal powers of $p$, we derive 
% 
\begin{equation}
:\xi^n:~~=~~\sum_{0\le k\le n/2}(-1)^k \frac{n!}{2^kk!(n-2k)!}\xi^{n-2k},
\label{equ87bis}
\end{equation}
%
and conversely
% 
\begin{equation}
\xi^n=\sum_{0\le k\le n/2} \frac{n!}{2^kk!(n-2k)!}:\xi^{n-2k}:.
\label{equ88bis}
\end{equation}
%
Notice that the orthogonality relation (\ref{equ47}) for the Hermite 
polynomials translates in probabilistic terms into 
% 
\begin{equation}
\langle :\xi^m:~:\xi^n:\rangle=m!\delta_{mn},
\label{equ89bis}
\end{equation}
%
hence the sequence $1,:\xi:,:\xi^2:,\ldots$ is derived from the natural 
sequence $1,~\xi,~\xi^2,\ldots$ by orthogonalization.

To conclude, we can use the reflected probability density $w(-x)$ as an 
impulse response and define the input-output relation by 
% 
\begin{equation}
F(t)=\int f(t+\tau)w(\tau)d\tau,
\label{equ90bis}
\end{equation}
%
that is
% 
\begin{equation}
F(t)=\langle f(t+\xi)\rangle
\label{equ91bis}
\end{equation}
%
in probabilistic terms. The interpretation is that the input is spoiled by 
random delay in transmission. Then $\Pi_n(t)$ is the input corresponding to 
the output $t^n$. Analytically this is expressed by 
% 
\begin{equation}
\int_{-\infty}^{+\infty}\Pi_n(t+\tau)w(\tau)d\tau=t^n,
\label{equ92bis}
\end{equation}
%
and probabilistically by
% 
\begin{equation}
\langle :(\xi+t)^n:\rangle=t^n.
\label{equ93bis}
\end{equation}
%
\subsection{The Bargmann-Segal transform}
\label{Bargmann}
Let us consider again the input-output transformation in the Gaussian case. 
It is then called the Bargmann-Segal transform (or $B$-transform), denoted 
by $\mathbf{B}$. According to (\ref{equ90bis}), we have   
% 
\begin{equation}
\mathbf{B}f(z)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}f(x+z)e^{-x^2/2}dx
\label{equ94bis}
\end{equation}
%
or
% 
\begin{equation}
\mathbf{B}f(z)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}f(x)e^{-(z-x)^2/2}dx.
\label{equ94bbis}
\end{equation}
%
Comparing formulas (\ref{equ24}) and (\ref{equ28}), one obtains
% 
\begin{equation}
e^{-(z-x)^2/2}=e^{-x^2/2}\sum_{n\ge0}H_n(x)z^n/n!,
\label{equ95bis}
\end{equation}
%
and by integrating term by term in the expression (95) one concludes 
% 
\begin{equation}
\mathbf{B}f(z)=\sum_{n\ge0}\Gamma_n(f) z^n/n!
\label{equ96bis}
\end{equation}
with
% 
\begin{equation}
\Gamma_n(f)=\int_{-\infty}^{+\infty}H_n(x)f(x)d{\gamma}(x).
\label{equ97bis}
\end{equation}
%
Taking into account the orthogonality property (\ref{equ47}) namely 
% 
$$
\int_{-\infty}^{+\infty}H_m(x)H_n(x)d{\gamma}(x)=\delta_{mn}n!,\nonumber
$$
%
one derives $\Gamma_n(f)=n!c_n$ for $f$ given by a series 
$\sum_{n\ge0}c_nH_n(x)$. That is, \textbf{the $B$-transform takes} 
$\sum_{n\ge0}c_nH_n(x)$ into $\sum_{n\ge0}c_nz^n$. 

To be more precise, we need to introduce some function spaces. The natural 
one is $L^2(d{\gamma})$ consisting of the (measurable) functions $f(x)$ for 
which the integral $\int_{-\infty}^{+\infty}|f(x)|^2d{\gamma}(x)$ is 
finite; the scalar product is given by 
% 
\begin{equation}
\langle f_1|f_2\rangle=\int_{-\infty}^{+\infty}\overline{f_1(x)}f_2(x)d{\gamma}(x).
\label{equ98bis}
\end{equation}
%
In this space, the functions $He_n(x):=H_n(x)/(n!)^{1/2}$ (for 
$n=0,1,\ldots$) form an orthonormal basis\footnote{The orthonormality 
condition $\langle He_m|He_n\rangle=\delta_{mn}$ is nothing else than the 
orthogonality condition (\ref{equ47}). But it requires a proof to show that 
this system is \textbf{complete}, that is that any function in the Hilbert 
space $L^2(d{\gamma})$ can be approximated by polynomials (in the norm 
convergence).}. The $B$-transform takes this space onto the space of series 
$\sum c_n z^n$ with $\sum_{n\ge 0}n!|c_n|^2 <\infty$. 

In its original form (94), the transformation $\mathbf{B}$ 
requires  $z$ to be real, but the form (\ref{equ94bbis}) extends to the case 
of a complex number $z$. Indeed, from the property that $\sum_{n\ge 
0}n!|c_n|^2$ is finite, it follows that the series $\sum_{n\ge 0}c_nz^n$ 
has an infinite radius of convergence, hence represents an entire function 
of the complex variable $z$. The space of such entire functions is denoted 
$\mathcal{F}(\mathbf{C})$ and called the \textbf{Fock space} (in one degree 
of freedom, see Section~3.9).  

The elements of $L^2(d{\gamma})$ can be interpreted as the random variables 
of the form $X=f(\xi)$ with $\langle|X|^2 \rangle$ finite, where $\xi$ is a 
normalized Gaussian random variable. We saw that $\mathbf{B}$ takes 
$H_n(\xi)~=~:\xi^n:$ into $z^n$. Hence it is tempting to denote by $:~ 
:$ the map inverse to $\mathbf{B}$, so that $:z^n:~=~H_n(x)$.
 We have \textbf{a new kind of operational calculus} 
%
$$
\begin{array}{ccc}
& \mathbf{B} &  \\L^2(d{\gamma})  & \rightleftharpoons 
&\mathcal{F}(\mathbf{C}), 
\\ & :~ : &  \nonumber
\\  
\end{array}
$$
%
\textbf{where $\mathbf{B}$ transforms a function $f(x)$ or the random variable $X=f(\xi)$ into the entire 
function} 
% 
\begin{equation}
\mathbf{B}X(z)=e^{-z^2/2}\langle X\cdot e^{z\xi}\rangle
\label{equ99bis}
\end{equation}
%
\textbf{and the inverse map takes an entire function $\Phi(z)=\sum_{n\ge0}c_nz^n$ 
in the Fock space into the random variable} $:\Phi(\xi):=\sum_{n\ge 
0}c_n:\xi^n:$. 

According to the definition (94), $\mathbf{B}$ takes the function $e^{px}$ 
into $e^{pz+p^2/2}$, that is it acts as $e^{\mathbf{D}^2/2}$  where 
$\mathbf{D}$ is the derivation, followed by the change of variable $x$ into 
$z$. This result can be reformulated as follows. Using the exponential 
generating series 
% 
\begin{equation}
\sum_{n\ge0}H_n(x) p^n/n!=e^{px-p^2/2}
\label{equ100bis}
\end{equation}
%
for the Hermite polynomials, and noting that $e^{\mathbf{D}^2/2}$ applied 
to $e^{pz-p^2/2}$ gives $e^{px}$, we conclude that $e^{\mathbf{D}^2/2}$ 
takes $H_n(x)$ into $x^n$, that is $\mathbf{B}$ \textbf{coincides on the 
polynomials with the differential operator} $e^{\mathbf{D}^2/2}$ \textbf{of 
infinite degree}. 

We would like to conclude the general rule  
% 
\begin{equation}
\mathbf{B}=e^{\mathbf{D}^2/2},
\label{equ101bis}
\end{equation}
%
hence
% 
\begin{equation}
: ~:~~=e^{-\mathbf{D}^2/2}.
\label{equ102bis}
\end{equation}
%
One way to substantiate these claims is to consider the heat (or diffusion) 
equation 
% 
\begin{equation}
\partial_sF(s,x)=\frac{1}{2}\partial_x^2 F(s,x)
\label{equ103bis}
\end{equation}
%
with initial value
% 
\begin{equation}
F(0,x)=f(x).
\label{equ104bis}
\end{equation}
%
Since $\mathbf{D}=\partial_x$, the solution of equation (\ref{equ103bis}) 
can be written formally as $F(s,x)=e^{s\mathbf{D}^2/2}f(x)$, hence 
$e^{\mathbf{D}^2/2}f(x)$ \textbf{represents the value for $s=1$ of the 
solution of equation (\ref{equ103bis}) which agrees for $s=0$ with $f(x)$}. 
But we know an explicit solution to the heat equation 
% 
\begin{equation}
F(s,x)=\frac{1}{\sqrt{2\pi s}}\int_{-\infty}^{+\infty}f(x+u)e^{-u^2/2s}du.
\label{equ105bis}
\end{equation}
%
Comparing with (\ref{equ94bis}), we obtain
% 
\begin{equation}
\mathbf{B}f(x)=F(1,x),
\label{equ106bis}
\end{equation}
%
and this relation is the true expression of 
$\mathbf{B}=e^{\mathbf{D}^2/2}$. 

The operator $e^{\mathbf{D}^2/2}$ (or $\mathbf{B}$) is \textbf{smoothing}. 
That is, if we simply assume that $f$ belongs to $L^2(d{\gamma})$ (that is 
that the integral $\int_{-\infty}^{+\infty}|f(x)|^2 e^{-x^2/2}dx$ is 
finite), then the function $F(1,x)=e^{\mathbf{D}^2/2}f(x)$ extends to an 
entire function  in the complex domain. Conversely, the \textbf{Wick 
operator}  $:~:~=e^{-\mathbf{D}^2/2}$ makes sense only for the functions 
$g(x)$ (for $x$ real) which extend in the complex domain into a function 
$\Phi(z)$ (for $z$   complex) belonging to the Fock space 
$\mathcal{F}(\mathbf{C})$. 
\subsection{The quantum harmonic oscillator}
\label{quantum}
Let us rewrite the definition of the $B$-transform as an integral operator
% 
\begin{equation}
\mathbf{B}f(z)=\int_{-\infty}^{+\infty}B(z,x)f(x)dx
\label{equ107bis}
\end{equation}
%
with a kernel
% 
\begin{equation}
B(z,x)=(2\pi)^{-1/2}e^{-(z-x)^2/2}.
\label{equ108bis}
\end{equation}
%
It is often more convenient to replace the Hilbert space $L^2(d\gamma)$ by 
the Hilbert space $L^2(\mathbf{R})$ \footnote{consisting of the 
(measurable) functions $\phi(x)$ such that 
$\int_{-\infty}^{+\infty}|\phi(x)|^2 dx$ be finite, with scalar product 
$\langle \phi_1|\phi_2 \rangle 
=\int_{-\infty}^{+\infty}\overline{\phi_1(x)}\phi_2(x)dx$.}.
 Defining the function $u_0(x)$ by $(2\pi)^{-1/4}e^{-x^2/4}$, we get 
% 
\begin{equation}
d\gamma (x)=u_0(x)^2dx,
\label{equ109bis}
\end{equation}
%
hence $\int |f(x)|^2 d\gamma (x)$ is finite if and only if 
$\int|f(x)u_0(x)|^2dx$ is finite. That is, the multiplication by the 
function $u_0(x)$ gives an isometry of $L^2(d\gamma)$ onto 
$L^2(\mathbf{R})$. We can transfer the $B$-transform to $L^2(\mathbf{R})$, 
as the isometry $\mathbf{B}'$ of $L^2(\mathbf{R})$ onto 
$\mathcal{F}(\mathbf{C})$ 
\footnote{with the scalar product $\langle 
\Phi_1|\Phi_2 \rangle =\sum_{n\ge0}n!\overline{c}_{n,1} c_{n,2} $ for $\Phi_j(z)=\sum_{n\ge0}
c_{n,j} z^n \\(j=1,2)$.} defined by $\mathbf{B}f=\mathbf{B}'(fu_0)$. 
Explicitly  
% 
\begin{equation}
\mathbf{B}' f(z)=\int_{-\infty}^{+\infty}B'(z,x)f(x)dx
\label{equ110bis}
\end{equation}
%
with 
% 
\begin{equation}
B'(z,x)=u_0(x)^{-1}B(z,x)=(2\pi)^{-1/4}e^{-z^2/2+zx-x^2/4}.
\label{equ111bis}
\end{equation}
%
Many properties are easier to describe in the Fock space. For instance the 
function $1$ is called the \textbf{ground state} $\Omega$, the 
multiplication by $z$  is called the \textbf{creation operator}, denoted by 
$\mathbf{a}^*$, and the derivation $\partial _z=\frac{d}{dz}$ is the 
\textbf{annihilation operator}, denoted by $\mathbf{a}$. The vectors 
% 
\begin{equation}
e_n=\frac{1}{\sqrt{n!}}(\mathbf{a}^*)^n\Omega,
\label{equ112bis}
\end{equation}
%
that is the functions $e_n(z)=\frac{1}{\sqrt{n!}}z^n$, form an orthonormal 
basis of $\mathcal{F}(\mathbf{C})$ with $e_0=\Omega$. An easy calculation 
gives 
%
\begin{equation}
\left\{
\begin{array}{c}
\mathbf{a}e_n=n^{1/2}e_{n-1}~\mathrm{for}~n\ge1,~\mathbf{a}e_0=0\\ 
\mathbf{a}^*e_n=(n+1)^{1/2}e_{n+1}, 
\end{array}
\right. \
\label{equ113bis}
\end{equation}
%
hence the matrices 
$$\mathbf{a} =\left ( 
\begin{array}{ccccccc}
0 & \sqrt{1} & 0 & 0& .& .& . \\0 &0 & \sqrt{2} & 0 &.&.&.\\0&0 &0 & 
\sqrt{3}&.&.&.\\ 
0&0 &0 &0&.&.&. \\. &. &.&.&.&.&.\\.&.&.&.&.&.&.\\
\end{array}
\right ),\quad \mathbf{a}^* =\left ( 
\begin{array}{ccccccc}
0 & 0 & 0 &0&.&.&. \\\sqrt{1} &0 & 0 & 0 &.&.&.\\0&\sqrt{2} &0 & 0&.&.&.\\ 
0&0 &\sqrt{3} &0&.&.&. \\.&.&.&.&.&.&.\\.&.&.&.&.&.&.\\ 
\end{array}
\right )$$
in the basis $(e_n)_{n\ge0}$; it follows that $\mathbf{a}$ \textbf{and} 
$\mathbf{a}^*$ \textbf{are adjoint to each other}. Moreover from the 
definitions $\mathbf{a}^*=z,~\mathbf{a}=\partial_z$ follows the commutation  
relation 
% 
\begin{equation}
\mathbf{a}\mathbf{a}^*-\mathbf{a}^*\mathbf{a}=1.
\label{equ114bis}
\end{equation}
%
Finally the \textbf{number operator} $\mathbf{N}=\mathbf{a}^*\mathbf{a}$ is 
given by $\mathbf{N}=z\partial_z$, hence is diagonalized in the basis 
$(e_n)$ 
% 
\begin{equation}
\mathbf{N}e_n=ne_n.
\label{equ115bis}
\end{equation}
%

In the spirit of operational calculus, we transfer these results from the 
Fock space model to the spaces $L^2(d\gamma)$ and $L^2(\mathbf{R})$. The 
following table summarizes these translations (where $\partial_x$ is 
$\frac{d}{dx}$):  
$$\begin{array}{cccc} Space & 
L^2(d\gamma)&L^2(\mathbf{R})&\mathcal{F}(\mathbf{C})\\ \\
~~~\Omega~~~~&1~~~~&(2\pi)^{-1/4}e^{-x^2/4}=u_0(x)~~~~&1\\\\
e_n&(n!)^{-1/2}H_n(x)~~~~&(n!)^{-1/2}H_n(x)u_0(x)~~~~&(n!)^{-1/2}z^n\\\\
\mathbf{a}^*&x-\partial_x&x/2-\partial_x&z\\\\
\mathbf{a}&\partial_x&x/2+\partial_x&\partial_z\\\\
\mathbf{N}&x\partial_x-\partial_x^2&-\partial_x^2+x^2/4-1/2&z\partial_z\\\\
\end{array}$$
For instance, the fact that $\mathbf{a}^*$ corresponds to $x-\partial_x$ in 
$L^2(d\gamma)$ is proved as follows: from the definition of $B(z,x)$ one 
gets 
% 
\begin{equation}
(x+\partial_x)B(z,x)=zB(z,x).
\label{equ116bis}
\end{equation}
%
Multiplying by $f(x)$ and integrating by parts, we get
% 
\begin{eqnarray}
&&\int B(z,x)(x-\partial_x)f(x)dx=\int(x+\partial_x)B(z,x)f(x)dx\nonumber 
\\&&=z\int B(z,x)f(x)dx\nonumber, 
\label{equ116bbis}
\end{eqnarray}
%
that is $\mathbf{B}((x-\partial_x)f)=z\mathbf{B}f$. The other cases are 
similar. 

We apply these results to the \textbf{harmonic oscillator}. In classical 
mechanics, the harmonic oscillator is described by the 
\textbf{Hamiltonian} $H=\frac{p^2}{2m}+\frac{Kq^2}{2}$ in canonical coordinates $p$,$q$. The equation of 
motion 
is $\ddot{q}+\omega^2q=0$ with the pulsation $\omega=\sqrt{K/m}$, and the 
momentum $p=m\dot{q}$. To get the corresponding quantum Hamiltonian 
$\mathbf{H}$, replace $p$ by the operator $\mathbf{p}=-i\hbar\partial_q$ 
hence 
% 
\begin{equation}
\mathbf{H}=-\frac{\hbar^2}{2m}\partial_q^2+\frac{m\omega^2q^2}{2}.
\label{equ117bis}
\end{equation}
% 
Introduce the dimensionless coordinate $x=(2m\omega/\hbar)^{1/2}q$. Then 
$\mathbf{H}$ can be rewritten as 
% 
\begin{equation}
\mathbf{H}=\hbar\omega(\mathbf{a}^*\mathbf{a}+\frac{1}{2})
\label{equ118bis}
\end{equation}
% 
in the model $L^2(\mathbf{R})$. From the diagonalization of 
$\mathbf{N}=\mathbf{a}^*\mathbf{a}$, we conclude that the energy levels of 
the quantum harmonic oscillator (that is, the eigenvalues of $\mathbf{H})$ 
are given by $\hbar\omega(n+\frac{1}{2})$ with $n=0,1,2,\ldots$: that is 
\textbf{Planck's radiation law}, with the correction $\frac{1}{2}$ giving 
$\frac{1}{2}\hbar\omega$ for the energy of the ground state $u_0=\Omega.$ 



\section{The art of manipulating infinite series}
\label{art}
\setcounter{equation}{0}
\subsection{Some divergent series}
\label{divergent}
Euler claimed that $S=1-1+1-1+\ldots$ is equal to $\frac{1}{2}$. Here is 
the purported proof:
\vspace{1.0ex}

\vbox{
$\hbox to2cm {\hfil$S$}=1~-1~+1~-1+\ldots$

$\hbox to2cm {\hfil$+~S$}=\hphantom{1~-{}}1~-1~+1-\ldots$ 

~~~~~~~~~~~------------------------------------

$\hbox to2cm {\hfil$2~S$}=1~+0~+0~+0+\ldots=1.$
}
\vspace{1.0ex}

What is implicit is the use of two rules: 

a) If $S=u_0+u_1+u_2+\ldots$, then $S=0+u_0+u_1+\ldots$

b) If $S=u_0+u_1+u_2+\ldots$ and $S'=u_0+u'_1+u'_2+\ldots,$ then 
$S+S'=(u_0+u'_0)+(u_1+u'_1)+(u_2+u'_2)+\ldots~$. \\These rules certainly 
hold for convergent series but to extend them to  divergent series is 
somewhat hazardous. 

Let us repeat the previous calculation in a slightly more general form:
\vspace{1.0ex}

\vbox{
$\hbox to2cm {\hfil$S$}=1~-t~+t^2~-t^3+\ldots$
 
$\hbox to2cm {\hfil$+~tS$}=~~~~~~t~-t^2~+t^3-\ldots$

~~~~~~~~---------------------------------------

$\hbox to2cm {\hfil$(1+t)S$}=1~+0~+0~+0+\ldots=1.$ 
}

\vspace{1.0ex}

The result is 
% 
\begin{equation}
1-t+t^2-t^3+\ldots=\frac{1}{1+t},
\label{equ1ter}
\end{equation}
%
the classical summation of the geometric series. If $t$ is a real number 
such that $|t|<1$, the geometric series is convergent, and the use of rules  
a) and b) is justified. To get Euler's result, take the limiting value 
$t=1$ in (\ref{equ1ter}). 

What we need is the explicit description of various procedures to define 
rigorously the sum of certain divergent series (not all at once) and to 
compare these procedures. \textbf{Suppose we want to define the sum} 
% 
\begin{equation}
S=u_0+u_1+\ldots.
\label{equ2ter}
\end{equation}
%
\textbf{Introduce weights} $p_{0,t}, p_{1,t},\ldots$ \textbf{and the weighted series}
% 
\begin{equation}
S_t=p_{0,t}u_0+p_{1,t}u_1+\ldots.
\label{equ3ter}
\end{equation}
%
\textbf{If the series $S_t$ is convergent for each value of the parameter 
$t$, and $S_t$ approaches a limit $S$ when $t$ approaches some limiting 
value $t_0$, then $S$ is the sum for this procedure\footnote{See Knopp's 
book \cite{Knopp} for this method.}}. 

The previous procedure is reasonable only when $\lim_{t\rightarrow 
t_0}p_{n,t}=1$ for $n=0,1,\ldots$. Some examples: 

a) $p_{0,N}=p_{1,N}=\ldots=p_{N,N}=1,~p_{n,N}=0$ for $n>N$ and 
$N=0,1,2,\ldots$. Then the weighted sum amounts to the finite sum 
% 
$$
S_N=u_0+\ldots+u_N \nonumber
\label{equ4bter}
$$
%
(obviously convergent) and the convergence of $S_N$ towards a limit $S$ 
corresponds to the convergence of the series $u_0+u_1+u_2+\ldots$ in the 
standard sense, with the standard sum $S$.

b) Put $\sigma_N=\frac{1}{N+1}(S_0+\ldots+S_N)$; this corresponds to the 
weights 
%
\begin{equation}
p_{n,N}=\left\{\begin{array}{ll} 1-\frac{n}{N+1}&\mbox{~~for~$0\le n\le 
N$}\\ 0&\mbox{~~for~$n>N$}. 
\end{array}
\right.
\label{equ4ter}
\end{equation}
%
If $\sigma_N$ converges to a limit $\sigma$, this is the 
\textbf{Cesaro-sum} of the series $u_0+u_1+u_2+\ldots$.

c) To get the \textbf{Abel summation}, we introduce the weights 
$p_{n,t}=t^n$ for $n=0,1,2,\ldots$ and a real parameter $t$ with $0<t<1$. 
We take therefore the limit for $t=1$ of the power series 
$\sum_{n\ge0}u_nt^n$. 

It is known that every convergent series with sum $S$ is Cesaro-summable 
with the same sum $\sigma=S$. Similarly, Cesaro summation is extended by 
Abel summation. Euler's example is $u_n=(-1)^n$, hence 
% 
\begin{equation}
S_N=\left\{\begin{array}{ll} 1&\mbox{~~if~$N \geq 0$~is~even}\\ 
0&\mbox{~~if~$N \geq 0$~is~odd} 
\end{array}
\right.
\label{equ5ter}
\end{equation}
%
and therefore
%
\begin{equation}
%\[
\sigma_N=\left\{\begin{array}{ll}
\frac{1}{2}&\mbox{~~if~$N$~is~odd}\\ 
\frac{1}{2}+\frac{1}{2N+2}&\mbox{~~if~$N$~is~even}.
\end{array}
\right.
%\]
\label{equ6ter}
\end{equation}
It follows that $\sigma_N$ converges to $\sigma=\frac{1}{2}$. Hence the 
series $1-1+1-1+\ldots$ is Cesaro-summable to $\frac{1}{2}$, and a previous 
calculation shows that it is Abel-summable to $\frac{1}{2}$ also.

The scope of Abel summation can be extended in various ways. For instance, 
if the sequence $(u_n)$ is bounded, that is $|u_n|\le M$ for 
$n=0,1,2,\ldots$ with a constant $M$ independent of $n$, then the power 
series $\sum_{n\ge 0}u_nz^n$ converges for any complex number $z$ with 
$|z|<1$ and defines therefore a holomorphic function $U(z)$ in the open 
disk $|z|<1$ (see Fig. \ref{unitdisk}).
%
\begin{figure}\centering 
\includegraphics[width=0.5\textwidth]{cartier4.pdf} 
\caption{The open unit disk}
\label{unitdisk} 
\end{figure}
%
If the limit $\lim_{r \rightarrow 1}U(r e^{i\theta})$(for $0\le r<1$) 
exists, it can be taken as an Abel sum for the series 
$\sum_{n\ge0}u_ne^{in\theta}$. 

In a slightly more general way, we can assume that the sequence $(u_n)$ is 
polynomially bounded, that is 
% 
$$
|u_n|\le C n^k \nonumber
$$
%
for all $n=1,2,\ldots$ and some constants $C>0$ and $k=1,2,\ldots$ The 
radius of convergence of the series $\sum_{n\ge 0}u_nz^n$ is still at least 
$1$, and if $U(1)=\lim_{r 
\rightarrow 1}U(r)=\lim_{r \rightarrow 1}\sum_{n\ge 0}u_n r^n$ exists, it is 
the Abel sum for $u_0+u_1+u_2+\ldots$ 

We just give one example, namely $u_n=(-1)^n n^k$ for $k=0,1,2,\ldots$. We 
calculate 
% 
\begin{eqnarray}
&&U(z)=\sum_{n\ge0}u_nz^n=\sum_{n\ge0}(-1)^nn^kz^n=\sum_{n\ge0}n^k(-z)^n\nonumber\\ 
&&=\sum_{n\ge0}(z\partial_z)^k(-z)^n=(z\partial_z)^k\sum_{n\ge0}(-z)^n=(z\partial_z)^k\frac{1}{1+z}. 
\nonumber
\end{eqnarray}
%
Particular cases:
% 
\begin{eqnarray}
&&k=0,~~~~U(z)=\frac{1}{1+z},~~~~~~~~~~~~~~~~U(1)=\frac{1}{2}\nonumber\\
&&k=1,~~~~U(z)=\frac{-z}{(1+z)^2},~~~~~~~~~~~~U(1)=-\frac{1}{4}\nonumber\\
&&k=2,~~~~U(z)=\frac{z(z-1)}{(1+z)^3},~~~~~~~~~~~~U(1)=0\nonumber\\
&&k=3,~~~~U(z)=\frac{-z^3+4z^2-z}{(1+z)^4},~~~~U(1)=\frac{1}{8}\nonumber,
\end{eqnarray}
%
that is 
% 
$$
\sum_{n\ge0}(-1)^n=\frac{1}{2}\nonumber
$$
% 
$$
\sum_{n\ge0}(-1)^nn=-\frac{1}{4}\nonumber
$$
% 
$$
\sum_{n\ge0}(-1)^nn^2=0\nonumber
$$
%
$$
\sum_{n\ge0}(-1)^4 n^3=\frac{1}{8}\nonumber.
$$
% 
In general, we get $U(1)=(z\partial_z)^k\frac{1}{1+z}|_{z=1}$ that is, 
after the change of variable $z=e^u$, 
\begin{equation}
\sum_{n\ge0}(-1)^n n^k=\partial_u^k \frac{1}{e^u+1}\bigg|_{u=0}.
\label{equ7ter} 
\end{equation}
% 
Using the exponential generating series for the Bernoulli numbers in the 
form 
% 
$$
\frac{1}{e^u-1}=\frac{1}{u}+\sum_{k\ge0}\frac{B_{k+1}}{(k+1)!} u^k \nonumber
$$
% 
we obtain
% 
\begin{equation}
\frac{1}{e^u+1}=\frac{1}{e^u-1}-\frac{2}{e^{2u}-1}=\sum_{k\ge0}\frac{(1-2^{k+1}) B_{k+1}}{(k+1)!}u^k,
\label{equ8ter}
\end{equation}
% 
hence Euler's result
%
\begin{equation}
\sum_{n\ge0}(-1)^n n^k=\frac{(1-2^{k+1})B_{k+1}}{k+1}.
\label{equ9ter} 
\end{equation}
% 
We leave it to the reader to rederive the previous cases $0\le k\le3$ using 
the  values for $B_1,B_2,B_3,B_4$ given in Section~3.1. We come back to 
this result in Section~4.4. 

Euler gave formulas for wildly divergent series like $\sum_{n\ge0}(-1)^n 
n!$. Using the classical formula 
%
\begin{equation}
n!=\int_0^{\infty}e^{-t}t^n dt
\label{equ10ter} 
\end{equation}
and assuming unrestrictedly as legitimate formal rules both term-by-term integration and summation of a 
geometric series, we get 
%
\begin{eqnarray}
&&\sum_{n\ge0}(-1)^n n!=\sum_{n\ge0}(-1)^n\int_0^{\infty}e^{-t}t^n 
dt\nonumber  \\&&~~~~=\int_0^{\infty}e^{-t}\sum_{n\ge0}(-t)^n 
~dt=\int_0^{\infty}\frac{e^{-t}}{1+t}dt\nonumber, 
\label{equ10bter} 
\end{eqnarray} 
the last integral being convergent. This is just the 
beginning of the use of Borel transform and Borel summation for divergent 
series. 
\subsection{Polynomials of infinite degree and summation of series}
\label{polynomials}
It is an important principle that \textbf{a polynomial can be reconstructed 
from its roots}. More precisely, let 
%
\begin{equation}
P(z)=c_nz^n+c_{n-1}z^{n-1}+\ldots+c_1z+c_0
\label{equ11ter} 
\end{equation}
%
(with $c_n\neq 0$) be a polynomial of degree $n$ with complex coefficients. 
If $\lambda_1$ is a root of $P$, that is $P(\lambda_1)=0$, it is elementary  
to factorize $P(z)=(z-\lambda_1)P_1(z)$ where $P_1(z)$ is a polynomial of 
degree $n-1$. Continuing this process, we end up with a factorization 
%
\begin{equation}
P(z)=(z-\lambda_1)\ldots(z-\lambda_m)Q(z),
\label{equ12ter} 
\end{equation}
%
where the polynomial $Q(z)$ of degree $n-m$ has no more roots. According to 
a highly non-trivial result, first stated by d'Alembert (1746) and proved 
by Gau\ss~(1797), a polynomial without roots is a constant, hence the 
factorization (\ref{equ12ter}) takes the form 
%
\begin{equation}
P(z)=c_n(z-\lambda_1)\ldots(z-\lambda_n).
\label{equ13ter} 
\end{equation}
%
By a well known calculation, one derives the following 
relations between coefficients and roots
\begin{eqnarray}
&&\lambda_1+\ldots+\lambda_n=-c_{n-1}/c_n\nonumber \\
&&\sum_{i<j}\lambda_i\lambda_j=c_{n-2}/c_n,~\mathrm{etc}\ldots\nonumber
\end{eqnarray}
%

For our purposes, it is better to use the inverses of the roots, assumed to 
be nonzero. Since the logarithmic derivative transforms product into sum 
and annihilates constants, we derive 
%
\begin{equation}
\mathbf{D}P(z)/P(z)=\sum_{i=1}^n\frac{1}{z-\lambda_i}.
\label{equ14ter} 
\end{equation}
% 
Use of the geometric series gives
%
\begin{equation}
\sum_{i=1}^n\frac{1}{z-\lambda_i}=-\sum_{i=1}^n\sum_{k\ge0}z^k/\lambda_i^{k+1}.
\label{equ15ter} 
\end{equation}
%
Introducing the sums of inverse powers of roots
%
\begin{equation}
\gamma_k=\sum_{i=1}^n \lambda_i^{-k},
\label{equ16ter} 
\end{equation}
%
we conclude from this calculation
%
\begin{equation}
z\mathbf{D}P(z)+P(z)\sum_{k\ge1}\gamma_k z^k=0.
\label{equ17ter} 
\end{equation}
%
Assuming for simplicity $c_0=1$ and equating the coefficients of equal 
powers of $z$, we obtain the following variant of \textbf{Newton's 
relations} 
%
\begin{equation}
\gamma_k+c_1\gamma_{k-1}+\ldots+c_{k-1}\gamma_1+kc_k=0
\label{equ18ter} 
\end{equation}
%
for $k\ge1$. It is important to notice that \textbf{the degree $n$ of 
$P(z)$ does not appear explicitly in the relation (\ref{equ18ter})}, which 
can be solved inductively 
%
\begin{eqnarray}
&&\gamma_1=-c_1 \label{equ19ter}\\ &&\gamma_2=c_1^2-2c_2 \label{equ20ter}\\ 
&&\gamma_3=-c_1^3+3c_1c_2-3c_3\label{equ21ter}\\ 
&&\gamma_4=c_1^4-4c_1^2c_2+4c_1c_3-4c_4+2c_2^2 \label{equ22ter}.
\end{eqnarray}
%

Around $1734$, Euler undertook to calculate the sum of the series 
$S_2=\sum_{n\ge1}\frac{1}{n^2}$. This series is slowly convergent, but 
Euler invented efficient acceleration methods for summing series and 
calculated the sum $S_2=1.64493406\ldots$; he recognized $S_2=\pi^2/6$. He 
obtained also the value of $S_4=\sum_{n\ge1}1/n^4$ to be $\pi^4/90$. To 
establish these results rigorously, he introduced the equation $\sin x=0$ 
admitting the solutions $x=0,\pm\pi,\pm2\pi,\pm3\pi,\ldots$ Discarding the  
root $x=0$ and using the power series expansion of $\sin x$, we are led to 
consider the equation 
%
$$
1-\frac{x^2}{6}+\frac{x^4}{120}-\ldots=0 \nonumber
\label{equ18bter} 
$$
%
with roots $\pm\pi,\pm2\pi,\pm3\pi,\ldots$. With the previous notations we 
have
\begin{eqnarray}
&&c_1=0,~c_2=-\frac{1}{6},~c_3=0,~c_4=\frac{1}{120},\ldots\nonumber  \\ 
&&\gamma_2=\sum_{n\ge1}\left[\frac{1}{(\pi n)^2}+\frac{1}{(-\pi 
n)^2}\right]=2S_2/\pi^2 
\nonumber\\ && \gamma_4=\sum_{n\ge1}\left[\frac{1}{(\pi n)^4}+\frac{1}{(-\pi 
n)^4}\right]=2S_4/\pi^4\nonumber. 
\end{eqnarray}
Assuming that the relations (\ref{equ20ter}) and (\ref{equ22ter}) still 
hold, we get 
\begin{eqnarray}
&&2S_2/\pi^2=\gamma_2=-2c_2=\frac{1}{3}\nonumber\\ 
&&2S_4/\pi^4=\gamma_4=-4c_4+2c_2^2=-\frac{1}{30}+\frac{1}{18}=\frac{1}{45}\nonumber. 
\end{eqnarray}
The desired relations
%
$$
S_2=\pi^2/6,~S_4=\pi^4/90 \nonumber
$$
%
follow immediately.

To summarize the method used by Euler:

a) first guess the value from accurate numerical work; 

b) consider the function  
%
$$
\frac{\sin x}{x}=1-\frac{x^2}{6}+\frac{x^4}{120}-\ldots\nonumber
$$
%
as a polynomial of infinite degree, with infinitely many roots 
$\pm\pi,\pm2\pi,\break\pm3\pi,\ldots$;

c) since Newton's relations (\ref{equ19ter}) to (\ref{equ22ter}) \textbf{do not 
involve explicitly the degree $n$ of the polynomial}, assume their validity  
in the case $n=\infty$ as well, and exploit them for $P(x)=(\sin x)/x$.


\subsection{The Euler-Riemann zeta function}
\label{zetafunction}
We use Riemann's definition and notation
%
\begin{equation}
\zeta(s)=\sum_{n\ge1}n^{-s}.
\label{equ23ter}
\end{equation}
%
The series converges absolutely for any complex number $s$ with real part 
$\Re(s)$ greater than $1$. It has been shown by Riemann that $\zeta(s)$ can 
be analytically continued to the whole complex plane, the only singularity 
being a pole of order $1$ at $s=1$, that is $\zeta(s)-1/(s-1)$ is an entire 
function. Obviously $\zeta(1)=\sum_{n\ge1}1/n$ is a divergent series, but 
$\zeta(s)$ is defined when $s\neq 1$ is an integer (positive or negative). 
Euler was the first to calculate $\zeta(s)$ when $s$ is an integer. 

We consider the case where $s$ is \textbf{even and strictly positive}. 
Euler proved the formula 
%
\begin{equation}
\zeta(2k)=\frac{2^{2k-1}\pi^{2k}|B_{2k}|}{(2k)!},
\label{equ24ter}
\end{equation}
%
and in particular
%
\begin{equation}
\zeta(2)=\frac{2\pi^2|B_2|}{2!}=\frac{\pi^2}{6}
\label{equ25ter}
\end{equation}
%
%
\begin{equation}
\zeta(4)=\frac{8\pi^4|B_4|}{4!}=\frac{\pi^4}{90}.
\label{equ26ter}
\end{equation}
%
Since $\zeta(2)=\sum_{n\ge1}1/n^2=S_2$ and similarly $\zeta(4)=S_4$, we 
recover the formulas for $S_2$ and $S_4$. The method we used in the 
previous section could be extended to cover the general case 
(\ref{equ24ter}), but it is simpler to go back to the formula given for the 
logarithmic derivative in (\ref{equ14ter}). For the function $\sin z$, the 
logarithmic derivative is $\cot z=\cos z/\sin z$. This function is 
meromorphic in the whole complex plane, with simple poles of residue $1$ at   
each integral multiple of $\pi$. \textbf{Euler assumed at first that, in 
analogy with (\ref{equ14ter}), $\cot z$ should be equal to the sum of its 
polar contributions, that is} 
%
\begin{equation}
\cot z=\sum_{n=-\infty}^{+\infty}\frac{1}{z-n\pi}.
\label{equ27ter}
\end{equation}
%

Assume this relation for a moment, and derive (\ref{equ24ter}). The series 
(\ref{equ27ter}) is not absolutely convergent, but can be summed in a 
symmetrical way by taking $\sum_{n=-\infty}^{+\infty}$ to be 
$\lim_{N\rightarrow\infty}\sum_{n=-N}^{+N}$. Hence 
%
\begin{equation}
\cot z-\frac{1}{z}=\sum_{n=1}^{\infty}\frac{2z}{z^2-n^2\pi^2}.
\label{equ28ter}
\end{equation}
%
The right-hand side can be developed using the geometric series; for 
$|z|<\pi$, the series involved are absolutely convergent, hence
%
\begin{eqnarray}
&&\sum_{n=1}^{\infty}\frac{2z}{z^2-n^2\pi^2}=-\sum_{n=1}^{\infty}2z\sum_{k=1}^{\infty}
\frac{(z^2)^{k-1}}{(n^2\pi^2)^k}=-2\sum_{n=1}^{\infty}\sum_{k=1}^{\infty}\frac{z^{2k-1}}
{n^{2k}\pi^{2k}}\nonumber \\
&&=-2\sum_{k=1}^{\infty}\frac{z^{2k-1}}{\pi^{2k}}\sum_{n=1}^{\infty}\frac{1}{n^{2k}}\nonumber,
\end{eqnarray}
% 
that is
%
\begin{equation}
\cot z=\frac{1}{z}-2\sum_{k\ge1}\frac{\zeta(2k)}{\pi^{2k}}z^{2k-1}.
\label{equ29ter}
\end{equation}
%
Another use of the exponential generating series for the Bernoulli numbers 
yields 
%
\begin{eqnarray}
&&\cot z=i\frac{e^{2iz}+1}{e^{2iz}-1}=\frac{2i}{e^{2iz}-1}+i\nonumber \\
&&\hphantom{\cot
z}{}=2i\bigg\{\frac{1}{2iz}-\frac{1}{2}+\sum_{k\ge1}\frac{B_{2k}}{(2k)!}
(2iz)^{2k-1}\bigg\}+i,\nonumber 
\label{equ29bter}
\end{eqnarray}
%
hence finally 
%
\begin{equation}
\cot z=\frac{1}{z}+\sum_{k\ge1}\frac{(-1)^k2^{2k}B_{2k}}{(2k)!}z^{2k-1}.
\label{equ30ter}
\end{equation}
%
To establish (\ref{equ24ter}), it is enough to compare the expansions 
(\ref{equ29ter}) and (\ref{equ30ter}) for $\cot z$ and to note that 
$\zeta(2k)=\sum_{n\ge1}\frac{1}{n^{2k}}$ is the sum of a convergent series 
of positive numbers, hence $\zeta(2k)>0$.

Euler's proof for the expansion (\ref{equ27ter}) of $\cot z$ is reproduced 
in many textbooks. Here is a variant which seems to have been unnoticed so 
far. Define 
%
\begin{equation}
\Phi(z)=\cot z-\sum_{n=-\infty}^{+\infty}\frac{1}{z-n\pi}.
\label{equ31ter}
\end{equation}
%
Examining the poles of $\cot z$, we see that $\Phi(z)$ is an entire 
function of the complex variable $z$. A simple manipulation yields the 
functional equation 
%
\begin{equation}
\Phi(z)=\frac{1}{2}\left[
\Phi\left(\frac{z}{2}\right)+\Phi\left(\frac{z+\pi}{2}\right)\right];
\label{equ32ter}
\end{equation}
%
we have to prove that $\Phi(z)=0$ for all $z$.

a) \textbf{The function $\Phi$ is bounded}: indeed, denote by $C_n$ the set 
of complex numbers whose modulus is at most $(2^n+1)\pi$. Since $C_1$ is a 
compact set and $\Phi$ is continuous, there exists a constant $M>0$ such 
that $|\Phi(z)|\le M $ for $z$ in $C_1$. Assuming the estimate 
$|\Phi(z)|\le M$ for $z$ in $C_n$, we use the functional equation for $z$ 
in $C_{n+1}$, 
%
\begin{equation}
|\Phi(z)|\le 
\frac{1}{2}\left|\Phi\left(\frac{z}{2}\right)\right|
+\frac{1}{2}\left|\Phi\left(\frac{z+\pi}{2}\right)\right| ,
\label{equ33ter}
\end{equation}
%
and observe that both $z/2$ and $(z+\pi)/2$ belong to $C_n$, hence 
$|\Phi(z/2)|\le M $, $|\Phi((z+\pi)/2)|\le M $; from (\ref{equ33ter}) we 
conclude that $|\Phi(z)|\le M$ (for $z$ in $C_{n+1})$. Every complex number 
belongs to some set $C_n$, hence $|\Phi(z)|\le M$ for all $z$.

b) We appeal now to \textbf{Liouville's theorem} to conclude that $\Phi$, 
being a bounded entire function, is a constant, hence $\Phi(z)=\Phi(0)$.

c) The function $\Phi$ is \textbf{odd}, that is $\Phi(-z)=-\Phi(z)$, hence 
$\Phi(0)=0$.

Liouville's theorem, the main ingredient in this proof, was proved around 
$1850$, a century after Euler worked on these questions. It is interesting 
to note that the d'Alembert-Gau\ss~theorem is an easy corollary of Liouville's 
theorem. [\textbf{Hint}: if $P(z)$ is a polynomial without roots, the 
function $\Phi(z)=1/P(z)$ is entire and bounded, hence a constant; that is,  
$P(z)$ is a constant.]

\subsection{Sums of powers of numbers}
The other result of Euler about $\zeta(s)$ can be stated as follows:
%
\begin{equation}
1^k+2^k+3^k+\ldots=-\frac{B_{k+1}}{k+1}
\label{equ34ter}
\end{equation}
%
for $k=1,2,3,\ldots$ It looks at first odd, since it gives a finite 
value for an infinite sum of positive numbers, obviously divergent since 
each term is at least $1$. Euler's derivation is more or less as follows.

Formula (\ref{equ9ter}) can be written as 
%
\begin{equation}
1^k-2^k+3^k-4^k+\ldots=-(1-2\cdot 2^k)\frac{B_{k+1}}{k+1}.
\label{equ35ter}
\end{equation}
%
On the other hand, multiply the left-hand side of (\ref{equ34ter}) by 
$1-2\cdot 2^k$ and rearrange. This yields 
 
\vspace{1.0ex}

~~~~~~$(1-2\cdot 2^k)(1^k+2^k+3^k+\ldots)=$ 


~~~~~~$~~1^k+2^k+3^k+4^k+5^k+6^k+\ldots$ 

$-2(~~~~~~~~~~2^k~~~~~~+4^k~~~~~~~+6^k+\ldots)$

~~~~--------------------------------------------

$=~~~~~1^k-2^k+3^k-4^k+5^k-6^k+\ldots$

\vspace{1.0ex}

\noindent and finally (\ref{equ34ter}) is obtained from (\ref{equ35ter}). 

This procedure is highly questionable, but can be fixed as follows. We 
introduce  two functions 
%
\begin{equation}
\zeta(s)=\sum_{n\ge1}n^{-s},~~\eta(s)=\sum_{n\ge1}(-1)^{n-1}n^{-s}.
\label{equ36ter}
\end{equation}
%
Provided these functions can be continued analytically to the negative 
integers, formula (\ref{equ34ter}) and (\ref{equ35ter}) read respectively 
as 
%
\begin{eqnarray}
&&\zeta(-k)=-\frac{B_{k+1}}{k+1}\nonumber 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\, (34') 
\\&&\eta(-k)=(2^{k+1}-1)\frac{B_{k+1}}{k+1}\nonumber
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (35') 
\label{equ35bter}
\end{eqnarray}
%
for $k=1,2,\ldots$ Furthermore
%
\begin{eqnarray}
&&\eta(s)=\sum_{\mathrm{n~odd}}n^{-s}-\sum_{\mathrm{n~even}}n^{-s}=\sum_{n\ge1}n^{-s}-2
\sum_{\mathrm{n~even}}n^{-s}\nonumber \\
&&~~~~~~~~~~=\zeta(s)-2\sum_{m\ge1}(2m)^{-s}\nonumber
\label{equ36bter}
\end{eqnarray}
%
 and finally 
%
\begin{equation}
\eta(s)=(1-2^{1-s})\zeta(s).
\label{equ37ter}
\end{equation}
%
Our manipulation of series is justified as long as $\Re (s) >1$, but the 
final formula remains valid for all $s$ for which both $\zeta(s)$ and 
$\eta(s)$ are regular (analytic continuation!). In particular 
%
\begin{equation}
\eta(-k)=(1-2^{k+1})\zeta(-k).
\label{equ38ter}
\end{equation}
% 
Hence formulas (34$\mathrm{'}$) and (35$\mathrm{'}$) are equivalent, 
substantiating Euler's derivation. 

Using the known values of the Bernoulli numbers, we deduce 
%
\begin{eqnarray}
&&\zeta(-2)=\zeta(-4)=\zeta(-6)=\ldots=0\nonumber\\ 
&&\zeta(-1)=-\frac{1}{12},~\zeta(-3)=\frac{1}{120},~\zeta(-5)=-\frac{1}{252},\ldots\nonumber. 
\end{eqnarray}
% 
It can also be shown that $\zeta(0)=-1/2$. Hence we get the paradoxical 
results: 
%
\begin{eqnarray}
&&\nonumber 
\zeta(0)=1+1+1+\ldots=-\frac{1}{2}\\&&\zeta(-1)=1+2+3+\ldots=-\frac{1}{12}\nonumber .
\end{eqnarray}
% 

Among the many methods available to construct the analytical continuation 
of $\zeta(s)$, we select the following one using $\eta(s)$. Indeed, from 
Euler's definition of the gamma function 
%
\begin{equation}
\Gamma(s)=\int_0^{\infty}e^{-t}t^{s-1}dt
\label{equ39ter}
\end{equation}
% 
(for $\Re (s)>1$), we deduce by a simple change of variable
%
\begin{equation}
\Gamma(s)n^{-s}=\int_0^{\infty}e^{-nt}t^{s-1}dt.
\label{equ40ter}
\end{equation}
% 
By summation and term by term integration
%
\begin{eqnarray}
&&\Gamma(s)\eta(s)=\sum_{n\ge1}(-1)^{n-1}\Gamma(s)n^{-s}=\sum_{n\ge1}(-1)^{n-1}\int_0^{\infty}e^{-nt}t^{s-
1
}dt\nonumber\\ 
&&~~~~~~=\int_0^{\infty}
\bigg\{\sum_{n\ge1}(-1)^{n-1}e^{-nt}\bigg\}\cdot t^{s-1}dt\nonumber, 
\end{eqnarray}
% 
that is
%
\begin{equation}
\Gamma(s)\eta(s)=\int_0^{\infty}\frac{t^{s-1}}{e^t+1}dt.
\label{equ41ter}
\end{equation}
% 
All the calculations are justified as long as $\Re (s)>1$. We use now a 
general principle, unknown to Euler, to deal with integrals of the type 
%
\begin{equation}
\Phi(s)=\int_0^{\infty}F(t)t^{s-1}dt.
\label{equ42ter}
\end{equation}
% 
We split the integral as $\int_0^1+\int_0^{\infty}$.

a) Assuming that $F(t)$ decreases at infinity faster than any power 
$t^{-k}$ (for $k\ge0$), then
%
\begin{equation}
\Phi_{1,\infty}(s)=\int_1^{\infty}F(t)t^{s-1}dt
\label{equ43ter}
\end{equation}
%  
extends to an entire function.

b) Assuming that $F(t)$ is differentiable to any order on the closed 
interval $[0,1]$, then 
%
\begin{equation}
\Phi_{0,1}(s)=\int_0^1F(t)t^{s-1}dt
\label{equ44ter}
\end{equation}
% 
extends to a meromorphic function in the complex plane $\mathbf{C}$. The 
only singularities\footnote{\textbf{Hint}: integrate by parts using 
$\frac{d}{dt}t^{-s}=-st^{-s-1}$.} are at $s=0,-1,-2,\ldots$ with singular 
part  $\frac{\mathbf{D}^kF(0)}{k!(s+k)}$ around $s=-k$. 

Applying this principle to the definition (\ref{equ39ter}) of $\Gamma(s)$ 
we recover the well-known fact that $\Gamma(s)$ extends to a meromorphic 
function, with poles at $s=0,-1,-2,\ldots$ and singular part 
$\frac{(-1)^k}{k!(s+k)}$ around $s=-k$. We use now formula (\ref{equ41ter}) 
for $\Gamma(s)\eta(s)$. Hence this function extends to a meromorphic 
function with poles at $s=0,-1,\ldots$ and singular part $\frac{c_k}{s+k}$ 
around $s=-k$, where 
%
\begin{equation}
\frac{1}{e^t+1}=\sum_{k\ge0}c_kt^k.
\label{equ45ter}
\end{equation}
% 
According to (\ref{equ8ter}), we get
%
\begin{equation}
c_k=\frac{(1-2^{k+1})B_{k+1}}{(k+1)!}.
\label{equ46ter}
\end{equation}
% 
Dividing $\Gamma(s)\eta(s)$ by $\Gamma(s)$, the poles cancel; hence 
$\eta(s)$ extends to an entire function, and comparing the singular parts 
of $\Gamma(s)\eta(s)$ and $\Gamma(s)$ around $s=-k$, we find
%
\begin{equation}
\eta(-k)=(-1)^kk!c_k.
\label{equ47ter}
\end{equation}
% 
We have to distinguish between several cases:

$\bullet$ $k=0$ yields $\eta(0)=c_0=-B_1=\frac{1}{2}$;

$\bullet$ $k\ge2$ is even yields $\eta(-k)=0$ since $B_r=0$ for $r=k+1$ odd;

$\bullet$ $k\ge1$ is odd, then $(-1)^k=-1$ and 
%
\begin{equation}
\eta(-k)=\frac{(2^{k+1}-1)B_{k+1}}{k+1}.
\label{equ48ter}
\end{equation}
% 
This is Euler's formula (\ref{equ35ter}) or formula (35$\mathrm{'}$). The 
analytical continuation of $\zeta(s)$ can now be performed by using 
(\ref{equ37ter}), that is we define 
 %
\begin{equation}
\zeta(s)=\frac{\eta(s)}{1-2^{1-s}}.
\label{equ49ter}
\end{equation}
% 
Since $\eta(s)$ is entire, the only singularity of $\zeta(s)$ is a pole at 
$s=1$, with singular part $\frac{1}{s-1}$. We can now calculate $\zeta(-k)$  
from $\eta(-k)$ and get
 %
\begin{equation}
\zeta(-k)=-\frac{B_{k+1}}{k+1}
\label{equ50ter}
\end{equation}
% 
for $k=1,2,\ldots$, as expected. Furthermore, from $\eta(0)=\frac{1}{2}$ we 
get the remaining value 
%
\begin{equation}
\zeta(0)=-\eta(0)=-\frac{1}{2}.
\label{equ51ter}
\end{equation}
% 
\subsection{Variation I: Did Euler really fool himself?}
Bourbaki wrote (in \cite{Bourbaki1}, page~VI.29): 
\lq\lq Mais la tendance au calcul formel est la plus forte, et 
l'extraordinaire intuition d'Euler lui-m\^{e}me ne l'emp\^{e}che pas de 
tomber parfois dans l'absurde, lorsqu'il \'{e}crit par exemple 
$0=\sum_{n=-\infty}^{+\infty}x^n$". Did Euler really fool himself? 

To keep with our habits (after Cauchy!) denote by $z$ a complex variable 
and try  to evaluate the sum of $I=\sum_{n=-\infty}^{+\infty}z^n$. We break 
the sum into $I_++I_--1$, where 
 %
$$
I_+=\sum_{n\ge0}z^n,~~I_-=\sum_{n\le0}z^n \nonumber .
$$
%  
By the geometric series, we get $I_+=\frac{1}{1-z}$ and 
$I_-=\frac{1}{1-1/z}$ and simple algebra gives 
%
\begin{equation}
I_++I_-=\frac{1}{1-z}+\frac{z}{z-1}=\frac{1-z}{1-z}=1,
\label{equ52ter}
\end{equation}
% 
hence $I=0$ as claimed. What is paradoxical is that there is no complex 
number $z\neq0$ for which both series $I_+$ and $I_-$ converge 
simultaneously, since $\sum_{n\ge0}z^n$ converges for $|z|<1$ and 
$\sum_{n\le0}z^n$ converges for $|z|>1$. We really need analytical 
continuation: $I_+$ as a function of $z$ extends from the convergence 
domain $|z|<1$ to $\mathbf{C}-\{1\}$ as the rational function 
$\frac{1}{1-z}$, and one goes from $I_+$to $I_-$ by inverting $z$ (into 
$1/z$). If both $I_+$ and $I_-$ are extended in this way to 
$\mathbf{C}-\{1\}$, the calculation (\ref{equ52ter}) is perfectly valid, 
hence $I=0$ in this sense.

Another method to prove $I=0$ is to observe that multiplication of $I$ by $z$ 
shifts $z^n$ to $z^{n+1}$, hence rearranges the series, hence $Iz=I$, hence 
$I(z-1)=0$, and by dividing by $z-1$ we get $I=0$ for $z\neq 1$. 
Nevertheless, there is    some trouble. Consider the critical region 
$|z|=1$ where both $I_+$ and $I_-$ diverge, and use polar coordinates 
$z=e^{2\pi i u}$. Then $I$ is the series 
%
\begin{equation}
J(u)=\sum_{n=-\infty}^{+\infty}e^{2\pi in u}.
\label{equ53ter}
\end{equation}
%
Playing with Fourier series, introduce  a test function $f(u)$ supposed to 
be smooth (i.e., infinitely differentiable) and periodic, $f(u+1)=f(u)$. We 
expand it as a Fourier series 
%
\begin{equation}
f(u)=\sum_{n=-\infty}^{+\infty}c_ne^{2\pi in u}
\label{equ54ter}
\end{equation}
%
with
%
\begin{equation}
c_n=\int_0^1 e^{-2\pi inu}f(u)du.
\label{equ55ter}
\end{equation}
%
From (\ref{equ54ter}) we get, by putting $u=0$,
%
\begin{equation}
f(0)=\sum_{n=-\infty}^{+\infty}c_n
\label{equ56ter}
\end{equation}
%
hence by (\ref{equ55ter}) and (\ref{equ53ter})
%
$$
f(0)=\sum_{n=-\infty}^{+\infty}\int_0^1 e^{-2\pi 
inu}f(u)du=\int_0^1\left\{\sum_{n=-\infty}^{+\infty}e^{-2\pi 
inu}\right\}\cdot f(u)du,\nonumber 
$$
%
and finally
%
\begin{equation}
f(0)=\int_0^1J(u)f(u)du.
\label{equ57ter}
\end{equation}
%
Remove now the assumption $f(u+1)=f(u)$ by introducing a smooth function 
$\phi(u)$ vanishing off some finite interval and by defining 
%
\begin{equation}
f(u)=\sum_{m=-\infty}^{+\infty}\phi(u+m)
\label{equ58ter}
\end{equation}
%
(an absolutely convergent series). By an easy manipulation, one derives 
from (\ref{equ57ter}) 
%
\begin{equation}
\sum_{m=-\infty}^{+\infty}\phi(m)=\int_{-\infty}^{+\infty}J(u)\phi(u)du.
\label{equ59ter}
\end{equation}
%
Using the standard Dirac's function $\delta(u)$, we get by definition 
%
$$
\phi(m)=\int_{-\infty}^{+\infty}\phi(u)\delta(u-m)du ,\nonumber
$$
%
hence 
%
\begin{equation}
0=\int_{-\infty}^{+\infty}\left\{J(u)-\sum_{m=-\infty}^{+\infty}\delta(u-m)
\right\}\phi(u)du. 
\label{equ60ter}
\end{equation}
%
Since the test function $\phi$ is arbitrary, we can omit it from 
(\ref{equ60ter}), hence the conclusion 
%
\begin{equation}
J(u)=\sum_{m=-\infty}^{+\infty}\delta(u-m).
\label{equ61ter}
\end{equation}
%
That is, by substituting $e^{2\pi iu}$ to $z$, \textbf{the series} 
$I=\sum_{n=-\infty}^{+\infty} z^n$ \textbf{is not} $0$ \textbf{but} \\ 
$\sum_{m=-\infty}^{+\infty}\delta(u-m)$ . So Euler was wrong, but not too 
much, since $\delta(u-m)=0$ for $u\neq m$, hence  
$\sum_{n=-\infty}^{+\infty} z^n$ is $0$ for $z\neq 1$. 

Recall the other proof, using
%
\begin{equation}
I(z-1)=0;
\label{equ62ter}
\end{equation}
%
division by $z-1$ gives $I=0$, provided $z\neq 1$, corresponding to $u\notin 
\mathbf{Z}$ for $z=e^{2\pi i u}$. Formula (\ref{equ62ter}) is equivalent to 
%
\begin{equation}
J(u)(e^{2\pi i u}-1)=0,
\label{equ63ter}
\end{equation}
and this suggests a new proof of (\ref{equ61ter}). Indeed, if $f(u)$ is a 
smooth function with isolated simple zeroes $u_m$, then $J(u)f(u)=0$ implies  
that $J(u)$ is a linear combination of terms $c_m\delta(u-u_m)$. Here 
$f(u)=e^{2\pi i u}-1$, hence $u_m=m$ for $m$ in $\mathbf{Z}$, that is 
$m=0,\pm1,\pm2,\ldots$ hence $J(u)= \sum_{m=-\infty}^{+\infty} 
c_m\delta(u-m)$ for suitable coefficients $c_m$. But $J(u+1)=J(u)$, hence 
all coefficients $c_m$ are equal to some constant $c$ and 
$J(u)=c\sum_{m=-\infty}^{+\infty} \delta(u-m)$. It remains to calculate the 
normalization constant $c$. That kind of argument could be understood by 
Euler, but it acquires now a rigorous meaning due to Laurent Schwartz's 
theory of distributions ($200$ years after Euler!)\footnote{Our final 
result can be expressed as $\sum_{n=-\infty}^{+\infty}e^{2\pi 
inu}=\sum_{m=-\infty}^{+\infty}\delta(u-m)$. It is equivalent to Poisson's 
summation formula.}. 

Another version of our proof is by using a contour integral (see
Fig.~\ref{path}). 
%
\begin{figure}\centering 
\includegraphics[width=0.6\textwidth]{cartier5.pdf} 
\caption{Path for the contour integral}
\label{path} 
\end{figure}
%
Consider a function  $\Phi(z)$ holomorphic in a domain containing the 
annulus $r\le |z|\le R$ bounded by $C_+$ and $C_-$ (beware the 
orientations). The rational function $R(z)=\frac{1}{z-1}$ is given by a 
convergent series $\sum_{n=-\infty}^{-1} z^n$ for $|z|>1$, hence for $z$ in 
$C_+$. It follows 
 %
\begin{equation}
\int_{C_+}R(z)\Phi(z)dz=\sum_{n=-\infty}^{-1}\int_{C_+}z^n\Phi(z)dz.
\label{equ64ter}
\end{equation}
%
Similarly
%
\begin{equation}
\int_{C_-}R(z)\Phi(z)dz=\sum_{n=0}^{\infty}\int_{C_-}z^n\Phi(z)dz,
\label{equ65ter}
\end{equation}
%
and using the residue formula,
%
$$
\int_{C_+}\sum_{n=-\infty}^{-1}z^n\cdot
\Phi(z)dz+\int_{C_-}\sum_{n=0}^{\infty}z^n\cdot \Phi(z)dz=2\pi i 
\Phi(1).\nonumber
$$
%
A shorthand would be
%
\begin{equation}
\sum_{n=-\infty}^{+\infty}z^n=2\pi i \delta(z-1)
\label{equ66ter}
\end{equation}
%
using $\delta$-functions in the complex domain\footnote{A classical 
formula is
%
$$
\delta(f(u))=\sum_m \frac{1}{|f'(u_m)|}\delta(u-u_m)\nonumber,
$$
% 
the summation being extended over the solutions of the equation $f(u_m)=0$  
(provided $f'(u_m)\neq 0$). In this formula $f(u)$ is a real-valued 
function of a real variable $u$. Assuming that it remains valid for 
$f(u)=e^{2\pi i u}-1$ (with complex values), we derive
%
$$
2\pi i \delta(z-1)=\sum_{m=-\infty}^{+\infty}\delta(u-m)\nonumber
$$
% 
for $z=e^{2\pi i u}$. This brings together our two methods.}. 

Let us go back to sums of powers and Bernoulli numbers and polynomials. A 
classical formula reads as follows:
%
\begin{equation}
B_k(u)=-k!\sum_{n\neq 0}\frac{e^{2\pi inu}}{(2\pi i n)^k}.
\label{equ67ter}
\end{equation}
%
A complex version is the following
%
$$
\kern4cm\kern-7.34pt
\sum_{n\neq 0}\frac{z^n}{n^k}=
-\frac{(2\pi i)^k}{k!}B_k\left(\frac{\log z}{2 \pi i}\right)
\nonumber \kern-7.34pt\kern4cm(68)_k
\label{equ68kter}
$$
%
(by the change of variables $z=e^{2\pi iu}$) for $k=0,1,2,\ldots$. For 
$k=0$, this reads as Euler's \lq\lq absurd formula" 
%
$$
\kern6cm\kern-21.61pt
\sum_{n\neq 0}z^n=-1,
\nonumber \kern6cm\kern-21.61pt
(68)_0
\label{equ680ter}
$$
%
since $B_0(x)=1$. The case $k=1$ is
%
$$
\kern6cm\kern-37.01pt
\sum_{n\neq 0}\frac{z^n}{n}=\pi i -\log z; 
\nonumber \kern6cm\kern-37.01pt
(68)_1
\label{equ681ter}
$$
%
of course
%
\begin{eqnarray}
&&\sum_{n=1}^{\infty}\frac{z^n}{n}=\log\frac{1}{1-z}\nonumber\\ && 
\sum_{n=-\infty}^{-1}\frac{z^n}{n}=-\sum_{m=1}^{\infty}\frac{(z^{-1})^m}{m}=-\log \frac{1}{1-1/z}\nonumber 
\end{eqnarray}
%
and $\mathrm{(68)_1}$ amounts to 
\setcounter{equation}{68}
%
\begin{equation}
\log\frac{1}{1-z}-\log \frac{1}{1-1/z}=\pi i -\log z.
\label{equ69ter}
\end{equation}
%
Since $\log (-1)=-\pi i$ and 
$\frac{1}{1-1/z}=\frac{z}{z-1}=\frac{z\cdot (-1)}{1-z}$, this relation follows 
from $\log uv=\log u+\log v$, but some care has to be exercized with the 
multivalued complex logarithm (notice the ambiguity $\log(-1)=\pm 
\pi i$ for instance). 

For the general case, notice the following. Define 
%
\begin{equation}
L_k(z)=\sum_{n\neq 0}\frac{z^n}{n^k},~~R_k(z)=-\frac{(2\pi 
i)^k}{k!}B_k\left(\frac{\log z}{2 \pi i}\right). 
\label{equ70ter}
\end{equation}
%
We know already that $L_0(z)=R_0(z)$ and $L_1(z)=R_1(z)$. Furthermore, it 
is obvious that  
%
\begin{equation}
z\frac{d}{dz}L_k(z)=L_{k-1}(z),
\label{equ71ter}
\end{equation}
%
and from the fact that the derivative of $B_k(x)$ is $kB_{k-1}(x)$, one 
gets 
%
\begin{equation}
z\frac{d}{dz}R_k(z)=R_{k-1}(z).
\label{equ72ter}
\end{equation}
%
So we can easily conclude that $L_2(z)-R_2(z)$ is a constant, which has to 
be shown to be $0$ to prove $L_2(z)=R_2(z)$. Then $L_3(z)-R_3(z)$ is a 
constant, etc\dots

This line of argument can be made rigorous. Introduce the polylogarithmic 
functions\footnote{The dilogarithm $Li_2(z)$ was known to Euler, and 
further developed in the $19^\mathrm{th}$ century in connection with 
Lobatchevski geometry. Fifteen years ago, the subject was almost forgotten, 
to be resurrected by geometers and mathematical physicists alike. It is now 
a hot subject of research.} 
%
\begin{equation}
\mathrm{Li}_k(z)=\sum_{n=1}^{\infty}\frac{z^n}{n^k}.
\label{equ73ter}
\end{equation}
%
Our formula reads now as follows:
%
$$
\kern4cm\kern-47.87pt
\mathrm{Li}_k(z)+(-1)^k\mathrm{Li}_k\left(\frac{1}{z}\right)=
-\frac{(2\pi i)^k}{k!}B_k\left(\frac{\log z}{2 
\pi i}\right). \kern4cm\kern-47.87pt
(74)_k\nonumber
\label{equ74ter}
$$
%
\setcounter{equation}{74}
To make sense of it, we proceed as follows:

a) We cut the complex plane along the real interval $[0,+\infty[$, to get 
$\Omega_0=\mathbf{C}-[0,+\infty[$.

b) In the cut plane, we choose the somewhat unusual branch of the  
logarithm $\log (r e^{i\theta})=\log r +i\theta$ for $0<\theta<2\pi$. 

c) We define the function $\mathrm{Li}_k(z)$ by the convergent series 
(\ref{equ73ter})  for $|z|<1$, and verify that 
%
\begin{equation}
z\frac{d}{dz}\mathrm{Li}_k(z)=\mathrm{Li}_{k-1}(z)
\label{equ75ter}
\end{equation}
%
for $k=1,2,\ldots$ and $\mathrm{Li}_0(z)=\frac{z}{1-z}$. Since the cut 
plane $\Omega_1=\mathbf{C}-[1,\infty[$ is simply connected, any holomorphic 
function in $\Omega_1$ has a primitive, hence by (\ref{equ75ter}), each 
$\mathrm{Li}_k(z)$ extends analytically to $\Omega_1$. 

d) For $z$ in $\Omega_0$, both $z$ and $\frac{1}{z}$ are in $\Omega_1$, 
hence both $\mathrm{Li}_k(z)$ and $\mathrm{Li}_k(\frac{1}{z})$ are defined 
for $z$ in $\Omega_0$, and formula $(74)_k$ is asserted for $z$ in 
$\Omega_0$. 

e) The cases $k=0$ and $k=1$ are settled as before.

f) From (\ref{equ75ter}) and the rule for the derivative of $B_k(x)$, we 
get   that the validity of $(74)_k$ for the index $k$ implies that of 
$(74)_{k+1}$ for the index $k+1$ up to the addition of a constant. To show 
that it is $0$ use the fact that for $k\ge2$, the series 
$\sum_{n=1}^{\infty}\frac{z^n}{n^k}$ converges also for $|z|=1$, and study 
the limiting value for $z\rightarrow 1$, using $B_k(0)=B_k(1)$. 

\textbf{So after all, Euler was right!}

Putting $z=1$ in $(74)_k$ we obtain the value of $\zeta(k)+(-1)^k 
\zeta(k)$. For $k$ odd, we get $0=0$, but for $k$ even, we recover the value of $\zeta(k)$ 
given by (\ref{equ24ter}).

\subsection{Variation II: Infinite products}

Suppose we want to calculate $\infty!=1\cdot 2\cdot 3\cdots$. Going to logarithms we 
define 
%
\begin{equation}
\infty !=\exp\left(\sum_{n=1}^{\infty}\log n\right).
\label{equ76ter}
\end{equation}
%
Suppose we have a series $\sum_{n\ge1}a_n$ with $na_n$ bounded. What is known as the 
$\zeta$-\textbf{summation procedure} fits our general framework at the beginning of
Section~4.1: 
consider the convergent series $\sum_{n\ge1}a_n n^{-\epsilon}$ for 
$\epsilon >0$, and let $\epsilon$ tend to $0$. To sum the series 
$\sum_{n\ge1}\log n$, we should consider
$\sum_{n\ge1}n^{-\epsilon}\cdot \log n$ 
but this converges for $\epsilon >1$ only and we cannot go directly to the 
limit $\epsilon \rightarrow 0$. What we have to do is to consider the 
series $\sum_{n\ge1}n^{-s}\cdot \log n$ for $\Re( s)>1$; this is obviously the 
derivative $-\zeta'(s)$ of the Riemann zeta function, hence it can be 
analytically continued to the neighborhood of $0$. The regularized sum of 
$\sum_{n\ge 1}\log n$ is then $-\zeta'(0)$ and finally 
%
\begin{equation}
\infty !=e^{-\zeta'(0)}.
\label{equ77ter}
\end{equation}
%
From the formulas (\ref{equ37ter}) and (\ref{equ41ter}), one derives 
without much ado $\zeta'(0)=-\frac{1}{2}\log 2\pi$ (a formula more or less  
equivalent to Stirling's formula). Conclusion: 
%
\begin{equation}
\infty !=\sqrt{2\pi}.
\label{equ78ter}
\end{equation}
%

\textbf{General rule}: to normalize a divergent product $\Pi_{n\ge1}a_n$, 
introduce the series $\sum_{n\ge1}a_n^{-s}=Z(s)$, make an analytic 
continuation from the convergence domain $\Re (s)>\sigma_0$ to $s=0$ and 
define 
%
\begin{equation}
\prod_{n\ge 1}^{\mathrm{reg}}a_n:=e^{-Z'(0)}.
\label{equ79ter}
\end{equation}
%
Generalizing slightly (\ref{equ78ter}), we can use this method to prove the 
identity\footnote{Formally: $\frac{\infty!}{(\infty+v)!}=\Gamma(v+1)$.} 
%
\begin{equation}
\prod_{n\ge 0}^{\mathrm{reg}}(n+v)=\frac{\sqrt{2\pi}}{\Gamma(v)}.
\label{equ80ter}
\end{equation}
%
We can compare this to the Weierstra{\ss} product expansion for the gamma 
function 
%
\begin{equation}
\frac{1}{\Gamma(v)}=ve^{\gamma
v}\prod_{n\ge1}\left(1+\frac{v}{n}\right)e^{-v/n}.
\label{equ81ter}
\end{equation}
%
A careless, but nevertheless instructive, comparison of (\ref{equ80ter}) 
and (\ref{equ81ter}) is as follows: 
%
\begin{eqnarray}
&&\prod_{n\ge1}\left(1+\frac{v}{n}\right)e^{-v/n}=
\prod_{n\ge1}\frac{n+v}{n}e^{-v/n}\nonumber  
\\
&&=\prod_{n\ge1}(n+v)\bigg(\prod_{n\ge1}n\bigg)^{-1} \exp\bigg(-v 
\sum_{n\ge1}1/n\bigg)\nonumber 
\end{eqnarray}
%
and regularizing the divergent series $\sum_{n\ge1}\frac{1}{n}$ by the 
Euler constant $\gamma$, we are through! Notice that the two most important  
properties of $\Gamma$, namely

1) the functional equation $\Gamma(v+1)=v\Gamma(v)$;

2) the function $\frac{1}{\Gamma(v)}$ of a complex variable $v$ is entire 
with zeroes at $0,-1,-2,\ldots$ 

\noindent can be read off immediately from 
(\ref{equ80ter}).

According to our general method, the proof of (\ref{equ80ter}) requires to 
study the function  
%
\begin{equation}
\zeta(s,v)=\sum_{n\ge0}(n+v)^{-s}
\label{equ82ter}
\end{equation}
%
known as Hurwitz zeta function (see \cite{Waldschmidt} for more details). 
We list a few properties: 

a) \textbf{a particular case} $\zeta(s,1)=\zeta(s)$; 

b) \textbf{functional equations}: 

\begin{eqnarray}
&&\zeta(s,v+1)=\zeta(s,v)-v^{-s} \\
 &&\partial_v\zeta(s,v)=-s\zeta(s+1,v); 
\label{equ84ter}
\end{eqnarray}

c) \textbf{analytic continuation}: for fixed $v$, $\zeta(s,v)$ can be 
analytically continued to the complex plane with one singularity at $s=1$, 
with singular part $\frac{1}{s-1}$; hence $\zeta(s,v)-\zeta(s)$ is an 
entire function; 

d) \textbf{special values}: 
%
\begin{equation}
\zeta(-k,v)=-\frac{B_{k+1}(v)}{k+1}
\label{equ85ter}
\end{equation}
%
for $k=0,1,2,\ldots$

The last relation can be written, in the spirit of Euler, as 
%
\begin{equation}
v^k+(v+1)^k+(v+2)^k+\ldots=-\frac{B_{k+1}(v)}{k+1}.
\label{equ86ter}
\end{equation}
As a particular case we get the surprising identity
%
\begin{equation}
v^0+(v+1)^0+(v+2)^0+\ldots=\frac{1}{2}-v.
\label{equ87ter}
\end{equation}
%
\section{Conclusion: From Euler to Feynman}
\setcounter{equation}{0}
Feynman is the modern heir to Euler. Among his many contributions to 
theoretical physics, the most famous one is his use of diagrams to encode 
in a   very compact way complicated integrals with significance in 
experiments in high energy physics. His method of diagrams has been 
generalized by various  authors (Cvitanovic, Penrose,\dots) to provide a very 
flexible tool for computations in tensor analysis. 

His really bold discovery is the use of integrals in function spaces (see 
for instance \cite{Dewitt}), the so-called Feynman path integrals. These 
(so far) ill-defined integrals are powerful tools to evaluate infinite 
series and infinite products. We give just one example. Consider the 
Hilbert space  $L^2(0,2\pi)$ of functions $f(x)$ with $0<x<2\pi$ and 
$\int_0^{2\pi}|f(x)|^2dx$  finite. The unbounded operator 
$\Delta=-d^2/dx^2$ can be diagonalized with eigenfunctions $e_n(x)=e^{inx}$  
(for $n=0,\pm1,\pm2,\ldots$) corresponding to the eigenvalue $n^2$. Hence 
the characteristic determinant $\det(v-\Delta)$ is an entire function with 
the eigenvalues as zeroes. Using our regularized products, one now defines 
the regularized determinant as 
%
\begin{equation}
\mathrm{det}^{\mathrm{reg}}(v-\Delta)=
v\left(\prod_{n\ge1}^{\mathrm{reg}}(v-n^2)^2\right)~~~~
\label{equC1}
\end{equation}
%
($0$ is a simple eigenvalue, and $1^2,2^2,\ldots$ are eigenvalues of 
multiplicity $2$). This can be evaluated by a formula due to Euler 
%
\begin{equation}
\sin v=v\prod_{n\ge1}\left(1-\frac{v^2}{n^2\pi^2}\right),
\label{equC2}
\end{equation}
%
equivalent (via logarithmic derivatives) to the formula
%
\begin{equation}
\cot v=\frac{1}{v}+\sum_{n\ge1}\frac{2v}{v^2-n^2\pi^2}
\label{equC3}
\end{equation}
%
also due to Euler, and considered above.

Feynman's bold step is as follows. From matrix calculus, we learn the 
following integral formula for a characteristic determinant 
%
\begin{equation}
\det(v-A)=\left[\int_{\mathbf{R}^n}d^nx ~\exp
-\pi\left(v\sum_{i=1}^nx_i^2-\sum_{i,j}a_{i,j}x_ix_j\right)\right]^{-2}
\label{equC4}
\end{equation}
%
where $d^nx$ is the volume element $dx_1\cdots dx_n$ in the Euclidean space 
$\mathbf{R}^n$, and $A=(a_{i,j})$ is a real symmetric, positive definite, 
matrix of size $n\times n$. By analogy, Feynman writes $\det(v-\Delta)$ as 
the square of the inverse of
%
\begin{equation}
\int_{L^2(0,2\pi)}\mathcal{D}x\cdot \exp(-\pi S(x)),
\label{equC5}
\end{equation}
%
where the so-called action $S(x)$ is defined by
%
\begin{equation}
S(x)=v\int_0^{2\pi}x(t)^2dt + \int_0^{2\pi}x'(t)^2dt
\label{equC6}
\end{equation}
%
(the variable in $[0,2\pi]$ is denoted by $t$, the function in 
$L^2(0,2\pi)$ by $x(t)$, and its derivative by $x'(t)$). The symbol 
$\mathcal{D}x$ is formally a volume element in the Hilbert space 
$L^2(0,2\pi)$ (infinite-dimensional generalization of the Euclidean space 
$\mathbf{R}^n$), sometimes written as $C\prod_t dx(t)$. Its rigorous 
definition is the main problem \cite{Dewitt}. 

Part of these calculations have been put into a rigorous framework, but not  
all of them.

\textbf{After all, Feynman shall be right!}

\section*{Acknowledgements} Ils vont \`{a} Michel Planat et Michel Waldschmidt,
pour leur impeccable organi\-sation du colloque de Chapelle des Bois, et 
pour leur conviction que ce texte verrait le jour. Leurs amicales pressions  
et leur aide mat\'erielle ont grandement contribu\'e \`{a} l'heureux 
d\'enouement. Merci aussi \`a Christian Krattenthaler pour une relecture extr\^emement soign\'ee de ce 
texte.

\begin{thebibliography}{}
\addcontentsline{toc}{section}{References}

\bibitem{Borel} Borel E. (1902) Le\c{c}ons sur les s\'{e}ries \`a termes positifs. 
Gauthier-Villars. 

\bibitem{Bourbaki1} Bourbaki N. (1976) Fonctions d'une variable r\'eelle (chapitres 1 \`a 7).
Hermann, edited by Masson since 1980. 

\bibitem{Bourbaki2} Bourbaki N. (1984) El\'ements d'histoire des math\'ematiques.
Masson.

\bibitem{Cartier} Cartier P. (1966) Quantum mechanical relations and theta functions, 
in Algebraic groups and discontinuous subgroups. Proc. Symp. Pure Maths, 
vol.~IX, American Mathematical Society, Providence, 361-383. 

\bibitem{Dewitt} DeWitt-Morette C., Cartier P. and Folacci P. (editors) (1997) Functional Integration,
Basics and Applications, Nato Asi, Series B, vol 361. Plenum Press.

\bibitem{Euler} Euler L. (1755) Institutiones calculi differentialis cum ejus usu in analysi infinitorum 
ac doctrina serierum. Berlin, 1910. 

\bibitem{Hardy} Hardy G. H. (1924) Orders of infinity. Cambridge Univ. Press
(reprinted by Hafner, New York, 1971).

\bibitem{Knopp} Knopp K. (1964) Theorie und Anwendung der unendlichen Reihen. Springer (5th
edition).

\bibitem{Waldschmidt} Waldschmidt M. {\it et al} (1995) From Number Theory to Physics. Springer
(see Chapter~1 by P.~Cartier: An introduction to zeta functions).

\bibitem{Weil} Weil A. (1984) Number Theory, An Approach Through History.
Birkh\"auser.

\end{thebibliography}


\end {document}


