of score-based diffusion model. The ODE can be derived from the Fokker-Planck equation, which we describe in the generality of Feller processes below.
Semigroups
Let be a Banch space. Suppose for every we associate a bounded linear mapping (also called an operator) so as to have
is the identity
for all
for every
If (i) and (ii) hold, the family of operators is called a semigroup of a single parameter. In view of (ii) we shall later see that under favourable circumstances it is possible to represent the operators in exponential form for an operator . To begin this investigation, for every we associate to the operator for every . Define the operator whenever the limit exists (in topology generated by the norm of the Banach space ), in which case we say is in the domain of . Observe that is a vector subspace of and is a linear operator. The operator is called the infinitesimal generatorof the semigroup .
Exponential Representation of Semigroups
We can say several things about the semigroup satisfying (1)-(3) above:
There are constants such that for every .
is a continuous map of into for every .
is dense in and is closed.
The differential equation holds for every .
For every the convergence is uniform on every compact subset of .
If with real part , then the integral defines a bounded linear operator called the resolvant of the semigroup whose range is , and which is the inverse of .
Property (5) above is almost which we set out to seek, but under what conditions? The following theorem answers this: any of these three statements imply the other two
The representation of semigroups specialized to Hilbert space is also interesting to describe: if the semigroup consists of normal operators satisfying the condition condition (iii), then for a normal infinitesimal generator we have . Similarly, if the semigroup satisfying (iii) consists of unitary operators, then we have representation for some self-adjoint operator .
Hille-Yosida Theorem
Which operators generate semigroups? Conversely, what properties do infinitesimal generators of semigroups have? The answers lies in the Hille-Yosida theorem, which states:
A densely defined operator in a Banach space is the infinitesimal generator of a semigroup satisfying properties (i)-(iii) above if and only ifthere are constances such that for all and all positive integers .
The starting point for the study of Feller processes is to regard the transition function (t.f.) of Markov processes as operators on a function space. Concretely, let be a (possibly time dependent) t.f. and let be a function from a suitable function space, then the integral defines an operator. In general, the Chapman-Kolmogorov relations are for any . When the t.f. depends on through the difference alone, then we write for and call it a homogenenous transition function. In this situation the Chapman-Kolmogorov relations turns into a form that reminds us of the semigroup relation (ii) .
To make precise the semigroup appearence, let be a locally compact space with countable base (abbrev. LCCB space), and let be the set of real valued continuous functions that vanishes at infinity. Recall that a positive operator maps positive functions to positive functions. A Feller semigroup on is a family of positive linear operators on such that
and for every .
for any pair .
for every
Each Feller semigroup on gives rises to a unique homogenenous transition function such that for every and every . We call the transition associated to the Feller semigroup a Feller transition function. The necessary and sufficient condition for a t.f. to be a Feller t.f. is as follows (proof in
Reference
D. Revuz, M. York, Continuous Martingales and Brownian Motion, Grundlehren der mathematischen Wissenschaften, Third Edition
A Feller process is a Markov process having a Feller transition function.
Convolution Semigroup
An example of Feller semigroups is furnished by convolution semigroups, constructed out from families of probability measures on satisfying
in the vague topology.
From this Feller semigroup we can derive a Feller transition function The transition functions of d-dimensional Brownian motionare of this type. For 1-dimensional Brownian motion, the t.f. come from convolotion of probability measures given by the densities If the transition function of a process is given by a convolution semigroup , then has stationary independent increments. The law of the increment of is . A proof can be found in
Reference
D. Revuz, M. York, Continuous Martingales and Brownian Motion, Grundlehren der mathematischen Wissenschaften, Third Edition
Stationary above means the law of the increment depends on and only through their difference .
Conversely if a Feller process has stationary increments, then its transition functions are generated from convolution semigroup with . Such a process is called a Lévy process.
Forward Backward Eqautions
The abstract results about semigroups and their infinitesimal generators can be applied to the case of Feller processes. In particular we focus on the property that is differentiable and where as above denotes the infinitesimal generator of the (Feller) semigroup.
We can view as a pairing of the measure and the function . With this duality in mind we write With being the formal adjoint of we can rewrite as this is called the forward equation or the Fokker-Planck equation. The left side appears to involve differentiating a measure, however, given that measures can be expressed in terms of functions (ex. a probability measure is represented by its density function, and moreover absolutely continuous measure by the so called Radon-Nikodym derivative), the above differential equation makes sense. Analogously, this equation is called the backwards equation
Generator for Brownian Motion
Denote by the space of twice continuously differentiable functions where the function and its first two derivatives vanish at inifity. Then for Brownian motion in started at the origin, the infinitesimal generator is . In other words It can be shown that for the domain of the operator and for is the proper subspace of functions such that in the sense of distributions (since such functions need not possess second derivatives).
For Brownian motions, the forward-backward equations are this is why the normal density functions defined above and its higher dimensional analogues are fundamental solutions of the heat equaiton
If is a dimensional Brownian motion started at the origin, then for a matrix we can define a Markov process starting off origin . The corresponding infinitesimal generator is where .
Proofs can be found in
Reference
D. Revuz, M. York, Continuous Martingales and Brownian Motion, Grundlehren der mathematischen Wissenschaften, Third Edition
To describe the infinitesimal generator for Feller processes more generally, suppose is a Feller semigroup on and . Then for every realtively compact open set , there exists functions on and a kernel such that for and where is a Radon measure on , the matrix is symmetric positive semidefinite, and do not depend on .
The idea behind this generator is that the infinitesimal purturbation consists of adding a translation , a Gaussian process with covariance , jumps given by , and accounts for scenarios for the process of being killed.
If the process has continuous paths, then its infinitesimal generator is given on by the semi-elliptic second order differential operator where is symmetric positive semidefinite.
More details about this section can be found in
Reference
O. Kallenberg, Foundations of Modern Probability, Springer, Third Edition
Next we will define diffusion processes in the context of the above discussions. Suppose for the matrices is symmetric positive semidefinite. Let be a vector field, and suppose further the maps and are Boreal measurable and locally bounded. We associate to the pair the second order differential operator A Markov process with state space is said to be a diffusion processwith generator if has continuous paths, and if for any and any we say that has covariance or diffusion coefficient and drift . If further introduce dependence on time then correspondingly we obtain non-homogeneous diffusion if for any and
Diffusion Approximation of Markov Chains
We now sketch out in non-rigourous terms the approximation of discrete-time Markov chains by diffuion and more generally continuous time Markov processes. We begin a discussion of convergence of Feller processes . As a Feller process can be specified by various equivalent objects: the Feller semigroup, the infinitesimal generator, we can describe this convergence in a multitude of equivalent ways. We do not make precise the various modes of convergences here and refer to the interested reader to
Reference
D. Revuz, M. York, Continuous Martingales and Brownian Motion, Grundlehren der mathematischen Wissenschaften, Third Edition
Let and be Feller processes in a state space with semigroups and respectively, and generators and on the domains and respectively. Let be a so called core of which is a special vector subspace of . Then the following (non-precise) statements are equivalent
Any element in the core can be approximated by a sequence in the sense that and
Convergence of the operators for each
Convergence of functions for every
The convergence in distribution of implies convergence of Feller processes
This abstract convergence result can be applied to approximating Markov chains:
Let be discrete Markov chains in state space with transition operators and let be a Feller process with semigroup and generator . Fix a core of the generator and let be a sequence of numbers that converge to zero. Then the above equivalent statements apply to the processes and operators In particular we can approximate the discrete-time Markov chain by diffusion processes.