Abstract
We establish a stochastic maximum principle (SMP) for control problems of partially observed diffusions of mean-field type with risk-sensitive performance functionals.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
AMS subject classification:
1 Introduction
In optimal control problems for diffusions of mean-field type the performance functional, drift and diffusion coefficient depend not only on the state and the control but also on the probability distribution of the state-control pair. The mean-field coupling makes the control problem time-inconsistent in the sense that the Bellman Principle is no longer valid, which motivates the use of the stochastic maximum (SMP) approach to solve this type of optimal control problems instead of trying extensions of the dynamic programming principle (DPP). This class of control problems has been studied by many authors including [1, 2, 5, 7, 15, 20]. The performance functionals considered in these papers have been of risk-neutral type i.e. the running cost/profit terms are expected values of stage-additive payoff functions. Not all behavior, however, can be captured by risk-neutral performance. One way of capturing risk-averse and risk-seeking behaviors is by exponentiating the performance functional before expectation (see [17]).
The first paper that we are aware of and which deals with risk-sensitive optimal control in a mean field context is [24]. Using a matching argument, the authors derive a verification theorem for a risk-sensitive mean-field game whose underlying dynamics is a Markov diffusion. This matching argument freezes the mean-field coupling in the dynamics, which yields a standard risk-sensitive HJB equation for the value-function. The mean-field coupling is then retrieved through the Fokker-Planck equation satisfied by the marginal law of the optimal state.
In a recent paper [11], the authors have established a risk-sensitive SMP for mean-field type control. The risk-sensitive control problem was first reformulated in terms of an augmented state process and terminal payoff problem. An intermediate stochastic maximum principle was then obtained by applying the SMP of ([5], Theorem 2.1.) for loss functionals without running cost but with augmented state in higher dimension and complete observation of the state. Then, the intermediate first- and second-order adjoint processes are transformed into a simpler form using a logarithmic transformation derived in [12].
Optimal control of partially observed diffusions (without mean-field coupling) has been studied by many authors including the non-exhaustive references [3, 4, 8–10, 13, 14, 16, 19, 21, 23, 26, 27], using both the DPP and SMP approaches. Reference [23] derives an SMP for the most general model of optimal control of partially observed diffusions under risk-neutral performance functionals. Recently, Wang et al. [25], extended the SMP for partially observable optimal control of diffusions for risk-neutral performance functionals of mean-field type.
The purpose of this paper is to establish a stochastic maximum principle for a class of risk-sensitive mean-field type control problems under partial observation. Following the above mentioned papers of optimal control under partial observation, in particular [23], our strategy is to transform the partially observable control problem into a completely observable one and then apply the approach suggested in [11] to derive a suitable risk-sensitive SMP. To the best to our knowledge, the risk-sensitive maximum principle under partial observation without passing through the DPP, and in particular, for mean-field type controls was not established in earlier works.
The paper is organized as follows. In Sect. 2, we present the model and state the partially observable risk-sensitive SMP which constitutes the main result, whose proof is given in Sect. 3. Finally, in Sect. 4, we apply the risk-sensitive SMP to the linear-exponential-quadratic setup under partial observation. To streamline the presentation, we only consider the one-dimensional case. The extension to the multidimensional case is by now straightforward. Furthermore, we consider diffusion models where the control enters only the drift coefficient, which leads to an SMP with only one pair of adjoint processes. The general Peng-type SMP can be derived following e.g. [11, 23].
2 Statement of the Problem
Let \(T>0\) be a fixed time horizon and \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}})\) be a given filtered probability space on which there are defined two independent standard one-dimensional Brownian motions \(W=\{W_s\}_{s\ge 0}\) and \(Y=\{Y_s\}_{s\ge 0}\). Let \(\mathscr {F}_t^{W}\) and \(\mathscr {F}_t^{Y}\) be the \({\mathrm{l\negthinspace P}}\)-completed natural filtrations generated by W and Y, respectively. Set \({\mathrm{l\negthinspace F}}^Y:=\{\mathscr {F}_t^{Y},\ 0\le s \le T\}\) and \({\mathrm{l\negthinspace F}}:=\{{\mathscr {F}}_s,\ 0\le s \le T\}\), where, \(\mathscr {F}_t=\mathscr {F}_t^{W} \vee \mathscr {F}_t^{Y}\).
We consider a mean-field type version the stochastic controlled system with partial observation considered in [23] which is an extension of the model considered by [4, 14] to which we refer for further details.
The model is defined as follows.
(i) An admissible control u is an \({\mathrm{l\negthinspace F}}^{Y}\)-adapted process with values in a non-empty subset (not necessarily convex) U of \({\mathrm{l\negthinspace R}}\) and satisfies \(E[\int _0^T|u(t)|^2dt]<\infty \). We denote the set of all admissible controls by \(\mathscr {U}\). The control u is called partially observable.
(ii) Given a control process \(u\in \mathscr {U}\), we consider the signal-observation pair \((x^u,Y)\) which satisfies the following SDE of mean-field type
where,
and \(\beta (t,x): [0,T] \times {\mathrm{l\negthinspace R}}\longrightarrow {\mathrm{l\negthinspace R}}\) are Borel measurable function.
In this model, the observation process Y, which carries out the controls u, is assumed to be a given Brownian motion independent of W and is supposed to admit a decomposition as a trend \(\int _0^{\cdot }\beta (t,x^u(t))dt\) (a functional of the state process \(x^u\)) corrupted by a process \(\widetilde{W}^u\) which are a priori not observable.
The case \(\alpha =0\) corresponds to the model considered in [4, 14]. A more general model of the function \(\beta \) would be to let it depend on the control u and be of mean-field type. To keep the presentation simpler, we skip this case in this paper. But, the main results do extend to this case.
Before we formulate the control problem, we show that the system (1) has a weak solution. Introduce the density process defined on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}})\) by
which solves the linear SDE
Assuming the function \(\beta \) bounded (see Assumption 1, below), \(\rho \) is a uniformly integrable martingale such that, for every \(p\ge 2\),
where, C is a constant which depends only on the bound of \(\beta \), p and T. Define \(d{\mathrm{l\negthinspace P}}^{u}=\rho ^u(T)d{\mathrm{l\negthinspace P}}\). By Girsanov’s Theorem, \({\mathrm{l\negthinspace P}}^{u}\) is a probability measure. Moreover, \(\widetilde{W}^u\) is a \( {\mathrm{l\negthinspace P}}^u\)-standard Brownian motion independent of W. This in turn entails that \(({\mathrm{l\negthinspace P}}^u,x^u,Y,W,\widetilde{W}^u)\) is a weak solution of (1).
The objective is to characterize admissible controls which minimize the risk-sensitive cost functional given by
where, \(\theta \) is the risk-sensitivity index,
Any \(\bar{u}(\cdot )\in {\mathscr {U}}\) which satisfies
is called a risk-sensitive optimal control under partial observation.
Let \(\varPsi _T=\int _0^T f(t, x(t),E^u[x(t)], u(t)) dt\,+\,h(x(T), E^u[x(T)])\) and consider the payoff functional given by
When the risk-sensitive index \(\theta \) is small, the loss functional \(\widetilde{\varPsi }_{\theta }\) can be expanded as
where, \(\text{ var }_u(\varPsi _T)\) denotes the variance of \(\varPsi _T\) w.r.t. \( {\mathrm{l\negthinspace P}}^u\). If \(\theta <0\) , the variance of \(\varPsi _T\), as a measure of risk, improves the performance \(\widetilde{\varPsi }_{\theta }\), in which case the optimizer is called risk seeker. But, when \(\theta >0\), the variance of \(\varPsi _T\) worsens the performance \(\widetilde{\varPsi }_{\theta }\), in which case the optimizer is called risk averse. The risk-neutral loss functional \(E^u[\varPsi _{T}]\) can be seen as a limit of risk-sensitive functional \( \widetilde{\varPsi }_{\theta }\) when \(\theta \rightarrow 0\).
Since \(d{\mathrm{l\negthinspace P}}^{u}=\rho ^u(T)d{\mathrm{l\negthinspace P}}\), the associated risk-sensitive cost functional becomes
where, on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}})\), the process \((\rho ^u,x^u)\) satisfies the following dynamics:
We have recast the partially observable control problem into the following completely observable control problem: Minimize \(J^{\theta }(u(\cdot ))\) defined by (4) subject to (5).
The main result of this paper is a stochastic maximum principle (SMP) in terms of necessary optimality conditions for the problem (3) subject to (5).
We will only consider the case where the risk-sensitive parameter is positive, \(\theta >0\). The case \(\theta <0\) can be treated in a similar fashion by considering \(\theta =-\bar{\theta }, \bar{\theta }>0\), and \(\bar{f}=-f,\bar{h}=-h\) in the performance functional (4).
We will make the following assumption.
Assumption 1
The functions \(b, \sigma ,\alpha ,\beta , f, h\) are twice continuously differentiable with respect to (x, m). Moreover, these functions and their first derivatives with respect to (x, m) are bounded and continuous in (x, m, u).
To keep the presentation less technical, we impose these assumptions although they are restrictive and can be made weaker.
Under these assumptions, in view of Girsanov’s theorem and [18], Proposition 1.2., for each \(u\in {\mathscr {U}}\), the SDE (5) admits a unique weak solution \((\rho ^u, x^u)\).
We now state an SMP to characterize optimal controls \(\bar{u}(\cdot )\in \mathscr {U}\) which minimize (4), subject to (5). Let \((\bar{\rho },\bar{x}):=(\rho ^{\bar{u}},x^{\bar{u}})\) denote the corresponding state process, defined as the solution of (5).
We introduce the following notation.
We define the risk-neutral Hamiltonian as follows. For \((p,q)\in {\mathrm{l\negthinspace R}}^2\times {\mathrm{l\negthinspace R}}^{2\times 2}\),
where, \(^{\prime }*^{\prime }\) denotes the transposition operation of a matrix or a vector.
We also introduce the risk-sensitive Hamiltonian: \((p,q, \ell )\in {\mathrm{l\negthinspace R}}^2\times {\mathrm{l\negthinspace R}}^{2\times 2}\times {\mathrm{l\negthinspace R}}^2\),
We have \(H=H^0\).
Setting
the explicit form of the Hamiltonian (8) reads
Setting \(\theta =0\), we obtain the explicit form of the Hamiltonian (7):
With the obvious notation for the derivatives of the functions \(b,\alpha ,\beta , \sigma ,f,h\), w.r.t. the arguments x and m, we further set
With this notation, the system (5) can be rewritten in the following compact form
We define the risk-neutral Hamiltonian associated with random variables X such that \(\phi (X)\) and \(\tilde{\phi }(X)\) are \(L^1(\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace P}})\) as follows (with the obvious abuse of notation): For \((p,q)\in {\mathrm{l\negthinspace R}}^2\times {\mathrm{l\negthinspace R}}^{2\times 2}\),
We also introduce the risk-sensitive Hamiltonian: \((p,q, \ell )\in {\mathrm{l\negthinspace R}}^2\times {\mathrm{l\negthinspace R}}^{2\times 2}\times {\mathrm{l\negthinspace R}}^2\),
For \(g\in \{b,c, \sigma ,\alpha ,\beta \}\) and \(u\in U\), we set
and
Let
We introduce the adjoint equations involved in the risk-sensitive SMP for our control problem.
where, in view of (2) and (13), for \(k\in \{\rho ,x\}\),
and
We note that the processes \((\hat{p},\hat{q},\ell )\) may depend on the sensitivity index \(\theta \). To ease notation, we omit to make this dependence explicit.
Below, we will show that, under Assumption 1, (14) admits a unique \({\mathrm{l\negthinspace F}}\)-adapted solution \((\hat{p},\hat{q},\hat{v}^{\theta },\ell )\) such that
Moreover,
Lemma 1
The process defined on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}})\) by
is a uniformly integrable \({\mathrm{l\negthinspace F}}\)-martingale.
The process \(L^{\theta }\) defines a new probability measure \({\mathrm{l\negthinspace P}}^{\theta }\) equivalent to \({\mathrm{l\negthinspace P}}\) by setting \(L^{\theta }_t:=\frac{d{\mathrm{l\negthinspace P}}^{\theta }}{d{\mathrm{l\negthinspace P}}}|_{{\mathscr {F}}_t}\). By Girsanov’s theorem, the process \(B_t^{\theta }:=B_t-\theta \int _0^t \ell (s)ds,\,\, 0\le t\le T\) is a \({\mathrm{l\negthinspace P}}^{\theta }\)-Brownian motion.
The following theorem is the main result of the paper. Let \(E^{\theta }[\,\cdot \,]\) denote the expectation w.r.t. \({\mathrm{l\negthinspace P}}^{\theta }\).
Theorem 1
(Risk-sensitive maximum principle) Let Assumption 1 hold. If the process \((\bar{\rho }(\cdot ),\bar{x}(\cdot ),\bar{u}(\cdot ))\) is an optimal solution of the risk-sensitive control problem (3)–(5), then there are two pairs of \({\mathrm{l\negthinspace F}}\)-adapted processes \((v^{\theta }, \ell )\) and \((\hat{p},\hat{q})\) which satisfy (14)–(15), such that
for all \(u\in U,\) almost every t and \({\mathrm{l\negthinspace P}}^{\theta }-\)almost surely.
Remark 1
The boundedness assumption of the involved coefficients and their derivatives imposed in Assumption 1, in Theorem 1, guarantees the solvability of the system of forward-backward SDEs (5) and (14). In fact Theorem 1 applies provided we can solve system of forward-backward SDEs (5) and (14). A typical example of such a situation is the classical Linear-Quadratic (LQ) control problem (see Sect. 4 below), in which the involved coefficients are at most quadratic, but not necessarily bounded.
3 Proof of the Main Result
In this section we give a proof of Theorem 1 here presented in several steps.
3.1 An Intermediate SMP for Mean-Field Type Control
In this subsection we first reformulate the risk-sensitive control problem associated with (4)–(5) in terms of an augmented state process and terminal payoff problem. An intermediate stochastic maximum principle is then obtained by applying the SMP obtained in ([1], Theorem 3.1 or [5], Theorem 2.1) for loss functionals without running cost. Then, we transform the intermediate first-order adjoint processes to a simpler form. The mean-field type control problem for the cost functional (4) under the dynamics (5) is equivalent to
subject to
We introduce the following notation.
With this notation, the system (18) can be rewritten in the following compact form
and the risk-sensitive cost functional (4) becomes
where,
We define the Hamiltonian associated with random variables R such that \(\phi (R)\in L^1(\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace P}})\) as follows. For \((p,q)\in {\mathrm{l\negthinspace R}}^3\times {\mathrm{l\negthinspace R}}^{3\times 3}\),
where, \(\varGamma ^*\) denotes the transpose of the matrix \(\varGamma \).
Setting
the explicit form of the Hamiltonian (19) reads
In view of (12), we set
We may apply the SMP for risk-neutral mean-field type control (cf. [1], Theorem 3.1 or [5], Theorem 2.1) to the augmented state dynamics \((\rho , x,\xi )\) to derive the first order adjoint equation:
This is a system of linear backward SDEs of mean-field type which, in view of ([6], Theorem 3.1), under Assumption 1, admits a unique \({\mathrm{l\negthinspace F}}\)-adapted solution (p, q) satisfying
where, \(|\cdot |\) denotes the usual Euclidean norm with appropriate dimension.
We may apply the SMP for SDEs of mean-field type control from ([1], Theorem 3.1 or [5], Theorem 2.1) together with the SMP for risk-neutral partially observable SDEs derived in ([23], Theorem 2.1) to obtain the following SMP.
Proposition 1
Let Assumption 1 hold. If \((\bar{R}(\cdot ), \bar{u}(\cdot ))\) is an optimal solution of the risk-neutral control problem (17) subject to the dynamics (18), then there is a unique pair of \({\mathrm{l\negthinspace F}}\)-adapted processes (p, q) which satisfies (21)–(22) such that
for all \(u\in U,\) almost every t and \({\mathrm{l\negthinspace P}}-\)almost surely.
3.2 Transformation of the First Order Adjoint Process
Although the result of Proposition 1 is a good SMP for the risk-sensitive mean-field type control with partial observations, augmenting the state process with the third component \(\xi \) yields a system of three adjoint equations that appears complicated to solve in concrete situations. In this section we apply the transformation of the adjoint processes (p, q) introduced in [11] in such a way to get rid of the third component \((p_3,q_{31},q_{32})\) in (21) and express the SMP in terms of only two adjoint process that we denote \((\hat{p},\hat{q})\), where
Indeed, noting that from (21), we have \(dp_3(t) =\langle q_3(t),dB_t\rangle \) and \( p_3(T)=- \theta \psi ^{\theta }_T,\) the explicit solution of this backward SDE is
where,
In particular, we have \(v^{\theta }(0)= E[\psi ^{\theta }_T].\) Therefore, in view of (24), it would be natural to choose a transformation of (p, q) into an adjoint process \((\hat{p}, \hat{q})\) , where,
such that
This would imply that, for almost every \(0\le t\le T\),
which in turn would reduce the number of adjoint processes to those of the form given by (23).
We consider the following transform:
In view of (21), we have
We should identify the processes \(\hat{\alpha }\) and \(\hat{q}\) such that
for which (25) and (26) are satisfied.
In order to investigate the properties of these new processes \((\hat{p},\hat{q})\), the following properties of the generic martingale \(v^{\theta }\), used in [11], are essential. We reproduce them here for the sake of completeness. Since, by Assumption 1, f and h are bounded by some constant \(C>0\), we have
Therefore, \(v^{\theta }\) is a uniformly integrable \({\mathrm{l\negthinspace F}}\)-martingale satisfying
Hence, in view of (2), we have
Furthermore, the martingale \(v^{\theta }\) enjoys the following useful logarithmic transform established in ([12], Proposition 3.1)
and
Moreover, the process Z is the first component of the \({\mathrm{l\negthinspace F}}\)-adapted pair of processes \((Z,\ell )\) which is the unique solution to the following quadratic BSDE:
where, \(\ell (t)=(\ell _1(t),\ell _2(t))\) satisfies
In particular, \(v^{\theta }\) solves the following linear backward SDE
Hence,
Proof of Lemma 1. In view of (30),
is a uniformly integrable \({\mathrm{l\negthinspace F}}\)-martingale. \(\square \)
To identify the processes \(\tilde{\alpha }\) and \(\tilde{q}\) such that
we may apply Itô’s formula to the process \({p}(t)=\theta v^{\theta }\tilde{p}(t)\), use (21) and (34) and identify the coefficients. We obtain
Therefore,
where, \(B_t^{\theta }:=B_t-\theta \int _0^t \ell (s)ds,\,\, 0\le t\le T\), which is, in view of (35) and Girsanov’s Theorem, a \({\mathrm{l\negthinspace P}}^{\theta }\)-Brownian motion, where \(\frac{d{\mathrm{l\negthinspace P}}^{\theta }}{d{\mathrm{l\negthinspace P}}}\Big |_{{\mathscr {F}}_t}:=L^{\theta }_t\).
In particular,
Therefore, noting that \(\hat{p}_3(t):=[\theta v^{\theta }(t)]^{-1}p_3(t)\) is square-integrable, we obtain
Thus, its quadratic variation becomes \(\int _0^T|\hat{q}_3(t)|^2dt=0,\,\,\mathbb {P}^{\theta }-\text{ a.s. }\) This implies that, for almost every \(0\le t\le T\), \(\hat{q}_3(t)=0,\,\, \mathbb {P}^{\theta }\,\,\text{ and }\,\, \mathbb {P}-\hbox \mathrm{a.s.{ }}\)
Hence, we can drop the last components from the adjoint processes \((\hat{p},\hat{q})\) and only consider (keeping the same notation)
for which (37) reduces to the risk-sensitive adjoint equation:
In view of the uniqueness of \({\mathrm{l\negthinspace F}}\)-adapted pairs (p, q), solution of (21), and the pair \((v^{\theta },\ell )\) obtained satisfying (33) and (34), the solution of the system of backward SDEs (38) is unique and satisfies (15).
3.3 Risk-Sensitive Stochastic Maximum Principle
We may use the transform (27) and (36) to obtain the explicit form (11) of the risk-sensitive Hamiltonian \(H^{\theta }\) defined by
where, \(H^e\) is defined by (19).
Let
and
We have
where, we recall that \(v^{\theta }(t)/v^{\theta }(0)=L^{\theta }_t=d{\mathrm{l\negthinspace P}}^{\theta }/d{\mathrm{l\negthinspace P}}|_{{\mathscr {F}}_t}\).
Now, since \(\theta >0\) and \(v^{\theta }(0)=E[\psi _T^{\theta }]>0\), the variational inequality (1) translates into
for all \(u\in U,\) almost every t and \({\mathrm{l\negthinspace P}}^{\theta }-\)almost surely. This finishes the proof of Theorem 1.
4 Illustrative Example: Linear-Quadratic Risk-Sensitive Model Under Partial Observation
To illustrate our approach, we consider a one-dimensional linear diffusion with exponential quadratic cost functional. Perhaps, the easiest example of a linear-quadratic (LQ) risk-sensitive control problem with mean-field coupling is
where, \(a, b, \alpha ,\beta ,\mu \) and \(\sigma \) are real constants.
In this section we will illustrate our approach by only considering the LQ risk-sensitive control under partial observation without the mean-field coupling i.e. \((\mu =0)\) so that our result can be compared with [8] where a similar example (in many dimensions) is studied using the Dynamic Programming Principle. The case \(\mu \ne 0\) can treated in a similar fashion (cf. [11]).
We consider the linear-quadratic risk-sensitive control problem:
where, \(a, b, \alpha ,\beta \) and \(\sigma \) are real constants.
An admissible process \((\bar{\rho }(\cdot ), \bar{x}(\cdot ), \bar{u}(\cdot ))\) satisfying the necessary optimality conditions of Theorem 1 is obtained by solving the following system of forward-backward SDEs (cf. (5) and (14)) (see Remark 1, above).
where,
and the associated risk-sensitive Hamiltonian is
In general the solution \((v^{\theta },\ell )\) primarily gives the correct form of the process \(\ell \) which may be a function of the optimal control \(\bar{u}\). Inserting \(\ell \) in the BSDE satisfied by (p, q) in the system (41) and solving for (p, q), we arrive at the characterization the optimal control of our problem.
For the LQ-control problem it turns out that by considering the BSDE satisfied by \((v^{\theta },\ell )\), we will find an explicit form of the optimal control \(\bar{u}\). Indeed, by (31), this is equivalent to consider the BSDE satisfied by \((Z,\ell )\):
Since \(\bar{u}\) is \( {\mathscr {F}}^Y_t\), the form of \(Z_T\) suggests that we characterize \(\bar{u}\) and \(\ell \) such that
where, \(\gamma \) and \(\eta \) are deterministic functions such that \(\gamma (T)=1\) and \(\eta (T)=0\). In view of the SDEs satisfied by \((\bar{\rho },\bar{x})\) in (41), applying Itô’s formula and identifying the coefficients, we get
and
Hence,
where, the first equation is the risk-sensitive Riccati equation, and
By the conditional Jensen’s inequality, we have
Therefore, the optimal control is
and the optimal dynamics solves the linear SDE
where, by the filter equation of Theorem 8.1 in [22], \(\pi _t(\bar{x}):=E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t]\) is the solution of the SDE on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}}^{\theta })\):
where, \(\bar{Y}^{\theta }_t=Y_t-\int _0^t(\theta \alpha \gamma (s)+\beta ) \pi _s(\bar{x})ds\) is an \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}^Y, {\mathrm{l\negthinspace P}}^{\theta })\)-Brownian motion.
Inserting the form (43) of \(\ell \) in the BSDE satisfied by (p, q) in the system (41) and solving for (p, q), we arrive at the same characterization the optimal control of our problem, obtained as a maximizer of the associated \(H^{\theta }\) given by (42). We sketch the main steps and omit the details.
We have
The BSDE satisfied by (p, q) then reads
In view of Theorem 1, if \(\bar{u}\) is an optimal control of the system (40), it is necessary that
This yields
The associated state dynamics \(\bar{x}\) solves then the SDE
It remains to compute \(E^{\theta }[p_2(t)|{\mathscr {F}}^Y_t]\). Indeed, inserting the form (43) of \(\ell \) in the BSDE satisfied by (p, q) in the system (47), by Itô’s formula and identifying the coefficients, it is easy to check that \((p_1(t),q_{11}(t),q_{12}(t))\) given by
solves the first adjoint equation in (47). Furthermore, since \(p_2(T)=-\bar{x}(T)\), setting
where, \(\lambda \) is a deterministic function such that \(\lambda (T)=1\), and identifying the coefficients, we find that \(\lambda \) satisfies the risk-sensitive Riccati equation in (44). Moreover,
By uniqueness of the solution of the risk-sensitive Riccati equation in (44), it follows that \(\lambda =\gamma \). Therefore,
Summing up: the optimal control of the LQ-problem (41) is
where, \(\gamma \) solves the risk-sensitive Riccati equation
The optimal dynamics solves the linear SDE
and the filter \(\pi _t(\bar{x}):=E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t]\) is solution of the SDE on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}}^{\theta })\):
where, \(\bar{Y}^{\theta }_t=Y_t-\int _0^t(\theta \alpha \gamma (s)+\beta ) \pi _s(\bar{x})ds\) is an \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}^Y, {\mathrm{l\negthinspace P}}^{\theta })\)-Brownian motion.
References
Andersson, D., Djehiche, B.: A maximum principle for SDE’s of mean-field type. Appl. Math. Optim. 63(3), 341–356 (2010)
Bensoussan, A., Sung, K.C.J., Yam, S.C.P., Yung, S.P.: Linear-quadratic mean field games, Preprint. arXiv:1404.5741, (2014)
Baras, J.S., Elliott, R.J., Kohlmann, M.: The partially observed stochastic minimum principle. SIAM J. Control Optim. 27(6), 1279–1292 (1989)
Bensoussan, A.: Maximum principle and dynamic programming approaches of the optimal control of partially observed diffusions. Stochastics 9, 169–222 (1983)
Buckdahn, R., Djehiche, B., Li, J.: A general stochastic maximum principle for SDEs of mean-field type. Appl. Math. Optim. 64(2), 197-216 (2011)
Buckdahn, R., Li, J., Peng, S.: Mean-field backward stochastic differential equations and related partial differential equations. Stoch. Process. Appl. 119(10), 3133–3154 (2009)
Carmona, R., Delarue, F.: Forward-backward stochastic differential equations and controlled mckean vlasov dynamics. arXiv:1303.5835, (2013)
Charalambous, C.D.: Partially observable nonlinear risk-sensitive control problems: dynamic programming and verification theorem. IEEE Trans. Autom. Control 42(8), 1130–1138 (1997)
Charalambous, C.D., Hibey, J.: Minimum principle for partially observable nonlinear risk-sensitive control problems using measure-valued decompositions. Stochast. Stochast. Rep. 57, 247–288 (1996)
Davis, M.H.A., Varaiya, P.: Dynamic programming conditions for partially observable stochastic systems. SIAM J. Control Optim. 11(2), 226–261 (1973)
Djehiche, B., Tembine, H., Tempone, R.: A stochastic maximum principle for risk-sensitive mean-field type control. IEEE Trans. Autom. Control (2014). doi:10.1109/TAC.2015.2406973
El-Karoui, N., Hamadène, S.: BSDEs and risk-sensitive control, zero-sum and nonzero-sum game problems of stochastic functional differential equations. Stoch. Process. Appl. 107, 145–169 (2003)
Fleming, W.H.: Optimal control of partially observable diffusions. SIAM J. Control Optim. 6, 194–214 (1968)
Hausmann, U.G.: The maximum principle for optimal control of diffusions with partial information. SIAM J. Control Optim. 25, 341–361 (1987)
Hosking, J.: A stochastic maximum principle for a stochastic differential game of a mean-field type. Appl. Math. Optim. 66, 415–454 (2012)
Huang, J., Wang, G., Xiong, J.: A maximum principle for partial information backward stochastic control problems with applications. SIAM J. Control Optim. 48, 2106–2117 (2009)
Jacobson, D.H.: Optimal stochastic linear systems with exponential criteria and their relation to differential games. Trans. Autom. Control AC-18, 124-131 (1973)
Jourdain, B., Méléard, S., Woyczynski, W.: Nonlinear SDEs driven by Lévy processes and related PDEs. Alea 4, 1–29 (2008)
Kwakernaak, H.: A minimum principle for stochastic control problems wth output feedback. Syst. Control Lett. 1, 74–77 (1981)
Li, J.: Stochastic maximum principle in the mean-field controls. Automatica 48, 366–373 (2012)
Li, X., Tang, S.: General necessary conditions for partially observed optimal stochastic controls. J. Appl. Probab. 32, 1118–1137 (1995)
Liptser, R.S., Shiryayev, A.N.: Statistics of random process, vol. 1. Springer, New York (1977)
Tang, S.: The Maximum Principle for partially observed optimal control of stochastic differential equations. SIAM J. Control Optim. 36(5), 1596–1617 (1998)
Tembine, H., Zhu, Q., Basar, T.: Risk-sensitive mean-field games. IEEE Trans. Autom. Control 59(4), 835–850 (2014)
Wang, G., Zhang, C., Zhang, W.: Stochastic maximum principle for mean-field type optimal control under partial information. IEEE Trans. Autom. Control 59(2), 522–528 (2014)
Whittle, P.: A risk-sensitive maximum principle: the case of imperfect state observations. IEEE Trans. Autom. Control 36, 793–801 (1991)
Zhou, X.Y.: On the necessary conditions of optimal control for stochastic partial differential equations. SIAM J. Control Optim. 31, 1462–1478 (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is distributed under the terms of the Creative Commons Attribution Noncommercial License, which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Copyright information
© 2016 The Author(s)
About this paper
Cite this paper
Djehiche, B., Tembine, H. (2016). Risk-Sensitive Mean-Field Type Control Under Partial Observation. In: Benth, F., Di Nunno, G. (eds) Stochastics of Environmental and Financial Economics. Springer Proceedings in Mathematics & Statistics, vol 138. Springer, Cham. https://doi.org/10.1007/978-3-319-23425-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-23425-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23424-3
Online ISBN: 978-3-319-23425-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)