Dynamic Optimization for Agent-Based Systems and Inverse Optimal Control

Abstract: This dissertation is concerned with three problems within the field of optimization for agent--based systems. Firstly, the inverse optimal control problem is investigated for the single-agent system. Given a dynamic process, the goal is to recover the quadratic cost function from the observation of optimal control sequences. Such estimation could then help us develop a better understanding of the physical system and reproduce a similar optimal controller in other applications. Next, problems of optimization over networked systems are considered. A novel differential game approach is proposed for the optimal intrinsic formation control of multi-agent systems. As for the credit scoring problem, an optimal filtering framework is utilized to recursively improve the scoring accuracy based on dynamic network information.In paper A, the problem of finite horizon inverse optimal control problem is investigated, where the linear quadratic (LQ) cost function is required to be estimated from the optimal feedback controller. Although the infinite-horizon inverse LQ problem is well-studied with numerous results, the finite-horizon case is still an open problem. To the best of our knowledge, we propose the first complete result of the necessary and sufficient condition for the existence of corresponding LQ cost functions. Under feasible cases, the analytic expression of the whole solution space is derived and the equivalence of weighting matrices is discussed. For infeasible problems, an infinite dimensional convex problem is formulated to obtain a best-fit approximate solution with minimal control residual, where the optimality condition is solved under a static quadratic programming framework to facilitate the computation.In paper B, the optimal formation control problem of a multi-agent system is studied. The foraging behavior of N agents is modeled as a finite-horizon non-cooperative differential game under local information, and its Nash equilibrium is studied. The collaborative swarming behaviour derived from non-cooperative individual actions also sheds new light on understanding such phenomenon in the nature. The proposed framework has a tutorial meaning since a systematic approach for formation control is proposed, where the desired formation can be obtained by only intrinsically adjusting individual costs and network topology. In contrast to most of the existing methodologies based on regulating formation errors to the pre-defined pattern, the proposed method does not need to involve any information of the desired pattern beforehand. We refer to this type of formation control as intrinsic formation control. Patterns of regular polygons, antipodal formations and Platonic solids can be achieved as Nash equilibria of the game while inter-agent collisions are naturally avoided.Paper C considers the credit scoring problem by incorporating dynamic network information, where the advantages of such incorporation are investigated in two scenarios. Firstly, when the scoring publishment is merely individual--dependent, an optimal Bayesian filter is designed for risk prediction, where network observations are utilized to provide a reference for the bank on future financial decisions. Furthermore, a recursive Bayes estimator is proposed to improve the accuracy of score publishment by incorporating the dynamic network topology as well. It is shown that under the proposed evolution framework, the designed estimator has a higher precision than all the efficient estimators, and the mean square errors are strictly smaller than the Cramér-Rao lower bound for clients within a certain range of scores.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)