Skip to content

Commit c0bd006

Browse files
update
1 parent c0445c6 commit c0bd006

2 files changed

Lines changed: 119 additions & 37 deletions

File tree

class01/Manifest.toml

Lines changed: 34 additions & 33 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

class01/class01_intro.jl

Lines changed: 85 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ end
2020
begin
2121
import Pkg
2222
Pkg.activate(".")
23+
Pkg.instantiate()
2324
# Pkg.status()
2425
using PlutoUI
2526
using Random
@@ -38,11 +39,89 @@ using ForwardDiff
3839
# ╔═╡ ec473e69-d5ec-4d6a-b868-b89dadb85705
3940
ChooseDisplayMode()
4041

42+
# ╔═╡ 1f774f46-d57d-4668-8204-dc83d50d8c94
43+
md"# Intro - Optimal Control and Learning
44+
45+
In this course, we are interested in problems with the following structure:
46+
47+
```math
48+
\begin{equation}
49+
\!\!\!\!\!\!\!\!\min_{\substack{(\mathbf y_1,\mathbf x_1)\\\mathrm{s.t.}}}
50+
\!\underset{%
51+
\phantom{\substack{(\mathbf y_1,\mathbf x_1)\\\mathrm{s.t.}}}%
52+
\!\!\!\!\!\!\!\!\!\!(\mathbf y_1,\mathbf x_1)\in\mathcal X_1(\mathbf x_0)%
53+
}{%
54+
\!\!\!\!f(\mathbf x_1,\mathbf y_1)%
55+
}
56+
+\mathbb{E}_1\Bigl[
57+
\quad \cdots
58+
59+
\;+\;\mathbb{E}_t\Bigl[
60+
\min_{\substack{(\mathbf y_t,\mathbf x_t)\\\mathrm{s.t.}}}
61+
\!\underset{%
62+
\phantom{\substack{(\mathbf y_t,\mathbf x_t)\\\mathrm{s.t.}}}%
63+
\!\!\!\!(\mathbf y_t,\mathbf x_t)\in\mathcal X_t(\mathbf x_{t-1},w_t)%
64+
}{%
65+
\!\!\!\!\!\!\!\!\!\!f(\mathbf x_t,\mathbf y_t)%
66+
}
67+
+\mathbb{E}_{t+1}[\cdots]
68+
\Bigr].
69+
\end{equation}
70+
```
71+
which minimizes a first stage cost function $f(\mathbf{x}_1,
72+
\mathbf{y}_1)$ and the expected value of future costs over possible
73+
values of the exogenous stochastic variable $\{w_{t}\}_{t=2}^{T} \in
74+
\Omega$.
75+
76+
Here, $\mathbf{x}_0$ is the initial system state and the
77+
control decisions $\mathbf{y}_t$ are obtained at every period $t$
78+
under a feasible region defined by the incoming state
79+
$\mathbf{x}_{t-1}$ and the realized uncertainty $w_t$. $\mathbf{E}_t$ represents the expected value over future uncertainties $\{w_{\tau}\}_{\tau=t}^{T}$. This
80+
optimization program assumes that the system is entirely defined by
81+
the incoming state, a common modeling choice in many frameworks (e.g.,
82+
MDPs). This is without loss of generality,
83+
since any information can be appended in the state. The system
84+
constraints can be generally posed as:
85+
86+
```math
87+
\begin{align}
88+
&\mathcal{X}_t(\mathbf{x}_{t-1}, w_t)=
89+
\begin{cases}
90+
\mathcal{T}(\mathbf{x}_{t-1}, w_t, \mathbf{y}_t) = \mathbf{x}_t \\
91+
h(\mathbf{x}_t, \mathbf{y}_t) \geq 0
92+
\end{cases}
93+
\end{align}
94+
```
95+
"
96+
97+
# ╔═╡ a0f71960-c97c-40d1-8f78-4b1860d2e0a2
98+
md"""
99+
where the outgoing state of the system $\mathbf{x}_t$ is a
100+
transformation based on the incoming state, the realized uncertainty,
101+
and the control variables. $h(\mathbf{x}_t, \mathbf{y}_t) \geq 0$
102+
captures the state constraints. Markov Decision Process (MDPs) refer
103+
to $\mathcal{T}$ as the "transition kernel" of the system. State and
104+
control variables are restricted further by additional constraints
105+
captured by $h(\mathbf{x}_t, \mathbf{y}_t) \geq 0$. We
106+
consider policies that map the past information into decisions. In
107+
period $t$, an optimal policy is given by the solution of the dynamic
108+
equations:
109+
110+
```math
111+
\begin{align}
112+
V_{t}(\mathbf{x}_{t-1}, w_t) = &\min_{\mathbf{x}_t, \mathbf{y}_t} \quad \! \! f(\mathbf{x}_t, \mathbf{y}_t) + \mathbf{E}[V_{t+1}(\mathbf{x}_t, w_{t+1})] \\
113+
& \text{ s.t. } \quad\mathbf{x}_t = \mathcal{T}(\mathbf{x}_{t-1}, w_t, \mathbf{y}_t) \nonumber \\
114+
& \quad \quad \quad \! \! h(\mathbf{x}_t, \mathbf{y}_t) \geq 0. \nonumber
115+
\end{align}
116+
```
117+
"""
118+
41119
# ╔═╡ 52005382-177b-4a11-a914-49a5ffc412a3
42-
md"# 101 (Continuous-Time) Dynamics
43-
#### A Crash Course
120+
section_outline(md"A Crash Course:",md" (Continuous-Time) Dynamics
121+
")
44122

45-
General form for a smooth system:
123+
# ╔═╡ 8ea866a6-de0f-4812-8f59-2aebec709243
124+
md"General form for a smooth system:
46125
47126
```math
48127
\dot{x} = f(x,u) \quad \text{First-Order Ordinary Differential Equation (ODE)}
@@ -56,7 +135,6 @@ u \in \mathbb{R}^{m} & \text{Control} \\
56135
\dot{x} \in \mathbb{R}^{n} & \text{Time derivative of } x \\
57136
\end{cases}
58137
```
59-
60138
"
61139

62140
# ╔═╡ 2be161cd-2d4c-4778-adca-d45f8ab05f98
@@ -951,7 +1029,10 @@ end
9511029
# ╔═╡ Cell order:
9521030
# ╟─13b12c00-6d6e-11f0-3780-a16e73360478
9531031
# ╟─ec473e69-d5ec-4d6a-b868-b89dadb85705
1032+
# ╟─1f774f46-d57d-4668-8204-dc83d50d8c94
1033+
# ╟─a0f71960-c97c-40d1-8f78-4b1860d2e0a2
9541034
# ╟─52005382-177b-4a11-a914-49a5ffc412a3
1035+
# ╟─8ea866a6-de0f-4812-8f59-2aebec709243
9551036
# ╟─2be161cd-2d4c-4778-adca-d45f8ab05f98
9561037
# ╟─b452ee52-ee33-44ad-a980-6a6e90954ee1
9571038
# ╟─9f62fae9-283c-44c3-8d69-29bfa90faf29

0 commit comments

Comments
 (0)