Risk aversion
This tutorial was generated using Literate.jl. Download the source as a .jl
file. Download the source as a .ipynb
file.
In Introductory theory, we implemented a basic version of the SDDP algorithm. This tutorial extends that implementation to add risk-aversion.
Packages
This tutorial uses the following packages. For clarity, we call import PackageName
so that we must prefix PackageName.
to all functions and structs provided by that package. Everything not prefixed is either part of base Julia, or we wrote it.
import ForwardDiff
import HiGHS
import Ipopt
import JuMP
import Statistics
Risk aversion: what and why?
Often, the agents making decisions in complex systems are risk-averse, that is, they care more about avoiding very bad outcomes, than they do about having a good average outcome.
As an example, consumers in a hydro-thermal problem may be willing to pay a slightly higher electricity price on average, if it means that there is a lower probability of blackouts.
Risk aversion in multistage stochastic programming has been well studied in the academic literature, and is widely used in production implementations around the world.
Risk measures
One way to add risk aversion to models is to use a risk measure. A risk measure is a function that maps a random variable to a real number.
You are probably already familiar with lots of different risk measures. For example, the mean, median, mode, and maximum are all risk measures.
We call the act of applying a risk measure to a random variable "computing the risk" of a random variable.
To keep things simple, and because we need it for SDDP, we restrict our attention to random variables $Z$ with a finite sample space $\Omega$ and positive probabilities $p_{\omega}$ for all $\omega \in \Omega$. We denote the realizations of $Z$ by $Z(\omega) = z_{\omega}$.
A risk measure, $\mathbb{F}[Z]$, is a convex risk measure if it satisfies the following axioms:
Axiom 1: monotonicity
Given two random variables $Z_1$ and $Z_2$, with $Z_1 \le Z_2$ almost surely, then $\mathbb{F}[Z_1] \le F[Z_2]$.
Axiom 2: translation equivariance
Given two random variables $Z_1$ and $Z_2$, then for all $a \in \mathbb{R}$, $\mathbb{F}[Z + a] = \mathbb{F}[Z] + a$.
Axiom 3: convexity
Given two random variables $Z_1$ and $Z_2$, then for all $a \in [0, 1]$,
\[\mathbb{F}[a Z_1 + (1 - a) Z_2] \le a \mathbb{F}[Z_1] + (1-a)\mathbb{F}[Z_2].\]
Now we know what a risk measure is, let's see how we can use them to form risk-averse decision rules.
Risk-averse decision rules: Part I
We started this tutorial by explaining that we are interested in risk aversion because some agents are risk-averse. What that really means, is that they want a policy that is also risk-averse. The question then becomes, how do we create risk-averse decision rules and policies?
Recall from Introductory theory that we can form an optimal decision rule using the recursive formulation:
\[\begin{aligned} V_i(x, \omega) = \min\limits_{\bar{x}, x^\prime, u} \;\; & C_i(\bar{x}, u, \omega) + \mathbb{E}_{j \in i^+, \varphi \in \Omega_j}[V_j(x^\prime, \varphi)]\\ & x^\prime = T_i(\bar{x}, u, \omega) \\ & u \in U_i(\bar{x}, \omega) \\ & \bar{x} = x, \end{aligned}\]
where our decision rule, $\pi_i(x, \omega)$, solves this optimization problem and returns a $u^*$ corresponding to an optimal solution.
If we can replace the expectation operator $\mathbb{E}$ with another (more risk-averse) risk measure $\mathbb{F}$, then our decision rule will attempt to choose a control decision now that minimizes the risk of the future costs, as opposed to the expectation of the future costs. This makes our decisions more risk-averse, because we care more about the worst outcomes than we do about the average.
Therefore, we can form a risk-averse decision rule using the formulation:
\[\begin{aligned} V_i(x, \omega) = \min\limits_{\bar{x}, x^\prime, u} \;\; & C_i(\bar{x}, u, \omega) + \mathbb{F}_{j \in i^+, \varphi \in \Omega_j}[V_j(x^\prime, \varphi)]\\ & x^\prime = T_i(\bar{x}, u, \omega) \\ & u \in U_i(\bar{x}, \omega) \\ & \bar{x} = x. \end{aligned}\]
To convert this problem into a tractable equivalent, we apply Kelley's algorithm to the risk-averse cost-to-go term $\mathbb{F}_{j \in i^+, \varphi \in \Omega_j}[V_j(x^\prime, \varphi)]$, to obtain the approximated problem:
\[\begin{aligned} V_i^K(x, \omega) = \min\limits_{\bar{x}, x^\prime, u} \;\; & C_i(\bar{x}, u, \omega) + \theta\\ & x^\prime = T_i(\bar{x}, u, \omega) \\ & u \in U_i(\bar{x}, \omega) \\ & \bar{x} = x \\ & \theta \ge \mathbb{F}_{j \in i^+, \varphi \in \Omega_j}\left[V_j^k(x^\prime_k, \varphi)\right] + \frac{d}{dx^\prime}\mathbb{F}_{j \in i^+, \varphi \in \Omega_j}\left[V_j^k(x^\prime_k, \varphi)\right]^\top (x^\prime - x^\prime_k)\quad k=1,\ldots,K. \end{aligned}\]
Note how we need to explicitly compute a risk-averse subgradient! (We need a subgradient because the function might not be differentiable.) When constructing cuts with the expectation operator in Introductory theory, we implicitly used the law of total expectation to combine the two expectations; we can't do that for a general risk measure.
If it's not obvious why we can use Kelley's here, try to use the axioms of a convex risk measure to show that $\mathbb{F}_{j \in i^+, \varphi \in \Omega_j}[V_j(x^\prime, \varphi)]$ is a convex function w.r.t. $x^\prime$ if $V_j$ is also a convex function.
Our challenge is now to find a way to compute the risk-averse cost-to-go function $\mathbb{F}_{j \in i^+, \varphi \in \Omega_j}\left[V_j^k(x^\prime_k, \varphi)\right]$, and a way to compute a subgradient of the risk-averse cost-to-go function with respect to $x^\prime$.
Primal risk measures
Now we know what a risk measure is, and how we will use it, let's implement some code to see how we can compute the risk of some random variables.
We're going to start by implementing the primal version of each risk measure. We implement the dual version in the next section.
First, we need some data:
Z = [1.0, 2.0, 3.0, 4.0]
4-element Vector{Float64}:
1.0
2.0
3.0
4.0
with probabilities:
p = [0.1, 0.2, 0.4, 0.3]
4-element Vector{Float64}:
0.1
0.2
0.4
0.3
We're going to implement a number of different risk measures, so to leverage Julia's multiple dispatch, we create an abstract type:
abstract type AbstractRiskMeasure end
and function to overload:
"""
primal_risk(F::AbstractRiskMeasure, Z::Vector{<:Real}, p::Vector{Float64})
Use `F` to compute the risk of the random variable defined by a vector of costs
`Z` and non-zero probabilities `p`.
"""
function primal_risk end
Main.primal_risk
We want Vector{<:Real}
instead of Vector{Float64}
because we're going to automatically differentiate this function in the next section.
Expectation
The expectation, $\mathbb{E}$, also called the mean or the average, is the most widely used convex risk measure. The expectation of a random variable is just the sum of $Z$ weighted by the probability:
\[\mathbb{F}[Z] = \mathbb{E}_p[Z] = \sum\limits_{\omega\in\Omega} p_{\omega} z_{\omega}.\]
struct Expectation <: AbstractRiskMeasure end
function primal_risk(::Expectation, Z::Vector{<:Real}, p::Vector{Float64})
return sum(p[i] * Z[i] for i in 1:length(p))
end
primal_risk (generic function with 1 method)
Let's try it out:
primal_risk(Expectation(), Z, p)
2.9000000000000004
WorstCase
The worst-case risk measure, also called the maximum, is another widely used convex risk measure. This risk measure doesn't care about the probability vector p
, only the cost vector Z
:
\[\mathbb{F}[Z] = \max[Z] = \max\limits_{\omega\in\Omega} z_{\omega}.\]
struct WorstCase <: AbstractRiskMeasure end
function primal_risk(::WorstCase, Z::Vector{<:Real}, ::Vector{Float64})
return maximum(Z)
end
primal_risk (generic function with 2 methods)
Let's try it out:
primal_risk(WorstCase(), Z, p)
4.0
Entropic
A more interesting, and less widely used risk measure is the entropic risk measure. The entropic risk measure is parameterized by a value $\gamma > 0$, and computes the risk of a random variable as:
\[\mathbb{F}_\gamma[Z] = \frac{1}{\gamma}\log\left(\mathbb{E}_p[e^{\gamma Z}]\right) = \frac{1}{\gamma}\log\left(\sum\limits_{\omega\in\Omega}p_{\omega} e^{\gamma z_{\omega}}\right).\]
Prove that the entropic risk measure satisfies the three axioms of a convex risk measure.
struct Entropic <: AbstractRiskMeasure
γ::Float64
function Entropic(γ)
if !(γ > 0)
throw(DomainError(γ, "Entropic risk measure must have γ > 0."))
end
return new(γ)
end
end
function primal_risk(F::Entropic, Z::Vector{<:Real}, p::Vector{Float64})
return 1 / F.γ * log(sum(p[i] * exp(F.γ * Z[i]) for i in 1:length(p)))
end
primal_risk (generic function with 3 methods)
exp(x)
overflows when $x > 709$. Therefore, if we are passed a vector of Float64
, use arbitrary precision arithmetic with big.(Z)
.
function primal_risk(F::Entropic, Z::Vector{Float64}, p::Vector{Float64})
return Float64(primal_risk(F, big.(Z), p))
end
primal_risk (generic function with 4 methods)
Let's try it out for different values of $\gamma$:
for γ in [0.001, 0.01, 0.1, 1.0, 10.0, 100.0, 1_000.0]
println("γ = $(γ), F[Z] = ", primal_risk(Entropic(γ), Z, p))
end
γ = 0.001, F[Z] = 2.9004449279791005
γ = 0.01, F[Z] = 2.9044427792027596
γ = 0.1, F[Z] = 2.9437604953310674
γ = 1.0, F[Z] = 3.264357634151263
γ = 10.0, F[Z] = 3.879608772845574
γ = 100.0, F[Z] = 3.987960271956741
γ = 1000.0, F[Z] = 3.998796027195674
The entropic has two extremes. As $\gamma \rightarrow 0$, the entropic acts like the expectation risk measure, and as $\gamma \rightarrow \infty$, the entropic acts like the worst-case risk measure.
Computing risk measures this way works well for computing the primal value. However, there isn't an obvious way to compute a subgradient of the risk-averse cost-to-go function, which we need for our cut calculation.
There is a nice solution to this problem, and that is to use the dual representation of a risk measure, instead of the primal.
Dual risk measures
Convex risk measures have a dual representation as follows:
\[\mathbb{F}[Z] = \sup\limits_{q \in\mathcal{M}(p)} \mathbb{E}_q[Z] - \alpha(p, q),\]
where $\alpha$ is a concave function that maps the probability vectors $p$ and $q$ to a real number, and $\mathcal{M}(p) \subseteq \mathcal{P}$ is a convex subset of the probability simplex:
\[\mathcal{P} = \{p \ge 0\;|\;\sum\limits_{\omega\in\Omega}p_{\omega} = 1\}.\]
The dual of a convex risk measure can be interpreted as taking the expectation of the random variable $Z$ with respect to the worst probability vector $q$ that lies within the set $\mathcal{M}$, less some concave penalty term $\alpha(p, q)$.
If we define a function dual_risk_inner
that computes q
and α
:
"""
dual_risk_inner(
F::AbstractRiskMeasure, Z::Vector{Float64}, p::Vector{Float64}
)::Tuple{Vector{Float64},Float64}
Return a tuple formed by the worst-case probability vector `q` and the
corresponding evaluation `α(p, q)`.
"""
function dual_risk_inner end
Main.dual_risk_inner
then we can write a generic dual_risk
function as:
function dual_risk(
F::AbstractRiskMeasure,
Z::Vector{Float64},
p::Vector{Float64},
)
q, α = dual_risk_inner(F, Z, p)
return sum(q[i] * Z[i] for i in 1:length(q)) - α
end
dual_risk (generic function with 1 method)
Expectation
For the expectation risk measure, $\mathcal{M}(p) = \{p\}$, and $\alpha(\cdot, \cdot) = 0$. Therefore:
function dual_risk_inner(::Expectation, ::Vector{Float64}, p::Vector{Float64})
return p, 0.0
end
dual_risk_inner (generic function with 1 method)
We can check we get the same result as the primal version:
dual_risk(Expectation(), Z, p) == primal_risk(Expectation(), Z, p)
true
Worst-case
For the worst-case risk measure, $\mathcal{M}(p) = \mathcal{P}$, and $\alpha(\cdot, \cdot) = 0$. Therefore, the dual representation just puts all of the probability weight on the maximum outcome:
function dual_risk_inner(::WorstCase, Z::Vector{Float64}, ::Vector{Float64})
q = zeros(length(Z))
_, index = findmax(Z)
q[index] = 1.0
return q, 0.0
end
dual_risk_inner (generic function with 2 methods)
We can check we get the same result as the primal version:
dual_risk(WorstCase(), Z, p) == primal_risk(WorstCase(), Z, p)
true
Entropic
For the entropic risk measure, $\mathcal{M}(p) = \mathcal{P}$, and:
\[\alpha(p, q) = \frac{1}{\gamma}\sum\limits_{\omega\in\Omega} q_\omega \log\left(\frac{q_\omega}{p_{\omega}}\right).\]
One way to solve the dual problem is to explicitly solve a nonlinear optimization problem:
function dual_risk_inner(F::Entropic, Z::Vector{Float64}, p::Vector{Float64})
N = length(p)
model = JuMP.Model(Ipopt.Optimizer)
JuMP.set_silent(model)
# For this problem, the solve is more accurate if we turn off problem
# scaling.
JuMP.set_optimizer_attribute(model, "nlp_scaling_method", "none")
JuMP.@variable(model, 0 <= q[1:N] <= 1)
JuMP.@constraint(model, sum(q) == 1)
JuMP.@NLexpression(
model,
α,
1 / F.γ * sum(q[i] * log(q[i] / p[i]) for i in 1:N),
)
JuMP.@NLobjective(model, Max, sum(q[i] * Z[i] for i in 1:N) - α)
JuMP.optimize!(model)
return JuMP.value.(q), JuMP.value(α)
end
dual_risk_inner (generic function with 3 methods)
We can check we get the same result as the primal version:
for γ in [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]
primal = primal_risk(Entropic(γ), Z, p)
dual = dual_risk(Entropic(γ), Z, p)
success = primal ≈ dual ? "✓" : "×"
println("$(success) γ = $(γ), primal = $(primal), dual = $(dual)")
end
✓ γ = 0.001, primal = 2.9004449279791005, dual = 2.900444927979128
✓ γ = 0.01, primal = 2.9044427792027596, dual = 2.9044427792027667
✓ γ = 0.1, primal = 2.9437604953310674, dual = 2.9437604953310674
✓ γ = 1.0, primal = 3.264357634151263, dual = 3.264357634151263
✓ γ = 10.0, primal = 3.879608772845574, dual = 3.8796087723675954
✓ γ = 100.0, primal = 3.987960271956741, dual = 3.987960271956741
This method of solving the dual problem "on-the-side" is used by SDDP.jl for a number of risk measures, including a distributionally robust risk measure with the Wasserstein distance. Check out all the risk measures that SDDP.jl supports in Add a risk measure.
The "on-the-side" method is very general, and it lets us incorporate any convex risk measure into SDDP. However, this comes at an increased computational cost and potential numerical issues (e.g., not converging to the exact solution).
However, for the entropic risk measure, Dowson, Morton, and Pagnoncelli (2020) derive the following closed form solution for $q^*$:
\[q_\omega^* = \frac{p_{\omega} e^{\gamma z_{\omega}}}{\sum\limits_{\varphi \in \Omega} p_{\varphi} e^{\gamma z_{\varphi}}}.\]
This is faster because we don't need to use Ipopt, and it avoids some of the numerical issues associated with solving a nonlinear program.
function dual_risk_inner(F::Entropic, Z::Vector{Float64}, p::Vector{Float64})
q, α = zeros(length(p)), big(0.0)
peγz = p .* exp.(F.γ .* big.(Z))
sum_peγz = sum(peγz)
for i in 1:length(q)
big_q = peγz[i] / sum_peγz
α += big_q * log(big_q / p[i])
q[i] = Float64(big_q)
end
return q, Float64(α / F.γ)
end
dual_risk_inner (generic function with 3 methods)
Again, note that we use big
to avoid introducing overflow errors, before explicitly casting back to Float64
for the values we return.
We can check we get the same result as the primal version:
for γ in [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]
primal = primal_risk(Entropic(γ), Z, p)
dual = dual_risk(Entropic(γ), Z, p)
success = primal ≈ dual ? "✓" : "×"
println("$(success) γ = $(γ), primal = $(primal), dual = $(dual)")
end
✓ γ = 0.001, primal = 2.9004449279791005, dual = 2.9004449279791005
✓ γ = 0.01, primal = 2.9044427792027596, dual = 2.9044427792027596
✓ γ = 0.1, primal = 2.9437604953310674, dual = 2.943760495331067
✓ γ = 1.0, primal = 3.264357634151263, dual = 3.264357634151263
✓ γ = 10.0, primal = 3.879608772845574, dual = 3.879608772845574
✓ γ = 100.0, primal = 3.987960271956741, dual = 3.987960271956741
Risk-averse subgradients
We ended the section on primal risk measures by explaining how we couldn't use the primal risk measure in the cut calculation because we needed some way of computing a risk-averse subgradient:
\[\theta \ge \mathbb{F}_{j \in i^+, \varphi \in \Omega_j}\left[V_j^k(x^\prime_k, \varphi)\right] + \frac{d}{dx^\prime}\mathbb{F}_{j \in i^+, \varphi \in \Omega_j}\left[V_j^k(x^\prime_k, \varphi)\right]^\top (x^\prime - x^\prime_k).\]
The reason we use the dual representation is because of the following theorem, which explains how to compute a risk-averse gradient.
Let $\omega \in \Omega$ index a random vector with finite support and with nominal probability mass function, $p \in \mathcal{P}$, which satisfies $p > 0$.
Consider a convex risk measure, $\mathbb{F}$, with a convex risk set, $\mathcal{M}(p)$, so that $\mathbb{F}$ can be expressed as the dual form.
Let $V(x,\omega)$ be convex with respect to $x$ for all fixed $\omega\in\Omega$, and let $\lambda(\tilde{x}, \omega)$ be a subgradient of $V(x,\omega)$ with respect to $x$ at $x = \tilde{x}$ for each $\omega \in \Omega$.
Then, $\sum_{\omega\in\Omega}q^*_{\omega} \lambda(\tilde{x},\omega)$ is a subgradient of $\mathbb{F}[V(x,\omega)]$ at $\tilde{x}$, where
\[q^* \in \argmax_{q \in \mathcal{M}(p)}\left\{{\mathbb{E}}_q[V(\tilde{x},\omega)] - \alpha(p, q)\right\}.\]
This theorem can be a little hard to unpack, so let's see an example:
function dual_risk_averse_subgradient(
V::Function,
# Use automatic differentiation to compute the gradient of V w.r.t. x,
# given a fixed ω.
λ::Function = (x, ω) -> ForwardDiff.gradient(x -> V(x, ω), x);
F::AbstractRiskMeasure,
Ω::Vector,
p::Vector{Float64},
x̃::Vector{Float64},
)
# Evaluate the function at x=x̃ for all ω ∈ Ω.
V_ω = [V(x̃, ω) for ω in Ω]
# Solve the dual problem to obtain an optimal q^*.
q, α = dual_risk_inner(F, V_ω, p)
# Compute the risk-averse subgradient by taking the expectation of the
# subgradients w.r.t. q^*.
dVdx = sum(q[i] * λ(x̃, ω) for (i, ω) in enumerate(Ω))
return dVdx
end
dual_risk_averse_subgradient (generic function with 2 methods)
We can compare the subgradient obtained with the dual form against the automatic differentiation of the primal_risk
function.
function primal_risk_averse_subgradient(
V::Function;
F::AbstractRiskMeasure,
Ω::Vector,
p::Vector{Float64},
x̃::Vector{Float64},
)
inner(x) = primal_risk(F, [V(x, ω) for ω in Ω], p)
return ForwardDiff.gradient(inner, x̃)
end
primal_risk_averse_subgradient (generic function with 1 method)
As our example function, we use:
V(x, ω) = ω * x[1]^2
V (generic function with 1 method)
with:
Ω = [1.0, 2.0, 3.0]
3-element Vector{Float64}:
1.0
2.0
3.0
and:
p = [0.3, 0.4, 0.3]
3-element Vector{Float64}:
0.3
0.4
0.3
at the point:
x̃ = [3.0]
1-element Vector{Float64}:
3.0
If $\mathbb{F}$ is the expectation risk-measure, then:
\[\mathbb{F}[V(x, \omega)] = 2 x^2.\]
The function evaluation $x=3$ is $18$ and the subgradient is $12$. Let's check we get it right with the dual form:
dual_risk_averse_subgradient(V; F = Expectation(), Ω = Ω, p = p, x̃ = x̃)
1-element Vector{Float64}:
12.0
and the primal form:
primal_risk_averse_subgradient(V; F = Expectation(), Ω = Ω, p = p, x̃ = x̃)
1-element Vector{Float64}:
12.0
If $\mathbb{F}$ is the worst-case risk measure, then:
\[\mathbb{F}[V(x, \omega)] = 3 x^2.\]
The function evaluation at $x=3$ is $27$, and the subgradient is $18$. Let's check we get it right with the dual form:
dual_risk_averse_subgradient(V; F = WorstCase(), Ω = Ω, p = p, x̃ = x̃)
1-element Vector{Float64}:
18.0
and the primal form:
primal_risk_averse_subgradient(V; F = WorstCase(), Ω = Ω, p = p, x̃ = x̃)
1-element Vector{Float64}:
18.0
If $\mathbb{F}$ is the entropic risk measure, the math is a little more difficult to derive analytically. However, we can check against our primal version:
for γ in [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]
dual =
dual_risk_averse_subgradient(V; F = Entropic(γ), Ω = Ω, p = p, x̃ = x̃)
primal = primal_risk_averse_subgradient(
V;
F = Entropic(γ),
Ω = Ω,
p = p,
x̃ = x̃,
)
success = primal ≈ dual ? "✓" : "×"
println("$(success) γ = $(γ), primal = $(primal), dual = $(dual)")
end
✓ γ = 0.001, primal = [12.03239965008496], dual = [12.03239965008496]
✓ γ = 0.01, primal = [12.323650575272044], dual = [12.323650575272044]
✓ γ = 0.1, primal = [14.9332498638561], dual = [14.933249863856101]
✓ γ = 1.0, primal = [17.999012701279042], dual = [17.999012701279042]
✓ γ = 10.0, primal = [17.999999999999996], dual = [18.0]
× γ = 100.0, primal = [NaN], dual = [18.0]
Uh oh! What happened with the last line? It looks our primal_risk_averse_subgradient
encountered an error and returned a subgradient of NaN
. This is because of the overflow issue with exp(x)
. However, we can be confident that our dual method of computing the risk-averse subgradient is both correct and more numerically robust than the primal version.
As another sanity check, notice how as $\gamma \rightarrow 0$, we tend toward the solution of the expectation risk-measure [12]
, and as $\gamma \rightarrow \infty$, we tend toward the solution of the worse-case risk measure [18]
.
Risk-averse decision rules: Part II
Why is the risk-averse subgradient theorem helpful? Using the dual representation of a convex risk measure, we can re-write the cut:
\[\theta \ge \mathbb{F}_{j \in i^+, \varphi \in \Omega_j}\left[V_j^k(x^\prime_k, \varphi)\right] + \frac{d}{dx^\prime}\mathbb{F}_{j \in i^+, \varphi \in \Omega_j}\left[V_j^k(x^\prime_k, \varphi)\right]^\top (x^\prime - x^\prime_k),\quad k=1,\ldots,K\]
as:
\[\theta \ge \mathbb{E}_{q_k}\left[V_j^k(x^\prime_k, \varphi) + \frac{d}{dx^\prime}V_j^k(x^\prime_k, \varphi)^\top (x^\prime - x^\prime_k)\right] - \alpha(p, q_k),\quad k=1,\ldots,K,\]
where $q_k = \mathrm{arg}\sup\limits_{q \in\mathcal{M}(p)} \mathbb{E}_q[V_j^k(x_k^\prime, \varphi)] - \alpha(p, q)$.
Therefore, we can formulate a risk-averse decision rule as:
\[\begin{aligned} V_i^K(x, \omega) = \min\limits_{\bar{x}, x^\prime, u} \;\; & C_i(\bar{x}, u, \omega) + \theta\\ & x^\prime = T_i(\bar{x}, u, \omega) \\ & u \in U_i(\bar{x}, \omega) \\ & \bar{x} = x \\ & \theta \ge \mathbb{E}_{q_k}\left[V_j^k(x^\prime_k, \varphi) + \frac{d}{dx^\prime}V_j^k(x^\prime_k, \varphi)^\top (x^\prime - x^\prime_k)\right] - \alpha(p, q_k),\quad k=1,\ldots,K \\ & \theta \ge M. \end{aligned}\]
where $q_k = \mathrm{arg}\sup\limits_{q \in\mathcal{M}(p)} \mathbb{E}_q[V_j^k(x_k^\prime, \varphi)] - \alpha(p, q)$.
Thus, to implement risk-averse SDDP, all we need to do is modify the backward pass to include this calculation of $q_k$, form the cut using $q_k$ instead of $p$, and subtract the penalty term $\alpha(p, q_k)$.
Implementation
Now we're ready to implement our risk-averse version of SDDP.
As a prerequisite, we need most of the code from Introductory theory.
Click to view code from the tutorial "Introductory theory".
struct State
in::JuMP.VariableRef
out::JuMP.VariableRef
end
struct Uncertainty
parameterize::Function
Ω::Vector{Any}
P::Vector{Float64}
end
struct Node
subproblem::JuMP.Model
states::Dict{Symbol,State}
uncertainty::Uncertainty
cost_to_go::JuMP.VariableRef
end
struct PolicyGraph
nodes::Vector{Node}
arcs::Vector{Dict{Int,Float64}}
end
function Base.show(io::IO, model::PolicyGraph)
println(io, "A policy graph with $(length(model.nodes)) nodes")
println(io, "Arcs:")
for (from, arcs) in enumerate(model.arcs)
for (to, probability) in arcs
println(io, " $(from) => $(to) w.p. $(probability)")
end
end
return
end
function PolicyGraph(
subproblem_builder::Function;
graph::Vector{Dict{Int,Float64}},
lower_bound::Float64,
optimizer,
)
nodes = Node[]
for t in 1:length(graph)
model = JuMP.Model(optimizer)
states, uncertainty = subproblem_builder(model, t)
JuMP.@variable(model, cost_to_go >= lower_bound)
obj = JuMP.objective_function(model)
JuMP.@objective(model, Min, obj + cost_to_go)
if length(graph[t]) == 0
JuMP.fix(cost_to_go, 0.0; force = true)
end
push!(nodes, Node(model, states, uncertainty, cost_to_go))
end
return PolicyGraph(nodes, graph)
end
function sample_uncertainty(uncertainty::Uncertainty)
r = rand()
for (p, ω) in zip(uncertainty.P, uncertainty.Ω)
r -= p
if r < 0.0
return ω
end
end
return error("We should never get here because P should sum to 1.0.")
end
function sample_next_node(model::PolicyGraph, current::Int)
if length(model.arcs[current]) == 0
return nothing
else
r = rand()
for (to, probability) in model.arcs[current]
r -= probability
if r < 0.0
return to
end
end
return nothing
end
end
function forward_pass(model::PolicyGraph, io::IO = stdout)
incoming_state =
Dict(k => JuMP.fix_value(v.in) for (k, v) in model.nodes[1].states)
simulation_cost = 0.0
trajectory = Tuple{Int,Dict{Symbol,Float64}}[]
t = 1
while t !== nothing
node = model.nodes[t]
ω = sample_uncertainty(node.uncertainty)
node.uncertainty.parameterize(ω)
for (k, v) in incoming_state
JuMP.fix(node.states[k].in, v; force = true)
end
JuMP.optimize!(node.subproblem)
if JuMP.termination_status(node.subproblem) != JuMP.MOI.OPTIMAL
error("Something went terribly wrong!")
end
outgoing_state = Dict(k => JuMP.value(v.out) for (k, v) in node.states)
stage_cost =
JuMP.objective_value(node.subproblem) - JuMP.value(node.cost_to_go)
simulation_cost += stage_cost
incoming_state = outgoing_state
push!(trajectory, (t, outgoing_state))
t = sample_next_node(model, t)
end
return trajectory, simulation_cost
end
function upper_bound(model::PolicyGraph; replications::Int)
simulations = [forward_pass(model, devnull) for i in 1:replications]
z = [s[2] for s in simulations]
μ = Statistics.mean(z)
tσ = 1.96 * Statistics.std(z) / sqrt(replications)
return μ, tσ
end
function lower_bound(model::PolicyGraph)
node = model.nodes[1]
bound = 0.0
for (p, ω) in zip(node.uncertainty.P, node.uncertainty.Ω)
node.uncertainty.parameterize(ω)
JuMP.optimize!(node.subproblem)
bound += p * JuMP.objective_value(node.subproblem)
end
return bound
end
function evaluate_policy(
model::PolicyGraph;
node::Int,
incoming_state::Dict{Symbol,Float64},
random_variable,
)
the_node = model.nodes[node]
the_node.uncertainty.parameterize(random_variable)
for (k, v) in incoming_state
JuMP.fix(the_node.states[k].in, v; force = true)
end
JuMP.optimize!(the_node.subproblem)
return Dict(
k => JuMP.value.(v) for
(k, v) in JuMP.object_dictionary(the_node.subproblem)
)
end
evaluate_policy (generic function with 1 method)
First, we need to modify the backward pass to compute the cuts using the risk-averse subgradient theorem:
function backward_pass(
model::PolicyGraph,
trajectory::Vector{Tuple{Int,Dict{Symbol,Float64}}},
io::IO = stdout;
risk_measure::AbstractRiskMeasure,
)
println(io, "| Backward pass")
for i in reverse(1:length(trajectory))
index, outgoing_states = trajectory[i]
node = model.nodes[index]
println(io, "| | Visiting node $(index)")
if length(model.arcs[index]) == 0
continue
end
# =====================================================================
# New! Create vectors to store the cut expressions, V(x,ω) and p:
cut_expressions, V_ω, p = JuMP.AffExpr[], Float64[], Float64[]
# =====================================================================
for (j, P_ij) in model.arcs[index]
next_node = model.nodes[j]
for (k, v) in outgoing_states
JuMP.fix(next_node.states[k].in, v; force = true)
end
for (pφ, φ) in zip(next_node.uncertainty.P, next_node.uncertainty.Ω)
next_node.uncertainty.parameterize(φ)
JuMP.optimize!(next_node.subproblem)
V = JuMP.objective_value(next_node.subproblem)
dVdx = Dict(
k => JuMP.reduced_cost(v.in) for (k, v) in next_node.states
)
# =============================================================
# New! Construct and append the expression
# `V_j^K(x_k, φ) + dVdx_j^K(x'_k, φ)ᵀ(x - x_k)` to the list of
# cut expressions.
push!(
cut_expressions,
JuMP.@expression(
node.subproblem,
V + sum(
dVdx[k] * (x.out - outgoing_states[k]) for
(k, x) in node.states
),
)
)
# Add the objective value to Z:
push!(V_ω, V)
# Add the probability to p:
push!(p, P_ij * pφ)
# =============================================================
end
end
# =====================================================================
# New! Using the solutions in V_ω, compute q and α:
q, α = dual_risk_inner(risk_measure, V_ω, p)
println(io, "| | | Z = ", Z)
println(io, "| | | p = ", p)
println(io, "| | | q = ", q)
println(io, "| | | α = ", α)
# Then add the cut:
c = JuMP.@constraint(
node.subproblem,
node.cost_to_go >=
sum(q[i] * cut_expressions[i] for i in 1:length(q)) - α
)
# =====================================================================
println(io, "| | | Adding cut : ", c)
end
return nothing
end
backward_pass (generic function with 2 methods)
We also need to update the train
loop of SDDP to pass a risk measure to the backward pass:
function train(
model::PolicyGraph;
iteration_limit::Int,
replications::Int,
# =========================================================================
# New! Add a risk_measure argument
risk_measure::AbstractRiskMeasure,
# =========================================================================
io::IO = stdout,
)
for i in 1:iteration_limit
println(io, "Starting iteration $(i)")
outgoing_states, _ = forward_pass(model, io)
# =====================================================================
# New! Pass the risk measure to the backward pass.
backward_pass(model, outgoing_states, io; risk_measure = risk_measure)
# =====================================================================
println(io, "| Finished iteration")
println(io, "| | lower_bound = ", lower_bound(model))
end
μ, tσ = upper_bound(model; replications = replications)
println(io, "Upper bound = $(μ) ± $(tσ)")
return
end
train (generic function with 1 method)
Risk-averse bounds
This section is important.
When we had a risk-neutral policy (i.e., we only used the expectation risk measure), we discussed how we could form valid lower and upper bounds.
The upper bound is still valid as a Monte Carlo simulation of the expected cost of the policy. (Although this upper bound doesn't capture the change in the policy we wanted to achieve, namely that the impact of the worst outcomes were reduced.)
However, if we use a different risk measure, the lower bound is no longer valid!
We can still calculate a "lower bound" as the objective of the first-stage approximated subproblem, and this will converge to a finite value. However, we can't meaningfully interpret it as a bound with respect to the optimal policy. Therefore, it's best to just ignore the lower bound when training a risk-averse policy.
Example: risk-averse hydro-thermal scheduling
Now it's time for an example. We create the same problem as Introductory theory:
model = PolicyGraph(;
graph = [Dict(2 => 1.0), Dict(3 => 1.0), Dict{Int,Float64}()],
lower_bound = 0.0,
optimizer = HiGHS.Optimizer,
) do subproblem, t
JuMP.set_silent(subproblem)
JuMP.@variable(subproblem, volume_in == 200)
JuMP.@variable(subproblem, 0 <= volume_out <= 200)
states = Dict(:volume => State(volume_in, volume_out))
JuMP.@variables(subproblem, begin
thermal_generation >= 0
hydro_generation >= 0
hydro_spill >= 0
inflow
end)
JuMP.@constraints(
subproblem,
begin
volume_out == volume_in + inflow - hydro_generation - hydro_spill
demand_constraint, thermal_generation + hydro_generation == 150.0
end
)
fuel_cost = [50.0, 100.0, 150.0]
JuMP.@objective(subproblem, Min, fuel_cost[t] * thermal_generation)
uncertainty =
Uncertainty([0.0, 50.0, 100.0], [1 / 3, 1 / 3, 1 / 3]) do ω
return JuMP.fix(inflow, ω)
end
return states, uncertainty
end
A policy graph with 3 nodes
Arcs:
1 => 2 w.p. 1.0
2 => 3 w.p. 1.0
Then we train a risk-averse policy, passing a risk measure to train
:
train(
model;
iteration_limit = 3,
replications = 100,
risk_measure = Entropic(1.0),
)
Starting iteration 1
| Backward pass
| | Visiting node 3
| | Visiting node 2
| | | Z = [1.0, 2.0, 3.0, 4.0]
| | | p = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
| | | q = [1.0, 0.0, 0.0]
| | | α = 1.0986122886681098
| | | Adding cut : 150 volume_out + cost_to_go ≥ 22498.901387711332
| | Visiting node 1
| | | Z = [1.0, 2.0, 3.0, 4.0]
| | | p = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
| | | q = [1.0, 0.0, 0.0]
| | | α = 1.0986122886681098
| | | Adding cut : 150 volume_out + cost_to_go ≥ 37497.802775422664
| Finished iteration
| | lower_bound = 12497.802775422664
Starting iteration 2
| Backward pass
| | Visiting node 3
| | Visiting node 2
| | | Z = [1.0, 2.0, 3.0, 4.0]
| | | p = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
| | | q = [0.5999999999999176, 0.2000000000000412, 0.2000000000000412]
| | | α = 0.14834174943478465
| | | Adding cut : 89.99999999998764 volume_out + cost_to_go ≥ 13499.851658248712
| | Visiting node 1
| | | Z = [1.0, 2.0, 3.0, 4.0]
| | | p = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
| | | q = [1.0, 0.0, 0.0]
| | | α = 1.0986122886681098
| | | Adding cut : 100 volume_out + cost_to_go ≥ 29998.594667538695
| Finished iteration
| | lower_bound = 14998.594667538693
Starting iteration 3
| Backward pass
| | Visiting node 3
| | Visiting node 2
| | | Z = [1.0, 2.0, 3.0, 4.0]
| | | p = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
| | | q = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
| | | α = 5.551115123125783e-17
| | | Adding cut : cost_to_go ≥ -5.551115123125783e-17
| | Visiting node 1
| | | Z = [1.0, 2.0, 3.0, 4.0]
| | | p = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
| | | q = [1.0, 0.0, 0.0]
| | | α = 1.0986122886681098
| | | Adding cut : 100 volume_out + cost_to_go ≥ 29998.594667538695
| Finished iteration
| | lower_bound = 14998.594667538693
Upper bound = 9649.35014663117 ± 875.4055832011778
Finally, evaluate the decision rule:
evaluate_policy(
model;
node = 1,
incoming_state = Dict(:volume => 150.0),
random_variable = 75,
)
Dict{Symbol, Float64} with 8 entries:
:volume_out => 200.0
:demand_constraint => 150.0
:hydro_spill => 0.0
:inflow => 75.0
:volume_in => 150.0
:thermal_generation => 125.0
:hydro_generation => 25.0
:cost_to_go => 9998.59
For this trivial example, the risk-averse policy isn't very different from the policy obtained using the expectation risk-measure. If you try it on some bigger/more interesting problems, you should see the expected cost increase, and the upper tail of the policy decrease.