{
"cells": [
{
"cell_type": "markdown",
"source": [
"# Introductory theory"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"!!! note\n",
" This tutorial is aimed at advanced undergraduates or early-stage graduate\n",
" students. You don't need prior exposure to stochastic programming!\n",
" (Indeed, it may be better if you don't, because our approach is\n",
" non-standard in the literature.)\n",
"\n",
" This tutorial is also a living document. If parts are unclear, please\n",
" [open an issue](https://github.com/odow/SDDP.jl/issues/new) so it can be\n",
" improved!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"This tutorial will teach you how the stochastic dual dynamic programming\n",
"algorithm works by implementing a simplified version of the algorithm."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Our implementation is very much a \"vanilla\" version of SDDP; it doesn't have\n",
"(m)any fancy computational tricks (e.g., the ones included in SDDP.jl) that\n",
"you need to code a performant or stable version that will work on realistic\n",
"instances. However, our simplified implementation will work on arbitrary\n",
"policy graphs, including those with cycles such as infinite horizon problems!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"**Packages**\n",
"\n",
"This tutorial uses the following packages. For clarity, we call\n",
"`import PackageName` so that we must prefix `PackageName.` to all functions\n",
"and structs provided by that package. Everything not prefixed is either part\n",
"of base Julia, or we wrote it."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"import ForwardDiff\n",
"import HiGHS\n",
"import JuMP\n",
"import Statistics"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"!!! tip\n",
" You can follow along by installing the above packages, and copy-pasting\n",
" the code we will write into a Julia REPL. Alternatively, you can download\n",
" the Julia `.jl` file which created this tutorial [from GitHub](https://github.com/odow/SDDP.jl/blob/master/docs/src/tutorial/21_theory_intro.jl)."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Preliminaries: background theory"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Start this tutorial by reading An introduction to SDDP.jl, which\n",
"introduces the necessary notation and vocabulary that we need for this\n",
"tutorial."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Preliminaries: Kelley's cutting plane algorithm"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Kelley's cutting plane algorithm is an iterative method for minimizing convex\n",
"functions. Given a convex function $f(x)$, Kelley's constructs an\n",
"under-approximation of the function at the minimum by a set of first-order\n",
"Taylor series approximations (called **cuts**) constructed at a set of points\n",
"$k = 1,\\ldots,K$:\n",
"$$\n",
"\\begin{aligned}\n",
"f^K = \\min\\limits_{\\theta \\in \\mathbb{R}, x \\in \\mathbb{R}^N} \\;\\; & \\theta\\\\\n",
"& \\theta \\ge f(x_k) + \\frac{d}{dx}f(x_k)^\\top (x - x_k),\\quad k=1,\\ldots,K\\\\\n",
"& \\theta \\ge M,\n",
"\\end{aligned}\n",
"$$\n",
"where $M$ is a sufficiently large negative number that is a lower bound for\n",
"$f$ over the domain of $x$."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Kelley's cutting plane algorithm is a structured way of choosing points $x_k$\n",
"to visit, so that as more cuts are added:\n",
"$$\n",
"\\lim_{K \\rightarrow \\infty} f^K = \\min\\limits_{x \\in \\mathbb{R}^N} f(x)\n",
"$$\n",
"However, before we introduce the algorithm, we need to introduce some bounds."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Bounds"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"By convexity, $f^K \\le f(x)$ for all $x$. Thus, if $x^*$ is a minimizer of\n",
"$f$, then at any point in time we can construct a lower bound for $f(x^*)$ by\n",
"solving $f^K$."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Moreover, we can use the primal solutions $x_k^*$ returned by solving $f^k$ to\n",
"evaluate $f(x_k^*)$ to generate an upper bound."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Therefore, $f^K \\le f(x^*) \\le \\min\\limits_{k=1,\\ldots,K} f(x_k^*)$."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"When the lower bound is sufficiently close to the upper bound, we can\n",
"terminate the algorithm and declare that we have found an solution that is\n",
"close to optimal."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Implementation"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Here is pseudo-code fo the Kelley algorithm:"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"1. Take as input a convex function $f(x)$ and a iteration limit $K_{max}$.\n",
" Set $K = 0$, and initialize $f^K$. Set $lb = -\\infty$ and $ub = \\infty$.\n",
"2. Solve $f^K$ to obtain a candidate solution $x_{K+1}$.\n",
"3. Update $lb = f^K$ and $ub = \\min\\{ub, f(x_{K+1})\\}$.\n",
"4. Add a cut $\\theta \\ge f(x_{K+1}) + \\frac{d}{dx}f\\left(x_{K+1}\\right)^\\top (x - x_{K+1})$ to form $f^{K+1}$.\n",
"5. Increment $K$.\n",
"6. If $K = K_{max}$ or $|ub - lb| < \\epsilon$, STOP, otherwise, go to step 2."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"And here's a complete implementation:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function kelleys_cutting_plane(\n",
" # The function to be minimized.\n",
" f::Function,\n",
" # The gradient of `f`. By default, we use automatic differentiation to\n",
" # compute the gradient of f so the user doesn't have to!\n",
" dfdx::Function = x -> ForwardDiff.gradient(f, x);\n",
" # The number of arguments to `f`.\n",
" input_dimension::Int,\n",
" # A lower bound for the function `f` over its domain.\n",
" lower_bound::Float64,\n",
" # The number of iterations to run Kelley's algorithm for before stopping.\n",
" iteration_limit::Int,\n",
" # The absolute tolerance ϵ to use for convergence.\n",
" tolerance::Float64 = 1e-6,\n",
")\n",
" # Step (1):\n",
" K = 0\n",
" model = JuMP.Model(HiGHS.Optimizer)\n",
" JuMP.set_silent(model)\n",
" JuMP.@variable(model, θ >= lower_bound)\n",
" JuMP.@variable(model, x[1:input_dimension])\n",
" JuMP.@objective(model, Min, θ)\n",
" x_k = fill(NaN, input_dimension)\n",
" lower_bound, upper_bound = -Inf, Inf\n",
" while true\n",
" # Step (2):\n",
" JuMP.optimize!(model)\n",
" x_k .= JuMP.value.(x)\n",
" # Step (3):\n",
" lower_bound = JuMP.objective_value(model)\n",
" upper_bound = min(upper_bound, f(x_k))\n",
" println(\"K = $K : $(lower_bound) <= f(x*) <= $(upper_bound)\")\n",
" # Step (4):\n",
" JuMP.@constraint(model, θ >= f(x_k) + dfdx(x_k)' * (x .- x_k))\n",
" # Step (5):\n",
" K = K + 1\n",
" # Step (6):\n",
" if K == iteration_limit\n",
" println(\"-- Termination status: iteration limit --\")\n",
" break\n",
" elseif abs(upper_bound - lower_bound) < tolerance\n",
" println(\"-- Termination status: converged --\")\n",
" break\n",
" end\n",
" end\n",
" println(\"Found solution: x_K = \", x_k)\n",
" return\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Let's run our algorithm to see what happens:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"kelleys_cutting_plane(;\n",
" input_dimension = 2,\n",
" lower_bound = 0.0,\n",
" iteration_limit = 20,\n",
") do x\n",
" return (x[1] - 1)^2 + (x[2] + 2)^2 + 1.0\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"!!! warning\n",
" It's hard to choose a valid lower bound! If you choose one too loose, the\n",
" algorithm can take a long time to converge. However, if you choose one so\n",
" tight that $M > f(x^*)$, then you can obtain a suboptimal solution. For a\n",
" deeper discussion of the implications for SDDP.jl, see\n",
" Choosing an initial bound."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Preliminaries: approximating the cost-to-go term"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"In the background theory section, we discussed how you could formulate an\n",
"optimal policy to a multistage stochastic program using the dynamic\n",
"programming recursion:\n",
"$$\n",
"\\begin{aligned}\n",
"V_i(x, \\omega) = \\min\\limits_{\\bar{x}, x^\\prime, u} \\;\\; & C_i(\\bar{x}, u, \\omega) + \\mathbb{E}_{j \\in i^+, \\varphi \\in \\Omega_j}[V_j(x^\\prime, \\varphi)]\\\\\n",
"& x^\\prime = T_i(\\bar{x}, u, \\omega) \\\\\n",
"& u \\in U_i(\\bar{x}, \\omega) \\\\\n",
"& \\bar{x} = x,\n",
"\\end{aligned}\n",
"$$\n",
"where our decision rule, $\\pi_i(x, \\omega)$, solves this optimization problem\n",
"and returns a $u^*$ corresponding to an optimal solution. Moreover, we alluded\n",
"to the fact that the cost-to-go term (the nasty recursive expectation) makes\n",
"this problem intractable to solve."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"However, if, excluding the cost-to-go term (i.e., the `SP` formulation),\n",
"$V_i(x, \\omega)$ can be formulated as a linear program (this also works for\n",
"convex programs, but the math is more involved), then we can make some\n",
"progress by noticing that $x$ only appears as a right-hand side term of the\n",
"fishing constraint $\\bar{x} = x$. Therefore, $V_i(x, \\cdot)$ is convex with\n",
"respect to $x$ for fixed $\\omega$. (If you have not seen this result before,\n",
"try to prove it.)"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"The fishing constraint $\\bar{x} = x$ has an associated dual variable. The\n",
"economic interpretation of this dual variable is that it represents the change\n",
"in the objective function if the right-hand side $x$ is increased on the scale\n",
"of one unit. In other words, and with a slight abuse of notation, it is the\n",
"value $\\frac{d}{dx} V_i(x, \\omega)$. (Because $V_i$ is not differentiable, it\n",
"is a [subgradient](https://en.wikipedia.org/wiki/Subderivative) instead of a\n",
"derivative.)"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"If we implement the constraint $\\bar{x} = x$ by setting the lower- and upper\n",
"bounds of $\\bar{x}$ to $x$, then the [reduced cost](https://en.wikipedia.org/wiki/Reduced_cost)\n",
"of the decision variable $\\bar{x}$ is the subgradient, and we do not need to\n",
"explicitly add the fishing constraint as a row to the constraint matrix."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"!!! tip\n",
" The subproblem can have binary and integer variables, but you'll need to\n",
" use Lagrangian duality to compute a subgradient!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Stochastic dual dynamic programming converts this problem into a tractable\n",
"form by applying Kelley's cutting plane algorithm to the $V_j$ functions in\n",
"the cost-to-go term:\n",
"$$\n",
"\\begin{aligned}\n",
"V_i^K(x, \\omega) = \\min\\limits_{\\bar{x}, x^\\prime, u} \\;\\; & C_i(\\bar{x}, u, \\omega) + \\theta\\\\\n",
"& x^\\prime = T_i(\\bar{x}, u, \\omega) \\\\\n",
"& u \\in U_i(\\bar{x}, \\omega) \\\\\n",
"& \\bar{x} = x \\\\\n",
"& \\theta \\ge \\mathbb{E}_{j \\in i^+, \\varphi \\in \\Omega_j}\\left[V_j^k(x^\\prime_k, \\varphi) + \\frac{d}{dx^\\prime}V_j^k(x^\\prime_k, \\varphi)^\\top (x^\\prime - x^\\prime_k)\\right],\\quad k=1,\\ldots,K \\\\\n",
"& \\theta \\ge M.\n",
"\\end{aligned}\n",
"$$"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"All we need now is a way of generating these cutting planes in an iterative\n",
"manner. Before we get to that though, let's start writing some code."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Implementation: modeling"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Let's make a start by defining the problem structure. Like SDDP.jl, we need a\n",
"few things:\n",
"\n",
"1. A description of the structure of the policy graph: how many nodes there\n",
" are, and the arcs linking the nodes together with their corresponding\n",
" probabilities.\n",
"2. A JuMP model for each node in the policy graph.\n",
"3. A way to identify the incoming and outgoing state variables of each node.\n",
"4. A description of the random variable, as well as a function that we can\n",
" call that will modify the JuMP model to reflect the realization of the\n",
" random variable.\n",
"5. A decision variable to act as the approximated cost-to-go term."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"!!! warning\n",
" In the interests of brevity, there is minimal error checking. Think about\n",
" all the different ways you could break the code!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Structs"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"The first struct we are going to use is a `State` struct that will wrap an\n",
"incoming and outgoing state variable:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"struct State\n",
" in::JuMP.VariableRef\n",
" out::JuMP.VariableRef\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Next, we need a struct to wrap all of the uncertainty within a node:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"struct Uncertainty\n",
" parameterize::Function\n",
" Ω::Vector{Any}\n",
" P::Vector{Float64}\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"`parameterize` is a function which takes a realization of the random variable\n",
"$\\omega\\in\\Omega$ and updates the subproblem accordingly. The finite discrete\n",
"random variable is defined by the vectors `Ω` and `P`, so that the random\n",
"variable takes the value `Ω[i]` with probability `P[i]`. As such, `P` should\n",
"sum to 1. (We don't check this here, but we should; we do in SDDP.jl.)"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Now we have two building blocks, we can declare the structure of each node:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"struct Node\n",
" subproblem::JuMP.Model\n",
" states::Dict{Symbol,State}\n",
" uncertainty::Uncertainty\n",
" cost_to_go::JuMP.VariableRef\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"* `subproblem` is going to be the JuMP model that we build at each node.\n",
"* `states` is a dictionary that maps a symbolic name of a state variable to a\n",
" `State` object wrapping the incoming and outgoing state variables in\n",
" `subproblem`.\n",
"* `uncertainty` is an `Uncertainty` object described above.\n",
"* `cost_to_go` is a JuMP variable that approximates the cost-to-go term."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Finally, we define a simplified policy graph as follows:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"struct PolicyGraph\n",
" nodes::Vector{Node}\n",
" arcs::Vector{Dict{Int,Float64}}\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"There is a vector of nodes, as well as a data structure for the arcs. `arcs`\n",
"is a vector of dictionaries, where `arcs[i][j]` gives the probability of\n",
"transitioning from node `i` to node `j`, if an arc exists."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"To simplify things, we will assume that the root node transitions to node `1`\n",
"with probability 1, and there are no other incoming arcs to node 1. Notably,\n",
"we can still define cyclic graphs though!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"We also define a nice `show` method so that we don't accidentally print a\n",
"large amount of information to the screen when creating a model:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function Base.show(io::IO, model::PolicyGraph)\n",
" println(io, \"A policy graph with $(length(model.nodes)) nodes\")\n",
" println(io, \"Arcs:\")\n",
" for (from, arcs) in enumerate(model.arcs)\n",
" for (to, probability) in arcs\n",
" println(io, \" $(from) => $(to) w.p. $(probability)\")\n",
" end\n",
" end\n",
" return\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"### Functions"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Now we have some basic types, let's implement some functions so that the user\n",
"can create a model."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"First, we need an example of a function that the user will provide. Like\n",
"SDDP.jl, this takes an empty `subproblem`, and a node index, in this case\n",
"`t::Int`. You could change this function to change the model, or define a new\n",
"one later in the code."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"We're going to copy the example from An introduction to SDDP.jl,\n",
"with some minor adjustments for the fact we don't have many of the bells and\n",
"whistles of SDDP.jl. You can probably see how some of the SDDP.jl\n",
"functionality like `@stageobjective` and `SDDP.parameterize`\n",
"help smooth some of the usability issues like needing to construct both the\n",
"incoming and outgoing state variables, or needing to explicitly declare\n",
"`return states, uncertainty`."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function subproblem_builder(subproblem::JuMP.Model, t::Int)\n",
" # Define the state variables. Note how we fix the incoming state to the\n",
" # initial state variable regardless of `t`! This isn't strictly necessary;\n",
" # it only matters that we do it for the first node.\n",
" JuMP.@variable(subproblem, volume_in == 200)\n",
" JuMP.@variable(subproblem, 0 <= volume_out <= 200)\n",
" states = Dict(:volume => State(volume_in, volume_out))\n",
" # Define the control variables.\n",
" JuMP.@variables(subproblem, begin\n",
" thermal_generation >= 0\n",
" hydro_generation >= 0\n",
" hydro_spill >= 0\n",
" inflow\n",
" end)\n",
" # Define the constraints\n",
" JuMP.@constraints(\n",
" subproblem,\n",
" begin\n",
" volume_out == volume_in + inflow - hydro_generation - hydro_spill\n",
" demand_constraint, thermal_generation + hydro_generation == 150.0\n",
" end\n",
" )\n",
" # Define the objective for each stage `t`. Note that we can use `t` as an\n",
" # index for t = 1, 2, 3.\n",
" fuel_cost = [50.0, 100.0, 150.0]\n",
" JuMP.@objective(subproblem, Min, fuel_cost[t] * thermal_generation)\n",
" # Finally, we define the uncertainty object. Because this is a simplified\n",
" # implementation of SDDP, we shall politely ask the user to only modify the\n",
" # constraints, and not the objective function! (Not that it changes the\n",
" # algorithm, we just have to add more information to keep track of things.)\n",
" uncertainty = Uncertainty([0.0, 50.0, 100.0], [1 / 3, 1 / 3, 1 / 3]) do ω\n",
" return JuMP.fix(inflow, ω)\n",
" end\n",
" return states, uncertainty\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"The next function we need to define is the analog of\n",
"`SDDP.PolicyGraph`. It should be pretty readable."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function PolicyGraph(\n",
" subproblem_builder::Function;\n",
" graph::Vector{Dict{Int,Float64}},\n",
" lower_bound::Float64,\n",
" optimizer,\n",
")\n",
" nodes = Node[]\n",
" for t in 1:length(graph)\n",
" # Create a model.\n",
" model = JuMP.Model(optimizer)\n",
" JuMP.set_silent(model)\n",
" # Use the provided function to build out each subproblem. The user's\n",
" # function returns a dictionary mapping `Symbol`s to `State` objects,\n",
" # and an `Uncertainty` object.\n",
" states, uncertainty = subproblem_builder(model, t)\n",
" # Now add the cost-to-go terms:\n",
" JuMP.@variable(model, cost_to_go >= lower_bound)\n",
" obj = JuMP.objective_function(model)\n",
" JuMP.@objective(model, Min, obj + cost_to_go)\n",
" # If there are no outgoing arcs, the cost-to-go is 0.0.\n",
" if length(graph[t]) == 0\n",
" JuMP.fix(cost_to_go, 0.0; force = true)\n",
" end\n",
" push!(nodes, Node(model, states, uncertainty, cost_to_go))\n",
" end\n",
" return PolicyGraph(nodes, graph)\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Then, we can create a model using the `subproblem_builder` function we defined\n",
"earlier:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"model = PolicyGraph(\n",
" subproblem_builder;\n",
" graph = [Dict(2 => 1.0), Dict(3 => 1.0), Dict{Int,Float64}()],\n",
" lower_bound = 0.0,\n",
" optimizer = HiGHS.Optimizer,\n",
")"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"## Implementation: helpful samplers"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Before we get properly coding the solution algorithm, it's also going to be\n",
"useful to have a function that samples a realization of the random variable\n",
"defined by `Ω` and `P`."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function sample_uncertainty(uncertainty::Uncertainty)\n",
" r = rand()\n",
" for (p, ω) in zip(uncertainty.P, uncertainty.Ω)\n",
" r -= p\n",
" if r < 0.0\n",
" return ω\n",
" end\n",
" end\n",
" return error(\"We should never get here because P should sum to 1.0.\")\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"!!! note\n",
" `rand()` samples a uniform random variable in `[0, 1)`."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"For example:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"for i in 1:3\n",
" println(\"ω = \", sample_uncertainty(model.nodes[1].uncertainty))\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"It's also going to be useful to define a function that generates a random walk\n",
"through the nodes of the graph:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function sample_next_node(model::PolicyGraph, current::Int)\n",
" if length(model.arcs[current]) == 0\n",
" # No outgoing arcs!\n",
" return nothing\n",
" else\n",
" r = rand()\n",
" for (to, probability) in model.arcs[current]\n",
" r -= probability\n",
" if r < 0.0\n",
" return to\n",
" end\n",
" end\n",
" # We looped through the outgoing arcs and still have probability left\n",
" # over! This means we've hit an implicit \"zero\" node.\n",
" return nothing\n",
" end\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"For example:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"for i in 1:3\n",
" # We use `repr` to print the next node, because `sample_next_node` can\n",
" # return `nothing`.\n",
" println(\"Next node from $(i) = \", repr(sample_next_node(model, i)))\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"This is a little boring, because our graph is simple. However, more\n",
"complicated graphs will generate more interesting trajectories!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Implementation: the forward pass"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Recall that, after approximating the cost-to-go term, we need a way of\n",
"generating the cuts. As the first step, we need a way of generating candidate\n",
"solutions $x_k^\\prime$. However, unlike the Kelley's example, our functions\n",
"$V_j^k(x^\\prime, \\varphi)$ need two inputs: an outgoing state variable and a\n",
"realization of the random variable."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"One way of getting these inputs is just to pick a random (feasible) value.\n",
"However, in doing so, we might pick outgoing state variables that we will\n",
"never see in practice, or we might infrequently pick outgoing state variables\n",
"that we will often see in practice. Therefore, a better way of generating the\n",
"inputs is to use a simulation of the policy, which we call the **forward**\n",
"**pass**."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"The forward pass walks the policy graph from start to end, transitioning\n",
"randomly along the arcs. At each node, it observes a realization of the random\n",
"variable and solves the approximated subproblem to generate a candidate\n",
"outgoing state variable $x_k^\\prime$. The outgoing state variable is passed as\n",
"the incoming state variable to the next node in the trajectory."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function forward_pass(model::PolicyGraph, io::IO = stdout)\n",
" println(io, \"| Forward Pass\")\n",
" # First, get the value of the state at the root node (e.g., x_R).\n",
" incoming_state =\n",
" Dict(k => JuMP.fix_value(v.in) for (k, v) in model.nodes[1].states)\n",
" # `simulation_cost` is an accumlator that is going to sum the stage-costs\n",
" # incurred over the forward pass.\n",
" simulation_cost = 0.0\n",
" # We also need to record the nodes visited and resultant outgoing state\n",
" # variables so we can pass them to the backward pass.\n",
" trajectory = Tuple{Int,Dict{Symbol,Float64}}[]\n",
" # Now's the meat of the forward pass: beginning at the first node:\n",
" t = 1\n",
" while t !== nothing\n",
" node = model.nodes[t]\n",
" println(io, \"| | Visiting node $(t)\")\n",
" # Sample the uncertainty:\n",
" ω = sample_uncertainty(node.uncertainty)\n",
" println(io, \"| | | ω = \", ω)\n",
" # Parameterizing the subproblem using the user-provided function:\n",
" node.uncertainty.parameterize(ω)\n",
" println(io, \"| | | x = \", incoming_state)\n",
" # Update the incoming state variable:\n",
" for (k, v) in incoming_state\n",
" JuMP.fix(node.states[k].in, v; force = true)\n",
" end\n",
" # Now solve the subproblem and check we found an optimal solution:\n",
" JuMP.optimize!(node.subproblem)\n",
" if JuMP.termination_status(node.subproblem) != JuMP.MOI.OPTIMAL\n",
" error(\"Something went terribly wrong!\")\n",
" end\n",
" # Compute the outgoing state variables:\n",
" outgoing_state = Dict(k => JuMP.value(v.out) for (k, v) in node.states)\n",
" println(io, \"| | | x′ = \", outgoing_state)\n",
" # We also need to compute the stage cost to add to our\n",
" # `simulation_cost` accumulator:\n",
" stage_cost =\n",
" JuMP.objective_value(node.subproblem) - JuMP.value(node.cost_to_go)\n",
" simulation_cost += stage_cost\n",
" println(io, \"| | | C(x, u, ω) = \", stage_cost)\n",
" # As a penultimate step, set the outgoing state of stage t and the\n",
" # incoming state of stage t + 1, and add the node to the trajectory.\n",
" incoming_state = outgoing_state\n",
" push!(trajectory, (t, outgoing_state))\n",
" # Finally, sample a new node to step to. If `t === nothing`, the\n",
" # `while` loop will break.\n",
" t = sample_next_node(model, t)\n",
" end\n",
" return trajectory, simulation_cost\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Let's take a look at one forward pass:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"trajectory, simulation_cost = forward_pass(model);"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"## Implementation: the backward pass"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"From the forward pass, we obtained a vector of nodes visited and their\n",
"corresponding outgoing state variables. Now we need to refine the\n",
"approximation for each node at the candidate solution for the outgoing state\n",
"variable. That is, we need to add a new cut:\n",
"$$\n",
"\\theta \\ge \\mathbb{E}_{j \\in i^+, \\varphi \\in \\Omega_j}\\left[V_j^k(x^\\prime_k, \\varphi) + \\frac{d}{dx^\\prime}V_j^k(x^\\prime_k, \\varphi)^\\top (x^\\prime - x^\\prime_k)\\right]\n",
"$$\n",
"or alternatively:\n",
"$$\n",
"\\theta \\ge \\sum\\limits_{j \\in i^+} \\sum\\limits_{\\varphi \\in \\Omega_j} p_{ij} p_{\\varphi}\\left[V_j^k(x^\\prime_k, \\varphi) + \\frac{d}{dx^\\prime}V_j^k(x^\\prime_k, \\varphi)^\\top (x^\\prime - x^\\prime_k)\\right]\n",
"$$"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"It doesn't matter what order we visit the nodes to generate these cuts for.\n",
"For example, we could compute them all in parallel, using the current\n",
"approximations of $V^K_i$."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"However, we can be smarter than that."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"If we traverse the list of nodes visited in the forward pass in reverse, then\n",
"we come to refine the $i^{th}$ node in the trajectory, we will already have\n",
"improved the approximation of the $(i+1)^{th}$ node in the trajectory as well!\n",
"Therefore, our refinement of the $i^{th}$ node will be better than if we\n",
"improved node $i$ first, and then refined node $(i+1)$."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Because we walk the nodes in reverse, we call this the **backward pass**."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"!!! info\n",
" If you're into deep learning, you could view this as the equivalent of\n",
" back-propagation: the forward pass pushes primal information through the\n",
" graph (outgoing state variables), and the backward pass pulls dual\n",
" information (cuts) back through the graph to improve our decisions on the\n",
" next forward pass."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function backward_pass(\n",
" model::PolicyGraph,\n",
" trajectory::Vector{Tuple{Int,Dict{Symbol,Float64}}},\n",
" io::IO = stdout,\n",
")\n",
" println(io, \"| Backward pass\")\n",
" # For the backward pass, we walk back up the nodes.\n",
" for i in reverse(1:length(trajectory))\n",
" index, outgoing_states = trajectory[i]\n",
" node = model.nodes[index]\n",
" println(io, \"| | Visiting node $(index)\")\n",
" if length(model.arcs[index]) == 0\n",
" # If there are no children, the cost-to-go is 0.\n",
" println(io, \"| | | Skipping node because the cost-to-go is 0\")\n",
" continue\n",
" end\n",
" # Create an empty affine expression that we will use to build up the\n",
" # right-hand side of the cut expression.\n",
" cut_expression = JuMP.AffExpr(0.0)\n",
" # For each node j ∈ i⁺\n",
" for (j, P_ij) in model.arcs[index]\n",
" next_node = model.nodes[j]\n",
" # Set the incoming state variables of node j to the outgoing state\n",
" # variables of node i\n",
" for (k, v) in outgoing_states\n",
" JuMP.fix(next_node.states[k].in, v; force = true)\n",
" end\n",
" # Then for each realization of φ ∈ Ωⱼ\n",
" for (pφ, φ) in zip(next_node.uncertainty.P, next_node.uncertainty.Ω)\n",
" # Setup and solve for the realization of φ\n",
" println(io, \"| | | Solving φ = \", φ)\n",
" next_node.uncertainty.parameterize(φ)\n",
" JuMP.optimize!(next_node.subproblem)\n",
" # Then prepare the cut `P_ij * pφ * [V + dVdxᵀ(x - x_k)]``\n",
" V = JuMP.objective_value(next_node.subproblem)\n",
" println(io, \"| | | | V = \", V)\n",
" dVdx = Dict(\n",
" k => JuMP.reduced_cost(v.in) for (k, v) in next_node.states\n",
" )\n",
" println(io, \"| | | | dVdx′ = \", dVdx)\n",
" cut_expression += JuMP.@expression(\n",
" node.subproblem,\n",
" P_ij *\n",
" pφ *\n",
" (\n",
" V + sum(\n",
" dVdx[k] * (x.out - outgoing_states[k]) for\n",
" (k, x) in node.states\n",
" )\n",
" ),\n",
" )\n",
" end\n",
" end\n",
" # And then refine the cost-to-go variable by adding the cut:\n",
" c = JuMP.@constraint(node.subproblem, node.cost_to_go >= cut_expression)\n",
" println(io, \"| | | Adding cut : \", c)\n",
" end\n",
" return nothing\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"## Implementation: bounds"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Lower bounds"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Recall from Kelley's that we can obtain a lower bound for $f(x^*)$ be\n",
"evaluating $f^K$. The analogous lower bound for a multistage stochastic\n",
"program is:"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"$$\n",
"\\mathbb{E}_{i \\in R^+, \\omega \\in \\Omega_i}[V_i^K(x_R, \\omega)] \\le \\min_{\\pi} \\mathbb{E}_{i \\in R^+, \\omega \\in \\Omega_i}[V_i^\\pi(x_R, \\omega)]\n",
"$$"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Here's how we compute the lower bound:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function lower_bound(model::PolicyGraph)\n",
" node = model.nodes[1]\n",
" bound = 0.0\n",
" for (p, ω) in zip(node.uncertainty.P, node.uncertainty.Ω)\n",
" node.uncertainty.parameterize(ω)\n",
" JuMP.optimize!(node.subproblem)\n",
" bound += p * JuMP.objective_value(node.subproblem)\n",
" end\n",
" return bound\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"!!! note\n",
" The implementation is simplified because we assumed that there is only one\n",
" arc from the root node, and that it pointed to the first node in the\n",
" vector."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Because we haven't trained a policy yet, the lower bound is going to be very\n",
"bad:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"lower_bound(model)"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"### Upper bounds"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"With Kelley's algorithm, we could easily construct an upper bound by\n",
"evaluating $f(x_K)$. However, it is almost always intractable to evaluate an\n",
"upper bound for multistage stochastic programs due to the large number of\n",
"nodes and the nested expectations. Instead, we can perform a Monte Carlo\n",
"simulation of the policy to build a statistical estimate for the value of\n",
"$\\mathbb{E}_{i \\in R^+, \\omega \\in \\Omega_i}[V_i^\\pi(x_R, \\omega)]$, where\n",
"$\\pi$ is the policy defined by the current approximations $V^K_i$."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function upper_bound(model::PolicyGraph; replications::Int)\n",
" # Pipe the output to `devnull` so we don't print too much!\n",
" simulations = [forward_pass(model, devnull) for i in 1:replications]\n",
" z = [s[2] for s in simulations]\n",
" μ = Statistics.mean(z)\n",
" tσ = 1.96 * Statistics.std(z) / sqrt(replications)\n",
" return μ, tσ\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"!!! note\n",
" The width of the confidence interval is incorrect if there are cycles in\n",
" the graph, because the distribution of simulation costs `z` is not\n",
" symmetric. The mean is correct, however."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Termination criteria"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"In Kelley's algorithm, the upper bound was deterministic. Therefore, we could\n",
"terminate the algorithm when the lower bound was sufficiently close to the\n",
"upper bound. However, our upper bound for SDDP is not deterministic; it is a\n",
"confidence interval!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Some people suggest terminating SDDP when the lower bound is contained within\n",
"the confidence interval. However, this is a poor choice because it is too easy\n",
"to generate a false positive. For example, if we use a small number of\n",
"replications then the width of the confidence will be large, and we are more\n",
"likely to terminate!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"In a future tutorial (not yet written...) we will discuss termination criteria\n",
"in more depth. For now, pick a large number of iterations and train for as\n",
"long as possible."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"!!! tip\n",
" For a rule of thumb, pick a large number of iterations to train the\n",
" policy for (e.g.,\n",
" $10 \\times |\\mathcal{N}| \\times \\max\\limits_{i\\in\\mathcal{N}} |\\Omega_i|$)"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Implementation: the training loop"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"The `train` loop of SDDP just applies the forward and backward passes\n",
"iteratively, followed by a final simulation to compute the upper bound\n",
"confidence interval:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function train(\n",
" model::PolicyGraph;\n",
" iteration_limit::Int,\n",
" replications::Int,\n",
" io::IO = stdout,\n",
")\n",
" for i in 1:iteration_limit\n",
" println(io, \"Starting iteration $(i)\")\n",
" outgoing_states, _ = forward_pass(model, io)\n",
" backward_pass(model, outgoing_states, io)\n",
" println(io, \"| Finished iteration\")\n",
" println(io, \"| | lower_bound = \", lower_bound(model))\n",
" end\n",
" println(io, \"Termination status: iteration limit\")\n",
" μ, tσ = upper_bound(model; replications = replications)\n",
" println(io, \"Upper bound = $(μ) ± $(tσ)\")\n",
" return\n",
"end"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Using our `model` we defined earlier, we can go:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"train(model; iteration_limit = 3, replications = 100)"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Success! We trained a policy for a finite horizon multistage stochastic\n",
"program using stochastic dual dynamic programming."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Implementation: evaluating the policy"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"A final step is the ability to evaluate the policy at a given point."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"function evaluate_policy(\n",
" model::PolicyGraph;\n",
" node::Int,\n",
" incoming_state::Dict{Symbol,Float64},\n",
" random_variable,\n",
")\n",
" the_node = model.nodes[node]\n",
" the_node.uncertainty.parameterize(random_variable)\n",
" for (k, v) in incoming_state\n",
" JuMP.fix(the_node.states[k].in, v; force = true)\n",
" end\n",
" JuMP.optimize!(the_node.subproblem)\n",
" return Dict(\n",
" k => JuMP.value.(v) for\n",
" (k, v) in JuMP.object_dictionary(the_node.subproblem)\n",
" )\n",
"end\n",
"\n",
"evaluate_policy(\n",
" model;\n",
" node = 1,\n",
" incoming_state = Dict(:volume => 150.0),\n",
" random_variable = 75,\n",
")"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"!!! note\n",
" The random variable can be **out-of-sample**, i.e., it doesn't have to be\n",
" in the vector $\\Omega$ we created when defining the model! This is a\n",
" notable difference to other multistage stochastic solution methods like\n",
" progressive hedging or using the deterministic equivalent."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Example: infinite horizon"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"As promised earlier, our implementation is actually pretty general. It can\n",
"solve any multistage stochastic (linear) program defined by a policy graph,\n",
"including infinite horizon problems!"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Here's an example, where we have extended our earlier problem with an arc from\n",
"node 3 to node 2 with probability 0.5. You can interpret the 0.5 as a discount\n",
"factor."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"model = PolicyGraph(\n",
" subproblem_builder;\n",
" graph = [Dict(2 => 1.0), Dict(3 => 1.0), Dict(2 => 0.5)],\n",
" lower_bound = 0.0,\n",
" optimizer = HiGHS.Optimizer,\n",
")"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Then, train a policy:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"train(model; iteration_limit = 3, replications = 100)"
],
"metadata": {},
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Success! We trained a policy for an infinite horizon multistage stochastic\n",
"program using stochastic dual dynamic programming. Note how some of the\n",
"forward passes are different lengths!"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"evaluate_policy(\n",
" model;\n",
" node = 3,\n",
" incoming_state = Dict(:volume => 100.0),\n",
" random_variable = 10.0,\n",
")"
],
"metadata": {},
"execution_count": null
}
],
"nbformat_minor": 3,
"metadata": {
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.10.4"
},
"kernelspec": {
"name": "julia-1.10",
"display_name": "Julia 1.10.4",
"language": "julia"
}
},
"nbformat": 4
}