Integrality

There's nothing special about binary and integer variables in SDDP.jl. Your models may contain a mix of binary, integer, and continuous state and control variables. Use the standard JuMP syntax to add binary and integer variables.

For example:

using SDDP, HiGHS
model = SDDP.LinearPolicyGraph(
    stages = 3,
    lower_bound = 0.0,
    optimizer = HiGHS.Optimizer,
) do sp, t
    @variable(sp, 0 <= x <= 100, Int, SDDP.State, initial_value = 0)
    @variable(sp, 0 <= u <= 200, integer = true)
    @variable(sp, v >= 0)
    @constraint(sp, x.out == x.in + u + v - 150)
    @stageobjective(sp, 2u + 6v + x.out)
end

A policy graph with 3 nodes.
 Node indices: 1, 2, 3

Specifying a duality handler

If you want finer control over how SDDP.jl computes subgradients in the backward pass, you can pass an SDDP.AbstractDualityHandler to the duality_handler argument of SDDP.train.

The duality handlers implemented in SDDP.jl are:

Here is an example:

using SDDP, HiGHS
model = SDDP.LinearPolicyGraph(
    stages = 3,
    lower_bound = 0.0,
    optimizer = HiGHS.Optimizer,
) do sp, t
    @variable(sp, 0 <= x <= 100, Int, SDDP.State, initial_value = 0)
    @variable(sp, 0 <= u <= 200, integer = true)
    @variable(sp, v >= 0)
    @constraint(sp, x.out == x.in + u + v - 150)
    @stageobjective(sp, 2u + 6v + x.out)
end
SDDP.train(
    model;
    duality_handler = SDDP.BanditDuality(),
    log_every_iteration = true,
)

-------------------------------------------------------------------
         SDDP.jl (c) Oscar Dowson and contributors, 2017-25
-------------------------------------------------------------------
problem
  nodes           : 3
  state variables : 1
  scenarios       : 1.00000e+00
  existing cuts   : false
options
  solver          : serial mode
  risk measure    : SDDP.Expectation()
  sampling scheme : SDDP.InSampleMonteCarlo
subproblem structure
  VariableRef                             : [5, 5]
  AffExpr in MOI.EqualTo{Float64}         : [1, 1]
  VariableRef in MOI.GreaterThan{Float64} : [4, 4]
  VariableRef in MOI.Integer              : [2, 2]
  VariableRef in MOI.LessThan{Float64}    : [2, 3]
numerical stability report
  matrix range     [1e+00, 1e+00]
  objective range  [1e+00, 6e+00]
  bounds range     [1e+02, 2e+02]
  rhs range        [2e+02, 2e+02]
-------------------------------------------------------------------
 iteration    simulation      bound        time (s)     solves  pid
-------------------------------------------------------------------
         1   9.000000e+02  9.000000e+02  3.737099e-01         6   1
         2   9.000000e+02  9.000000e+02  4.353840e-01        15   1
         3S  9.000000e+02  9.000000e+02  5.683281e-01        21   1
         4   9.000000e+02  9.000000e+02  6.048889e-01        27   1
         5S  9.000000e+02  9.000000e+02  6.527359e-01        33   1
         6   9.000000e+02  9.000000e+02  6.852419e-01        39   1
         7S  9.000000e+02  9.000000e+02  7.340109e-01        45   1
         8   9.000000e+02  9.000000e+02  7.663419e-01        51   1
         9S  9.000000e+02  9.000000e+02  8.144901e-01        57   1
        10   9.000000e+02  9.000000e+02  8.473580e-01        63   1
        11S  9.000000e+02  9.000000e+02  8.952351e-01        69   1
        12   9.000000e+02  9.000000e+02  9.275680e-01        75   1
        13S  9.000000e+02  9.000000e+02  9.760029e-01        81   1
        14   9.000000e+02  9.000000e+02  1.009063e+00        87   1
        15S  9.000000e+02  9.000000e+02  1.058713e+00        93   1
        16   9.000000e+02  9.000000e+02  1.091831e+00        99   1
        17S  9.000000e+02  9.000000e+02  1.140489e+00       105   1
        18   9.000000e+02  9.000000e+02  1.172885e+00       111   1
        19S  9.000000e+02  9.000000e+02  1.221605e+00       117   1
        20   9.000000e+02  9.000000e+02  1.253963e+00       123   1
-------------------------------------------------------------------
status         : simulation_stopping
total time (s) : 1.253963e+00
total solves   : 123
best bound     :  9.000000e+02
simulation ci  :  9.000000e+02 ± 0.000000e+00
numeric issues : 0
-------------------------------------------------------------------

Using a different optimizer to compute duals

If your default optimizer cannot compute dual variables, for example, you are using Gurobi to solve a MINLP, each duality handler accepts a new optimizer as a positional argument that is called once the integrality has been relaxed. For example:

using SDDP, HiGHS, Ipopt
model = SDDP.LinearPolicyGraph(
   stages = 3,
   lower_bound = 0.0,
   optimizer = HiGHS.Optimizer,
) do sp, t
   @variable(sp, 0 <= x <= 100, Int, SDDP.State, initial_value = 0)
   @variable(sp, 0 <= u <= 200, integer = true)
   @variable(sp, v >= 0)
   @constraint(sp, x.out == x.in + u + v - 150)
   @stageobjective(sp, 2u + 6v + x.out)
end
SDDP.train(
    model;
    duality_handler = SDDP.ContinuousConicDuality(Ipopt.Optimizer),
)

-------------------------------------------------------------------
         SDDP.jl (c) Oscar Dowson and contributors, 2017-25
-------------------------------------------------------------------
problem
  nodes           : 3
  state variables : 1
  scenarios       : 1.00000e+00
  existing cuts   : false
options
  solver          : serial mode
  risk measure    : SDDP.Expectation()
  sampling scheme : SDDP.InSampleMonteCarlo
subproblem structure
  VariableRef                             : [5, 5]
  AffExpr in MOI.EqualTo{Float64}         : [1, 1]
  VariableRef in MOI.GreaterThan{Float64} : [4, 4]
  VariableRef in MOI.Integer              : [2, 2]
  VariableRef in MOI.LessThan{Float64}    : [2, 3]
numerical stability report
  matrix range     [1e+00, 1e+00]
  objective range  [1e+00, 6e+00]
  bounds range     [1e+02, 2e+02]
  rhs range        [2e+02, 2e+02]
-------------------------------------------------------------------
 iteration    simulation      bound        time (s)     solves  pid
-------------------------------------------------------------------
         1   9.000000e+02  9.000000e+02  4.302301e-01         6   1
        20   9.000000e+02  9.000000e+02  1.251437e+00       123   1
-------------------------------------------------------------------
status         : simulation_stopping
total time (s) : 1.251437e+00
total solves   : 123
best bound     :  9.000000e+02
simulation ci  :  9.000000e+02 ± 0.000000e+00
numeric issues : 0
-------------------------------------------------------------------

Convergence

SDDP.jl cannot guarantee that it will find a globally optimal policy when some of the variables are discrete. However, in most cases we find that it can still find an integer feasible policy that performs well in simulation.

Moreover, when the number of nodes in the graph is large, or there is uncertainty, we are not aware of another algorithm that can claim to find a globally optimal policy.

Does SDDP.jl implement the SDDiP algorithm?

Most discussions of SDDiP in the literature confuse two unrelated things.

First, how to compute dual variables
Second, when the algorithm will converge to a globally optimal policy.

Computing dual variables

The stochastic dual dynamic programming algorithm requires a subgradient of the objective with respect to the incoming state variable.

One way to obtain a valid subgradient is to compute an optimal value of the dual variable $\lambda$ in the following subproblem:

\[\begin{aligned} V_i(x, \omega) = \min\limits_{\bar{x}, x^\prime, u} \;\; & C_i(\bar{x}, u, \omega) + \mathbb{E}_{j \in i^+, \varphi \in \Omega_j}[V_j(x^\prime, \varphi)]\\ & x^\prime = T_i(\bar{x}, u, \omega) \\ & u \in U_i(\bar{x}, \omega) \\ & \bar{x} = x \quad [\lambda] \end{aligned}\]

The easiest option is to relax integrality of the discrete variables to form a linear program and then use linear programming duality to obtain the dual. But we could also use Lagrangian duality without needing to relax the integrality constraints.

To compute the Lagrangian dual $\lambda$, we penalize $\lambda^\top(\bar{x} - x)$ in the objective instead of enforcing the constraint:

\[\begin{aligned} \max\limits_{\lambda}\min\limits_{\bar{x}, x^\prime, u} \;\; & C_i(\bar{x}, u, \omega) + \mathbb{E}_{j \in i^+, \varphi \in \Omega_j}[V_j(x^\prime, \varphi)] - \lambda^\top(\bar{x} - x)\\ & x^\prime = T_i(\bar{x}, u, \omega) \\ & u \in U_i(\bar{x}, \omega) \end{aligned}\]

You can use Lagrangian duality in SDDP.jl by passing SDDP.LagrangianDuality to the duality_handler argument of SDDP.train.

Compared with linear programming duality, the Lagrangian problem is difficult to solve because it requires the solution of many mixed-integer programs instead of a single linear program. This is one reason why "SDDiP" has poor performance.

Convergence

The second part to SDDiP is a very tightly scoped claim: if all of the state variables are binary and the algorithm uses Lagrangian duality to compute a subgradient, then it will converge to an optimal policy.

In many cases, papers claim to "do SDDiP," but they have state variables which are not binary. In these cases, the algorithm is not guaranteed to converge to a globally optimal policy.

One work-around that has been suggested is to discretize the state variables into a set of binary state variables. However, this leads to a large number of binary state variables, which is another reason why "SDDiP" has poor performance.

In general, we recommend that you introduce integer variables into your model without fear of the consequences, and that you treat the resulting policy as a good heuristic, rather than an attempt to find a globally optimal policy.