Trouble debugging gradients of nonlinear model

I have a nonlinear (nonconvex) program that I am solving with Ipopt and using DiffOpt to differentiate through. I have been having trouble getting correct gradients and am not sure if this is a numerical issue with my model formulation, or if there is something more fundamental going on. Here is a pared-down example that demonstrates my issue.

```
# define parameter value
ϕ0=[0.0];

# define constants used in model
Phigh = 0.4412123910369964;
Plow = 0.357171935601378;
c(Nsep) = (1 / (1 + 2*Nsep))^2
cols = 2; 
reqwakes = 1; 
Nwakes = 1;
aprev = 0.1 * ones(2, 1);
rhist = Phigh * ones(1);
normalized_rhist = [1];
predhorz = 2;
KP = 0.5 * π/4;

model = DiffOpt.nonlinear_diff_model(Ipopt.Optimizer)

# define decision variables
JuMP.@variable(model, 0.05 <= a[1:cols, -reqwakes:Nwakes] <= 1/3)

# fix certain values
for j=1:cols, n=1:reqwakes
    JuMP.fix(a[j, n - cols], aprev[j, n]; force=true)
end

JuMP.fix.(a[end, 0:end], 0.333333; force=true)

# define parameter
JuMP.@variable(model, ϕ in JuMP.Parameter(ϕ0[1]))

# define parameterized signal
scale = (Phigh - Plow) / 2;
shift = (Phigh + Plow) / 2;

JuMP.@variables(model, begin
               Pref_unscaled[t=1:predhorz]
               Pref_var[t=1:predhorz]
           end)

JuMP.fix.(Pref_unscaled[1], reverse(normalized_rhist))

JuMP.@constraints(model, begin
                 [t=2:predhorz], Pref_unscaled[t] .== ϕ * Pref_unscaled[t-1]
                 [t=1:predhorz], Pref_var[t] == (scale * Pref_unscaled[t] + shift)
             end)

# define NL model expressions
JuMP.@expressions(model, begin
                     CP[j=1:cols, n=0:Nwakes], 4 * a[j,n] * (1 - a[j,n])^2
                     δV[j=1:cols, n=0:Nwakes], 2 * sqrt( sum( (a[prev,n-(j-prev)] * c(j-prev) )^2 for prev=1:j-1; init=0) )
                     u[j=1:cols, n=0:Nwakes], (1 - δV[j,n])
                     Pcap[j=1:cols, n=0:Nwakes], CP[j,n] * u[j,n]^3
                     Pgen[n=0:Nwakes], sum(Pcap[j,n] for j=1:cols)
                 end)

# define objective
JuMP.@objective(model, Min, sum( (KP * Pgen[n] - Pref_var[1+n])^2 for n=0:Nwakes))
```

With this model, I vary `φ` and find the optimal `a`, producing the following curve.
<img width="600" height="400" alt="Image" src="https://github.com/user-attachments/assets/0436eecf-fb31-4fbe-bab5-b236331e8e46" />


I then take the derivative of `a` w.r.t. `φ` using both DiffOpt

```
DiffOpt.empty_input_sensitivities!(model)

DiffOpt.set_reverse_variable.(model, model[:a][1, 0], 1)
DiffOpt.reverse_differentiate!(model)

dadϕ = DiffOpt.get_reverse_parameter.(model, model[:ϕ]);

```
and finite differencing.
```
function solve_a_ϕ0_manual(ϕ0)
    JuMP.set_parameter_value.(model[:ϕ], ϕ0)
    
    JuMP.optimize!(model)

    JuMP.assert_is_solved_and_feasible(model)

    asolved_new = JuMP.value.(model[:a]);

    return asolved_new[1, 0]
end

dadϕ_fdm = FiniteDifferences.grad(FiniteDifferences.forward_fdm(2, 1; factor=1e6), solve_a_ϕ0_manual, ϕ0);
```

This returns problematic results. In the flat portion of the plot, e.g., φ0 = 0.0, both gradients are "close" to 0, but still differ significantly.
```
dadϕ = -9.871980439887741e-10
dadϕ_fdm = ([-5.972732875606716e-8],)
dadϕ_fdm[1] / dadϕ = [60.501871047818184]
```

In the changing portion of the plot, e.g., φ0 = 1.0, the gradients are closer but don't really match well.
```
dadϕ = -0.03988513330240084
dadϕ_fdm = ([-0.04102676532022559],)
dadϕ_fdm[1] / dadϕ = [1.02862299617176]
```

Plotting the gradients as tangents on the graphs shows that the finite differencing is giving me more accurate results here.

Any insight would be appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trouble debugging gradients of nonlinear model #318

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Trouble debugging gradients of nonlinear model #318

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions