Title: Conjugate Gradient Methods for Largescale Unconstrained Optimization
1Conjugate Gradient Methods for Large-scale
Unconstrained Optimization
- Dr. Neculai Andrei
- Research Institute for Informatics
- Bucharest
- and
- Academy of Romanian Scientists
Ovidius University, Constantza - Romania,
March 27,
2008
2Contents
- Problem definition
- Unconstrained optimization methods
- Conjugate gradient methods
- - Classical conjugate gradient algorithms
- - Hybrid conjugate gradient algorithms
- - Scaled conjugate gradient algorithms
- - Modified conjugate gradient algorithms
- - Parametric conjugate gradient algorithms
- Applications
3Problem definition
- continuously differentiable
- gradient is available
- n is large
- Hessian is unavailable
Necessary optimality conditions
Sufficient optimality conditions
4Unconstrained optimization methods
Step length
Search direction
1) Line search
2) Trust-Region algorithms
Quadratic approximation
Influences
5Step length computation
1) Armijo rule
2) Goldstein rule
63) Wolfe conditions
Implementations
Shanno (1978)
Moré - Thuente (1992-1994)
7Proposition Assume that
is a descent direction and
satisfies the Lipschitz condition
for all
on the line segment connecting
and
where
is a positive constant.
If the line search satisfies the Goldstein
conditions then
If the line search satisfies the Wolfe
conditions, then
8Remarks
1) The Newton method, the quasi-Newton or the
limited memory quasi-Newton methods has the
ability to accept unity step lengths along the
iterations.
2) In conjugate gradient methods the step lengths
may differ from 1 in a very unpredictable
manner. They can be larger or smaller than 1
depending on how the problem is scaled.
N. Andrei, (2007) Acceleration of conjugate
gradient algorithms for unconstrained
optimization. (submitted JOTA)
9Methods for Unconstrained Optimization
1) Steepest descent (Cauchy, 1847)
2) Newton
3) Quasi-Newton (Broyden, 1965 and many others)
104) Conjugate Gradient Methods (1952)
is known as the conjugate gradient parameter
5) Truncated Newton method (Dembo, et al, 1982)
6) Trust Region methods
117) Conic model method (Davidon, 1980)
8) Metode tensoriale (Schnabel Frank, 1984)
129) Methods based on systems of Differential
Equations Gradient flow Method (Courant,
1942)
10) Direct searching methods
Hooke-Jevees (form searching) (1961) Powell
(conjugate directions) (1964) Rosenbrock
(coordinate system rotation)(1960) Nelder-Mead
(rolling the simplex) (1965) Powell UOBYQA
(quadratic approximation) (1994-2000)
N. Andrei, Critica Retiunii Algoritmilor de
Optimizare fara Restrictii Editura Academiei
Romane, 2008.
13Conjugate Gradient Methods
Magnus Hestenes (1906-1991)
Eduard Stiefel (1909-1978)
14The prototype of Conjugate Gradient Algorithm
Step 1. Select the initial starting point
Step 2. Test a criterion for stopping the
iterations.
Step 3. Determine the steplength
by Wolfe conditions.
Step 4. Update the variables
Step 5. Compute
?
Step 6. Compute the search direction
Step 7. Restart. If
then set
Step 8.
Compute the initial guess
set
and continue with step 2. ?
15Convergence Analysis
Theorem. Suppose that
1) the level set
is bounded,
2) the function
is continuously differentiable,
3) the gradient is Lipschitz continuous, i.e.
Consider any conjugate gradient method where
is a descent direction,
1)
is obtained by the strong Wolfe line search.
2)
then
If
16Classical conjugate gradient algorithms
1. Hestenes - Stiefel (HS)
2. Polak Ribière - Polyak (PRP)
3. Liu - Storey (LS)
4. Fletcher - Reeves (FR)
5. Conjugate Descent Fletcher (CD)
6. Dai Yuan (DY)
17Classical conjugate gradient algorithms
Performance Profiles
18Classical conjugate gradient algorithms
7. Dai Liao (DL)
8. Dai Liao plus (DL)
9. Andrei - Sufficient Descent Condition (CGSD)
N. Andrei, A Dai-Yuan conjugate gradient
algorithm with sufficient descent and conjugacy
conditions for unconstrained optimization.
Applied Mathematics Letters, vol.21, 2008,
pp.165-171.
19Classical conjugate gradient algorithms
20Hybrid conjugate gradient algorithms -
projections
10. Hybrid Dai - Yuan (hDY)
11. Hybrid Dai Yuan zero (hDYz)
12. Gilbert Nocedal (GN)
21Hybrid conjugate gradient algorithms -
projections
13. Hu Storey (HuS)
14. Touati-Ahmed and Storey (TaS)
15. Hybrid LS CD (LS-CD)
22Hybrid conjugate gradient algorithms -
projections
23Hybrid conjugate gradient algorithms - convex
combination
16. Convex combination of PRP and DY from
conjugacy condition (CCOMB - Andrei)
If
then
If
then
N. Andrei, New hybrid conjugate gradient
algorithms for unconstrained optimization.
Encyclopedia of Optimization, 2nd Edition,
Springer, August 2008, Entry 761.
24Hybrid conjugate gradient algorithms - convex
combination
17. Convex combination of PRP and DY from
Newton direction (NDOMB - Andrei)
If
then
If
then
N. Andrei, New hybrid conjugate gradient
algorithms as a convex combination of PRP and DY
for unconstrained optimization. ICI Technical
Report, October 1, 2007. (submitted AML)
25Hybrid conjugate gradient algorithms - convex
combination
26Hybrid conjugate gradient algorithms - convex
combination
27Hybrid conjugate gradient algorithms - convex
combination
18. Convex combination of HS and DY from
Newton direction (HYBRID - Andrei)
(Secant condition)
If
then
If
then
N. Andrei, A hybrid conjugate gradient algorithm
for unconstrained optimization as a convex
combination of Hestenes-Stiefel and
Dai-Yuan. Studies in Informatics and Control,
vol.17, No.1, March 2008, pp.55-70.
28Hybrid conjugate gradient algorithms - convex
combination
19. Convex combination of HS and DY from Newton
direction with modified secant condition
(HYBRIDM - Andrei)
If
then
If
then
N. Andrei, A hybrid conjugate gradient algorithm
with modified secant condition for unconstrained
optimization. ICI Technical Report, February 6,
2008 (submitted to Numerical Algorithms)
29Hybrid conjugate gradient algorithms - convex
combination
30Scaled conjugate gradient algorithms
N. Andrei, Scaled memoryless BFGS preconditioned
conjugate gradient algorithm for unconstrained
optimization. Optimization Methods and Software,
22 (2007), pp.561-571.
31Scaled conjugate gradient algorithms
A) Secant Condition
B) Modified secant Condition
N. Andrei, Accelerated conjugate gradient
algorithm with modified secant condition for
unconstrained optimization. ICI Technical Report,
March 3, 2008. (Submitted, JOTA, 2007)
32Scaled conjugate gradient algorithms
C) Hessian / vector approximation by finite
difference
N. Andrei, Accelerated conjugate gradient
algorithm with finite difference Hessian /
vector product approximation for unconstrained
optimization. ICI Technical Report, March 4,
2008 (submitted Math. Programm.)
33Scaled conjugate gradient algorithms
20. Birgin Martínez (BM)
21. Birgin Martínez plus (BM)
22. Scaled Polak-Ribière-Polyak (sPRP)
34Scaled conjugate gradient algorithms
23. Scaled Fletcher Reeves (sFR)
24. Scaled Hestenes Stiefel (sHS)
35Scaled conjugate gradient algorithms
25. SCALCG (secant condition)
N. Andrei, Scaled conjugate gradient algorithms
for unconstrained optimization. Computational
Optimization and Applications, vol. 38, no. 3,
(2007), pp.401-416.
36Scaled conjugate gradient algorithms
Theorem Suppose that
satisfies the Wolfe conditions
then the direction
is a descent direction.
N. Andrei, A scaled BFGS preconditioned conjugate
gradient algorithm for unconstrained
optimization. Applied Mathematics Letters, 20
(2007), pp.645-650.
37Scaled conjugate gradient algorithms
The Powell restarting procedure
The direction
is computed using a double update scheme
for
ANDREI, N., A scaled nonlinear conjugate gradient
algorithm for unconstrained optimization.
Optimization. A journal of mathematical
programming and operations research, DOI,
accepted.
38Scaled conjugate gradient algorithms
N. Andrei, Scaled memoryless BFGS preconditioned
conjugate gradient algorithm for unconstrained
optimization. Optimization Methods and Software,
22 (2007), pp.561-571.
39Scaled conjugate gradient algorithms
Lemma
If
Lipschitz continuous, then
Theorem
For strongly convex functions,
with Wolfe line search
40Scaled conjugate gradient algorithms
41Scaled conjugate gradient algorithms
26. ASCALCG (secant condition)
In conjugate gradient methods the step lengths
may differ from 1 in a very unpredictable manner.
They can be larger or smaller than 1 depending on
how the problem is scaled.
General Theory of Acceleration
N. Andrei, (2007) Acceleration of conjugate
gradient algorithms for unconstrained
optimization. ICI Technical Report, October 24,
2007. (submitted JOTA, 2007)
42Scaled conjugate gradient algorithms
43Scaled conjugate gradient algorithms
44Scaled conjugate gradient algorithms
If
is a descent direction, then
45Scaled conjugate gradient algorithms
computation
46Proposition. Suppose that
is a uniformly convex function
on the level set
and
satisfies
where
the sufficient descent condition
and
where
Then the sequence generated by ACG converges
linearly to
solution to the optimization problem.
47Scaled conjugate gradient algorithms
N. Andrei, Accelerated scaled memoryless BFGS
preconditioned conjugate gradient algorithm for
unconstrained optimization. ICI Technical Report,
March 10, 2008 (submitted Numerischke
Mathematik, 2008)
48Scaled conjugate gradient algorithms
49Scaled conjugate gradient algorithms
50Modified conjugate gradient algorithms
27. Andrei Sufficient Descent Condition from
PRP (A-prp)
28. Andrei Sufficient Descent Condition from DY
(ACGA)
29. Andrei Sufficient Descent Condition from DY
zero (ACGA)
51Modified conjugate gradient algorithms
30) CG_DESCENT (Hager-Zhang, 2005, 2006)
52Modified conjugate gradient algorithms
53Modified conjugate gradient algorithms
54Modified conjugate gradient algorithms
55Modified conjugate gradient algorithms
56Modified conjugate gradient algorithms
(Slide 30)
31) ACGMSEC
For
we get exactly the Perry method.
Theorem If
then
N. Andrei, Accelerated conjugate gradient
algorithm with modified secant condition for
unconstrained optimization. ICI Technical Report,
March 3, 2008. (Submitted Applied Mathematics
and Optimization, 2008)
57Modified conjugate gradient algorithms
58Modified conjugate gradient algorithms
59Modified conjugate gradient algorithms
32) ACGHES
N. Andrei, Accelerated conjugate gradient
algorithm with finite difference Hessian /
vector product approximation for unconstrained
optimization. ICI Technical Report, March 4,
2008 (submitted Math. Programm.)
60Modified conjugate gradient algorithms
61Modified conjugate gradient algorithms
62Comparisons with other UO methods
63Parametric conjugate gradient algorithms
33) Yabe-Takano
34) Yabe-Takano
64Parametric conjugate gradient algorithms
35) Parametric CG suggested by Dai-Yuan
36) Parametric CG suggested by Nazareth
65Parametric conjugate gradient algorithms
37) Parametric CG suggested by Dai-Yuan
66Applications
A1) Elastic-Plastic Torsion (c5) (nx200, ny200)
MINPACK2 Collection
67SCALCG iter445, fg584,
cpu8.49(s) ASCALCG iter240, fg269,
cpu6.93(s)
n40000 variables
68A2) Pressure Distribution in a Journal Bearing
(ecc0.1 b10) (nx200, ny200)
69(No Transcript)
70A3) Optimal Design with Composite Materials
71(No Transcript)
72A4) Steady State Combustion - Bratu Problem
73(No Transcript)
74A5) Ginzburg-Landau (1-dimensional)
Free Gibbs energy
75n1000 variables
76Thank you !