Preface xv

1 Optimization problem tasks and how they arise 1

1.1 The general optimization problem 1

1.2 Why the general problem is generally uninteresting 2

1.3 (Non-)Linearity 4

1.4 Objective function properties 4

2 Optimization algorithms – an overview 9

2.1 Methods that use the gradient 9

2.2 Newton-like methods 12

2.3 The promise of Newton’s method 13

2.4 Caution: convergence versus termination 14

2.5 Difficulties with Newton’s method 14

2.6 Least squares: Gauss–Newton methods 15

2.7 Quasi-Newton or variable metric method 17

2.8 Conjugate gradient and related methods 18

2.9 Other gradient methods 19

2.10 Derivative-free methods 19

2.11 Stochastic methods 20

2.12 Constraint-based methods – mathematical programming 21

References 22

3 Software structure and interfaces 25

3.1 Perspective 25

3.2 Issues of choice 26

3.3 Software issues 27

3.4 Specifying the objective and constraints to the optimizer 28

3.5 Communicating exogenous data to problem

definition functions 28

3.6 Masked (temporarily fixed) optimization parameters 32

3.7 Dealing with inadmissible results 33

3.8 Providing derivatives for functions 34

3.9 Derivative approximations when there are constraints 36

3.10 Scaling of parameters and function 36

3.11 Normal ending of computations 36

3.12 Termination tests – abnormal ending 37

3.13 Output to monitor progress of calculations 37

3.14 Output of the optimization results 38

3.15 Controls for the optimizer 38

3.16 Default control settings 39

3.17 Measuring performance 39

3.18 The optimization interface 39

References 40

4 One-parameter root-finding problems 41

4.1 Roots 41

4.2 Equations in one variable 42

4.3 Some examples 42

4.4 Approaches to solving 1D root-finding problems 51

4.5 What can go wrong? 52

4.6 Being a smart user of root-finding programs 54

4.7 Conclusions and extensions 54

References 55

5 One-parameter minimization problems 56

5.1 The optimize() function 56

5.2 Using a root-finder 57

5.3 But where is the minimum? 58

5.4 Ideas for 1D minimizers 59

5.5 The line-search subproblem 61

References 62

6 Nonlinear least squares 63

6.1 nls() from package stats 63

6.2 A more difficult case 65

6.3 The structure of the nls() solution 72

6.4 Concerns with nls() 73

6.5 Some ancillary tools for nonlinear least squares 79

6.6 Minimizing Rfunctions that compute sums of squares 81

6.7 Choosing an approach 82

6.8 Separable sums of squares problems 86

6.9 Strategies for nonlinear least squares 93

References 93

7 Nonlinear equations 95

7.1 Packages and methods for nonlinear equations 95

7.2 A simple example to compare approaches 97

7.3 A statistical example 103

References 106

8 Function minimization tools in the base R system 108

8.1 optim() 108

8.2 nlm() 110

8.3 nlminb() 111

8.4 Using the base optimization tools 112

References 114

9 Add-in function minimization packages for R 115

9.1 Package optimx 115

9.2 Some other function minimization packages 118

9.3 Should we replace optim() routines? 121

References 122

10 Calculating and using derivatives 123

10.1 Why and how 123

10.2 Analytic derivatives – by hand 124

10.3 Analytic derivatives – tools 125

10.4 Examples of use of R tools for differentiation 125

10.5 Simple numerical derivatives 127

10.6 Improved numerical derivative approximations 128

10.7 Strategy and tactics for derivatives 129

References 131

11 Bounds constraints 132

11.1 Single bound: use of a logarithmic transformation 132

11.2 Interval bounds: Use of a hyperbolic transformation 133

11.3 Setting the objective large when bounds are violated 135

11.4 An active set approach 136

11.5 Checking bounds 138

11.6 The importance of using bounds intelligently 138

11.7 Post-solution information for bounded problems 139

Appendix 11.A Function transfinite 141

References 142

12 Using masks 143

12.1 An example 143

12.2 Specifying the objective 143

12.3 Masks for nonlinear least squares 147

12.4 Other approaches to masks 148

References 148

13 Handling general constraints 149

13.1 Equality constraints 149

13.2 Sumscale problems 158

13.3 Inequality constraints 163

13.4 A perspective on penalty function ideas 167

13.5 Assessment 167

References 168

14 Applications of mathematical programming 169

14.1 Statistical applications of math programming 169

14.2 R packages for math programming 170

14.3 Example problem: L1 regression 171

14.4 Example problem: minimax regression 177

14.5 Nonlinear quantile regression 179

14.6 Polynomial approximation 180

References 183

15 Global optimization and stochastic methods 185

15.1 Panorama of methods 185

15.2 R packages for global and stochastic optimization 186

15.3 An example problem 187

15.4 Multiple starting values 196

References 202

16 Scaling and reparameterization 203

16.1 Why scale or reparameterize? 203

16.2 Formalities of scaling and reparameterization 204

16.3 Hobbs’ weed infestation example 205

16.4 The KKT conditions and scaling 210

16.5 Reparameterization of the weeds problem 214

16.6 Scale change across the parameter space 214

16.7 Robustness of methods to starting points 215

16.8 Strategies for scaling 222

References 223

17 Finding the right solution 224

17.1 Particular requirements 224

17.2 Starting values for iterative methods 225

17.3 KKT conditions 226

17.4 Search tests 228

References 229

18 Tuning and terminating methods 230

18.1 Timing and profiling 230

18.2 Profiling 234

18.3 More speedups of R computations 238

18.4 External language compiled functions 242

18.5 Deciding when we are finished 247

18.5.1 Tests for things gone wrong 248

References 249

19 Linking R to external optimization tools 250

19.1 Mechanisms to link R to external software 251

19.2 Prepackaged links to external optimization tools 252

19.3 Strategy for using external tools 253

References 254

20 Differential equation models 255

20.1 The model 255

20.2 Background 256

20.3 The likelihood function 258

20.4 A first try at minimization 258

20.5 Attempts with optimx 259

20.6 Using nonlinear least squares 260

20.7 Commentary 261

Reference 262

21 Miscellaneous nonlinear estimation tools for R 263

21.1 Maximum likelihood 263

21.2 Generalized nonlinear models 266

21.3 Systems of equations 268

21.4 Additional nonlinear least squares tools 268

21.5 Nonnegative least squares 270

21.6 Noisy objective functions 273

21.7 Moving forward 274

References 275

Appendix A R packages used in examples 276

Index 279