Model reduction for dynamic real-time optimization of chemical processes

Pełen tekst

(1)Model Reduction for Dynamic Real-Time Optimization of Chemical Processes.

(2) Cover design by Rob Bergervoet c Copyright 2005 Edoch.

(3) Model Reduction for Dynamic Real-Time Optimization of Chemical Processes. PROEFSCHRIFT. ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus prof.dr.ir. J.T. Fokkema, voorzitter van het College voor Promoties, in het openbaar te verdedigen. op donderdag 15 december om 13:00 uur. door. Jogchem VAN DEN BERG. werktuigkundig ingenieur geboren te Enschede.

(4) Dit proefschrift is goedgekeurd door de promotor: Prof.ir. O.H. Bosgra. Samenstelling promotiecommissie: Rector Magnificus Prof.ir. O.H. Bosgra Prof.dr.ir. A.C.P.M. Backx Prof.ir. J. Grievink Dr.ir. P.J.T. Verheijen Prof.dr.-Ing. H.A. Preisig Prof.dr.-Ing. W. Marquardt Prof.dr.ir. P.A. Wieringa. voorzitter Technische Universiteit Delft, promotor Technische Universiteit Eindhoven Technische Universiteit Delft Technische Universiteit Delft Norwegian University of Science and Technology RWTH Aachen Technische Universiteit Delft. Published by: OPTIMA OPTIMA P.O. Box 84115 3009 CC Rotterdam The Netherlands Telephone: +31-102201149 Telefax : +31-104566354 E-mail: account@ogc.nl ISBN 90-8559-152-x Keywords: chemical processes, model reduction, optimization. c Copyright 2005 by Jogchem van den Berg All rights reserved. No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from Jogchem van den Berg. Printed in The Netherlands..

(5) Voorwoord Na een memorabele tijd van bijna zeven jaar staat dan uiteindelijk toch het resultaat van mijn onderzoek zwart op wit. En dat doet goed. Tijdens mijn afstuderen raakte ik er steeds meer van overtuigd dat het zeer de moeite waard zou zijn om een promotieonderzoek te gaan doen. Na enkele gespreken met Ton Backx en Okko Bosgra over een internationaal project, kwam het onderwerp modelreductie ter sprake waarvan ik meteen geloofde dat het een boeiend onderwerp zou zijn. Ik denk met erg veel plezier terug aan de absurdistische gesprekken afgewisseld met heftige inhoudelijke neuzel discussies. Meestal begon het serieus maar gelukkig was er altijd wel iemand met een verfrissende opmerking, om zo het belang van de zaak te relativeren. Een paar mensen wilde ik graag in het bijzonder bedanken. Ten eerste natuurlijk Okko die mij alle vrijheid heeft gegeven om mijn eigen plan te trekken op basis van onze inhoudelijke altijd interessante discussies. Adrie wil ik graag bedanken voor zijn gepassioneerde uitleg over alles was met chemie te maken heeft en voor de rol van klankbord die hij voor mij vervulde. Mijn kamergenoten Rob en Dennis, nestor David, Martijn, Eduard, Branko, Camile, Gideon, Leon, Maria, Martijn, Matthijs en Agnes, Carsten, Debbie, Peter, Piet, Sjoerd en Ton wil ik bedanken voor alle koffietafelgesprekken en borrelpraat. Ook wil ik mijn collega’s van het project bedanken waaronder Martin, Jitendra, Wolfgang Marquardt, Mario, Jobert, Wim, Sjoerd, Celeste, Piet-Jan, Pieter, Peter Verheijen, Johan Grievink. Zonder jullie was het project zeker niet geslaagd. Tenslotte wil ik mijn ouders bedanken voor de onvoorwaardelijke steun die ik altijd van hen gekregen heb. Mijn broer Mattijs en Kirsten voor Lynn voor wie ik nu eindelijk een goede suikeroom kan zijn. Rob voor het ontwerp van de omslag van mijn boekje en mijn andere vrienden de al die tijd verhalen hebben moeten aanhoren over de ups en downs die ik tijdens mijn promotietijd heb gehad. Op naar de volgende uitdaging! Jogchem van den Berg Rotterdam, oktober 2005 v.

(6)

(7) Contents Voorwoord. v. 1 Introduction and problem formulation 1.1 Introduction . . . . . . . . . . . . . . . . 1.2 Problem exploration . . . . . . . . . . . 1.3 Literature on nonlinear model reduction 1.4 Solution directions . . . . . . . . . . . . 1.5 Research questions . . . . . . . . . . . . 1.6 Outline of this thesis . . . . . . . . . . .. . . . . . .. 1 1 4 17 25 28 29. 2 Model order reduction suitable for large scale nonlinear models 2.1 Model order reduction . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Balanced reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Proper orthogonal decomposition . . . . . . . . . . . . . . . . . . 2.4 Balanced reduction revisited . . . . . . . . . . . . . . . . . . . . . 2.5 Evaluation on a process model . . . . . . . . . . . . . . . . . . . 2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 31 31 37 45 50 62 73. 3 Dynamic optimization 3.1 Base case . . . . . . 3.2 Results . . . . . . . . 3.3 Model quality . . . . 3.4 Discussion . . . . . .. 75 75 85 86 88. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . .. 4 Physics-based model reduction 89 4.1 Rigorous model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.2 Physics-based reduced model . . . . . . . . . . . . . . . . . . . . 95 vii.

(8) 4.3 4.4. Model quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105. 5 Model order reduction by projection 5.1 Introduction . . . . . . . . . . . . . . . . 5.2 Projection of nonlinear models . . . . . 5.3 Results of model reduction by projection 5.4 Discussion . . . . . . . . . . . . . . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. 107 107 110 113 123. 6 Conclusions and future research 127 6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 6.2 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Bibliography. 140. List of symbols. 141. A Gramians 143 A.1 Balancing transformations . . . . . . . . . . . . . . . . . . . . . . 143 A.2 Perturbed empirical Gramians . . . . . . . . . . . . . . . . . . . 145 B Proper orthogonal decomposition. 147. C Nonlinear Optimization. 149. D Gradient information of projected models. 153. Summary. 155. Samenvatting. 157. Curriculum Vitae. 159. viii.

(9) Chapter 1. Introduction and problem formulation This thesis explores possibilities of model reduction techniques for online optimization based control in chemical process industry. The success of this control approach on industrial scale problems started in petrochemical industry and becomes now adopted by chemical process industry. This is a challenge because the operation of chemical plants differs from petrochemical plant operation imposing different requirements on optimization based control and consequently on process models. For online optimization based control we have limited computational time so computational load is a critical issue. This thesis focusses on the models and their contribution to online optimization based control.. 1.1. Introduction. The value of models in process industries becomes apparent in practice and literature where numerous successful applications are reported of steadystate plant design optimization and model based control. This development was boosted by maturing commercial modelling tools and continuously increasing computing power. Side effect of this development is that not only larger processes with more unit operations can be modelled but each unit operation can be modelled in more detail as well. Especially spatially distributed systems, such as distillation columns and tubular reactors, are well known model size boosters. Large-scale models are in principle not a problem, at the most inconvenient for the model developer because of the long simulation times involved. Application of models in an online setting seriously pushes the demands on models 1.

(10) to its limits since the available time for computations is limited. Numerous solutions are thinkable that contribute in solving this issue varying from buying faster computers to solving approximate, less computational demanding, control problems. Industry focusses on implementation of effective optimization based control solutions whereas university groups focus on understanding of (in)effectiveness of these solutions. Understanding gives direction to development of more effective solutions suitable for industrial applications. This thesis is the result of close collaboration between industry and universities aiming at symbiosis of these two different focusses. Online optimization ˚ström and Wittenmark, 1997) we know From digital control theory (see e.g. A that sampling introduces a delay in the control system, limiting controller bandwidth. The sampling period is therefore preferably chosen as short as possible. A similar situation holds for online optimization. When applying online optimization-based control we can make a tradeoff between a high precision solution with a long sampling period and an approximate solution at short sampling period. In case of a fairly good model and low frequently disturbances, a low sample rate would most probably be sufficient. However, a tight quality constraint in combination with a fast response to a disturbance for the same system will require a much higher controller bandwidth to maintain performance. So, the tradeoff between solution accuracy and sampling period depends on plantmodel mismatch, disturbance characteristics, plant dynamics and presence of constraints. With current modelling packages, processes can be modelled in great detail and model accuracy seems unquestionable. Unfortunately, in reality we always have to deal with uncertain parameters and stochastic disturbances. One can think of heat exchanger fouling, catalyst decay, uncertain reaction rates and uncertain flow patterns. This motivates the necessity of a feedback mechanism that deals with disturbances and uncertainties on a suitable sampling interval. So we can develop large scale models, based on modelling assumptions, that have some degree of accuracy. Still we need to estimate some key uncertain parameters online (e.g. heat exchanger fouling) from data. Why not choose for a shorter sampling period enabling a higher controller bandwidth by allowing some model approximation? At some point there must be a break even point where performance of controller based on an accurate model at low sampling rate is as good as based on a less accurate model at high sampling rate. Applications of linear model predictive controllers to mildly nonlinear processes illustrate this successful tradeoff between model accuracy and sample frequency. This observation is the main motivation to research and develop nonlinear model 2.

(11) approximation techniques for online optimization based control. Aim of this approximative model is to improve performance by adding more accuracy to the predictive part without being overtaken by the downside of a slightly longer sampling period. Model reduction Model reduction is not a purpose in itself. Without being specific what is aimed at by reduction, model reduction is meaningless. In this thesis we will assess the value of model reduction techniques for dynamic real-time optimization. In system theory we associate model reduction with model-order reduction, which implies a reduction of the number of differential equations. Because linear model reduction was successful, this notion was carried over to reduction of process models governed by nonlinear differential and algebraic equations (dae). First difference between these two models types is obviously that dae process models are nonlinear in their differential part. This nonlinearity was precisely the extension we looked for since this should improve model accuracy. Second difference is that dae process models consist of many algebraic equations. In case of a set of (index one) nonlinear differential algebraic equations in general we cannot eliminate these algebraic equations by analytical substitution, bringing us back to ordinary differential equations. This implies we deal with a truly different model structure if implicit algebraic equations are present. This difference has far reaching consequences for the notion of reduction of nonlinear models as will become clear later in this thesis. As discussed in the previous section, both solution accuracy and computational load of the online optimization determine the online performance. In case of online optimization based control, in principle we do not care about the exact number of differential or algebraic variables of a model; model reduction should result in a reduction of the optimization time accepting some small degradation of solution accuracy. Model approximation is an alternative description for the model reduction that will be used in this thesis since it is less associated to model order reduction only. A model is not only characterized by its number of differential and algebraic equations, but also by structural properties such as sparsity and controllabilityobservability. Time scales, nonlinearity and steady-state gains are important properties as well. The relation between model properties and computational load and accuracy is not always trivial, which can result in unexpected findings when evaluating model reduction techniques. Note further that the model is not the only degree of freedom within the optimization framework. The success of model reduction for online optimization-based control depends on the optimization strategy and implementation details such as choices of solvers and solver options. 3.

(12) Realizing that the final judgement of the success of a model reduction technique for online optimization depends on many different choices of implementation we elaborate in the next section on different aspects that effect the optimization problem.. 1.2. Problem exploration. In this section we will explore different aspects to be considered when discussing dynamic real time optimization of large scale nonlinear chemical processes. First of all we need a model representing the process behavior. This is not a trivial task but is nowadays supported by various commercially available tools. Such a model is then confronted with plant data to assess its model validity, which in general requires changes in the model. After several iterations we end up with a validated model, which is ready for use within online optimization. Since we are interested in future dynamic evolution of some key output process variables we require simulation techniques to relate them to future process input variables by simulation. Computation of the best future inputs is done by formulating and solving an optimization problem. We will touch on differences between the two main implementation variants to solve such a dynamic optimization problem. Finally we will motivate the need for model reduction by showing the computational consequences of straightforward implementation of this optimization problem for the model size that is common for industrial chemical processes. Mathematical models Numerous different names are available for models, each referring to a specific property of the model. One could distinct between models based on conservation laws and data driven models. The first class of models is referred to as first principles, fundamental, rigorous or theoretical models and are in general nonlinear dynamic continuous time models. These models are formulated by a set of ordinary differential equations (ode model) or a set of differential algebraic equations (dae model). The second class of models is referred to as identified, step response or impulse response models and are in general defined as linear discrete time inputoutput regressive models. Nonlinear static models can be added to these linear dynamic models giving an overall nonlinear behavior. The subdivision is not as black and white as stated here and combinations of both classes are possible as the term hybrid models already implies. Process models that are used for dynamic optimization in general are formulated as a set of differential algebraic equations. In the special case that 4.

(13) all algebraic equations can be eliminated by substitution, the dae model can be rewritten as a set of ordinary differential equations. Distinction between these two models is important because of their different numerical integration properties, which will be treated later in this chapter. Characteristic for models in process industry are physical property relations that generally do not allow for elimination by substitution and to a large extent contribute to the number of algebraic equations. Physical property calculations can also be hidden in an external module interfaced to the simulation software. Caution is required interfacing both pieces of software1 . Partial differential equations (pde model) describe microscopic conservation laws and emerge naturally in case spatial distributions are modelled. Classical examples are the tubular heat exchanger and tubular reactor. Although this type of models is explicitly mentioned here, the model can be translated into a dae-model that approximates the true partial differential equations. Dae and ode models both have internal variables, which implies that the effect of all past inputs to the future can be captured by a single value for all variables of the model at time zero, as opposed to most identified models that in general are regressive and require historical data to capture future behavior of past inputs. Disadvantage of continuous time nonlinear models is the computational effort for simulation whereas simulation with a discrete time model is trivial. On the other hand stability of a nonlinear discrete time model is nontrivial. Advantage of rigorous models is that they in general have a larger validity range than identified models because of their fundamental origin of nonlinear equations. Nonlinear identification techniques of dynamic processes are emerging but are still far from mature. Although modelling still can be tedious, developments by commercial process simulation and modelling tools such as gPROMS and Aspen Custom Modeler allow people with different modelling skills to use and build rigorous models quite efficiently. A graphical user interface with drag and drop features increased the accessability for more people than only diehard command prompt programmers. Still, thorough modelling knowledge is required to deal with fundamental problems such as index problems. All mathematical models can be described by its structure and parameters. Fundamental question is how to decide on model structure and how to interpret mismatch between plant and simulated data. Do we need to change model structure or do we need to change model parameter values? No tools are available how to discriminate between those two options not to mention help finding a better structure. 1 Error handling in the external module can conflict with error handling by the simulation software.. 5.

(14) In case we do have a match between plant and simulated data, we could have the situation where not all parameters can be uniquely determined from available plant data. Danger is that the match for this specific data can be satisfactory but using a new data set could give a terrible match. Whenever possible it seems wise to arrange parameters in order of uncertainty. Dedicated experiments can decrease uncertainty of specific parameters such as activity coefficients and pre-exponential factors in an Arrhenius equation or physical properties such as specific heat. Besides parameter uncertainty we have structural uncertainty due to lumping based on uncertain flow patterns or uncertain reaction schemes. In some cases we can interchange model uncertainty by parameter uncertainty by adding a structure that can be inactivated by a parameter. Risk is that we end up with too many parameters to be determined uniquely from available data. Computation of the values of model parameters will be discussed in the next section. Identification and validation Computation of parameter values can be formulated as a parameter optimization problem and is referred to as a parameter estimation problem in case the model structure is based on first principles. In case model structure is motivated by mathematical arguments, identification is more commonly used to address the procedure. Basically both boil down to a parameter optimization problem minimizing some error function. Important for model validation are the model requirements. Typically, model requirements are defined in terms of error tolerance on key process variables over some predefined operating region. Most often these are steady-state operating points but these requirements can also be checked for dynamic operation. Less common is a model requirement defined in terms of a maximum commotional effort. The objective of a parameter identification is to find those parameters that minimizes the error over this operation region. Resulting parameter values of either estimation or identification are now ready for validation. During the validation procedure parameter values are fixed and the model is used to generate predictions based on new input-output data. This split of data is also referred to as estimation data set and validation data set. From a more philosophical point of view a one could better refer to model validation by model unfalsification (Kosut, 1995); a model is valid until proven otherwise. This touches on the problem that for nonlinear models not all possible input signals can be validated against plant data. For linear models we can assess model error because of the superposition principle and duality between time and frequency domain. Model identification is a data driven approach to develop models. The therefore required data can either be obtained during normal operation, or as in most 6.

(15) cases, from dedicated experiments (e.g. step response tests). Elegant property of this approach is that the identified model is both observable and controllable, which is typically not the case for rigorous models. Since only a very limited number of modes are controllable and observable this results in low order models. In that sense rigorous modelling can learn from model identification techniques. Linear model identification is a mature area whereas for nonlinear identification several techniques are available (e.g. Volterra series, neural nets and splines) without a thorough theoretical foundation. Neural networks have a very flexible mathematical structure with many parameters. By means of a parameter optimization, referred to as training of the neural net, an error criterion is minimized. The result is tested (validated) against data that was not used for training. Many papers are written on this appealing topic with the main focus on parameter optimization strategy and internal approximative functions and structure. Danger of neural nets is over-fitting, which results in poor interpolative predictions. Over-fitting implies that the data used for training is not rich enough to uniquely determine all parameters (comparable to an under-determined or ill-conditioned least squares solution). Extrapolative predictive capability is acknowledged to be very bad (Can et al., 1998) and one is even advised to train the neural net with data that encloses a little bit more than the relevant operating envelope. This reveals another weak spot of this type of data driven models since data is required at operating conditions that are undesired. Lots of data is required, which can be very costly if these data has to be generated by dedicated tests. A validated rigorous model can take away part of this problem when used as an alternative data generator. Simulation Simulation of linear and discrete time models is a straightforward task whereas simulation of continuous time nonlinear models is more involved. In general simulation is executed by a numerical integration routine available in many different variants. Basic variants are described in textbooks such as by Shampine (1994), Dormand (1996) and Brenan et al. (1996). The main problem with this method is that the efficiency of these routines are strongly effected by heuristics in e.g. error, step-size and prediction order control, which is less easy to see through. Easily understandable are fixed step-size explicit integration routines like Euler and Runga-Kutta schemes. The main problem here is the poor efficiency for stiff systems due to small integration step-size. The stability region of explicit integration schemes limits step size whereas stability of implicit integration routines does not depend on step-size. Implicit fixed step-size integration routines can be viewed as an optimization solved by iterative Newton steps. A well 7.

(16) known property of this Newton step based optimization is its fast convergence, given a good initial guesses. Numerous different approaches based on different interpolation polynomials are available for this initial guess under which the Gear predictor (Dormand, 1996) is probably known best. Routines with a variable step-size are more tedious to understand, caused by heuristics in step-size control. This control balances step-size with the number of Newton steps needed for convergence with the objective to minimize computational load. Similarly the order of the prediction mechanism may be variable and controlled by heuristics. Inspection of all options reveals that many handles are available to influence the numerical integration (e.g. absolute tolerance, relative tolerance, convergence tolerance, maximum iterations, maximum iteration of no improvement, effective zero, perturbation factor, pivot search depth, etc.). Fixed step-size numerical integration routines exhibit a variable numerical integration error with a pre-computed upper bound whereas variable step-size routines maximize step-size constraint to a maximum integration error tolerance. Experience learns that consistent initialization of dae models is a delicate issue and far from trivial, since it reduces to an optimization problem with as many degrees of freedom as variables that are to be initialized (easily over tens of thousands of variables). Not only does a good initial guess speed up convergence, in practice it appears to be a necessity; with a default initial guess, initialization will most probably fail. Modelling of dae systems in practice is done by developing a small model that gradually is extended, reusing previously converged initial values as an (incomplete) initial best guess of both algebraic and differential variables. Numerical integration routines were developed for autonomous systems. Discontinuities can be handled but at cost of a (computationally expensive) reinitialization. Since a digital controller would introduce a discontinuity at every sample time, and consequently require a re-initialization, it can be attractive to approximate this digital controller by its analogue (continuous) equivalent if possible. For simulation of optimal trajectories defined on a basis of discontinuous functions, it might be worthwhile to approximate the trajectory by a set of continuous and differentiable basis-functions. This reduces the number of re-initializations and therefore can improve computational efficiency, unless the step-size has to be reduced drastically where the differentiable approximation introduces large gradients. Selecting a solver and fine tuning solver options balancing speed and robustness is a tedious exercise and makes it hard to derive general conclusions about different available solvers. Generally models of chemical processes exhibit different time-scales (stiff system) and a low degree of interacting variables (sparsity). Sparse implicit solvers deal with this type of models very efficiently.. 8.

(17) Optimization. Like in the world of modelling, the field of dynamic optimization has its own jargon to address specific characteristics of the problem. Most optimization problems in process industry can be characterized as non-convex, nonlinear, constrained optimization problems. In practice this implies that only local optimal solutions can be found instead of global optimal solutions. The presence of constraints requires constraint handling, which can be done in different ways (see e.g. textbooks by Nash and Sofer, 1996 and Edgar and Himmelblau, 1989). Often these constraints are multiplied with Lagrange multipliers and added to the objective, which transforms the original optimization problem into an unconstraint optimization problem. We can distinguish between penalty and barrier functions. The penalty function approach allows for (intermediate) solutions that violate constraints (most probably they will, since solutions tend to be at the constraint), whereas the barrier function approach requires a feasible initial guess and from this solution guarantees feasibility. Finding a feasible initial guess can already be very challenging, which explains the popularity of the penalty function approach. For steady-state plant (Floudas, 1995) design optimization, typical optimization parameters are equipment size, recycle flows and operating conditions like temperature, pressure and concentration. Discrete decision variables to determine the type of equipment (or number of distillation trays) yield a computational hard optimization known as a mixed integer nonlinear program (minlp). The optimization problem to be solved for computation of optimal input trajectories is referred to as a dynamic optimization problem and generally assumes smooth nonlinear models without discontinuities. Using a parametrization of these trajectories by means of basis functions and coefficients, such a problem can be written as a nonlinear program. The choice of basis functions determines the set of possible solutions. A typical set of basis functions consists of functions that are one for a specific time interval and zero otherwise. This basis allows a progressive distribution of decision variables over time, which is very commonly used in online applications. A progressive basis reflects the desire (or expectation!) to have an optimal solution with (possible) high frequent control moves in the beginning and low frequent control moves towards the end of the control horizon. Since a clever choice of basis functions could reduce the number of basis functions (and consequently the number of parameters for optimization) this is an interesting field of research. For the solution of dynamic optimization problems we need to distinguish between the sequential and simultaneous approach (Kraft, 1985; Vassilidis, 1993). The sequential approach computes a function evaluation by simulation of the model followed by a gradient based update of the solution. This sequence is re9.

(18) peated until solution tolerances are satisfied (converged solution) or some other termination criterion is satisfied (non converged solution). In the simultaneous approach, also referred to as collocation method (Neuman and Sen, 1973; Biegler, 2002), not only the input trajectory is parameterized but the state trajectories as well. This trajectory is described by a set of basis functions and coefficients from which the time derivatives can be computed. At each discrete point in time this trajectory time derivative should satisfy the time derivative defined by the model equations. This results in a nonlinear program (nlp) type of optimization problem were the objective is minimized subjected to a very large set of coupled equality constraints representing the process behavior. The free parameters of this nlp are both the parameters that define the input trajectory and parameters that describe the state trajectory. Since mathematically there is no difference between these parameters and all parameters are updated each iteration step together, this method is called the simultaneous approach. In general the sequential approach outperforms the simultaneous approach for large systems. This is not a rigid conclusion since in both areas researchers are developing better algorithms exploiting structure and computationally cheap approximations. Note that in case of the sequential approach during all (intermediate) solutions the model equations are satisfied by means of simulation. In case of the simultaneous approach intermediate solutions generally do not satisfy model equations. Note furthermore that with a fixed input trajectory the collocation method is an alternative for numerical integration. Both the sequential as well as the simultaneous approach are implemented as an approximate Newton step type of optimization. The Hessian is approximated by an iterative scheme efficiently reusing derivative information. A true Newton step is simply not worthwhile because of its computational load. Optimization routines require a sensitivity function of optimization parameters with respect to the objective function (and constraints). This sensitivity function is reflected by partial derivatives that can be computed by numerical perturbation or in special cased by analytical derivatives. Jacobian information generated during simulation can be used to build a linear time variant model along the trajectory, which proves to be an efficient and suitable approximation of the partial derivatives. Furthermore, parametric sensitivity can also be derived by integration of sensitivity equations or by solving adjoint equations. Reuse of Jacobian information from the simulation and exploitation of structure can reduce the computational load resulting in an attractive alternative.. 10.

(19) Industrial process operation and control Process operation covers a very wide area and involves different people throughout the company. The main objective the plant operation is to maximize profitability of the plant. Primary task of plant operation is safeguarding. Safety of people and environment always gets highest priority. In order to achieve this, hardware measures are implemented. Furthermore, measurements are combined to determine the status of the plant. If a dangerous situation is detected, a prescribed scenario is launched that shuts down the plant safely. For the detection as well as for the development of scenarios, models can be employed. Fault detection can be considered as a subtask within the safeguarding system. It involves the determination of the status of the plant. A fault does not always induce a plant shutdown but can also trigger a maintenance action. Basic control is the first level in the hierarchy as depicted in Figure 1.1 providing control actions to keep the process at desired conditions. Unstable processes can be stabilized allowing safe operation. Typically, basic controllers receive temperature, pressure and flow measurements and act on valve positions. Furthermore, all kinds of smart control solutions are developed to increase performance. Different linearizing transformation schemes and decoupling ratio and feed forward schemes are implemented in the distributed control system (dcs) and perform quite well. These schemes are to a high degree based on process knowledge, but nevertheless not referred to as advanced process control. Steady-state energy optimization (pinch) studies can reduce energy costs by rearranging energy streams (heat integration). Side effect is the introduction of (undesired) interaction of different parts of a plant. An upset downstream can act as a disturbance upstream without a material recycle present. Material recycle streams are known to introduce large time constants of several hours (Luyben et al., 1999). Both heat integration and material recycles complicate control for operators. Automation of the process industry took place very gradually. Nowadays most measurements are digitally available in the control room from which practically all controls can be executed. The availability of these measurements were a necessity for the development of advanced process control techniques, such as model predictive control (see tutorial paper by Rawlings, 2000), and because of its success, it initiated real-time process optimization. Scheduling can be considered as the top level of plant operations (Tousain, 2002) as depicted in Figure 1.1. At this level it is decided what product is produced at what time and sometimes even by which plant. Processing of information from the sales and marketing department, the purchase department, storage of raw material and end products is a very complex task. Without radical simplifications, implementation of a scheduling problem would result in 11.

(20) scheduler ✻. ❄. (dynamic) real time optimizer ✻. ❄. model predictive controller ✻. ❄. plant + basic controllers. Figure 1.1: Control hierarchy with different layers and information transfer.. a mixed integer optimization that exceeds the complexity of a dynamic optimization. Therefore models used for scheduling problems only reflect very basic properties preventing the scheduling problem to explode. Scheduling will be not further discussed in this thesis although it is recognized as a field with large opportunities. In practice very pragmatic solutions are implemented such as the production of different products in a fixed order, referred to as a product wheel. This rigid way of production has the advantage that detailed information is available to predict all costs that are involved. Downside is that opportunities are missed because of this inflexible operation. The availability of model-based process control enables a larger variety of transitions between different products. Information on the characteristics of different transitions can be made available and can be exploited by the scheduling task. This increases the potential of scheduling but requires powerful tools. Production nowadays shifts from bulk to specialties, which creates new opportunities for those who know to swiftly control their processes within new specifications (Backx et al., 2000). Capability of fast and cheap transitions enables companies to produce and sell on demand at usually favorable prices and brings added value to the business. In order to be more flexible, stationary plant operation is replaced by a more flexible transient (or even batch wise) type of operation. An other driver to improve on process control is environmental legislation, which becomes more and more stringent and pushes operation to its limits. Optimization-based process control contributes to this flexible and competitive high quality plant operation. 12.

(21) Economic dynamic real time optimization plays a key role in bringing money to the business, since it translates a schedule into economically optimal set point trajectories for the plant. At least as important is the feedback that the dynamic optimization can give to the scheduling optimization in terms of e.g. minimal required transition times and estimated costs of different and possible new transitions. This information, depicted by the arrow from dynamic real time optimization to the scheduler in Figure 1.1, enables improved scheduling performance because the information is more accurate and complete and allows for more flexible operation. This more enhanced information can, for example, bring the difference between accepting and refusing a customers order. Dynamic real-time optimization plays a crucial role in connecting scheduling to plant control and can give a significant contribution to the profitability of a plant. Real-time process optimization state of the art operated plants have a layered control structure where the plants steady-state optimum is computed recursively by the real-time process optimizer providing set points that are tracked by a linear model predictive controller. Besides that this approach was implementable from a computational point of view, from the operators perspective, this approach was acceptable to do as well, with a safety argument that pleads for a layered control structure. In case of failure of the real-time optimization the process is not out of control but only the optimization is not executed. Since a state of the art optimizer assumes some steady-state condition, this condition is checked before the optimizer is started (Backx et al., 2000). This check is somewhat arbitrary because in practice a plant is never in steady state. Before the next steady-state optimization is executed a parameter estimation is carried out using online data. The result of the steady-state optimization is a new set of set points causing a dynamic response of the plant. Only after the process is stable again a next cycle can be started, which limits the update frequency of optimal set points. If a process is operated quasi steady-state and optimal conditions change gradually this approach can be very effective. For a continuous process that produces different products we require economical optimal transitions from one operation point to the other. Including process dynamics in the optimization enables exploitation of the full potential of the plant. Result of this optimization approach will be a set of optimal set point trajectories and a predicted dynamic response of the process. In this approach we do not require steady-state conditions to start an optimization and it enables shorter transition times. Shorter transition times generally result in reduction of off spec material and therefore increase profitability of a plant. The real-time, steady-state optimizer and linear model predictive controller can be replaced by a single dynamic real-time optimization (drto) based on one 13.

(22) ✛ ✲❄ ❄ ✲❄ ✲❄ ✛ ❄ . . . . ............ . . . . . . . . .... .... ............. .. .. ..........✲ ✛ .. .. .............. ............ ...... .... .... ............ ................. .......... .............. ............. ..... ...... ........ ..... . .......................... .............. ............ .............. ............. ❄ ......... .. .......... .............. ............. ✠ ................. ✲... .... ✛. .............. ............. .............. ............. ..... ...... ........ ❄ ......... .. .......... .............. ............. ✠ ................. ✲... ... .............. ............. .............. ............. ..... ...... ........ ❄ ......... .. .......... .............. ............. ✠ ................. ✲... ... .............. ............. ✲. ✲. ✲. ✲. ✲. ✲. ✒. .... . .................... .............. ............. .............. ............. ✒. .... . ................... .............. ............. .............. ............. ✒. .... . ................... .............. ............. ✲. Figure 1.2: Typical chemical process flow sheet with multiple different unit operations and material recycle streams representing behavior of a broad class of industrial processes.. large-scale nonlinear dynamic process model. This problem should be solved at the sample rate of the model predictive controller to maintain similar disturbance rejection properties as the linear model predictive controller. The prediction horizon of the optimization should be a couple of times the process dominant time constant. The implication of this straightforward implementation is discussed next.. Implications straightforward implementation Let us now explore what the implication is of straightforward implementation of dynamic real-time optimization as a replacement of the real-time steadystate optimizer and linear model predictive controller. In a typical chemical process, two or more components react into the product of interest followed by one or more separation steps. This represents typical behavior of a broad class of industrial processes and therefore findings can be carried over to many plants. In general, one or more side reactions take place introducing extra components. The use of a catalyst can shift selectivity but never prevent side reactions completely. Suppose we assume only one side reaction, we already have to deal with four species, or even five if we take the catalyst into account. We can separate the four species with three distillation columns as depicted in Figure 1.2 if we assume that the catalyst can be separated by a decanter. The recycle streams introduce positive feedback and therefore long time constants in the overall plant dynamics (Luyben et al., 1999). Suppose we assume a instantaneous phase equilibrium and uniform mixing on each tray, the number of differential equation that describes the separation 14.

(23) of this chemical process is nx = nc (ns + 1)(nt + 2) , where nx is the number of differential equations, nc is the number of columns, ns is the number of species that are involved and nt is the average number of trays per column. The one in the formula represent the energy balance and the two represents the reboiler and condensor of a column. For a setup with three columns with twenty, forty and sixty trays and five species we need already over seven hundred and fifty differential equations. If the reaction takes place in a tubular reactor we need to add a partial differential equation to the model. This can only be implemented after discretization, easily adding another tree hundred equations (five species times sixty discretization points) to the model bringing the total over a thousand equations. So we can extend the previous formula to nx = nc (ns + 1)(nt + 2) + ns nd , where nd is the number of discretization points. In practice the number of algebraic variables is three to ten times the number of differential equations, depending on implementation of physical properties (as hidden module or as explicit equations in the model). This brings the total of equations to several thousands up to ten thousand equations. This estimate serves as a lower bound for a first principle industrial process model, and illustrates the number of equations that should be handled by plant-wide model-based process operation techniques. Fortunately the models are very nicely structured that can be exploited. This model property is referred to as the model sparsity and reflects the interaction or coupling between equations and variables. This property can be visualized by a matrix, the so-called Jacobian matrix J. If the j th variable occurs in the ith equation the J(i, j) element is nonzero and zero otherwise. Most zero elements are structural so do not depend on variable values. These elements do not have to be recomputed during simulation, which allows for efficient implementation of simulation algorithms. The number of nonzero elements for process models is about five to ten percent of the total elements of the Jacobian matrix. Next we will estimate the number of manipulated variables that are involved. Every column has five manipulated variables: the outgoing flows from reboiler and condensor, reboiler duty, condenser duty and the reflux rate or ratio. In this simple example we can manipulate within the reactor the fresh feed flow and composition, feed to catalyst ratio, cooling rate and outgoing flow of the reactor. This brings the number of manipulated variables to twenty, all potential candidates to be computed by model-based optimization. The number of 15.

(24) parameters that is involved can be computed by the next formula: np =. nu H , ts. where np is the number of free parameters, nu is the number of manipulated variables, H is the control horizon and ts is the sampling rate. For a typical process as described in this section, the dominant time constant can be over a day, especially if recycle streams are present introducing positive feedback. An acceptable sampling rate for most manipulated variables is a sampling rate of one to five minutes, however, pressure control might require a much higher sampling rate. In case of a horizon of three times the dominant time constant, twenty inputs and a sampling rate of five minutes, the total number of free parameters is over seventeen thousand. This results in a very large optimization problem that is not very likely to give sensible results. A selection of manipulated variables and clever parametrization of the input signals can reduce this number of free parameters. The input signal can even be implemented as a fully adaptive, problem-dependent parameterization generated by repetitive solution of increasingly refined optimization problems (Schlegel et al., 2005). The adaptation is based on a wavelet analysis of the solution profiles obtained in the previous step. In practice, first some base layer control would be implemented around each column to control levels and pressures. Set points for these controllers could then be degree of freedom for optimization. The added value of including these set points within a dynamic optimization are not evident but small inventories could decrease transition times. If for some reason the added value of these degrees of freedom are expected to be small they can be removed from the optimization problem reducing the number of optimization parameters. Suppose we want to do one nonlinear integration of the rigorous model within one sampling period to do a prediction of an input trajectory. In this case we need to simulate three days within five minutes. This requires at least simulation speed of over eight hundred times real time. If a sampling period of one hour is acceptable we still need a simulation speed of seventy two times real time. In this scenario we did not account for multiple, in case of the sequential optimization approach approximately between five and twenty, simulations and other computations than simulation. Depending on the input sequence, for the size of the models that is considered on current standard computer the simulation speed is between one to twenty times realtime. This reveals the tremendous gap between desired and current status of numerical integration. Nevertheless, numerical solvers are already very sophisticated handling different timescales, also referred to as stiff systems, and exploiting model structure. With current commercial modelling tools we usually end up with a set of differential and algebraic equations. Keeping the model in line with the process, 16.

(25) measurements are used to estimate the actual state of the process by means of an observer, e.g. an extended Kalman filter (e.g. Lewis, 1986). This is a model based filtering technique, balancing model error with measurement error. The resulting state is then used as a corrected initial condition for the model. Finding a consistent solution for this new initial condition for a set of differential algebraic equations is called an initialization problem, which is hard to solve without a good initial guess. Fortunately, we can use the uncorrected state as initial guess which should be good enough. Still this initialization problem has to be solved every iteration at cost of valuable computing time. Going online with a straightforward implementation of real time dynamic plant optimization based on first principle models introduces an enormous computational overload. At present only very pragmatic solutions are available that directly provoke all kinds of comments such as the inconsistency that is introduced by the use of different models in different layers within the plant operation hierarchy. These approaches are legitimated by the argument that there are no better alternatives readily available. Despite all this criticism on the pragmatic solutions for the model based plant operation, the approach has proven to contribute towards the profitability of the plant. This profitability can only be increased if consistent solutions are developed that replace the pragmatic solutions. Model reduction can provide a consistent solution and is explored in this thesis. First we will continue with model reduction techniques available in literature.. 1.3. Literature on nonlinear model reduction. Models that are available for large scale industrial processes can in general be characterized as a set of differential and algebraic equations (dae). Therefore we search for model reduction techniques that are applicable to this general class of models. This class of models is capable to describe the majority of processes and is more general than a set of ordinary differential equations (ode). Transformation of a dae into an ode is not possible in general and is regarded as major model reduction step. Since we are interested in the effect of different models on computational load for optimization every technique mapping one model to an other model is a candidate model reduction technique. This implies that different modelling and identification techniques can be considered using the original model as data generating plant replacement. Marquardt (2001) states that the proper way to assess model reduction techniques for online nonlinear model based control is to compare the closed loop performance based on the original model, with low sampling frequency, with 17.

(26) the reduced model at higher sampling frequency. Maximum sampling frequencies are determined by the computational load that is related to the differences between original and reduced model. The reduced model should enable higher sampling frequencies compensating for loss in accuracy and therefore result in higher closed loop performance. None of this type of assessments have been found in literature. Therefore we will need to resort to more general literature on model reduction and nonlinear modelling techniques. Computation and performance assessment of NLMPC Findeisen et al. (2002) assessed computation and performance of nonlinear model predictive control. The implementation of the control problem used in this paper was the so-called direct shooting, which is a special efficient implementation of the simultaneous approach (Diehl et al., 2002). In their assessment, different models are compared under closed loop control. The different models of the 20 tray distillation column were a nonlinear wave model with 2 differential and 3 algebraic equations, a concentration and holdup model with 42 ordinary differential equations and a more detailed model (including tray temperatures) with 44 differential and 122 algebraic equations. All different models where the result of remodelling, thus extra simplifications and assumptions based on physics and process knowledge resulted in reduced models. The effect on the computational load is presented even for different control horizons, distinguishing between the maximum and average computation time. More simplified models resulted in lower computational load, which is not surprising. More interesting is that the reduction in computational load is quantified. The increase of controller performance due to higher sampling frequency enabled by reduced computational effort was not presented. Neither is completely clear how big the modelling error between original and reduced models is. Load of state estimation for these different reduced models was assessed as well. Furthermore computational load of different nonlinear predictive schemes were assessed with the original model in case of no plant model mismatch. Nonlinear wave approximation Balasubramhanya and Doyle III (2000) developed a reduced-order model of a batch reactive distillation column using travelling waves. The reader is referred to e.g. Marquardt (1990) and Kienle (2000) for more details on travelling waves. This nonlinear model was successfully deployed within a nonlinear model predictive controller (nlmpc) and linear Kalman filter that was computationally more efficient than a nlmpc based on the original full-order nonlinear model. Although the original full order model was only a 31st order ode and the re18.

(27) duced model a 5th order ode, the closed loop performance was over six times faster in closed-loop with nlmpc based on the reduced model while maintaining performance. Performance was quite high despite of prediction horizon of only two samples and a control horizon of one. Furthermore they compared the nonlinear models with the linearized model illustrating the level of nonlinearity of the process. Simplified physical property models Successful use of simplified physical property models within flow sheet optimization is reported by Ganesh and Biegler (1987). A simple flash with recycle as well as a more involved ethylbenzene process with reactor, flash, two columns and two recycle streams are presented in that paper. In both cases the rigorous phase equilibrium model (Soave-Redlich-Kwong) was approximated by a simplified model (Raoult’s law, Antoine’s equation and ideal enthalpy). This type of model simplification is based on process knowledge and physical insight. It is a tailored approach but applicable to all models where phase equilibria are to be computed. Reductions up to an order of magnitude were reported by straightforward use of the simplified model within optimization. Danger of this approach was already reported by Biegler (1985) and is that the optimum does not coincide with the optimum of the original model. Combining the rigorous and simplified model in their optimization strategy Ganesh and Biegler can guarantee convergence to the true optimum with the original model and still reducing the computational load by over thirty percent. Model simplification based on physics appears to be successful for steady-state flowsheet optimization. Chimowitz and Lee (1985) reported increase of computational efficiency of a factor of order three by the use of local thermodynamic models. According to Chimowitz up to 90% of computational time was used for thermodynamic computations during simulation, motivating their approach of model approximation. The local thermodynamic models are integrated with the numerical solver where an updating mechanism of the parameters of the local models was included. This approach is not easy to use since model reduction and numerical algorithm are integrated. Ledent and Heyen (1994) attempted to use local models within dynamic simulations but were not successful due discontinuities introduced by updating the local models. Still local models as such, without update mechanism, can be used reducing computational load despite their limited validity. Perregaard (1993) worked on model simplification and reduction for simulation and optimization of chemical processes. The objective of his paper is to present a simplification procedure of the algebraic equations that through simplification of the algebraic equations for phase equilibria calculations is capable of 19.

(28) reducing the computing time to solve the model equations without effecting the convergence characteristics of the numerical method. Furthermore it exploits the inherited structure of equations representing the chemical process. This structured equation oriented framework was adopted from Gani et al. (1990) who distinguish between differential, explicit algebraic and implicit algebraic equations. Key observation is that for Newton-like methods, the Jacobian can be approximated during intermediate iterations. They replace the true Jacobian by a cheap to compute approximate Jacobian. This approximate Jacobian information is derived from local thermodynamic models with analytical derivatives. They present in their paper several cases and report reductions of overall computational times of the order 20-60% without loss of accuracy and no side affects on convergence of the numerical method. Støren and Hertzberg (1997) developed a tailored dae solver that is computationally more efficient and reliable and report limited reduction (34-63%) in computation times for dynamic optimization calculations. In their approach also local thermodynamic models are exploited.. Model order reduction by projection Many papers are available on nonlinear model reduction by projection. More precise would be order reduction of nonlinear model by linear projection. Order referring to the number of differential equations. A generic procedure can be formulated by three steps. First a suitable transformation is applied revealing the important contributions to process dynamics. Second, the new coordinate system is decomposed into two subspaces. Finally, the dynamics can be formulated in the new coordinate system where either the unimportant dynamics are truncated or added as algebraic constraints (residualization). In case of residualization the resulting model is dae format and will not reduce computational effort (Marquardt, 2001) due to increased complexity (loss of sparsity). Therefore in most cases the transformed model is truncated. An approximate solution with reduced computational load is known as slaving. Aling (1997) reported increasing computational load with increasing residualization and reduced computational load by approximating the solution of slaved modes. In most papers, projection is applied to ordinary differential equations (Marquardt, 2001). Only L¨ offler and Marquardt (1991) applied their projection to both differential and algebraic equations. As an error measure between original and reduced model, plots of trajectories of key variables are used. These are simply generated by simulation of a specific input sequence and applied to both models. In some papers results of computational time of simulations are added as relevant information. Important information on the applied numerical integration algorithm is mostly not available despite the fact that this is crucial for interpretation of the results. This becomes clear when comparing an explicit 20.

(29) fixed step numerical integration scheme with variable step-size implicit numerical integration scheme. Extremely important is the ability of the algorithms to exploit sparsity (Bogle and Perkins, 1990). Process models are known to be very sparse, which can be efficiently exploited by some numerical integration algorithms reducing the computational load of simulation. Projection methods in general destroy this sparsity, which is reflected on computational load of those algorithms that exploit this sparsity. Projection methods differ in how the projection is computed. Two main approaches how to compute these projections are discussed next: proper orthogonal decomposition and balance projection, respectively. Proper orthogonal decomposition Popular is projection based on a proper orthogonal decomposition (pod) with its origin in fluid dynamics (see e.g. Berkooz et al., 1993; Holmes et al., 1997; Sirovich, 1991). This approach is also referred to as Karhunen-Loève expansion or method of empirical eigenfunctions. Bendersky and Christofides (2000) apply a static optimization of a catalytic rod and a packed bed reactor described by partial differential equations resulting in reductions of over a factor of thirty in computational load. In order to find the empirical eigenfunctions they generated data with the original model and grid the design variables between upper an lower bound. In case of the packed bed this implied with three design variables at nine equally spaced values 93 = 729 simulations representing the complete operating envelope. This is a brute force solution to a problem also addressed by Marquardt (2001). However in case of a dynamic optimization this would not be attractive due to the much higher number of decision variables: typically over four inputs and at least 10 points in time would imply 940 ≈ 1038 simulations. Baker and Christofides (2000) applied proper orthogonal projection to a rapid thermal chemical vapor deposition (rtcvd) process model to be able to design a nonlinear output feedback controller with four inputs and four outputs. This design can be done off-line, so no computational load aspects were mentioned. They show that the nonlinear output feedback controller outperforms four pi controllers in a disturbance free scenario. Still the deposition was unevenly distributed. Addition of a model based feedforward would add performance to control solution and might diminish the difference between a nonlinear output controller and the four pi controllers. Aling et al. (1997) applied the proper orthogonal decomposition reduction method to a rapid thermal processing system. The reduction of differential equations was impressive from one hundred and ten to less than forty. Reduction of computational load for simulation was up to a factor ten. First a simplified model, a set of ordinary differential equations, is derived from a finite element model (Aling, 1996). Then this model is further reduced by a proper 21.

(30) orthogonal decomposition of order forty, twenty, ten and five by truncation. These truncated models are the further reduced by residualization. Residualization transforms the ode into a dae that is solved using a ddasac solver. Residualization does not reduce the computation and therefore they propose a so-called pseudo-steady approaximation (slaving), which is a computationally cheaper solution than residualisation. Order reduction by balanced projection Lall (1999, 2002) introduced empirical Gramians as an equivalent for linear Gramians that can be used for balanced linear model order reduction (Moore, 1981). Hahn and Edgar (2002) elaborate on model order reduction by balancing empirical Gramians and show results of significant model order reduction but limited reduction in simulation times. Some closed loop results were presented but little details were presented on the implementation of the model predictive controller scheme. Performance of the controller based on the reduced model were as good as based on the full-order model but no reduction in computational effort was achieved. Lee et al. (2000) exploit subspace identification (Favoreel et al., 2000) for control relevant model reduction by balanced truncation. The technique is control relevant because it is based on the input to output map instead of the input to state map that is used for Proper Orthogonal Decomposition model reduction. This argument holds for all balanced model reduction techniques like the empirical Gramians (Lall, 2002; Hahn and Edgar, 2002). Newman and Krishnaprasad (1998) compared proper orthogonal decomposition and the method of balancing. Their focus was on ordinary differential equations describing the heat transfer in a rapid thermal chemical vapor deposition (rtcvd) for semiconductor manufacturing. The transformation that was used for balancing the nonlinear system was derived from a linear model in a nominal operating point. The transformation balancing this linear model was then applied to the nonlinear model. The transformation matrices were very ill conditioned and they used a method proposed by Safonov and Chiang (1989) to overcome this problem. An idea was suggested to find a better approximation of the nonlinear balancing approach presented by Scherpen (1993). The order of the models was significantly reduced by both projection methods with acceptable error, but no results were presented on reduction of the computational load. Zhang and Lam (2002) developed a reduction technique for bilinear systems that outperformed a Gramian based reduction, though demonstrated on a small example. The solution of the model reduction problem is based on the gradient flow technique to optimize the H2 error between original and reduced order model. 22.

(31) Singular perturbation Reducing the number of differential equations can easily be done if the model is in a standard form of a singular perturbed system (Kokotovic, 1986). In this special case we can distinguish between the first couple of differential equations representing the slow dynamics and the remaining differential equations associated with fast dynamics. Model reduction is then done by assuming that the fast dynamics behave like algebraic constraints, which reduces the number of differential equations. For some differential equations it is fairly obvious to determine its time scale but in general it is nontrivial. Tatrai et al. (1994a, 1994b) and Robertson et al. (1996a, 1996b) use state to eigenvalue association to bring models in standard form. This involves a homotopy procedure with continuation parameter that varies from zero to one, weighting the system matrix at some operating point with its trace. At different values of the continuation parameter the eigenvalues of the composed matrix are computed enabling the state to eigenvalue association. Problem is that the true eigenvalues are the result of the interaction between several differential equations and therefore in principle cannot be assigned to one particular differential state. Duchêne and Rouchon (1996) show that the originally chosen state space is not the best coordinate system to apply singular perturbation. They illustrate this on a simple example and later demonstrate their approach on a case study with 13 species and 67 reactions. Their reduction approach is compared with a quasi steady-state approach and original by plotting time responses to a non steady-state initial condition. Reaction kinetics simplification Petzold (1999) applied an optimization based method to determine what reactions dominate the overall dynamics. Aim is to derive the simplest reaction system, which retains the essential features of the full system. The original mixed integer nonlinear program (minlp) is approximated by a problem that can be solved by a standard sequential quadratic programming (sqp) method by using a so-called beta function. Results are presented in this paper for several reaction mechanisms. Androulakis (2000) formulates the selection of dominant reaction mechanisms as a minlp as well but uses a branch and bound algorithm to solve it. Edwards et al. (2000) not only eliminates reactions but species as well by solving a minlp using dicopt. Li and Rabitz (1993) presented a paper on approximate lumping schemes by singular perturbation, which they later developed into a combined symbolic and numerical techniques to apply constrained nonlinear lumping applied to 23.

(32) an oxidation model (Li and Rabitz, 1996). Significant order reductions were presented but no effect on computational load were discussed in these papers.. Nonlinear empirical modelling Empirical models have a very low computational complexity and therefore alow for fast simulations. Typically, interpolation capabilities are comparable to fundamental models but extrapolation of fundamental models is far superior (Can et al., 1998). Since we do not want to restrict ourselves to optimal trajectories that are interpolations of historical data, these type of models seem unsuitable for dynamic optimization. Nevertheless we will mention some literature on nonlinear empirical modelling. Sentoni et al. (1996) successfully applied Laguerre systems combined with neural nets to approximate nonlinear process behaviour. Safavi et al. (2002) present a hybrid model of a binary distillation column combining overall mass and energy balances with a neural net accounting for the separation. This model was used for an online optimization of the distillation column and compared to the full mechanistic model. The resulting optima were close, indicating that the hybrid model was performing well. No results were presented on the computational benefit of the hybrid model. Ling and Rivera (1998) present a control relevant model reduction of Volterra models by a Hammerstein model with reduced number of parameters. Focus was on closed loop performance of a simple polymerization described by 4 ordinary differential equations reactor and the computational aspects were not discussed. Later Ling and Rivera (2001) presented a three step approach to derive control relevant models. First, a nonlinear arx model is estimated from plant data using an orthogonal least squares algorithm. Second, a Volterra series model is generated from the nonlinear arx model. Finally a restricted complexity model is estimated from the Volterra series through the model reduction algorithm described above. This seems to involve quite some non trivial steps to finally arrive at the reduced model.. Miscellaneous modelling techniques Norquay et al. (1999) successfully deployed a Wiener model within a model predictive controller on an industrial splitter using a linear dynamics and a static output nonlinearity. Pearson and Pottmann (2000) compare a Wiener, Hammerstein and nonlinear feedback structure for gray-box identification, all based on linear dynamics interconnected with a static nonlinear element. Pearson (2003) elaborates in his review paper on nonlinear identification on selection of nonlinear structures for computer control. 24.

(33) Stewart et al. (1985) presented a rigorous model reduction approach for nonlinear spatially distributed systems such as distillation columns, by means of orthogonal collocation. A stiff solver was used to test and compare the original with different collocation strategies that appear to be remarkably efficient. Briesen and Marquardt (2000) present results on adaptive model reduction for simulation of thermal cracking of multi component hydrocarbon mixtures. This method provides an error controlled simulation. During simulation an adaptive grid reduces model complexity where possible. The error estimation governs the efficiency of the complete procedure and no results were presented on reduction of computational load. For online use of such a model it would require some adaptation of a Kalman filter since the order of the model is changing continuously. Kumar and Daoutidis (1999) applied a nonlinear input-output linearizing feedback controller to a high purity distillation column that was non-robust using the original model but had excellent robustness properties using a singularly perturbed reduced model. No details were presented on the effect of the model reduction on computational load. See e.g. Nijmeijer and van der Schaft (1990), Isidori (1989) and Kurtz and Henson (1997) for more details on feedback linearizing control. Important observation of this literature overview is that no reduction techniques are available directly linked to reduction of computation load or simulation of optimization. All techniques have different focuses, and the effect on computational load can only be evaluated by implementation. There does not exist a modelling synthesis technique for dynamic optimization that provides an optimal model for a fixed sampling interval and prediction horizon. So the effect of most promising model reduction techniques should be evaluated on their merits for real time dynamic optimization by implementation.. 1.4. Solution directions. The gap revealed for consistent online nonlinear model based optimization is caused by its computational load. Since computing speed approximately doubles every eighteen months one could argue that it is only a matter of time before the gap is closed. With a gap of factor eight hundred derived in the previous section we still need to wait for almost fifteen years until computers are fast enough, assuming that the computer speed improvements can be extrapolated. However, suppose next decades computing power were available we would immediately want to control even more unit operations by this optimization based controller or increase the sampling frequency to improve performance. This brings us back to square one where the basic question is how to reduce computational load of the optimization problem so that it can be solved within 25.