Index of /rozprawy2/10390

Pełen tekst

(1)New method for data replication in distributed heterogeneous database systems. Miroslaw Kasper Department of Computer Science AGH – University of Science and Technology. Supervisor: Grzegorz Dobrowolski Krakow, 2011.

(2) I dedicate this thesis to my wife Iwona.

(3) Acknowledgements. I would like to acknowledge my supervisor Grzegorz Dobrowolski for his tireless guidance and many useful comments about the material in the thesis. His intuition, insight and ideas were crucial in helping me progress in this line of research, and more importantly, in obtaining useful results..

(4) Contents 1 Introduction. 1. 2 Database replication. 8. 2.1 2.2. Database replication definition . . . . . . . . . . . . . . . . . . . . Replication Model . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9 11. 2.3. Development of replication techniques . . . . . . . . . . . . . . . .. 14. 2.4. Review of the approaches . . . . . . . . . . . . . . . . . . . . . . .. 16. 2.4.1. Time of transactions . . . . . . . . . . . . . . . . . . . . . 2.4.1.1 Eager replication . . . . . . . . . . . . . . . . . .. 17 17. 2.4.1.2. . . . . . . . . . . . . . . . . . .. 20. 2.4.2. System architecture . . . . . . . . . . . . . . . . . . . . . .. 26. 2.4.3 2.4.4. Middleware based replication . . . . . . . . . . . . . . . . Server interaction . . . . . . . . . . . . . . . . . . . . . . .. 28 29. 2.4.5. Transaction termination . . . . . . . . . . . . . . . . . . .. 30. 2.4.6. Environment complexity . . . . . . . . . . . . . . . . . . .. 31. Review of replication techniques usage . . . . . . . . . . . . . . . 2.5.1 Transactional data processing . . . . . . . . . . . . . . . .. 32 32. 2.5.2. Large amount of data . . . . . . . . . . . . . . . . . . . . .. 37. 2.5.3. Multimedia data . . . . . . . . . . . . . . . . . . . . . . .. 39. 2.5.4 2.5.5. Data in mobile systems . . . . . . . . . . . . . . . . . . . . Real time data processing . . . . . . . . . . . . . . . . . .. 40 42. 2.5.6. Spatial data storing . . . . . . . . . . . . . . . . . . . . . .. 43. 2.5.7. Replication for High Availability. . . . . . . . . . . . . . .. 45. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 46. 2.5. 2.6. Lazy replication. iii.

(5) CONTENTS. 3 Theta replication method. 49. 3.1 3.2. General assumptions and requirements . . . . . . . . . . . . . . . Theta replication architecture . . . . . . . . . . . . . . . . . . . .. 50 52. 3.3. Approach details . . . . . . . . . . . . . . . . . . . . . . . . . . .. 55. 3.3.1. Middleware components . . . . . . . . . . . . . . . . . . .. 55. 3.3.2 3.3.3. Transaction processing . . . . . . . . . . . . . . . . . . . . Communication in middleware layer . . . . . . . . . . . . .. 57 60. 3.3.4. Conflict resolution . . . . . . . . . . . . . . . . . . . . . .. 62. 3.3.5. Failure resistance . . . . . . . . . . . . . . . . . . . . . . .. 68. Complexity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Selection of the transaction identifier . . . . . . . . . . . .. 70 71. 3.4.2. Conflict resolution . . . . . . . . . . . . . . . . . . . . . .. 71. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 72. 4 In-laboratory testing 4.1 Testing implementation . . . . . . . . . . . . . . . . . . . . . . . .. 74 76. 3.4. 3.5. 4.2. Performance indices and testing plan . . . . . . . . . . . . . . . .. 78. 4.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 85. 4.3.1 4.3.2. Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . Percentage of reads . . . . . . . . . . . . . . . . . . . . . .. 86 92. 4.3.3. Conflicting transactions ratio . . . . . . . . . . . . . . . .. 94. 4.3.4. Size of database copies . . . . . . . . . . . . . . . . . . . .. 96. 4.4. 4.3.5 Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100. 5 Real life evaluation 5.1. 102. Introduction to IBIS system . . . . . . . . . . . . . . . . . . . . . 102 5.1.1. System logical structure and architecture . . . . . . . . . . 103. 5.2. 5.1.2 Agents interactions . . . . . . . . . . . . . . . . . . . . . . 108 Database architecture . . . . . . . . . . . . . . . . . . . . . . . . . 109. 5.3. Replication implementation for IBIS . . . . . . . . . . . . . . . . 114 5.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 115. 5.3.2 5.3.3. Data replication architecture . . . . . . . . . . . . . . . . . 116 Data flow and data management in replicated IBIS . . . . 119. iv.

(6) CONTENTS. 5.3.4 5.4. 5.5. Implementation details . . . . . . . . . . . . . . . . . . . . 122. Evaluation scenarios and results . . . . . . . . . . . . . . . . . . . 124 5.4.1 Failure resistance . . . . . . . . . . . . . . . . . . . . . . . 125 5.4.2. Web Interface . . . . . . . . . . . . . . . . . . . . . . . . . 129. 5.4.3. Crawling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134. 5.4.4 5.4.5. Data Searching . . . . . . . . . . . . . . . . . . . . . . . . 139 Strategy management . . . . . . . . . . . . . . . . . . . . . 144. 5.4.6. Method adaptation . . . . . . . . . . . . . . . . . . . . . . 149. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150. 6 Conclusions. 152. A Theta method software deployment. 155. B Implementation details 160 B.1 In-laboratory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 B.2 IBIS case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162. C Test environment details. 164. C.1 In-laboratory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.2 IBIS case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 References. 169. v.

(7) List of Figures 2.1. Ways of updates propagation in replicas . . . . . . . . . . . . . .. 11. 2.2. Updates control in replicas . . . . . . . . . . . . . . . . . . . . . .. 12. 2.3. SRDF based data replication architecture . . . . . . . . . . . . . .. 19. 2.4 2.5. Architecture for Snapshot based data replication . . . . . . . . . . Cascade replication for PostgreSQL Slony-I . . . . . . . . . . . . .. 22 25. 2.6. Database replication architecture with centralized (a) and distributed (b) middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 28. 2.7 2.8. DARX application architecture . . . . . . . . . . . . . . . . . . . Bayou system architecture . . . . . . . . . . . . . . . . . . . . . .. 36 42. 2.9. Example of Regional Locking (Source: [44]) . . . . . . . . . . . .. 45. 3.1. General architecture of Theta replication system . . . . . . . . . .. 52. 3.2. Communication layers for Theta replication . . . . . . . . . . . .. 53. 3.3. Data flow in Theta replication system . . . . . . . . . . . . . . . .. 54. 3.4 3.5. Middleware components . . . . . . . . . . . . . . . . . . . . . . . Transaction processing in Theta approach . . . . . . . . . . . . .. 56 58. 3.6. Parameter set containing transactions information . . . . . . . . .. 61. 3.7. Data structure containing details of transaction . . . . . . . . . .. 61. 3.8 3.9. Data structure containing details of transaction . . . . . . . . . . TIDs selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 62 64. 3.10 Conflict Resolution activity diagram . . . . . . . . . . . . . . . .. 65. 3.11 Idea of Conflict Resolution . . . . . . . . . . . . . . . . . . . . . .. 67. 3.12 Middleware transaction logging . . . . . . . . . . . . . . . . . . . 3.13 Idea of transaction recovery inside middleware . . . . . . . . . . .. 68 69. vi.

(8) LIST OF FIGURES. 4.1. Location of Theta method in overall replication system . . . . . .. 77. 4.2 4.3. Middleware external interfaces . . . . . . . . . . . . . . . . . . . . Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 78 87. 4.4. Centralized system versus Theta replication . . . . . . . . . . . .. 90. 4.5. Queries percentage influence on response times . . . . . . . . . . .. 92. 4.6 4.7. Conflicts ratio influence on the approach response times . . . . . . 94 Database size influence on mean response times – tables with indexes 97. 4.8. Database size influence on mean response times – long lasting transactions for tables without indexes . . . . . . . . . . . . . . .. 99. 5.1. System logical structure (Source: [90]) . . . . . . . . . . . . . . . 104. 5.2. IBIS system architecture (Source: [90]) . . . . . . . . . . . . . . . 105. 5.3 5.4. Architecture of IBIS platform (Source: [65]) . . . . . . . . . . . . 106 Interactions between agents (Source: [65]) . . . . . . . . . . . . . 108. 5.5. IBIS databases (Source: [65]) . . . . . . . . . . . . . . . . . . . . 110. 5.6. Strategy database . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. 5.7 5.8. Working database . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Configuration database . . . . . . . . . . . . . . . . . . . . . . . . 113. 5.9. Architecture of IBIS platform with data replication . . . . . . . . 117. 5.10 IBIS agents and processes communication . . . . . . . . . . . . . 118 5.11 Activity diagram for IBIS replication . . . . . . . . . . . . . . . . 119 5.12 Communication in IBIS with replication . . . . . . . . . . . . . . 120 5.13 Deployment diagram for Theta replication management in IBIS . 122 5.14 Data flow for iBATIS data mapper (Source: [40]) . . . . . . . . . 123 5.15 Web Interface control panel for IBIS system . . . . . . . . . . . . 130 5.16 Crawling test architecture . . . . . . . . . . . . . . . . . . . . . . 135 5.17 Data Searching test architecture . . . . . . . . . . . . . . . . . . . 140 5.18 Strategy management test architecture . . . . . . . . . . . . . . . 144 5.19 Strategy management efficiency comparison . . . . . . . . . . . . 148 5.20 Listing for inserting web object – centralized database . . . . . . . 149 5.21 Listing for inserting web object – replicated database . . . . . . . 150 A.1 Elements are placed on the attached CD . . . . . . . . . . . . . . 156 A.2 Middleware directory structure for IBIS implementation . . . . . . 158. vii.

(9) LIST OF FIGURES. A.3 Running middleware processes for replication in IBIS . . . . . . . 159 C.1 Laboratory environment for Theta replication implementation . . 164 C.2 Test environment for IBIS system with data replication . . . . . . 167. viii.

(10) Chapter 1 Introduction Systems used nowadays are more and more geographically dispersed. At the same time requirement to reduce the time of access to data is getting higher and higher. Ensuring the appropriate time of data accessibility is required in every kind of data processing system, and it is especially important in systems implemented for complex, distributed environments processing huge amounts of transactions [3, 54, 61, 84]. Moreover, the growing companies expand theirs activities to many countries all over the world and in the same time data processing requirements and methods change as well. Additional requirements appear in such systems, which are to ensure an appropriate level of service availability and efficiency in environments consisting of lots of data centers spread among many distant countries. On the other hand these demands are getting more and more difficult to fulfil as the complexity of the system grows (multi-node environments, long distances between remote nodes, lots of messages exchanged). Data replication technique can be invaluable technique that allows to fulfill high demands of distributed data management systems [3, 15, 54, 61, 84]. The majority of the presented in the literature approaches designed for data replication in complex, multi-node, transactional environments, focuses mainly on the faulttolerance issues within the system [4, 32, 38] and the security of the system [97, 116]. For instance, in multi-agent systems, which by theirs nature are distributed systems, these approaches address the problems related to the communication and interaction between agents, as well as the coordination of the agents [97].. 1.

(11) It is obvious that data synchronization in distributed, multi-node systems, similarly to many other distributed systems, is performed between a large number of remote nodes. This causes that the whole system that uses the replication approach must ensure high scalability level understood as the possibility of extending of the data management system by adding remote replicas, whereas its overall performance increases or at least does not decrease [48, 54, 61]. Data in the systems covering wide areas can be stored in different database management systems running on variety of operating systems and hardware platforms, therefore, it is also necessary to enable appropriate cooperation between replication nodes in such heterogeneous systems [48]. The aim of this research is therefore to propose the replication method for distributed systems with large number of nodes that will be suitable for systems in which high level of scalability is a priority issue. The proposed method should also accomplish fault-tolerance requirements and should be applicable for heterogeneous environment based on various software and hardware platforms. The following section of this chapter introduce the research thesis and then the main activities connected with the performed research tasks are defined. It can be revealed that Theta replication method, which enables concurrent transactions processing without necessity of usage of distributed locks, provides non-conflicting and highly scalable technique for data replication in heterogeneous database systems. Theta replication method and related Conflict Prevention algorithm are originally proposed on the purpose of this research. The proposed method has been designed for the high availability multi-tier architecture with distributed middleware [7, 50]. In the proposed approach the users’ transactions are provided to all middleware nodes without preserving any order, where they are reassembled into the order in a way that guarantees the same overall result in all replicas. The middleware uses its own concurrency control mechanism, called Conflict Prevention algorithm, which enables transactions to be executed in parallel. Transactions are verified against possibility of conflict appearance using Conflict Prevention algorithm, which is designed especially for Theta method pur-. 2.

(12) poses. Conflict Prevention algorithm defines transaction processing order in a way that ensures nonconflicting processing. Moreover it supports execution of transaction with a degree of parallelism, while at the same time guarantees data consistency in replicas (the same results of processed transactions in each replica). Since Conflict Prevention algorithm ensures that conflict can not happen in any replica, which technically is a single instance database, transactions are applied to the system in a way as they would be applied in a system with centralized database. As a result usage of distributed data locks during data replication is not required. Theta approach is suitable for heterogeneous environments which is implied as the possibility of replication implementation in environments with different RDBMS, operating systems and/or hardware platforms without the necessity of using complex gateways, additional drivers, etc. The research contributions of this thesis and planned research tasks are: • Analysis of replication techniques usage spectrum, • Proposal of the new replication method – Theta replication, • Experimental evaluation of the method, • Practical Theta method implementation. The following part of this section provides a short introduction to the tasks performed on the purpose of this research. Analysis of replication techniques usage spectrum In recent years many replication approaches have been proposed in the literature [17, 54, 61, 76, 77, 84], or have been implemented in database management systems used in real life solutions [42, 67, 99, 104, 105]. The review of the replication approaches, which is introduced in the chapter 2, focuses on the presentation of the relations between different types of replication. Furthermore, it allows to identify key aspects of the data replication approaches, as well as it helps to recognize and define directions for further work related to replication issues.. 3.

(13) Comparison of conceptually similar approaches is not easy because of many subtle differences in mechanisms used and comparison of test results for various replication techniques usually is worthless since different assumptions are made for different approaches, often designed for mutually exclusive solutions (replication for OLTP systems, DSS systems, multimedia systems, etc.). It is also very difficult, and usually even not possible to perform the experiments for the available replication approaches, since the source code or deployment software of those approaches are not accessible. Therefore the review of the replication approaches presented in the chapter 2 provides the analysis of the replication approaches in the aspect of theirs usage. Theta replication method The key component of the new approach is the Conflict Prevention algorithm realized in the middleware. It determines an optimal order of transactions execution to ensure data coherence in database copies and to keep identical data in all copies. The proper order of transaction execution is determined regardless the time and the order of particular transactions appearance in the middleware. After the conflict analysis is performed, non-conflicting transactions are submitted to the database concurrently, which in fact is correct according to the 1-copy serializability theory [5], whilst conflicting transactions have to be executed apart from transactions they are in conflict with. In the system with Theta replication implementation every location consists of an instance of the middleware software and an instance of the database management system. It is possible that both middleware and database instances work in a cluster configuration. The middleware is located between the clients submitting transactions and the database. To submit theirs transactions clients use a dedicated driver called Theta Connector. Theta Connector converts the users’ requests into the straight sets of parameters. Afterwards, these parameter sets are sent to the middleware, where conflict resolution is performed. When the conflict resolution procedure is finished, the parameters are decoded into the stored procedures calls, and then these procedures are executed in each copy of the database with a degree of parallelism.. 4.

(14) The following list provides general assumptions and requirements for the proposed data replication approach: • Ensuring high level of scalability, • Reduction or elimination of the locks between remote replicas which causes that the usage of distributed transactions is not a must, • Reduction of the amount of messages exchanged between replication nodes, • Transaction processing in parallel, • Replication for multiple nodes environment, • Portability and easiness of the usage in heterogeneous environments. Chapter 3 introduces the model and the concept of Theta replication method as well as extensive description of the approach. Theta method experimental evaluation The data management systems in the majority of the companies are required to be accessible all the time, and any unforeseen and unwanted data lose or modification are not acceptable. Thus there is no possibility to proceed research experiments in the production environment and is decided to prepare the prototype implementation of the proposed Theta replication method and conduct the appropriate experiments in the simulated environment, called in the course in-laboratory testing. The main purpose of these experiments is verification of Theta replication method presented in chapter 3. The following factors are taken under consideration: • High scalability level, • Transaction processing in parallel, • Portability and possibility of the usage in heterogeneous environments,. 5.

(15) • Deadlock detection and resolution, • Reduced communication, • Resistance to failures, • Easiness of introducing changes. The issues related to the in-laboratory implementation of Theta replication method and its experimental evaluation are presented in the chapter 4. Practical implementation of Theta method System IBIS [65] is a system that continuously monitors the Internet and classifies objects found (WWW pages first of all) according to matching the content of these objects with user defined profiles which describe the area of interests. IBIS platform is a multi-agent and multi-user system. Its implementation consists of a set of machines on which processes of particular agents are run. Performing their tasks on the purpose of the common goal, agents have to work on consistent data, easily and efficiently accessed from the servers on which processes of agents run. Because of the distributed, multi-node architecture of the IBIS system, data replication realized based on Theta replication method occurs to be ideal to fulfill these demands. As a part of this research, data replication based on Theta method is designed and implemented for IBIS multi-agent system. Theta replication implementation for IBIS system allows to carry out evaluation of the method in real life system providing possibilities to conduct functional verification of the replication process, fault-tolerance tests and in practice scalability review. In chapter 5 there is an introduction to IBIS platform and then there is an extensive presentation of the practical implementation of Theta replication method for IBIS system, including appropriate evaluation of the solution. Appendices Theta replication method is implemented and evaluated in the in-laboratory and real life environments. The wide-ranging evaluation of the implemented method. 6.

(16) has been performed and built replication software can be used for the purposes of data sharing based on Theta replication method. It is planed to package the software and to prepare universal installer to enable easier deployment of the replication software as well as its configuration. Attached materials, described in appendices, can be used to set up replication environment as it is used within the performed evaluation of Theta method. Appendix A provides details of the replication software implementing Theta method. Detailed specification of the software used for IBIS system implementation can be found in the appendix B, while the test environment organization including hardware specification is presented in the appendix C.. 7.

(17) Chapter 2 Database replication As it is stated in the introduction, majority of the currently used systems is distributed among many remote locations. Supply chain management, customer relationship management, business intelligence and many other types of applications usually do not share data or business rules nor communicate with each other to exchange data. Thus, identical data is stored in multiple locations, and as a consequence of it there is no possibility to automate business processes. A solution to this problem can be the Enterprise Application Integration [27], which is a process of linking applications within a single organization together, to simplify and automate business processes. An access to a huge amount of data distributed among remote sites and operating on them is also realized using data replication techniques. Besides storing copies of the same data in multiple remote locations, database replication improves performance, scalability and fault-tolerance of the systems. In data replication process it is essential to keep copies consistent across the whole system. Several correctness criteria have been proposed for database replication: 1-copy serializability [5, 49], generalized snapshot isolation [1, 22] and 1-copy snapshot isolation [61]. Data replication is one of the most advantageous techniques in dealing with failures and improving efficiency for the variety of data storing systems. It has been an area of researches for almost thirty years, as the first publications related to data replication appeared in the late seventies [103, 111]. Years of detailed studies made possible a development of a wide range of algorithms and protocols which are used for maintaining data replication in distributed environments. The. 8.

(18) 2.1 Database replication definition. turn of the 21st century has provided with rapid development of those techniques as various approaches for database replication have been proposed. There have been developed and implemented many replication approaches for several system architectures. A typical attribute of replication approaches is that they only appeal to specific applications like synchronization of certain tables, data transfer between online transaction processing systems (OLTP) and decision support systems (DSS), hot fail over, etc. It is a great challenge to design and implement such an approach that would be suitable for many systems with different characteristics. This issue is extremely complex and still requires a lot of research to be done. Moreover, it is possible that a universal replication approach can never be found since various requirements of the future systems may cause it would be necessary to use solutions designed especially to meet new demand. In the last few years a lot of replication techniques for database systems have been proposed in the literature and many practical implementations have appeared. A comparison of these conceptually similar techniques is not easy to deal with because of many subtle differences in mechanisms that are used. Furthermore, it is extremely hard to relate results coming from different areas of theirs usage. All these factors cause that replication techniques are very difficult to compare with each other. Moreover, even though a lot of researches have been completed and many replication techniques have been studied, it is still highly probable that there are many more possibilities of database replication which have not been explored or even have not been proposed yet. This chapter first introduces the data replication and provides data replication use cases. Afterwards there is a review of the replication approaches and theirs development. Then there is a survey on replication approaches from the point of view of their suitability and feasibility for usage in particular systems. The last section covers the issues related to the implementation of data replication in nowadays database management systems.. 2.1. Database replication definition. Database replication is the creation and maintenance of multiple copies of the same database [108]. It is a technique that allows to ensure a coherence and. 9.

(19) 2.1 Database replication definition. forces data synchronization in the distributed database system. Database copies are called replicas and may be located in many remote sites. The database replication should be transparent for users, in a way as if they were working on a single instance database. In a system with centralized database every client connects to the same server. In a system in which database is replicated among various sites, client may choose a replica to connect, or, which is more often used in practice, client connect to dedicated replica, which usually is the nearest one to that client in term of network distance. If replicas run on separate servers, the database replication system has two important advantages comparing to the centralized database: • High availability – if one replica crashes due to a software or hardware failure, the remaining replicas can still continue processing, while centralized database system becomes completely unavailable after only one crash. • Increased performance – the transaction processing load can be distributed among all the replicas in the system. This leads to a larger throughput (since queries and read operations do not change the database state, they can be independently executed in one replicas only) and a shorter response times for queries (because queries can be executed only in one replica, which is usually in the same location as the client, and without any additional communication among the replicas). Higher availability and better performance of the system are related to the following costs: • Overhead related to an additional processing and communication – the replicas require additional communication to ensure that modifications are applied to all of the database copies, which increases the load on the machines and in the communication network, thus degrades the overall system performance. • System complexity – synchronization of the database copies among replicas requires usage of advanced communication and transaction processing algorithms.. 10.

(20) 2.2 Replication Model. 2.2. Replication Model. When replicated, a simple single-node transaction may apply its updates remotely either as part of the same transaction (eager) or as separate transactions (lazy). In either case, if data is replicated at N nodes, the transaction does N times as much work [30]. Figure 2.1 shows two ways to propagate updates to replicas (eager and lazy). Eager updates are applied to all replicas of an object as part of the original transaction. One replica is updated by the originating transaction and updates to other replicas propagate asynchronously, typically as a separate transaction for each node when lazy update is performed.. Figure 2.1: Ways of updates propagation in replicas. Updates may be controlled in two ways which is presented in fig. 2.2. Either all updates are initiated by a master copy of the object, or updates may be initiated by any. Group ownership has many more chances for conflicting updates. The considered model assumes that the database consists of a fixed set of objects. There are a fixed number of nodes, each storing a copy of all replicated objects. Each node originates a fixed number of transactions per second. Each. 11.

(21) 2.2 Replication Model. Figure 2.2: Updates control in replicas. transaction updates a fixed number of objects. Inserts and deletes are modeled as updates, reads are ignored. Replica update requests have a transmit delay and also require processing by the sender and receiver. These delays and extra processing are ignored; only the work of sequentially updating the replicas at each node is modeled. Some nodes are mobile and disconnected most of the time. When first connected, a mobile node sends and receives deferred replica updates. Parameters used in replication model are listed in table 2.1. Name DB Size Nodes Transactions TPS Actions Action Time. Description Number of distinct objects in the database Number of nodes; each node replicates all objects Number of concurrent transactions at a node. This is a derived value Number of transactions per second originating at this node Number of updates in a transaction Time to perform an action. Table 2.1: Variables used in the model and analysis. Each node generates TPS transactions per second. Each transaction involves a fixed number of actions and each action requires a fixed time to execute. Thus. 12.

(22) 2.2 Replication Model. duration of transaction equals to ActionsxAction T ime. Given these two observations, the number of concurrent transactions originating at a node is:. T ransactions = T P S ∗ Actions ∗ Action T ime. (2.1). In a system of N nodes, N times as many transactions will be originating per second. Since each update transaction must replicate its updates to the other (N − 1) nodes, it is easy to see that the transaction size for eager systems grows by a factor of N and the node update rate grows by N 2 . In lazy systems, each user update transaction generates N − 1 lazy replica updates, so there are N times as many concurrent transactions, and the node update rate is N 2 higher. This non-linear growth in node update rates leads to unstable behavior as the system is scaled up [30]. Eager Replication updates all replicas when a transaction updates any instance of the object. There are no serialization anomalies (inconsistencies) and no need for reconciliation in eager systems. Locking detects potential anomalies and converts them to waits or deadlocks [30]. In a single-node system the with eager replication transactions have about T ranascations∗Actions 2. resources locked (each is about half way complete). Since objects are chosen uniformly from the database, the chance that a re-. quest by one transaction will request a resource locked by any other transaction is. T ransactions∗Actions . 2∗DB Size. A transaction makes Actions such requests, so the chance. that it will wait sometime in its lifetime is approximately [29, 31]: T ransactions ∗ Actions ) ∗ Actions 2 ∗ DB Size T ransactions ∗ Actions2 PW = 2 ∗ DB Size P W = 1 − (1 −. (2.2) (2.3). A deadlock consists of a cycle of transactions waiting for one another. The probability a transaction forms a cycle of length two is P W 2 divided by the number of transactions. Cycles of length j are proportional to P W j and so are. 13.

(23) 2.3 Development of replication techniques. even less likely if P W << 1. The probability that the transaction deadlocks is approximately: PW2 T ransactions ∗ Actions4 T ransactions 4 ∗ DB Size2 T P S ∗ Action T ime ∗ Actions5 PD = 4 ∗ DB Size2 PD =. (2.4) (2.5). Lazy group replication allows any node to update any local data. When the transaction commits, a transaction is sent to every other node to apply the transaction updates to the replicas at the destination node. It is possible for two nodes to update the same object and race each other to install their updates at other nodes. The replication mechanism must detect this and reconcile the two transactions so that their updates are not lost. Transactions that would wait in an eager replication system face reconciliation in a lazy-group replication system. Waits are much more frequent than deadlocks because it takes two waits to make a deadlock [30]. Eager replication waits cause delays while deadlocks create application faults. With lazy replication, the much more frequent waits are what determines the reconciliation frequency. So, the system-wide lazy-group reconciliation rate follows the transaction wait rate equation:. Lazy Group Reconcil Rate =. 2.3. T P S 2 ∗ Action T ime ∗ (Actions ∗ N odes)3 (2.6) 2 ∗ DB Size. Development of replication techniques. At the beginning of its development data replication for databases was realized by the modifications in a source code of the database management system engine. These changes were performed in various parts of the engine, like in the transactional log module which consists of all the modifications in database, or with the usage of additional modules ensuring group communication. An example of the system based on this idea is the implementation of the system Postgres-R [53, 54], which is characterized by relatively good performance as the overhead. 14.

(24) 2.3 Development of replication techniques. related to replication process is low. However, since it is necessary to modify the source code of the essential part of database, which is its engine, systems implemented on the basis of code modifications in the database engine are very hard to realize in different system platforms. Difficulties are present not only in case of the usage of different system engines or operating systems, but also appear for different versions of the database supplied by the same vendor, even if it is the next release of this database. The solution presented in [5] is based on the propagation of all local operations to each remote site in the system. Unfortunately, propagating of every operation to the remaining sites in the system led to the frequent occurrences of the distributed deadlock for these operations. Thus, after some research a new ROWAA approach (Read One Write All Available) ensuring the coherency of replication process was designed. In this approach all operations related to a single transaction are first processed in one site only, and afterward data modifications are transferred to the remaining nodes, without the necessity of any additional communication messages. Since fewer messages are transferred through the network to process a transaction, it is obvious that the transaction time is shorter, and the replication performance is improved. Optimistic 2 Phase-Locking protocol (O2PL) presented in [9] is the example of this approach. It is one of the first approaches ensuring replicated data coherence on the basis of ROWAA approach. This approach is realized with the usage of an adaptation of the two phase-locking protocol (2PL) in which local transactions are distinguished from remote transactions. It allows to predict and avoid or decrease the quantity of deadlocks. Start of the transaction processing immediately after it is delivered to the site allows on significant decrease of the overall time required to commit this transaction. After a complete transaction is delivered, its correct order is determined, and if the transaction is not in conflict with preceding transactions it is commited. Otherwise, the whole transaction is processed from the beginning once more. To decrease deadlocks influence on data replication approaches based on group communication were designed and implemented. Group Communication Systems (GCS) approaches used in [11, 43] provide a mechanism that guarantee necessary order of delivered messages in the network, and also enable failure detection in any site of the system. The most restrictive order of delivered messages requires. 15.

(25) 2.4 Review of the approaches. the same order of delivering in every node of the system, which allows to avoid distributed deadlocks. Group communication based approaches were widely explored and also some systems using it were implemented, for instance Basic Replication Protocol (BRP) presented in [43] or Postgres-R project in [53, 54]. Unfortunately, group communication based approaches are not the most effective technique in preventing deadlocks for all possible applications or every type of transaction. It is so, because exchanging of additional messages between sites is necessary to ensure the proper order of delivered messages related to transactions. An overhead related to it can significantly decrease the performance of the approach [18]. The following researches in GCS techniques led to reduction of the influence of latencies in the network on the overall time required to deliver transaction to every node in the correct order. En example of such solution is Generic Broadcast approach discussed in [2, 78]. In this approach delivery order is important only for these transactions which are conflictigting, while the rest of transactions can be delivered in any order. Another example is Optimistic Atomic Broadcast approach presented in [55, 115]. Messages in this approach are delivered in the same order as they where received, which enables quick application of writeset in the remote nodes, despite of the necessity of waiting for the final order of transactions before they are commited. Thus, only those remote transactions whose writeset did not follow the total order are rolled back, reapplying them in the correct order. Group communication based techniques also have been used for the epidemic algorithms presented in [37].. 2.4. Review of the approaches. Very well known and useful classification was proposed by Gray in [30], where replication approaches are grouped according to two parameters. These parameters are: the place in which a transaction is initiated and the time when a transaction is distributed to each node. Wiesmann in [117, 118] proposes an extended classification using three parameters – first parameter is the architecture of the server, the second one is the degree of communication among database nodes during the execution of a transaction, and the last one is the transaction termination protocol.. 16.

(26) 2.4 Review of the approaches. The classification of replication techniques is included in the following subsections. Replication techniques are distinguished on the basis of an architecture of the replication system, a time of data propagation between replicas, interactions between nodes, a way of termination of transactions and a possibility of its usage in heterogeneous environments.. 2.4.1. Time of transactions. In replicated system consistency of data in all replicas, in the presence of updates, has to be ensured. Some replication protocols require strong consistency, which means that data must be consistent all the time. In such cases replicas must coordinate updates before the response is sent to the client and response time increases. There are protocols that require only weak consistency allowing data to be inconsistent temporarily. This let us to get faster response for writes, but it is possible that transactions read data which is not up to date since it have not yet been applied in every remote replica. Weak consistency protocols may even require to rollback updates previously applied and committed, even though the client received a confirmation that commit was successful. Depending on the time when modification in one site is propagated to other replication sites two replication approaches are distinguished: eager and lazy. In eager approach transaction may be commited only when it is possible to commit it in all sites. Unlike eager approach, lazy approach allows updates to be commited before they are propagated to other sites. 2.4.1.1. Eager replication. Eager replication (synchronous) approach requires immediate propagation of changes from the database node where transaction was submitted to all nodes in the replication system. Thus, eager replication keeps all database replicas totally synchronized in all nodes by updating all the replicas as a part of one atomic transaction. Eager replication ensures serializable execution of transactions which causes that there are no concurrency anomalies. However, eager replication reduces update performance and increases transaction response times because extra updates and messages are added to the transaction [30].. 17.

(27) 2.4 Review of the approaches. Eager replication implementations most often are based on Two-Phase Commit protocol (2PC) [5] or some modifications of this protocol – Three-Phase Commit protocol (3PC) [98], etc. Two-Phase Commit protocol is a distributed algorithm that need all nodes in a distributed system to agree to commit a transaction. The protocol results in either all nodes committing the transaction or aborting otherwise, even in the case of network failures or node failures. The greatest drawback of the two-phase commit protocol is the fact that it is a blocking protocol as it usually uses Two Phase Locking protocol (2PL). In 2PL protocol a node blocks data (raws, table) while it waits for a message, which means that other processes competing for resource locks held by the blocked processes will have to wait for the locks to be released. Ensuring 1-copy serializability, which guarantees that data is coherent and integral, is the most important advantage of eager replication. Data is up to date in each node and all the time. Disadvantages of this technique are reduced scalability and low efficiency. The consequence of locks is the possibility of deadlocks, which also significantly decrease performance. Eager replication protocols ensure strong consistency and fault-tolerance as updates must be confirmed at remote replicas before replying to clients. However, this have a meaningful impact in user visible performance. Also in contrast with lazy protocols, approaches to eager replication provide very different trade offs between performance and flexibility [46]. The following part of this subsection provides popular commercial implementation of eager replication approach. Volume Replication Replication of disk volumes performed at the block I/O level is a straightforward replication approach for general purposes. By intercepting each block written by the application designated volumes and shipping it over to network, a remote copy is maintained ready for fail-over. Reads are performed on the local copy. The replication process is thus completely transparent to the application. The downside of the approach is that remote updates are performed with a block granularity, which depending on the application might represent a large. 18.

(28) 2.4 Review of the approaches. network overhead. The approach is also restricted to fail-over, as the backup copy cannot usually be used even for read-only access, due to lack of cache synchronization at the application level.. Figure 2.3: SRDF based data replication architecture. Examples of volume replication solutions can be found in Veritas Volume Replicator [106] and EMC Symmetrix Remote Data Facility [23] available for many different operating systems, as well as in the open source DRBD for Linux OS [33]. RAIDb protocols A Redundant Array of Inexpensive Databases (RAIDb) provides a better performance and fault tolerance than a single instance database. It is achieved at a lower cost by combining multiple database instances into an array of databases. Like RAID to disks, different RAIDb levels provide various cost/performance/faulttolerance tradeoffs. RAIDb-0 features full partitioning, RAIDb-1 offers full replication and RAIDb-2 introduces an intermediate solution called partial replication, in which the user can define the degree of replication of each database table [46].. 19.

(29) 2.4 Review of the approaches. In RAIDb the distribution complexity is hidden from the clients and therefore they are provided with the view of a single database like in a centralized architecture. As for RAID, a controller is located in front of the underlying resources and the clients send their requests directly to the RAIDb controller, which distributes them among the set of RDBMS. RAIDb is implemented as a software solution in C-JDBC [107]. 2.4.1.2. Lazy replication. In lazy replication (asynchronous) the updates of a transaction are propagated once it has already committed. Lazy approach does not require continuous connection between all nodes. Every node works independently and reads and writes are processed locally. Updates are propagated to remote nodes either as SQL statements or as log records containing the results of executed operations. Despite many disadvantages, lazy approach has been used in several commercial DBMSs and it is the important option when mobile or frequently disconnecting databases are considered. Lazy replication is used to synchronize data for certain tables only, not for the whole database instance. Synchronization can be executed periodically, on the user’s demand or just after successful modification in local node. Processing transactions locally allows on quickly transaction completion, but does not always ensure replicas consistency and may lead to conflicts, data incoherency and high abort rate. Advantages of lazy replication approach are: scalability, no overhead related to Two Phase Locking, resistance to single node failures, constant connection among all nodes not required. Lazy replication has become a standard feature of many commercial database management systems and open-source database projects [42, 63, 67, 99, 105]. In the replication system with lazy approach implementation every transaction is executed and committed at a single replica without synchronization with other replicas. Every other replica is synchronized later as changes are captured, distributed and applied. Because of it, the overall performance of the whole system measured as a time of transaction processing increases significantly. However,. 20.

(30) 2.4 Review of the approaches. since updates can be lost after the failure of any replica, lazy replication approaches are not suitable for fault-tolerance solutions with strong data consistency required [46]. There is a variety of implementations in which lazy replication operates in different ways. Main differences between implementations of lazy approach exist in the following areas: • the way of the whole system is managed, • system architecture determining which replica processes and publishes updates to other replicas, • the way in which updates are captured, distributed and applied, • filtering of the processed transactions. Lazy replication protocols can be divided in three phases: • Capture – updates performed on replicated objects transformed into a format suitable for publication, • Distribution – changes in published objects are propagated to the relevant replicas, • Apply – updates applied in the relevant replicas. The following part of this subsection contains a general description of the most popular commercial database management systems with implementation of lazy replication approach. Oracle Oracle database in version 11g and 10g offers three basic replication solutions [8, 67, 68]: • Snapshot replication, • Streams replication,. 21.

(31) 2.4 Review of the approaches. • Advanced replication. Snapshot replication also called materialized view replication is based on the capturing of changes for the materialized views, which are then propagated and applied in the remote sites. A snapshot is a query that has its data materialized, or populated in a form of a data table. When a snapshot is created a table corresponding to the column list in the query is created. When the snapshot is refreshed, that underlying table is populated with the results of the query. For replication, as data changes to a table in the master database, the snapshot refreshes as scheduled, and that data are distributed to the replicated databases.. Figure 2.4: Architecture for Snapshot based data replication. Oracle Streams Replication enables the propagation and management of data, transactions and events in a data stream either within a database, or from one database to another. Streams replication passes published updates, events or even messages to subscribed destinations, where they are applied or further processed. The Oracle Streams technology uses the operation logs (redo logs) as an input for a capture process. The capture process formats both operations on data. 22.

(32) 2.4 Review of the approaches. (DML) and data definitions (DDL) into the events which are enqueued for distribution process. The distribution process reads the input queue and enqueues the events in a remote database. The queue from which the events are propagated is called the source queue, while the queue receiving the events is called the destination queue. There can be a one-to-many, many-to-one, or many-tomany relationship between source and destination queues. Depending on the type of events an apply process at the destination site applies them directly to the database, or may dequeue these events and send them to an apply handler which performs customized processing of the event and then applies it to the replicated object. Oracle Advanced Replication is realized as a set of support triggers and procedures for each replicated object. These triggers and procedures enqueues a specific remote procedure calls according to the command executed. This queue is consumed by the Oracle implementation of the distribution process which push and pull the deferred calls, propagating the changes from sources to destination databases. Then, the apply process dequeues these information and updates the subscriber. Since Oracle Advanced Replication allows to perform updates in every replica, a conflict detection mechanism is used in every site. When a conflict is detected, actions defined for the specific types of conflict are performed (conflict resolution procedure). MS SQL Server MS SQL Server 2005/2008 provides three replication solutions: transactional replication, snapshot replication and merge replication [63, 101, 104]. Transactional Replication uses the log as a source to capture incremental changes that were made to a published objects. The capture process copies transactions marked for replication from the log to the distribution agent. Basically, the capture process reads the transaction log and queues the committed transactions marked for replication. Then, the distribution agent reads the queued transactions and applies them to the subscribers.. 23.

(33) 2.4 Review of the approaches. In Snapshot Replication the entire published object is captured thus it distributes data exactly as it appears at the specific moment in time. This solution does not monitor the updates made against an object. Roughly, the capture process is implemented by a snapshot agent. Periodically, it copies the replicated object schema and data from the publisher to a snapshot folder for future use by a distribution agent, which also acts as an apply process. Snapshot replication can be used by itself but the snapshot process, which creates a copy of all of the objects and data specified by a publication, is very often used to provide the initial state of data and database objects for the transactional and merge replication. Merge Replication allows the publisher and the subscribers to make updates while they are connected or disconnected. When both are connected, it merges the updates. The capture process tracks the changes at the publisher and at the subscribers, while the distribution process distribute changes and also acts as an apply process. Merge replication allows various sites to work autonomously and later merge updates into a single, uniform result. Because updates are made at more than one node, the same data may have been updated by the publisher and by more than one subscriber. Therefore, conflicts can occur when updates are merged and merge replication provides a number of ways to handle conflicts. IBM DB2 IBM DB2 Universal Database 8.2 provides two different solutions that can be used to replicate data from and to relational databases: SQL replication and Q replication [41, 42]. In SQL Replication changes are captured at sources and staging tables are used to store committed transactional data. The changes are then read from the staging tables and replicated to corresponding target tables. With staging tables, data can be captured and staged once for delivery to multiple targets, in different formats, and at different delivery intervals. Q Replication is the replication solution that can replicate large volumes of data at a very low levels of latency. Q replication captures changes to the source. 24.

(34) 2.4 Review of the approaches. tables and converts committed transactional data to messages. This data is sent as soon as it is committed at the source and read by Q replication. The data is not staged in tables. The messages are sent to the target location through the message queues. These messages are then read from the queues at destination sites, converted back into the transactional data and applied to the target tables. IBM DB2 UDB version 8.2 also provides a solution called event publishing for converting committed source changes into messages in an XML format and publishing those messages to applications such as message brokers. PostgreSQL Slony-I is a replication solution for the PostgreSQL database, which implements lazy replication as a ”master to multiple slaves” replication [99]. In Slony-I, the capture process is implemented using triggers which log the changes made against the published objects. These changes are then periodically distributed using a replication daemon which connects directly to the publisher, reads the logged changes and forwards them to the subscribers.. Figure 2.5: Cascade replication for PostgreSQL Slony-I. Slony-I allows to connect several subscribers in cascade. The management process is not integrated or centralized, thus the maintenance tasks must be done in each replica separately.. 25.

(35) 2.4 Review of the approaches. MySQL MySQL includes a built-in lazy replication protocol that can be configured to propagate updates between replicas [64]. Replication enables data from one MySQL database server (called the master) to be replicated to one or more MySQL database servers (slaves). Replication is asynchronous and replication slaves do not need to be connected permanently to receive updates from the master. Replication in MySQL features support for one-way, asynchronous replication, in which one server acts as the master, while one or more other servers act as slaves.. 2.4.2. System architecture. The first parameter to consider in database replication classification is the place where transactions start to execute. Depending on transactions location in primary approach updates can be executed only in primary (master) site, whilst update everywhere approach allows on update execution at any site (usually updates at the user’s local site). According to propagation time updates done at any site have to be propagated to other sites. Primary copy approaches are designed for Master-Slave architecture and require to have a specific node (the primary copy) associated with all data in system. Each data update, before being sent to all nodes, has to be sent to the primary node, where it is processed – executed or analyzed to determine the order of execution. After transactions processing the primary copy propagates update or its results to the remaining nodes. Replication mechanism in primary copy approach does not require distributed protocols like Two-Phase Commit (2PC) or Two Phase Locking (2PL), which eliminates overhead related to these protocols. There are also eliminated conflicts in transactions in primary copy approach as transactions are processed only in primary node and conflict resolution is realized in a single instance database. On the other hand, this technique has many drawbacks which prevent it from being used widely. These disadvantages are: single point of failure connected with one. 26.

(36) 2.4 Review of the approaches. primary node, bottlenecks in primary node and read-only access to data in nodes that are not primary copy. Since primary copy approach enforces a bottleneck in the replication system and a single point of failure, some modifications for the approach are used to overcome these limitations. This approach easily overloads the primary node, therefore read-only transactions can be applied in secondary nodes which balances the load. Bottlenecks can also be avoided by data partitioning among more than one node. Having different primary nodes for subsets of data, update transactions can be executed in parallel, dividing the load among these nodes. Furthermore, in case of crash of the primary copy, one of the other nodes can become primary one. Therefore, primary copy approach is mainly used for fault tolerance. In case of one node crash the database is still available for users. This technique is applicable for data transfer between Online Transactional Processing systems (OLTP) and Decision Support Systems (DSS). Update everywhere approach is related to Multiple Master architecture. This database replication technique does not impose limitations on a node where updating transactions can be processed, and updates can be performed in each node in the system. In update everywhere approach updates can reach two different nodes at the same time, even though they are conflicting, which is opposite to primary copy approach. In update everywhere replication nodes are equal to each other. Each of the client’s transaction is processed by one node and then changes are propagated to other replicas. However, updating data in every node causes appearance of conflicts between transactions. Conflict resolution approaches are used to deal with conflicts. Update everywhere approaches are suitable for dealing with failures, as election protocols are not necessary to continue processing. Another update everywhere approach advantage related to the symmetric architecture of this replication system is the possibility of reading and writing data in every replica. When update everywhere technique is used in connection with eager approach it usually require usage of Two-Phase Commit or Two Phase Locking protocols, which significantly decreases overall performance of the system. Replication conflicts appearing when approach is used in connection with lazy replication are also. 27.

(37) 2.4 Review of the approaches. serious problem, as they lower replication performance and need special treatment. The complexity of concurrency control in update everywhere approach requires additional maintenance, and usually is realized in middleware subsystem, which coordinates global transactions.. 2.4.3. Middleware based replication. It is possible to find many algorithms and protocols which are used for maintaining data replication in distributed environments [3, 43, 61, 76]. To ensure data consistency, concurrency control in database management systems is based either on internal mechanisms of database management system, or uses a middleware tier [7, 50]. Concurrency control based on middleware protocols is getting more and more popular because of its flexibility and simplicity. Unlike systems with modified code of the database engine, replications system built in middleware architecture make possible realization of replication that is transparent for the users and applications. If middleware based replication is implemented it is not necessary to do any modifications in the code of the database engine.. Figure 2.6: Database replication architecture with centralized (a) and distributed (b) middleware. In recent years appeared many middleware based replication protocols for databases. The middleware protocol maintain replication control in middleware. 28.

(38) 2.4 Review of the approaches. subsystem placed between client and databases. Middleware based replication can be developed and maintained irrespective of database management systems, and there is a possibility to use it in heterogeneous environments. Distributed middleware architecture has many advantages in regard to replication control, which causes that this architecture has been chosen for further research. The most important features of the middleware based replication are: • Increased fault tolerance of the system – in case of failure of one of the middleware component it is easily replaced. • Scalability as new databases and middleware servers can be easy added in a location it is required. • Possibility ofonline upgrades and maintenance. • Increased performance of the middleware.. 2.4.4. Server interaction. The following section provide more information on the classification based on server interaction, parameter related to the number of messages which are sent among database nodes to ensure the applying of transactions. This interaction impacts amount of network traffic, and in consequence is a significant overhead to the processing of transactions. Since applications are more and more geographically dispersed and distances among nodes are getting extremely large, network communication becomes an important factor in replication process. Server interaction is explained as a function of the number of messages necessary to handle the operations of a transaction [118]. The number of network interactions influences replication protocol significantly and determines the order in which transactions are to be processed. On the basis of interaction among nodes two cases are distinguished: constant interaction and linear interactions. In constant interaction the number of messages that are used to synchronize database nodes is constant and does not depend on the number of particular operations in the transaction. Typically, these approaches merge all operations related to the transaction in a single message and use only one message per. 29.

(39) 2.4 Review of the approaches. transaction. Replication techniques might also use a greater than one message, but for constant interaction approach number of messages is always unchangeable, independently of the complexity of the transaction. On the contrary, linear interaction enables technique in which processing of transaction requires exchanging variable amount of messages. The amount of messages depends on the number of operation in particular transaction as it is usually proportional to the amount of single operations which are parts of the entire transaction. These single operations might be simply SQL statements, changes collected in the writesets or records in database logs which contain the results of transaction executed in specific node.. 2.4.5. Transaction termination. The way of transaction termination decides how atomicity is guaranteed. It leads to the existence of two replication techniques. One of them needs exchange of messages to fulfill requirements of the ACID, the other allow to satisfy ACID properties without any additional transmission of messages [117]. Approaches based on voting termination requires an additional messages to terminate transaction for establishing a coherent state of the replicas. In voting protocols the decision to abort transaction can be made by primary node (weak voting) or by any replica (strong voting). Voting messages might be very straightforward as a single message sent by one replica to confirm its state. On the basis of these message the other replicas determine their states. Message exchange can also be much more complex as in case of protocols like two-phase commitment protocol. Non-voting termination techniques allow replicas in all nodes to decide on their own whether they commit the transaction or abort it. However, transactions have to be executed in the same order in each replica. It is only a semblances that this is extremely strict requirement to ensure the same order of transactions processing, because for non-conflicting transactions identical order is not necessary.. 30.

(40) 2.4 Review of the approaches. 2.4.6. Environment complexity. The architectures of the replication environment might be quite simple as in systems with all replicas configured in exactly the same way. These architectures can also be extremely complex in connection with many layers used in such systems, like client, middleware and database tiers. Requirement of cooperation and realization of replication among different database engines used in one replication system is a great challenge. Depending on an approach possibility to be adapted for replicating data among miscellaneous types of databases, we can distinguish two approaches: heterogeneous and non-heterogeneous replication [48]. Non-heterogenous approaches are the approaches working in environments consisted of each replica implemented upon the same type of database. This kind of replication is much easier to implement than replication for heterogeneous databases. However, the requirement of replication among different types of databases is getting stronger and stronger and heterogeneous replication is the area for future research. Heterogeneous approaches can be used in systems with a number of varied databases, which means all these database might be supplied by different software vendors. Especially, it might happen that these databases have different APIs. Thus, realization of the replication for heterogeneous databases is very complex and demanding task. There are some interfaces for accessing data in a heterogeneous environment, like Open DataBase Connectivity driver (ODBC) and Java DataBase Connectivity (JDBC). They can be used for accessing data among replicas, however, it adds another layer, in already complicated environment, and might adversely affect replication performance. Object-relational mapping libraries represented by Enterprise Objects Framework, Hibernate, Enterprise Java Beans and so forth, which convert data between incompatible databases and object-oriented programming languages are also possible to use for replication, but overhead connected with objects mapping might be too high reducing replication effectiveness in practice.. 31.

(41) 2.5 Review of replication techniques usage. 2.5. Review of replication techniques usage. Database management systems used in present day systems can be classified according to the way of data management as operational and analytical databases. Operational databases are used to perform everyday duties in many organizations, institutions and companies. They are applied not only in the systems which require to gather and store data, but in the systems which need to modify data as well. Operational database stores dynamically changed data which reflects actual state of reality. On the other hand, analytical databases are used to store historical and archival data, or information related to some events. When a company wants to analyze market tendencies, to gain access to long-term statistic data, or to perform business forecasts, then uses data stored in analytical databases. Data maintained by analytical databases are hardly ever modified, if modified at all. Moreover, this data always presents state of objects in some established moment in the past. Furthermore, appearing new technologies and data formats related to designed database management systems, more and more frequently require storing diverse data types like sounds, graphics, animations and videos, while storing traditional text and numbers too. Existing applications are continuously developed using modern, advanced relations between different types of data, which makes it easier to search demanded data or objects. The following subsections provide review of the database management systems for different types of application in the aspect of the usage of data replication in those systems. Replication aspects introduced in the review are related to transactional and reporting data processing as well as storing large objects, mobile data, real time data or spatial data.. 2.5.1. Transactional data processing. The characteristic feature of On-line Transaction Processing systems (OLTP) is that they are used to process large volumes of simple, read or write transactions. Transactional systems, which operate on huge amount of data, are nowadays. 32.

(42) 2.5 Review of replication techniques usage. applied in many implementations of banking systems, telecommunication, reservation systems (flight tickets for instance), etc. The emphasis in OLTP systems is put on retaining data integrity in multiusers environments, as well as on an efficiency of data processing measured as a number of transactions per time unit. In a typical OLTP system database efficiency is a key factor since users demand fast answers and reactions during performed operations, despite the fact that large number of transactions causes downgrades in performance related to the appearance of disks contentions, resources locking, deadlocks, etc. Quality and usability of such systems is highly dependent on the rate of database writes (writes on disks or other I/O devices). To fulfill excessive requirements for databases in OLTP systems, they are usually designed to store as little data as possible and the time for INSERT, UPDATE or DELETE operations is minimized. Additionally, data replication can be applied to improve performance and increase availability of the transactional system. When data replication is used in OLTP systems, then it is necessary to continuously ensure data consistency between all copies of data. In literature there are many replication approaches that could be used for data replication in OLTP systems. Such approaches can be found in [3, 15, 54, 61, 84]. An approach that is nowadays very often used for commercial OLTP systems is the eager replication approach based on 2-Phase Commit Protocol (2PC) described in [5]. These approaches ensure data consistency and fulfill demands of 1-copy serializability [49]. However, since the necessity of the locking of resources distributed among many remote locations, response times and the risk of occurrence of distributed deadlocks increases, which leads to the lower scalability of such systems. Thus, such systems are usually restricted to consist of few nodes only, which operate not very far from each other. In general, replication for OLTP systems is implemented using an architecture with equivalent nodes (multi-master, update everywhere replication). This is caused by requirement of data consistency in every replica. OLTP systems with equivalent nodes allow to read and write data in any replica in the whole system. Since data in each copy is identical, they are also suitable to ensure high resistance to failures (fault-tolerance). In OLTP systems with replication based on 2PC, their efficiency is usually significantly reduced by distributed locking. 33.

(43) 2.5 Review of replication techniques usage. and deadlocks. Another drawback of the eager replication with equivalent nodes is the necessity of conflict resolution, which causes the implementation of such system more difficult and impairs an efficiency of such solutions. In systems built in architecture in which one node is more important than others (primary copy, master-slave replication), data at first is modified in the main node and only after the transaction completion it is distributed to the other nodes. This leads to data inconsistency between particular replicas and as a consequence master-slave architecture is not suitable for OLTP systems. Lazy replication approaches are also not feasible for replication in OLTP systems. This is a result of a deferred propagation of changes which does not ensure consistent data state in each copy of data. It causes that conflicts appear, which very often leads to the necessity of performing complex conflict resolution procedures [16, 20, 95, 110]. It is obvious that all that factors cause a significant decrease in an overall performance of such implementations. Data processing in transactional distributed multi-node systems One of the most important feature of any distributed system is its transparency which is seen by a user as if it was single, integrated system. A distributed system may have a common goal, such as solving a large computational problem. Alternatively, each computer may have its own user with individual needs, and the purpose of the distributed system is to coordinate the use of shared resources or provide communication services to the users [28, 79]. Large volume of transactions and huge amount of data stored in such systems require complex solutions to achieve appropriate performance, data availability and security. Distributed cooperative applications (e.g., e-commerce, air traffic control, city traffic management) are now designed as a sets of autonomous entities called agents [21]. A system with number of interacting agents is named a multi-agent systems. Distributed multi-agent systems has became a powerful tool in the process of designing scalable software. The general outline of a distributed agent software consists of computational entities which interact with one another towards a common goal that is beyond their individual capabilities [32]. In the. 34.