Index of /rozprawy2/11615

(1)

Autonomic Management

of Cloud-Native Applications

Ph.D. Dissertation Joanna Kosińska

Supervisor:

Prof. dr hab. inż. Krzysztof Zieliński

(2)

(3)

Autonomiczne zarządzanie

natywnymi aplikacjami chmurowymi

Rozprawa doktorska Joanna Kosińska

Promotor:

Prof. dr hab. inż. Krzysztof Zieliński

(4)

(5)

(6)

(7)

1 Introduction 1

1.1 Motivation and Thesis Statement . . . 4

1.2 Scope of Research . . . 6

1.3 Thesis Outcomes . . . 7

1.4 Structure of the Dissertation . . . 8

2 Technological Background and Related Work 11 2.1 Principles of Cloud-native . . . 12

2.1.1 Cloud-native application (CNApp) . . . 16

2.1.2 Basics of the containers . . . 20

2.1.3 Orchestrating the containers . . . 23

2.2 Basics of Resource Management . . . 25

2.2.1 Policy-based management . . . 25

2.2.2 Knowledge Representation and Reasoning (KRR) . . . 26

2.2.3 Technologies related to Cloud-native management . . . 28

2.3 Autonomic Computing Paradigm . . . 30

2.4 Summary . . . 34

3 Autonomic management aspects among Cloud-native applications 37 3.1 System requirements . . . 38

3.2 Model of a Cloud-native execution environment . . . 39

3.3 Abstraction of application management concept in Cloud-native environ-ments . . . 41

3.4 Cloud-native observability concept . . . 47

3.5 Cloud-native autonomic element concept . . . 52

3.6 Summary . . . 53 4 High-level design of an autonomic management framework for

(8)

4.2 Microservices architecture of AMoCNA . . . 64 4.2.1 Instrumentation Layer . . . 69 4.2.2 Observation Layer . . . 70 4.2.3 Inference Layer . . . 71 4.2.4 Control Layer . . . 72 4.2.5 Management Layer . . . 73

4.3 Algorithms presenting initialization of AMoCNA . . . 75

4.4 Summary . . . 78

5 AMoCNA Implementation 79 5.1 Selection of the technology stack . . . 79

5.1.1 Alignment with AMoCNA . . . 80

5.2 Inclusion of AMoCNA in Cloud-native compliant environments . . . 82

5.3 Details of the implementation . . . 83

5.3.1 Cloud-native application outline . . . 84

5.3.2 Autonomic Element microservice . . . 85

5.3.3 Management Policies microservice . . . 93

5.4 Summary . . . 94

6 System evaluation 95 6.1 Evaluation methodology . . . 96

6.1.1 AMoCNA testbed . . . 97

6.2 Evaluation outcomes . . . 103

6.2.1 Evaluation 1 - Consequences of dynamic adjustment of the Cloud-native execution environment . . . 103

6.2.2 Evaluation 2 - Influence of a rule engine approach to autonomic management enforcement . . . 109

6.2.3 Evaluation 3 - Declarative management policies approach to auto-nomic management . . . 113

6.2.4 Evaluation 4 - System performance . . . 116

6.3 Summary . . . 121

7 Summary 123 7.1 Contribution to the Cloud-native area . . . 123

7.2 Future works . . . 125

Glossary 133

Bibliography 134

(9)

Introduction

Cloud computing [1, 2] refers to the concept of exposing computing technologies on the Internet. The Cloud is a metaphor of an Internet-based network which is represented on topology diagrams by a cloud symbol, disregarding the complexity of communica-tion techniques. The concept of Cloud Computing assumes the existence of three basic elements: so-called thin clients [3]1, distributed processing infrastructures (Grid) [4, 5] and utility computing [6–8]. Grid computing model aim is to serve computing resources on demand. This is done by dedicated servers connected with high-speed network loc-ated in datacenters. Datacenters are equipped with redundant power suppliers and cool-ing devices and also with solutions enhanccool-ing computcool-ing reliability and data storage. Grid computing leverages virtualization techniques [9] as a means of effective resource management. Virtualization can be viewed as part of a trend in enterprise IT. The first mentions are dated on 1960 and concerns only the virtualization among a single computer resources. Nowadays the meaning of this term has broaden. Virtualization concerns many aspects, as of hardware [10], storage [11], network [12], operating sys-tem [13], etc. The aim of each aspect is to simplify the management of the application and its environment.

Grids and technologies that they use2 are standardized by Open Grid Forum (OGF)3 community. The third important element is utility computing that is paying only for the used resources and the length of the usage. The Grid infrastructure provides services for coordinated usage of utility resources [14].

Cloud Computing (CC) environments are constructed to host the applications. These follow the guidelines of Twelve Factor App4. The specified best practices help to avoid

1

services are available Anwhere, Anytime and from Anything.

2

e.g. The Simple API for Grid Applications (SAGA), Job Submission Description Language (JSDL), Open Grid Services Architecture (OGSA).

3

http://www.ogf.org

4

(10)

methods used in traditional client-server programming (e.g. writing to a local filesystem). A new architectural style known as Cloud-native has been established to augment the Twelve Factor patterns of designing modern applications. On-premise applications are simply migrated to the Cloud environment as opposed to new applications built in Cloud-native manner fully exploiting CC model. Cloud-Cloud-native is an approach to developing and deploying applications according to concepts of:

• DevOps – is a clipped compound of words development and operations. Their main characteristics is automation and monitoring of each step of software de-velopment life cycle, its deployment and runtime. It is founded on collaboration between development and operation teams in order that they are more efficient, innovate faster, and deliver higher value to businesses and customers [15]. The RightScale5 company survey6 describes and illustrates with many diagrams cur-rent trends in DevOps. It also shows the level of adoption of appropriate techniques among research centers and enterprises.

• Continuous Integration/Continuous Delivery (CI/CD) – refers to practices of Continuous Integration (CI) and Continuous Delivery (CD). The CI approach assumes merging the developers teams (namely DevOps teams earlier described) work to a shared repository very frequently (several times a day) to prevent integra-tion obstacles. Whereas CD means to continuously release the applicaintegra-tion changes without waiting for the whole bundle to be ready to commit every change. In this way the developer can get feedback frequently and react faster. More explanation of CI/CD approach is given on page 13.

• Containers – offer a perfect execution environment for microservices. It is a way of packaging all processes and libraries that make up an application. The pack-aging is standardized by Open Container Initiative (OCI). Compared to traditional Virtual Machine (VM)s their creation and destroy imposes low overhead. The virtualization is on Operating System (OS) level rather than on hardware level (Figure 2.4).

• Microservices – is an architecture style derived from Service Oriented Architec-ture (SOA) [16] where the application is a collection of small services, independent of each other. The microservice can be deployed, scaled and restarted without inferencing the work of other application’s components. It communicates with the environment through HTTP API. Microservices decompose monolithic application to small, independent components.

The tight integration of enterprise systems with Cloud-native approach that is un-doubtedly attractive from the financial perspective cannot complicate the resource

man-5

https://www.rightscale.com

6

(11)

agement and their utilization. This requirement is particularly critical if the management cost increases significantly.

Cloud-native leverages open source software stack to deploy new applications as contain-ers, and dynamically orchestrate those containers to optimize resource utilization [17]. However, the distribution over the network causes potential malfunction of many more components that is harder to detect. After deployment, the environment needs to be observed and analyzed with monitoring, logging and tracing tools in order to quickly dia-gnose the problem and to do troubleshooting. The monitoring metrics are also helpful in other management tasks as: reservation, brokering, provisioning, installation, deploy-ment, scheduling, operations like start, stop, delete, and many more system’s functions. It should be emphasized that these functions are not only initiated by monitoring com-ponents, also through the human guidelines. However crucial is the independence of the administrator intervention.

Adaptive and autonomic management of computing resources is not a new problem in computer science. Figure 1.1 depicts the hierarchical structure of adjectives: auto-matic, adaptive and autonomic, sometimes erroneously used as synonyms. The arrow shows the direction of development of subsequent systems. Further up the hierarchy the set of systems properties increases. Achieving its maximum for autonomic systems. Automatic systems cover repeatable processes operated by a human or are specifically

Figure 1.1: Pyramid of dependencies between automatic, adaptive and autonomic sys-tems

programmed. The process is steered when to start or stop and what exactly to do. It op-erates within the presets and is predictable. However automation saves time. Adaptive systems in turn cover processes that change their behavior based on the execution con-text. The process improves its performance through direct and indirect environmental impacts or the feedback. On the other hand autonomic systems cover more complex processes. The system posses broad knowledge about its execution environment and can operate beyond its boundaries. The gained knowledge enables such systems to take invol-untary decisions and operate without human intervention. In autonomic systems an in-evitable part constitutes monitoring of environment parameters. These systems are often called 3A systems (Automatic Adaptive and Aware) [18, 19]. Automatic and adaptive features are listed above, whereas awareness is reached through monitoring functionality. Despite Autonomic Computing (AC) is found a long time ago it is still a top research

(12)

challenge especially in the context of other technologies. The present dissertation is an at-tempt to answer the question how it is possible to manage the lifecycle of a Cloud-native application through the techniques as AC utilizing the high-level policies. And also what are the requirements of AC in the context of Cloud-native. The research regards automation of Cloud-native environments in case to attain efficient in resource delivery. The principal problem of each dissertation constitutes the research motivation and thesis statement. These are described in the following section. The next section shows the boundaries of the research. All is concluded with the outcomes of this dissertation.

1.1 Motivation and Thesis Statement

Introducing the thesis, aim and the scope of the dissertation requires to briefly charac-terize fundamental notions important for this dissertation. These concepts regard main components of the research domain and they are: Cloud-native, Autonomic Computing (AC) and resource management. Introduction chapter of this dissertation includes short explanation of each term with their thorough review in the following chapter (Technolo-gical Background and Related Work).

Designing new applications and rebuilding the existing one in accordance with Cloud-native guidelines is an indispensable movement to be up to date with the business. Cloud-native has already changed the vision about developing, deploying and operating the applications. However this trend has a positive overtone, some negative aspects or difficulties should be clearly stated.

Nowadays trend in IT is to move the workloads from bare metal to virtualized envir-onments and then to containers towards serverless. In response to this challenge the microservices has risen. Enterprise microservice-based applications contain thousands of instances of containerized services. Such distributed, often geographically dispersed applications enabled an additional cost of management that business have to carry. At early stages administrators must tackle with heterogeneity of computing resource pools (diverse computing types, producents, installed OS or versions, diverse installed middleware, etc) to avoid also vendor lock-in and the consequences of such attitude. The bias towards diversity is seen also at higher-levels in the application stack where providers must deal with the variety of workloads (e.g., high performance computing) and its special requirements. An indispensable capability is on-demand self-service pro-visioning of IT infrastructure resources. Especially in case of ever growing environments the manual management processes have become inefficient. To still have a competitive offer to customers the providers must not obey agreed Service Level Agreement (SLA) at the similar costs. In order to ensure internal and external requirements, many or-ganizations implement strong governance practices and controls. These rules guard the compliance standards through policy-based management. Considering the above pre-sumptions it is hard to not violate the SLA requirements, especially in case when the

(13)

demands are still growing. The complication of basic management tasks led to seek different ways to accomplish these vital operations. A remedy seems to be AC paradigm in the context of Cloud-native environments.

For decades, system components and software elements have been evolving to deal with the increased complexity of system control, resource sharing and operational manage-ment [20]. Developing components responsible for self-managemanage-ment adds the follow-ing, autonomous characteristics to the system [21]: configuration, healfollow-ing, self-optimization and self-protecting. Such system architectures solve the overall complexity of resources management. The present dissertation proposes a self-management architec-ture model in Cloud-Native Execution Environment by distinguishing autonomic element constructs. Then the model is enriched with the ability to process declarative manage-ment policies.

Cloud-native is also about velocity. Application velocity is based on the user feedback. The Cloud-native uses the velocity in the context of application development, but suc-cessfully this term can be applied within diverse planes. Such attitude ideally consorts with contracted SLA. The level of service definitions should be specific and measureable in each area. This allows the Quality of Service (QoS) to be benchmarked and, if stipu-lated by the agreement, rewarded or penalized accordingly [22].

Autonomic behavior of the system requires specification of the goal that should be achieved as a result of adaptation or autonomous process realization. Such a goal is usually expressed as a collection of QoS or Quality of Experience (QoE) values [23]. These metrics are available only if the system’s components are observable. Cloud-native Computing Foundation (CNCF) treats this feature as a fourth, not-obligatory step in its Cloud-native landscape [24]. Some tools addressing this area in Cloud-native environments were developed (described in subsection 2.2.3). However there is still an in-evitable need to transform those metrics into executable actions to continuously improve customer experience (QoE) and Return on Investment (ROI). The aforementioned fea-tures are hard to achieve without solid information processing. The KRR is the field of Artificial Intelligence (AI) [25] which main aim is to process the gained information. The processing concerns the phase from data representation in case to formalize the sur-roundings to simplify the complexity of the system. The reasoning from knowledge aims at finding logic to automatize the system’s tasks and proactively mitigate the possible SLA violation. This dissertation, apart from the aforementioned objectives aims also at utilizing the gained knowledge while enforcing high-level policies.

The Introduction chapter briefly presented the concepts of Cloud-native, AC and resource management. However their introduction was made separately. The novelty of this dissertation assumes that AC and resource management exist in the context of Cloud-native. Hence the notion of autonomic management exploited further in the dissertation. It is defined as follows:

(14)

adminis-trative tasks such as installation, configuration, optimization and main-tenance in heterogeneous computing systems without or only a little human intervention. The automation is achieved through high-level policies.

The thesis statement is as follows:

Cloud-native environments enriched with declarative policy-based ap-proach compliant with components’ observability is an effective method of autonomic management realization.

where:

observability is the characteristic of a component describing its state. It concentrates on collecting metrics to enhance them with KRR cap-abilities. All these information are analyzed and then a Cloud-native application is profiled accordingly. Tightly coupled with observability is monitoring that supports collecting the current measurements, their transformation and visualization.

Cloud-native is a very fresh approach, however it is a future of the software develop-ment. It is not yet tested with AC paradigms although its usage in this context seems to be a natural step. It seems that after hardening the philosophy of Cloud-native other technologies would be applied to it, particularly the AC.

This dissertation overcomes aforementioned inconveniences in the context of Cloud-native applications. To solve the identified needs, pointed out in this chapter, it pro-poses the concept of autonomic management of Cloud-native applications with special emphasis on containerized nature of such environments.

1.2 Scope of Research

The concept of resource management in Cloud-native has a very broad meaning. It con-cerns many aspects depending on the regarding context. The low-level infrastructure management includes among others: provisioning, orchestration, migration, tagging, discovery, etc. The containerization enforces tasks and capabilities as: state manage-ment and scheduling, high availability and fault tolerance, security, networking, service discovery, continuous deployment, governance, etc. The prior tasks are similar as in Grid Computing or CC and are widely elaborated with those technologies [26–28]. The con-tainerization tasks are mainly the responsibilities of orchestrators strategies. However supporting these management tasks is still a field of many research works. One of the solution assumes data monitoring. The traditional monitoring is focused mainly to im-prove IT resiliency. In a Cloud-native environment effective monitoring, exploration of raw data like metrics, logging, tracing and dynamic instrumentation of resources still

(15)

is a challenging issue. The need to understand application internal state is a common problem and solving it delivers significant benefits for IT environments. Monitoring the data is beneficial for management systems, but in contrast observability enables to en-hance the data with knowledge that facilitates mining relevant information.

This dissertation focuses on autonomic management of Cloud-native environments through observation of all its internal components. The observations must be done across all levels in the Cloud-native application stack. Only the global view of the sys-tem gives authoritative results that can be utilized by syssys-tem’s components realizing autonomic management.

The proposed extension of current Cloud-native environments is based on knowledge processing capabilities to detect, correlate and respond to diverse events across multiple real-time data sources. The gained measurements and reasoning based on them dif-ferentiate from the number and kind of components being instrumented. Cloud-native environments are a great mixture of types of components (computing resources, container engines, containers, orchestrators, etc.) that their observability is a must. The primary role of observing the components is to help match their expected quality with perceived quality [29]. The QoS metrics determine the established SLA that defines service para-meters as availability, performance, operation, etc. These parapara-meters are an input for SLA enforcement processes. Current standard is to use high-level declarative policies to govern the system behavior. Moreover, these policies are formed in a Domain Specific Language (DSL) and natural language filling the language gap between common users and IT specialists. Policies provide a right approach to efficiently and effectively address the requirements of Cloud-native environments.

Furthermore, this dissertation analyses several technological stacks and standards, and then chooses the ones utilized while creating development environment for proposed solution. The presented research is an enhancement of current Cloud-native environ-ments of their observability to support autonomic management with the declarative policy-based approach.

1.3 Thesis Outcomes

The main contributions of this dissertation are as follows:

• Critical analysis of the current knowledge in the area of Cloud-native environments and their autonomic management. The analysis lists the capabilities and limita-tions of existing solulimita-tions in the context related to this dissertation.

• Clarification of boundaries of Cloud-native execution environment. The proposed model contains all components constituting a Cloud-native application.

(16)

• View of the application management concept in Cloud-native environments based on observability data. The study presents, both a high-level view with its upper bounds, as well as the low-level view with its lower bounds. Meanwhile, are listed the management actions that can be enforced on the Cloud-native application. • Conceptualizing mechanisms of declarative policies management in Cloud-native

environments. The policy-driven management is realized in conjunction with the capabilities offered by rule engines.

• Specification of an observability stack of Cloud-native applications. The stack assumes diversity of components that compose a Cloud-native application.

• Development of a blueprint for Cloud-native autonomic element. The proposal takes into account Cloud-native paradigms that feature its model. Hereinafter, the main building block, namely the control loop, is called the Cloud-native MRE-K loop.

• Design and implementation of the framework for Autonomic Management of Cloud-Native Applications (AMoCNA). The framework is driven by declarative manage-ment policies, and internally led by capabilities of autonomic elemanage-ments. Enhancing the Cloud-native autonomic elements with reasoning over observation data, proves the thesis statement.

• Evaluation of proposed concepts that are dedicated to improve performance of Cloud-native applications. Supportive in this case is implemented AMoCNA system. The evaluation exposes pros and cons of particular concepts in the context of autonomic management of Cloud-native applications.

1.4 Structure of the Dissertation

The present dissertation has been divided into following chapters:

Chapter 1 introduces the aspects touched in this dissertation. It describes the shortened history of moving to Cloud-native. Also the distinction between terms: automatic, adaptive and autonomic is cleared. After brief explanation of crucial concepts and the definitions important from this dissertation perspective it is pos-sible to present the issues of this research. This includes the motivation and thesis statement, its scope and outcomes.

Chapter 2 comprehends current standards and technologies related to designing mod-ern CC environments that constitutes one of a crucial research area of this dis-sertation. This chapter in details describes Cloud-native concept and shows the features of a Cloud-native application and platform. This includes explanation of container technology and its orchestration. Chapter 2 contains also a discussion

(17)

of the management of such environments with the emphasize on AC paradigm. These information constitutes a strong foundations for further deliberation and re-search contained in subsequent chapters of the present dissertation.

Chapter 3 aims at proposing the concepts aligned with the thesis statement. The chapter starts with systematizing current knowledge about Cloud-native. Presen-ted Cloud-native execution environment model is its effect. This model is fur-ther exhausted in next concepts. Abstracting the resource management solves continuous evolution problems among components of Cloud-native environment. This abstraction bound with AC paradigm results in fully autonomic management of Cloud-native applications. The proposed concepts are intensively explored in the design and implementation phase of this dissertation showing the urgency of the research area.

Chapter 4 completely exhausts the concepts developed in chapter 3, to design the system for Autonomic Management of Cloud-Native Applications. This system, named AMoCNA, leverages the Cloud-native principles that are already seen in the microservices architecture of AMoCNA. The division of the framework into three logical parts: (i) Cloud-native execution environment, (ii) Cloud-native Autonomic Computing, and (iii) Cloud-native management policies part, clearly distinguishes the crucial capabilities of AMoCNA. This chapter, contains at the end, the main algorithms of AMoCNA that are helpful in implementing the system of similar class.

Chapter 5 proposes a mapping of AMoCNA prototype, by implementing the frame-work with the usage of concrete technologies. The chosen technologies conform to concepts developed earlier. This chapter presents the details of the AMoCNA implementation with the chosen technology stack. Implementation of AMoCNA has a great prominence for proving correctness, not only theoretically but also practically of this thesis statement.

Chapter 6 is the evaluation of the developed concepts and a verification of its conform-ance with the requirements listed in Chapter 3. It utilizes AMoCNA prototype, to conduct several test scenarios.

Chapter 7 summarizes and concludes the present dissertation and proposes the ways of further development of the undertaken research.

Additionals contain a bibliography, a glossary with acronyms and definitions of terms used in this dissertation. Also attached is an appendix containing important Java code, Kubernetes Objects definition and some declared policies in drl language.

(18)

(19)

Technological Background and

Related Work

This chapter is a summary of current knowledge in the field of aspects related to this dissertation. The main scope of this thesis concerns CC environments with the emphas-ize on their native applications. The concept of developing and deploying applications with this approach has been shown. The following sections explain in detail the notion of a container, its orchestrator, and show the benefits of containerization. Containeriza-tion is the foundaContaineriza-tion of Cloud-native applicaContaineriza-tions. The scope of this dissertaContaineriza-tion apart from an architectural design includes also a prototype implementation. Hence it is also crucial to choose and discuss standards and environments related to Cloud-native. The decision has impact on architecture design and influence its features. It has been de-cided to use opensource software. There functionality do not stand out from commercial products, whereas they are publicly available and the usage for research purposes is non complicated [30].

Another important aspect is associated with the research related to the resource man-agement in Cloud-native environments. Particular attention has been put on building such systems. The related works are thoroughly analyzed, exposing the areas demand-ing more attention. Finally, the principles of Autonomic Computdemand-ing are analyzed. AC, in this dissertation, is a crucial solution in Cloud-native environments. The carried out an in-depth analysis aims in proving the novelty of this thesis.

The structure of this chapter is as follows. Consequent sections describe no less im-portant technologies for proving the correctness of the thesis statement. This disser-tation includes also a working implemendisser-tation prototype, hence the description would also be helpful while choosing appropriate technology stack to implement the proposed prototype. The first section introduces the concept of being Cloud-native. Presented are characteristics that architectures designed in Cloud-native manner should posses.

(20)

This section also resolves doubts concerning Cloud-native applications and Cloud-native platforms. Meanwhile, the brief description of various CC models is provided. Tightly coupled with Cloud-native are containers and their orchestrators. Their characterist-ics are given in first section. This section includes also a non-exhaustive list of key characteristics gained by container orchestration environment. The second section dis-cusses the research related to the resource management in Cloud-native environments. Resource management actions preserve established SLA contract. The proposed in this dissertation concepts fulfill this requirement through policy-based management and KRR techniques. Both terms are explained in this section. Additionally, the section analysis the popular solutions in this area, including their capabilities and limitations. The next section presents the concepts and methods of AC. This paradigm constitutes funda-mentals of the present dissertation. The extensive description is concluded with the figure of properties of AC and points out elements most valuable in the scope of this dissertation. Finally, the chapter is summarized.

2.1 Principles of Cloud-native

Cloud-native is a nowadays buzzword and stands for an approach to building and running applications that fully exploits the advantages of the CC delivery model [31]. Cloud-native gives new standards to developing and deploying applications. The twelve factor app7 patterns describe how a software-as-a-service application should be built. The patterns focus on speed, safety and scale. Speed means to deliver value more quickly, safety is the ability to move rapidly but also maintain stability, availability, and durability and scale is the elasticity to respond to changes. The overall objective of being Cloud-native is to improve speed of application delivery, scalability, its resilience and finally reducing technical risk of its deployment [32]. To be Cloud-native means also to be agnostic. Agnostic of the underlying infrastructure, of OS, of programming language, etc. Such attitude prevents from vendor-locking, the developer treats all resources simply as cloud-based.

CNCF is an open community formed for standardization of Cloud-native computing, creation of Cloud-native communities among the industry’s top developers, end users, and vendors and thirdly formed for collecting open source software dedicated to make cloud native computing universal and sustainable. The foundation is vendor-neutral fostering deployment of the fastest-growing projects on GitHub, including Kubernetes8, Prometheus9 and Envoy10.

7 https://12factor.net 8 https://kubernetes.io 9 https://prometheus.io 10 https://www.envoyproxy.io

(21)

CNCF definition of Cloud-native is as follows:

Definition 1.

Cloud-native computing uses an open source software stack to deploy applic-ations as microservices, packaging each part into its own container, and dy-namically orchestrating those containers to optimize resource utilization.

CNCF also provided a trail map that is an overview of the process of moving towards Cloud-native architecture [24]. Main identified and obligatory steps are:

1. Containerization – is the process of exposing an application as a container. Con-tainers are in details described in the following subsection 2.1.2. With this notion microservices are tightly connected. They focus on splitting the entire applica-tion into smaller components representing particular funcapplica-tionality. Cloud-native is built around the concept of loosely coupled microservices [33]. Proliferation of microservices architecture started in 2014 with the publication of Martin Fowler’s article [34]. Like SOA it is an architectural concept independent of any technology and vendor. Therefore it is not standardized but there are some common charac-teristics around development and deployment of applications compliant with this architecture. Microservices style of programming differs from the old style of pro-gramming monolithic applications. Monolithic applications are usually built with the MVC11 pattern in mind. The changes to the system require building and de-ploying a new version of the whole application. Microservices represent the decom-position of monolithic systems into independently deployable services [36]. One mi-croservice is responsible (usually) for exposing one capability. Table 2.1 shows the differences between concepts of a monolithic application and microservices style. The comparison favors microservices for larger systems. However the complexity of such systems increases. As shown in [37] the increase of modern application style concepts is illusory cost expensive.

2. Continuous Integration/Continuous Delivery (CI/CD) – is a philosophy that fosters flexible and agile development, deployment and operations of all kinds of software. Continuous Delivery and it’s precursor, Continuous Integration is a practice in software development. In CI [41] attitude the new part of software is integrated frequently, at least each team member integrates every day by commit-ting code to a controlled source code repository. As quickly as possible the potential errors are detected and should be fixed fast. This process is utilized by automated builds and tests. CI includes a constant feedback loop from the end users. Thus not only supporting error detection but also end users’ satisfactory level. Feed-back allows to put attention on key areas of the software and to meet all rising requirements. Proper CI technique runs builds in multiple environments. CD [42],

11

Model View Controller pattern assumes the existence of three main parts of the system [35]. Thus its name.

(22)

Features Monolithic applications Architecture Microservices Architecture Architecture concept

All functionality is in a single process

Each element of functionality is separated into a service Programming

language Rather one language

Each microservice can be written in different language

Data storage One solution Each microservice has its

own database Networking

style

Network services are based on hardware and configured specifically for the applica-tion usage

Network services are repres-ented by Software Defined Networking (SDN) [38, 39] components

Scalability Whole application is replic-ated on multiple servers

Replicating only those ser-vices as needed

Communication style

Communication mechanism is complex and with lo-gic (e.g. bus ESB in-cludes message routing, cho-reography, transformations, business rules [40])

It is established that mi-croservices communicate over lightweight protocols as HTTP, JSON as opposite to complex e.g. BPEL

Responsibility Managed by one single team Can be managed by many teams

Maintenance

A change made to a small part of the application, re-quires the entire monolith to be rebuilt and deployed

Not the entire application is rebuilt and deployed. Up-dates are made only on re-quiring microservices

Table 2.1: Comparison of characteristics of monolithic applications concept with mi-croservices concept

on the other hand is a complementary technique to CI. In CD attitude the software can be released to production at any time. CD is a compound of CI techniques with deploying the software in environments similar to production. CD is often confused with Continuous Deployment. The CD only assumes that the software can be deployed into production at any time. Executables are build and tested but not necessarily so frequently deployed (e.g. due to business policies). If the un-satisfactory release is shipped it can be quickly upgraded. Hence, the importance of feedback from testers team can be noticed. The key of CI/CD operation is the automation of the process of software delivery. This automation requires script-ing, testing and configuration tools. The most popular open-source technologies in these areas are:

• Build management - Apache Ant, Apache Maven, Gradle. • CI - Jenkins, Bamboo.

(23)

The RightScale survey12 describes and illustrates with many diagrams current trends in DevOps.

3. Orchestration – is the process of automation, coordination and management of certain IT tasks. Starting more and more containers and containerized applica-tions, divided into hundreds of pieces, complicates their management and orches-tration [43]. Orchestrator logically orders the loosely coupled microservices into a dependency graph and organizes their deployment. The proliferation of con-tainers risen the need of their orchestration. Container orchestrators can control containers remotely. Orchestrators help users build, scale and manage modern ap-plications and their dynamic lifecycles. They solve potential issues that might rise among containers run across many machines. This includes high availability, scal-ing, replication, fault tolerance and isolation. These orchestrators allow dynamic management of infrastructure, namely expose infrastructure as a programmable entities. Orchestrators are in details described in the subsection 2.1.3

The rest steps are optional. From the list, only Observability & analysis step is given a brief description for the reason that it is proposed in this dissertation to be a mean supporting autonomic management aspects among Cloud-native applications.

4. Observability & analysis – this is a concept which originated in Control Theory [44]. In compliance with that theory a system is said to be observable if, the current state can be determined in finite time using only the outputs. Observability is a char-acteristic of the system, it is a noun. Whereas monitoring, that is a verb, is the action of gathering the measures. The measurement of the overall microservice performance impose the application’s QoS metrics. To attain negotiated SLA parameters the system have to properly externalize its state through instrumenta-tion techniques. This opinstrumenta-tional step constitutes a research area of this dissertainstrumenta-tion. Observability is often defined as consisting of: (i) logging, (ii) metrics and (iii) tra-cing. CNCF advices to use its projects Prometheus13 for metrics, Fluentd for logging and Jaeger14 for tracing. This dissertation focuses on collecting and then analyzing the metrics, hence the usage of Prometheus in concept evaluation phase (Chapter 6).

Following the native guidelines the notion of native applications and Cloud-native platforms come into being. The difference between those two concepts is subtle, but important for this dissertation. Both notions are distinguished in further part of this section. 12 https://www.rightscale.com/lp/2017-state-of-the-cloud-report 13 https://prometheus.io 14 https://www.jaegertracing.io

(24)

2.1.1 Cloud-native application (CNApp)

A Cloud-native application, abbreviated further as CNApp, is developed and deployed in compliance with standards enumerated on page 13. The enumeration is composed of existing and new software development patterns. The existing patterns include CI/CD programming style as software automation (infrastructure and systems), API integra-tions, and SOA requirements. On the other hand the new Cloud-native patterns include containerization and orchestration of those containers and microservices architecture. Figure 2.1 is a summary of main building blocks of being Cloud-native. This example

Figure 2.1: Transformation chain towards a Cloud-native application

takes into consideration steps necessary to achieve a Cloud-native application. The left arrow shows the direction of transformations. Each subsequent layer proceeds from the prior layer getting entities mentioned in the Cloud-native computing definition (Defin-ition 1). The bottom layer executes containerization process. In effect this layer con-sists of containers responsible for proper functionality. The Orchestration Layer concon-sists of entities (orchestrators) that manages the containers. An orchestrator schedules the allocation of the containers among provisioned infrastructure. Figure 2.1 depicts only network management capabilities of orchestrators. The description of container’s orches-trator can be found in subsection 2.1.3. In Figure 2.1 there are two overlay networks: (i) vlan10 and (ii) vlan20. Container c1 is attached to both networks. There are three

(25)

replicas of c1, all are attached to both networks and deployed according to scheduling rules and on those nodes that fulfill incl. the resource utilization requirements. These containers with the definition of their configurations comprise microservices (depicted as colored cuboids in third level in the hierarchy of Cloud-native application). Mi-croservices are then collected together and configured according to a file. This grouping realizes some (or all) functionality of an application. The last important link are the applications composed from microservices connected and working together. In the pic-ture they are depicted as solid figures composed of cubes. Fullfilling all requirements concerning resources availability and their management becomes another often a tedious challenge. Rather than focusing on proper resource management, Cloud-native applica-tion can leverage a CC platform (as depicted in Figure 2.1). Microservices can benefit from CC platforms in different fields from authentication and authorization mechanisms through acquisition of computing, storage and network resources. Cloud-native comput-ing utilizes the advantages of CC model. A CC platform allows to concentrate mainly on business functionality of a Cloud-native application. Its importance for Cloud-native environment justifies the following description of CC basics.

CC environments

The research area of this dissertation focuses also on the modern ways of modeling and provisioning resources in CC environment. A CC platform, as was pointed out in pre-vious subsection, has vital functionality for Cloud-native applications. This subsection characterizes aforementioned computing style and analyzes the available solutions. CC history is known from archive magazines as Weekly15, Forbes16, various blogs or books [45–47]. There are many definitions of CC, the majority of them have common parts. However the most complete and compact definition proposed NIST17[48]:

Definition 2.

Cloud Computing (CC) is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing re-sources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. CC model is characterized by: on-demand self-service, broad network access, resource pooling, rapid elasticity and measured services. The three service models, described below are: (i) Infrastructure as a Service (IaaS), (ii) Platform as a Service (PaaS), (iii) Software as a Service (SaaS). Whereas the CC model can be deployed as a: private cloud, community cloud, public cloud or a hybrid cloud.

15 http://www.computerweekly.com 16 http://www.forbes.com 17 https://www.nist.gov

(26)

The boundaries of the service models are blurred. Eg. SaaS platforms cannot ex-ist without solid PaaS, and PaaS plat-forms require advanced infrastructure con-cepts offered in IaaS model. Being a Cloud-native means to fully exploit CC model, its infrastructure, middleware, authentic-ation and authorizauthentic-ation, logging, network management, and many more depending on the platform offerings. Among CC models these services are offered by IaaS and PaaS models. From PaaS platforms derive Cloud-native platforms that signific-antly enhance the development of Cloud-native applications. This is shown in pre-vious subsection and elaborated in the fol-lowing subsection.

Figure 2.2: Division of responsibility of re-source management among CC models.

Cloud-native platforms

Another important group of Cloud-native entities are Cloud-native platforms [49, 50] that are based on CC platforms. native platforms accelerate development of Cloud-native applications. They abstract the underlying infrastructure and provide the afore-mentioned Cloud-native capabilities. Cloud-native platforms are beneficiary to (i) de-velopers as they can focus only on the application core aim, they do not e.g. think of the testbeds and are not dependent of other teams. The platform provisions the resources un-noticeable. Secondly, they are beneficiary to (ii) application maintainers as they control over the application lifecycle, provisioning, deployment, upgrades, and security patches. The application is just pushed to the platform, and using the well known tool chains with little or no code modification the application is deployed. The platform supports application deployment workflow performing automatically the example steps: upload-ing external files accordupload-ing to metadata description, choosupload-ing the appropriate execution environment (this term is explained in the next chapter), starting the additional services (e.g. streams of logs, messaging engines, caching components, databases, etc.). And fi-nally they are beneficiary to (iii) application providers as they influence the application’s availability.

Cloud-native platforms center the application while provisioning and maintaining the parts of technology stack needed to achieve Cloud-native aspect. They accomplish this by undertaking tasks that must be done (e.g. container orchestration or scheduling) but that are not directly related to the development of the application. Thus using a Cloud-native platform is justified. The benefits of usage of a Cloud-Cloud-native platform include,

(27)

but are not limited to:

• Provisioning computing resources as VMs, bare metal servers, etc. • Provisioning storage resources as databases, disks, etc.

• Provisioning network as SDN controllers, load balancers, virtual switches, service discovery, etc.

• Creating, scheduling and orchestrating containers.

• User and Role-based Access Control (RBAC) management. • Monitoring and auditing capabilities.

• Aggregating logs.

• Multidimensional scaling.

• Providing High Availability (HA), fault tolerance and resilience.

In the above itemization three first items are obligatory in CC platforms, namely in IaaS. In Figure 2.3 they are depicted as virtual and physical resources. Next items (apart the last one) are realized through configuring and installing additional software components as services or plugins. Often these steps are delivered in PaaS model with fulfillment of IaaS capabilities. In Figure 2.3 they are depicted as an execution environment for a microservice with additional services. An execution environment guarantees applica-tion runtime, configuraapplica-tion and resource tunning. The execuapplica-tion environment, namely a container does not have an exclusiveness access to resources (depicted as dashed lines at the bottom). On the other hand good practices dictate to have a separate container for each microservice. The last item is realized by CC platform features and by

third-Figure 2.3: Structure of a Cloud-native platform

party solutions having nothing in common with CC models. This item is realized while designing the topology of the system. Cloud-native platforms have all features of CC plat-forms. They expand the features set by reliability and predictability over the CC-based infrastructure. Resiliency is an essential component of continuity. In CC it is achieved through the concept of availability zones that separates the underlying infrastructure. This mechanism provides the ability to place resources, such as instances, and data in multiple locations. Hence, to duplicate the resources in different locations. Other mechanisms supporting resiliency include restarting unresponsive VMs and containers, dynamic routing and load balancing traffic amongst availability zones. In traditional CC

(28)

there is no guarantee concerning reliability of the infrastructure. In modern networks HA is achieved through devices and connection redundancy. The network need to have proper alternative path selection technique to take advantage of additional equipment and links between them. Also there exist software solutions to this problem, namely clusters. Clustering mechanism is utilized by orchestrators too. It is introduced in sub-section 2.1.3.

Above, thorough analysis of the Cloud-native platform structural and behavioral models have shown its profits for Cloud-native systems. Yet, one platform should be chosen for this dissertation prototype.

Summary

Cloud-native is a very fresh trend but has one of the fastest adoption rates of any new ap-proach. Some notices about it are dated at the very end of 2016 year. R&D environments published yet a couple of publications concerning that technology [51, 52]. However the papers show that Cloud-native is still in the incubating phase. It has to be stressed, that there does not exist one accepted and widely used Cloud-native definition. CNCF com-munity aims in systematizing the knowledge in the Cloud-native field. This dissertation is grounded in CNCF ideas. The Cloud-native concept is still evolving and its products are rapidly developed. Such immaturity imposes many challenges that need to be solved. Some of them are conceptualized in chapter 3 of the present dissertation.

2.1.2 Basics of the containers

Containerization is the process of exposing an application as a container. This disserta-tion research is done among such applicadisserta-tions. The knowledge of container mechanisms is crucial to understand the developed architecture.

The concept of containers is known for decades (e.g. Solaris Zones18), a long time ago it was introduced in Linux operating system. Containers are based on Linux kernel features:

• Cgroup – namely control group, organizes processes in a hierarchy and allocates system resources to that hierarchy. With cgroups resource utilization is configur-able.

• Namespaces – define the resources that are available to the process. Namespaces allow a process to treat a global resource as if the process had its own isolated instance of that resource [53].

• Filesystems – are mounted to the process in read-write or read-only manners. The operations are guarded by certain namespace.

18

(29)

In 2015 the OCI has been established. Its mission is to build a vendor-neutral, portable and open specification and runtime for containers. The OCI currently contains two specifications: the Runtime Specification and the Image Specification [54]. The agnostic container image is defined as follows:

Definition 3.

A container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries settings [55].

From the technical point of view container image is just a set of serialized file systems with some configuration and metadata. Deploying it (running as a container), means mounting the filesystem in a namespace. Containers aim is similar to VMs, but they are

(a) VM virtualizes hardware

Figure 2.4a depicts 3 VMs running on top of a hypervisor.

(b) Container virtualizes OS

Figure 2.4b depicts 3 containers running on top of a container runtime inside a bare metal server.

Figure 2.4: VM versus Container

more portable and efficient [56]. Figure 2.4 shows that containers abstract the operating system level while VMs are an abstraction of physical infrastructure. Each VM includes a full copy of an operating system, all binaries and libraries. VM image takes up couple of GB, while container’s image takes up only couple of MB. They share the OS with other containers but run as an isolated process. VMs are slow to boot, on contrary containers start almost instantly. However, in VM favour is the fact that they isolate processes better and more maturily solve the security issuess [57]. The proposed dissertation’s in-frastructure benefits from combining containers and VMs (Figure 2.5), achieving greater flexibility in deploying and managing applications. On the basis of the above description containerization is:

• Lightweight – containers share the OS kernel, start much faster, and use a fraction of the memory compared to booting an entire OS [58].

• Portable – builds are local but containers can be deployed and run anywhere, on any operating system.

(30)

(a) Standalone containers

Figure 2.5a depicts 3 containers running on a bare metal server.

(b) Containers run inside VMs

Figure 2.5 depicts 4 applications. Starting from the left, AppA and AppB run in two separate contain-ers in a single VM. The next ap-plication (containerized) runs in an-other VM, and the last application is running directly in a VM.

Figure 2.5: Standalone containers versus containers run inside VMs

• Efficient in resource utilization – regarding the containers start and further resource sharing.

• Isolation aware – not only among containers but also from the underlying infra-structure. The application needs are limited to the container boundaries instead of the whole machine.

• Independent – different containers can run on the same host without the know-ledge of the others.

• Modular – monolithic applications are split into discrete units.

• Open-source – most container runtimes and images are free.

The most popular and widely used container technology is Docker. The launch of Docker in 2013 transformed the application standards and its management. Other examples for container technologies are CoreOS’ rkt (Rocket)19or Cloud Foundry’s Garden, (Warden). Other commercial examples are VMware Photon Platform, vSphere Integrated Contain-ers, Microsoft’s Windows Server containContain-ers, Hyper-V containers or VMware Thinapp. There are lots of container runtimes, however Docker is the absolute leader20. This justi-fies the usage of Docker in further research. Choosing this container runtime has impact on this dissertation’s concepts. Therefore the further part of this dissertation focuses on Docker containerization.

19

https://coreos.com/rkt/

20

A survey by DevOps.com and ClusterHQ : https://clusterhq.com/assets/pdfs/state-of-container-usage-june-2015.pdf

(31)

2.1.3 Orchestrating the containers

Term orchestration is rooted in music, where orchestra [59] is a large instrumental en-semble typical of classical music, which mixes instruments from different families. Or-chestras are usually led by a conductor who directs the performance. The conductor unifies the orchestra, sets the tempo and shapes the sound of the ensemble. Transferring the above description to IT field the following definition is reached:

Definition 4.

An orchestrator is a workflow management solution. Orchestrator enables to automate the creation, monitoring, and deployment of resources in the environment [60].

On the other hand there is a verb scheduling. To schedule a container means to allocate resources required by workloads. The scheduling decisions are compliant with defined scheduling policies. Scheduling a container is one of the orchestrator functionality. Vital in the Cloud-native environment is the relation between Container Engine and the Orchestrator. Container Engine meets the needs of managing a single container on a single computing resource, while an Orchestrator abstracts the infrastructure resources (com-puting, network and storage) to allow end-users to treat the entire cluster as a single deployment target [61]. The cluster management and orchestration features are an ad-ditional set of tools built on top of the Container Engine. Orchestrators are the primary container management interface of distributed deployments. They refer to the act of con-tainer scheduling, cluster (described below) management, and also provisioning of addi-tional hosts.

The environment wherein the containers together with their orchestrator run, form a cluster:

Definition 5.

A cluster consists of two or more nodes collaborating together to accomplish a particular task. A node stands for a single computing resource.

There are four main reasons21, resulting in cluster types, of creating clusters:

Storage cluster assures the filesystem consistency [62] among servers in parallel doing read and write operations. The purpose of this cluster is to enhance the files administration through limiting the number of their occurrence and their upgrades. Additionally a storage cluster eliminates the need to duplicate the application data, simplifies the creation of backup data and helps to remove the failures.

HA cluster [63] assures a minimum amount of down-time through eliminating a Single Point of Failure (SPF) and through migrating the workload from failed node. The

21

(32)

preservation of the data integrity during migration must be emphasized. This cluster type is also known as fail-over cluster.

Load balancing cluster [64] distributes the workload between different cluster nodes. Depending on the current load the number of cluster’s nodes can be manipulated to satisfy the QoS requirements. The failure of a node is not noticed by end-users. In such situations the traffic is directed to other nodes.

High performance cluster [65] uses its nodes to accomplish computing tasks in par-allel. The idea is to aggregate computing power to deliver higher performance. Such solution is exploited to deal with sophisticated problems.

As stated in aforementioned definition and mapping it to Cloudnative environment -the main purpose of grouping hosts (bare metal or VMs) into an orchestrated environ-ment is to accomplish by microservices a given task. Nodes in these environenviron-ments are cluster components that have installed container’s runtime. Orchestrators usually have implemented services that realize most of features that described above types of clusters possess. Described below characteristics of Cloud-native environments, namely the or-chestrator characteristics mainly results from the clustering properties. Running contain-ers as a cluster and orchestrating those containcontain-ers broadens the environments’ features (itemize on page 21) to:

• Manageabilty – the cluster state is managed from one place that also is respons-ible for containers scheduling and their placement. The tasks, that orchestrators out-of-the-box ensure to do are: (i) system monitoring, (ii) starting up or shutting down the containers, (iii) resource utilization among applications, (iv) loadbalan-cing between the active application instances, (v) sharing authentication secrets, and many more.

• Scalability - they can scale on demand, with support and synchronization among different hardware and OS.

• HA – have the characteristics of always being available [66]. The system attempts to protect itself against noticeable failures and automatically tries to repair those failures.

• Fault tolerance – ensures that the system operates properly in the case of a failure in one of its components. Regarding this issue cluster nodes are duplicated. • Security – communication among cluster nodes is over TLS. The exchanged data

can be encrypted.

Introduced, by Cloud-native environments difficulties that some of them are mentioned in above itemization, are overcome with orchestration solutions. Doing all aforemen-tioned tasks manually requires a lot of effort and is too slow to react to unexpected events. Also developing a tool that automates these tasks is a tedious challenge. Orchestrators

(33)

enhance a vast capabilities of distributed, hierarchical deployments. Moreover, the or-chestrator simplifies the modifications of changing requirements that is done at runtime. Their considerable significance to Cloud-native environments is hence remarked.

The following sections explain terms management and autonomic. Showing, at the same time their importance for the Cloud-native environments and for this dissertation.

2.2 Basics of Resource Management

This section describes concepts related to the management of Cloud-native applications and the methods of its realization. Firstly, it explains the basic terminology. Then two aspects are emphasized, namely policy-based management and Knowledge Representa-tion and Reasoning. The former one, declares acRepresenta-tions (what ) that should be enforced, and the latter one observes when the actions should be executed.

Cloud-native application is treated in the present dissertation, as an object being man-aged and utilising resources (computing, network, storage) that are indirectly manman-aged. The PhD dissertation [10] thoroughly describes and defines the terms resource, resource management and effectiveness of resource management. Duplicating those terms is need-less, however the resource management definition is quoted as being the principal ob-jective of this dissertation.

Definition 6.

Resource management [10] is a process resulting in a change of resource availability or its state. The resource might be influenced directly or indirectly by the change in its surrounding environment or the change in conditions of its operation.

A Resource Management System (RMS) act as a middleware between resources (com-puting, network and storage) and the application’s requirements. The requirements are negotiated in a contract and encapsulated in QoS metrics. The RMS seeks to maximize the resource metrics attaining at the same time negotiated SLA. The agreed parameters are usually included in high-level directives, called policies that are a mean of system management. The following subsection briefly characterizes this important term, namely a policy.

2.2.1 Policy-based management

The growth of number of resources, users and connection devices causes the increase of complexity of their management. The growth increases also the complexity of network infrastructure and the overall system’s architecture. It is desirable that management tasks execute autonomously with minimal administrator’s intervention. The problem

(34)

touches not only the devices, but also the communication way, data replication, assur-ing the acceptable latency, etc. Apart the autonomic management, the administrator should be able to do management tasks from different places. The management interface need to take into account the administrator’s context. Policy-based management aims in solving the aforementioned inconveniences and in nowadays systems it is a standard and a default technique. An undeniable convenience is the ability of changing the system behavior without reimplementing it. Policies enhance the presets of managed elements in runtime. The changes are dynamically applied without modifying the underlying implementation. Providing for the above aspects the policy is defined as follows:

Definition 7.

Policy is a set of limitations imposed onto a set of all possible system’s beha-vior. As a result a subset of all accceptable system’s behavior is obtained [67]. The acceptable system’s behavior constitutes the specification of user demands. The system need to have tools that help them to acquire and represent policies — high-level specifications of goals and constraints, typically represented as rules or utility functions and map them onto lower-level actions [68]. There are numeruous definitions of policy (e.g. [67, 69]). All of them are, however a formal behavioral guide.

The above definition can be oversimplified to the definition of policy that it is a reasoning on the basis of actual system parameters (e.g. memory usage, number of processes, etc.). Such concept claims to elaborate this point of view, especially as it has significant importance for the present dissertation. The following subsection explains the usage of rule engines as a tool for policy manipulation.

2.2.2 Knowledge Representation and Reasoning (KRR)

In every system, data play a crucial role. Thus, it is important to properly process them. The KRR is a solution to this problem. The knowledge representation part of KRR refers to expressing knowledge in symbolic form. Whereas the reasoning part regards thinking using this knowledge.

KRR capabilities can be delivered through rule engines. At a high level rule engines are composed of three elements: ontology, rules and data. The reasoning is performed by rules, strictly rule engine. A rule is a statement that specifies the execution of one or more actions in case its prerequisites are met [70].

Definition 8.

Rules are usually represented in the form expressions:

if [list of requirements] then [list of actions]

The most common case [71] is to use processing rules (rule engine) as a tool for im-plementing the policy decision point (according to the IETF/DMTF Policy

(35)

Architec-ture [72]). Thus, rules are the expressions of system policies. It is important that rule processing enables easy policy modifications simplifying the development of policy management tools and policy storage in repositories. Also, a rule engine is fulfilling the strategy pattern objectives. At runtime, it selects the action to enforce, depending on the processed policy.

Figure 2.6: A simplified view of a rule engine architecture.

Main elements of a rule engine are (Figure 2.6):

• Inference engine – It matches the rules against the facts.

• Production memory – called a rule base. It loads the rules and exposes them during the system operation.

• Working memory – also called a set of facts. It asserts the facts, modifies or retracts them. Facts are units of information representing the current state of a given system component.

Rule engines [73] enable flexible creation and evolution of complex logic in the form of rule sets. Actions can modify the fact set, thus it is necessary to process the action set in cascade-like style. An effective solution to this problem is proposed in [73] which introduces an algorithm that serves as a basis for many current rule engine implementa-tions. The crucial advantage of a rule engine is its ability to operate in isolation from the code implementing the main system logic. The usage of rule engines in systems fulfills the following assumptions: separation of system logic from its implementation and the ease of introducing changes in system operation policy without the necessity to recompile application source. Existing rules, realizing a given strategy do not need to be modified while extending the rule set. Adding and removing rules (policy management changes) can be done without stopping the system. Moreover rules can be shared among different applications and organizations.

Facilitation of policy specification, their processing and the ability of runtime reconfig-uration are only the set of core rule engine’s capabilities that makes it a perfect tool to exhaust in Cloud-native environments. Rule engines are the building blocks of con-cepts developed in chapter 3 in the context of Cloud-native applications. Their principal role, in these environments, is to aid fulfillment of policy-based management.

(36)

2.2.3 Technologies related to Cloud-native management

This section describes the management solutions and technology stacks, widely adop-ted by Cloud-native environments, even in enterprise-grade systems. Their analysis will verify whether some of them could be useful while implementing the management en-hancement in Cloud-native system. It will also reveal their weak points, indicating areas worth concentration.

The foundation of resource management are their monitoring metrics [74]. Application Performance Management (APM) is an IT technology that evolves as new technologies appear. It is based on the concept of monitoring. Traditional monitoring solutions take metrics from each server and applications they run. Cloud-native creates new challenges for monitoring. The metrics need to be correlated on several layers (Figure 3.5). Apart from its high distribution, the key difference from traditional monitoring is the dynamic nature of the environment [75]. The environment is reserved for each container. Each having its own virtual networks, own storage access methods, sharing CPU access with other conquering processes. The diversity of metrics for cluster nodes and containers allow to properly monitor resource usage. Resources are shared among containers ac-cording to specified CPU, IO and memory limits. However it might be difficult to prop-erly specify the limits without any support. The further part of this subsection briefly describes Cloud-native resource management tools with emphasizing those supporting Docker containerization. Their analysis will be used to select a monitoring tool useful while implementing the model extensions in the area of autonomic management of sys-tems compliant with Cloud-native applications.

Firstly, Prometheus22 platform should be enumerated. It is a CNCF recommended project written mostly in Golang23. Prometheus is an open-source metrics collection and monitoring system with a multi-dimensional data model, a flexible query language to these data (PromQL) and efficient time series database. The central point is the Prometheus Server, responsible for monitoring process of its Targets. Prometheus provides extensive and large amount of instrumentation client libraries and data Ex-porters. The monitoring includes metrics exposed internally by the system as logs or statd or a complex third-party solutions. The integration points to external systems are divided (and already provisioned) to categories: (i) databases (e.g. MySQL Oracle, PostgresSQL), (ii) hardware related (e.g. IoT Edison, Node/system metrics), (iii) mes-saging systems (e.g. NATS, RabbitMQ), (iv) storage (e.g. Ceph, Hadoop), (v) http (e.g. Apache, Nginx), (vi) APIs (e.g. AWS, GitHub, Mozilla), (vii) other monitoring (e.g. JMX, Nagios, Sensu), (viii) miscellaneous (e.g. JIRA, cAdvisor, Xen). A substantial exporter for this dissertation and especially for containerized environments is cAdvisor. cAdvisor24 is a common tool from Google that analyzes performance characteristics

22 https://prometheus.io 23 http://golang.org 24 https://github.com/google/cadvisor