Lecture 1

(1)

Lecture 1

Introduction to database systems

(2)

Basic database concepts

A database stores data in operating system files.

A database management system (DBMS) is a

special software for managing large amount

of data stored in operating system files.

(3)

Disadvantages of the simplest DBMSs based files:

 Problems in accessing data – users need special applications to operate on data stored in files.

 Problems with integrity constraints – it is required to implement mechanisms which allow to ensure that the data meets some conditions.

 Lack of concurrency control – one user can perform operations on data which are accessed by other

users at the same time.

 Data inconsistency – without special mechanisms a database can be left in an inconsistent state (e.g.

some integrity constraints can not be preserved, data entered) after user modifications.

 Lack of backup and restoring mechanisms –

database should posses mechanisms which allow to archive its content and restore backups in case of failure.

(4)

Contemporary database management systems provide the following features:

 Higher level of abstractions – DBMS hides details connected with physical data representation and storage operations.

 Efficient data access – DBMS provides diverse

mechanisms which allow to process data efficiently.

 Data integrity – DBMS allows to define variety of integrity constraints and it has mechanisms to enforce these constraints.

 Security – users can gain access to some (not to all) data through specific privileges or roles.

 Concurrent access – DBMS provides very

sophisticated mechanisms to allow users work simultaneously on the same date.

 Archiving and recovering mechanisms – DBMS is equipped with variety mechanisms to restore database from backup files in case of failure.

(5)

Architecture of database management system

The ANSI/SPARC architecture is the most known framework for database management systems.

Conceptual level

Internal level External level

User 1/

Application 1 ^...

User 2/

Application 2

User n/

Application n

Stored data

(6)

The external level represents the data model dedicated to specific users and

client applications. In relational databases it involves user schemas, privileges, tables, views, etc.

The conceptual level defines the data model for the whole enterprise, not only to

particular users. In particular it contains the database catalog and its schema.

The internal level concerns physical data

storage.

(7)

The classical client-server architecture is based on the ANSI/SPARC model.



Users connect to the server through a

network and send their requests (e.g. SQL queries).



The server sends then the answer back to

the clients.

(8)

The main components of DBMS.

 A query parser takes user queries to perform their syntactic and semantic analysis.

 The parsed query is passed to a query optimizer which examines various execution plans for this query.

 The plan with the lowest cost is presented to a query

evaluator which processes it and returns its result.

 The query evaluator make use of a resource manager. It defines how records are arranged in files and allows access to desired sets of records.

 A buffer manager is

responsible for moving pages from disk to memory buffers.

 It cooperates with a storage manager to obtain data from disk.

 A recovery manager is

responsible for preserving the effects of completed

transactions even if the system fails.

Storage (database files) Query parser

Query optimizer

Query evaluator

Resource manager

Buffer manager

Storage manager

Transaction manager Recovery

manager

(9)

User requests forms database transactions.

A transaction manager ensures atomicity and isolation of transactions.

Atomicity means that all operations of a

transaction must be

successfully completed;

otherwise the transaction is cancelled.

Isolation ensures that effects made by a

transaction are not visible to other transactions

Storage (database files) Query parser Query optimizer Query evaluator

Resource manager

Buffer manager

Storage manager

Transaction manager Recovery

manager

(10)

Relational database management systems (RDBMSs)



It was proposed in 1970 by Edgar Codd.



The model assumes that data are stored in relations which can be understood as two- dimensional tables with columns and rows.



The model also introduces the collection of operators (e.g. selection, projection, join) which act on relations and return also

relations.

(11)

Example

Assume that there is a need to store

information about departments and their

employees. The above information can be

kept in the following two relations (tables):

(12)

DEPARTMENTS

EMPLOYEES

(13)



The DEPT_ID column in both tables models the relationships between them.



The relationship is modelled by storing the values from a unique column (e.g.

DEPARTMENTS.DEPT_ID) of one table in a column of another table (e.g.

EMPLOYEES.DEPT_ID).



Relations can be considered as time-varying

sets of rows if the modifications operations are allowed on them. Hence the above two tables represent some states of the relations

EMPLOYEES and DEPARTMENTS. They are called

instances of these relations.

(14)

 The concept of a schema is introduced to describe the structure of relations.

 The schema involves the relation name and the names of its attributes (columns).

 It specifies also data types for these attributes. For example, the previously considered relations can have the following schemas:

DEPARTMENTS(DEPT_ID:integer,DEPT_NAME:string, CITY:string,STREET:string,

POSTAL_CODE:string)

EMPLOYEES(EMP_ID:integer,FIRST_NAME:string, LAST_NAME:string,

SALARY:float,HIRE_DATE:date, DEPT_ID:integer)

 The rows (tuples) of relations must confirm to their schemas. For example, the values of the DEPT_ID attribute of the DEPARTMENTS relation must be integers.

Schemas

(15)

Structured Query Language (SQL)

Each RDBMS provides a language to manipulate relations and their contents. The language which contemporary RDBMSs apply today is called SQL (Structured Query Language).

It has several dialects depending on the DBMS on which it is applied. SQL consists of the following:

 Data Definition Language (DDL) – it is applied to create, modify, and remove database objects (e.g.

tables, indices, views). DDL mainly involves CREATE, ALTER, and DROP statements.

 Data Manipulation Language (DML) – it is applied to retrieve, insert, update, and delete data from tables.

DML involves SELECT, INSERT, UPDATE, and DELETE statements.

 Data Control Language (DCL) – it allows to set and remove database privileges. DCL involves GRANT, DENNY, and REVOKE statements.

(16)

Example 1.1: Sample SQL query.

The query below retrieves information (first name, last name and salary) about employees earning more than 4000.

SELECT FIRST_NAME, LAST_NAME, SALARY FROM EMPLOYEES

WHERE SALARY > 4000

The execution plan for this can involve the following steps:

 examine all the rows of the EMPLOYEES relation which occurs in the FROM clause,

 choose only the rows which satisfy the selection conditions occurring in the WHERE clause,

 display the values of the attributes of those rows (attribute names are indicated in the SELECT

clause).

(17)

Main database objects

The following database objects are supported by contemporary RDBMSs:

 Table – it is a collection of rows in which each row has the same number of fields (columns).

 Constraint – it defines rules to enforce the integrity of a database.

 View- it can be considered a virtual table because its rows are not explicitly stored in the database. If a user query addresses a view then its rows are obtained by executing the select

statement which is associated with this view.

 Index – it is auxiliary (redundant) data structure that allows to speed up retrieval of data.

 Procedure – it is a saved collection of SQL statements which perform specific tasks.

 Function – it is a stored routine that must return a value.

 Trigger – it is a special kind of the stored procedure which automatically runs whenever a specific event occurs in the database.

(18)

Examples of RDBMSs



Microsoft SQL Server

https://www.microsoft.com/sqlserver



Oracle Server

www.oracle.com/database



DB2 www.ibm.com/db2



Sybase www.sybase.com



MySQL http://www.mysql.com/



PostgreSQL http://www.postgresql.org/

(19)

Other database management systems

Object-oriented database systems

An object-oriented database management system

(OODBMS) is a database management system which is based on object-oriented data model.

This model introduces concepts such as complex

objects, classes, attributes, relationships, methods, typing, persistence, inheritance, polymorphism etc.

Object-oriented databases (object bases) consists of objects. The relationships between object are

modelled by pointer objects.

(20)

Object-oriented concepts

Objects are abstract entities which represent things or notions occurring in the real world.

For example, the following entities are objects:

 an employee named John Smith,

 a bank account with the number 5683487554867,

 a car with VIN 678GH9056893.

Each object possess a unique internal identifier (OID), an external name (assigned by database designers) and a set of attributes. The state of an object is determined by the values stored in its attributes. Objects usually posses a set of methods (e.g. functions, procedures) which are applied to read and

modify their state.

Note that the concept of a primary key does not occur in an

object model. It is not necessary to add an additional attribute to an object which uniquely identifies it. In the object model the system automatically generates a unique internal identifier for each object. Internal identifiers are also used to model relations between objects. In the relational model a foreign key is applied to implement relationships between tables.

(21)

Classes

To capture the structure of objects the concept of classes is introduced.

A class is a container which holds information common to a specific group of objects.

The elements of the class are called invariants (e.g. attributes, methods).

An object assigned to a class is called an

instance of this class. All instances of a given class are equipped with its invariants.

The actual set of instances of the class is called

extent of this class.

(22)

Inheritance

An inheritance is a special type of relation between classes which introduces some hierarchy of classes.

One class C1 can be defined as a subclass of another class C2.

It means that instances of C1 will be equipped with all the variants of C2.

The class C1 can also stores additional invariants.

Multiple inheritance takes place if a class

inherits properties of at least two classes.

(23)

Example of classes hierarchy

Vehicle vin owneryear getYear()

getOwner() setOwner(...) displayDesc()

number_plateCar mark model

getNP() setNP(...) getMark() getModel()

displayDesc()

Bicycle

number_of_wheels type

getWheels() getType(...) displayDesc()

object object

object object object object

object object

i

(24)

Polymorphism

It is possible to equip an object with different methods which have the same names and signatures. Each such a method is the invariant of a specific class to which the object belongs. Polymorphism causes that sending a message to objects can result in different actions depending on the class of object which receives the message.

For instance, the class Vehicle contains a method

displayDesc() which display some information about its instances. The specialized classes Car and Bicycle also contain a method displayDesc() (each class posses own code of this method). When an object receives a message displayDesc() the system verifies to which class the object belongs. After detecting the object class it calls an appropriate method displayDesc().

(25)

Example of object-oriented database

The above class diagram (schema) introduces three classes:

PersonClass, EmpClass and DeptClass. Persistent instances of these classes can be accessed using their names Person,

Employee and Department. A Department object stores a collection (named employs) of references to the instances of EmpClass. An Employee object stores single reference (named worksIn) to the Department object.

employs

worksIn

EmpClass:Employee salary: float

hire_date: date DeptClass:Department

dept_name: string city: string

street: string postal_code: string

PersonClass:Person first_name: string last_name: string birth_date: date getFullName()

inherits

1 *

(26)

Each object possesses a unique internal identifier. Object identifiers will be

denoted by ij, j=1,2,...Three types of objects are used to model an object store for the above diagram: atomic, reference, and complex. They are modelled as the

following triples:

 atomic object as <i, n, v> - an object has the internal identifier i, the external name n and the atomic value v.

 reference object as <im, n, ip> - an object has the identifier im and the name n. The identifier ip is the value of this object and denotes a

reference (pointer) to another object.

 complex object as <i, n, S> - an object has the identifier im and the name n. S denotes arbitrary set of objects.

i₂₀ Department i₂₁ dept_name “Marketing”

i₂₂ city “Lodz”

i₂₃ street “Piotrkowska 145”

i₂₄ postal_code “90-024”

i₂₅ employs i₈ i₂₆employs i₁₅

i₈ Employee i₉ first_name ”John”

i₁₀ last_name ”Smith”

i₁₁ birth_date 23-02-1984 i₁₂ salary 6300

i₁₃ hire_date 11-07-2013 i₁₄ worksIn i₂₀

i₁₅ Employee i₁₆ first_name ”John”

i₁₇ last_name ”Smith”

i₁₈ birth_date 23-02-1984 i₁₉ salary 4800

i₂₀ hire_date 12-10-2013 i₂₁ worksIn i₂₀

i₃ EmpClass (invariants of EmpClass) i₂ PersonClass

(invariants of PersonClass) i₁ DeptClass

(invariants of DeptClass)

instance of instance of

instance of subclass of

(27)

Lecture 1

Lecture 1

Introduction to database systems

Basic database concepts

A database stores data in operating system files.

A database management system (DBMS) is a

special software for managing large amount

of data stored in operating system files.

Architecture of database management system

The external level represents the data model dedicated to specific users and

client applications. In relational databases it involves user schemas, privileges, tables, views, etc.

The conceptual level defines the data model for the whole enterprise, not only to

particular users. In particular it contains the database catalog and its schema.

The internal level concerns physical data

storage.

The classical client-server architecture is based on the ANSI/SPARC model.

Users connect to the server through a

network and send their requests (e.g. SQL queries).

The server sends then the answer back to

the clients.

Relational database management systems (RDBMSs)

It was proposed in 1970 by Edgar Codd.

The model assumes that data are stored in relations which can be understood as two- dimensional tables with columns and rows.

The model also introduces the collection of operators (e.g. selection, projection, join) which act on relations and return also

relations.

Example

Assume that there is a need to store

information about departments and their

employees. The above information can be

kept in the following two relations (tables):

DEPARTMENTS

EMPLOYEES

The DEPT_ID column in both tables models the relationships between them.

The relationship is modelled by storing the values from a unique column (e.g.

DEPARTMENTS.DEPT_ID) of one table in a column of another table (e.g.

EMPLOYEES.DEPT_ID).

Relations can be considered as time-varying

sets of rows if the modifications operations are allowed on them. Hence the above two tables represent some states of the relations

EMPLOYEES and DEPARTMENTS. They are called

instances of these relations.

Schemas

Structured Query Language (SQL)

Example 1.1: Sample SQL query.

Main database objects

Examples of RDBMSs

Microsoft SQL Server

https://www.microsoft.com/sqlserver

Oracle Server

www.oracle.com/database

DB2 www.ibm.com/db2

Sybase www.sybase.com

MySQL http://www.mysql.com/

PostgreSQL http://www.postgresql.org/

Other database management systems

Object-oriented database systems

Object-oriented concepts

Classes

To capture the structure of objects the concept of classes is introduced.

A class is a container which holds information common to a specific group of objects.

The elements of the class are called invariants (e.g. attributes, methods).

An object assigned to a class is called an

instance of this class. All instances of a given class are equipped with its invariants.

The actual set of instances of the class is called

extent of this class.

Inheritance

An inheritance is a special type of relation between classes which introduces some hierarchy of classes.

One class C1 can be defined as a subclass of another class C2.

It means that instances of C1 will be equipped with all the variants of C2.

The class C1 can also stores additional invariants.

Multiple inheritance takes place if a class

inherits properties of at least two classes.

Example of classes hierarchy

Polymorphism

Example of object-oriented database

XML database systems

XML (Extensible Markup Language) is flexible

text format designed to store and transport

data. Each XML document has a tree structure

which begins with the root.