A DBMS is a complex software system. This section is meant to discuss the types of software components that constitute a DBMS and the types of computer system software with which the DBMS interacts.
This is an overview of the key components of DBMS. This figure is by no means exhaustive! Some components may be left out, some others added.
To the left there are various ways to interact with DBMS. DDL statements are meant to create data definitions which will be stored in a catalog. Interactive queries are typically executed from a front-end tool such as a comment line interface. Applications interact with DBMS using embedded DML statements. The DBA can use various database tools to maintain the DBMS.
To facilitate the usage of this, the DBMS will provide various interfaces which will invoke the components. The most important components are : the connection Manager, the security manager, the DDL compiler, the query processor and the storage manager.
The Query Processor
The query processor is one of the most important parts of the DBMS. It will assist in the execution of database queries such as retrieval of data, insertion, update and removal of data from the database. The query processor consists of :
DML compiler :
it provides a set of constructs to select, insert update and delete data.
DML :
Procedural DML : specifies how to navigate in the database to locate and modify the data. Usually starts by positioning on one specific record and
navigate from there onwards to other records using memory pointers. Procedural DML is also referred to as record at a time DML. DBMSs with procedural DML do not have a query processor available. In other words, the application developer has to explicitly define the optimization.
This isn’t the preferred implementation since it complicates the efficiency, maintenance and transparency of database applications.
Declarative DML : More efficient implementation. The DML statements specify what data should be retrieved or what changes should be made
rather than how they should be done. The DBMS will then determine the physical execution in terms of access for a navigational strategy. It is usually
set-at-a-time DML.
Ex: SQL
Applications often need to access the database. DML is embedded in the application in order to access the database. The application is written in the host language (Ex : Java). Some DML code is embedded inside the application (Ex : embedded SQL).
Data structures of DBMS and DML may be different from the data structure of the host language. This is often referred to an Impedance Mismatch problem. This problem can be solved in various ways.
- Choose a host language and DBMS with comparable data structures. EX combine java with object oriented DBMS
- Opt to use middleware to map data structures from the DBMS to the host language and vice-versa.
The DML compiler will start by extracting the DML statements from the host language. It will collaborate with the query parser, query rewriter, query optimizer, query executor for executing the DML statements. Errors will be generated and reported if necessary.
Query parser will parse the query into the internal representation format. Il will check the query for syntactical and semantically correctness. To do so, it will make extensive use of the catalog to verify if the integrity constraints have been respected. Again errors will be generated and recorded if necessary.
Query rewriter will optimize the query independent of the current database state. It will simplify it using predefined rules and heuristics. Nested queries might be reformulated or flattens to join queries.
Query optimizer he will optimize the query based on the current database state. Can make use of pre-defined indices which are part of the physical scheme and provide a quick access to the date. He will come up with various execution plans and evaluate their costs by aggregating the estimate number of input/output operations, …
Query executer
The Storage manager
Storage manager includes :
Transaction Manager supervises the execution of database transactions. A database transaction is a sequence of retried operations considered to be an atomic unites. This manager will create a schedule to improve efficiency and execution performance. It also guarantees the ACID properties in a multiuser environment.
Buffer Manager is responsible for managing the buffer or cache of the DBMS for speedy access. He is responsible for intelligently cashing the date in the buffer for speed access. It needs to constantly monitor the buffer to decide which content should be removed and which one added. In case data in the buffer has been updated, it must also synchronize the corresponding external memory. It must also be able to serve multiple transactions simultaneously.
Lock manager essential for providing concurrency control. Before a transaction can read or write a database object, it must acquire a lock. A read lock allows a transaction to read a database object whilst a write lock allows a transaction to update. To enforce a transaction atomicity consistency, a lock database object may prevent other transactions from using it.
Locking protocol which describes the locking rules and locking tables with the lock information
Log Manager keeps track of all database operations in a log file. He will be called upon to undo actions of aborted transactions or during crash recoveries.
All these components interact in various ways depending on which interface part is being executed. The database itself contains the raw data, the indices and the catalog.
DBMS
The connection manager allows setting up a database connection. It can be set up locally as well as through a network. He will verify the authentication information such as username and password and will return in connection handle. The database connection can either run as a single process or as a thread within a process. A thread represents an execution plot within a process. It represents the smallest unit of processor scheduling. Multiple threads can run within a process and share resources.
The security manager will verify whether a user has the right privileges to execute the database actions required. Some users can have read access whilst others have write access. This can be refined to certain parts of the data. He will retrieve these privileges from the catalog.
The DDL compiler compiles the data definitions specified in DDL. Ideally the DBMS should foresee 3 DDLS : one for the physical scheme, one for the conceptual scheme, one for the external scheme. Often there is just 1 DDL with 3 independent sets of instructions. This is the
case for most relation databases which use SQL as their DDL. The DDL compiler will first parse the DDL statements and check there correctness and then translate it to an internal format and generate errors if required. Then he will register the data definitions in the catalog where it can be used by all the other components of the DBMS.
Database utilities
Loading utility allows providing the database with information from a variety of different sources such as another DBMS text file, shell files, etc.
Reorganization utility automatically reorganizes the data in order to improve the performance. It can also create new access paths to improve the performance.
Performance monitoring utilities report various KPI’s such as storage space consumed. It also provides statistics to the DBA. The DBA uses the statistics in making decisions such as whether or not to reorganize files or whether to add or drop indexes to improve performance.
User management utilities allow creating user groups or accounts and assigning privileges to them.
Back-up and recovery utility : creates a backup copy of the database, usually by dumping the entire database onto tape or other mass storage medium. The backup copy can be used to restore the database in case of catastrophic disk failure. Incremental backups are also often used, where only changes since the previous backup are recorded.
Other utilities may be available for sorting files, handling data compression, monitoring access by users, interfacing with the network, and performing other functions.
DBMS interfaces
A DBMS needs to interact with many different parties such as a database designer, a database administrator and application or even the end user. To facilitate the communication, it will provide many different interfaces.
- Web-based interfaces
- Stand-alone query language interfaces
- Command line interface
- Forms-based interface
- Graphical user interface
- Natural Language interfaces
- Admin Interfaces for the DBA
- Network interfaces