GATE: Architecture Of A Database System
The requirements of a database system as described above has led to the identification of three distinct levels of abstraction in such a system namely the physical level, the global or conceptual schema level and the user view level. This three, level architecture was proposed by the Standards and Planning Requirements Committee of the American National Standards Committee on Computers and Information Processing (ANSI/SPARC Report, Tsichritis and Klug, 1978). Although the architecture proposed by this group was rather detailed, the essence was to divide the database structure in such a way that the distinct layers could be described at different levels of abstraction, with clearly defined interfaces between them.
The merits of this three-level architecture have long been recognized and most modern DBMSs support, to some extent, the separation of the physical database, the schema and the user views. The architecture is also consistent with the principles of software engineering, which advocate a separation of levels of concern and a modular approach to software development.
- The Conceptual Schema Level: The conceptual schema level is the global level, i.e.. the logical description of the entire database. It is the overall logical view of the data and their interrelationships as seen by the database modelers, by the systems analysts and programmers who are concerned with implementation, and by terminal user and application programs which require access to the entire database. The conceptual schema also contains the logical description of the usage constraints which are designed to preserve the integrity and privacy of the data against invalid modification or unauthorized access.
- The user view: The user view level (sometimes called the subschema level in network and hierarchical systems) is concerned with individual user views of j the database. Views are provided to satisfy the needs of applications which require access to only a subset of the data. As described earlier, a view may present to the user or application program, an aspect of the data model which is very different from that in the schema, and which has been tailored specifically to the application.
Database software may be divided into three distinct categories:
- Languages (commutations) to create, use and maintain the database.
- Utilities to provide support facilities such as report generation, graphical output, and statistical operations.
- Operational routines for run-time management, including routines for back-up and recovery, and for concurrency control.
In the commercial environment, database, languages have traditionally been divided into several different categories, namely data description languages, data manipulation languages and query languages. These may be described as follows.
The Schema Data Description Language (DDL)
The schema DDL is a high level notation for describing the record types and relationships existing in the database in terms of an underlying data model. Commercial DBMSs typically provide their own unique schema DDL. With network and hierarchical systems the schema DDL may be quite complex and resemble a programming language (typically COBOL) in style and form. For relational systems the schema DDL is usually quite simple (reflecting the simpler structures of the relational model), and may even be interactive.
Subschema Data Description Languages
Subschema DDLs are notations for describing a view of the database, often in a manner compatible with a standard programming language. Most network and hierarchical DBMSs support several subschema DDLs, interfacing typically to popular languages such as COBOL, FORTRAN. PL/1 and Pascal. Relational typically provide a view definition facility which permits the description of logical relational views derived from the permanent structures present in the schema, ouch views may then be accessed and manipulated using the same techniques as are used for the complete database.
The Data Manipulation Language (DML)
The DML is a language used by the applications programmer or ‘knowledge’ terminal user to communicate with the database. It provides facilities to; insertion, deletion, modification and retrieval o data-commercial DBMSs usually provide data manipulation language of various types, both procedural and non-procedural, for end-users as well as for system programmers. Thus a DML may take one of several possible forms:
- An interactive commercial language.
- A library of pre-defined procedures which may be called by application programs written in standard programming languages. This approach to data manipulation is available when complex computation is to be-performed on the data.
- A procedural programming language, unique to the particular DBMS, but which may be based on a standard programming language with added facilities for data manipulation within a database environment. The data manipulation language SQL (Chamberlin et al, 1976), is rapidly emerging as an industry-wide standard interface to commercial relational systems. It is often provided both as a stand-alone command language as well as embedded in a variety of programming languages.
Some relational database, management system do not make any distinction between data description-languages and data manipulation languages. Rather, the approach that they adopt is to combine data description, view definition, and data manipulation within the framework of a well-structured programming language. For example, there have been a number of attempts in recent years, to integrate the relational model of data with languages such as Pascal, Modula-2 and ADA.
A query language is a very high level, nonprocedural-language (often ‘English-like’ ) provided by most DBMSs to facilitate retrieval or simple updates when communicating with the database from a terminal. The user is not required to specify the detailed logical procedures, as is the case with a DML, but only a ‘structured’ statement of the information required or the action to be carried out. The query language processor must, then determine the correct access paths required to resolve a query. Query languages range in power and sophistication from semi-procedural interactive programming applications (user views) must be identified, not only so that such views may be incorporated in the overall system design, but also to identify access rights for the purposes of security and privacy. Thus the database designer must discuss the requirements in depth with every possible class of user and study the manner in which data is currently processed. In the existing environment, data may be processed manually or possibly in a computerized system using files and application programs. In the former case the data processing needs may not be well defined, and decisions must be made as to what can and should be automated. Where a file-based computerized system exists, the various application programs may be rather loosely related. Some may be grossly inefficient, while others may be redundant or of historical interest only.
The objectives of the database designer at the requirements analysis stage are:
- To obtain a clear and concise description of the infrastructure of the enterprise to be modeled.
- To derive information about the nature and volume of data to be stored and processed. In particular, the properties (or attributes) of the data objects must be ascertained, together with their domains, ordering requirements, sort criteria, units of measurement, etc. The frequency with which the data objects are accessed or updated and the expected growth in the volume of the data are also important factors which may influence the implementation phase, as are the requirements for data protection.
- To compile information on the nature and volume of the transactions (functions) occurring in the enterprise and the relationships between the various transactions. Important factors which must be determined include the exact purpose of the transaction, the frequency with which it is performed, the data requirements of the transaction, and the data and information which it produces. It is also important to ascertain the nature of any interactions which may take place between the various transactions. In particular, information is required on the data flow between transactions and the sequence in which they must be performed.