How does the ANSI-SPARC architecture promote logical and physical data independence in databases?

The American National Standards Institute (ANSI) Standards Planning and Requirements Committee (SPARC) produced the ANSI-SPARC architecture in 1975 (ANSI, 1975). It is a three-level schema architecture which have the physical, logical and external levels.

The ANSI/SPARC architecture and framework have been widely applied in relational databases. Information at the levels is represented by the internal, conceptual, and external schemas. The diagram below illustrates the ANSI-SPARC architecture.

The internal level comprises of an internal schema, which describes the physical storage structure of the database. It uses a physical data model and describes the complete details of data storage and access paths for the database. An external schema may include classes defined in the conceptual schema just as it may also contain derived classes directly or indirectly defined on the basis of conceptual schema classes that do not necessarily need to be included in the conceptual schema.

The conceptual level has a conceptual schema, which describes the structure of the whole database for a community of users. The conceptual schema hides the details of physical storage structures and concentrates on describing entities, data types, relationships, user operations, and constraints. Usually, a representational data model is used to describe the conceptual schema when a database system is implemented. This implementation conceptual schema is often based on a conceptual schema design in a high-level data model.

The external level includes a number of external schemas. It involves the way users perceives the data. Each external schema describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group. As in the conceptual level, each external schema is typically implemented using a representational data model.

The goal of the three-schema architecture is to separate the user applications from the physical database. Most Database Management Systems (DBMS) do not separate the three levels completely and explicitly, but support the three-schema architecture to some extent. The three-level ANSI-SPARC architecture has an important place in database technology development because it clearly separates the users’ external level, the database’s conceptual level, and the internal storage level for designing a database. In a DBMS based on the three-schema architecture, each user group refers to its own external schema.

Data independence in DBMS means the ability to change a schema to one level without affecting the other schemas. It is when the schema is changed at some level, the schema at the next higher level remains unchanged but only the mapping between the two levels is changed. There are two types of data independence namely logical data independence and physical data independence.

Logical data independence is the ability to change the conceptual schema without having to change external schemas. A database administrator may change the conceptual schema to expand the database

e.g. by adding a record type or data item. Only the view definition and the mappings need to be changed in a DBMS that supports logical data independence. After the conceptual schema undergoes a logical reorganization, application programs that reference the external schema constructs must work as before.

Physical data independence is the ability to change the internal schema without having to change the conceptual schema. Therefore, the external schemas need not be changed as well. It allows users to frame queries in terms of the logical structure of the data, letting a query processor automatically translate them into optimal plans that access physical storage structures. For example, creating additional access structures to improve the performance of retrieval or update in a database must not affect the users provided the same data as before remains in the database.

In general, physical data independence exists in most databases and file environments where physical details such as the exact location of data on disk, and hardware details of storage encoding, placement, compression, splitting, merging of records, and so on are hidden from the user.

Using these three levels, it is possible to use complex structures at internal level for efficient operations and to provide simpler convenient interface at external level. This also helps to maintain data independence between the schemas. It helps different users to access same data with different customized views and users are not concerned about the physical data storage details. The physical storage structure of the database can be changed without requiring changes in internal structure of the database as well as users’ view. Also, no user should know where or how the physical bits are stored which provide clients with only the data that they should be allowed to see. The conceptual structure of the database can also be changed without affecting end users.

The ANSI-SPARC architecture can make it easier to achieve true data independence, both physical and logical. However, the two levels of mappings create an overhead during compilation or execution of a query or program, leading to inefficiencies in the DBMS. This is the reason why few DBMSs have implemented the full ANSI-SPARC architecture.

Rajiv Shah
Oct 5, 2021
More related questions

Questions Bank

View all Questions