What is Data Replication?
Data replication is the process of storing separate copies of the database at two or more sites. It is a popular fault tolerance technique of distributed databases.
Advantages of Data Replication
- Reliability − In case of failure of any site, the database system continues to work since a copy is available at another site(s).
- Reduction in Network Load − Since local copies of data are available, query processing can be done with reduced network usage, particularly during prime hours. Data updating can be done at non-prime hours.
- Quicker Response − Availability of local copies of data ensures quick query processing and consequently quick response time.
- Simpler Transactions − Transactions require less number of joins of tables located at different sites and minimal coordination across the network. Thus, they become simpler in nature.
Disadvantages of Data Replication
- Increased Storage Requirements − Maintaining multiple copies of data is associated with increased storage costs. The storage space required is in multiples of the storage required for a centralized system.
- Increased Cost and Complexity of Data Updating − Each time a data item is updated, the update needs to be reflected in all the copies of the data at the different sites. This requires complex synchronization techniques and protocols.
- Undesirable Application – Database coupling − If complex update mechanisms are not used, removing data inconsistency requires complex co-ordination at application level. This results in undesirable application – database coupling.
Some commonly used replication techniques are −
- Snapshot replication
- Near-real-time replication
- Pull replication