Database Management System: September 2011

Monday 26 September 2011

External View Level

The external view level is closest to the users. It is concerned with the way the data is viewed by individual users. You can say that external level is the individual user level. A user can either be an application programmer or an end-user, but DBA is an important special case.
The external level consist of many different external views of database. Each external view describes that a particular user group is interested in and hides the rest of the database from that user group. In addition, different views may have different representations of the same data. For example, one user may view date in the form (day, month, year).. while another may view date as (year, month, day). Some users may view calculated data, which is not actually stored in database. The calculated data is created temporarily when needed. For example, marks of subject "C++" of students are stored in the database and average marks of this subject are calculated by system when the user refers to it. Similarly, if a date-of-birth of a student is stored in the database then you can find the age of the student.
The external views are defined by means of external schemes, which are written in the data definition language (DDL). Usually, the DBA writes an external schema to create a user view. Each user's view. An external record is a record seen by a particular user (a part of his external view). Actually, an external view is a collection of external records.
The external schemes are compiled by the DBMS and store in its data dictionary. The DBMS uses the external schema created for a specific user, to create a user interface to access the database. Thus user-interface created through external schema accepts and displays information in the format the user expects. It also acts as barrier to hide the information of database from the user that are not permitted to him.

Sunday 25 September 2011

DATABASE SYSTEM ARCHITECTURE

We know that the database system provides users with an abstract view of data, by hiding certain details of how data is stored and manipulated. Therefore, before to design a database, the data of organization is considered on abstract level.
Database system architecture means design or construction of database system. The database system architecture provides general concept and structure of database system. The architecture of most commercial database management systems is based on the three-level architecture by DBTG.

The Three-Level Architecture
An early proposal for a standardized terminology (or vocabulary) and architecture for database system was developed and published in 1971 by DBTG (Database Task Group) appointed by CODASYL. A similar architecture and terminology were developed and published in 1975 by SPARC (Standard Planning and Requirements Committee) of American National Institute. As a result of these and later reports, databases can be viewed at three levels, know as external, conceptual and internal levels. The three-level architecture is also known as schema architecture. The purpose of the three-level architecture is to separate the user application and the physical database. The reasons of this separation are:

Different users and need different views of the same data.
Users should not have to deal directly with the physical database storage details.
The DBA should be able to change the database storage structure or storage device without affecting other user's views etc.

The three level architecture is divided into three view levels, External view level, Conceptual view level and Internal view level.The Figure shows the three-level architecture of the database system.

Thursday 22 September 2011

Data Processing On Internet and Object Oriented Database Systems

Data Processing On Internet

The Internet was introduced in 1969 by Advanced Research Projects Agency (ARPA) of USA. Today, most the database systems are online. It means that databases, DBMS software and database application are stored on the Web server. The database technology is used in conjunctions with Internet technology to access data on the Web server. The database application are developed using Internet technology to access the database on the Web server. The database on the Internet uses the hypertext transfer protocol (HTTP), dynamic, hypertext markup language (DHTML), and extensible markup language (XML), to communicate information between database application and database stored on the Web server.

Object Oriented Database Systems

In the middle of 1980s, it had become clear that there were several fields where relational databases were not applicable, due to the types of data involved. These included medicine, multimedia and high-energy physics, all of which needed more flexibility in how their data was represented and accessed. This let to research being started in object-oriented databases where users could define their own methods of access to data and how it was represented and manipulated. In the start of 1990s, two systems had appeared: the Object-Oriented DBMS (OODMBS) and the object-relational database management system (ORDBMS). However, unlike previous models, the actual composition of these models are not clear. This evolution represents third generation database management system.

Client-Server Database Application and Entity Relationship Model

Client-Server Database Application

Earlier multi-user architectures used mainframe computer to process the database. The mainframe computer provides all the functions to the connected user directly. The mainframe contains the DBMS software, application programs, and user-interfaces. The users connected to the mainframe through their terminals. The remote users were connected to mainframe in a communication network.
In the mid 1980s, most users began to share data through local area network (LAN). The microcomputers were linked in a LAN so that data and resources such as printers, storage devices etc could be shared. The LAN enabled the users to send data to one another through computers, The first application of LAN enabled the users to share the resources such as printers, storage devices (e.g., large capacity disk) and to communication via electronic mail. The end-users also wanted to share their database, which led to the development of multi-user database application on local area networks. After this, the Client-Server architecture was introduced to share data on the computer network.

Entity-Relationship Model
The relational model also has limited modeling, capabilities. In 1976, Peter Chen presented the Entity-Relationship model for database design. The E-R data model is a detailed. logical representation of the data for organization. The E-R model is expressed in terms of entities, relationships between entities and the attributes (or properties) of both the entities and their relationships. The E-R model is normally expressed as an E-R diagram, which is a graphical representation of an E-R model

Entity-Relationship Model

Relational Database Systems

In 1970, E. F. Codd of the IBM Research Laboratory published a paper on the relational data model. In this paper he described a new system (i.e. relational database model) for storing and working with large databases. He applied the concepts of relational algebra ( a branch of mathematics) to describe the new system. Instead of records being stored in some sort of linked list of free-form records as in CODASYL, his concept was to use a "table" of fixed-length records. Many experimental relational database management systems were implemented thereafter, with the first commercial products appearing in the late 1970s and early 1980s. IBM started working on a prototype system based on Codd's concepts e.g., System R in the early 1970s. This project led to two major developments:

The development of a structured query language called SQL, which has since become the standard language for relational systems.
The production of various commercial relational DBMS products during the 1980s, for example DB2 from IBM and ORACLE from ORACLE Corporation.

Now there are several hundred relational database management systems for both mainframe and microcomputer environments, though many are following the concept of the relational model. Other examples of multi-user relational database managements system are INGRES from Computer Associates, and INFORMIX from Informix Software Inc. and SYBASE from Sybase Inc.
In 1979, dBase-II was developed by Ashton-Tare and it was called as relational DBMS. It was very popular in PCs. It was not a truly relational DBMS product. In face, it was a programming language with generalized file-processing capabilities.

Brief History of Database and DBMS

Due to the advancement in the electronic industry, the increased processing and storage capacity of computer has opened the doors for computer scientists to develop various techniques to store large amount of related data in an efficient and compact manner. The concept of database was introduced by IBM in 1960s. A brief description about the development of DBMS and database models is given below:

Hierarchical DBMS

A large amount of search was conducted during 1960s. As a result NAA (North American Aviation) developed a software known as GUAM (Generalize Update Access Method). GUAM was based on the concept that smaller components come together a part of larger components and so on, until the final product is assembled. This is like a hierarchical structure and thus known as hierarchical structure.

Network DBMS

In the mid-1960s, another development was made known as IDS (Integrated Data Store) by General Eletric. This work was headed by CHARLES Bachman. This development led to a new type of database system known as network DBMS. This network database system was developed partly to address the need to represent more complex data relationships that could be modeled with hierarchical structures, and partly to impose a database standard. To establish such standards, the Conference on Data System Languages (CODASYL), comprising representatives of the US Government and the world of business and commerce, formed a List Processing Task Force in 1965. It was renamed ad Data Base Task Group (DBTG) in 1967. The terms of reference for the DBTG were to define standard specifications for an environment that would allow database creation and data manipulation. A draft report was issued in 1969 and the initial report describing a network database implementation was issued in 1971. The DBTG proposal identified three components:

The network schema, which represents the logical organization of the entire database as seen by the DBA. It includes a definition of the database name, the type of each record, and the components of each record type.
The sub-schema, which represents the part of the database as seen by the user or application program.
A Data Management Languages, which is used to define the data structure, and the to manipulate etc.

For standardization, the DBTG specified three distinct languages

A schema Data Definition Language (DDL), which enables DBA to define the schema.
A sub-schema DDL, which allows the application programmers to define the parts to the database they require.
A Data Manipulation Language (DML), to manipulate the data of database.

Wednesday 21 September 2011

Relationship Between DBMS and Application Program

There is very close relationship between database management system and applications program. The application program provides the user-interface to send requests to database management system and to receive processed results from database management system. The database management system processed the requests and returns the results to the application program and also controls and manages the database.
In most of the database management systems, application programs are used for interfacing with database. The user communicates with the data system through the user-interface part of the application program. The application program is developed according to the requirements of the users. The application programmer must have the complete knowledge about database management system and databases used in it. Then he can easily develop the application program. The application program provides an easy and user-friendly user-interface to access the database.

Disadvantage of DBMS

Although there are many advantages of DBMS, the DBMS may also have some minor disadvantages. These are:

Cost of Hardware and Software

A processor with high speed of data processing and memory of large size is required to run the DBMS software. It means that you have to up grade the hardware used for file-based system. Similarly, DBMS software is also very costly,.

Cost of Data Conversion

When a computer file-based system is replaced with database system, the data stored into data file must be converted to database file. It is very difficult and costly method to convert data of data file into database. You have to hire database system designers along with application programmers. Alternatively, you have to take the services of some software house. So a lot of money has to be paid for developing software.

Cost of Staff Training

Most database management system are often complex systems so the training for users to use the DBMS is required. Training is required at all levels, including programming, application development, and database administration. The organization has to be paid a lot of amount for the training of staff to run the DBMS.

Appointing Technical Staff

The trained technical persons such as database administrator, application programmers, data entry operations etc. are required to handle the DBMS. You have to pay handsome salaries to these persons. Therefore, the system cost increases.

Database Damage

In most of the organization, all data is integrated into a single database. If database is damaged due to electric failure or database is corrupted on the storage media, the your valuable data may be lost forever.

Advantages of DBMS

The database management system has a number of advantages as compared to traditional computer file-based processing approach. The DBA must keep in mind these benefits or capabilities during databases and monitoring the DBMS.
The Main advantages of DBMS are described below.

Controlling Data Redundancy

In non-database systems each application program has its own private files. In this case, the duplicated copies of the same data is created in many places. In DBMS, all data of an organization is integrated into a single database file. The data is recorded in only one place in the database and it is not duplicated.

Sharing of Data

In DBMS, data can be shared by authorized users of the organization. The database administrator manages the data and gives rights to users to access the data. Many users can be authorized to access the same piece of information simultaneously. The remote users can also share same data. Similarly, the data of same database can be shared between different application programs.

Data Consistency

By controlling the data redundancy, the data consistency is obtained. If a data item appears only once, any update to its value has to be performed only once and the updated value is immediately available to all users. If the DBMS has controlled redundancy, the database system enforces consistency.

Integration of Data

In Database management system, data in database is stored in tables. A single database contains multiple tables and relationships can be created between tables (or associated data entities). This makes easy to retrieve and update data.

Integration Constraints

Integrity constraints or consistency rules can be applied to database so that the correct data can be entered into database. The constraints may be applied to data item within a single record or the may be applied to relationships between records.

Data Security

Form is very important object of DBMS. You can create forms very easily and quickly in DBMS. Once a form is created, it can be used many times and it can be modified very easily. The created forms are also saved along with database and behave like a software component. A form provides very easy way (user-friendly) to enter data into database, edit data and display data from database. The non-technical users can also perform various operations on database through forms without going into technical details of a fatabase.

Report Writers

Most of the DBMSs provide the report writer tools used to create reports. The users can create very easily and quickly. Once a report is created, it can be used may times and it can be modified very easily. The created reports are also saved along with database and behave like a software component.

Control Over Concurrency

In a computer file-based system, if two users are allowed to access data simultaneously, it is possible that they will interfere with each other. For example, if both users attempt to perform update operation on the same record, then one may overwrite the values recorded by the other. Most database management systems have sub-systems to control the concurrency so that transactions are always recorded with accuracy.

Backup and Recovery Procedures

In a computer file-based system, the user creates the backup of data regularly to protect the valuable data from damage due to failures to the computer system or application program. It is very time consuming method, if amount of data is large. Most of the DBMSs provide the 'backup and recovery' sub-systems that automatically create the backup of data and restore data if required.

Data Independence

The separation of data structure of database from the application program that uses the data is called data independence. In DBMS, you can easily change the structure of database without modifying the application program.

Components of DBMS

A database management system (DBMS) consists of several components. Each component plays very important role in the database management system environment. The major components of database management system are:

Software
Hardware
Data
Procedures
Database Access Language

Software

The main component of a DBMS is the software. It is the set of programs used to handle the database and to control and manage the overall computerized database

DBMS software itself, is the most important software component in the overall system
Operating system including network software being used in network, to share the data of database among multiple users.
Application programs developed in programming languages such as C++, Visual Basic that are used to to access database in database management system. Each program contains statements that request the DBMS to perform operation on database. The operations may include retrieving, updating, deleting data etc . The application program may be conventional or online workstations or terminals.

Hardware

Hardware consists of a set of physical electronic devices such as computers (together with associated I/O devices like disk drives), storage devices, I/O channels, electromechanical devices that make interface between computers and the real world systems etc, and so on. It is impossible to implement the DBMS without the hardware devices, In a network, a powerful computer with high data processing speed and a storage device with large storage capacity is required as database server.

Data

Data is the most important component of the DBMS. The main purpose of DBMS is to process the data. In DBMS, databases are defined, constructed and then data is stored, updated and retrieved to and from the databases. The database contains both the actual (or operational) data and the metadata (data about data or description about data).

Procedures

Procedures refer to the instructions and rules that help to design the database and to use the DBMS. The users that operate and manage the DBMS require documented procedures on hot use or run the database management system. These may include.

Procedure to install the new DBMS.
To log on to the DBMS.
To use the DBMS or application program.
To make backup copies of database.
To change the structure of database.
To generate the reports of data retrieved from database.

Database Access Language

The database access language is used to access the data to and from the database. The users use the database access language to enter new data, change the existing data in database and to retrieve required data from databases. The user write a set of appropriate commands in a database access language and submits these to the DBMS. The DBMS translates the user commands and sends it to a specific part of the DBMS called the Database Jet Engine. The database engine generates a set of results according to the commands submitted by user, converts these into a user readable form called an Inquiry Report and then displays them on the screen. The administrators may also use the database access language to create and maintain the databases.

The most popular database access language is SQL (Structured Query Language). Relational databases are required to have a database query language.

Users

The users are the people who manage the databases and perform different operations on the databases in the database system.There are three kinds of people who play different roles in database system

Application Programmers
Database Administrators
End-Users

Application Programmers

The people who write application programs in programming languages (such as Visual Basic, Java, or C++) to interact with databases are called Application Programmer.

Database Administrators

A person who is responsible for managing the overall database management system is called database administrator or simply DBA.

End-Users

The end-users are the people who interact with database management system to perform different operations on database such as retrieving, updating, inserting, deleting data etc.

Monday 19 September 2011

DATABASE MANAGEMENT SYSTEM (DBMS)

Typically database management system is considered as a computerized record keeping system. However, DBMS is a collection of programs, which are used to define, create and maintain database. Basically, database management system is a general-purpose software package whose overall purpose is to maintain information and to make that information available on demand. You can also develop a special purpose DBMS software (in Visual Basic, C++ etc.) to create and maintain database.
There are many functions of general-purpose DBMS software but the main functions are:

Defining the Structure of Database: which involves defining tables and their relationships, fields and their data types and constraints for data to be stored in the database.
Populating the Database: which involves to store data into database.
Manipulating the Database: which involves to retrieves specific data, update data, insert data, and to generate reports.

Now consider the example of university computer file-based processing system is converted to database management system, applicants, students, courses, and faculty is stored in a single file called database. The data is integrated into database. It means that the data items are stored in a compatible format and logical connection among them is also stored. The database contains a description of its own structure so that the database management system "knows" what data item exist and how they are related to each other. Many users can share the database through the database management system. The Database Management system also provides a user-interface for online queries. The users can access the database directly from terminals, using query language such as Structured Query Language statements.

Types of Database

The database may be of different types but there are two generic database architectures. These are:

Centralized Database
Distributed Database

Centralized Database

A database whose all data is located at a single computer (or site) and multiple users can access that database is know as centralized database. A centralized database provides an efficient way to access and update data. These databases are usually used in computer network environments. The examples of centralized databases are:

Personal Computer Databases
Client/Server Databases
Central Computer Databases

(i) Personal Computer Databases

A personal computer database is normally created and maintained by a single user on the personal computer. The personal computer database are commonly used in small business or organization. If there is a need to share data, this database could be stored on a database server in a local area network, so that the multiple user can access and update the database.

(ii) Client/Server Databases

The client/server databases are used in small to medium organization or businesses to share data among multiple users in local area network. The microcomputers are often used in a local area network.

The client/server architecture is designed for the distribution of work on a computer network in which many clients may share the data (or services). Here is an example of client/server database

(iii) Central Computer Databases

The central computer databases are commonly used in central computers in large organizations. The central computer may be a mainframe or minicomputer. These databases are accessed by a large number of users. The users at remote locations can also access the database using remote terminals and data communication links.

Distributed Database
Many organizations/departments have sub-offices in different cities and countries. In such cases, the distributed databases are used instead of centralized databases. A distributed database is a single logical database, which is spread physically across computers in multiple locations (such as cities or countries).
The distributed databases are further divided into two categories:

Homogeneous Databases
Heterogeneous Databases

Homogeneous Databases

The homogeneous database means that the database technology is the same at each of the locations (or sites) and that the data at various locations are also compatible. In a homogeneous system, all nodes use the same hardware and software for the database system.

The following conditions must be satisfied for homogeneous database.

The operating system used at each location must be same or compatible.
The data structures used at each location must be same or compatible.
The database application (or DBMS) used at each location must be same or compatible.

Heterogeneous Databases

The heterogeneous database systems are opposite to homogeneous database systems. In a heterogeneous system, different nodes may have different hardware and software and data structures at various nodes or locations are also compatilbe

DATABASE

A database is a collection of related data stored in an efficient and compact manner. The word "efficient" means that stored data can be accessed very easily and quickly. Similarly, the word "compact" means that stored data takes up as little space as possible. In the above definition of database, the phrase "related data" is used. It means that a database contains data or information about a particular topic such as:

Database of employees that contains data of employees of an organization or department.
Database of students that contains data of students of a college/university etc.

A database holds related data as well as description of that data. For this reason, a database is also defined as a self-describing collection of integrated records. The description of data is known as the system catalog or data dictionary or metadata (It means data about data). For example when a table of a data base is designed, the data type, size, format and other description of fields are specified. This is an example of metadata, which describes the properties of data to be stored into fields of table.

The data of any organization is its integral part. The data is very important for developing new products and their marketing. The data must be accurate and available when needed. This is the reason that all organizations must organize and manage their data into databases. The databases are used for variety of purposes in an table.

Almost all the organization and government departments of every country in the world use database to maintain their records,

Sunday 18 September 2011

Disadvantage of Computer File-based Processing System

Although a computer file-based processing system has many advantages over manual record keeping system, but it has some limitations. The basic disadvantages (or limitations) of computer file-based processing system are described below.

Data Redundancy

Redundancy means having multiple copies of the same data. In computer file-based processing system, each application program has its own data files. The same data may be duplicated in more than one file. The duplication of data may create many problems such as:

To update a specific data/record, the same data must be updated in all files, otherwise different file may have different information about a specific item.
A valuable storage space is wasted.

Data Inconsistency

Data inconsistency mean that different files may contain different information of a particular object or person. Actually redundancy leads to inconsistency.When the same data is stored in multiple locations, the inconsistency may occur.

Data Isolation

In computer file-based system, data is isolated in separate files. It is difficult to update and to access particular information from data files.

Data Atomicity

Data atomicity means data or record is either entered as a whole or it is not entered at all.

Data Dependence

In computer file-based processing systems, the data stored in file depends upon the application program through which the file was created. It means that the structure of data files is coupled with application program.

The physical structure of data files and records are defined in the application program code. It is difficult to change the structure of data files or records. If you want to change the structure of data file (or format of file), then you have to modify the application program.

Program Maintenance

In computer file-based processing system, the structure of data file is coupled with the individual application programs. Therefore, any modification to a data file such as size of a data field, its type etc. requires the modification of the application program also. This process of modifying the program is referred to as program maintenance.

Data Sharing

In computer file-based processing systems, each application program uses its own private data files. The computer file-based processing systems do not provide the facility to share data of a data file among multiple users on the network.

Data Security

The computer file-based processing system do not provide the proper security system against illegal access of data. Anyone can easily change or delete valuable data stored in the data file. It is the most complicated problem of file-processing system.

Incompatible File Format

In computer file-based processing systems, the structure of data file is coupled with the application program and the structure of data file is dependent on the programming languages in which the application program was developed.

COMPUTER FILE-BASED PROCESSING SYSTEM

Before to understand the characteristics of a pure database system, it is necessary to know about the computer file-based processing system, which was used in the past (before the database system). In computer file-based processing system, the data is usually kept in computer files on magnetic disk or tape. In a typical computer file-based processing system, each department has its own set of application programs and data files. Data is stored and managed in data files through application programs. In computer file-based processing system, each department (or user) defines and implements the data files needed for a specific application. Each application program is developed with its own set of data files to meet the needs of a particular department.

For example, the admission office of a college may have an application program for maintaining records of candidates for admission in the college. The admission office forwards information about enrolled students to the head of the department (or dean or registrar). If the information of students/candidates comes in the form of printed reports, then they must be re-entered into the computer. Similarly, concerning courses offered, schedules, and so on. The dean's office may have its own students file, course file, as well as faculty file. In each case, all departments or offices have their own data files. The departments/offices of the university shares the information by copying data files on disk or by obtaining printout by the permission of the owner.

DATA PROCESSING

Data is processed to get the required results. For the purpose, different operations may be performed on data. Therefore, data processing is defined as: " a sequence of operations on data to convert it into useful information". The important operations that can be performed on data are:

Arithmetic and logical operations.
To send and receive data from one location to another.
Classification of data.
Arranging data into a specific order etc.

The data processing can be accomplished through following methods:

Manual Data Processing.
Mechanical Data Processing.
Electronic Data Processing.

Manual Data Processing

In manual data processing, data is processed manually without using any machine or tool to get required results. In manual data processing, all the calculations and logical operations are manually performed on data.

Mechanical Data Processing

In mechanical data processing, data is processed by using different tools like typewriters, mechanical printers or other mechanical devices.

Electronic Data Processing

It is the modern technique to process data. The data is processed through computer. Data and set of instructions are given to the computer as input and the computer automatically processes the data according to given set of instructions.

Data

The word 'data' refers to facts concerning things such as people, objects, event etc. A list of class students' roll numbers, names, marks etc, is an example of students data. Therefore, data is defined as; collection of raw facts and figures, which is collected for specific purpose, is called data.

For example, collection of students' data may look like the following,

David 62 63 64

John 50 75 70

Kate 90 82 85

Amelia 75 80 60

The above data does not convey proper meaning, because it has no relation among given values and there is no proper labeling of data values.

Types of Data

Data can be divided into three types. These are:

1) Numeric Data.

2) Alphabetic Data.

3) Alphanumeric Data.

Numeric Data
It consist of digits 0 to 9 +, &, - signs and decimal points.
Alphabetic Data
It consist of all the alphabet letters, i.e. A to Z and a to z.
Alphanumeric Data
It consist of alphabetic letters, numeric digits (0-9) and some special characters such as #, $, etc.

Database Management System