Lecture 15
Lecture 15
3 Designing Databases 8.4 Database Trends 8.5 Management Requirements Discussion Questions
Lecture - 15
The first few terms, field, record, file, database, are depicted in Figure 8.1, which shows the relationship between them. An entity is basically the person, place, thing, or event about which we maintain information. Each characteristic or quality describing an entity is called an attribute. Each record requires a key field, or unique identifier. The best example of this is your Social Security number: there is only one per person. That explains in part why so many
companies and organizations ask for your Social Security number when you do business with them. Suppose you decide to create a database for your newspaper delivery business. To succeed, you need to keep accurate, useful information for each of your customers. You set up a database to maintain the information. For each customer, you create a record. Within each record you have the following fields: customer name, address, ID, date last paid. Smith, Jones, and Brooks are the records within a file you decide to call Paper Delivery. The entities then are Smith, Jones, and Brooks, the people on whom you are maintaining information. The attributes are customer name, address, ID, and date last paid. The key field in this file is the ID number; perhaps you'll use their phone number, since it will be unique for each record. This is a simplistic example of a database, but it should help you understand the terminology.
Accessing Records
When we were describing secondary storage, we talked about magnetic tape and disk storage for computer data. To understand how information is accessed from these mediums, think about the difference between a music cassette tape and a music CD. If you want to get to a particular song on a cassette tape, you must pass by all the other songs sequentially. If you want to get to a song on CD, you can go directly to that song without worrying about any of the others. That is the difference between sequential and direct access organization for database records. Sequential file organization, in conjunction with magnetic tape, is typically used for processing the same information on all records at the same time. It is also good for processing many records at once, commonly called batch processing. Direct or random file organization is used with magnetic disks. Because of increased speed and improved technological methods of recording data on disks, many companies now use disks instead of tapes. The other advantage that disks have over tapes is that disks don't physically deteriorate as fast as tapes do. There is less danger of damaging the surface of the disks than there is of breaking a tape.
Figure 8.4 shows that records are not stored sequentially but at random. The transform algorithm uses the value in the key field to find the storage location and access the record.
Even worse, the fields and records for Marketing probably don't have the same structure and meaning as the fields and records for Accounting, or those for Production. Each record describes basically the same entity (customers or products), but it is very possible that each database file will have different information, or attributes, in records concerning the same entity. All of this may have happened with the best of intentions. All the departments began with the goal of making their part of the organization more efficient. Eventually these good intentions can cost big dollars to bring the islands together, resolve data conflicts between them, and retrain people to understand the new database structures. Bottom Line: Managers and workers must know and understand how databases are constructed so they know how to use the information resource to their advantage. Managers must guard against problems inherent with islands of information and understand that sometimes resolution of short-term problems is far costlier in the long term.
information. The goal of this language should be to make it easy for users. The basic idea is to establish a single data element that can serve multiple users in different departments depending on the situation. Otherwise, you'll be tying up programmers to get information from the database that users should be able to get on their own. Data dictionary. Each data element or field should be carefully analyzed to determine what it will be used for, who will be the primary user, and how it fits into the overall scheme of things. Then write it all down and make it easily available to all users. This is one of the most important steps in creating a good database.
Figure 8.7 shows a properly constructed data dictionary report. You can see exactly who owns the data element and all the business functions that use the data element. It also lists the people who have access to the data element. Why is it so important to document the data dictionary? Let's say Suzy, who was in on the initial design and building of the database, moves on and Joe takes her place. It may not be so apparent to him what all the data elements really mean, and he can easily make mistakes from not knowing or understanding the correct use of the data. He will apply his own interpretation, which may or may not be correct. Once again, it ultimately comes down to a persware problem.
Physical views of items are often different from the logical views of the same items when they are actually being used. For instance, assume you store tablets of paper in your lower right desk drawer. You store your pencils in the upper left drawer. When it comes time to write your request for a pay raise, you pull out the paper and pencil and put them together on your desktop. It isn't important to the task at hand where the items were stored physically; you are concerned with the logical idea of the two items coming together to help you accomplish the task. The physical view of data cares about where the data are actually stored in the record or in a file. The physical view is important to programmers who must manipulate the data as they are physically stored in the database. Does it really matter to the user that the customer address is physically stored on the disk before the customer name? Probably not. However, when users create a report of customers located in Indiana they generally will list the customer name first and then the address. So it's more important to the end user to bring the data from their physical location on the storage device to a logical view in the output device, whether screen or paper. Bottom Line: Database Management Systems have three critical components: the data definition language, the data manipulation language, and the data dictionary. Managers should ensure that all three receive attention. Managers should also make sure that end users are involved in developing these three components.
Hierarchical Databases
The hierarchical data model presents data to users in a treelike structure. Think of a mother and her children. A child only has one mother and inherits some of her characteristics, such as eye color or hair color. A mother might have one or more children to which she passes some of her characteristics but usually not exact ones. The child then goes on to develop its own characteristics separate from the mother.
In a hierarchical database, characteristics from the parent are passed to the child by a pointer just as a human mother will have a genetic connection to each human child. You can see how this database pointer works by looking at Figure 8.10.
Network Database
A network data model is a variation of the hierarchical model. Take the same scenario with one parent and many children and add a father and perhaps a couple of stepparents. Now the parents aren't restricted to only one (the mother) but to many parents. That is, a parent can have many children and a child can have many parents. The parents pass on certain characteristics to the children, but the children also have their own distinct characteristics.
As with hierarchical structures, each relationship in a network database must have a pointer from all the parents to all the children and back, as Figure 8.11 demonstrates. These two types of databases, the hierarchical and the network, work well together since they can easily pass data back and forth. But because these database structures use pointers, which are actually additional data elements, the size of the database can grow very quickly and cause maintenance and operation problems.
A relational data model uses tables in which data are stored to extract and combine data in different combinations. The tables are sometimes called files, although that is actually a misnomer, since you can have multiple tables in one file. (Make sure you review the description of fields and records in the text.) In a relational database, each table contains a primary key, a unique identifier for each record. To make sure the tables relate to each other, the primary key from one table is stored in a related table as a secondary key. For instance, in the Customer table the primary key is the unique Customer ID. That primary key is then stored in the Order Table as the secondary key so that the two tables have a direct relationship. Customer Table Order Table Field Name Description Order Number Primary Key Customer Name Self Explanatory Order Item Self Explanatory Customer Address Self Explanatory Number of Items Ordered Self Explanatory Customer ID Primary Key-----> Customer ID Secondary Key Order Number Secondary Key Use these three basic operations to develop relational databases: Select: create a subset of records meeting the stated criteria Join: combine related tables to provide more information than individual tables Project: create a new table from subsets of previous tables The biggest problem with these databases is the misconception that every data element should be stored in the same table. In fact, each data element should be analyzed in relation to other data elements with the goal of making the tables as small in size as possible. The ideal relational database will have many small tables, not one big one. On the surface that may seem like extra work and effort, but by keeping the tables small, they can serve a wider audience because they are more flexible. This setup is especially helpful in reducing redundancy and increasing the usefulness of data.
TABLE 8.1
Table 8.1 compares these alternatives on several dimensions to show you the advantages and disadvantages of each. What you should remember is that none of these databases is very good if you don't keep the end user in mind. If you're not careful, you'll wind up with lots of information that no one can use.
Creating a database
Don't start pounding on the keyboard just yet! First, you should think long and hard about how you use the available information in your current situation. Think of the good and the bad of how it is organized, stored, and used. Now imagine how this information could be organized better and used more easily throughout the organization. What part of the current system would you be willing to get rid of and what would you add? Involve as many users in this planning stage as possible. They are the ones who will prosper or suffer because of the decisions you make at this point. Determine the relationships between each data element that you currently have (entityrelationship diagram). The data don't necessarily have to be in a computer for you to consider the impact. Determine which data elements work best together and how you will organize them in tables. Break your groups of data into as small a unit as possible (normalization). Even when you say it's as small as it can get, go back again. Avoid redundancy between tables. Decide what the key identifier will be for each record. See, you've done all this and you haven't even touched the computer yet! Give it your best shot in the beginning: it costs a lot of time, money, and frustration to go back and make changes or corrections or to live with a poorly designed database. Bottom Line: There are three types of databases: hierarchical, network, and relational. Relational databases are becoming the most popular of the three because they are easier to work worth, easier to change, and can serve a wider range of needs throughout the organization.
8. 4 Database Trends
Recent database trends include the growth of distributed databases and the emergence of object-oriented and hypermedia databases.
Distributed Databases
Distributed databases are usually found in very large corporations that require multiple sites to have immediate, fast access to data. As the book points out, there are lots of disadvantages, so you should be careful in determining if this is the right way for you to run your business.
FIGURE 8.18
Hypermedia database
Data Warehouses
As organizations want and need more information about the company, the products, and the customers, the concept of data warehousing has become very popular. Remember those islands of information we keep talking about? Unfortunately, too many of them have proliferated over the years, and now companies are trying to rein them in using data warehousing. No, data warehouses are not great big buildings with shelves and shelves of bits and bytes stored on them. They are huge computer files that store old and new data about anything and everything a company wants to maintain information on. Since the data warehouse can be cumbersome, a company can break the information into smaller groups called data marts. It's easier and cheaper to sort through smaller groups of data. It's still useful to have a huge data warehouse, though, so that information is available to everyone who wants or needs it. You can let the user determine how the data will be manipulated and used. Using a data warehouse correctly can give management a tremendous amount of information that can be used to trim costs, reduce inventory, put products in the right stores, etc.
Even though Web browsers have been around for only a few years, they are far easier to use than most of the query languages associated with the other programs on mainframe computer systems. That's why many companies are starting to link their databases to a Web-like browser. They are finding out that it's easier to provide their "road warriors" with Web-like browsers attached to the computer at the main office. Employees anywhere can have up-to-the-minute access to any information they need. It's also proving cheaper to create browser applications that can more easily link information from disparate systems than to try to combine all the systems. Bottom Line: There are many ways to manipulate databases so that an organization can save money and still have useful information. With technological improvements, companies don't have to continually start from scratch but can blend the old with the new when they want to update their systems.
Nothing is ever as easy as it sounds. As Figure 8.21 shows, there is a lot more to a viable, useful database than just its structure.
Data Administration
Ask any manager what her resources are and she's likely to list people, equipment, buildings, and money. Very few managers will include information on the list, yet it can be more valuable than some of the others. A data administration function, reporting to
senior management, can help emphasize the importance of this resource. This function can help define and structure the information requirements for the entire organization to ensure it receives the attention it deserves Data Administration is responsible for: Developing information policies Planning for data Overseeing logical database design Data dictionary development Monitoring the usage of data by techies and non-techies No one part of the organization should feel it owns information to the exclusion of other departments or people in the organization. A certain department may have the primary responsibility for updating and maintaining the information, but that department still has to share it across the whole company. Well-written information policies can outline the rules for using this important resource, including how it will be shared, maintained, distributed, and updated.
Data planning
At the beginning we said that as many users as possible should be brought together to plan the database. We believed it so much then that we'll say it again here. By excluding groups of users in the planning stages, no matter how insignificant that group may seem, a company courts trouble.
Discussion Questions:
Click on the Discussion icon in the top toolbar to answer the following Discussion Questions.
1. Given what you know so far, how would you structure a database for an organization to which you belong? You could use a sorority, fraternity, social group or work group you're currently involved with. 2. Why do relational database management systems appear to be a better than a hierarchical or network database management system? 3. What do you see as the benefits of using a Web-like browser to access information from a data warehouse? 4. What is a data mart? What are the advantages of having one? 5. What should managers focus on when building a database?
Telecommunications and Networks 9.1 The Telecommunications Revolution 9.2 Components and Functions of a Telecommunications System 9.3 Communications Networks 9.4 Electronic Commerce and Electronic Business Technologies 9.5 Management Issues and Decisions Discussion Questions