Monday, February 11, 2008

Understanding The Normalization of Database Tables

The first time I was introduced to this concept of normalization was in my Information Systems Design and Analysis class (IT 361) and honestly it never made sense to me. Perhaps, it was due to the fact that we were not focusing so much on this material in the analysis class like we do now. Now, that I look back and I have reviewed more of the material in the textbook Database Systems: Design, Implementation, and Management by Rob Coronel 7th edition, I see how they are related. I can not explain it quite as eloquently as the text does, but I know the relationship that exists has to due with the the System Development Life Cycle(SDLC) and the Database Life Cycle (DBLC). If I can recall this correctly, each of these cycles or processes has a framework or methodology (best practices) for being executed. They are connected in that each is a reflection of the other. The Rob Coronel text puts it more eloquently in Chapter 9, Database Design, "that successful database design must reflect the information system of which the database is a part. Successful information systems are developed within a framework known as the SDLC. That within the information system, the most successful databases are subject to frequent evaluation and revision within a framework known as the DBLC(Coronel 359)." This text talks about how the database is part of the overall picture in an information system (Coronel 359) and that seems logical only now to me.

It was not that I simply could not make the connection of Information System to Database System because obviously the common thread between them is information. However, it was not until I begin to learn database design that things became more clear to me. I began to see them as partners in a marriage in the sense that one is not more important than the other; they must coexist in a way that the needs of each are met; and to be successful they can not be rushed into or occur at random but must be carefully planned so that all needs or requirements are met. Otherwise, like a marriage poorly put together the two will fall apart or fail. Thus, I realize the importance of the sequence of the systems analysis class and the database design class and why it was important to introduce some database concepts in the analysis class.

I recall my instructor explaining primary keys, candidate keys, composite keys, and first, second, and third normal form and feeling overwhelmed because I simply could not figure out why these concepts made sense to system analysis and design. Now, I was taking the class via delayed tape and was always playing catchup with the material perhaps that is the reason why I never quite understood it. At any rate, I remember my instructor saying that we would see this again later in some class. Honestly, I'm sure he likely said with database design, but I totally forgot about it.

So, now the concepts of database design has been introduced to me and we finally get to the normalization of database tables and I begin to panic. I'm thinking oh oh, here goes that difficult stuff again that never quite made sense to me. However, I watch the DVD for IT 450 and recall the ERD concepts and it does not seem so bad. After watching the class lecture, I then decide to tackle the chapter on normalization to see if the book read easier. I found myself rereading and dozing off a little bit because of the wording, but then I start to pay attention to the diagrams. It seems simple enough at least going from first normal form (1NF) to second normal form (2NF). This simplicity, I discovered by finally understanding what partial dependency ("dependency based only on part of a composite key" Colonel 154); transitive dependency (dependedency of one attribute that is not part of a key upon another attribute that is not part of a key Colonel 154); and desirable dependencies actually meant to the database design.

I understand that the goal of this normalization process it to maintain integrity of data to minimize errors that could impact the performance of the database. Yes, this is basically what I should have got from class, but it is still sometimes difficult to apply these concepts. In an effort to understand normalization, I tried to find real world examples of situations in which database design was poor due to normalization errors, however, finding that exact example was like finding a needle in a haystack. However, I did find some blogs written by professional database programmers that helped me to understand the normalizaton of database tables (http://database-programmer.blogspot.com/2007/12/database-skills-first-normal-form.html). I am not as strong with it as I like therefore I will not try to explain it. This is just a blog acknowledging that I am more comfortable with the normalization of database tables because of external research as well as class lectures and the text. Now, how I will fair come test time depends on how much time I get to practice normalizing tables and the difficulty level of the questions.

No comments: