Everything I Learned About Databases From...: 2008

Wednesday, April 2, 2008

Seeking Out Problematic Sources In Biometrics Early Instead Of Waiting To Treat Its Symptoms

In all of my blogs about biometrics, I address an ongoing concern that the use of biometrics as an "identity document" or in personal identification is being deployed without further research and analysis on the potential magnitude of its implications. I find this topic very interesting in addition to the privacy issues surrounding databases. Therefore, I continued researching and reading articles about the subject of biometrics implementation as a more efficient security measure. I weighed the pros and cons of what I discovered before finding a document in which a public policy forum on biometrics was held. I am sure this is not likely to be the first or the last time a group convenes to determine the best approach to defining an appropriate policy for implementing biometric systems for various identity verification applications. I thought, "Whew! Finally, a collective, diverse group of intellectual individuals have taken measures in the interest of biometrics implications seriously enough to intiate the development of a suitable public policy." It is obvious that we will not be able to avoid every potential problem that comes with the use of biometrics, but how we may be able to minimize the impacts if we seek to find sources to the problem instead of waiting to treat symptoms.

I like the seeking the cure before the disease or at least analyzing and researching the issue to discover the source of problems. It is an approach that is taken in Project Quality Management and in healthcare as a natural, approach called homeopathy. If we can get to source of the disease or problem, we can prevent it instead of treating the symptoms which perpetuates the existence of the disease or problem. The public forum discussions on biometrics seem to be focused on taking the holistic approach as they sought diverse perspectives for meeting the challenges of biometrics in advance. In the article, a list of major concerns and recommendations have been proposed. I summarized them as the following:

1. Concern for the potential abuse of biometric systems
2. Using "biometrics in an immigration and citizenship context could create
we-versus-them mentality."
3. Trading off security with privacy concerns
4. Implementation of biometrics without establishing an identity policy
5. Failure to perform a proper assessment of biometric implications
6. Concern that "technology will drive policy if we don't ensure that policy
imperatives are driving the development of technology."
7. Addressing the need for "a business case for using biometric applications in
identity documentation, including a national identity card."
8. Concern for that the "perceived dichotomy between security and privacy is false."
9. "Use of biometrics in identity documentation presents genuine issues that merit
serious public discussion."
10. Monitoring and controlling quality and performance of such systems

Reference Article or Link:
http://www.cic.gc.ca/english/pdf/pub/biometrics.pdf

Enhanced Driver Licenses In Washington, Does This Arouse Suspicions For Anyone?

Reference Article or Link:
http://www.dmv.org/news-alerts/enhanced-driver-license.php

"Why Enhance the Driver License?
In a continued effort to develop alternative forms of identification compliant with the Western Hemisphere Travel Initiative, the Department of Homeland Security came up with the idea for these voluntary licenses and ID cards. The hope is that the EDL/ID―which denotes identity and citizenship―will make travel across land and sea ports of entry much more convenient.

Benefits of the EDL/ID
Using Radio Frequency Identification Tags (RFIT) and other measures that make forgery more difficult, EDL/IDs are encoded with the proper information to replace passports at border crossings. Furthermore, the EDL is less expensive and easier to tote around than a traditional passport. Washington Governor Chris Gregoire described EDLs as "a way to boost security at our border without hampering trade and tourism." If all goes as planned in Washington, we just might see these alternative forms of identification across all states.

Part of the Intelligence Reform and Terrorism Prevention Act of 2004, the The Western Hemisphere Travel Initiative requires travelers to carry passports when crossing the borders into Canada and Mexico (as well as Bermuda and the Caribbean). EDLs would take the place of a passport for U.S. citizens crossing at these land and sea ports of entry, but not for international air travel. This represents a savings to consumers, with passports costing $97 and EDLs $45 (DMV-Washington 1998-2008)."

I read this piece after I searched for more information on the Department of Motor Vehicles in Washington, D.C. I was just looking for information about traffic light cameras and tickets for traffic violations via mail for my blog due to my experience regarding the topic. After reading it,I recalled all of the information in Database Nation and articles and my blogs on the use of biometrics with respect to our national security. I began to think "Uh-oh. It's happening. They are deploying these systems in spite of further research and analysis of their implications on accurate personal identification."

Then, I thought perhaps the Enhanced Driver License despite an effort to protect our national security is just another way to track our traveling habits. I became suspicious of what might actually be done with the information. If our spending and driving behaviors are already being tracked by collecting data from credit and discount shopping cards and smart tags, then this is just another way for them to find out even more about us. Could this type of license be used as a GPS enabled tracking device? Or could it be another way for the DMV to collect information about our travelling habits and then sell them to third parties who then inundate us with travel promotions and credit card solicitations with frequent flyer programs?

Beyond the collection of travel data, I worry that the issues surrounding biometrics will become more abundant with the issuance of such a license. This could defintely lead to big business for identification card counterfeiters, which in turn magnifies the problem of identity theft. We will have to worry about malicious individuals seeking and gaining employment at organizations or agencies that maintain biometric data for their own personal gain. What about data collected that may be used to inaccurately accuse someone of a crime simply because they were at the wrong place at the wrong time? I think that we are asking for trouble here despite all of the potential benefits. The deployment of Enhanced Driver's Licenses as a more efficient passport seems to be in effect already, which means I will stay alerted to the good and potential bad this program may cause. Who knows? Maybe it will be successful; there will be no mistaken identities; and the worst we will endure is an increase in junk mail.

Who's Watching You?

About 2 years ago, I went to our mailbox to retrieve our mail and discovered a letter from the Washington D.C. Department of Motor Vehicles. I thought "Okay, this is kind of odd. We have never lived in D.C. so this must be some kind of solicitation of support or some error." The letter was addressed to my husband, a person who never reads mail because he has delgated that job to me, his personal secretary. So, I opened the letter and to my surprise there is a partial picture of the back of his truck fit into the outline of a triangle. There was his license plate and smaller, bits of image of the street and traffic signs where he had been driving. As I scanned the rest of the letter, I discovered the time in which the picture had been taken and the speed at which my husband had been driving. There was a notice in bold type indicating that my husband had been caught speeding on camera and instructions for payment such a traffic violation was also included. "Wow! I can not believe this!", were the first words I uttered.

Now, we were already aware of cameras being installed at traffic lights to monitor traffic violations, catch the offenders, and of course generate more revenue for those states or districts that had deployed them. Yet, it was still shocking to have it happen to us because no one else we knew, despite being aware of their frequent offenses, had ever been sent a ticket in the mail for a traffic violation. At that very moment, we began to take it very seriously that not only were our driving habits being monitored, but that all of our daily activities were also under surveilence.

When my husband and I began discussing the details of the ticket, he recalled just where he was heading that day and what he was doing at the light. Yet, he was never aware that his quick trips to get lunch, gas, or gum from a convenience store might actually be monitored. We begin to explore all of the possibilities for problems this could cause anyone who is being watched whether they are on their best behavior or not.

We imagined what this could mean for someone who might be driving the company car and just happened to get caught speeding. This information gets sent to the employer in the form of a ticket. Now, before the cameras and this type of information sharing among agency databases it might have been something that an employee could have handled discreetly while maintaining what appeared to be a good driving record on the surface---at least that is until the employer updated the data it stored on its drivers via regular DMV background searches. All of a sudden it becomes more than just a traffic violation that the employee has to worry about, but perhaps conditions of continued employment or reprimands on the job. If that is not enough, consider the possibilities of an area caused by two cars travelling a little closely and while passing through a light one is speeding and one may not be speeding. Is it not likely that a picture could be taken of the second car, a traffic violation could be recorded, and then a speeding ticket gets sent to the wrong person? There have been instances where multiple cars are pulled over for speeding simultaneously because they were travelling within in enough range of radar guns to be considered a traffic violator. I am sure that there are likely statistics to support the accuracy of radar guns in catching speeding violations, but the margin for error can not be discarded. A similar margin of error must be taken into account with the use of traffic light cameras. There is always a chance that cameras malfunction or delay and can capture the wrong data that then gets feed into a system where that information is shared among agencies who can make critical decisions about individuals. In this case, the decisions seem to be automated somewhat, but I could be wrong. What are the implications for deploying systems like this despite potential for inaccuracy?

We believe that the magnitude of implications of these surveilance systems is beyond what we could ever imagine because there are too many factors that are not being taken into account. Similar to all other collection of data stored in these systems, the potential for abuse by malicious individuals privy to that data within the organization that maintain this data is of great concern. You never know who is watching you, what information that is being kept about you, and just how that information can be used against you. Thus, we keep this in the forefront of our minds most of the time, but have to be careful not to get too comfortable with the cameras like the individuals on reality t.v. shows.

Medical Databases Offer No Data Protection Can This Be Serious?

I have browsed through another chapter in Garfinkel's Database Nation and am growing more paranoid and concerned each day, well at least while reading anyway. There was a section in the Chapter 6: To Know Your Future that discusses how medical databases are maintained with information about our medical records that could be used for other purposes. I was not aware that there are no laws that protect the safeguarding of our medical records. Literally, anyone in a doctor's office could maliciously use the contents of our records to commit identity theft or leak out information to outsiders without reasonably being punished. Well, that stinks!

Prior to reading this book, I always felt like it was not necessary to include my Social Security Number on medical forms for each doctor or hospital visit. It just seems odd to have to keep sharing something that is considered the current unique identifier that links so much information to me. When I think of how many doctors who I no longer see (e.g. specialists, unfavorable doctors, and doctors left behind during a move), then I wonder what could actually be done with my information.

I worry that because I am no longer a current patient that some individual privy to my personal data will discount my very existence because they do not have to see my face again or believe that old patients quates to old data that is not important.

It is overwhelming to think that someone can simply disrupt your life by going into your past to reveal some medical secret for which you thought was between you and your doctor and not be prosecuted to the fullest extent of the law. This chapter and section reveals why it is so important for the laws to catch up with technology. I am always hearing about how technology law is becoming more important to that particular industry.

Still, I wonder if we will ever catchup or slow down long enough to assess the potential sources of problems that medical records databases among an infinite number of other types of databases present before we start thinking about treating their symptoms. I wonder what it will take. Perhaps, regulation will increase when information regarding a significant government official surfaces---of course if that happens then it may be an internal leak in a grander scheme of things, but that's another topic.

Biometrics: "The Silver Bullet For Terrorism?"

Since 9/11 and the war in Iraq, we as a nation have been growing increasingly concerned about how to prevent terrorists attacks. I have seen documentaries and other things in the media that lead us to believe that we may be a little closer to finding as solution. Already, we have begun to make traveling by plane a major undertaking and an oftentime tedious, risky process. I feel that it is often risky because we do not know the full scope of what will be deemed as a threat on a plane. I heard about people being detained or delayed simply for having a toe nail clip or a finger nail file in their carry-on bags. Perfumes (if not already prohibited) might even be seen as a threat if there is no way of really knowing the true contents of their containers. I have watched on the news and read in papers so much about the need to arrive very early to go through the security screening processes in enough time to make your flight. The worst case scenario I recently read about was one where passengers were prohibited from getting off the plane while it had been grounded longer than 2 hours just as a security measure. All of these hassles we must go through in an effort for the powers that be to try to sniff out potential terrorists or other threats.

The list of stories and bad experiences go and on, but from what I have seen, heard, or read so far, these incidences are rarely magnified because of these security screening processes are base upon promoting and maintaining national security. If a few feathers get ruffled in the process of achieving this broad based goal, then we have to see it as utilitarianism at best---the greatest good for the greatest number.

Some argue that there needs to be a modification to this process---an easier way to weed out the potential problems or threats to our national security. There is already so much technology being directed toward this effort and specialized training is continuously being provided to the essential personnel yet that is not enough.

Proponents for the increased use of technologic security measures would like to see biometric systems implemented to more accurately identify those who may or may not pose a threat to our national security(EFF Sep 2003). It seems that whatever systems are already being implemented must be deemed so much more inefficient perhaps due to their lack of ability to use unique identifiers to discriminate terrorists from ordinary citizens. Thus biometric-based security systems would resolve this inefficiency via a more accurate means for catching terrorists because they can use unique identifiers via human "bio" samples to spot these type of criminals during a simple scan (EFF 2003).

The belief is that biometrics (e.g. fingerprint) could pin point the correct person seeking to execute some harmful act in a less imperfect manner. If we would simply deploy this type of technology on a broad basis as the most appropriate security protocol, then we could ultimately rely on all other, less efficient methods a whole lot less.

The Electronic Frontier Foundation (EFF) is concerned that we are being duped into beleiving that this is best alternative, the end-all, be-all, cure-all by the marketing efforts of proponents for biometrics use (EFF Sep 2003). This group worries that this ideal of a "silver bullet (EFF Sep 2003)" mentality is diminishing the unspoken side effects of its use and what we know to be fact That fact is that database consistency is not guranteed because when managed by individuals it is likely to be error-prone or at risk for manipulation.

We can not escape errors eternally. Biometrics are not only limited by human error, but also by the hardware and other physical storage systems that will house them (e.g. failed and or outdated systems). What will happen if biometric data has to be continually transferred between systems for either system upgrades or information sharing? The potential for inconsistent retrieval of data and lost updates among many things are issues that need to be considered among a whole host of others.

The list of potential issues that are a major concern for the EFF and citizens who oppose the deployment of biometric systems on the national level will likely grow. We can expect no overnight eradication of terrorism via some magic pill or "silver bullet (EFF Sep 2003)." What we must have to do is weigh the issues and do more research before we adopt and implement biometrics on the enterprise level.

I agree with the EFF regarding taken a minimalist approach until we know more about the overall impacts of biometric systems. We should approach the use of all technology that will involve the storage and access of individuals bio-data and other personal data to make critical business decisions with caution even we become share the belief with EFF that biometrics may enhance the current technological security infrastructures thus they should be deployed in parallel to established systems.

We have to remember that the nature of decisions to be made based upon the information in these type of database systems is critical not only to the businesses that implement such systems, but also to the lives of the citizens who may suffer the greatest impact if something were to go wrong.

Reference Article or Link:
http://www.eff.org/wp/biometrics-whos-watching-you

Tuesday, April 1, 2008

What Are Data Warehouses? Looking Beyond The Obvious Definition

During my research on relational databases and database privacy issues, I kept getting search results for data warehouses and business intelligence. When I searched for the term, some searches gave me more technical information than I could digest. I used other resources to get a basic but official definition, but prior to that I just imagined that this was a SAT exam. Thus, I chose to use context clues in the abstract search results and my basic eye-balling of the word. Normally, I would call upon skills I acquired from some Latin in highschool to decompose the words, but I mean it seemed simple enough. So, I gather that it is basically information inventory (data) that is physically stored in a virtual warehouse. Although this inventory of data may be physically stored on computers and their harddrives, these physical storage systems are mobile unlike a physical warehouse of more tangible inventory. Data inventory in the sense of relocating an entire warehouse is more feasible via data transmission technologies than the transmission of tangible goods traditionally stored in warehouses. It sounds simple enough although some what contradictory.

Now, the following is the official definition that I got from Wikipedia, which seemed reasonable enough to prevent an instant headache.

Data Warehouses:
"A 'data warehouse' is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis. [1]

This classic definition of the data warehouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data, and to manage dictionary data are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. An expanded definition for data warehousing includes tools for business intelligence, tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata (Wikipedia.com 02-Apr-08)."

After reading a simplified, but official definition then I realized the limitations of my definition of data warehouses. I discovered the limitations of my data warehouse definition rested with the purpose and implementation of data warehouses. I found an article called "The Case for Data Warehousing (Greenfield 1)" that helped me to understand what data warehouses are and why they are implemented.

What I gained from this article was that data warehouses can be massive storage for data about data or metadata in terms of storing data definitions in a database that represents a model for how data is to be used. They are stored separately from the operating system files so that data retrieval is faster.
During my research on relational databases and database privacy issues, I kept getting search results for data warehouses and business intelligence. When I searched a for the term, some searches gave me more technical information than I could digest. I used other resources to get a basic but official definition, but prior to that I just imagined that if this was a SAT exam. Thus, I chose to use context clues in the abstract search results and my basic eye-balling of the word. Normally, I would call upon skills I acquired from taking two years of Latin in highschool to decompose the words, but I mean it seemed simple enough. So, I gather is basically information inventory (data) that is physically stored in a virtual warehouse. Although this inventory of data may be phsically stored on computers and their harddrives, these physical storage systems are mobile unlike a physical warehouse of more tangible inventory. Data inventory in the sense of relocating an entire warehouse is more feasible via data transmission technologies than the transmission of tangible goods traditionally stored in warehouses. It sounds simple enough although some what contradictory.

Now, the following is the official definition that I got from Wikipedia, which seemed reasonable enough to prevent an instant headache.

Data Warehouses:
"A 'data warehouse' is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis. [1]

This classic definition of the data warehouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data, and to manage dictionary data are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. An expanded definition for data warehousing includes tools for business intelligence, tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata (Wikipedia.com 02-Apr-08)."

After reading a simplified, but official definition then I realize the limitations of my definition of data warehouses. I discovered the limitations of my data warehouse definition were founded upon the purpose and implementation of data warehouses. I found an article called "The Case for Data Warehousing (Greenfield 1)" that helped me to understand what data warehouses are and why they are implemented.

What I gained from this article was that data warehouses can be massive storage for data about data or metadata in terms of storing data definitions in a database that represents a model for how data is to be used. They are stored separately from the operating system files so that data retrieval is faster. Basically, when companies decide to implement data warehouses they do so in hopes of improving the integrity, accuracy, and consistency of data and minimizing time required for processing database transactions. Overall, it is about database optimization.

Most companies who decide to implement a data warehouse do so to optimize the overall performance of their business by optimizing the management of data critical to their business or what mostly considered as business intelligence. The case for data warehouses can have serious implications on business if not executed successfully I gathered from the article, but mostly when I think about data modeling. I recall just how important that task was when I had perform it for information systems analysis and design. Each piece of data coming in and going out has to be processed, stored, and managed properly or the system of processes are virtually inefficient.

Still, I remain limited in my overall knowledge of subject. Therefore, I will simply say that I get the jist of what data warehouses are and their significance to business. I think that I may also have some good examples of data warehouses, but have no idea of how well they are being optimized. Perhaps, I can research two very popular ones (e.g. Social Security Administration and Credit Bureaus) to see what I can discover about how the data is being managed to create business intelligence.

Referencing Article:
http://www.dwinfocenter.org/casefor.html

More About the Oracle vs. PeopleSoft Battle

In a previous blog, I talked about just how much and how little I knew about PeopleSoft. I revealed that, despite pursuing a career in IT, I did not realize that PeoplesSoft was a major database company. However, I knew that PeopleSoft was a big deal because everywhere I turned someone was talking about PeopleSoft training. I also noted that my previous employer was making the transisition to deploy PeopleSoft systems just as I was exiting the company. Later, I discovered via a lot of media coverage of the Oracle v. PeopleSoft battle just who and what PeopleSoft was and that they were fighting Oracle's acquistion of them. Also, I mentioned that employee and customer attitudes at Peoplesoft were very unfavorable toward Oracle for various reasons with the most primary being jobs and product support. Besides what I presented previously and knowledge of Oracle's successful acquisition, there was not much that I knew about the whole Oracle vs. PeopleSoft battle. Therefore, I ended my blog with the promise to go seek more information about it.

During my research, I found this article "PeopleSoft's Last Hurrah?(Gilbert 21-Sep-04)" on CNET News.com. After reading it, I discovered that the concerns for Oracle's plan to discontinue support of PeopleSoft were valid ones. I don't know the exact details because I did not have time to do as much digging as I would have liked, but in this article there was mention of Oracle's plans to support the product for only 10 years after the acquistion. My immediate sentiment upon reading that was "Ouch!" Then, I began to think that 10 years is a long time in IT and technology will inevitably change faster than we can adapt to and adopt it. Yet, I could not ignore feeling what a company who may have invested a lot of money into the PeopleSoft product might actually be thinking at the onset and end of this battle. They had to see it has sunk costs obviously and began the preparation of fundraising for what may be the eventual costs of a new database platform. Or they could run the risk of keeping a product for which support may be limited to the expertise of internal personnel via product experience.

I contemplated PeopleSoft’s position for awhile, taking into account employee and customer concerns. Then, I tried to visualize things from the Oracle perspective. PeopleSoft was Oracle’s biggest competitor and well business is business. If we see it from a general business perspective, then it is basically the survival of the fittest and Oracle was determined to be the survivor. I imagined that the employees and customers at Oracle could easily have been in the same positions as those at PeopleSoft. Therefore, I take no sides in this matter, but try to foster understanding in business.

However, I did not understand how PeopleSoft despite being "...the second-largest supplier of enterprise resource management software, behind SAP and just ahead of Oracle (Gilbert 1)" was in the unfortunate position of being taken over by the very company it seemed to be outperforming. I am sure if I dug a little deeper then I could trace it all back to the financial statements and well whatever else those Wall Street journalists report. My research unfortunately was limited by time and general concern to satisfy the jist of my curiosity despite the article revealing the financial problems PeopleSoft faced after acquiring a rival company as well. All and all what I learned about the Oracle vs. PeopleSoft battle was that it represented another cycle of business in which a hostile takeover led to antitrust suits, bitter words between opposing CEOs; the eventual win for the acquiring company and loss for the acquired company; and the usual gamut associated with mergers and acquisitions.

Still, I would like to know a little more about Oracle, PeopleSoft, SAP, and other major providers of enterprise resource managemant software. I am really interested in Oracle more so now partly due to this topic, but primarily because it seems like the chosen one so often when I hear about databases, SQL, and other relative enterprise resource management discussions. Another unique reason is that there has been an Oracle headquarters or branch located near my past employer and now within walking distance of my home. Although I never really knew exactly what area I would pursue in IT, I always was fascinated by Oracle because of its association to databases and the fun I had building a database in community college. Who knows it may be a symbol of something, but I will not know until I actually gain more experience with database design without the ease of Microsoft Access.

Referencing Article:
http://www.zdnet.com.au/insight/software/soa/PeopleSoft-s-last-hurrah-/0,139023769,139160071-2,00.htm

"Create One Version of Truth": Can This Ideal Be Realized in Databases Managed by Humans?

In my blog "Could 'End-User Buy-In And Support For Accurate Data' Resolve The 'Garbage-In, Garbage-Out' Issues Of Databases?", I discussed how the company in the article, Renassler Polytechnic Institute, had developed what could be a best practices methodology for ensuring accurate data and consistent databases within its company. I summarized the list of steps or things that Renassler implemented to faciliate this process of maintaining accurate data while gaining a consistent database in return. The list is included again as the following:

1. Create cross-functional support.
2. Think big, start small, deliver quickly.
3. Create one version of data truth.
4. Provide support for new behaviors.

Notice that I put the third item in the list in boldface type to coincide with this blog topic. I was reviewing the list of blog entries on this site to ensure that any blogs that were a prequel to a series of subsequent blogs had indeed been followed up by those sequel blogs. When I scanned the list of specific ones to concatenate, this blog suddenly redirected my focus. I kept thinking about a recent situation in which a critical judgement or decision had been made on the basis of certain information provided by what was obviously deemed an accurate and reliable repository. Although the data gathered from this repository was not subjected to check constraints, it was presumed accurate.

Thus a critical decision or judgement was made when in fact the data retrieved from this repository was actually inconsistent. The repository had not been updated to reflect errors made thus it was presented to the end user as valid. A major issue with this inconsistent data was that, first, the keeper of the data made a mistake, acknowledged it, but either for lack of accountability or perhaps forgetful intentions did not update the repository. When confronted with the error, the data keeper (dba) still did not make changes to the repository thus the data retrieved resulted in actions and ideas that had negative impacts on the objects in the repository when inconsistent, inaccurate data was retrieved.

In search of understanding what caused the bad data to be stored in the repository despite knowledge of errors, a deficiency was discovered. There had been no clearly defined data definitions or any model for how information or different events would be be handled since the business had hired a new data keeper (dba). Thus, problems surfaced because the business requirements had changed and no one had updated the repository to reflect those changes. Users kept making updates that either were being discarded or lost because the new data keeper did not communicate the new business rules therefore any inconsistent data kept being rewritten. The problem was the lack of communication regarding data issues on the behalf of the data keeper. Information continued to be modified so much that it began to cause conflict and to resolve it the data keeper created a version of truth for the repository and communicated the change to upper management instead of the users. Upper management then made a critical decision influenced by bad data and a desire to resolve the chaos within the system. Data definitions were established and business rules were communicated to users and stored in a new repository as a version of truth. It seemed that a lot of the problems were resolved for the data keeper and upper management, but not the users. Inconsistent data and poor communication had tarnished the users' creditability and no one had created one version of truth.

There was a version of truth extracted from the repository, but it was inaccurate. Then, there was a version of truth communicated to upper management via the users that was deemed unworthy of recording to the new repository despite validity. Upper management stepped in at the request of the new data keeper to establish a version of truth that everyone would have to accept. This eliminated chaos, but did not truly create one version of truth because there were multiple ideals of what the one version of truth entailed. Since neither set of data had been checked against each other and no concatenation of data had been obtained from the repository (dba), the users, and upper management, then no one was able to create one version of truth.

This is sort of a wacky analytical blog in which the revelation is that just like there's multiple sides to every story there is such with the presentation of information stored in databases. What makes a single story have so many sides is that each user can interpret data in a variety of ways and based upon some ideal can present it as they see fit. This may lead to inconsistencies in the data and cause problems for the database system if there are no checks and balances. Since individuals maintain databases and can modify within certain restrictions data contained in databases, there is potential for human errors. If no one is willing to check the database for those errors to ensure that all data is consistent with everything that has been presented, then it will be impossible to have a database with one version of truth. It will be more along the lines of deciding whose version of truth is more acceptable despite bad data thus making this concept difficult to realize in some settiings.

What Are Biometrics?

The first time I ever heard of this term biometrics was two years ago during a database concepts lecture. Yet, when it was explained to me I knew that it was a big deal on the news after the 9/11 incident. I believe at that time I was so bombarded by the news constantly reporting national security issues and intelligence conferences where major IT security companies were heavily sought. Vaguely, I remember an anchor person talking about how the government was considering the use of software that could identify criminals, terrorists, or otherwise potentially dangerous individuals before the board a plane simply by scanning the irises of their eyes. Immediately, I thought wow we're really going sci-fi now. I can not recall how many times I had seen all that technology in futuristic movies, spy movies, and movies that involve government corruption e.g. Enemy of the State, Pelican Brief, and the list just gets longer. Still, I had never really made the connection to databases despite being obviously aware that some instance of information sharing and checking had to occur. Thus, I decided I to look around to see what interesting articles I could find about biometrics.

First, I just wanted to know what it meant. An official definition for biometrics was provided on the website http://www.eff.org/wp/biometrics-whos-watching-you and given as the following:

"Biometrics refers to the automatic identification or identity verification of living persons using their enduring physical or behavioral characteristics. Many body parts, personal characteristics and imaging methods have been suggested and used for biometric systems: fingers, hands, feet, faces, eyes, ears, teeth, veins, voices, signatures, typing styles, gaits and odors ."

Biometrics as far as I understand are basically metric systems (or measuring systems) based upon living organisms or tissue (bio) that is used in an attempt to uniquely identify (primary key ideal) individuals. My interpretation or paraphrasing of the term calls to mind a database. When I think about how biometrics can be used to discriminate one related entity from another, then I begin to think about primary keys which uniquely identify a set of data and normalization which optimizes the retrieval of that data. However, like most databases there are limitations and potential for errors. And this presents a lot of major concerns for the EFF and others, which leads to another blog discussion on biometrics.

Should We Be Concerned About Biometrics?

Referencing Article:http://www.eff.org/wp/biometrics-whos-watching-you

"Why be concerned about biometrics? Proponents argue that:

A) biometrics themselves aren't dangerous because all the real dangers are associated with the database behind the biometric information, which is little different from problems of person-identifying information (PII) databases generally;

B) biometrics actually promote privacy, e.g., by enabling more reliable identification and thus frustrating identity fraud.

But biometric systems have many components. Only by analyzing a system as a whole can one understand its costs and benefits. Moreover, we must understand the unspoken commitments any such system imposes (EFF Sep 2003)."

I read this and was moved by the idea that a system that could have significant implications on personal identification data is being prematurely recommended for broad based deployment without significant analysis of its advantages and disadvantages.

The advantages of biometrics seem to create a false sense of security I believe in part because we have relied on them for so long in other applications. Biometrics have been used in law enforcement where we fingerprint criminals and maintain that information in records to possibly identify repeat offenders, update current records with new information about individuals, or for use in otherwise pertinent applications. As far as I know, fingerprints have been very reliable in identifying individuals and solving criminal cases and now with the advancement in forensic science we are able to use DNA and DNA databases to close cases that would otherwise be deemed cold or open indefinitely. This biometric data has also been very useful in exonerating the falsely accused and getting justice for these individuals and their families.

It seems simple enough that we could rely on "bio" samples unique to each individual to perform check constraints against "live" samples in biometric databases (EFF Sep 2003). Yet, I have to wonder what would happen in the case of individuals who share similar DNA characteristics particularly in the case of twins, parents, other siblings, and family members. Now, the argument could be made that each individual's fingerprints are unique, but what about in the case of scanning faces and other body parts for personal identification? I have lost count of the number of times someone has mistaken me, my mother, and or my sister for someone else. Most of the times it was because we look so much like other members of our family, which is to be expected, when someone just glances at your face and recalls a familiar image. I can see this also being a potential problem for a database system as well. The possibility for potential errors seem to exceed the scope of biometrics particularly when you factor in identical twins. I am just curious to know how we could prevent mistakes in that instance. These are just one or two potential disadvantages of biometrics. Obviously, we need to rethink early adoption of these systems as primary tools for personal identification.

I agree with the Electronic Frontier Foundation in that we need a realistic model to build the most efficient biometric systems before we can implement this technology instead of promoting it as a cure all solution for personal identification and the most accurate way to combat crimes that rely on biometric resolutions. We do not know enough about the impacts of relying too heavily on such systems. It is very easy to allow the pros to outweigh the cons when we approach advancements in technology. It is almost like kids who are easily persuaded by the promotion of all these innovative and technologically advanced toys or gadgets so they easily adopt them via persuading their parents of the surface benefits. For example, the continuous stream of video games and cell phones that saturate the market. We do not realize that with the early adoption of these systems and gadgets that we ultimately pay the price of systems and phones that become outdated almost the moment we buy them or poor behaviors that result of not understanding the full extent of product adoption (e.g. laziness on the behalf of children who would rather sit and play games all day instead of being active or excessive text messaging that leave us paying exorbitant fees associated with cell phone bills). We may get more than we bargained for if we act prematurely in the adoption of new technologies instead of thinking things through before we act.

It all comes down to asking ourselves the question of whether we should be concerned about biometrics, gathering data about that question, and evaluating and analyzing that data before we rush into deploy a system that may do more harm than its proposed good.

Monday, March 31, 2008

Women's Influence on Popular Database Yahoo

I was searching the Internet for more blog articles when the headline "Yahoo To Launch Site For Women" caught my eye. It was nothing extraordinary, but I found it interesting that a popular search engine like Yahoo would be taking the time out to cater to a more specific demographic---women. I guess I should not have been surprised with all the marketing the Internet and its search engines promote. However, I naturally assumed they lumped us all into one group of web surfers with subcategories that were more focused on interests by topic, but not so much by the nature of our sex. It makes sense that Yahoo would redirect their focus to a particular market niche, but why now?

In search of the answer to that question, I kept reading the article which later revealed what seems most logical, there is a profit issue. Apparently, Yahoo is one of the Internet companies who is not faring so well as of late. I guess that shows how much I really pay attention to this stuff. Honestly, I take it for granted that the Yahoos and Googles will always be there at mine and everyone else's disposal if ever an Internet search beckons them. Yet, that does not seem to be the case as Yahoo, despite its model, is subject to cycles of all business---growth, maturity, decline, and death. It seems Yahoo is trying to prevent the latter part of that cycle. In it's quest to revive business, Yahoo has recognized a market niche that they have yet to truly tap. They have realized the power and influence women continue to have in the marketplace. Apparently, our opinions are not only significant factors in determining the square footage as it applies to the spaces in real estate, but also in cyberspaces where Yahoo maintains a residence.

Thus, Yahoo has initiated the task of designing its user interface to reflect what their internal research indicates is of most importance to women. Yahoo has decided to partner with media companies (magazines) to develop products and services that cater to women and their needs or interests. The following is an excerpt from the article:

"Amy Iorio, vice president for Yahoo Lifestyles, said internal research also shows women are looking for a site to aggregate various content and communications tools.

'These women were sort of caretakers for everybody in their lives,' she said. 'They didn’t feel like there was a place that was looking at the whole them — as a parent, as a spouse, as a daughter. They were looking for one place that gave them everything.' (New York Associated Press 31-Mar-08).”

I think this is definitely a good move for Yahoo just in how modifying their content may likely affect me. Often, I go to Yahoo primarily to just check email and rarely use it to perform a search transaction unless my browser references it. All I hear about is Goggle and that's likely what most people use. I can not recall the first time or ever if anyone said "Oh just Yahoo me" the way the say "Google Me." Naturally, I began using Google because it was so well promoted that I believed it must be a great company as well as Internet resource. However, my first email account was created via Yahoo so I have remained somewhat loyal. If they modify the content on the user interface to something that interests me more, then I would probably access the site more. In my opinion, Yahoo has made a smart decision to address the needs of women by providing "content and communication tools (NY AP 31-Mar-08)" on their sites that will allow women to feel like they have everything in one place.

Referencing Article:
Yahoo to launch site for women
31st March 2008, 7:30 WST
http://www.thewest.com.au/default.aspx?MenuID=145&ContentID=65337

Friday, March 28, 2008

Anarchy, Anarchists, and Anarchism Either Way You Say It Spells Rebellion Against DBs

Anarchy: a utopian society of individuals who enjoy complete freedom without government

Anarchist: a person who rebels against any authority, established order, or ruling power 2: a person who believes in, advocates, or promotes anarchism or anarchy

Anarchism: a political theory holding all forms of governmental authority to be unnecessary and undesirable and advocating a society based on voluntary cooperation and free association of individuals and groups

While researching information about one of my favorite movies Enemy of the State starring Will Smith, I stumbled upon an interview with the writer of the film, David Marconi. It was an in depth and almost eerie interview one in which the writer simply faced with a challenging interviewee, simply had to sit back, turn on his recorder, and allow the director speak at his leisure. Initially, that eerie vibe I felt when I continued to read the article made so much sense to me later when the words anarchy, anarchist, and anarchism were defined and discussed. It made sense that the writer would not be able to systematically interview this guy---the writer/director. In that instance, I realized that this man lived and breathed this philosophy of a truly autonomous life without authority, rigor, and control so much that he was not willing to compromise that belief even during an interview. For him, there was no standard protocol---no need for structure.

Immediately, I was intrigued by this concept of free will and no implications of time which he later discussed during the interview. When I analyzed the article, I thought of Garfinkel's Database Nation and began to understand why Marconi was the perfect someone to make a movie like Enemy of the State. The more that I read the article the more I was taken aback by what this man had to say because he obviously was no bumbling idiot. A lot of what he said made sense to me particularly when he discussed the implications for time and the way children initially view time. The recurring theme was about giving up freedoms for an authoritative ideal of rigid structure. I kept thinking about the idea of restrictions, structures, and form while I read the article and how I could make them analogous to databases. The key to me was the form with respect to database design.

I abandoned my idea of just wanting to write about GIS, criminal databases, government spying, and the whole intricate web of ideas related to the events that occur in the movie. The idea of form when we design a database with respect to table normalization for database optimization reflected the ideal of a rigid structure (normalization) and implications of time (e.g. optimization). I thought about how we even evolved to this point of needing to be so organized or structured. Then, I began to contrast that with this ideal that we were convinced by some authority perhaps the business with its business rules that the task we were executing was necessary for order and to optimize time. It all seemed so intricate to me, but yet it made sense if I chose to see things from the perspective of an anarchist. Thus, I would see the database as another construct of that authoritative system that needs to place restrictions not only on the contents or data it keeps, but on us as the keepers (DBAs) of that data. And either way you spell it, A-N-A-R-C-H-Y, A-N-A-R-C-H-I-S-T-S, or A-N-A-R-C-H-I-S-M, it is rebellion against databases.

Referencing Article:
http://www.altpr.org/apr12/zerzan.html

Wednesday, March 26, 2008

What I Know About PeopleSoft: Formerly An Emerging Brand Now Its The Oracle Brand

I don't know much about Peoplesoft except that it became a big deal in 2000 to the institution (a community college) where I attended classes and worked as a part-time IT support tech. Honestly, I had no clue that it was a database or anything. All I remember were the grunts from tenured staff that they were having to learn a new system because someone was "moving their cheese." During that time, I was a member of a 3-person IT staff and we had regular meetings about the grunting, but only one of us (a senior staff member) had received training on PeopleSoft. It just so happened that particular individual liked having a monopoly on the information and seemed quite content to be delegated the job of training everyone else---essential personnel of course. I was just a student and fell into the non-essential personnel at the moment (as long as I was not needed to put out little fires then this was not something that would immediately affect my job). Therefore, I was enjoying my cheese for the moment.

I recall planning my move to Virginia Beach to attend college on campus therefore I only got the jist of what all the PeopleSoft brouhaha was all about. Essentially, it was being used for student enrollment, other registration tasks, and employee information. During that time, the concept of VoIP was also being thrown around as a requirement for deploying the PeopleSoft system. Other than that I was clueless.

So, I packed my bags moved to VA Beach and out of the IT field into the life of a commuter on campus and that of a debt collector. The next time I was to ever hear this name PeopleSoft was in a few job search requirements and later during news of Oracle's acquisition of this company in 2004-2005. I remember seeing upset employees resistance and I think picket signs disparaging Oracle for acquiring the company. There were talks of layoffs and downsizing and rumors that Oracle had acquired it's competition out of fear and planned to let the brand die by not providing support for the application. Some of the concerns seemed logical and valid while others did not make complete business sense.

Layoffs and downsizing are normal during mergers and acquisitions, but trying to kill off an established brand and application that had already been adopted by many users did not make business sense. If I recall vaguely, Oracle made a statement regarding continued support and plans to keep some of Peoplesoft's employees. I am a little shaky on the details therefore I decided to research it further to learn more about Peoplesoft and Oracle's acquisition of the company. What I learned was...(to be continued in another blog).

Vendor Quality and Assurance: A Lesson On Vendor Relations

"...Software companies, including Oracle, typically include clauses in their license agreements that remove their liability from any kind of negative business-related events resulting from the software (like crashing all your company's servers), nor are there any warranties for the buyer to fall back on.

In other words, it's "buyer beware" on steroids. "If it doesn't do what you thought, it's not our fault," Jones adds. "[Vendors] are very unlikely to do anything that creates promise in the future."

Two of the main reasons why software companies typically include such licensing provisions are because, first, they can; and second, because there are accounting rules designed to stop vendors from misstating the timing of when they record revenues. For example, if a license agreement includes a 12-month warranty period, explains Jones, the vendor could not book the revenue from the software deal until after the 12 months expired (Wailgum 1)."

I stumbled upon this article soon after a Quality Management lecture in my Project Management class. My instructor gave us a synopsis of a poorly managed IT project courtesy of an article about the problems with the automation of the Census in today's Virginia Pilot. The article discussed the complications of the project, the initial procurement costs of an estimated $5.1 million dollars and a projected additional $2 million plus dollars to make changes to this project despite failing to meet project requirements. The vendor requested more money to fix problems due to the fact that the product it created was not user friendly. In it's defense, the vendor blamed the Census Bureau for failing to clearly specify the specifications of the product it was so expensively paid to complete. Now, from what I have gained from the Project Management course so far and what my now agitated instructor reiterated, is that the vendor was responsible for the efficient management of this project and basically passed the buck. If a project fails, then it is the responsibility of the vendor's Project Manager and the vendor should be held liable.

Ironically, the article for which I'm primarily referencing seems to reflect a similar situation in which Oracle's licensing agreement almost allowed it to escape liability despite any potential negative impacts to its customers. However, this European bank did its due diligence in ensuring via tactiful negotiations, quality assurance in the event of product failure despite an often iron clad licensing clause for Oracle products.

Initially, when I read the headline and the opening excerpt in this blog, I had mixed feelings about Oracle regarding quality and accountability. I guess that is is why it pays to dig a little deeper. Oracle for as long as I have known is a very widely respected company with the exception to some or at least a significant percentage of downsized PeopleSoft employees (I'll save that blog for next time). I am sure that there may be still be some instances where a customer was displeased, but I have not researched those yet.

Anyway, I continued to read further and was quite impressed with how the customer (a European bank negotiator) handled what could have been a stonewalled situation. Hats off to the negotiator in this deal because he really did his homework and mapped out a strategy for what could serve as a best practices approach to managing vendor relationships (Wailgum 3).

The negotiator researched licensing agreements with respect to vendor audits and gained a background on how Oracle and other software vendors typically maneuvered via legal backdoors their way out of quality of service (QoS) guarantees. He did what we often still struggle with in IT and as a society in general, which was to gain insightful information regarding past transactions or events and use it most efficiently. The customer realized an opportunity to minimize costs and risks of doing business by maintaining focus on what was most important (quality and reliability) to them while also providing a benefit to Oracle---for a hefty price of course. Oracle saw it as a win-win situation and despite it's reputation for playing hardball with licensing issues, it decided to bend its rules to accomodate the European bank. Perhaps, it was the weight of the money bags on its back that primarily sealed the deal and not so much that Oracle was becoming soft on its licensing policies.

Although it is very likely the money and obvious mutual benefits sealed the deal, I like to think that Oracle is committed to upholding its reputation as one of the best database (application) vendors out there and cares about its customers.

The referencing article.
href="http://www.cio.com/article/198000/How_a_European_Bank_Got_Oracle_to_Surrender_Key_Software_Licensing_Points"

Tuesday, March 25, 2008

IDMB.COM: IS THIS INTERNET MOVIE DATABASE AN INVITATION TO CAREER NETWORKING OR CAREER HACKING?

What I found most astounding was that this database is available to ordinary users as well as entertainment professionals despite celebrity privacy issues and frequent cell phone hacking to get the very information that may be accessible on the IDMB.com. Then, I thought perhaps the creation of this site was in an effort to give hackers some of the information they seek in a less violating and mutually agreeable way. I am sure no one really agrees with me on this, but after reading about (in Database Nation) all the sneaky ways information is collected then I may not be wrong. At the very least, it could be useful for someone without malicious intent who may need a creative way to network in the entertainment industry due to limited connections. If I look at it via the perspective of someone just trying to catch a harmless entertainment career break, then I step out of the paranoid and overly concerned role that Garfinkel cast upon me. I begin to see this type of database just as vital as those other career sites like Monster and CareerBuilder.

Professional online databases for people in the entertainment industry seems very beneficial for those trying to get discovered in this business, but I wonder if this could be inviting more trouble. Today, with all the instances of cell phone hacking and other IT security issues associated with malicious people trying to gain access to celebrity information, could this be a disaster waiting to happen? I will keep an eye and ear out for news that the data or security of this site has been compromised because it seems like a perfect place to target for those who are enticed by or obsessed with discovering personal information about celebrities.

Thursday, March 20, 2008

My First Reaction to Reading A Chapter of Database Nation

This blog is about a journal entry I wrote in September 2006 soon after reading a chapter of Database Nation: The Death of Privacy in the 21st Century by Simson Garfinkel (http://www.oreilly.com/catalog/dbnationtp/). I had just given birth to a 8lb baby boy three weeks after undertaking Introduction to Database Concepts for the first time. If I can recall (because I was quite delirious due to managing a newborn), I found the material in the course textbook to be overwhelming and a little dry. Thus, when I had the Database Nation book sent to me I was never happier to find an easy read that could be used to complete a journal assignment. Unfortunately, I had to drop the course due to health complications and was unable to attempt it again until the Spring 2008 semester. I began to approach the tasks of reading the book and completing the journal again when I found something I wrote the day I got the Database Nation book. The entry is included below.

September 27, 2006, 1:01 a.m.

I just received my Database Nation paperback outside my door today. No one knocked on my door to see if I was even home to receive the package. UPS just dropped it in front of my door unlike all of the other times when they require me to sign or will not leave a package otherwise. Usually, they stop by the rental office if no one answers the door to my apartment. It's funny because I was home this time, but never heard anyone knock on the door. I even sat in my livinging room near the door in anticipation of a package. Most of the day had gone by when my son arrived home from school with a package in hand (much to my surprise) that had been left outside our door. Now, I know it is typical for UPS to just leave packages outside if no one is home, but it was atypical for our UPS agent. I had been worried that the package would get delivered later today or days after we had gone out of town on a family trip.

My first thought was that I need to notify UPS not to just leave unattended packages by my door because someone could steal or discard them or much worse open them to obtain information about me. I can be a privacy lunatic sometimes, but feel like the issues that surround privacy are so infinite that we can not even begin to fathom the best solution. All of this I pondered just at the thought of receiving a book that explores the intricacies of privacy and its relationship to databases.

I am reluctant to open this book because I worry that it will be yet another dry read. However, I discover that I am more intrigued by each page I read. So many privacy issues and ideas cross my mind. There is much to consider. I think about how much this book speaks volumes to my every day life and how techology has evolved to cause an evolution of great proportions in my own life. I have changed so much without really being aware of the change.

As I read all of the examples of how databases can affect our lives via the aspect of privacy, I begin to think about the movie "Enemy of the State." It is one of my favorite movies because it explores the possibilities of privacy issues like Database Nation does. I begin to read more pages of the book and ponder what angle I will write about for my journal assignment. I think of so many things that I am overwhelmed. I can not remember them all. In fact, I find myself so overwhelmed that I can only relate it to how much of that same feeling overcomes me when I consider how to protect my privacy.

Ultimately, I feel naked no matter what I try to do to conceal my information. I shred things a lot now. In the past, I never cared about shredding and would never have thought about buying a shredder because there was no obvious need. Now, I shred anything with other household members' or my personal information on it if it must be discarded. Sometimes, I take shredded things and flush them down the toilet. Luckily, I have managed not to clog up the plumbing. Although, I fear any day my toilets are going to grow tired, give a big hiccup, and expel all of its contents that I tried to flush, conceal, eliminate, and or keep away from identity thieves.

I try to process all that I have read and all of the thoughts that the book has generated. Then, I think how I can tackle the journal entries. Maybe, I can compare and contrast the movie "Enemy of the State" and Database Nation. Another option would be to read the articles, Rob text book, and or Simson Garfinkel DB Nation book. Maybe I should just read Garfinkel's book and go with whatever I feel passionate enough to write about. How about voicing my concerns and how they relate or dont relate to Mr. Garfinkel? I could compare and contrast our ideas.
(to be continued....)

Saturday, February 23, 2008

Database Privacy Issue Quotable: "Garbage In and Garbage Out"

While contemplating my next blog topic about databases, I decided to google privacy issues and databases. My search returned quite a few interesting articles as expected. One article I found the most interesting was the very controversial subject of the use of massive databases for law enforcement applications. Everyone except these law enforcement agencies seem to be concerned with the potential problems that could stem from databases that are supposed to maintain data about criminals and facilitate efficient information sharing among all of our nation's law enforcement systems. There is much debate about the potential for abuse among members of these agencies, but more importantly the possiblity that all information entered into a database about individuals may not be verified as accurate and may do more harm than its proposed good. Many questions surround this issue as it relates back to our individual privacy.

Who will police the collecting, generating, storing, and dissemination of information? What safeguards, if any, exist to protect the innocent who may become victims of human errors with respect to managing such personal data? If there are any monitoring and controlling measures in place, how well are they truly being implemented? Are we really any safer if we give up our right to privacy in exchange for an even greater unknown? It is that unknown that we put into the hands of the agencies and systems that are supposed to protect us, yet continue to fall short daily. Will we feel any greater sense of security in our daily lives if we have these massive communicative efforts purported on our behalf when we know that life holds no guarantees?

Each day presents uncertainty for us all and despite the efforts to safeguard lives, we will not be able to prevent every instance of a crime. We may come close, but at what price? Is giving up our privacy and freedoms comparable to feeling secure in our environment? Can we truly feel secure in a world where we must give up what is ours despite where we may fit into society?

When I contemplate the possibilities that exist at both ends of the spectrum, I am always left with more uncertainty and questions. Initially, I read this article and the idea of one department of justice "OneDOJ (Eggen 1)" facilitated by massive information sharing seemed like a great idea. It spoke of unity of all the systems created to protect us and almost signaled an end to the bureaucratic divisions that too often have failed us due to insufficient communication and coordination. If I closed my eyes and dared to dream, then I could see it like the Justice League with Captain America and his Super Friends working together for the greater good. If only it were so simple, then the controversy may not exist. Unfortunately, life is only comical sometimes and quite different from the comic books.

Law enforcement and all of its systems are not super heroes despite their ability to fight crime. Generally, the comic book characters---the objects of heroism--- do not feel victimized by those who are trying to protect them. However, the same can not be said for innocent people victimized by potential errors in database systems that may lead to false accusations and arrests. As this article states, some information stored in databases is grossly inaccurate thus this broad-based information sharing is like putting trash into the system---data pollution to say the least. We may forget about One Department of Justice being akin to Captin America and the Justice League. We must instead, scrap the comic and go back to the drawing board for as long as we remain human there will be errors in these database systems. Thus, in this case it is basically "garbage in and garbage out(Eggen 1)".

"Justice Dept. Database Stirs Privacy Fears Size and Scope of the Interagency Investigative Tool Worry Civil Libertarians" (By Dan Eggen Washington Post Staff Writer December 26, 2006; A07)

Wednesday, February 20, 2008

IDMB.COM: The Internet Movie Database (Data Validity and Transaction Management)

As I was searching for more database articles to blog about, Google returned results for a professional online database(s) for people in the entertainment industry. This site seems like a great site for entertainment professionals. The database must be almost infinite in size as everyone can contribute data similar to the wiki technology used on Wikipedia. I explored the site a little bit trying to find out more about it in the Home section, but could not find anything useful. However, I decided to investigate the frequently asked questions section digging for more information about IDMB Inc. I did not find much about the company either. I did find, however, a question about the source of the data in the database and the nature of its accuracy or reliability. This question made me think about the "garbage in and garbage out" quote from a previous blog and then about transaction management.

I thought about the previous blog because the FAQ section of this site like many websites today issued a disclaimer regarding the source and validity of the information it presents. There was an honest statement about how the information being shared might not be accurate because the sources vary within the industry or otherwise public. Therefore, the information being viewed by various users at some point in time might be inaccurate due to updates to the database. Now, I know this does not necessarily mean that there is no lock on database tables that store the information, but it made me think about transacation management.

I thought about how today I might access some data about a movie or business contact in the industry (given that I may be in that league of site members) that might be incorrect, but later be updated if I accessed it again. Now, I just want to make sure that I have the concept of transaction management correct. Transaction management applies to users with rights to access, read, update, or delete objects in a database. If I access this website just searching for information, then I am only able to read the information as it is presented despite the accuracy. However, if I can update information about a business contact that is being shared with other users in the database then I can only do this when other users are not reading or making changes to the database also. I am assuming that I understand this process to be implemented if we are accessing the same tables e.g. those objects related to the information that I am trying to change. It seems like common sense to me in some instances, but in others I am a little confused. At any rate, I am still learning about transaction management so I will go back to the the validity of data.

If I was networking on this site or researching anything on this site like any others, I would have to worry about how accurate the data is because of the suppliers of the data. Since this site acknowledges that it maintains a database that may contain data that is inconsistent with user input, then it is not reliable and like putting some garbage in to the system and getting some garbage out of it.

This blog is not meant to be heavily analytical or otherwise educationally valid. I was just exploring some of the database concepts and topics I been exposed to in class or have considered via way of blog responses to database articles.

Google vs. Scroggle: The Good and Bad of Search Engines

http://boston.com/news/nation/articles/2006/01/21/google_subpoena_roils_the_web/

The link above references another article that sheds light on how our privacy is further being compromised by database technology; the companies who initiate this compromise; and by the government with its attempts to gain greater access to personal information via the claim of civil protection. As the article headlines "Google Subpoea Roils The Web," there is indeed lot of conflicting emotions regarding whether Google---the mega search engine--should release the contents of its repositories to the government in an effort to fight Internet crime.

The information in Google's repositories can or could identify users via a reverse search of information obtained from online users who utilize search engines. It seems that Google and many other search engine companies collect data from users conducting searches, stores it for various business-related research (at least that is their official statement and no real admittance of what they actually do with that data), and maintains this information indefintely despite the risk of data compromise. These search engine companies create indices to the information that they maintain like a reference resource in a library and similar to a reference book with printed information, the information remains available as long as it is not discarded. The only difference here is that the public does not have access to the very information about them that seems public information to the search engine companies. Until I read this article, I was not fully aware of just how much or the exact nature of the information that was being collected and stored by companies who gather data from Internet users. Now, I am aware and unfortunately so is the government ---not that this was an entity left in the dark because they are basically becoming more like George Orwell's Big Brother. If Big Brother is becoming more of a reality, then all I can say is Poor Us and Poor Them.

Poor Us: all of us Internet users duped by the convenience and useful resources that these mega databank companies provide and now its Poor Them---search engine companies---whose chickens have come home to roost as the government seeks to benefit from their profiteering efforts. Now, they are being sued to release the information that they perhaps never should have collected in the first place or at the very least, should have never stored. They have no definite plans for these obvious datawarehouses so instead of discarding it they continue to stockpile it despite the potential for misuse if in the hands of malicious individuals or other worse case scenarios possibly our government in its quest to help.

A lot of these companies have begun to cooperate with the government, vowing to only supply necessary information while protecting users' identities. However, Goggle is putting up a fight supposedly to protect the users yet this seems a little contradictory because there is no mention of simply purging the data. I say if Google is really interested in the rights of Internet users for whom they provide both a good service and disservice to, then they can choose to end bilateral services and simply do good by their users---delete those repositories of user information. If Google does nothing more than put up a good fight with the government over their right to keep data collected via users' ignorance of their unauthorized data gathering to themselves, then Google and companies alike might as well join the government in the emerging role of Big Brother.

Google and companies alike, that execute these sneaky activities without fully disclosing them to the users of their services represent what is bad about search engines. It is one of the most popular search engines around and has become a catch phrase in pop culture, but I wonder how popular Google would be if ALL users knew that their personal information was being collected. I don't believe most users are aware regardless of what is on the news, Internet, or in the papers. It is big news to me and I obtain information from all mediums. Perhaps, I have been too comfortable with the information that I could get from the resource to notice what information that resource could and possibly would get from me. I am sure that is likely the case for most users. Yet, I wonder what they would do differently if armed with the knowledge of all the companies like Google to whom we must be easy prey.

Well, I know what I would do---go back to the basics for research, use public computers when I can, and or search for public interest group (privacy activists) online services who believe in our continued right to privacy despite the advances of technology. One such group mentioned in this article is Scroggle.org who represents good and what could be a good quality about all search engines. It is a "public interest group, Public Information Research Inc. of San Antonio," who "runs scroogle.org, an Internet service that disguises the Internet address of searchers who want to run Google and Yahoo searches anonymously (Hiawatha Bray Globe 2006 page 2) . "

"...Internet users concerned about privacy should do their Internet searches through Scroogle or other Internet ''proxies" that hide the address of the searcher (Osphalt, Bray 2). " Also, Google and other search companies should regularly erase their database of saved searches (Osphalt, Bray 2). "Perhaps they should consider whether it's worthwhile to keep all this information indefinitely. (Osphalt, Bray 2)."

After reading this, I visited the site scroogle.org and added it to my list of search engines on Internet Explorer. Now, I would like to believe that if I use this site to conduct my searches online that it will honestly perform the duty that it advertises. I will keep my fingers crossed because in today's society everyone has something to lose and something to gain even the "do good" groups. It is that tradeoff that got us all in this mess in the first place as we gained the information we searched for and lost our privacy in the process. However, I remain hopeful that there are still pure, privacy advocates out there who lend legitimate services such as scroogle.org to all of us still concerned about the loss of our privacy.

Could "End User Buy-In And Support For Accurate Data" Resolve The "Garbage In, Garbage Out" Issues of Databases?

In a previous blog, I referenced a database privacy quotable "Garbage In, Garbage Out" to address not only the concerns for data privacy, but the accuracy of that data. Database accuracy is such a big concern because information about every aspect of our lives continues to be recorded and stored in infinite datawarehouses to be accessible by individuals who have the power to grant or deny us privileges. It is not enough that our ability to make decisions is no longer an autonomous process, but that which should be our sole right is now dictated by powerhouses of information that could be dangerously inaccurate. We see the effects of bad data maintained in businesses when individuals experience identity theft; a keeper of the data stored about us makes a clerical error; and or when a company never continuously validates the integrity or quality of data it stores.

I recall reading an excerpt of Database Nation (Simson Garfinkel) in which he discusses how outdated data is at a lot of the companies who continuously collect and store data about individuals. This old or bad data may be rarely updated or never validated in a lot of companies whose sole function is to share information for the purpose of conducting business. We know this to be true because of the credit reporting, billing, and mailing errors that many of us have to live with daily. How many times have you received someone else's mail? And how many times have you received calls at your residence for the same wrong person-wrong phone number even after the passage of multiple years? Someone or some company simply is not, properly, maintaining the data being stored. This failure to validate and update data is not only annoying, but can be detrimental if this bad data is used to make critical business decisions.

We recognize this problem obviously, but what can we do about it? How do we begin to tackle this problem when it seems like an infinite one that will take infinite lifetimes to resolve? When companies maintain bad data while simultaneously collecting and storing new data, how can they reasonably expect to achieve data accuracy and consistency? It is possible we may never solve this problem, but if we choose to tackle the problem then we might minimize the number of errors and their impacts. There are companies who have taken steps toward approaching a solution.

In the article that is referenced below, "The Secret to Successful Business Intelligence A Top Notch Data Warehouse" Rensselaer Polytechnic Institute seeks to gain end-user endorsement and support for achieving accurate data. An essential first step for the institution was to review how data was being defined, stored, and used by various entities within the institution. It appeared that data was being managed with no real guidelines in place. There was chaos in a sense because no "data definitions" were established (Daniel 1). Different departments and business functions used their own definitions and methods for looking at data (Daniel 1). This presented a lot of problems for Renassler as the following excerpt reveals.

"...Finally, the admissions staff needed more timely demographic information about its applicants to inform student selection decisions.

Getting a handle on the data has been critical because higher education today is a tough arena. Government funding is down, requests for financial aid are up and admitting a diverse student body—in terms of gender, geography, ethnicity and academic achievement—has become more challenging. All these factors make balancing the supply of enrollment acceptances and financial aid with the demand from student applicants more challenging than in the past. The better Rensselaer could optimize its administrative resources and time, the more revenue it would have for courses and scholarships to attract the best and the brightest.

The answer was a business intelligence and enterprise data warehouse implementation (Daniel 1)."

Ultimately, Renassler decided to "create enterprisewide processes for collecting and using data (Daniel 1)" which included communication, training, and support for end users (Daniel 1). They implemented this new process via the following focused steps as listed in the article:

1. Create cross-functional support.
2. Think big, start small, deliver quickly.
3. Create one version of data truth.
4. Provide support for new behaviors.

I like Renassler's approach to solving an ongoing business intelligence problem. It showed some maturity (e.g. as it applies to Capability Maturity Model® Integration) on the part of the institution to go from chaos to at least recognizing the problem and trying to find a valid solution. Also, I believe that their new approach of putting more value on the keepers of data by providing "broad user support (Daniel 2)" could serve as a best practices methodology for other companies seeking to improve the quality of their information they store. Renassler's redirection serves as a good best practices because "enterprise data warehouse and business intelligence projects' success depends on broad user support and because consequential business decisions are made on the faith that information is accurate (Daniel 2)."

If every datawarehouse infrastructure applies this ideal of continuous proces improvement (CPI) or total quality management (TQM) of data, then we may get closer to weeding out the garbage that could potentially go into databases thus minimizing the garbage that goes out of them as well. It looks like it did well for Renassler in terms of ROI and "optimized expenses (Daniel 4)." See the section titled "An A+ for Rensselaer's Business Intelligence" on page 4 of the article.

Referencing Article:
http://www.cio.com/article/151601/The_Secret_to_Successful_Business_Intelligence_A_Top_Notch_Data_Warehouse
"Outdated information and disagreement over data definitions was impeding Rensselaer Polytechnic Institute's progress. To the rescue: a business intelligence plan that emphasized end user buy-in and support for accurate data"

Privacy Issues Surrounding Databases

This topic is one of an infinite nature once you begin to explore all the instances of privacy piracy. I got more information about how our privacy gets compromised by databases than I could ever have imagined. When I discover the ways that databases or database technologies, can be used and are being used to capture information about us as we go about our daily lives I am in awe that more is not being done to regulate these acts.

When I speak of these acts I am referring to how easily we get tricked into providing information about ourselves to individuals who use it for their own internal purposes or to sell or share it with third parties without our consent. As I read more excerpts of Simson's Database Nation I could not believe that I was not more guarded with my information.

There are certain instances where I questioned the collection of my information e.g. in hospitals, doctor's offices, and other cases where the collection of my personal data might seem more pertinent to my well-being. Yet, I discovered the gross abuse of my personal information even in places where I least expected infringement upon my privacy (e.g. medical records). I was not aware that when I signed consent to release my medical information to what I believed were eligible third parties like the providers of my medical insurance, that other third parties for which I never knowingly would have given consent also have permission to access my medical data. Who knew that the keepers of medical records could sell that information to insurance companies, current and potential employers, and anyone else who could use that information to make critical decisions about individuals.

It is like we do not even own our information any more once we give consent to collect our personal data whether in awareness or ignorance to third parties. When I think of how I use my Visa check cards instead of cash and sign up for discount programs in grocery stores in the quest to be frugal, I feel silly now each time I go to my mailbox and am uninundated in junk mail or solicitations. Then, I think of all my efforts to opt-out of marketing promotions and credit card offers and see them as wasted when there is no way to eliminate each group of individuals responsible for the intricate web of advertising. Who has time to call each company that sells their information to stop doing so or to remove their names from the proverbial list. I even get ads faxed to me now, which is such a gross waste of paper. It angers me to think that not only am I paying the price for someone else's careless mishandling of my information, but also incur costs in terms of paper and other printer supplies not to mention the infinite cost of privacy lost.

The privacy issues surrounding databases are so infinite and coupled with the ever increasing advances in technologies that assist the neglectful transmissions of our personal information that we may never see an end to this problem. At best we can only expect that technology will exacerbate the problem thus allowing it to get worst before it gets better. Who decides on when things get better ultimately is not up to the keepers, seekers, and senders of our information, but up to the us, the victims of database privacy abuses.

References:
Database Nation: The Death of Privacy In The 21st Centruy by Simson Garfinkel
ISBN 0-596-00105-3

Monday, February 11, 2008

Exploiting Children Via Databases (Database Nation Discussion)

In Chapter 7 "Buy Now!: Selling It To Our Youngest Consumers", Simson Garfinkel talks about this exploitative practice of collecting data from children while they use the Internet to be stored in databases for marketing purposes. I was unaware for some odd reason that sites that I once deemed child-friendly because of the age-appropriate content, were equally as harmful to my children. I now know why I have received all types of marketing offers in the mail for various children's magazines, toys, and other products for which I did not personally seek.

It is not enough that we have to deal with all of the commercials marketing to kids and other child predators. Now, we have to start screening the sites that our children frequent on the Internet that appear harmless. Who preys on children's ignorance of data gathering methods anyway? Are companies so concerned with increased profits that they will stop at nothing to entice a child to provide personal information about themselves and sometimes other members of their families?
I wonder if they realize that in their attempts to gain information for marketing campaigns geared toward children that they might be putting children in harms way. There is no way to guarantee that some malicious individuals are not intercepting the same personal data that they see as harmless.

Where is the social responsibility in all of these ploys to collect information? Who will protect the information that they are gathering about children? I do not want to even think about what type of malicious individuals could be working for the companies who collect address information from children via mandatory product or site registrations. I think that we really need tougher regulations in place to sanction any company who attempts to collect information from children without their parents' consent despite disclaimers and acceptance agreements. There needs to be some way to protect our kids from further exploitation.

I read that some regulatory efforts have been made toward minimizing how data can be gathered, but not prohibiting or completely outlawing data collection from minors. We as parents will simply continue to regulate this practice. Such a task will require continuous monitoring and trying to control what information is allowed into our homes and what is allowed to go out of them. A Database Nation is definitely where we reside today, but you would think they would take a little easier on the parents---the decision and purchase makers. It's like our jobs as parents are not difficult enough now we must learn to navigate around this intricate mess.

Understanding The Normalization of Database Tables

The first time I was introduced to this concept of normalization was in my Information Systems Design and Analysis class (IT 361) and honestly it never made sense to me. Perhaps, it was due to the fact that we were not focusing so much on this material in the analysis class like we do now. Now, that I look back and I have reviewed more of the material in the textbook Database Systems: Design, Implementation, and Management by Rob Coronel 7th edition, I see how they are related. I can not explain it quite as eloquently as the text does, but I know the relationship that exists has to due with the the System Development Life Cycle(SDLC) and the Database Life Cycle (DBLC). If I can recall this correctly, each of these cycles or processes has a framework or methodology (best practices) for being executed. They are connected in that each is a reflection of the other. The Rob Coronel text puts it more eloquently in Chapter 9, Database Design, "that successful database design must reflect the information system of which the database is a part. Successful information systems are developed within a framework known as the SDLC. That within the information system, the most successful databases are subject to frequent evaluation and revision within a framework known as the DBLC(Coronel 359)." This text talks about how the database is part of the overall picture in an information system (Coronel 359) and that seems logical only now to me.

It was not that I simply could not make the connection of Information System to Database System because obviously the common thread between them is information. However, it was not until I begin to learn database design that things became more clear to me. I began to see them as partners in a marriage in the sense that one is not more important than the other; they must coexist in a way that the needs of each are met; and to be successful they can not be rushed into or occur at random but must be carefully planned so that all needs or requirements are met. Otherwise, like a marriage poorly put together the two will fall apart or fail. Thus, I realize the importance of the sequence of the systems analysis class and the database design class and why it was important to introduce some database concepts in the analysis class.

I recall my instructor explaining primary keys, candidate keys, composite keys, and first, second, and third normal form and feeling overwhelmed because I simply could not figure out why these concepts made sense to system analysis and design. Now, I was taking the class via delayed tape and was always playing catchup with the material perhaps that is the reason why I never quite understood it. At any rate, I remember my instructor saying that we would see this again later in some class. Honestly, I'm sure he likely said with database design, but I totally forgot about it.

So, now the concepts of database design has been introduced to me and we finally get to the normalization of database tables and I begin to panic. I'm thinking oh oh, here goes that difficult stuff again that never quite made sense to me. However, I watch the DVD for IT 450 and recall the ERD concepts and it does not seem so bad. After watching the class lecture, I then decide to tackle the chapter on normalization to see if the book read easier. I found myself rereading and dozing off a little bit because of the wording, but then I start to pay attention to the diagrams. It seems simple enough at least going from first normal form (1NF) to second normal form (2NF). This simplicity, I discovered by finally understanding what partial dependency ("dependency based only on part of a composite key" Colonel 154); transitive dependency (dependedency of one attribute that is not part of a key upon another attribute that is not part of a key Colonel 154); and desirable dependencies actually meant to the database design.

I understand that the goal of this normalization process it to maintain integrity of data to minimize errors that could impact the performance of the database. Yes, this is basically what I should have got from class, but it is still sometimes difficult to apply these concepts. In an effort to understand normalization, I tried to find real world examples of situations in which database design was poor due to normalization errors, however, finding that exact example was like finding a needle in a haystack. However, I did find some blogs written by professional database programmers that helped me to understand the normalizaton of database tables (http://database-programmer.blogspot.com/2007/12/database-skills-first-normal-form.html). I am not as strong with it as I like therefore I will not try to explain it. This is just a blog acknowledging that I am more comfortable with the normalization of database tables because of external research as well as class lectures and the text. Now, how I will fair come test time depends on how much time I get to practice normalizing tables and the difficulty level of the questions.

My Introduction to Relational Databases

My very first experience with relational databases was between 1997 and 2000. I was enrolled in community college seeking an Associates degree in Information Systems Technology. The course was called Introduction to Database Management as expected, however, I do not believe I really learned how to manage databases. My instructor was great and I managed to get through the course just fine, but I did not retain much of what I learned. I believe that was largely due to not using what skills I learned beyond the course. What I remember most was that the course project required a lot of time and that I ended up buying a computer and the Microsoft Professional Suite to have unlimited access to Microsoft Access 1997.

I spent countless hours learning to build a database, creating primary keys, and performing queries either for sample projects for student enrollment (courses) or medical records (patients). Today, I have discovered that those sample projects must be the best examples because we use them as models again in my Introduction to Database Concepts course IT 450.

In my IT 450 course, I am learning about ERDs and relational schemas for the first time since they were previewed in my Systems Analysis and Design classes (IT 361 and IT 473). They are critical to database design, yet, I never learned about them in my very first database management course. I only recall using this thick textbook called Microsoft Access 97 and building the database due to a long list of requirements. The requirements were not written similar to the business requirements that I am being exposed to now. It was more along the lines of what would be expected once our database design was analyzed with respect to primary keys, field and index requirements, datatypes, and other things. This did not bother me at the time because I never knew how I would use this course later. However, it intrigues me today that I either was asleep at the wheel or that the very basis for database design was not a focus of the course. Honestly, I do not remember any discussion of ERDs and relational schemas.

Today, I am a little disappointed because I feel like I would be more prepared for my current course with all of the logistical issues I'm having with the course delivery. I would feel a lot less pressure while trying to grasp the concepts had we at least discussed this topic. Perhaps, the idea was to just have us be succintly introduced to databases and leave the real ground work for even higher educational institutions---those providing the Bachelor's and Master's degrees. If that is the case, then I can minimize my disappointments and focus on actually learning and possibly applying in the work force what I gain from the Introduction to Database Concepts course today.

Everything I Learned About Databases From...