Chapter 4 - Storing Data in Databases

[EXTRACTED DIRECTLY FOR CARMA Chapter 4]

Why do we need databases when our data has already been stored on disk?

Before the hierarchical file system was invented, users of a mainframe who had files to work on had only one folder to work with. Just pause for a minute and try to think of this in terms of your current devices. Imagine all the files on your computer could only be stored in one folder! You already find it difficult to remember all the files on your computer, imagine if everytime you wanted to find a file you couldn’t go to ‘Downloads’ or ‘Documents’ but you had one big sprawl of file names to look at and select for. This meant that people had to carry notebooks in which they wrote down the name of files they were working on with the computer because they couldn’t just create a notepad file and save that to their ‘Desktop’ with notes to themselves. Also, to work with the limited memory on these earlier computer systems, a user deleting the wrong file (because they have access to it) would mean lost data or non functional programs for other users of these multi-user systems. Fortunately, this horror scenario was transformed with the creation of the hierarchical file system. Special folders could be created for system files, for instance, like your famous ‘Program Files’ folder in Windows, or your ‘Applications’ folder on Mac where new programs were installed and kept separate from the rest of the user’s files. This innovation simplified alot about how computers could be used by multiple people, for instance the creation of ‘user folders’ linked to specific users, and used to organize one’s files separate from others documents. What an effect a little hierarchy can make! However, as it tends to go with humans, we are never satisfied for long.

People started to wonder if not only files, but the data contained within files could be stored using similar trees/hierarchical structures. Why stop at creating a single file for all the information that needs to be computed. In 1961, the good folks at IBM had the opportunity to explore this idea in more detail. When American Rockwell won the bid to build the spacecraft for the Apollo program to send a man to the moon and back, they needed a system to manage the large bills of material associated with the construction of the spacecraft. If you’ve never seen a bill of materials (BOM), let me describe what it looks like. A bill of materials represents a list of all the items you need to construct a certain number of a final product. Each bill of material has at the very top a description of what item is being made, and then a table containing the part items, the cost, the quantity and any other relevant details to identify the part uniquely. For instance, the bill of materials for a coffee table might include the type of wood required for the legs including the dimensions and the number of pieces needed for the table. Hold on a minute, I should probably use the example of a chair so we don’t confuse a physical table with a database table. So, the bill of materials for a chair with metal legs might include the type of metal, the thickness and the length, and the number of pieces if the metal is sold in fixed lengths. It will also include the type of screws and bolts required for the chair, the material for the seat, any special paints required to finish the chair, and so on. IBM needed to build a system to capture this sort of hierarchical information for individual parts of the spacecraft. For a spacecraft, you can imagine that some parts have to be built and then used as a part of other parts of the craft, so there was need for some referential links in the system that was being built. This hierarchical tree-like structure was the foundation of the design of ‘Information Control System (ICS)’ and the view that was created to enter in information into each individual bill of materials, called Data Language/Interface (DL/I). Because the format of a bill of materials is standard, IBM created DL/I and ‘Data Language One’ to interact with the system including adding in new information for a specific bill of materials, or reading information about stored data.

The ICS system introduced the idea that application code should be separate from the data, and that a management layer should ‘watch over’ the data in an application, for instance through providing response codes that report about the status of data within an application. During DL/I processing, the system reports the events that occur with DL/I codes by issuing status codes which had associated programmer responses to fix the issues. The DL/I status codes are two character alphanumeric codes, starting from AA (destination of command wrongly specified) to Z0 (invalid data found in the input). Famously, the DL/I response when a transaction was completed successfully was … ‘nothing!’, a blank (bb) which meant you could proceed to the next step. The seeds of the perennial idea in programming that ‘Success is Silent’ and ‘no news is good news’ can be seen in this technology. IBM successfully delivered on this system for NASA, playing a key role in landing a man on the moon. Shortly before Neil Armstrong walked on the moon in July 1969, IBM released ICS and DL/I under the descriptive name IMS (Information Management System) in 1968. As at 2005, IBM IMS (which retained the same core hierarchical structure) was being used by 95% of Fortune 1000 companies.

The Relational Model

<aside> 💡

Assembly Language

https://www.youtube.com/watch?v=yOyaJXpAYZQ

</aside>

https://www.informit.com/articles/article.aspx?p=377307&seqNum=2