A database is a collection of data that is organised in a way that makes it easily accessed and maintained. We normally store data using a relational database. What this means is that we’re going to store the data in more than one table and these tables will be linked together by certain fields.
To better understand databases there are some key terms that are worth learning.
This lesson we’ll learn about:
- Types of relational database management systems (RDBMS) and their characteristics.
- Relational data structures
- Relation algebra sets
1. Types of Relational Database
When talking about types of databases, we often mean between flat-file databases and relational databases. Flat-file databases are where all data is stored in a single table, whereas relational databases are where the data is split across several tables.
However, there are also different types of relational databases. We will normally classify them as:
This is a database management system that is designed to store data locally, to be accessed on an individual personal computer system. An example of this is Microsoft Access.
These systems are often specifically targeted to be simple to use, often not requiring any understanding of the SQL that underlines relational databases. Instead they use an easy-to-use GUI that make it easier to navigate and perform actions with.
This is a database management system that is designed to be stored on a server that can be accessed over a network by a large number of users on client devices.
This is like a websites database. Most websites store the usernames & passwords, as well as many other things, in a database kept on the website server. Then all the hundreds, or even thousands, of users can submit data to the database, and retrieve data from it.
These systems often require far greater technical knowledge, requiring users to perform operations using SQL in a command-line interface, rather than through a GUI. Proprietary server RDBMS software, such as Oracle, is often very expensive.
However there are a number of free open source options, such as MySQL, that are very commonly used, particularly on the web.
Why might we prefer to use a desktop database instead of a server database? Research the answer to this online.
2. Relational Database Structures
To understand the structure of a database we need to know a few important terms. These are:
- Relations – Most of us know what a table is from using spreadsheet software. It is a grid made up of rows and columns. Well in relational databases we often refer to these as “relations”.
- Tuples – A row in a table is known as a record, or a tuple. A tuple holds all the data on a single item. For example, if we have a table to store book details for a shop, then each tuple is an individual book the shop sells, and it will store all the data on that individual book such as the ISBN, book title and price.
- Attributes – An individual piece of data in a record is known as a field, or attribute. In the previous example we said that each record for a book will store the ISBN, book title and price. Well these three things are all examples of attributes. Attributes appear as columns in a database table.
- Domains – All the possible allowable values for an attribute. This is slightly different to the data type of the attribute. For example, a field may have an integer number data type, which defines that it can only allow whole numbers to be entered. However, there may be additional rules applied, such as that the number must be between 1 & 10. The domain would therefore be this range of whole numbers.
- Cardinality – How unique an attribute is in terms of its data values. Some attributes will have a wide range of different data values entered. For example, the primary key field will have a completely unique value for every record. Where there is a large percentage of unique values, this is known as “High Cardinality”. Where there are a lot of repeated values across the entities tuples, this is known as having “Low Cardinality”.
Terms like “records” and “tuples”, or “attributes” and “fields” are often used quite interchangeably. Make sure you are comfortable with using these terms in each of their different forms.
3. Relational Algebra Sets
In order to manipulate and retrieve data from a database we use Structured Query Language. The basis of SQL, and other query languages, is defined by a mathematical query language called relational algebra.
Relational algebra uses a number of operations to allow us to work with our database. These include:
- Select – Retrieves a subset of tuples from a relation. This subset is made up of those tuples that meet certain criteria. We represent the select operation with the symbol σ.
- Union – Retrieves all the tuples from two relations combined together. This requires both relations to have the same number of fields, with matching data types. Any duplicate tuples following the union are removed. We represent a union operation with the symbol ∪.
- Intersect – Similar to a Union, but only retrieves the tuples from two relations that appear in both relations. We represent an intersect operation with the symbol ∩.
- Join – Combines the attributes from two relations. So, the columns from the first table will be shown alongside the columns in the second table. This requires the two relations to have one attribute in common. We represent a join operation with the symbol ⋈.
There are actually a number of additional relational algebra operations, such as “Project”. Research these online.
- A relational database stores data across multiple tables that are linked together through related fields.
- Desktop databases are designed to store data locally, to be accessed on an individual personal computer system.
- Server databases are designed to be stored on a server that can be accessed over a network by a large number of users on client devices.
- A relation is a table in a database.
- A tuple is a row in a database table. This is also known as a record.
- An attribute is an individual data item in a record. This is also known as a field.
- The domain is all the possible allowable values for an attribute.
- The cardinality is how unique an attribute is in terms of its data values.
- A select operation retrieves a subset of tuples from a relation.
- A union operation retrieves all the tuples from two relations combined together.
- A intersect operation retrieves the tuples from two relations that appear in both relations.
- A join operation combines the attributes from two relations.