PostgreSQL is an open source relational database management system that has seen active development for multiple decades. Distributions run on most operating systems including Linux, macOS, and Windows. This lab focuses on pgAdmin, which is a cross-platform graphical user interface for administering PostgreSQL databases.

Expected Duration
1–2 hours
Deadline
None (no explicit submission is required)
Points
None

Learning Objectives

  • Create a relational database using PostgreSQL (pgAdmin)
  • Execute SQL queries using pgAdmin

Help Policy

Authorized Resources
Any
Notes
Never copy another person’s work and submit it as your own.

You must document all help received from all sources, including the instructor and instructor-provided course materials (such as the textbook).

Assignment

Prior to starting this assignment, you must complete the software installation instructions to create a virtual machine (VM) and install PostgreSQL (and pgAdmin) in that VM.

This tutorial introduces many topics that we will cover in detail throughout the rest of the course. You are not expected to be able to write the SQL queries by yourself, but trying to work through them will be good practice for upcoming lessons.

In this lab, you will construct a database for a garage that handles automotive repairs. This lab introduces data definition and data manipulation using the Structured Query Language (SQL).

Configuring pgAdmin

Open pgAdmin. Because pgAdmin has just been installed, a connection to the PostgreSQL server must be established although you might simply be requested to provide the password for the local PostgreSQL instance.

Click the “plug” icon (tooltip: “Add a connection to a server.”) on the left of the toolbar. Enter the following information in the dialog box:

Property Value
Name PostgreSQL
Host localhost
Username student
Password CompSci364

Click the “OK” button on the dialog.

Now that you have logged into the PostgreSQL server, the object browser displays contents of the database cluster. pgAdmin allows users to view a hierarchy of databases and their objects in the left frame. In the main window, the top pane displays details about the selected object and any pertinent statistics. When applicable, the SQL pane has a reverse-engineered SQL script that could be used to (re)construct the current object.

Create the Database

Create a new database named garage by expanding the server, right-clicking Databases, and then clicking New Database…. Enter “garage” in the Database field and click the OK button.

In the object browser (left side bar), you should see the garage database listed. If not, right-click Databases in the object browser and then click the Refresh… button.

Tables

The tables required are described below along with each table’s fields and keys. Detailed instructions are included for constructing the first table; you are expected to repeat the steps for the others.

Expand database, schema, and public nodes for the garage database in the object browser (left side bar).1 Right click on Tables and select New Table… to open the table creation wizard. Use the following steps to create the person table:

  1. On the first tab, you will need to name your new table. Type “person” in the Name field.
  2. Move to the tab labeled Columns.
    1. Click the Add button. Create the id column by typing “id” in the Name field. Select “serial” as the data type. On the Definition tab, select the Not NULL check box. Click the OK button.
    2. Click the Add button. Enter “lastname” in the Name field. Select “character varying” as the data type, and enter “25” in the Length field. (This length specifies the maximum number of characters that will be stored – longer values are truncated to fit.) On the Definition tab, select the Not NULL check box. Click the OK button.
    3. Define the firstname field analogously to the lastname field.
    4. Click the Add button. Enter “phone” in the Name field. Select “numeric” as the data type, and enter “10” as the length and “0” as the precision. (The length of a NUMERIC field is the total number of digits that will be stored; precision is the number of digits that follow the decimal point.) On the Definition tab, select the Not NULL check box. Click the OK button.
  3. Move to the Constraints tab. Primary Key should appear in the combo box at the bottom of the dialog box. Click the Add button.

    On the Properties tab, enter “PK_person” as the name of the primary key. On the Columns tab, select the “id” field in the Column combo box. Click the Add button.

    Click the OK button.

  4. Move to the SQL tab, which appears at the end of the list, to view the SQL statement that will actually create the table. Examine the SQL and make sure you understand the CREATE TABLE (...) portion.
  5. Click the OK button to create the table.

Verify that the person table appears in the object browser. (You may have to right click Tables and then click Refresh… to see it listed).

At this point, you may be wondering if all this effort, particularly navigating through the table creation wizard, is worth it. As you’re beginning, you’ll probably find the table creation wizard to be helpful because it eliminates the need to know SQL’s CREATE TABLE syntax. In the long term, though, writing the SQL command “by hand” is much faster as you might have guessed from reviewing it before clicking the “Save” button in the wizard.

Define all the remaining tables that are shown below. Do this in the order they appear so you don’t have trouble setting the foreign key constraints. Instructions to create a foreign key constraint follow the table definitions.

Table Field Data Type Keys and Constraints
person id serial primary key, not NULL
  lastname character varying(25) not NULL
  firstname character varying(25) not NULL
  phone numeric(10) not NULL
customer id serial primary and foreign key, not NULL
employee id serial primary and foreign key, not NULL
  wage numeric(5, 2) default value = 5.15, not NULL
car carid serial primary key, not NULL
  customerid serial foreign key, not NULL
  vin character(17) unique
  make character varying(15) not NULL
  model character varying(50) not NULL
  year numeric(4) not NULL
work workid serial primary key, not NULL
  carid serial foreign key, not NULL
  description text not NULL
  date date not NULL
  cost numeric(7, 2) default = 0.00, not NULL
workrecord employeeid serial primary and foreign key, not NULL
  workid serial primary and foreign key, not NULL

Creating a foreign key is done in the Constraints tab. Use the following steps to define the foreign key for the customer table:

  1. Select the Constraints tab in the table creation wizard.
  2. Select Foreign Key in the combo box that appears at the bottom of the dialog box. Click the Add button.
  3. Click the Columns tab, ensure that “public.person” is selected in the References combo box, “id” is selected in the Local column combo box, and “id” is selected in the Referencing combo box. Click the Add button.
  4. Click the OK button.

This foreign key stipulates that any entry in the customer table is associated with a corresponding entry in the person table. This concept is known as referential integrity – i.e., the customer’s id must refer to (or reference) a valid person’s id.

Queries

Included in pgAdmin is a query tool that can be used to execute arbitrary SQL statements. The top half of the query tool is the entry window where SQL commands are entered. The bottom section is reserved for the query’s output. Because the garage database is currently empty, the tables must be populated. Download garage_data.sql. This file consists of SQL statements that will insert data into each table. Open the Query Tool (click the SQL icon on the toolbar or Tools > Query Tool on the menu bar).

Note: Due to the size of the file, do not copy-paste it into the query window. Doing so might cause pgAdmin to hang.

Click the play button on the toolbar (tooltip: “Execute query”) to run the queries.

The Query Tool retains a history of all the queries which have been executed in the Query History tab in the Output pane (bottom of window). Along with the individual SQL statements, it keeps track of the execute time and results (number of rows retrieved/affected) until clearing the history (Query > Clear history on the menu bar). Clearing the history at this point is a good idea since so many individual queries have been executed.

Type SELECT * FROM person to view the entire person table. When a query is sent to the PostgreSQL server, the query optimizer generates a query plan which specifies how the SQL statement will be executed. pgAdmin can show the steps involved in running a query in a graphical way. Write a SQL statement to view all the records in the person table, but instead of executing the query, ask PostgreSQL to explain it by clicking the Explain button on the tool bar (or Query > Explain in the menu bar). The Explain tab is shown with a diagram of the person table. Clicking on the person table should bring up a tool tip reading something like

Seq Scan on person (cost=0.00..87.00 rows=5000 width=25)

The first piece of text indicates the operation, and the second indicates the table. Next come four numbers. The numbers contain the following information:

  • Estimated startup cost (how long before the first row can be returned)
  • Estimated total cost (if all rows are retrieved)
  • Estimated number of rows this step will return
  • Estimated average widths (in bytes) of output rows

Running several more complicated queries will demonstrate the complexities of the query plan. Three queries are listed below; following each query description is an explanation of the SQL required to perform it. If you are comfortable with the SQL language, try not to look at the solutions unless you are stuck. Pay most attention to the final query which takes advantage of a subquery expression by building upon the previous two queries.

  1. Obtain a listing of the first and last names of all customers. The results should be sorted by last name in ascending order.

  2. Obtain the id number and number of vehicles for all customers who have more than three cars.

  3. Obtain the first and last names and the total amount of money spent by individuals who own three or more vehicles and have spent more than fifteen hundred dollars at the garage. Sort in descending order by the total amount of money spent.

Indexes

Adding additional table indexes can improve query performance. Obtain the listing of first and last names in the person table where the last names are in ascending order (do not order first names as well). Examine the steps in the query plan. You should see a sequential scan followed by a sort. Look at the time required to do both operations. Now, index the lastname field of the person table using the following syntax:

CREATE INDEX index_name ON table (field [, ...]);

Run the first and last name query again. Pay special attention to the changes in the query plan. No longer is there a scan followed by a sort; the index allows the individual steps to be combined into one which is faster than before. Indexing fields that are used in join operations can significantly improve performance. Because queries in the garage database use primary keys for join operations, additional indexes are unneeded—PostgreSQL automatically defines a unique index when a primary key is specified.


  1. PostgreSQL organizes a server as a hierarchy: database > schema > object. (A table is one type of object.) A schema is loosely analogous to a package or module in most programming languages: schemas are used to group related objects, and objects in different schemas can share the same name. Schemas can also simplify access control by granting roles different permissions to different schemas. In practice, we will use the public schema, which exists by default in a PostgreSQL database.