PostgreSQL Databases: Connect To R and Python

PostgreSQL is a commonly used database language for creating and managing large amounts of data effectively.

Here, you will see how to:

1) create a PostgreSQL database using the Linux terminal

2) connect the PostgreSQL database to R using the “RpostgreSQL” library, and to Python using the “psycopg2” library

Creating our PostgreSQL database

In this example, we are going to create a simple database containing a table of dates, cities, and average temperature in degrees (Celsius).

We will name our database weather, and our table cities.

Once we open a Linux terminal, we enter the following:

sudo -u postgres createdb weather
sudo -u postgres psql weather

This creates our weather database. We then enter \password to set a password:

\password

Once prompted, we will enter a password rainbow, or whatever password you prefer 🙂

Enter new password: rainbow
Enter it again: rainbow

We then enter psql to initiate PostgreSQL and \conninfo to test the connection:

stocks=# psql
stocks-# \conninfo
You are connected to database "weather" as user "postgres" via socket in "/var/run/postgresql" at port "5432".

Create Table for Database

linux postgresql

Once we have verified our connection, we can now create the table. As mentioned, we will name our table cities, and include the fields date, city_name, and averagetemp_celsius. We will define each variable as DATE, VARCHAR(50), and DECIMAL(7,2) respectively.

CREATE TABLE cities (
    date    DATE,
    city_name    VARCHAR(50),
    averagetemp_celsius         DECIMAL(7,2)
);

Once that’s done, we can then insert the appropriate values into our table:

INSERT INTO cities VALUES
('02/08/2017', 'New York', 25),
('02/08/2017','Los Angeles',28),
('03/08/2017','London',18),
('03/08/2017','Paris',22),
('03/08/2017','Los Angeles',28),
('04/08/2017','Berlin',18),
('04/08/2017','Tokyo',25),
('05/08/2017','Zurich',24),
('05/08/2017','Shanghai',29);

Connect Python to Database using psycopg2

1. We import our psycopg2 library, and enter our database credentials:

import psycopg2
conn = psycopg2.connect("dbname=weather user=postgres password=rainbow host=localhost")

2. Now, let’s execute two commands. Firstly, we wish to insert an extra row of data into our database from Python directly:

cur = conn.cursor()
cur.execute("INSERT INTO cities (date, city_name,averagetemp_celsius) VALUES (%s, %s, %s)",
...      ("05/08/2017", "Sydney", "16"))

3. Once we have done that, we can then display our data within the Python console:

cur.execute("SELECT date, city_name, averagetemp_celsius from cities")
rows = cur.fetchall()

4. And, we see that our data is displayed in the Python console:

psycopg2

5. In order to make the changes to the database permanent, we now commit our changes to the cities database:

conn.commit()

6. We have committed the necessary changes and can now close out our connection:

cur.close()
conn.close()

Connect R to database using RpostgreSQL

Firstly, we install the RpostgreSQL package:

install.packages("RPostgreSQL")
require("RPostgreSQL")

Then, we define the database password within R:

pw <- {
  "rainbow"
}

We define our driver and set up the connection:

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname = "weather",
                 host = "localhost", port = 5432,
                 user = "postgres", password = pw)
rm(pw)

We verify that the “cities” table exists:

dbExistsTable(con, "cities")

Once verified, we read our table into R using dbReadTable:

myTable <- dbReadTable(con,c("cities"))
attach(myTable)

RpostgreSQL

And there’s our table – we’ve successfully connected our PostgreSQL database to R! You might also find the following tutorial helpful in using PostgreSQL with Ubuntu.

dbSendQuery: Editing database directly from R

Now that we've created our database, what if we wish to edit directly from R? For instance, suppose that we have a new row of values that we would like to input. Instead of having to access the PostgreSQL database directly every time, we would like to send a query from R that does this for us.

Let us suppose that we wish to insert the following three values: '06/08/2017', 'Sao Paulo', '16'.

To do this, we create a variable res that uses a dbSendQuery command to send an INSERT INTO query as follows:

res <- dbSendQuery(con, statement=paste("INSERT INTO cities (date, city_name, averagetemp_celsius) VALUES ('06/08/2017', 'Sao Paulo', '16')"));

dbSendQuery

Once we have sent the query, we now see that upon opening the associated data frame in R, our entry for Sao Paulo has been included and our query has been committed!

Conclusion

In this tutorial, you have learned how to:

  1. Create a database and table in PostgreSQL
  2. Connect a Python environment to a database using psycopg2
  3. Commit queries to the database and add additional entries