MySQLdb: Connect Python and mySQL Databases Together

psycopg2

In a previous tutorial, we set up a financial database using a range of mySQL queries, and used such queries to create separate tables and discriminate among data in those tables. However, there are many occasions when a user needs to connect to a mySQL database through an external program. This is particularly the case with Python, which integrates quite well with mySQL through the MySQLdb library. If you are not familiar with the workings of mySQL, then I strongly recommend reading the previous tutorial, which provides a guide for the commands being used here.

Steps to create mySQL database and run queries

The below tutorial illustrates how to establish a connection to the mySQL database through Python, and run the appropriate queries from within the Python environment to manipulate the database directly.

1. CHECK DATABASE CONNECTION

I always recommend testing the connection to the database directly before engaging with Python. Often times, the Python code will not run, leading the user to believe there is a problem with the script, when in fact the database is being blocked through a firewall or some other issue.

We can run the telnet command through a terminal (e.g. Linux) as follows:

telnet IPAddress Port

As an example, if we were to try to connect to the google.com port, we can input as follows:

telnet google.com 80
Trying 216.58.198.174...
Connected to google.com.

 
2. CONNECT TO DATABASE USING MYSQLDB LIBRARY

import MySQLdb
#Enter database connection values
dsn_database = "your_databasename"
dsn_hostname = "localhost"
dsn_port = 3306
dsn_uid = "your_username"
dsn_pwd = "your_password"
conn = MySQLdb.connect(host=dsn_hostname, port=dsn_port, user=dsn_uid, passwd=dsn_pwd, db=dsn_database)

Note that depending on whether you are running the database locally or on a cloud, then you may either have to input “localhost”, or the IP Address where your database is located under dsn_hostname, respectively.
 
3. CREATE TABLE

Using the conn.query() command, we now use our prior SQL commands in order to create the stock_selections table. Again, we can choose to drop/delete the table if it already exists in the database, and then assign new variables to our table:

conn.query("""DROP TABLE IF EXISTS stock_selections""")
conn.query("""CREATE TABLE stock_selections
(
ID varchar(255),
YearlyReturn decimal(18,5),
DividendYield decimal(18,5),
DebtEquity decimal(18,5),
PERatio decimal(18,5),
PricetoFCFRatio decimal(18,5),
YearlyEPSChange decimal(18,5),
NetIncome decimal(18,5),
ShareholderEquity decimal(18,5),
Sales decimal(18,5),
Inventory decimal(18,5))""")

 
4. INSERT DATA INTO TABLE

Again, we use the conn.query command, we then populate the newly created table with the appropriate data:

conn.query("""INSERT INTO stock_selections VALUES(‘3SOWO’, ‘8.11’, ‘0.7’, ‘0.46’, ‘14.8’, ‘7.39’, ‘-9.58’, ‘718805730’, ‘3146789135’, ‘398935732’, ‘152591876’), (‘AFASM’, ‘14.2’, ‘0.71’, ‘1.29’, ‘5.94’, ‘7.4’, ‘29.29’, ‘493395901’, ‘3171592425’, ‘871819920’, ‘495389897’)""")

 
5. QUERY DATA

Upon populating the table with the relevant data, we then use the cursor.execute command to hold and execute our query to filter out our data. Note that the many other queries executed in the previous tutorial can also be executed under this command:

cursor=conn.cursor()
cursor.execute("""SELECT ID, YearlyReturn, DividendYield, NetIncome / ShareholderEquity ROE, Sales / Inventory Inventory_Turnover, YearlyEPSChange
FROM stock_selections
WHERE (DividendYield >= 3 ) AND YearlyEPSChange >= 30
ORDER BY ID""")
cursor.fetchone()

 
6. PRINT OUT DATA

Once this is done, we can then print out the data generated by Python:

print "\nStock Data:\n"
rows = cursor.fetchall()
import pprint
pprint.pprint(rows)

 
7. CLOSE CONNECTION

We then close the connection using the conn.close() command.
 
Using Python in conjunction with mySQL is very useful when it comes to importing data and then using it for a more advanced level of statistical analysis than would be possible using SQL queries alone. This type of integration is particularly efficient across virtual machines such as Microsoft Azure, where the Azure SQL database and Python environment are stored on the same cloud and can interact with greater efficiency. All in all, to get the most out of SQL databases, being able to integrate the same with Python opens up a level of analysis which would not be possible through using mySQL alone.

Author: Michael Grogan

Michael Grogan is a machine learning consultant and educator, with a profound passion for statistics and data science.