How to Quickly Plot Data with Python on your Computer

Here’s a quick tutorial on the easiest way to get set up and ready to chart data on your computer.

Recently, I had some data I wanted to quickly examine and plot. My mind quickly jumped to Python as an easy way to explore the data and chart areas of interest.

While it is easy to play with python in a web kernel (DataCamp / CodeAcademy/Kaggle), I wanted to be able to chart them on my computer. After searching for a couple of tutorials, I realized the information to do this is scattered across the internet.

1. Installing Python

The first part of the process is to install Python and the required dependencies on your computer.

Easy Way – Anaconda

Anaconda Front Page

The easiest way to install Python is by installing the Anaconda Framework for Data Science. This works for both MacOSX, Windows, and Linux.

https://www.anaconda.com/distribution/

Screenshot of Anaconda Distribution Page

You can download either installing Python 3 or Python 2 version. I personally recommend installing Python 3.

Once the download is completed, you can launch the package installer and complete the installation.

Installing Anaconda onto MacOS X Part 1
Installing Anaconda into Mac OS X Part 2

Hard Way – HomeBrew (MacOSX)

MacOSX

1. Check your python version

$ python --version
Python 2.7.3 :: Continuum Analytics, Inc.

2. Install Xcode

In order to properly run Hombrew on your Mac, you need to install Xcode as a dependency.

Option A

$ xcode-select --install

Option B

Go to the App Store and Install Xcode.

3. Install Homebrew

http://homebrew.sh

In your Terminal paste the following command:

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

To verify that the installation worked as expected, type the following command:

$ brew doctor
Your system is ready to brew.

4. Installing Python

To install Python, you call the brew command in your Terminal. You can install other packages besides python using brew.

$ brew install python3 

Once you have successfully installed python3, you can check the version to verify that the installation was successful.

$ python3 --version

To start a Python session type in:

$ python3

2. Selecting your Editing Experience

For the rest of this tutorial, I’m going to assume you installed the Anaconda Package.

1. Start Anaconda Navigator

There are a couple of options in the Anaconda Navigator. Jupyter Notebook is the most common one which provides a great framework to do data analysis and annotate your steps along the way.

For this tutorial, we are going to use Spyder which provides an IDE like experience to quickly iterate on our analysis. This is similar to R Studio or MATLAB.

2. Launch Spyder

3. Importing your Data

First, we want to add the required dependencies to read data and plot data. We are going to import Pandas, NumPy, Matplolib, and Seaborn into our Python file.

# Import Required Python Dependencies 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Next, we are going to import the file you want to analyze. For the purpose of this exercise, we are going to use sample data from Kaggle. Kaggle is a great place where you can download and use different data sets to explore trends or practice your data science skills.

We are going to import this file into Pandas so that we can easily plot the DataFrame using Pandas and Seaborn.

To add a file to a Pandas DataFrame, you can use the pd.read_csv() command. We are also going to clean up our data to make it more readable and cut off parts we don’t want to analyze.

# Read in the file 
terror= pd.read_csv('globalterrorismdb_0718dist.csv', encoding='ISO-8859-1', low_memory = False)

# Rename Columns for better readability
terror.rename(columns={'iyear':'Year','imonth':'Month','iday':'Day','country_txt':'Country','region_txt':'Region','attacktype1_txt':'AttackType','target1':'Target','nkill':'Killed','nwound':'Wounded','summary':'Summary','gname':'Group','targtype1_txt':'Target_type','weaptype1_txt':'Weapon_type','motive':'Motive'},inplace=True)

# Select columns we are most interested in analyzing
terror=terror[['Year','Month','Day','Country','Region','city','latitude','longitude','AttackType','Killed','Wounded','Target','Summary','Group','Target_type','Weapon_type','Motive']]

# Calculate the casualties (Both Killed + Wounded)
terror['casualties']=terror['Killed']+terror['Wounded']

print(terror.head(4))

Output:

   Year  Month  Day    ...     Weapon_type Motive casualties
0  1970      7    2    ...         Unknown    NaN        1.0
1  1970      0    0    ...         Unknown    NaN        0.0
2  1970      1    0    ...         Unknown    NaN        1.0
3  1970      1    0    ...      Explosives    NaN        NaN

Lastly, we are going to print out some descriptive information about the

4. Charting your Data

Using Seaborn we can plot some areas of interest. For this exercise, I’m going to plot two simple visualizations of the data.

Terrorist Activities Each Year

plt.style.use('fivethirtyeight')

# Using Seaborn we can plot the Terrorist attacks by Year 
plt.subplots(figsize=(15,6))
sns.countplot('Year',data=terror,palette='RdYlGn_r',edgecolor=sns.color_palette('dark',7))
plt.xticks(rotation=90)
plt.title('Number Of Terrorist Activities Each Year')
plt.show()

Attack Methods by Terrorists

# We can also plot the Attack Methods by Terorrists 
plt.subplots(figsize=(15,6))
sns.countplot('AttackType',data=terror,palette='inferno',order=terror['AttackType'].value_counts().index)
plt.xticks(rotation=90)
plt.title('Attacking Methods by Terrorists')
plt.show()

Full Code Block

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Feb  5 16:55:13 2019

@author: oscarbarillas
"""

# Import Required Python Dependencies 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Read in the file 
terror= pd.read_csv('globalterrorismdb_0718dist.csv', encoding='ISO-8859-1', low_memory = False)

# Rename Columns for better readability
terror.rename(columns={'iyear':'Year','imonth':'Month','iday':'Day','country_txt':'Country','region_txt':'Region','attacktype1_txt':'AttackType','target1':'Target','nkill':'Killed','nwound':'Wounded','summary':'Summary','gname':'Group','targtype1_txt':'Target_type','weaptype1_txt':'Weapon_type','motive':'Motive'},inplace=True)

# Select columns we are most interested in analyzing
terror=terror[['Year','Month','Day','Country','Region','city','latitude','longitude','AttackType','Killed','Wounded','Target','Summary','Group','Target_type','Weapon_type','Motive']]

# Calculate the casualties (Both Killed + Wounded)
terror['casualties']=terror['Killed']+terror['Wounded']

print(terror.head(4))



plt.style.use('fivethirtyeight')

# Using Seaborn we can plot the Terrorist attacks by Year 

plt.subplots(figsize=(15,6))
sns.countplot('Year',data=terror,palette='RdYlGn_r',edgecolor=sns.color_palette('dark',7))
plt.xticks(rotation=90)
plt.title('Number Of Terrorist Activities Each Year')
plt.show()


# We can also plot the Attack Methods by Terorrists 
plt.subplots(figsize=(15,6))
sns.countplot('AttackType',data=terror,palette='inferno',order=terror['AttackType'].value_counts().index)
plt.xticks(rotation=90)
plt.title('Attacking Methods by Terrorists')
plt.show()

Attributions

Additional Reading

https://seaborn.pydata.org/

https://pandas.pydata.org/pandas-docs/stable/

Stay up to date, get my newsletter