Learning To Use APIs In Data Analysis

Learning To Use APIs In Data Analysis

A Step-by-Step Guide to Building Your First Crypto Data Analyser

Welcome to another data adventure! if you are a beginner trying to grow your data skills, then you might have come across APIs in your learning journey in one way or the other.

This is because APIs are how users access real-time data from servers where the data points are constantly changing, one such example is in the world of cryptocurrency where prices change dramatically in seconds or less.

In this guide, you will learn about APIs and how to use them to access data from their respective servers, analyse the data, visualise it and extract important insights from it.

The data we will be working with is from Coin Market Cap's Website but you can follow a similar process if you wish to access data from another platform(Just check for the documentation on their website to guide you).

What You'll Need

Before we begin this adventure here are a few things you need to be able to get the best out of this guide:

  • A basic understanding of Python programming.

  • Python installed on your computer and a Text editor or IDE such as Jupyter Notebooks, PyCharm, VisualStudioCode( or any IDE you prefer) to enable you to write and execute your Python script

  • Good internet access

Access to CoinMarketCap API(or that of the platform you want to access) Don't worry if you are not familiar with this, it is a straightforward process and we'll guide you on how to do that shortly.

Getting Set!

Installing The Required Libraries

First, we begin by installing the following libraries which we need for the task ahead.

  • Requests: This helps to communicate with APIs by requesting data from web servers.

  • Pandas: This Library helps us to manipulate and analyse the data we receive from the API.

  • MatPlotLIb and Seaborn: These are two different libraries that help us to create plots, charts(Matplotlib) and attractive statistical graphics(Seaborn).

    If you haven't already installed these libraries, you can do that by opening your command prompt or terminal and using the following prompt:

      pip install requests pandas matplotlib seaborn
    

Setting Up the Development Environment

As discussed earlier, you should have Python and your preferred code editor already installed. I've found it helpful to have an integration of Visual Studio Code and Jupyter Notebooks but any code editor you are comfortable with will work just fine. (If you need me to write about why this is my preference, its benefits and how to set it up please leave a comment and I will write about it soon )

APIs In A Nutshell

API stands for Application Programming Interface. But what does that even mean? let's explain that. Let's say you are visiting a restaurant and a waiter presents you with a menu to choose what you want.

Once you make your choice, the waiter then passes your order to the kitchen which then processes your meal. You don't need to know how it is prepared or go behind to help them with anything, all you need to do is specify what you want and bingo! You'd have it on your table.

Relating this to APIs, you visiting the restaurant are the API user, the waiter with the menu is the API and the menu items are the endpoints, while the kitchen is the system that processes your food. this organised and efficient process allows you to access the data you need without causing any mishaps for the owners of the platform.

Having understood that, it's time to get your API keys.

Getting Your Coin Market Cap API Keys

There are many platforms from which we can access crypto data such as Coin Gecko, Coin Desk, Crypto Watch, Binance etc. In this guide, we will be making requests to Coin market cap's servers and to do that we first need to get our API keys. Here is how to do that:

  1. Visit the official CoinMarketCap website (https://coinmarketcap.com/).

  2. If you are a new user you have to sign up for an account but if you already have one, just log in. At the time of writing this guide, you have to scroll to the bottom of the page, click on Crypto API and follow the prompts to sign up.

  3. Once logged in, hover over the section API Key box on the top panel, click to copy your unique API key then keep it safe.

  4. Navigate to the API DOCUMENTATION section on the left panel of the screen and click to find all the instructions you need to use the API key including code samples. Copy the code sample provided for Python to your IDE and replace the sandbox URL with that of CoinMarketCap and the API key provided with your unique API key (see the next section for more details ).

With that, we are ready for the next land mark on this adventure, fetching cryptocurrency data.

Fetching Crypto Data

Fetching cryptocurrency data is the heart of our task. To do this, we will first code the script that does this important task and then convert it into a function so that we can easily call it for the multiple times we will be making our requests.

Writing the Script

Importing libraries: We begin by importing the necessary libraries and modules we'll be using.

from requests import Request, Session
from requests.exceptions import ConnectionError, Timeout, TooManyRedirects
import json

API Endpoints: Next, we define the API endpoint (URL) and parameters which specify the data we want to retrieve. Even though the parameters from the sample code we copied specified "limit" as "5000" I edited mine to "15" so that it takes less time to run. you may edit yours to suit your needs.

# Defining the API URL and parameters
url = 'https://pro-api.coinmarketcap.com/v1/cryptocurrency/listings/latest'
parameters = {
  'start': '1',
  'limit': '15',
  'convert': 'USD'
}

Headers: we include the headers that provide important information to the API about your request .

# Add your API key to the headers
headers = {
  'Accepts': 'application/json',
  'X-CMC_PRO_API_KEY': 'your-api-key-here',

Sessions: This allows us to consistently use certain parameters like headers, across multiple requests, without the need to specify them each time.

# Create a Session and set the headers
session = Session()
session.headers.update(headers)

API Request: Having put all that in place, it is time to make the API request. We will be housing it in a try and except block to capture exceptions that may arise such as connection errors and time-outs.

After making the request, the response we get is usually in JSON format and the json.loads function helps us to convert it into a dictionary or list.

try:
  response = session.get(url, params=parameters)
  data = json.loads(response.text)
  print(data)
except (ConnectionError, Timeout, TooManyRedirects) as e:
  print(e)

next, we use pd.set_option to show us the complete columns of the table while the pd.json_normalise function helps us create a table DataFrame from the JSON data.

import pandas as pd
pd.set_option('display.max_columns', None)
df = pd.json_normalize(data ['data'])
print(df)

API Runner Function

Since we are going to be fetching the data multiple times we can encapsulate the script for doing this into a function so that we can run loops through it thereby making the process easier, shorter, faster and less error-prone

 def api_runner():
  url = 'https://pro-api.coinmarketcap.com/v1/cryptocurrency/listings/latest'
  parameters = {
    'start':'1',
    'limit':'15',
    'convert':'USD'
  }
  headers = {
    'Accepts': 'application/json',
    'X-CMC_PRO_API_KEY': 'f6ee6334-d261-47fd-8212-a0fda6a0ff00',
  }

  session = Session()
  session.headers.update(headers)

  try:
    response = session.get(url, params=parameters)
    data = json.loads(response.text)
    #print(data)
  except (ConnectionError, Timeout, TooManyRedirects) as e:
    print(e)

Storing And Managing Data

Now that we have successfully fetched the data we want, we need somewhere to store it. We will be storing our data in a CSV file. To do this we will first create a CSV file using the pandas library.

#creating a Dataframe from the retrieved data
df = pd.DataFrame(data)

#saving the DataFrame to a CSV file
df.to_csv('crypto_data.csv', index=False

We have now created a new CSV file named 'crypto_data.csv' which stores our cryptocurrency data

Appending data

Next, we want to make sure every time we make requests, new data is appended and not overwritten.

#Appending new data to our stored csv file
df.to_csv('crypto_data.csv', mode ='a', header=False, index =False)

mode=a means we are appending data.

header=False makes sure new headers are not created each time we append data and index=False stops the index column from being written.

Loops for multiple API calls

Finally, we write a loop to run the function we created as many times as needed and print a message to notify us when each run is completed. we use sleep to specify the time interval between each run.

for i in range(10):
    api_runner()
    print('successful')
    sleep(15)

In the code above we are running the function (the function housing our API requests) 10 times and "successful" is printed after each run while a 15-second interval is taken between each run.

Data Manipulation With Pandas

Ok, we have our data now but it may not be exactly how we want it, so we need to manipulate it to make the data easier to work with.

We use the pd.set_option(display.float_format... to format the decimal numbers to 5 decimal places.

Next, a good look at the DataFrame we have will show that it may be challenging to visualize for a few reasons.

  • Each run produced multiple values for the same headers. eg you'll notice that percent_change_1h for bitcoin(and every other cryptocurrency) appears 10 times( or the number of times your script ran)so we need to group that as follows
# Grouping by 'name' and selecting the desired columns
df2 = df.groupby('name', sort=False)[
    ['quote.USD.percent_change_1h',
    'quote.USD.percent_change_24h',
    'quote.USD.percent_change_7d',
    'quote.USD.percent_change_30d',
    'quote.USD.percent_change_60d',
    'quote.USD.percent_change_90d']
].mean()  # Use .mean() or another aggregation function as needed
  • Grouping the data may have solved one problem but it raises another subtle challenge. Sometimes when we perform groupby operations in Pandas the structure of the DataFrame may change and become difficult to work with and visualise. The process of Unstacking simplifies the DataFrame, reshaping it so that it becomes a conventional tabular format that is easier to work with. Here is how to do that with a few lines of code
# Unstack the data
df3 = df2.unstack().reset_index()

We added the .reset()index() function to normalise the index which would usually also have been affected by the group_by function

  • Names like quote.USD.percent_change_1h etc are too long and would make our visualisation look untidy, so we can name rename them as follows
# Replace the long names with short names
df3['percent_change'] = df3['percent_change'].replace({
    'quote.USD.percent_change_1h': '1h',
    'quote.USD.percent_change_24h': '24h',
    'quote.USD.percent_change_7d': '7d',
    'quote.USD.percent_change_30d': '30d',
    'quote.USD.percent_change_60d': '60d',
    'quote.USD.percent_change_90d': '90d'
})

# Rename columns
df3.columns = ['name', 'percent_change', 'values']

# Rename the columns in df3
df3.rename(columns={
    'percent_change': 'cryptocurrency',
    'name': 'percent_change'         
}, inplace=True)
  • We also renamed the columns after we created the visualisation because we felt it made more sense this way. Feel free to modify it in a way that makes sense to you too.

Data Visualisation With Seaborn

This is the final stage of creating our very own crypto data analyser. It's time to see what we have been up to with data visualisation. For this, we'll be using Python's Seaborn library to create a catplot (categorical plot) to show the percentage change in the value of various cryptocurrencies over time.

# Plot with Seaborn
sns.catplot(x='percent_change', y='values', hue='cryptocurrency', data=df3, kind='point')

The code above specifies what column should be represented by the x and y axes and what individual colours(hues) should represent.

Here is the result of our code:

Percent Change In The Value Of Cryptocurrencies Over Time

Analysis

From the visual above we can see that in the past 90 days, most cryptocurrencies increased in value in the first month(90d to 60d) except for toncoin which saw a sharp decline, after which it was all other cryptocurrencies followed a sharp decline in the following two months. Now, there are several ways this visual could be improved for example, I am just noticing how this visual would have a better impression if the percent_change axis was labelled time_interval and reversed ie from 90d to 1h instead of 1h to 90d. however, this guide relies heavily on showing you how to retrieve data using APIs and open your eyes to what is possible

Conclusion

We've just learned about APIs and how to use them to fetch data. We fetched the data, stored it in a CSV file and then did some transformations to enable us to work with and visualise our data better. Finally, we used Seaborn to visualise the data.

This guide was made as simple as possible with emphasis on how to retrieve data using APIs, there is so much you can do with all the data available and so much insight you can draw.

Thank you for coming on this adventure with me, I wish you good luck and I can't wait to see all the amazing things you'll come up with. Comments? Questions? feedback? please write them in the comments, I would love to know who is reading and what you think.

Credits

This guide was inspired and influenced by the work of Alex Freeberg(Alex the Analyst) on YouTube. Though the results are similar I had to modify my code to make it work.