convert python dictionary to pandas dataframe

How to Convert Python Dictionary to Dataframe

Python dictionary and dataframes are two popular data structures among software developers. A dictionary allows you to easily store data as key-value pairs in a compact manner. On the other hand, a dataframe allows you to store & analyze data in tabular manner. Often, we need to convert Python dictionary into dataframe. This is mostly required if we need to analyze JSON data. Since JSON data can be easily converted into Python dictionary, you can convert Python dictionary into dataframe to analyze it quickly. There are several ways to do this. In this article, we will learn how to transform Python dict into Dataframe.

Why Convert Python Dict to Dataframe

A Dataframe stores data as tables consisting of rows and columns. It supports numerous data manipulation capabilities such as grouping, filtering, aggregation & sorting. It also features many built-in functions to quickly analyze your data. Since dictionary is a set of key-value pairs, it is difficult to process it unless you convert it into something more structured. So if you want to perform any kind of data analysis or transformation, it is advisable to first convert a dict into dataframe.

How to Convert Python Dictionary into Dataframe

There are mainly 3 different ways to convert a Python dict to dataframe. They are:

  1. Using DataFrame() Constructor
  2. Using from_dict() function
  3. Using from_dict() function with orient=’index’ option

We will learn each of these solutions one by one.

1. Using DataFrame() Constructor

If you have a simple dict with a sequence of key-value pairs as shown below, then you can directly use the DataFrame() constructor to convert it into dataframe.

Let us say you have the following dictionary, which is a sequence of simple key-value pairs.

data = {u'2024-07-01': 34,
u'2024-07-02': 35,
u'2024-07-03': 33,
u'2024-07-04': 39,
u'2024-07-05': 35,
u'2024-07-06': 30}

Here is an example to convert this above dict to dataframe with column values (date, temperature). We pass the column names as columns argument.

import pandas as pd

df=pd.DataFrame(data.items(), columns=['Date', 'Temperature'])
print(df)

Here is the output you will see with row indexes and column names.

   Date        Temperature
0 2024-07-01 34
1 2024-07-02 35
2 2024-07-03 33
3 2024-07-04 39
4 2024-07-05 35
5 2024-07-06 30

Let us look at a different example. In this case also, let us say you have a dictionary which is a series of key-value pairs.

data = {'name': 'john', 'age': '25', 'score': '20'}

Here is another way of converting it into dataframe using constructor.

df = pd.DataFrame([data])
print(df)

Here is the output you will see.

    name     age     score
0 john 25 20

2. Using from_dict() function

Pandas library provides a from_dict() function that allows you to directly convert a dictionary into a dataframe. It works on dict of lists or dicts. Let us say you have the following list of dictionaries. In this case, each list item contains data pertaining to one row. Each key-value pair in the item is a column_name-column_value combination.

data = [{'name': 'john', 'age': 25, 'score': 20},
{'name': 'jim', 'age': 35, 'score': 25},
{'name': 'jane', 'age': 20, 'score': 39}]

Here is the code to convert the above dict into dataframe.

import pandas as pd
df = pd.DataFrame.from_dict(data)

print(df)

Here is the output you will see.

       name  age  score
0 john 25 20
1 jim 35 25
2 jane 20 39

This approach works well when your dict is structured such that each key is a dataframe column. Here is another example to illustrate this point. Let us say you have the following dictionary where each key is a column name and the list of values corresponding to it is the row values for that column.

data = {
'name': ['John', 'Jim', 'Jane'],
'age': [25, 35, 20],
'score': [25, 20, 39]
}

Here also you can directly call from_dict() function on your dictionary to get a dataframe.

import pandas as pd
df = pd.DataFrame.from_dict(data)

print(df)

You will see the following output.

      name  age  score
0 john 25 20
1 jim 35 25
2 jane 20 39

As you can see, from_dict is a very versatile function that supports many different structures of dictionaries.

3. Using orient=index option

Generally, when we use from_dict() function, it converts the dict keys into column names of the final dataframe. But sometimes you may want to transpose the table to make these columns into rows and keys into indexes. You can do this by just adding orient=’index’ option in from_dict() function.

Let us say you have the following dictionary such that each key is a row index and its values are the row values.

data = {'row1': ['john', 25, 20],
'row2': ['jim', 35, 25],
'row3': ['jane', 20, 39]}

You can convert this dict into dataframe by adding orient=index option. But since our data does not contain column names, you need to supply column names as an additional option.

import pandas as pd

df = pd.DataFrame.from_dict(data, orient='index', columns=['name','age','score'])

print(df)

Here is the output you will see. You will see that the dict has not only been converted into dataframe, but it has been

       name age score
row1 john 25 20
row2 jim 35 25
row3 jane 20 39

Conclusion

In this article, we have learnt several different ways to convert python dictionary to dataframe. Depending on the way your Python dictionary is structured, you can use from_dict() function along with different options to quickly convert it into a dataframe. The key is to determine the right from_dict() function arguments and options to properly parse the dict. Since a dict can take so many different structures, this can be tricky at times, but from_dict function is smart enough to work with most use cases.

Also read:

How to Start Background Process in Python
How to Prevent NGINX from Serving .git directory
How to Prevent Apache from Serving .git directory
How to Check if String is Substring of List Items
How to Check if Column is Empty or Null in MySQL

Leave a Reply

Your email address will not be published. Required fields are marked *