How to Access a JSON Column with Pandas
As a data scientist or software engineer, you may come across JSON columns in your data. JSON is a popular data format used for storing and exchanging data on the web. It is a lightweight, text-based format that is easy to read and write.
When working with JSON data in Python, Pandas is an excellent library to use. Pandas is a powerful data manipulation tool that provides efficient data structures for working with structured data.
In this tutorial, we will walk through the steps of accessing a JSON column with Pandas. We will cover the following topics:
- What is a JSON column?
- How to load a JSON file into Pandas
- How to access a JSON column in Pandas
- How to manipulate a JSON column in Pandas
Table of Contents
- What is a JSON Column?
- How to Load a JSON File into Pandas
- How to Access a JSON Column in Pandas
- How to Manipulate a JSON Column in Pandas
- Conclusion
What is a JSON Column?
A JSON column is a column in a table that contains data in JSON format. JSON stands for JavaScript Object Notation, and it is a lightweight data format that is easy to read and write.
JSON columns are commonly used in databases such as MySQL, Postgres, and MongoDB. They are used to store data that is not easily represented in a traditional tabular format.
Let’s consider the following json-formatted DF.
[
{
"id": 1,
"name": "John Doe",
"age": 30,
"attribute": {
"height": 176,
"weight": 80
}
},
{
"id": 2,
"name": "Alice Smith",
"age": 28,
"attribute": {
"height": 167,
"weight": 55
}
},
{
"id": 3,
"name": "Bob Johnson",
"age": 35,
"attribute": {
"height": 192,
"weight": 85
}
}
]
How to Load a JSON File into Pandas
Before we can access a JSON column with Pandas, we need to load the JSON data into a Pandas DataFrame. The easiest way to do this is to use the read_json
function in Pandas.
Here is an example of how to load a JSON file into a Pandas DataFrame:
import pandas as pd
df = pd.read_json('data.json')
print(df)
Output:
id name age attribute
0 1 John Doe 30 {'height': 176, 'weight': 80}
1 2 Alice Smith 28 {'height': 167, 'weight': 55}
2 3 Bob Johnson 35 {'height': 192, 'weight': 85}
In this example, we are loading a JSON file called data.json
into a Pandas DataFrame called df
.
How to Access a JSON Column in Pandas
Once we have loaded the JSON data into a Pandas DataFrame, we can access the JSON column using the bracket notation.
Here is an example of how to access a JSON column in Pandas:
import pandas as pd
df = pd.read_json('data.json')
json_column = df['attribute']
print(json_column)
Output:
0 {'height': 176, 'weight': 80}
1 {'height': 167, 'weight': 55}
2 {'height': 192, 'weight': 85}
Name: attribute, dtype: object
In this example, we are accessing a JSON column called attribute
in the Pandas DataFrame called df
.
How to Manipulate a JSON Column in Pandas
Pandas provides several methods for manipulating JSON data. Here are some common operations:
Extracting Data from a JSON Column
To extract data from a JSON column in Pandas, we can use the apply
method along with a lambda function.
Here is an example of how to extract data from a JSON column in Pandas:
import pandas as pd
df = pd.read_json('data.json')
json_column = df['attribute']
extracted_data = json_column.apply(lambda x: x['height'])
print(extracte_data)
Output:
0 176
1 167
2 192
Name: attribute, dtype: int64
In this example, we are extracting data from a JSON column called json_column_name
in the Pandas DataFrame called df
. We are extracting the value associated with the key 'key'
in the JSON data.
Adding Data to a JSON Column
To add data to a JSON column in Pandas, we can use the apply
method along with a lambda function.
Here is an example of how to add data to a JSON column in Pandas:
import pandas as pd
df = pd.read_json('data.json')
json_column = df['attribute']
# Add new data to json column
json_column = json_column.apply(lambda x: {**x, 'city': 'DC'})
df['attribute'] = json_column
print(df)
Output:
id name age attribute
0 1 John Doe 30 {'height': 176, 'weight': 80, 'city': 'DC'}
1 2 Alice Smith 28 {'height': 167, 'weight': 55, 'city': 'DC'}
2 3 Bob Johnson 35 {'height': 192, 'weight': 85, 'city': 'DC'}
In this example, we are adding a new key-value pair to a JSON column called attribute
in the Pandas DataFrame called df
. We are adding the key 'city'
with the value 'DC'
to each JSON object in the column.
Updating Data in a JSON Column
To update data in a JSON column in Pandas, we can use the apply
method along with a lambda function.
Here is an example of how to update data in a JSON column in Pandas:
import pandas as pd
df = pd.read_json('data.json')
json_column = df['attribute']
# replace a value in json column
json_column = json_column.apply(lambda x: {**x, 'city': 'New York'})
df['attribute'] = json_column
print(df)
Output:
id name age attribute
0 1 John Doe 30 {'height': 176, 'weight': 80, 'city': 'New York'}
1 2 Alice Smith 28 {'height': 167, 'weight': 55, 'city': 'New York'}
2 3 Bob Johnson 35 {'height': 192, 'weight': 85, 'city': 'New York'}
In this example, we are updating the value associated with the key 'city'
in a JSON column called attribute
in the Pandas DataFrame called df
. We are updating the value to 'New York'
for each JSON object in the column.
Conclusion
In this tutorial, we have covered the basics of accessing a JSON column with Pandas. We have learned how to load a JSON file into Pandas, how to access a JSON column in Pandas, and how to manipulate a JSON column in Pandas.
JSON columns are a powerful tool for storing and exchanging data in a flexible and lightweight format. With Pandas, you can easily work with JSON columns and perform common data manipulation operations.
About Saturn Cloud
Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.
Saturn Cloud provides customizable, ready-to-use cloud environments for collaborative data teams.
Try Saturn Cloud and join thousands of users moving to the cloud without
having to switch tools.