How to rename DataFrame columns in Pandas
Whether you’re changing the name of one or several columns or completely reassigning your DataFrame’s header, renaming columns in pandas is very simple.
Renaming specific columns can be easily accomplished with the built-in rename()
method. This method takes a dictionary of columns to be renamed and their new names, in the format old: new
. Remember to specify that you want to change column names with the axis
argument, as this method can also be used to rename rows. Here’s rename()
in action:
import pandas as pd
data = pd.DataFrame({'a': [1, 2, 3], 'b': [10, 20, 30], 'c': [100, 200, 300]})
#either axis value here works to specify columns
data_renamed = data.rename({'a': 'a1', 'b': 'b1'}, axis=1)
data_renamed = data.rename({'a': 'a2', 'b': 'b2'}, axis='columns')
data_renamed
If you’d rather rename columns in-place instead of assigning the output to a variable, you can instead use:
import pandas as pd
data = pd.DataFrame({'a': [1, 2, 3], 'b': [10, 20, 30], 'c': [100, 200, 300]})
data.rename({'a': 'a1', 'b': 'b2'}, axis=1, inplace=True)
data
You can also reassign all column names from a list by using set_axis()
:
import pandas as pd
data = pd.DataFrame({'a': [1, 2, 3], 'b': [10, 20, 30], 'c': [100, 200, 300]})
data2 = data.set_axis(['d', 'e', 'f'], axis=1)
data2
In older versions of pandas, set_axis()
also takes an inplace
argument. In pandas >1.5.0., set_axis()
instead uses a copy
boolean argument to specify whether or not to create a new DataFrame object. By default, the DataFrame is copied. To assign headers without copying the underlying data, use:
import pandas as pd
data = pd.DataFrame({'a': [1, 2, 3], 'b': [10, 20, 30], 'c': [100, 200, 300]})
data.set_axis(['d', 'e', 'f'], axis=1, copy=False)
data
It’s also possible to reassign headers directly in-place. Be aware this depends on the positions of the columns rather than their names, so is less robust than the methods above.
import pandas as pd
data = pd.DataFrame({'a': [1, 2, 3], 'b': [10, 20, 30], 'c': [100, 200, 300]})
data.columns = ['d', 'e', 'f']
data
So far, we’ve only considered examples of small datasets where it’s feasible to manually write out old and new column names. If you’re looking for a way to apply a text transformation to every column name in a DataFrame, you can also pass a function to rename()
. This functionality is especially useful if, for example, you would like to remove extra delimitors from column names:
import pandas as pd
data = pd.DataFrame({'a,': [1, 2, 3], 'b,': [10, 20, 30], 'c,': [100, 200, 300]})
#creating a new DataFrame, but you can also use inplace here
data_renamed = data.rename(columns=lambda x: x.replace(',', ''))
data_renamed
In summary, pandas provides several quick and easy methods for renaming DataFrame columns, whether you’re looking to rename a single column, rename all columns using a list, or rename all columns using a function.
Additional Resources:
How to drop Pandas DataFrame rows with NAs in a specific column
How to drop Pandas DataFrame rows with NAs in a specific column
About Saturn Cloud
Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.
Saturn Cloud provides customizable, ready-to-use cloud environments for collaborative data teams.
Try Saturn Cloud and join thousands of users moving to the cloud without
having to switch tools.