How to get a list of column names from a Pandas DataFrame
Pandas makes it easy to obtain a list of column names from a DataFrame
. For an extremely concise solution, you can simply call list()
on your DataFrame
object, which will return a list of header names:
import pandas as pd
data = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [10, 20, 30, 40, 50]})
list(data)
There are also two built-in tolist()
methods for Index
objects. If performance is a priority, the first method listed below is faster than the second, but either works:
#faster option
data.columns.values.tolist()
data.columns.tolist()
You can swap also out the Pandas values
call for NumPy’s to_numpy
instead; this method is preferred for clarity.
data.columns.to_numpy().tolist()
Finally, if you are using Python 3.5+, you can use unpacking generalizations to return a list of column names. This class of operations also includes ways to output your column names as a tuple or set, if so desired. Here’s how to do it for a list (yes, it’s that quick and easy!):
[*data]
In summary, in Python 3.5 and beyond, iterable unpacking operators provide the most concise way to get a list of column names from a DataFrame
. For previous versions of Python, and where conciseness is preferred, list()
is an alternative. If performance is a priority columns.values.tolist()
(or its NumPy equivalent) is usually the fastest option.
Additional Resources:
How to drop Pandas DataFrame rows with NAs in a specific column
How to drop Pandas DataFrame rows with NAs in a specific column
About Saturn Cloud
Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.
Saturn Cloud provides customizable, ready-to-use cloud environments for collaborative data teams.
Try Saturn Cloud and join thousands of users moving to the cloud without
having to switch tools.