Python: Display All Columns of a Pandas DataFrame in '.describe()'
Setting Up Your Environment
Before we dive in, ensure you have the necessary tools installed. You’ll need Python and the Pandas library. If you haven’t installed these yet, you can do so using the following commands:
pip install python
pip install pandas
Understanding the “.describe()” Method
The .describe()
method in Pandas is a convenient way to get a quick overview of your data. By default, it provides the count, mean, standard deviation, minimum, 25th percentile (Q1), median (50th percentile or Q2), 75th percentile (Q3), and maximum of the columns.
import pandas as pd
# Create a simple dataframe
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [2, 3, 4, 5, 6],
'C': [3, 4, 5, 6, 7],
'D': [1, 2, 3, 4, 5],
'E': [2, 3, 4, 5, 6],
'F': [3, 4, 5, 6, 7],
'G': [1, 2, 3, 4, 5],
'H': [2, 3, 4, 5, 6],
'I': [3, 4, 5, 6, 7],
'J': [1, 2, 3, 4, 5],
'K': [2, 3, 4, 5, 6],
'L': [3, 4, 5, 6, 7],
'M': [1, 2, 3, 4, 5],
'N': [2, 3, 4, 5, 6],
'O': [3, 4, 5, 6, 7],
'P': [1, 2, 3, 4, 5],
'Q': [2, 3, 4, 5, 6],
'R': [3, 4, 5, 6, 7]
})
print(df.describe())
So, if your DataFrame has many columns like above, not all of them will be displayed. This is where we need to tweak our settings.
Displaying All Columns:
To display all columns, you need to adjust the Pandas display options. You can do this by setting the max_columns
option to None
, which tells Pandas to display as many columns as there are in the DataFrame.
pd.set_option('display.max_columns', None)
Now, when you use the .describe()
method, all columns will be displayed.
print(df.describe())
Customizing the “.describe()” Method
While the default summary statistics provided by .describe()
are useful, you might want to customize them to better suit your needs. You can do this by passing a list of percentiles to the percentiles
parameter.
print(df.describe(percentiles=[.10, .20, .30, .40, .50, .60, .70, .80, .90]))
This will display the 10th, 20th, 30th, 40th, 50th, 60th, 70th, 80th, and 90th percentiles of your data.
Conclusion
The Pandas .describe()
method is a powerful tool for quickly understanding your data. By adjusting the display options in Python, you can ensure that all columns are displayed, giving you a complete overview of your dataset. Remember, data understanding is a crucial step in the data science process, and tools like Pandas make this step easier and more efficient.
About Saturn Cloud
Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.
Saturn Cloud provides customizable, ready-to-use cloud environments for collaborative data teams.
Try Saturn Cloud and join thousands of users moving to the cloud without
having to switch tools.