How can I select only numeric columns in Pandas?

How can I select only numeric columns in Pandas?

Pandas is a popular Python library used for data manipulation and analysis. It offers various functions to select specific columns from a dataset. To select only numeric columns in Pandas, one can use the “select_dtypes()” function and specify the data type as “number”. This will filter out all the non-numeric columns and return only the columns with numerical data. This method is efficient and convenient when dealing with large datasets and helps in performing mathematical operations and statistical analysis on only the relevant columns.

Select Only Numeric Columns in Pandas


You can use the following basic syntax to select only numeric columns in a pandas DataFrame:

import pandas as pdimport numpy as np

df.select_dtypes(include=np.number)

The following example shows how to use this function in practice.

Example: Select Only Numeric Columns in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                   'points': [18, 22, 19, 14, 14, 11, 20, 28],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#view DataFrame
print(df)

  team  points  assists  rebounds
0    A      18        5        11
1    B      22        7         8
2    C      19        7        10
3    D      14        9         6
4    E      14       12         6
5    F      11        9         5
6    G      20        9         9
7    H      28        4        12

We can use the following syntax to select only the numeric columns in the DataFrame:

import numpy as np

#select only the numeric columns in the DataFrame
df.select_dtypes(include=np.number)

        points	assists	rebounds
0	18	5	11
1	22	7	8
2	19	7	10
3	14	9	6
4	14	12	6
5	11	9	5
6	20	9	9
7	28	4	12

Notice that only the three numeric columns have been selected – points, assists, and rebounds.

We can verify that these columns are numeric by using the dtypes() function to display the data type of each variable in the DataFrame:

#display data type of each variable in DataFrame
df.dtypes

team        object
points       int64
assists      int64
rebounds     int64
dtype: object

From the output we can see that team is an object (i.e. string) while points, assists, and rebounds are all numeric.

Note that we can also use the following code to get a list of the numeric columns in the DataFrame:

#display list of numeric variables in DataFrame
df.select_dtypes(include=np.number).columns.tolist()

['points', 'assists', 'rebounds']

This allows us to quickly see the names of the numeric variables in the DataFrame without seeing their actual values.

The following tutorials explain how to perform other common tasks in pandas:

Cite this article

stats writer (2024). How can I select only numeric columns in Pandas?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-select-only-numeric-columns-in-pandas/

stats writer. "How can I select only numeric columns in Pandas?." PSYCHOLOGICAL SCALES, 27 Jun. 2024, https://scales.arabpsychology.com/stats/how-can-i-select-only-numeric-columns-in-pandas/.

stats writer. "How can I select only numeric columns in Pandas?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-select-only-numeric-columns-in-pandas/.

stats writer (2024) 'How can I select only numeric columns in Pandas?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-select-only-numeric-columns-in-pandas/.

[1] stats writer, "How can I select only numeric columns in Pandas?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.

stats writer. How can I select only numeric columns in Pandas?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top