Filter your data with Python

Selection of rows or columns

python
filter
Author

Clément Rieux

Published

January 1, 2023

How to filter data in python?

The goal of this tutorial is to quickly understand how to filter data with Python.

Data

import pandas as pd

noms = ['Jean', 'Lucie', 'Pierre', 'Marie', 'Antoine', 'Sophie']
ages = [25, 30, 20, 40, 35, 28]
villes = ['Paris', 'Lyon', 'Marseille', 'Toulouse', 'Bordeaux', 'Nantes']
df = pd.DataFrame({'Nom': noms, 'Age': ages, 'Ville': villes})

print(df)
       Nom  Age      Ville
0     Jean   25      Paris
1    Lucie   30       Lyon
2   Pierre   20  Marseille
3    Marie   40   Toulouse
4  Antoine   35   Bordeaux
5   Sophie   28     Nantes

Filters

Filter rows for people under 30 :

df_30 = df[df['Age'] < 30]
print(df_30)
      Nom  Age      Ville
0    Jean   25      Paris
2  Pierre   20  Marseille
5  Sophie   28     Nantes

Selection of lines for people under 30 living in Nantes or Paris :

df_filtre = df[(df['Age'] < 30) & ((df['Ville'] == 'Nantes') | (df['Ville'] == 'Paris'))]

print(df_filtre)
      Nom  Age   Ville
0    Jean   25   Paris
5  Sophie   28  Nantes

Filter columns for names and cities only :

df_name_city = df[['Nom', 'Ville']]
print(df_name_city)
       Nom      Ville
0     Jean      Paris
1    Lucie       Lyon
2   Pierre  Marseille
3    Marie   Toulouse
4  Antoine   Bordeaux
5   Sophie     Nantes

Get the name and age of people under 30 who live in Nantes or Paris:

filtre = (df['Age'] < 30) & ((df['Ville'] == 'Nantes') | (df['Ville'] == 'Paris'))
df_filtre = df.loc[filtre, ['Nom', 'Age']]

print(df_filtre)
      Nom  Age
0    Jean   25
5  Sophie   28