How to filter data in python?
The goal of this tutorial is to quickly understand how to filter data with Python.
Data
import pandas as pd
noms = ['Jean', 'Lucie', 'Pierre', 'Marie', 'Antoine', 'Sophie']
ages = [25, 30, 20, 40, 35, 28]
villes = ['Paris', 'Lyon', 'Marseille', 'Toulouse', 'Bordeaux', 'Nantes']
df = pd.DataFrame({'Nom': noms, 'Age': ages, 'Ville': villes})
print(df)
Nom Age Ville
0 Jean 25 Paris
1 Lucie 30 Lyon
2 Pierre 20 Marseille
3 Marie 40 Toulouse
4 Antoine 35 Bordeaux
5 Sophie 28 Nantes
Filters
Filter rows for people under 30 :
df_30 = df[df['Age'] < 30]
print(df_30)
Nom Age Ville
0 Jean 25 Paris
2 Pierre 20 Marseille
5 Sophie 28 Nantes
Selection of lines for people under 30 living in Nantes or Paris :
df_filtre = df[(df['Age'] < 30) & ((df['Ville'] == 'Nantes') | (df['Ville'] == 'Paris'))]
print(df_filtre)
Nom Age Ville
0 Jean 25 Paris
5 Sophie 28 Nantes
Filter columns for names and cities only :
df_name_city = df[['Nom', 'Ville']]
print(df_name_city)
Nom Ville
0 Jean Paris
1 Lucie Lyon
2 Pierre Marseille
3 Marie Toulouse
4 Antoine Bordeaux
5 Sophie Nantes
Get the name and age of people under 30 who live in Nantes or Paris:
filtre = (df['Age'] < 30) & ((df['Ville'] == 'Nantes') | (df['Ville'] == 'Paris'))
df_filtre = df.loc[filtre, ['Nom', 'Age']]
print(df_filtre)
Nom Age
0 Jean 25
5 Sophie 28