How to delete DataFrame row in Pandas based on column value?
Removing rows from a DataFrame based on specific column values is a common operation in data cleaning and preprocessing. Below are two straightforward methods:
1. Filtering Rows with Boolean Indexing (Most Common)
import pandas as pd df = pd.DataFrame({ "Name": ["Alice", "Bob", "Charlie", "David"], "Age": [25, 30, 35, 40], "City": ["New York", "Chicago", "San Francisco", "Seattle"] }) # Suppose we want to remove rows where "City" == "Chicago" df_filtered = df[df["City"] != "Chicago"] print(df_filtered)
- Explanation:
df["City"] != "Chicago"
creates a boolean Series indicating which rows have a “City” different from “Chicago.” - We keep only those rows by slicing
df[...]
. - This method returns a new DataFrame; the original
df
remains unchanged.
2. Dropping Rows by Index Using drop()
# Identify the rows where "City" == "Chicago" rows_to_drop = df[df["City"] == "Chicago"].index # Drop them from the DataFrame df.drop(rows_to_drop, inplace=True) print(df)
- Explanation: First, we get the index of rows we want to remove:
df[df["City"] == "Chicago"].index
. - Then we call
df.drop(rows_to_drop, inplace=True)
to remove those rows in place.
Additional Tips
inplace=True
modifies the original DataFrame. If you prefer returning a new DataFrame and keeping the original intact, omitinplace=True
and assign the result back todf
or another variable.- Chaining: If you do something like
df[df["City"] == "Chicago"].drop(...)
, be careful—it can lead to confusion or partial data modifications. Usually, it’s clearer to separate the steps.
Mastering Python for Data Analysis
If you’re looking to strengthen your Python skills and ensure you have a solid grasp of fundamental concepts (e.g., data structures, file I/O, object-oriented design), consider Grokking Python Fundamentals by DesignGurus.io. This course helps you build a robust foundation, which in turn makes advanced data manipulation with Pandas more intuitive and efficient.
Happy Data Wrangling!
CONTRIBUTOR
TechGrind