When working with Pandas DataFrames, we often need to select specific columns based on their index positions.
In case you are preprocessing data for machine learning, visualizing, or cleaning your dataset, selecting columns by index range is a powerful and efficient technique.
Let’s see how to select columns by index in Pandas using multiple methods such as .iloc, .loc, column slicing, and NumPy indexing.
Sample DataFrame
Before we dive into the methods, let’s create a sample DataFrame to work with:
import pandas as pd
# Sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston'],
'Salary': [70000, 80000, 90000, 100000],
'Department': ['HR', 'Engineering', 'Marketing', 'Finance']
}
df = pd.DataFrame(data) print(df)
Method 1: Using iloc to Select Columns by Index Range
The .iloc accessor allows you to select columns by their integer index positions.
# Select columns from index 1 to 3 (Age, City, Salary)
# : selects all rows
# 1:4 selects columns from index 1 up to (but not including) 4
subset = df.iloc[:, 1:4]
print(subset)
Method 2: Using Column Indexing with df.columns
You can also get column names from the df.columns array and slice them based on index.
# Select column names from index 1 to 3
selected_columns = df.columns[1:4]
# Use these column names to select from DataFrame subset = df[selected_columns] print(subset)
This is especially useful when working dynamically with column indices.