Creating a DataFrame is the first step in almost every data analysis task using Pandas. A DataFrame is a two-dimensional labeled data structure with rows and columns.
In this article, you’ll learn different ways to create a Pandas DataFrame, from dictionaries and lists to NumPy arrays and external sources like CSV or JSON.
Method 1: Create DataFrame from a Dictionary (Most Common Way)
This is the most widely used and simplest method to create a DataFrame. You can pass a Python dictionary where keys become column names and values become data lists.
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print(df)
Output:
Name Age City 0 Alice 25 New York 1 Bob 30 Los Angeles 2 Charlie 35 Chicago
Method 2: Create DataFrame from a List of Dictionaries
Each dictionary represents a row in the DataFrame, and the keys act as column headers. This method is very flexible when your data comes from parsed JSON or API responses.
import pandas as pd
data = [
{'Name': 'Alice', 'Age': 25, 'City': 'New York'},
{'Name': 'Bob', 'Age': 30, 'City': 'Los Angeles'},
{'Name': 'Charlie', 'Age': 35, 'City': 'Chicago'}
]
df = pd.DataFrame(data)
print(df)
Output:
Name Age City 0 Alice 25 New York 1 Bob 30 Los Angeles 2 Charlie 35 Chicago
Why use this?
> Perfect for handling JSON-like or API data
> Automatically infers columns from dictionary keys
Method 3: Create DataFrame from a List of Lists
When your data is in a plain nested list format, you can manually define column names. This approach is simple but less descriptive than using dictionaries. import pandas as pd
data = [
['Alice', 25, 'New York'],
['Bob', 30, 'Los Angeles'],
['Charlie', 35, 'Chicago']
]
df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])
print(df)
Output:
Name Age City 0 Alice 25 New York 1 Bob 30 Los Angeles 2 Charlie 35 Chicago
Why use this?
> Easy when you have raw tabular data in list format
> Useful for quick manual testing or prototyping
Method 4: Create DataFrame from a Dictionary of Series
Each Pandas Series becomes a column in the DataFrame, aligned by index. This is useful when combining multiple Series objects with shared or custom indices. import pandas as pd
names = pd.Series(['Alice', 'Bob', 'Charlie'])
ages = pd.Series([25, 30, 35])
cities = pd.Series(['New York', 'Los Angeles', 'Chicago'])
df = pd.DataFrame({'Name': names, 'Age': ages, 'City': cities})
print(df)
Output:
Name Age City 0 Alice 25 New York 1 Bob 30 Los Angeles 2 Charlie 35 Chicago
Why use this?
> Good for combining separate Series objects
> Allows easy control over index alignment
Method 5: Create DataFrame from a NumPy Array
You can convert a NumPy array into a DataFrame and define column names manually. This is useful when working with numerical data or machine learning outputs. import pandas as pd import numpy as np
data = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
df = pd.DataFrame(data, columns=['A', 'B', 'C'])
print(df)
Output:
A B C 0 1 2 3 1 4 5 6 2 7 8 9
Why use this?
> Integrates Pandas with NumPy arrays
> Efficient for numerical data and ML preprocessing
Method 6: Create DataFrame from a Tuple List
This method works like lists but with tuples. It’s often used when data comes from SQL or CSV parsing as row tuples. import pandas as pd
data = [
('Alice', 25, 'New York'),
('Bob', 30, 'Los Angeles'),
('Charlie', 35, 'Chicago')
]
df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])
print(df)
Output:
Name Age City 0 Alice 25 New York 1 Bob 30 Los Angeles 2 Charlie 35 Chicago
Method 7: Create Empty DataFrame and Add Columns Later
Sometimes you need to initialize an empty DataFrame first and add data dynamically. This is useful for iterative or streaming data collection. import pandas as pd
df = pd.DataFrame()
df['Name'] = ['Alice', 'Bob', 'Charlie']
df['Age'] = [25, 30, 35]
df['City'] = ['New York', 'Los Angeles', 'Chicago']
print(df)
Output:
Name Age City 0 Alice 25 New York 1 Bob 30 Los Angeles 2 Charlie 35 Chicago
Why use this?
> Ideal for building DataFrames dynamically
> Great for iterative data collection loops
Method 8: Create DataFrame from a CSV File
You can directly load data from a CSV file using `read_csv()`. This is one of the most common real-world ways to create DataFrames. import pandas as pd
# Read CSV file into DataFrame
df = pd.read_csv('data.csv')
print(df.head())
Output:
Name Age City 0 Alice 25 New York 1 Bob 30 Los Angeles 2 Charlie 35 Chicago
Why use this?
> Easiest way to load external data
> Supports many parameters like delimiter, encoding, etc.
Method 9: Create DataFrame from JSON Data
You can also load JSON files or strings directly into a DataFrame. Perfect for APIs or structured web data. import pandas as pd
json_data = '''
[
{"Name": "Alice", "Age": 25, "City": "New York"},
{"Name": "Bob", "Age": 30, "City": "Los Angeles"},
{"Name": "Charlie", "Age": 35, "City": "Chicago"}
]
'''
df = pd.read_json(json_data)
print(df)
Output:
Name Age City 0 Alice 25 New York 1 Bob 30 Los Angeles 2 Charlie 35 Chicago
Why use this?
> Ideal for reading API responses
> Handles nested JSON with built-in normalization
Method 10: Create DataFrame from Another DataFrame
You can create a new DataFrame from an existing one by copying or selecting specific columns. import pandas as pd
df1 = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
})
# Create new DataFrame with selected columns
df2 = pd.DataFrame(df1, columns=['Name', 'City'])
print(df2)
Output:
Name City 0 Alice New York 1 Bob Los Angeles 2 Charlie Chicago
Why use this?
> Efficient for selecting or duplicating data
> Safe for transformations without changing the original.
FAQs — How to Create a Pandas DataFrame
How to create a Pandas DataFrame from a dictionary?
Use a Python dictionary with column names as keys and lists as values:
import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
print(df)
This is the most common way to create a DataFrame manually.
How to create an empty DataFrame in Pandas?
Create an empty DataFrame with no data using:
import pandas as pd
df = pd.DataFrame()
print(df)
You can later add rows or columns to it.
How to create a Pandas DataFrame from a list of lists?
Pass a list of lists (each sublist is a row) and optionally column names:
data = [[1, 'Alice'], [2, 'Bob']]
df = pd.DataFrame(data, columns=['ID', 'Name'])
print(df)
How to create a DataFrame from a list of dictionaries in Pandas?
Each dictionary represents a row of data:
data = [{'Name': 'Alice', 'Age': 25}, {'Name': 'Bob', 'Age': 30}]
df = pd.DataFrame(data)
print(df)
How to create a DataFrame from NumPy arrays in Pandas?
Combine NumPy arrays with column labels:
import numpy as np
import pandas as pd
arr = np.array([[1, 2], [3, 4]])
df = pd.DataFrame(arr, columns=['A', 'B'])
print(df)
How to create a DataFrame with custom index in Pandas?
Use the index parameter to specify row labels:
data = {'Score': [90, 85, 88]}
df = pd.DataFrame(data, index=['Math', 'Science', 'English'])
print(df)
How to create a DataFrame from CSV data in Pandas?
Use pd.read_csv() to load data from a CSV file:
df = pd.read_csv('data.csv')
print(df.head())
How to create a DataFrame using a Series in Pandas?
Combine multiple Series objects into a DataFrame:
name = pd.Series(['Alice', 'Bob'])
age = pd.Series([25, 30])
df = pd.DataFrame({'Name': name, 'Age': age})
print(df)
How to create a DataFrame with random data in Pandas?
Use NumPy’s random functions to generate sample data:
import numpy as np
df = pd.DataFrame(np.random.randn(3, 2), columns=['A', 'B'])
print(df)
How to create a DataFrame with specific data types in Pandas?
Use the dtype parameter or convert after creation:
df = pd.DataFrame({'A': [1, 2, 3]}, dtype='float')
print(df.dtypes)