To merge multiple CSV files into a single file in Python the common approach is using the pandas library, which is a powerful tool for data manipulation and analysis:
Lets look at a sample real time scenario of how we can convert multiple csv files into one csv file.
Use case scenario: The business has sales data of different cities stored as seperate csv files. These files are all in the same format and has same headers. We need to prepare one single csv file to be loaded into our database.
In real time situation, this is paticularly useful when we have to migrate 100s of data files. For practice, I have taken 3 csv files.

If we open the indivual file, it will have the data of only that city



I use Jupyter notebook. If you would like to know how to install Jupyter notebook, you can follow the blog post here
- First, make sure you have the
pandaslibrary installed. You can install it using pip if you haven’t already:
bash
pip install pandas
- Once
pandasis installed, you can use the following code to merge multiple CSV files into a single file:
import pandas as pd
import glob
# Specify the directory where your CSV files are located
directory = 'Sales Sample Data'
# Use the glob module to get a list of all CSV files in the directory
all_files = glob.glob(directory + "/*.csv")
# Initialize an empty list to store DataFrames
df_list = []
# Loop through the list of CSV files and read them into DataFrames
for filename in all_files:
df = pd.read_csv(filename)
df_list.append(df)
# Concatenate the DataFrames into a single DataFrame
merged_df = pd.concat(df_list, ignore_index=True)
# Specify the output file name and path
output_file = 'Sales Sample Data/merged_file.csv'
# Save the merged DataFrame to a new CSV file
merged_df.to_csv(output_file, index=False)
print("CSV files have been merged and saved to", output_file)
This code does the following:
- It uses the
globmodule to get a list of all CSV files in the specified directory. - It reads each CSV file into a DataFrame and appends it to the
df_list. - It uses
pd.concat()to concatenate all DataFrames into a single DataFrame. - Finally, it saves the merged DataFrame to a new CSV file.
This will create a single merged CSV file that contains the data from all the individual CSV files.
The result will look like this.

The merged file will have all the sales in single file


