HOW TO MERGE MULTIPLE CSV FILES INTO SINGLE FILE PYTHON

To merge multiple CSV files into a single file in Python the common approach is using the pandas library, which is a powerful tool for data manipulation and analysis:

Lets look at a sample real time scenario of how we can convert multiple csv files into one csv file.

Use case scenario: The business has sales data of different cities stored as seperate csv files. These files are all in the same format and has same headers. We need to prepare one single csv file to be loaded into our database.

In real time situation, this is paticularly useful when we have to migrate 100s of data files. For practice, I have taken 3 csv files.

If we open the indivual file, it will have the data of only that city

I use Jupyter notebook. If you would like to know how to install Jupyter notebook, you can follow the blog post here

  1. First, make sure you have the pandas library installed. You can install it using pip if you haven’t already:

bash

  1. Once pandas is installed, you can use the following code to merge multiple CSV files into a single file:

This code does the following:

  • It uses the glob module to get a list of all CSV files in the specified directory.
  • It reads each CSV file into a DataFrame and appends it to the df_list.
  • It uses pd.concat() to concatenate all DataFrames into a single DataFrame.
  • Finally, it saves the merged DataFrame to a new CSV file.

This will create a single merged CSV file that contains the data from all the individual CSV files.

The result will look like this.

The merged file will have all the sales in single file

Popular posts