merge csv files into single file

How to Combine Multiple CSV Files Using Python

Python is a powerful programming language that allows you to perform numerous file operations. Sometimes you may need to combine multiple CSV files using python. While you can easily do this using sed/awk commands in Linux, sometimes you may need to merge CSV files from within your application/website. In such cases, it is easier to do this using python script. In this article, we will learn how to combine multiple CSV files using Python.


How to Combine Multiple CSV Files Using Python

Let us say you have 100 csv files 1.csv, 2.csv…100.csv and you need to merge them into out.csv file. In such case, you can use the following python script to combine multiple csv files. We have assumed that each file has a header row so we copy the header only from 1st csv file and skip it from all the other files.

fout=open("out.csv","a")
# first file:
for line in open("1.csv"):
    fout.write(line)

# now the rest:    
for num in range(2,99):
    f = open(str(num)+".csv")
    f.next() # skip the header
    for line in f:
         fout.write(line)
    f.close() # not really needed
fout.close()

In the above code, we first open 1.csv file and copy all its lines (including header) to out.csv. Then we open each file one by one, skip its header, and copy it line by line to out.csv.

If your files do not have a sequential filenames like 1.csv, 2.csv, etc. you can pass them using a list or some other object.

filenames=['abc.csv','xyz.csv','pqr.csv']

fout=open("out.csv","a")
# first file:
for line in open("abc.csv"):
    fout.write(line)

# now the rest:    
for file in filenames[1:]:
    f = open(filenames[file]+".csv")
    f.next() # skip the header
    for line in f:
         fout.write(line)
    f.close() # not really needed
fout.close()

Alternatively, if all your csv files are in one folder, you can also use os.listdir() function to list them and merge them. Here is an example to merge all csv files present at /home/ubuntu/data

# importing os module
import os
folder = "/home/ubuntu/data"
filenames=os.listdir(folder)


fout=open("out.csv","a")
# first file:
for line in open("abc.csv"):
    fout.write(line)

# now the rest:    
for file in filenames[1:]:
    f = open(filenames[file]+".csv")
    f.next() # skip the header
    for line in f:
         fout.write(line)
    f.close() # not really needed
fout.close()

In this article, we have learnt how to merge multiple CSV files into single CSV file.

Also read:

How to Rename Multiple Files in Directory With Python
Python Script to Load Data into MySQL
NGINX Pass Headers from Proxy Server
How to Populate MySQL Table with Random Data
How to Get Query Execution Time in MySQL

Leave a Reply

Your email address will not be published. Required fields are marked *