Python provides numerous ways to work files including PDF files. Sometimes you may need to combine multiple PDF files into single file. In this article, we will learn how to merge PDF files using python.
How to Merge PDF Files Using Python
Here are the different ways to merge PDF files using Python. For this purpose, we will use PyPDF2 library.
1. Install PyPDF2
Open terminal and run the following command to install PyPDF2 in python.
$ pip install PyPDF2
2. Merge PDF Files
PyPDF2 provides several ways to merge PDF files. We will look at them one by one.
File Concatenation
Let us say you have PDF files file1.pdf, file2.pdf, and file3.pdf. In this case, we import PDfFileMerger from PyPDF2 and use append() to append one file to another.
from PyPDF2 import PdfFileMerger pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf'] merger = PdfFileMerger() for pdf in pdfs: merger.append(pdf) merger.write("result.pdf") merger.close()
In the above code, we append file1.pdf, file2.pdf, and file3.pdf into result.pdf file. We create a PdfFileMerger() object first, and then basically loop through the list containing filenames, appending each of them to the PdfFileMerger() object. Finally, we call write() function to write the appended content into a single file result.pdf. Lastly, we call close() function to close both input and output files. Please note, if you mention only filenames in pdfs list above, python code will look for them relative to its location. So you may want to use full paths instead of relative paths.
pdfs = ['/home/ubuntu/file1.pdf', '/home/ubuntu/file2.pdf', '/home/ubuntu/file3.pdf']
File Merging
You can also use merge() function to append pdf file. It allows you to specify an insertion point in output file. In this case, you can specify the page number after which the insertion needs to take place.
from PyPDF2 import PdfFileMerger pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf'] merger = PdfFileMerger() for pdf in pdfs: merger.merge(2,pdf) merger.write("result.pdf") merger.close()
In this case, we use merge function to insert every pdf after the 2nd page.
Using Page Ranges
The above examples append one PDF fully with other PDF documents. If you want to append only specific pages and not the entire document, you can use pages keyword argument and pass a tuple of the format (start, end [,step]) to specify the page range to be appended.
from PyPDF2 import PdfFileMerger pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf'] merger = PdfFileMerger() for pdf in pdfs: merger.append(pdf, pages=(0, 3)) merger.write("result.pdf") merger.close()
In the above code, we append only first 3 pages of each document to create a single document. Here is another example, where we append alternate pages 1,3,5
#another example merger.append(pdf, pages=(0, 6, 2)) # pages 1,3, 5
It is important to remember to call the PDfFileMerger module’s close() method when you have completed writing PDF files. This will ensure that both input and output files are closed properly.
In this article, we have seen how to easily merge PDF files using python. You can customize these examples as per your requirement.
Also read:
How to Do Incremental Backup in MySQL
How to Pass SSH Password in Shell
MySQL Change Table Engine from InnoDB to MyISAM
How to Install Fonts in Ubuntu
How to Increment & Decrement Shell Variable
Related posts:
How to Password Protect PDF in Python
How to Create PDF File in Python
How to Read Inputs as Numbers in Python
How to Shuffle List of Objects in Python
How to Count Occurrence of List Item in Python
How to Convert Bytes to String in Python
How to Update Key in Dictionary in Python
How to Run Python Script in Django Project
Sreeram has more than 10 years of experience in web development, Python, Linux, SQL and database programming.