download large file in python requests

How to Download Large Files in Python Requests

Requests is a popular and feature rich Python library that allows you to work with URLs as well as files. You can use it to easily download files from URLs. But while downloading file it is initially kept in memory before being written to the disk. If your files are large, then it may not be possible to keep the entire file in memory. In such cases, how to download large file in python requests? In this article, we will learn the steps for it.


How to Download Large Files in Python Requests

Let us say you have the following URL to a large .txt file (>1GB).

url = http://www.example.com/data.txt

1. Using iter_content()

First, we will parse the URL to obtain just the filename.

local_filename = url.split('/')[-1]

Next, we download the URL as a stream using requests.get() function, with stream=True. We use open() function to open a file writer for our local file and use iter_content() function to write the file to disk in chunks (size=8192).

with requests.get(url, stream=True) as r:
        r.raise_for_status()
        with open(local_filename, 'wb') as f:
            for chunk in r.iter_content(chunk_size=8192): 
               
                f.write(chunk)
    return local_filename

You can add the above code to create a function.

def download_file(url):
    local_filename = url.split('/')[-1]

    with requests.get(url, stream=True) as r:
        r.raise_for_status()
        with open(local_filename, 'wb') as f:
            for chunk in r.iter_content(chunk_size=8192): 
                f.write(chunk)
    return local_filename

2. Using shutil

You can also use shutil module with Requests to simplify this code further. Just use copyfileobj() function and use response.raw (r.raw).

import requests
import shutil

def download_file(url):
    local_filename = url.split('/')[-1]
    with requests.get(url, stream=True) as r:
        with open(local_filename, 'wb') as f:
            shutil.copyfileobj(r.raw, f)

    return local_filename

Please note, this approach will not decompress gzips.

3. Using urllib

If you don’t mind using another library, it is much easier to do all this using urllib and skipping Requests library altogether for this purpose. Just see the code below. It uses urlretrieve() function which automatically downloads large files and writes to disk without loading it fully in memory.

from urllib.request import urlretrieve

url = 'http://www.example.com/data.txt'
dst = 'data.txt'
urlretrieve(url, dst)

In this article, we have learnt how to download large files in Python. It is advisable to use urllib instead of requests to download large files, since it has a few optimized, low-memory methods to support this requirement.

Also read:

How to Remove All Occurrences of Value in Python List
How to Move File in Python
How to Reset Auto Increment in MySQL
How to Show Last Queries Executed in MySQL
How to Change Href Attribute of Link Using jQuery

Leave a Reply

Your email address will not be published. Required fields are marked *