Large files are difficult to manage, store and transfer. So often you may need to split a file in Linux to make it easy to use and move them around. There are many utilities in Linux for this purpose. In this article, we will learn how to split file in Linux using split command.
How to Split File in Linux
split command is present by default on most Linux systems. It allows you to split a file in different ways according to your requirement.
Split Files by Number of Lines
Sometimes you may need to split files in Linux, by specifying the number of lines you want in each smaller file.
For example, let us say you have a large file data.txt that you want to split into 5 parts based on number of lines.
First, we use wc command to count the number of lines to be present in each smaller file.
`wc -l < data.txt`/5
In the above command, wc -l command counts the total number of lines in the file and divides it by 5 to get total number of lines to be included in each smaller file.
The general syntax for splitting a file based on number of lines is
split -l number_of_lines large_file_name small_file_name_prefix -da number_of_partitions
In the above command, we mention number of lines in each smaller file, large file’s name, prefix for smaller files, and number of splits to be made. To split a file into 5 parts, we need to make 4 splits. Here is the command to split our file data.txt into 5 parts. Instead of mentioning number of lines in each partition, we substitute the wc command we created above.
split -l$((`wc -l < data.txt`/5)) data.txt data.split.txt -da 4
In this case, split command will split data.txt into smaller files data.split.txt0000, data.split.txt0001,…
Please note, after file splitting, if there is still a remainder, then it will be added to a 6th file.
Alternatively, you can simply mention the number of lines to be used for each smaller file, by specifying only the number of lines to be used in each partition. Here is an example to split data.txt into smaller files, each of 1000 lines.
split -l 1000 data.txt data.split.txt
In this case, the number of smaller files that will be generated is not evident, but depends on the total number of lines in the file.
This approach is useful for files like logs, data dumps which contain lines of text. But it cannot be used on images that do not contain lines. In such cases, you need to split files based on their size, as described below.
Split Files by Size
In this case, we will split files in Linux by specifying the size of each smaller file. You can do this using -b or –bytes option. Here is an example command to split data.txt into smaller files, each of size 1Mb.
$ split --bytes=1M data.txt data.split.txt OR $ split -b=1M data.txt data.split.txt
You can specify file size of smaller files as Kb, Mb, and Gb by using suffix K,M,G respectively, in –bytes option. If you don’t specify any prefix, split command will read it as number of bytes.
In this article, we have learnt how to split files in Linux using split command, in different ways. You can customize it as per your requirement and include it in your applications, or run it as a standalone command, as needed.
Please note, split command can be used for all file formats, not just the text files. You can even use it on images if you want but in this case, you will be able to split the file only based on file size and not number of lines.
Also read:
Random Password Generator in Python
How to Generate Random Password in Linux
How to Use Wget to Download File via Proxy
How to Get Hostname/Domain Name from IP
How to Check if String Matches Regular Expression
Related posts:
Sreeram has more than 10 years of experience in web development, Python, Linux, SQL and database programming.