
The cut command in Linux is an essential utility for text processing, designed to extract particular segments from each line of a file or from piped input. This command does not modify the original file but instead reads the data and displays the desired portions in the standard output. In this guide, we will delve into the functionality of the cut command in Linux and provide practical, real-world examples to demonstrate its usage.
Exploring the cut Command
The cut
command is instrumental for anyone dealing with structured text, facilitating effective data manipulation and extraction within Unix-like environments. By extracting portions of a line based on byte positions, character positions, separators, or fields, cut proves invaluable for filtering and organizing data in shell scripts and command-line operations. Its applications range from retrieving specific columns from CSV files to trimming unnecessary characters or analyzing logs. Although often employed with files directly, cut also seamlessly interacts with the output of other commands when harnessed in a pipeline.
Basic Syntax of the cut Command
The cut command is straightforward, utilizing options followed by a file name. The syntax is as follows:
cut [OPTIONS] [FILE]
In this structure, OPTIONS dictate how the cut command operates, allowing you to select a field separator (like a comma), choose specific fields, set ranges, and exclude lines missing the separator, among other functionalities. If a file isn’t specified, cut will read from standard input. Additionally, you can provide multiple files, which will be treated as a combined entity for processing.
Commonly Used Options
The cut command offers a variety of options to pinpoint exact segments of text to extract. Here are some of the most frequently used:
- -f or –fields=LIST: Allows selection of specific fields based on a designated delimiter.
- -b or –bytes=LIST: Extracts specified bytes from each line.
- -c or –characters=LIST: Retrieves specific characters from each line.
- -d or –delimiter: Sets a custom delimiter instead of the default tab.
- –complement: Outputs everything except the specified fields, bytes, or characters.
- -s or –only-delimited: Skips lines lacking the delimiter; such lines are included by default.
- –output-delimiter: Allows selection of a different delimiter for the output, contrasting with the input delimiter.
The -f
, -b
, and -c
options utilize a LIST to define what to extract. You can specify the following:
- A single number like 2.
- Multiple numbers separated by commas, like 1, 3, 5.
- A range like 2-4 (to extract values from 2 to 4).
-
N-
to denote extraction from position N to the end. -
-M
to signify extraction from the start up to position M.
Utilizing the cut Command in Linux
To illustrate how the cut command functions, let’s execute some practical examples. First, let’s create a sample file named “mte.csv” using the echo command:
echo -e "empID, empName, empDesig\n101, Anees, Author\n102, Asghar, Manager\n103, Damian, CEO" > mte.csv

Next, we can check the contents of the file using the cat command:
cat mte.csv

It’s crucial to mention that the cut command merely presents the specified output without changing the file itself.
Extracting Data By Characters
To extract characters by position, utilize the -c
option with the cut command:
cut -c 1, 8 mte.csv
This command extracts the first and eighth characters from each row:

To extract characters within a specified range, apply the following command:
cut -c 1-8 mte.csv
This extracts characters from positions 1 to 8 in each row:

Extracting By Byte
To extract specific bytes, utilize the -b
option with the cut command:
cut -b 1-3 mte.csv
This command extracts the first three bytes from each line in the file mte.csv
:

Extracting By Field (Column)
To extract an entire field from a file, utilize the cut command with the -f
and -d
options:
cut -d', ' -f2 mte.csv
In this command, -d', '
designates a comma as a delimiter, while -f2
indicates that cut should extract the second field from each line:

Implementing Custom Delimiters in cut
Though cut defaults to using a tab as a delimiter, if fields are separated by a different character, use -d
to specify the correct one. For example, to extract the fifth word from a space-separated sentence, you can use:
echo "Hey! Geeks Welcome to Maketecheasier.com" | cut -d ' ' -f 5

Excluding Specific Fields During Extraction
You can omit certain fields while extracting text from a file by employing the --complement
option with the cut command. This option specifies that cut should output all fields apart from the designated ones:
cut -d', ' -f1 mte.csv --complement
This command skips the first column and returns the remainder of the content:

Modifying the Default Output Delimiter
By default, when extracting fields, the cut command retains the input delimiter in the output. However, you can alter the output delimiter by using the --output-delimiter
option:
cut -d', ' -f1-3 --output-delimiter='-' mte.csv
This command utilizes a hyphen as a separator in the output:

Combining cut with Other Linux Commands
The cut command can also be utilized in conjunction with other Linux commands using the pipeline | symbol. For instance, the following command extracts the first five characters from each output line of the who
command:
who | cut -c 1-5

In another example, you can use the cut command along with head to display the first two lines of “mte.csv, ” extracting only the empName and empDesig fields:
head -n 2 mte.csv | cut -d ', ' -f2, 3

Navigating Irregular Data Formats with the Linux cut Command
The cut command excels when handling data that is well-formatted with consistent delimiters (like commas or tabs).However, if you encounter files with inconsistent spacing or mixed delimiters, applying cut alone may yield unsatisfactory results. To address these scenarios, it’s often beneficial to clean the data in advance using commands like tr
or sed
, ensuring cut can effectively extract the correct portions.
Managing Excess Spaces
Consider a file named “mteData.txt” where fields are separated by varying spaces:
cat mteData.txt

Since cut anticipates a single delimiter, utilize tr
to normalize the spacing before applying cut:
cat mteData.txt | tr -s ' ' | cut -d ' ' -f1-2
This command processes “mteData.txt, ” replaces multiple spaces with a single one using tr
, and then extracts the first two fields:

Managing Mixed Delimiters
In cases where a file uses a combination of spaces and commas, normalize the format with sed
. For example, a file named “mteData1.txt” contains:
cat mteData1.txt

Utilize sed
with cut to convert all spaces to commas and then extract the first and third fields:
sed 's/ /, /g' mteData1.txt | cut -d ', ' -f1, 3

Conclusion
Throughout this article, we’ve uncovered the functionalities of the Linux cut command, a vital tool for extracting data from files or piped inputs. With its simple syntax, you can effortlessly obtain characters, bytes, or fields based on a specified delimiter. Additionally, we showcased how to combine the cut command with other utilities such as tr
, sed
, and head
to manage unclean data and achieve more efficient output. Whether you’re handling CSV files, analyzing logs, or cleansing data, the cut command is an indispensable asset for text processing in Unix-like environments.
Frequently Asked Questions
1. What is the primary purpose of the cut command in Linux?
The cut command in Linux is primarily used for extracting specific sections of text from files or the output of other commands. It enables users to manipulate and format data effectively based on delimiters, byte positions, or character positions.
2. Can I combine the cut command with other Linux commands?
Yes! The cut command can be seamlessly integrated with other Linux commands using the pipeline symbol (|).This allows for powerful data processing, enabling you to filter and format outputs from various commands.
3. How can I specify a custom delimiter when using the cut command?
You can specify a custom delimiter by using the -d
option followed by the desired delimiter character. For example, to use a comma as a delimiter, you would use -d', '
.
Leave a Reply ▼