On Unix-like operating systems, the diff command analyzes two files and prints the lines that are different. In essence, it outputs a set of instructions for how to change one file to make it identical to the second file.

This page covers the GNU/Linux version of diff.

Description

The diff software does not actually change the files it compares. However, it can optionally generate a script (if the -e option is specified) for the program ed or ex which can apply the changes.

  • Description
  • Viewing diff Output In Context.
  • Unified mode
  • Finding differences in directory contents.
  • Using diff to create an editing script.
  • Commonly-used diff options.
  • Syntax
  • Options
  • Examples
  • Related commands
  • Linux commands help

For example, consider two files, file1.txt and file2.txt.

If file1.txt contains the following four lines of text:

I need to buy apples. I need to run the laundry. I need to wash the dog. I need to get the car detailed.

…and file2.txt contains these four lines:

I need to buy apples. I need to do the laundry. I need to wash the car. I need to get the dog detailed.

…then we can use diff to automatically display for us which lines differ between the two files with this command:

diff file1.txt file2.txt

…and the output is:

2,4c2,4 < I need to run the laundry. < I need to wash the dog. < I need to get the car detailed.

I need to do the laundry. I need to wash the car. I need to get the dog detailed.

Let’s take a look at what this output means. The important thing to remember is that when diff is describing these differences to you, it’s doing so in a prescriptive context: it’s telling you how to change the first file to make it match the second file.

The first line of the diff output contains:

  • line numbers corresponding to the first file,
  • a letter (a for add, c for change, or d for delete), and
  • line numbers corresponding to the second file.

In our output above, “2,4c2,4” means: “Lines 2 through 4 in the first file need to be changed to match lines 2 through 4 in the second file.” It then tells us what those lines are in each file:

  • Lines preceded by a < are lines from the first file;
  • lines preceded by > are lines from the second file.

The three dashes ("—") merely separate the lines of file 1 and file 2.

Let’s look at another example. Let’s say our two files look like this:

file1.txt:

I need to go to the store. I need to buy some apples. When I get home, I’ll wash the dog.

file2.txt:

I need to go to the store. I need to buy some apples. Oh yeah, I also need to buy grated cheese. When I get home, I’ll wash the dog.

Output:

2a3

Oh yeah, I also need to buy grated cheese.

Here, the output is telling us “After line 2 in the first file, a line needs to be added: line 3 from the second file.” It then shows us what that line is.

Now let’s see what it looks like when diff tells us we need to delete a line.

file1:

I need to go to the store. I need to buy some apples. When I get home, I’ll wash the dog. I promise.

file2:

Our command:

The output:

4d3 < I promise.

Here, the output is telling us “You need to delete line 4 in the first file so that both files sync up at line 3.” It then shows us the contents of the line that needs to be deleted.

Viewing diff Output In Context

The examples above show the default output of diff. It’s intended to be read by a computer, not a human, so for human purposes, sometimes it helps to see the context of the changes.

GNU diff, which is the version most linux users are using, offers two different ways to do this: “context mode” and “unified mode”.

To view differences in context mode, use the -c option. For instance, let’s say file1.txt and file2.txt contain the following:

apples oranges kiwis carrots

apples kiwis carrots grapefruits

Let’s look at the contextual output for the diff of these two files. Our command is:

diff -c file1.txt file2.txt

And our output looks like this:

*** file1.txt 2014-08-21 17:58:29.764656635 -0400 — file2.txt 2014-08-21 17:58:50.768989841 -0400


*** 1,4 **** apples

  • oranges kiwis carrots — 1,4 —- apples kiwis carrots
  • grapefruits

The first two lines of this output show us information about our “from” file (file 1) and our “to” file (file 2). It lists the file name, modification date, and modification time of each of our files, one per line. The “from” file is indicated by “***”, and the “to” file is indicated by “—”.

The line “***************” is a separator.

The next line has three asterisks ("") followed by a line range from the first file (in this case lines 1 through 4, separated by a comma). Then four asterisks ("*").

Then it shows us the contents of those lines. If the line is unchanged, it’s prefixed by two spaces. If the line is changed, however, it’s prefixed by an indicative character and a space. The character meanings are as follows:

After the lines from the first file, there are three dashes ("—"), then a line range, then four dashes ("—-"). This indicates the line range in the second file that syncs up with our changes in the first file.

If there is more than one section that needs to change, diff shows these sections one after the other. Lines from the first file are still indicated with “***”, and lines from the second file with “—”.

Unified mode

Unified mode (the -u option) is similar to context mode, but it doesn’t display any redundant information. Here’s an example, using the same input files as our last example:

diff -u file1.txt file2.txt

— file1.txt 2014-08-21 17:58:29.764656635 -0400 +++ file2.txt 2014-08-21 17:58:50.768989841 -0400 @@ -1,4 +1,4 @@ apples -oranges kiwis carrots +grapefruits

The output is similar to above, but as you can see, the differences are “unified” into one set.

Finding differences in directory contents

diff can also compare directories by providing directory names instead of file names. See the Examples section.

Using diff to create an editing script

The -e option tells diff to output a script, which can be used by the editing programs ed or ex, containing a sequence of commands. The commands are a combination of c (change), a (add), and d (delete) which, when executed by the editor, modify the contents of file1 (the first file specified on the diff command line) so that it matches the contents of file2 (the second file specified).

Let’s say we have two files with the following contents:

Once upon a time, there was a girl named Persephone. She had black hair. She loved her mother more than anything. She liked to sit outside in the sunshine with her cat, Daisy. She dreamed of being a painter when she grew up.

file2.txt

Once upon a time, there was a girl named Persephone. She had red hair. She loved chocolate chip cookies more than anything. She liked to sit outside in the sunshine with her cat, Daisy. She would look up into the clouds and dream of being a world-famous baker.

We can run the following command to analyze the two files with diff and produce a script to create a file identical to file2.txt from the contents of file1.txt:

diff -e file1.txt file2.txt

…and the output looks like this:

5c She would look up into the clouds and dream of being a world-famous baker. . 2,3c She had red hair. She loved chocolate chip cookies more than anything. .

Notice that the changes are listed in reverse order: the changes closer to the end of the file are listed first, and changes closer to the beginning of the file are listed last. This order is to preserve line numbering; if we made the changes at the beginning of the file first, that might change the line numbers later in the file. So the script starts at the end, and works backwards.

Here, the script is telling the editing program: “change line 5 to (the following line), and change lines 2 through 3 to (the following two lines).”

Next, we should save the script to a file. We can redirect the diff output to a file using the > operator, like this:

diff -e file1.txt file2.txt > my-ed-script.txt

This command does not display anything on the screen (unless there is an error); instead, the output is redirected to the file my-ed-script.txt. If my-ed-script.txt doesn’t exist, it is created; if it exists already, it is overwritten.

If we now check the contents of my-ed-script.txt with the cat command…

cat my-ed-script.txt

…we see the same script we saw displayed above.

There’s still one thing missing, though: we need the script to tell ed to actually write the file. All that’s missing from the script is the w command, which writes the changes. We can add this to our script by echoing the letter “w” and using the » operator to add it to our file. (The » operator is similar to the > operator. It redirects output to a file, but instead of overwriting the destination file, it appends to the end of the file.) The command looks like this:

echo “w” » my-ed-script.txt

Now, we can check to see that our script has changed by running the cat command again:

5c She would look up into the clouds and dream of being a world-famous baker. . 2,3c She had red hair. She loved chocolate chip cookies more than anything. . w

Now our script, when issued to ed, makes the changes and write the changes to disk.

So how do we get ed to do this?

We can issue this script to ed with the following command, telling it to overwrite our original file. The dash ("-") tells ed to read from the standard input, and the < operator directs our script to that input. In essence, the system enters whatever is in our script as input to the editing program. The command looks like this:

ed - file1.txt < my-ed-script.txt

This command displays nothing, but if we look at the contents of our original file…

cat file1.txt

…we can see that file1.txt now matches file2.txt exactly.

Commonly-used diff options

Here are some useful diff options to take note of:

In this example, ed overwrote the contents of our original file, file1.txt. After running the script, the original text of file1.txt disappears, so make sure you understand what you’re doing before running these commands!

These are only some of the most commonly-used diff options. What follows is a complete list of diff options and their function.

Syntax

diff [OPTION]… FILES

Options

LETTERs are as follows for new group, lowercase for old group:

LFMT (only) may contain:

Both GFMT and LFMT may contain:

FILES takes the form “FILE1 FILE2” or “DIR1 DIR2” or “DIR FILE…” or “FILE… DIR”.

If the –from-file or –to-file options are given, there are no restrictions on FILE(s). If a FILE is a dash ("-"), diff reads from standard input.

Exit status is either 0 if inputs are the same, 1 if different, or 2 if diff encounters any trouble.

Examples

Here’s an example of using diff to examine the differences between two files side by side using the -y option, given the following input files:

diff -y file1.txt file2.txt

apples apples oranges < kiwis kiwis carrots carrots > grapefruits

And as promised, here is an example of using diff to compare two directories:

diff dir1 dir2

Only in dir1: tab2.gif Only in dir1: tab3.gif Only in dir1: tab4.gif Only in dir1: tape.htm Only in dir1: tbernoul.htm Only in dir1: tconner.htm Only in dir1: tempbus.psd

bdiff — Identify the differences between two very big files.cmp — Compare two files byte by byte.comm — Compare two sorted files line by line.dircmp — Compare the contents of two directories, listing unique files.ed — A simple text editor.pr — Format a text file for printing.ls — List the contents of a directory or directories.sdiff — Compare two files, side-by-side.