210

I have file1.txt

this is the original text  
line2  
line3  
line4  
happy hacking !  

and file2.txt

this is the original text  
line2  
line4  
happy hacking !  
GNU is not UNIX  

if I do: diff file1.txt file2.txt I get:

3d2  
< line3  
5a5  
> GNU is not UNIX  

How is the output generally interpreted? I think that < means removed but what do 3d2 or 5a5 mean?

If I do:

$ diff -u file1.txt file2.txt  
--- file1.txt        2013-07-06 17:44:59.180000000 +0200  
+++ file2.txt        2013-07-06 17:39:53.433000000 +0200  
@@ -1,5 +1,5 @@  
 this is the original text  
 line2  
-line3  
 line4  
 happy hacking !  
+GNU is not UNIX  

The results are clearer but what does @@ -1,5 +1,5 @@ mean?

Jim
  • 10,120

5 Answers5

220

In your first diff output (so called "normal diff") the meaning is as follows:

< - denotes lines in file1.txt

> - denotes lines in file2.txt

3d2 and 5a5 denote line numbers affected and which actions were performed. d stands for deletion, a stands for adding (and c stands for changing). the number on the left of the character is the line number in file1.txt, the number on the right is the line number in file2.txt. So 3d2 tells you that the 3rd line in file1.txt was deleted and has the line number 2 in file2.txt (or better to say that after deletion the line counter went back to line number 2). 5a5 tells you that the we started from line number 5 in file1.txt (which was actually empty after we deleted a line in previous action), added the line and this added line is the number 5 in file2.txt.

The output of diff -u command is formatted a bit differently (so called "unified diff" format). Here diff shows us a single piece of the text, instead of two separate texts. In the line @@ -1,5 +1,5 @@ the part -1,5 relates to file1.txt and the part +1,5 to file2.txt. They tell us that diff will show a piece of text, which is 5 lines long starting from line number 1 in file1.txt. And the same about the file2.txt - diff shows us 5 lines starting from line 1.

As I have already said, the lines from both files are shown together

 this is the original text  
 line2  
-line3  
 line4  
 happy hacking !  
+GNU is not UNIX

Here - denotes the lines which were deleted from file1.txt, and + denotes the lines which were added.

Stephen Kitt
  • 434,908
John Smith
  • 2,779
  • 1
    For completeness in the last paragraph: a space ' ' at the start of a line shows it is unchanged. – Matthew Wilcoxson Mar 16 '22 at 18:28
  • While it's easy enough to find online diff apps, are there any online diff-file app that provides visual cues to diff files? – Konchog May 18 '22 at 09:07
  • 1
    @Konchog not sure about online services, but out of desktop apps I like meld most. – John Smith May 19 '22 at 07:32
  • @JohnSmith, yeah I was looking for a patch file viewer, not a diff app. – Konchog May 19 '22 at 09:06
  • @JohnSmith Can you elaborate on "after deletion the line counter went back to line number 2" for 3d2? – ado sar Oct 04 '22 at 20:00
  • @adosar before deleting the line you're on the line 3 (the line counter is 3), after deleting the line it gets back you to the line 2 (the counter is 2). This is how I understand it – E. Shcherbo Oct 04 '22 at 21:56
45

Summary:

Given a diff file1 file2, < means the line is missing in file2 and >means the line is missing in file1. The 3d2 and 5a5 can be ignored, they are commands for patch which is often used with diff.

Full Answer:

Many *nix utilities offer TeXinfo manuals as well as the simpler man pages. you can access these by running info command, for example info diff. In this case, the section your are interested in is:

2.4.2 Detailed Description of Normal Format

The normal output format consists of one or more hunks of differences; each hunk shows one area where the files differ. Normal format hunks look like this:

 CHANGE-COMMAND
 < FROM-FILE-LINE
 < FROM-FILE-LINE...
 ---
 > TO-FILE-LINE
 > TO-FILE-LINE...

There are three types of change commands. Each consists of a line number or comma-separated range of lines in the first file, a single character indicating the kind of change to make, and a line number or comma-separated range of lines in the second file. All line numbers are the original line numbers in each file. The types of change commands are:

`LaR'
     Add the lines in range R of the second file after line L of the
     first file.  For example, `8a12,15' means append lines 12-15 of
     file 2 after line 8 of file 1; or, if changing file 2 into file 1,
     delete lines 12-15 of file 2.

FcT' Replace the lines in range F of the first file with lines in range T of the second file. This is like a combined add and delete, but more compact. For example,5,7c8,10' means change lines 5-7 of file 1 to read as lines 8-10 of file 2; or, if changing file 2 into file 1, change lines 8-10 of file 2 to read as lines 5-7 of file 1.

`RdL' Delete the lines in range R from the first file; line L is where they would have appeared in the second file had they not been deleted. For example, '5,7d3' means delete lines 5-7 of file 1; or, if changing file 2 into file 1, append lines 5-7 of file1 after line 3 of file 2.

terdon
  • 242,166
  • To make sure I understand this: if you were to swap file1 and file2, would all XaY become YdX, XcY become YcX, and XdY become YaX? – BallpointBen Oct 28 '20 at 04:23
  • @BallpointBen more likely the differences listed will change in order. It's much easier to understand if you just create a couple of files to play with. – terdon Oct 28 '20 at 12:20
13

1) Rename the file parameters to help you remember what's going on like this:

Rather than:

diff f1 f2    # f1=file 1, and f2=file2

think

diff file-to-edit file-with-updates

The results by default come from file-to-edit, with updates from file-with-updates.


2) Also these command renames might help you think about what is happening:

d stands for delete, but 'remove' is more clearly what happens
a stands for add, ...... but 'insert' is more clearly what happens

c stands for change = d + a or 'remove + insert'.


Used like this:

2,4d1 or in general D(s)-d-N = delete ('remove') D line(s). Then sync at line N in both.

4a2,4 or in general N-a-U(s) = At line N, add ('insert') update-line(s) U.

Note: Parameters for these two are nearly symmetric; just reversed left to right.


2,4c5,6 or in general R(s)-c-U(s) = Remove R(s) lines, then insert updated lines U(s) in their place.


For example:

4a2,4 means starting at 4, add (insert) updated lines 2-4 (i.e. "2,4" means lines 2, 3 and 4)

2,4d1 means remove lines 2-4 (2, 3 and 4).

2,4c5,6 means remove lines 2-4 (2, 3 and 4), and insert updated lines 5-6 (5 and 6).

Elliptical view
  • 3,921
  • 4
  • 27
  • 46
9

The above answers are good. However as a beginner, I found them slightly difficult to understand and upon searching further, I found a very useful link: Linux Diff Command & Examples

The site explains the concept in a simple and easy to understand manner.

Diff command is easier to understand if you consider it this way :

Essentially, it outputs a set of instructions for how to change one file to make it identical to the second file.

Each of the following cases are explained well:

a for add, c for change, d for delete

userAsh
  • 99
7

I suggest to use:

diff -rupP file1.txt file2.txt > result.patch

Then, when you read result.patch, you will instantly know the difference.

These are the meanings of the command line switches:

-r: recursive

-u: shows line number

-p(small): shows differences in C functions

-P(capital): in case of multiple files the full path is shown

runlevel0
  • 1,609
Ravi
  • 79