-1

I have two files

$ cat file1
c c1
b b1
$ cat file2
a a2
c c2
b b2

What happens if I specify nonexisting join field(s) for one or both of the input files?

Why do the following outer joins generate the outputs? Does it agree with outer joins in relational algebra? (SQL doesn't allow to specify nonexisting join fields)

Thanks.

$ join -a 1 -a 2 -o auto -t " " -e NULL  -1 1 -2 5 <(sort file1) <(sort file2)
NULL NULL a a2
NULL NULL b b2
NULL NULL c c2
b b1 NULL NULL
c c1 NULL NULL

$ join -a 1 -a 2 -o auto -t " " -e NULL  -1 5 -2 1 <(sort file1) <(sort file2)
NULL b b1 NULL
NULL c c1 NULL
a NULL NULL a2
b NULL NULL b2
c NULL NULL c2

$ join -a 1 -a 2 -o auto -t " " -e NULL  -1 5 -2 5 <(sort file1) <(sort file2)
NULL b b1 a a2
NULL b b1 b b2
NULL b b1 c c2
NULL c c1 a a2
NULL c c1 b b2
NULL c c1 c c2

Originated from What does `join -e` mean?

Tim
  • 101,790

1 Answers1

-1

NULL and NULL are equal, so the outer join in the third example is the same as inner join.

NULL and a not-NULL value are not equal, so the outer joins in the first two examples don't contain any records from inner joins, because inner joins are empty.

Tim
  • 101,790