2

'File A' has list of university ranked in year let's say 2018, 'File B' has list of university ranked in year 2017,

File A (2018 Rankings):

University of Oxford
University of Cambridge
California Institute of Technology
Stanford University
Massachusetts Institute of Technology

File B (2017 Rankings):

University of Oxford
California Institute of Technology
Stanford University
University of Cambridge
Massachusetts Institute of Technology

'Stanford University' is ranked 4th in 2018, whereas it was 3rd in 2017. So, I want an output of files which contains only the university ranked in 2017, which went above Stanford in 2018 rankings, similarly, list of university which was ranked in 2017 which went below Stanford in 2018 rankings.

Expected output should contain two files having data on,

**Ranked above Stanford: **
University of Cambridge

**Ranked below Stanford: **
NONE

NONE - As no university was ranked above stanford in 2017 rankings went below standford in 2018 rankings.

I want to be able to do this for any university mentioned in the list.

Data entered here are snippets from huge data files, they contain 1000+ lists of university ranked. I want to analyze this for few university only.

CCC
  • 79
  • Desired way is to print all the things ranked above 'd' in one file and ranked below in other file. – CCC Sep 08 '17 at 10:29
  • 'File A' has list of university ranked in year let's say 2011, 'File B' has list of university ranked in year 2013, I want to know how many university and what all university has gone above/below 'University X' in 2013? Sorry not able to paste the data here. – CCC Sep 08 '17 at 10:36
  • 1
    Which d of file B should be used? And why line by line -- there seems to be only one line? And put it in the question. Honestly. – Philippos Sep 08 '17 at 10:39
  • I did put in question now, there are files with names of people ranked too, hence I posted in an awful way. Two d's are a mistake. I edited too but it hasn't updated! – CCC Sep 08 '17 at 10:43
  • 1
    You could most certainly paste some sample data here; change university names to Looney Tune characters or something else unique; make sure it maps to the output you expect. – Jeff Schaller Sep 08 '17 at 11:08
  • 1
    provide sample data and expected output. I would suggest you to use diff command for comparing two files. – Sagar Sep 08 '17 at 11:27
  • 1
    If you throw out the current examples and replace them with samples that a) have fewer lines per file and b) use shorter, easier to read dummy entries while still illustrating your intent plus you add output that shows exactly what you expect .... you are sure to get the answer you seek. – B Layer Sep 08 '17 at 12:38
  • 1
    Answer given by @AFSHIN solves my problem perfectly. – CCC Sep 08 '17 at 19:22

2 Answers2

2

Below answer after question clarified in correct way, here is the final solution for that:

awk -F'\n' -v RS='Stanford University' 
    'NR==1 && NR==FNR{for (i=1;i<NF;i++)above[$i]++;next}
     NR==2&&NR==FNR{for (j=2;j<NF;j++)below[$j]++;next} 
     NR==3{for (x=1;x<NF;x++)X2017[$x]++;next}
     NR==4{for (y=2;y<NF;y++)Y2017[$y]++;next} 
END{ for (Z in Y2017) {if (Z in above) print Z>"Ranked-above.txt" }; 
     for (T in X2017) {if (T in below) print T>"Ranked-below.txt" };
}' 2018  2017

The output would be two files Ranked-above.txt and Ranked-below.txt with expected result.

**Ranked-above.txt**
University of Cambridge

**Ranked-below.txt**

You can search for another Universities with specifying in RS='University NAME HERE'

αғsнιη
  • 41,407
  • Thank you fro the response, but this just separates the lines above/below 'd' in each file. What I want is, for a given rank for 'University X' in file A, the list of university which has gone up/down the rank in 'file b' when compared with rank of 'University X' 'file B'? – CCC Sep 08 '17 at 11:20
  • 1
    WOW! Perfect,Thank you :) How to learn shell/awk/grep to solve things like these on my own? -_-

    PS:-I tried to upvote the answer, but I don't have required 15 credits yet :(

    – CCC Sep 08 '17 at 19:18
  • 1
    @CCC first think by yourself that all you need to resolve your problem never you can find anywhere completely, so you should devide your problem to small parts of questions, then find answers for them, finally you can combine all those together and resolve your problem : ) where you should seee and learn scripting? Here or other stackoverflow.com great sites. Ps. for reputation now you have privileges : ) – αғsнιη Sep 09 '17 at 03:01
  • Thank you, I have done the needful. I mean even to learn basic scripting/awk in a structured focused way. Internet is useful, but sometimes one can easily get lost and scattered all around. – CCC Sep 09 '17 at 11:45
  • 1
2

This is not an answer to your question in the sense that it doesn't produce the output that you required. It does however produce a table of changes in ranks between the lines in the two files.

The following awk program will output the change in ranking between the two files like this:

$ awk -f script.awk rankings-2017.txt rankings-2018.txt
        University of Oxford
 +2     University of Cambridge
 -1     California Institute of Technology
 -1     Stanford University
NEW     Uppsala University
 -1     Massachusetts Institute of Technology

("Uppsala University" was added on the second to last row of the second file).

The script:

NR == FNR       { rank[++n] = $0 }
NR != FNR       { ++nn;
    for (i = 1; i <= n; ++i) {
        if (rank[i] == $0) {
            if (i == nn) {
                printf("   ");
            } else {
                printf("%+3d", i - nn);
            }
            printf("\t%s\n", $0);
            next;
        }
    }
    printf("NEW\t%s\n", $0);
}
Kusalananda
  • 333,661
  • 1
    Thank you for spending time and answering. Doesn't matter if it didn't give a desired output! This is very useful, as my next problem was to find the change in ranking for each university!! Thank you again. PS:- Not able to upvote the answer as my reputation is less than 15. – CCC Sep 08 '17 at 19:28
  • 2
    @CCC Come back later ;-) – Kusalananda Sep 08 '17 at 19:35