-1

Suppose, I have many files in a folder and I would like to print (From the Center Number(excluding the line of Center Number until the line where 55 occurs in the table ) from the following block of text or table from these multiple files.

Center Number can be more or less than 55 in other files. I essentially want to print the line from where center number 1 to 55.

Kindly let me know if I am not clear.

Can you kindly help me how to get the table from such multiple files in a folder.

Standard orientation:                         
 ---------------------------------------------------------------------
 Center     Atomic      Atomic             Coordinates (Angstroms)
 Number     Number       Type             X           Y           Z
 ---------------------------------------------------------------------
      1          6           0       -1.215837    3.069318   -0.012683
      2          6           0        0.000000    3.745098   -0.004581
      3          6           0        1.215836    3.069317   -0.012691
      4          1           0        0.000000    4.830496   -0.014896
      5          6           0        2.560967    1.294763   -0.051815
      6          6           0        2.553538    3.571032   -0.074149
      7          6           0        3.377990    2.462531   -0.100004
      8          1           0        4.457741    2.469234   -0.167815
      9          5           0       -0.000001    0.772241    0.246764
     10          9           0       -0.000006   -0.297150   -0.651382
     11          9           0        0.000002    0.314025    1.553890
     12          7           0        1.257033    1.680283   -0.003563
     13          7           0       -1.257035    1.680284   -0.003556
     14          6           0       -2.560970    1.294765   -0.051802
     15          6           0       -3.377991    2.462533   -0.100003
     16          6           0       -2.553540    3.571034   -0.074123
     17          1           0       -4.457741    2.469236   -0.167825
     18          6           0        2.954901    5.014409   -0.124724
     19          1           0        2.567267    5.512958   -1.022276
     20          1           0        2.579039    5.570960    0.742911
     21          1           0        4.044288    5.113616   -0.135637
     22          6           0       -2.954903    5.014412   -0.124680
     23          1           0       -2.579044    5.570952    0.742963
     24          1           0       -2.567265    5.512971   -1.022225
     25          1           0       -4.044290    5.113619   -0.135596
     26          6           0       -2.954407   -0.091281   -0.068456
     27          6           0       -4.239390   -0.514493    0.004248
     28          1           0       -2.143544   -0.805389   -0.154218
     29          1           0       -5.028393    0.231421    0.091938
     30          6           0        2.954406   -0.091282   -0.068458
     31          6           0        4.239389   -0.514491    0.004242
     32          1           0        2.143544   -0.805392   -0.154206
     33          1           0        5.028392    0.231425    0.091913
     34          6           0       -4.709129   -1.896616   -0.017045
     35          6           0       -6.094342   -2.142706    0.054640
     36          6           0       -3.844921   -3.007822   -0.105300
     37          6           0       -6.600001   -3.440248    0.037225
     38          1           0       -6.776966   -1.298722    0.124539
     39          6           0       -4.350526   -4.302792   -0.122289
     40          1           0       -2.771378   -2.855372   -0.158161
     41          6           0       -5.729597   -4.527650   -0.051717
     42          1           0       -7.673133   -3.602373    0.093453
     43          1           0       -3.666214   -5.144353   -0.190030
     44          1           0       -6.119334   -5.541811   -0.064992
     45          6           0        4.709130   -1.896613   -0.017036
     46          6           0        6.094345   -2.142700    0.054612
     47          6           0        3.844922   -3.007823   -0.105246
     48          6           0        6.600006   -3.440242    0.037205
     49          1           0        6.776971   -1.298713    0.124475
     50          6           0        4.350528   -4.302792   -0.122227
     51          1           0        2.771376   -2.855375   -0.158076
     52          6           0        5.729602   -4.527646   -0.051692
     53          1           0        7.673140   -3.602364    0.093405
     54          1           0        3.666216   -5.144355   -0.189932
     55          1           0        6.119340   -5.541808   -0.064961
 ---------------------------------------------------------------------

Thanks for the kind replied and sorry for the delay. I may not have been clear enough.

I have given below the truncated contents (File1 and File2) from which I would like to grep/print the table. There are 100s of such files in the directory. I would like to print the table that is present below the string 'Standard Orientation' until the last line within the table. Thank you for the help. Please let me know if I am not clear.

Input File 1

 Largest Abelian subgroup         C1      NOp   1
 Largest concise Abelian subgroup C1      NOp   1
                         Standard orientation:                         
 ---------------------------------------------------------------------
 Center     Atomic      Atomic             Coordinates (Angstroms)
 Number     Number       Type             X           Y           Z
 ---------------------------------------------------------------------
      1          6           0       -1.216586    3.100980    0.000455
      2          6           0        0.000001    3.773566    0.000597
      3          6           0        1.216588    3.100980    0.000455
      4          1           0        0.000001    4.859634    0.000803
      5          6           0        2.545495    1.320077    0.000192
      6          6           0        2.555481    3.591962    0.000448
      7          6           0        3.373040    2.468224    0.000313
      8          1           0        4.455638    2.453296    0.000315
      9          5           0        0.000001    0.764927    0.000563
     10          9           0       -0.000002   -0.025840   -1.144052
     11          9           0        0.000004   -0.025210    1.145650
     12          7           0        1.253950    1.705003    0.000299
     13          7           0       -1.253948    1.705003    0.000304
     14          6           0       -2.545494    1.320077    0.000203
     15          6           0       -3.373038    2.468224    0.000323
     16          6           0       -2.555479    3.591962    0.000449
     17          1           0       -4.455636    2.453297    0.000331
     18          6           0        2.975497    5.031002    0.000174
     19          1           0        2.605223    5.563377   -0.885146
     20          1           0        2.598139    5.565869    0.880953
     21          1           0        4.066160    5.115561    0.004377
     22          6           0       -2.975495    5.031002    0.000141
     23          1           0       -2.597868    5.565972    0.880740
     24          1           0       -2.605491    5.563274   -0.885356
     25          1           0       -4.066157    5.115563    0.004663
     26          6           0       -3.042055   -0.137670   -0.000010
     27          6           0       -4.329115   -0.456310   -0.000102
     28          1           0       -2.243779   -0.891949   -0.000057
     29          1           0       -5.127391    0.297969   -0.000059
     30          6           0        3.042055   -0.137671   -0.000027
     31          6           0        4.329114   -0.456312   -0.000125
     32          1           0        2.243778   -0.891949   -0.000118
     33          1           0        5.127391    0.297967   -0.000031
     34          6           0       -4.825676   -1.914057   -0.000287
     35          6           0       -6.193953   -2.184917   -0.000378
     36          6           0       -3.906762   -2.963823   -0.000907
     37          6           0       -6.643592   -3.505635   -0.000774
     38          1           0       -6.918278   -1.357574   -0.000224
     39          6           0       -4.356219   -4.284249   -0.000825
     40          1           0       -2.827996   -2.749959   -0.001252
     41          6           0       -5.725089   -4.555194   -0.000398
     42          1           0       -7.722380   -3.718843   -0.000068
     43          1           0       -3.632201   -5.111953   -0.000588
     44          1           0       -6.079423   -5.596224   -0.000077
     45          6           0        4.825674   -1.914059   -0.000371
     46          6           0        6.193952   -2.184919   -0.000464
     47          6           0        3.906760   -2.963824    0.000042
     48          6           0        6.643589   -3.505638   -0.000459
     49          1           0        6.918277   -1.357577   -0.000455
     50          6           0        4.356216   -4.284250   -0.000432
     51          1           0        2.827994   -2.749960    0.000531
     52          6           0        5.725086   -4.555196   -0.001042
     53          1           0        7.722377   -3.718846   -0.001310
     54          1           0        3.632197   -5.111954   -0.000832
     55          1           0        6.079420   -5.596226   -0.001671
 ---------------------------------------------------------------------
 Rotational constants (GHZ):           0.1374145           0.0808973           0.0514693
 Leave Link  202 at Sat Jan 11 09:07:38 2020, MaxMem=   536870912 cpu:               0.2 elap:               0.0

Input File 2

Largest Abelian subgroup         C1      NOp   1
 Largest concise Abelian subgroup C1      NOp   1
                         Standard orientation:                         
 ---------------------------------------------------------------------
 Center     Atomic      Atomic             Coordinates (Angstroms)
 Number     Number       Type             X           Y           Z
 ---------------------------------------------------------------------
      1          6           0       -1.217328    3.807808   -0.035244
      2          6           0       -0.001626    4.483869   -0.038956
      3          6           0        1.214343    3.808279   -0.038119
      4          1           0       -0.001868    5.568980   -0.065942
      5          6           0        2.559803    2.033594   -0.051567
      6          6           0        2.551865    4.309250   -0.108853
      7          6           0        3.376521    3.200643   -0.118650
      8          1           0        4.456191    3.206513   -0.187829
      9          5           0       -0.000701    1.515223    0.258023
     10          9           0       -0.001535    0.432163   -0.623591
     11          9           0        0.000936    1.077139    1.572032
     12          7           0        1.255845    2.419557   -0.007705
     13          7           0       -1.258221    2.419071   -0.004733
     14          6           0       -2.562130    2.032603   -0.045515
     15          6           0       -3.379455    3.199335   -0.110684
     16          6           0       -2.555209    4.308262   -0.102805
     17          1           0       -4.459285    3.204786   -0.177327
     18          6           0        2.952863    5.751758   -0.182066
     19          1           0        2.564068    6.236386   -1.086712
     20          1           0        2.577903    6.321497    0.677361
     21          1           0        4.042215    5.850997   -0.195786
     22          6           0       -2.956937    5.750615   -0.175054
     23          1           0       -2.580176    6.320490    0.683495
     24          1           0       -2.570460    6.235404   -1.080607
     25          1           0       -4.046357    5.849432   -0.186208
     26          6           0       -2.955292    0.646389   -0.040413
     27          6           0       -4.240099    0.224094    0.040298
     28          1           0       -2.144380   -0.068795   -0.116152
     29          1           0       -5.029157    0.971114    0.117451
     30          6           0        2.953516    0.647533   -0.047382
     31          6           0        4.238674    0.225740    0.030295
     32          1           0        2.142705   -0.067967   -0.121195
     33          1           0        5.027621    0.973068    0.105568
     34          6           0       -4.709570   -1.158284    0.040791
     35          6           0       -6.094646   -1.403512    0.117881
     36          6           0       -3.845231   -2.270547   -0.031404
     37          6           0       -6.600050   -2.701267    0.120995
     38          1           0       -6.777366   -0.558687    0.175613
     39          6           0       -4.350581   -3.565723   -0.027904
     40          1           0       -2.771783   -2.118719   -0.087866
     41          6           0       -5.729520   -3.789738    0.047739
     42          1           0       -7.673080   -2.862717    0.180971
     43          1           0       -3.666171   -4.408093   -0.083517
     44          6           0        4.708682   -1.156454    0.029695
     45          6           0        6.094033   -1.401143    0.103481
     46          6           0        3.844607   -2.269056   -0.040417
     47          6           0        6.599948   -2.698702    0.105411
     48          1           0        6.776561   -0.556050    0.159568
     49          6           0        4.350467   -3.564035   -0.038101
     50          1           0        2.770967   -2.117646   -0.094317
     51          6           0        5.729671   -3.787511    0.034253
     52          1           0        7.673182   -2.859733    0.162829
     53          1           0        3.666254   -4.406671   -0.092065
     54          8           0       -6.243296   -5.124253    0.049984
     55          8           0        6.243970   -5.121826    0.035293
     56          6           0       -7.624282   -5.104707   -0.320687
     57          1           0       -7.967776   -6.107046   -0.469779
     58          1           0       -8.195969   -4.643809    0.457546
     59          1           0       -7.742425   -4.548641   -1.227182
     60          6           0        7.642659   -5.094520   -0.261059
     61          1           0        8.169083   -4.628290    0.545419
     62          1           0        7.999744   -6.095036   -0.388951
     63          1           0        7.805719   -4.539801   -1.161391
 ---------------------------------------------------------------------
 Rotational constants (GHZ):           0.0924797           0.0555654           0.0350317
 Leave Link  202 at Sat Jan 11 16:58:46 2020, MaxMem=   536870912 cpu:               0.2 elap:               0.1
AdminBee
  • 22,803
Rag
  • 69
  • 1
    I already posted an answer, but on second thought your question is unclear enough that more input is needed. 1. Is the example you provided the desired output or a sample input? 2. Do you want to print the entire line content, or only specific fields of each line for those where the "center number" is between 1 and 55? 3. Do you want to print the header, too (and be it once)? Please edit your post to clarify, in particular add an example of the input as well as desired output. – AdminBee Jan 06 '21 at 13:06
  • You have made an amendment to your question that is currently unclear: Are these two another example of input, just as the first one you included? If so, please be sure to add the desired output, too (maybe instead of one of these input examples). If it is a new problem, please ask a different question instead. If the answers already provided could not solve your original problem, please indicate that in a comment; otherwise consider accepting the one you consider most helpful, so that others facing similar issue may find it more easily. – AdminBee Jan 12 '21 at 08:29
  • Dear @AdminBee Thanks for the comment. I have actually responded to the suggestions given by the contributors. I learnt that I cannot attach files here, also, when I try to add comment to the contributor's response, I ran short of space, hence I have included them in the question, by editing it. My response starts from "Thanks for the kind replied". My issue is not resolved yet. I have given the truncated content from the file I am trying to retrieve data for a better understanding. Kindly let me know if I am not clear. – Rag Jan 12 '21 at 08:49
  • Please note that you would still need to provide the desired output in order for contributors to understand what you want to do. Also, if you say your "issue is not resolved yet", please explain where/how the solutions provided so far do not yet achieve the desired outcome: are there error messages, or is the output incomplete/mixed with undesired text? – AdminBee Jan 13 '21 at 13:07

2 Answers2

1

An awk solution comes to mind:

awk '$1~/^[[:digit:]]+$/ && $1>=1 && $1<=55' *.txt

This will process all files ending in .txt (or whatever the actual file names are in your case) and inspect the first column of every line. It will print the line provided that the first column

  • is an integer number, and
  • its value is between 1 and 55 (inclusive)

That way, we can be sure that processing is restricted to the actual "file body", ignoring the "table header".

Note that the "sanity check" of the first column is stringent in that it only allows non-negative integers, which seems to be the case judging from your example file content. If you want to relax this to accept any numerical value (with the only restriction being that it falls into the desired range), you could change

$1~/^[[:digit:]]$/

to

$1+0==$1

as noted by @αғsнιη in a comment.

AdminBee
  • 22,803
  • what is the use of partial matching $1~/[[:digit:]]+/ when you then have restrict checking $1 is between [1,50]? – αғsнιη Jan 17 '21 at 10:06
  • @αғsнιη You are right, it should be a full matching. I have edited the answer – AdminBee Jan 18 '21 at 08:42
  • no, I meant why do you need that at all? – αғsнιη Jan 18 '21 at 09:20
  • @αғsнιη Ah, I see. I included it because string comparison in awk is sometimes awkward ( ;) ), and a test of $1>2 will return true even if $1 only starts with a number, e.g. 3M. So in order to ensure that the "print" condition is really only triggered by lines in the "data" part of the file, and not something in the header that only starts with a number, I included this additional check (although it may seem a bit pedantic). – AdminBee Jan 18 '21 at 09:25
  • I see, but you then just checks for integer (unsigned) numbers, what if value was contains -2.956937 as it exist in next columns or scientific values like 2e-18? it's better to replace that $1~/^[[:digit:]]+$/ with $1+0==$1 && …. – αғsнιη Jan 18 '21 at 09:46
  • @αғsнιη The example file content and several aspects of the file header imply that the "Center Number" can only be a non-negative integer. You are of course correct that the test construct you mention can be used in a wider range of applications where the first column can by any number (but is then of course less specific, which may or may not be desirable). – AdminBee Jan 18 '21 at 09:58
-2

If you cd into the directory the following should output the first 55 rows in each file within that directory. This in turn can be sorted / piped to a file etc. There are nicer ways to do this, but this should work.

ls -al | awk '{print $9 }' | while read line; do cat $line | head -55 ; done

Note that if the command does not output anything, you may be required to change the column within "awk" command to the one correct for your OS.

  • 2
    Welcome to the site, and thank you for your contribution. Please note, however, that parsing the output of ls is highly discouraged as it can stumble on spaces and other special characters in the filenames, and the format of the ls output (in particular timestamps for ls -l) can depend on the user's locale. – AdminBee Jan 06 '21 at 13:09