4
cat a.txt
1387111124 ./asfjlasdf.txt
1348681215 ./akdfyxcv.txt

So I need this:

cat a.txt | SOMEMAGIC
1387111124 ./asfjlasdf.txt

The output should only have the lines that are younger than 3 months. How can I achieve this? (the first column is always the unix date, the second is some string without spaces)

Kevin
  • 40,767
gasko peter
  • 5,514

3 Answers3

6

Alternatively, if your date doesn't support %s, with many awk implementations, you can use srand() to get the current Unix time:

 awk 'BEGIN{srand(); d = srand() - 3 * 30 * 24 * 60 * 60}; $1 > d' < a.txt

Or use perl:

 perl -ne'BEGIN{$d = time - 3 * 30 * 24 * 60 * 60}print if $_ > $d' < a.txt

Note that 3 months here are counted as a fixed period of 90 days of 86400 seconds each.

If instead you want: not before the same day of the month, 3 months ago at the same time of the day (if now is 2013-12-15 21:02:01, that would be since 2013-09-15 21:02:01 (91 days and 1 hour ago in a European Union timezone for instance)), see @Zelda's answer, or:

 perl -MPOSIX -ne 'BEGIN{@t=localtime;$t[4]-=3;$d=mktime @t}
                   print if $_ > $d' < a.txt

(with the caveat that on May 29 (non leap year), May 30 & 31, Jul 31st and Dec 31st, the 3-months ago date will be the same as for GNU date -d '3 months ago').

If instead, you mean this month, the previous, or the one before the previous (if today is 2013-12-15, that would be since 2013-10-01 00:00:00), that would be:

 perl -MPOSIX -ne 'BEGIN{@t=localtime;$d=mktime 0,0,0,1,$t[4]-2,$t[5]}
                   print if $_ >= $d' < a.txt
  • Why < a.txt for awk? – Bernhard Dec 15 '13 at 15:33
  • 1
    @Bernhard, awk has some issues with some filenames (those containing = characters). Having the shell open the file does not have the issue and has other advantages. – Stéphane Chazelas Dec 15 '13 at 20:34
  • And how about memory for the hypothetical case of a huge file? – Bernhard Dec 16 '13 at 06:43
  • @Bernhard, the only difference is who opens the file for reading (the shell with "<" on fd 0, awk on a different fd without "<"), not those reads from the file (awk in both cases). There could be memory issues if lines are huge (with some awk implementations that don't have a limit there), but that's regardless of whether the shell or awk opens the file. – Stéphane Chazelas Dec 16 '13 at 09:46
  • I test awk '{}' file and awk '{}' <file on a hypothetical huge file (1GB), and did not see noticable differences indeed. – Bernhard Dec 16 '13 at 10:27
3

I would use awk and date (provided your date implementation, like GNU date, supports the non-standard %s format).

$ cat a.txt
1387111124 ./asfjlasdf.txt
1348681215 ./akdfyxcv.txt
$ awk -v date="$(date +%s)" '$1>(date-60*60*24*90)' a.txt
1387111124 ./asfjlasdf.txt

Use the date +%s to get the amount of seconds since 1970. Then substract 90 days and check whether the first column is larger than this.

Bernhard
  • 12,272
  • @Stephane Why would you need quotes around the subshell for this specific date? No spaces in output. – Bernhard Dec 15 '13 at 14:39
  • [Oops] Omiting quotes is the split+glob operator. The splitting is done on the characters of the current value of $IFS, not necessarily space. The question I would ask would be: why do you want to use the split+glob operator here? – Stéphane Chazelas Dec 15 '13 at 20:24
3

In some of the other answers 3 months is assumed to be 90 days. This is only true for a relatively small number of days per year. To get a more accurate time stamp for 3 months before now, you can use the following Python script (only using its standard library). That is accurate except for May 29 (non leap year), May 30 & 31, July 31st and Dec 31st.

#!/usr/bin/env python

import sys
from datetime import datetime

now = datetime.now()
#now = datetime(2013,5,31,12,0,0)
for x in range(4): # 0 - 3 days earlier
    try:
        if now.month < 4:
            now_min_3_months = now.replace(year=now.year-1, month=now.month+9, 
                                           day=now.day-x)
        else:
            now_min_3_months = now.replace(month=now.month-3, day=now.day-x)
        break
    except ValueError:
        pass
#print (int(now_min_3_months.timestamp()))
if sys.version_info < (3,):
    import calendar
    now_min_3_months_ts = calendar.timegm(now_min_3_months.utctimetuple())
else:
    now_min_3_months_ts = now_min_3_months.timestamp()

for line in sys.stdin:
    ts, rest = line.split(' ', 1)
    if int(ts) < now_min_3_months_ts:
        continue
    sys.stdout.write(line)

Pipe a.txt into this script to get the output.

Thanks go to Stephane and Anton for pointing out errors in the original answer.

Zelda
  • 6,262
  • 1
  • 23
  • 28
  • As Stephane indicates this does not work, you better use the non-standard dateutil library for that. And even then you can debate about what is 3 months before May 31st is something you can debate about. (dateutil calculates that to be Feb 28th) – Anthon Dec 15 '13 at 14:56