1

Let's say I send an email, containing a link to my website, to someone that I really hope he'll visit it (fingers-crossed style):

http://www.example.com/?utm_source=email392

or

http://www.example.com/somefile.pdf?utm_source=email392

How to make Linux trigger an action (such as sending an automated email to myself) when this URL is visited, by regularly examining /var/log/apache2/other_vhosts_access.log?

I can't do it at PHP level because I need to do it for various sources/websites (some of them use PHP, some don't and are just link to files to be downloaded, etc.; even for the websites using PHP, I don't want to modify every index.php to do it from there, that's why I prefer an Apache log parsing method)

Basj
  • 2,519
  • 1
    Notice that the URL might be visited by something else (e.g. Google bots) – Basile Starynkevitch Sep 14 '17 at 15:17
  • Yes @BasileStarynkevitch, I checked my logs and bots do visit my website, that's right, but they never do with the precise pattern /?utm_source=onlycommunicated_tooneperson_viaemail – Basj Sep 15 '17 at 11:57

6 Answers6

5

Live log monitoring using bash process substitution:

#!/bin/bash

while IFS='$\n' read -r line;
do
    # action here, log line in $line

done < <(tail -n 0 -f /var/log/apache2/other_vhosts_access.log | \
         grep '/somefile.pdf?utm_source=email392')

Process substitution feeds the read loop with the output from the pipeline inside <(...). The log line itself is assigned to variable $line.

Logs are watched using tail -f, which outputs lines as they are written to the logs. If your log files are moved periodically by logrotate, add --follow=name and --retry options to watch the file path instead of just the file descriptor.

Output from tail is piped to grep, which filters the relevant lines matching your URLs.

sebasth
  • 14,872
1

You can take a one liner like this:

grep -q "utm_source=email392" /var/log/apache2/other_vhosts_access.log && grep -q "utm_source=email392" /var/log/apache2/other_vhosts_access.log | mail -S "Accessed!" foo@bar.com

and run it periodically via cron.

Explaining it in more detail: the first grep is used only to check if further action is needed (adding -q makes it quiet, hiding any matches it might find). && means that the rest of the line will only run if the first grep finds a match (i.e. returns 0). If that is the case, the matching line(s) obtained by the second grep are piped into mail to be sent to foo@bar.com, on an email with the subject specified by the -s argument ("Acessed!").

The same logic (grep -q ... && ...) can be used to perform any other actions. You can run whatever you want after &&, e.g. a shell script for more complex stuff.

Note that if you run this at a higher frequency than the log's rotation -- e.g. checking hourly but rotating the logs daily -- the action might be triggered multiple times, since grep will keep finding the same line(s) over and over again until the log rotates.

Zé Loff
  • 2,112
  • Thanks for your answer, but this is a problem indeed: "if you run this at a higher frequency than the log's rotation -- e.g. checking hourly but rotating the logs daily -- the action might be triggered multiple times", because I only rotate logs weekly. – Basj Sep 15 '17 at 11:52
1

While I wrote my solution I've found that the first answer is similar to mine. I would recommend to avoid crontab too in this case. I'll post my code .

#!/bin/bash
file="$1"
pattern="$2"

tail -f -n0 $file | {
   while IFS= read -r line
   do
      if [ ! -z $(echo $line | grep "$pattern") ] ; then
         echo "visited $pattern" | mail user@example.com
      fi
   done
}

In addition you can run it on the backround with the & operator:

./checklog.sh /var/log/apache2/other_vhosts_access.log "somefile.pdf?utm_source=email392" &

or start it as a 'daemon' when the system boots up

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
Wax
  • 61
  • NIce! Will this run forever? (I would like this: run forever, until I kill it). Also will it reopen the file automatically when the logs rotate? – Basj Sep 15 '17 at 11:59
  • It will run until you kill it or the system goes down (in almost any case) :-) As mentioned in the first answer by sebasth, you can use the --follow and --retry options when the log rotates. I'm glad I could help. – Wax Sep 15 '17 at 12:12
  • Why did you put while IFS=(nothinghere) read -r line? – Basj Sep 15 '17 at 12:38
  • 1
    The value of the IFS (internal field separator) variable holds the character(s) used to split the string read. Therefore read will assign the whole line to line when IFS is empty. – Wax Sep 15 '17 at 13:11
  • Thanks again. Now a linked issue just in case you have an idea ;) – Basj Sep 15 '17 at 14:25
1

Try fail2ban with filter apache-badbots.conf, (replace the regex with your url) and as action sendmail.conf

[mycustombot] enable = true filter = apache-badbots ##your "custom" script action = sendmail[name=MyBadBot,dest=youremail@be.happy] logpath = /your/access/logs/*/path

BrenoZan
  • 345
0

Posting what I'm finally using, for future reference (ok I know one liners are sometimes bad, but...):

tail -F -n0 /var/log/apache2/other_vhosts_access.log | grep --line-buffered "?src=_" | { while IFS= read -r line; do echo "$line" | mail test@example.com; done } &

Notes:

  • I have to use grep --line-buffered because of this.

  • tail -F seems to replace --follow=name --retry, as mentioned here.

(Of course credit goes to sebasth and Wax.)

Basj
  • 2,519
0

you can do that using rsyslog and the ommail module

http://www.rsyslog.com/doc/v8-stable/configuration/modules/ommail.html

something like:

module(load="ommail")

if $msg contains "/somefile.pdf?utm_source=email392" then {
   action(type="ommail" server="..." port=".."
       mailfrom="...."
       mailto="..."
       subject.text="Page Viewed!")
}

this will work if apache is configured to log using syslog