I'm using a special way that can let me visit google. It is very different in the page now... So I can edit the question and comment the answers normally. I deleted more early discussions and go to the point.
I want to stop tailing and return 0 when keywords are found for the script to execute next commands. If keywords are not found after one minute, the whole script will be stopped to return an error code. I'm using set -euxo pipefail that is necessary.
timeout 1m tail -Fn 0 --pid=$(ps -ef | grep "sed /$keywords" | grep -v grep | awk '{print $2}') $log_file | sed "/$keywords/q"
The command above which I used before seamed ok when I tested. But in Jenkins, sometimes it returned "Build step 'Execute shell' marked build as failure" when the keywords were found.
I've found the reason when I restarted the program manually. It returned code 141. So I looked up the code, and found it is related with tail -f and pipe | in http://www.pixelbeat.org/programming/sigpipe_handling.html.
I modified a command in other questions for my purpose. It seemed fine except that "tail -Fn 0 balabala.log" still remained in the background, which disappeared in several minutes. But it is nearest to the target.
{ sed /"$keywords"/q; kill -13 $!; } < <(exec timeout 1m tail -Fn 0 $log_file)
It's beyond my comprehension... I looked up the usages but still feel uncertain.
- I changed
kill -s PIPE "$!"tokill -13 $!to shorten the script.- I'm still confused about the usage of
{ } < <(). They are like alien words to me...- Can
execbe deleted? It seems different from not writting it.- What's wrong with the
tailin the background? Is it dangerous if I start many programs in the same time?
Here is the log in Jenkins:
......
+ keywords='cloud-service-notice has been started successfully'
+ log_file=/data/jars/logs/info.cloud-service-notice.log
+ cd /data/jars/cloud-service-notice
+ nohup java -jar /data/jars/cloud-service-notice/cloud-service-notice.jar --spring.profiles.active=test
+ sed '/cloud-service-notice has been started successfully/q'
++ exec timeout 1m tail -Fn 0 /data/jars/logs/info.cloud-service-notice.log
2019-07-02 10:31:12,544 [main] INFO o.s.c.a.AnnotationConfigApplicationContext.prepareRefresh[588] - Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@54a097cc: startup date [Tue Jul 02 10:31:12 CST 2019]; root of context hierarchy
......
2019-07-02 10:31:33,860 [main] INFO c.enneagon.service.notice.NoticeApp.main[24] - cloud-service-notice has been started successfully
+ kill -13 20021
+ ssh web01 'cd /data/releases/cloud-service-notice/20190702-104108/.. &&' 'ls -1 | sort -r | awk '\''FNR > 20 {printf("rm -rf %s\n", $0);}'\'' | bash'
Finished: SUCCESS
The process 20021 was timeout 1m tail -Fn 0 balabala.log. After the script finished, the process 20023 tail -Fn 0 balabala.log remained, and disappeared a few minutes later.
[root@web01 scripts]# ps -ef | grep notice
root 20020 1 27 10:41 ? 00:01:07 java -jar /data/jars/cloud-service-notice/cloud-service-notice.jar --spring.profiles.active=test
root 20023 1 0 10:41 ? 00:00:00 tail -Fn 0 /data/jars/logs/info.cloud-service-notice.log
root 20461 18966 0 10:45 pts/1 00:00:00 grep --color=auto notice
I will answer my question with this command, but I'm not sure about it. I tested it in my local machines, and will put it in our production environment for further test.
After many tests, I finally changed the command to this:
{ sed /"$keywords"/q; kill $!; } < <(exec timeout 1m tail -Fn 0 $log_file)
-13 just be removed. That is the reason why tail -Fn 0 balabala.log remained. I can answer the four questions above roughly:
kill -15is better because I addedtimeoutin this command.$!is the pid oftimeout.tail -Fn 0 balabala.logis a sub process here and using default number 15 can kill it.- It is just a process substitution and multiple processes using
{...}. I can even ignorekillbecause after 1 minutetimeoutwill kill itself in the background. So this command withoutkillis still acceptable:sed /"$keywords"/q < <(exec timeout 1m tail -Fn 0 $log_file). In this situation, it will always return 0.- Better not. There will be two upper-level processes shown when the command is executed. But it will be all right after finishing.
- The reason is in "1" above.
timeoutwas killed, buttailwas left.
grep -A 1 $keyword | tail -n 1so you get the line after your keyword. (ortail -n 2but you get the idea) – KuboMD Jun 28 '19 at 18:36tailbuffers its output, and when it gets the success line, it hasn't filled its buffer. It times out,timeoutkills it, flushing the buffers, which go tosed, and you see the line in the output fromsed. Try usingstdbuforunbuffer, etc. to disable buffering. Also, please don't use code formatting when quoting people. – muru Jun 29 '19 at 14:55tailwon't exit after it has written an extra line aftersedhas exited (after the keyword has been found). – Stéphane Chazelas Jun 29 '19 at 15:12