82

When trying to write the stdout from a Python script to a text file (python script.py > log), the text file is created when the command is started, but the actual content isn't written until the Python script finishes. For example:

script.py:

import time
for i in range(10):
    print('bla')
    time.sleep(5)

prints to stdout every 5 seconds when called with python script.py, but when I call python script.py > log, the size of the log file stays zero until the script finishes. Is it possible to directly write to the log file, such that you can follow the progress of the script (e.g. using tail)?

EDIT It turns out that python -u script.py does the trick, I didn't know about the buffering of stdout.

Bart
  • 920

4 Answers4

86

This is happening because normally when process STDOUT is redirected to something other than a terminal, then the output is buffered into some OS-specific-sized buffer (perhaps 4k or 8k in many cases). Conversely, when outputting to a terminal, STDOUT will be line-buffered or not buffered at all, so you'll see output after each \n or for each character.

You can generally change the STDOUT buffering with the stdbuf utility:

stdbuf -oL python script.py > log

Now if you tail -F log, you should see each line output immediately as it is generated.


Alternatively explicit flushing of the output stream after each print should achieve the same. It looks like sys.stdout.flush() should achieve this in Python. If you are using Python 3.3 or newer, the print function also has a flush keyword that does this: print('hello', flush=True).

  • 11
    Thanks, I didn't know about the buffering! Knowing that, Google pretty quickly told me that python -u script.py does the trick. EDIT So many answers at once, I accepted yours since it pointed me in the direction of the buffering. – Bart Feb 02 '15 at 21:59
  • 2
    @julbra Cool, yes I didn't know python had that option either. Some command-line programs also have similar options - e.g. --line-buffered for grep, but some others don't. stdbuf is the general catchall utility to deal with those that don't. – Digital Trauma Feb 02 '15 at 22:01
  • @DigitalTrauma: Isn't it better to use no buffering at all i.e. stdbuf -o0 python script.py > log in this kind of determined circumstances? – heemayl Feb 02 '15 at 22:20
  • 1
    @heemayl -oL is a compromise. In general larger buffers will provide better performance when redirecting somewhere (fewer system calls and fewer I/O operations). However if it is absolutely necessary to see each character as it is output then yes, -o0 would be required. – Digital Trauma Feb 02 '15 at 22:25
  • @Paul Please avoid copy pasting contents between answers, or at the bery least mention the original authors that provided the content. – Bakuriu Feb 03 '15 at 13:14
  • @Bakuriu I wrote the edit myself. What answer do you think I copy-pasted the text from? Edit 1: I see you mentioned the same thing in your own answer. Again, I didn't copy anything. In fact, DigitalTrauma's answer was the only one I read when I edited it. Edit 2: And to be fair, you didn't mention anything about this being available only since Python 3.3. – Paul Feb 03 '15 at 20:49
  • @Paul Before introducing new content in an answer you should look the other answers, to avoid these kind of situations. Only edits about formatting/adding references and such don't need the context from other answers. (Btw: Python<3.3 is my definition of ancient version of python.) – Bakuriu Feb 03 '15 at 21:14
  • Oh my, many thanks! I was pulling my hair out for hours: why would I not see any output from a simple pipe from a python program streaming output piped to sed? Worked in Ubuntu, but not on Debian, at least not until I killed the python program. Turns out the output was just caught in the stdout buffer on Debian! But not on Ubuntu. Go figure! Your stdbuf -oL solution fixed it instantly. – CuriousMarc Apr 22 '20 at 06:19
57

This should do the job:

import time, sys
for i in range(10):
    print('bla')
    sys.stdout.flush()
    time.sleep(5)

As Python will buffer the stdout by default, here i have used sys.stdout.flush() to flush the buffer.

Another solution would be to use the -u(unbuffered) switch of python. So, the following will do too:

python -u script.py >> log
heemayl
  • 56,300
20

Variation on the theme of using python's own option for unbuffered output would be to use #!/usr/bin/python -u as first line.

With #!/usr/bin/env python that extra argument not gonna work, so alternatively,one could run PYTHONUNBUFFERED=1 ./my_scriipt.py > output.txt or do it in two steps:

$ export PYTHONUNBUFFERED=1
$ ./myscript.py
15

You should pass flush=True to the print function:

import time

for i in range(10):
    print('bla', flush=True)
    time.sleep(5)

According to the documentation, by default, print doesn't enforce anything about flushing:

Whether output is buffered is usually determined by file, but if the flush keyword argument is true, the stream is forcibly flushed.

And the documentation for sys's strems says:

When interactive, standard streams are line-buffered. Otherwise, they are block-buffered like regular text files. You can override this value with the -u command-line option.


If you are stuck with an ancient version of python you have to call the flush method of the sys.stdout stream:

import sys
import time

for i in range(10):
    print('bla')
    sys.stdout.flush()
    time.sleep(5)
Bakuriu
  • 817
  • 1
    The flush=True argument works nicely with Python 3.4.2, indeed doesn't work with the ancient (..) Python 2.7.9 – Bart Feb 03 '15 at 09:19
  • This answer suggests the same thing that DigitalTrauma said 10 hours prior. You should upvote his post, not post the same thing again. – dotancohen Feb 03 '15 at 12:51
  • 4
    @dotancohen Actually the part about print(flush=True) was added to that answer after mine by a third party author. I find it bad taste to rip contents from my answer to put them in an other without credit. I decided to add my answer solely because no answers provided any mention of the simplest way of achieving what the OP wanted in newer versions of python, and I added the "old way" just for completeness. The next time please check the revision history before commenting and or downvoting. – Bakuriu Feb 03 '15 at 13:11
  • @Bakuriu: I'm sorry then! This shows a good reason to always post why when downvoting. Could you please edit the post a bit so that I can change my downvote to an upvote? Thank you! – dotancohen Feb 03 '15 at 14:14
  • It should work with Python 2.7 if you do __future__ import: from __future__ import print_function. But yes, that's for compatibility with Python 3 only – Sergiy Kolodyazhnyy May 30 '18 at 20:19