How to view a log file that's worth 10GB+?

Question

The ways that I've thought of:

If you get to reproduce the scenario in real time. Use

tail -f application.log | tee /tmp/tailed_log

But it's not sure that we'll get to reproduce the scenario in real time.

Grep the logs

grep -i 'string_to_search" application.log > /tmp/greppedLOG

Problem with this is that this will skip the lines where "string_to_search" doesn't exist. The another string that is skipped is also important.

I've not tried this but this could work.

grep -i "string_to_search || string_to_not_skip" application.log > /tmp/greppedLOG

As I said, there will be strings that if missed will cause the whole issue jumbled down, so although this is the most likely what I'll do if I don't find other solution, I am not happy about it.

Another way is to grep n lines.

grep -A5000 "string_to_search" application.log > /tmp/greppedLOG

The problem with this is that you don't know how long the log for particular user is. This isn't reliable way.

Another way that I do is to split the large log file.

split -n 20 application.log /tmp/smallfile

But my senior colleagues have not recommended me to do this either because this is expensive process. Some times there is not enough space on their servers that we administrate, and if we split the log file in that case, server gets down.

Another way that I think could work is to get the logs from one time duration to another time duration. I've not tried this, but according to https://unix.stackexchange.com/a/751676/569181, this could work.

LC_ALL=C awk -v beg=10:00:00 -v end=13:00:00 '
  match($0, /[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/) {
    t = substr($0, RSTART, 8)
    if (t >= end) selected = 0
    else if (t >= beg) selected = 1
  }
  selected'

Problem with this is that it's not always that we get time when the customer did transaction. But some of the times, we are able to find the time when the customer did the transaction to check the logs.

Other ideas on top of my head.

Don't split the file if the the home partition in df -H output exceeds more than 85%. even if I type split command. I think I need some alias along with a script for that.

Zipping the logs and doing ggrep isn't either helpful because it is the same thing as splitting in terms of space.

Anything else you can recommend? Are there better solutions? Is this even a technical problem? Or our client's problem? I am really frustrated with this situation.

I'd suggest to set up logrotate and apply it to the exiting file, too. This would then prevent the same scenario in future and allow compressing older logs. However I am not sure how the initial splitting would be done regarding your disk size limitation. — FelixJN, Aug 18 '23 at 13:05
The proper command to search for either of two strings is grep -Ei "string_to_search|string_to_not_skip'. If you need to see lines before or after the found lines, you'll want the -A and -B options. — doneal24, Aug 18 '23 at 13:06
And I agree with @FelixJN that you should not let the log file get that large in the first place. — doneal24, Aug 18 '23 at 13:07
For cutting the large logfile, you could do as follows: use dd to write the last n-bytes to a file, then use truncate to reduce the size of the logfile by that amount. Loop through that until the file is nicely chopped up. Of course this does not take newlines as the standard cutoff position and needs to be executed with great care. — FelixJN, Aug 18 '23 at 13:25
@FelixJN You can deal with newlines in that case by dd-ing out a 4K block about where you want to split, and counting the part-line from the front of that. Add that to the original start point, and you have an exact size including a proper last line. Finicky but doable -- I have some tested code somewhere in the archive. — Paul_Pedant, Aug 18 '23 at 13:36
One way to split the file is using the coreutils split: split --line-bytes=100M logfile. This will split the file into multiple components, always splitting at the end-of-line.
(Note - I haven't investigated how this handles multi-byte encodings, such as UTF-8. It should be safe.) — Popup, Aug 18 '23 at 15:05
"But my senior colleagues have not recommended me to do this" - but your employer cannot afford to provision hosts which are not live servers? Or provide an adequately secure workstation for you to run this locally? — symcbean, Aug 18 '23 at 15:28
So, first of all, symcbean is right: get a VM with more at least 32 GB RAM, open the file in neovim (nvim -u NONE -R application.log), and just search to your heart's desire in it (once opened, /\Vstring_to_search). I just tried it; generated a 15 GB file with a couple hundred million lines, and yeah, that works. It's not "fast" by any means, but it gives you enough speed to find the places you want to start looking into, copy a couple thousand lines before to a new file, and then work fast. Analyzing the log on an IO-bound server is a bad idea anyways.. — Marcus Müller, Aug 18 '23 at 16:12
Generally, the fact that you have a >10 GB log is probably bad to begin with (see comments above about logrotate) and it might mean you're using the wrong tools to analyze the logs. Specific logs have specific tools; for example, packet logs have tools tailored to making detection intruders easy; database logs have other tools; webserver logs… Also, if this application is under your control, it might be time to consider better logging formats than plaintext, if, and only if, the application is actually logging the things you need (if it's not, adjust logging to omit irrelevant information). — Marcus Müller, Aug 18 '23 at 16:14
What's the problem with less application.log and search for your string within with /? — Stéphane Chazelas, Aug 18 '23 at 16:36
"if we split the log file in that case, server gets down" - if the server is that delicate you shouldn't be using it in a production environment. Fixing the application writing to the log file, as recommended on at least one other of your recent questions is the correct solution here. Fix the issue not the symptom — Chris Davies, Aug 18 '23 at 16:45
the log file is that of a big corporations. it's easily 10GB per day. It's normal as it's very much needed in case of transactions failure imo. — achhainsan, Aug 19 '23 at 06:45
"the log file is that of a big corporations. it's easily 10GB per day."

That's still not an excuse not to use logrotate. If you split it into smaller chunks and compress (with something like lz4) you should get much more manageable logs.

lz4 (or zstd) is great for logs, as it's blindingly fast to operate on compressed files. Keep the logs compressed, and decompress in a pipe to do whatever you need to. — Popup, Aug 21 '23 at 08:04
google it, read some results, then try to use it, then ask a specific question about it if you run into a problem using it. — Ed Morton, Aug 21 '23 at 11:21

Paul_Pedant · Answer 1 · 2023-08-27T15:39:13.807

This script will split a text file into a given number of sections, avoiding splitting text lines across sections. It can be used where there is only sufficient space to hold one section at a time. It operates by copying sections of the source file starting at the end, then truncating the source to free up space. So if you have a 1.8GB file and 0.5GB free space, you would need to use 4 sections (or more if you wish to have smaller output files). The last section is just renamed, as there is no need to copy it. After splitting, the source file no longer exists (there would be no room for it anyway).

The main part is an awk script (wrapped in Bash), which only sets up the section sizes (including adjusting to the section coincides with a newline). It uses the system() function to invoke dd, truncate and mv for all the heavy lifting.

$ bash --version
GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)
$ awk --version
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 4.0.1, GNU MP 6.1.2)
$ dd --version
dd (coreutils) 8.28
$ truncate --version
truncate (GNU coreutils) 8.28

The script takes between one and four arguments:

./splitBig Source nSect Dest Debug
Source: is the filename of the file to be split into sections.
nSect: is the number of sections required (default 10).
Dest: is a printf() format used to generate the names of the sections.
Default is Source.%.3d, which appends serial numbers (from .001 up) to the source name.
Section numbers correspond to the original order of the source file.
Debug: generates some diagnostics (default is none).

Test Results:

$ mkdir TestDir
$ cd TestDir
$ 
$ cp /home/paul/leipzig1M.txt ./
$ ls -s -l
total 126608
126608 -rw-rw-r-- 1 paul paul 129644797 Aug 27 15:54 leipzig1M.txt
$ 
$ time ../splitBig leipzig1M.txt 5
real    0m0.780s
user    0m0.045s
sys 0m0.727s
$ ls -s -l
total 126620
25324 -rw-rw-r-- 1 paul paul 25928991 Aug 27 15:56 leipzig1M.txt.001
25324 -rw-rw-r-- 1 paul paul 25929019 Aug 27 15:56 leipzig1M.txt.002
25324 -rw-rw-r-- 1 paul paul 25928954 Aug 27 15:56 leipzig1M.txt.003
25324 -rw-rw-r-- 1 paul paul 25928977 Aug 27 15:56 leipzig1M.txt.004
25324 -rw-rw-r-- 1 paul paul 25928856 Aug 27 15:56 leipzig1M.txt.005
$ 
$ rm lei*
$ cp /home/paul/leipzig1M.txt ./
$ ls -s -l
total 126608
126608 -rw-rw-r-- 1 paul paul 129644797 Aug 27 15:57 leipzig1M.txt
$ time ../splitBig leipzig1M.txt 3 "Tuesday.%1d.log" 1
.... Section   3 ....
#.. findNl: dd bs=8192 count=1 if="leipzig1M.txt" skip=86429864 iflag=skip_bytes status=none
#.. system: dd bs=128M if="leipzig1M.txt" skip=86430023 iflag=skip_bytes of="Tuesday.3.log" status=none
#.. system: truncate -s 86430023 "leipzig1M.txt"
.... Section   2 ....
#.. findNl: dd bs=8192 count=1 if="leipzig1M.txt" skip=43214932 iflag=skip_bytes status=none
#.. system: dd bs=128M if="leipzig1M.txt" skip=43214997 iflag=skip_bytes of="Tuesday.2.log" status=none
#.. system: truncate -s 43214997 "leipzig1M.txt"
.... Section   1 ....
#.. system: mv "leipzig1M.txt" "Tuesday.1.log"
real    0m0.628s
user    0m0.025s
sys 0m0.591s
$ ls -s -l
total 126612
42204 -rw-rw-r-- 1 paul paul 43214997 Aug 27 15:58 Tuesday.1.log
42204 -rw-rw-r-- 1 paul paul 43215026 Aug 27 15:58 Tuesday.2.log
42204 -rw-rw-r-- 1 paul paul 43214774 Aug 27 15:58 Tuesday.3.log
$

Script:

#! /bin/bash --
LC_ALL="C"
splitFile () {  #:: (inFile, Pieces, outFmt, Debug)
local inFile=&quot;${1}&quot; Pieces=&quot;${2}&quot; outFmt=&quot;${3}&quot; Debug=&quot;${4}&quot;

local Awk='

BEGIN {
    SQ = "\042"; szLine = 8192; szFile = "128M";
    fmtLine = "dd bs=%d count=1 if=%s skip=%d iflag=skip_bytes status=none";
    fmtFile = "dd bs=%s if=%s skip=%d iflag=skip_bytes of=%s status=none";
    fmtClip = "truncate -s %d %s";
    fmtName = "mv %s %s";
}
function findNl (fIn, Seek, Local, cmd, lth, txt) {
cmd = sprintf (fmtLine, szLine, SQ fIn SQ, Seek);
if (Db) printf (&quot;#.. findNl: %s\n&quot;, cmd);
cmd | getline txt; close (cmd);
lth = length (txt);
if (lth == szLine) printf (&quot;#### Line at %d will be split\n&quot;, Seek);
return ((lth == szLine) ? Seek : Seek + lth + 1);

}
function Split (fIn, Size, Pieces, fmtOut, Local, n, seek, cmd) {
for (n = Pieces; n &gt; 1; n--) {
    if (Db) printf (&quot;.... Section %3d ....\n&quot;, n);
    seek = int (Size * ((n - 1) / Pieces));
    seek = findNl( fIn, seek);
    cmd = sprintf (fmtFile, szFile, SQ fIn SQ, seek,
        SQ sprintf (outFmt, n) SQ);
    if (Db) printf (&quot;#.. system: %s\n&quot;, cmd);
    system (cmd);
    cmd = sprintf (fmtClip, seek, SQ fIn SQ);
    if (Db) printf (&quot;#.. system: %s\n&quot;, cmd);
    system (cmd);
}
if (Db) printf (&quot;.... Section %3d ....\n&quot;, n);
cmd = sprintf (fmtName, SQ fIn SQ, SQ sprintf (outFmt, n) SQ);
if (Db) printf (&quot;#.. system: %s\n&quot;, cmd);
system (cmd);

}
{ Split( inFile, $1, Pieces, outFmt); }
'
    stat -L -c "%s" "${inFile}" | awk -v inFile="${inFile}" 

        -v Pieces="${Pieces}" -v outFmt="${outFmt}" 

        -v Db="${Debug}" -f <( printf '%s' "${Awk}" )
}
Script body starts here.
splitFile &quot;${1}&quot; &quot;${2:-10}&quot; &quot;${3:-${1}.%.3d}&quot; &quot;${4}&quot;

@achhainsan It is a solution to your "Some times there is not enough space on their servers that we administrate, and if we split the log file in that case, server gets down." It truncates only the section which has already been copied elsewhere. It also answers your other question "How can I remove everything except last n bytes from the file in Linux?", except it does that for the entire file to divide it up, in order to re-use the space. It is a perfectly cromulent procedure, and tested extensively. — Paul_Pedant, Aug 27 '23 at 22:28
Upped, I'm not doubting the script but it's risky in production environment. — achhainsan, Aug 28 '23 at 03:03
@achhainsan The hard part is probably dealing with any logger processes which are keeping the current log open. Doing anything to the log that is not an atomic action is likely to disconnect the file from the process. or lose data. The source code of logrotate might show a technique (which won't be scriptable), but your system might just depend on all writers using open-append-close for every transaction. Best practice is probably to log to a socket which is read by a very robust process. — Paul_Pedant, Aug 28 '23 at 09:30

score 1 · Answer 2 · answered Oct 02 '23 at 08:50

I'm not at all clear what you're trying to achieve. As far as I can tell the question you're asking is in the question title, "How to view a log file that's worth 10GB+?" and the question itself simply contains ideas and thoughts that you yourself think might work.

So answering the only question I can find, one option is to use a pager such as less

less 10GBlogfile

The documentation (man less) eventually list the keys you can use, or once you've started it you can use h to get help - a list of keys and associated actions. For starters, G will Go to the last line, cursor keys (including PageUp, PageDown) will move around, / will search for an RE string, n/N will search for the next/previous match, and q will quit the pager

score 0 · Answer 3 · answered Aug 18 '23 at 15:11

0

I know that you already tried split -n 20 -- but have you thought of split -n 20 --filter 'grep <whatever> or something' That will split the original file into components, and pipe them separately to whatever command you want.

The splitting shouldn't be very costly - especially if you do split --bytes=100M - it's basically just a seek and read/write. However, I'm not sure how it handles variable-length encodings such as UTF-8. If you know that the data is ASCII, then it's pretty safe. Otherwise, you'd be better off doing something like split --line-bytes=<size> - but that will have to parse a lot more data, which can be costly.

answered Aug 18 '23 at 15:11

Popup

516

The problem is that the original file is left in place, and the OP states that there may not be space for the new copies alongside it. I found my tested code: an awk script which does the math, finds whole lines, and throws system calls like dd bs=128M if="./TestFile.txt" skip=116680341 iflag=skip_bytes of="./Out.10.txt" status=none and truncate -s 116680341 "./TestFile.txt", working from the end of the file backwards. It does 130MB ten ways in one second, but it scares the heck out of me, even after I retested it. – Paul_Pedant Aug 18 '23 at 16:26
@Paul_Pedant - mind making that an answer? – FelixJN Aug 18 '23 at 20:10
@FelixJN Done, but it does not seem to be popular. – Paul_Pedant Aug 27 '23 at 22:31

score 0 · Answer 4 · answered Oct 02 '23 at 10:58

You seem to be trying to extract and analyse errors from the logs. There is no generic answer to the problem - how you isolate the events associated with a specific pattern in a log file is completely dependent on the structure of the log file and the nature of the thing generating the log files.

you don't know how long the log for particular user is

Is there an explicit (username) or implicit (session id, process id, IP address) identifier? If not it sounds like you need one and will then have to make multiple passes through the log files to:

Identify error instances, timestamps and user identifier
Capture surrounding non-error events

How to view a log file that's worth 10GB+?

4 Answers4

Script body starts here.