Unix bash/ksh : Selection of first non space character from file from specific line

Question

I have file file1.txt whose contents are as follows:

Date List
-----------
    Quarter Date
         Year Date
             Month Date

Now I want to read the non space elements from each row of file and to write to a variable. For example for row 2 variable should contain Quarter Year only after removing space.

I tried:

tail -1 file1.txt > variable1

But it doesn't work.

kos · Accepted Answer · 2015-06-12T11:24:14.933

7

Using sed:

variable1="$(< inputfile sed -n '3s/ *//p')"

variable1="$([...])": runs the command [...] in a subshell and assigns its output to the variable $variable
< inputfile: redirects the content of inputfile to sed's stdin
-n: suppresses output

sed command breakdown:

3: asserts to perform the following command only on the 3rd line of input
s: asserts to perform a substitution
/: starts the search pattern
*: matches zero or more characters
/: stops the search pattern / starts the replacement string
/: stops the replacement string (hence actually replacing with nothing) / starts the modifiers
p: prints only the lines where the substitution succeeded

edited Jun 12 '15 at 11:24

answered Jun 12 '15 at 09:03

kos

2,887

hat's off great answer – Aman Jun 12 '15 at 09:14
@StéphaneChazelas The caret is needed for cases such as the first line of the sample input, where if missing it will drop the first space / sequence of spaces later in the string. – kos Jun 12 '15 at 11:13
1

No, * matches the empty string, so it will always match at the beginning of the string. – Stéphane Chazelas Jun 12 '15 at 11:16
@StéphaneChazelas Nope, I'll take it back, I was thinking about +. You're right, it'll be consumed. – kos Jun 12 '15 at 11:23

score 5 · Answer 2 · answered Jun 12 '15 at 09:10

5

First read the desired line into a variable (line 3 in the example):

var=$(sed -n '3p' file1.txt)

The sed command prints (p) the 3rd line of the file. The strip the leading spaces using parameter substitution:

echo "${var#"${var%%[![:space:]]*}"}"

The inner substitution means to remove everyting except the leading spaces. The outer substitution remove those spaces at the beginning of the line.

Output is:

Quarter Date

answered Jun 12 '15 at 09:10

chaos

48,171

wow, interesting use of bash's pattern matching… – Mingye Wang Jun 12 '15 at 09:31
@chaos how to store this result into a variable – Aman Jun 12 '15 at 10:56
1

@Aman Just var="${var#"${var%%[![:space:]]*}"}" – chaos Jun 12 '15 at 10:57

Mingye Wang · Answer 3 · 2015-06-12T11:06:58.823

4

tail -1 file1.txt > variable1 writes to the file variable1.

Use command substitution (bash.info 3.5.4, POSIX sh) instead:

variable1="$(tail -1 file1.txt)"

Btw my version of tail from GNU in cygwin doesn't have the -1 option. Instead, I use sed:

# EREGEX: Replace all whitespace at beginning of line
# NOTE: BSD sed uses a different flag to enable EREGEX, -E.
# EDIT: Dropped -r. \s is already included in BRE.
#       Thanks to kos for pointing that out.
# EDIT: Use POSIX [:space:] instead of Perl \s.
variable1="$(sed -e 's/^[[:space:]]*//g' < file1.txt)"

Combined with line selection:

# EDIT: limit the [s]ubstitude operation to the 4th line only, and
#       [p]rint directly from s.
variable1="$(sed -ne '4s/^[[:space:]]*//p' < file1.txt)"

edited Jun 12 '15 at 11:06

answered Jun 12 '15 at 09:07

Mingye Wang

1,181

The -r option is only available on GNU sed (as far as I know) and anyway you're not using it, so I think it's safe to drop, along with the -e option which is redundant. In any case the last command doesn't work. – kos Jun 12 '15 at 09:31
@kos The -e option is just a matter of sed style. In the second code block, I have already mentioned that in BSD sed there is another flag for ERE. – Mingye Wang Jun 12 '15 at 09:40
Yes, what I meant is that by dropping it it'd have matched a wider number of standards right away. Also the g modifier would be safe to be dropped – kos Jun 12 '15 at 10:03
\s is a (recent) GNUism. It won't work elsewhere. The standard equivalent is [[:space:]]. Note that you don't have to keep the history of your answer in. It's OK to change your answer without keeping trace of the older versions. – Stéphane Chazelas Jun 12 '15 at 11:02

score 3 · Answer 4 · edited Apr 13 '17 at 12:36

With ksh/zsh/bash:

IFS=' ' read -r variable < <(tail -n 1 file)

read strips leading and trailing space characters if space is found in IFS (which it is by default along with tab and newline).

You can also do:

while IFS=' ' read -r variable <&3; do
  something with "$variable"
done 3< file

To process the file line by line (though that's not usually the way to go in shells) with $variable holding the current line's content with leading and trailing space characters removed.

Unix bash/ksh : Selection of first non space character from file from specific line

4 Answers4