I am learning Perl, but I don't know how to solve this problem.

I have a .txt file of the following form:

1 16.3346384
2 11.43483
3 1.19819
4 1.1113829
5 1.0953443
6 1.9458343
7 1.345645
8 1.3847385794
9 1.3534344
10 2.1117454
11 1.17465
12 1.4587485

The first column only contains the line number, which is not of interest here, but it is present in the file; the values in the second column are the relevant part.

I want to output the longest contiguous sequence of lines which feature numbers smaller than 2.00 in the second column. For the above example, this would be lines 3 to 9 , and the output should be:

Perl one line:

perl -ne '$n = (split)[1]; if ($n > 2) {if ($i > $max) {$longest=$cur; $cur=""; $max=$i}; $i=0} else {$cur .= $n . "\n"; $i++} END {print $i > $max ? $cur : $longest}' < file.txt

Multi line for better readability:

perl -ne '
  $n = (split)[1];
  if ($n > 2) {
    if ($i > $max) {
  } else {
    $cur.= $n . "\n";
  END {
    print $i > $max ? $cur : $longest
  }' < file.txt

One liner with awk:

awk '$2 > 2 { if (i > max) {res=cur; cur=""; max=i} i=0} $2 < 2 {cur = cur $2 "\n"; i++} END {if (i > max) res=cur; printf res}' file.txt

Multi line:

awk '
  $2 > 2 { 
    if (i > max) {
  $2 < 2 {
    cur = cur $2 "\n"
  END {
    if (i > max) res=cur
    printf res
  }' file.txt

This is not quite a trivial task. There is also debate whether providing a finished program is helpful for others learning to solve a problem in a programming language, but I believe it has its merits, so I propose the following program (let's call it findlongestsequence.pl:

use strict;
use Getopt::Long;

my $limit; my $infile;
GetOptions( 'limit=f' => \$limit, 'infile=s' => \$infile );

my $lineno=0; my $groupstart;
my $currlength=0; my $maxlength=0; my $ingroup=0;
my @columns; my @groupbuf; my @longestgroup;

if (! open(fileinput, '<', "$infile" )) {exit 1;};
while (<fileinput>)
    @columns = split(/\s+/,$_);

    if ( $ingroup == 0 && $columns[1]<$limit )

    if ( $ingroup == 1 )
        if ($columns[1]>=$limit )
            if ( $currlength>$maxlength )

if ( $ingroup == 1 )
    if ( $currlength>$maxlength )

print join("\n",@longestgroup),"\n";
exit 0;

You can call the program as

./findlongestsequence.pl --infile input.txt --limit 2.0

This will first interpret the command-line parameters using Getopt::Long.

It will then open the file and read it line-wise while, and keep a line-counter in $lineno. Every line will be split into columns at whitespace.

  • If the program is not inside a group of lines with values < $limit ($ingroup is zero), but encounters a suitable line, it will record that it is now in such a group ($ingroup set to one), store the group start in $groupstart and start buffering the column 2 values in an array @groupbuf.
  • If the program is inside such a group, but the current value is larger than the $limit, it will recognize the end-of-group and calculate its length. If this is longer than the previously recorded longest group, the content (@groupbuf) and length ($currlength) of the new longest group is copied to @longestgroup and $maxlength, respectively.

Since it is possible that a group is terminated by end-of-file rather than a line with value > $limit, perform this check also if $ingroup is true at end-of-file.

At the end, the content of @longestgroup is printed with \n as token separator.

Using any awk:

$ cat tst.awk
$2 >= 2 {
    max = getMax(cur,max)
    cur = ""
{ cur = cur $2 ORS }
    printf "%s", getMax(cur,max)
function getMax(a,b) {
    return ( gsub(ORS,"&",a) > gsub(ORS,"&",b) ? a : b )

$ awk -f tst.awk file
Maybe something like:

<input perl -snle '
  if ($_ < $limit) {
  } else {
    $max = $n if $n > $max;
    $n = 0;
  END {
    print ($n > $max ? $n : $max);
  }' -- -limit=2 -max=0

Or if instead of the number of lines in that largest group of lines you want to see those lines as per newer edits to your question:

<input perl -snle '
  if ($_ < $limit) {
    push @lines, $_;
  } else {
    @max = @lines if @lines > @max;
    @lines = ();
  END {
    print for @lines > @max ? @lines : @max;
  }' -- -limit=2

If, as someone edited in your question, the line numbers are part of the data, add the -a option (awk mode where records are split into the @F array) and replace $_ (the whole record) with $F[1] (the second field, $F[0] being the first).

Idiomatic solution using <> for reading input and the flipflop operator.

#!/usr/bin/env perl
use strict;
use warnings;
# https://unix.stackexchange.com/questions/766081/how-to-print-the-longest-sequence-of-lines-featuring-numbers-smaller-than-a-thre
my $threshold = 2.00;
my ($section, $maxsection, $len, $maxlen);
my $flipflop;
while (<>) {
    # Remove leading line number
    # Flip flop operator
    # https://www.effectiveperlprogramming.com/2010/11/make-exclusive-flip-flop-operators/
    if ($flipflop = $_ <= $threshold .. $_ > $threshold) {
        if ($flipflop =~ /E0$/) {
            # End of section
            if (!defined($maxlen) || $len > $maxlen) {
                $maxsection = $section;
                $maxlen = $len;
            $len = 0;
            $section = "";
        } else {
            $section .= $_;
# One last possible end of section
if ($flipflop && $len > $maxlen) {
    $maxsection = $section;
print $maxsection;
Using Raku (formerly known as Perl_6)

~$ raku -ne 'BEGIN my (@max,@tmp);  $_ .= words;  \
             if .[1]  < 2 { @tmp.push: .[1] };    \
             if .[1] !< 2 { @max = @tmp if @tmp.elems > @max.elems; @tmp = Empty };  \
             END @max.elems >= @tmp.elems ?? (.put for @max) !! (.put for @tmp);'  file


~$ raku -ne 'BEGIN my (@max,@tmp);  $_ .= words;  \
             when .[1]  < 2 { @tmp.push: .[1] };  \
             default { @max = @tmp if @tmp.elems > @max.elems; @tmp = Empty };  \
             END @max.elems >= @tmp.elems ?? (.put for @max) !! (.put for @tmp);'  file

Here are answers written in Raku, a member of the Perl-family of programming languages. Raku features rational numbers, if ever you need to maintain precision when performing simple math operations (e.g. say 0.1 + 0.2 - 0.3;).

  • The first answer reads lines into $_ using the -ne non-autoprinting linewise flags. Both a @max and @tmp array are declared. The line is broken on whitespace into words and .= saved back into $_. If (if statement) the .[1] second columm satisfies the criteria, the values is pushed onto the @tmp array. If not, the @tmp array overwrites the @max array if it has more elems (elements). Regardless, the @tmp array is Empty (emptied). At the END to make sure a final contiguous sequence is/isn't the longest, Raku's Test ?? True !! False ternary operator is used to output the longest array.

  • The second answer is similar to the first except when statements are used. In Raku once a when conditional is satisfied its associated block is executed and control reverts to the outer block, skipping any subsequent when or default statements. See reference below.

Sample Input:

1 16.3346384
2 11.43483
3 1.19819
4 1.1113829
5 1.0953443
6 1.9458343
7 1.345645
8 1.3847385794
9 1.3534344
10 2.1117454
11 1.17465
12 1.4587485

Sample Output:


NOTE: The code above will output the first longest contiguous sequence in the case of a tie.



If you do not want to over-engineer, try this command line one-liner:

awk '{print $2}' yourfile.txt | sort -g > youroutput.txt
  1. The first command will pick the second column of your file
  2. The second command will sort the selected column based on general numeric sort and write into the output file. For more details and fiddling, check the man pages of awk and sort.
