-1

I have an application log file that also captures the execution time for each executed SQL statement, like below:-

Sample logs:

I 2023-11-15 07:18:00.743Z 1760463 37 ZVRwqItu6aw-b8eejMLI_gAAAAU 1@45077318@A PHP Request Summary: 7 warnings/errors on 6 lines
I 2023-11-15 07:17:15.927Z 1773299 99 ZVRwenUWDwrXl_9NnC-2vwAAAAM 1@45077318@A There is no cache key assigned for the current PID
I 2023-11-15 07:17:55.778Z 1773397 24 ZVRwoydG6PTd3KWR_MnfVAAAABU - Initiating db proxy socket to dev01:8080
I 2023-11-15 07:17:55.865Z 1773397 33 ZVRwoydG6PTd3KWR_MnfVAAAABU 1@45077318@A execStmtEx: time: 0.002 / 0 rows: 1 sql: <some_plsql_statement>
I 2023-11-15 07:17:56.185Z 1773397 65 ZVRwoydG6PTd3KWR_MnfVAAAABU 1@45077318@A applicationConnection TRX Record Commit (execStmtEx) time: 0
I 2023-11-15 07:17:57.515Z 1755787 26 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.001 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:57.519Z 1755787 26 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.001 / 0 rows: 0 sql: <some_sql_here>
I 2023-11-15 07:17:57.529Z 1755787 27 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:57.531Z 1755787 27 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.001 / 0 rows: 0 <some_sql_here>
I 2023-11-15 07:17:57.535Z 1755787 28 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:57.542Z 1755787 28 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
...

I want to grep for all these SQL queries along with their execution times, sorted in descending order(based on their execution times).

Desired output:

time: 0.002 | sql: <some_plsql_statement>
time: 0.002 | sql: <some_sql_here>

How can I achieve this with Linux commands? Could someone please help me to achieve this? Thanks in advance.

9
  • 2
    Does the file contain other lines too? If so please include a sample of those so we can see how to build a pattern that extracts only the relevant lines. (Right now sort would process the lines directly - no need for grep at all.) Commented Nov 15, 2023 at 8:35
  • @ChrisDavies, sorry for the late response, yes file contains other entries also related to the application. Could you please update your answer accordingly?
    – Arpit Jain
    Commented Nov 15, 2023 at 9:14
  • 2
    @ArpitJain no, please update your question accordingly. We cannot help you parse data that you don't show us. Also, please edit and explain how you want the output sorted. In descending order, yes, but of what? Of the time the command took to execute? Of the time it was launched? Alphabetically? Basically, show us a more complete example of your input, including lines you do not want to keep, and then show us the output you would want from that example.
    – terdon
    Commented Nov 15, 2023 at 9:45
  • 1
    Yes, that helps, but adding the expected output would be even better.
    – terdon
    Commented Nov 15, 2023 at 9:55
  • 1
    Yes, so please show us. Will all the input lines have exactly the same format? Can we expect the time to always be the 10th field? Or do you mean the time at the 3rd field?
    – terdon
    Commented Nov 15, 2023 at 9:57

2 Answers 2

2

Extract the relevant fields, printing only lines that match, and sort the result by descending time of execution:

sed -En 's/^.*(time: [0-9.]+) .* (sql: .*)/\1 | \2/p' logfile | sort -k2,3nr

Output

time: 0.002 | sql: <some_plsql_statement>
time: 0.002 | sql: <some_sql_here>
time: 0.002 | sql: <some_sql_here>
time: 0.002 | sql: <some_sql_here>
time: 0.001 | sql: <some_sql_here>
time: 0.001 | sql: <some_sql_here>
0
1

Assuming your lines will always have the same number of fields, and that sql: will only appear on the lines you want, you can do this to sort on the 10th field which seems to be the time it took to run the command:

$ grep sql: log | sort -rnk 10,10
I 2023-11-15 07:17:57.542Z 1755787 28 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:57.535Z 1755787 28 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:57.529Z 1755787 27 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:55.865Z 1773397 33 ZVRwoydG6PTd3KWR_MnfVAAAABU 1@45077318@A execStmtEx: time: 0.002 / 0 rows: 1 sql: <some_plsql_statement>
I 2023-11-15 07:17:57.519Z 1755787 26 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.001 / 0 rows: 0 sql: <some_sql_here>
I 2023-11-15 07:17:57.515Z 1755787 26 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.001 / 0 rows: 1 sql: <some_sql_here>

Or, if you want to sort on the 3rd field which I am guessing is the time the command was executed, use:

$ grep sql: log | sort -rnk 3,3
I 2023-11-15 07:17:57.542Z 1755787 28 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:57.535Z 1755787 28 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:57.529Z 1755787 27 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.002 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:57.519Z 1755787 26 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.001 / 0 rows: 0 sql: <some_sql_here>
I 2023-11-15 07:17:57.515Z 1755787 26 ZVRwpA5grkn1knR4gk6FMwAAABA 1@45077318@A queryResults: time: 0.001 / 0 rows: 1 sql: <some_sql_here>
I 2023-11-15 07:17:55.865Z 1773397 33 ZVRwoydG6PTd3KWR_MnfVAAAABU 1@45077318@A execStmtEx: time: 0.002 / 0 rows: 1 sql: <some_plsql_statement>

And if you only want the text from time: to the end of the line, assuming GNU grep, do:

$ grep -o 'time:.* sql:.*' log | sort -rnk 2,2
time: 0.002 / 0 rows: 1 sql: <some_sql_here>
time: 0.002 / 0 rows: 1 sql: <some_sql_here>
time: 0.002 / 0 rows: 1 sql: <some_sql_here>
time: 0.002 / 0 rows: 1 sql: <some_plsql_statement>
time: 0.001 / 0 rows: 1 sql: <some_sql_here>
time: 0.001 / 0 rows: 0 sql: <some_sql_here>
6
  • Thanks for the answer, could you please also add some details like how it works with each command so it would help me to understand and modify if needed, I would really appreciate that.
    – Arpit Jain
    Commented Nov 15, 2023 at 10:09
  • 1
    @ArpitJain there is nothing special here. Just look at man grep and man sort, and read the documentation for the options I am using. The -o tells grep to only print the matching section of the line, the -n tells sort to sort numerically and the -r to sort in reverse order. The -k n,n means "only sort on field n".
    – terdon
    Commented Nov 15, 2023 at 10:11
  • grep sql: log | sort -rnk 2,2, here in your answer it should be 10,10 if we want to sort on the 10th field?
    – Arpit Jain
    Commented Nov 15, 2023 at 10:19
  • @ArpitJain eeeek! Sorry! I had a copy/paste error. Please see updated answer. Yes, you want -k 10,10 to sort on the 10th field (if you use -k 10 it sorts from the 10th field and until the end of the line; with 10,10 it sorts only on the 10th).
    – terdon
    Commented Nov 15, 2023 at 10:24
  • You're welcome. And it starts from 1.
    – terdon
    Commented Nov 15, 2023 at 10:32

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .