0

I have a csv file that get's SFTP'd to a RHEL box I have setup. It then gets transformed a little before another process SFTP's in and grabs it sending it off to be imported to a DB of a 3rd party.

I need to have a shell script open file one, take the id & code fields.. search file two for them.. grab the pay field from file 2 and insert into the pay field of file one.

Example:

File1.csv
501,101,
505,101,
507,103,

File2.csv
501,101,5.50
505,101,6.25
507,103,1.25

I want the 5.50 in File2 to be inserted into File1 in the field after 101. Obviously search through File1 line by line and repeat.

Doing some searching, I've come across the join command which seems like it could work but I have 2 common fields I have to join against and I looked at the documentation and it seems that only 1 common field is supported.

Something like:

join -t, -0 1.1 1.2 2.3 -1 1 -2 1

That takes care of the first lookup.. but I need to match both columns 1 & 2 in order to get my desired result.

UPDATE

So in doing some more digging I've been playing with Join & Awk together to make this work. Feel like I am close but just missing something as my output right now is blank.

My actual datafile is laid out differently than my basic example above. Here is a better representation:

File1.csv
Column1,Column2,Column3,Column4,Column5,Column6,Column7


File2.csv
Column1,Column2,Column3,Column4,Column5

I want to set the value of Column6 in File1 to the Value of Column4 in File2 should there be a match of Columns 1,4,7 in File1 and Columns 2,5,3 in File2

My attempt thus far is:

join -t, -j1 -0 1.1 1.2 1.3 1.4 1.5 2.6 1.7 <(<file1.csv awk -F "," 
'{print $1"-"$4"-"$7","$0}' | sort -k1,1) <(<file2.csv awk -F "," '{print ($2"-"$5"-"$3","$0}' | sort -k1,1) > out.csv

I still want to include the entries that don't have a match. If I include

-a1

it sort of accomplishes this but the field i am matching to is blank. Instead I want it to just retain the original data.

4
  • 3
    If you write the script and have difficulty, you might show your work and ask here for help. Super User is not a script-writing service, though. Commented Aug 22, 2023 at 16:09
  • Thank you. I was messing with the join command so I'll post up how far I got.
    – Jeremy
    Commented Aug 22, 2023 at 20:10
  • (1) Your join command (that allegedly takes care of the first lookup) does not seem valid. Do you mean join -t, -o '1.1 1.2 2.3' -1 1 -2 1 File1.csv File2.csv? (2) I think the example files are not really a good example because for them the first lookup is enough; or even cat File2.csv is enough. (3) If you replace the first comma in each line with _ then you will have just one common field. In the result replace back. Can you take it from here? Commented Aug 22, 2023 at 20:46
  • So I've gotten join to work now... spits out a file where there is a match. Last part of the equation is to include the existing data if there isnt a match. If I include -a1 it sort of does this however i dont want the linked field of 2.6 in the example above.. i'd want 1.6 since there wasnt a match.
    – Jeremy
    Commented Aug 23, 2023 at 0:17

0

You must log in to answer this question.

Browse other questions tagged .