Parsing Long Excel Report Files for Predefined Results

Ask Question

Asked 5 years, 1 month ago

Modified 5 years, 1 month ago

Viewed 92 times

I work as an IT intern in a multinational and I was given a tedious task of combing through a 2500+ long multi-column excel report file in search of inactive servers.

Here's a sample row from that file:

Then, I got another excel file but this time with just the DB Codes (80+ of them).

My task was:

Go through the big report file
Find the company by its DB Code
Check if the server is active or not active, and if not active flag it for a decommission

Of course, as you might expect, I was told to return the results in a spreadsheet in the following format:

Full name: Acme Inc. | Code: ACM | Active?: no | Decomm?: yes

Fulln name:, Code:, etc. are column headers. Here, they are just for readability.

If I were to do it manually, I'd most probably die of boredom. But! There's Python, right?

So, I exported some of the columns from the report into a tab delimited file and drafted this:

def read_file_to_list(file_name):
    with open(file_name, 'r') as file_handler:
        stuff = file_handler.readlines()
    return [line[:-1] for line in stuff]


def make_dic(file_name):
    with open(file_name, 'r') as f:
        rows = (line.replace('"', "").strip().split("\t") for line in f)
        return {row[0]:row[1:] for row in rows}


def search(dic, ou_codes):
    counter = 1
    with open("output.txt", "w") as output:
        #writing a header
        output.write("Full name\tCode\tActive?\tDecomm?\n")
        for k, v in dic.items():
            for code in ou_codes:
                if v[0] == code:
                    #writing back line by line to a tab delimited file
                    outputline = "{}\t{}\t{}\t{}\n".format(k, *v,
                    "yes" if v[1] == "no" else "no")
                    output.write(outputline)

                    print("{}. Full name: {} | Code: {} | Active?: {} | Decomm?: {}".format(counter, k, *v, "yes" if v[1] == "no" else "no"))
                    counter += 1

decomm_codes = read_file_to_list('t1.txt')
all_of_it = make_dic('t2.txt')

search(all_of_it, decomm_codes)

That spits out:

1. Full name: Random, Inc | Code: RNDM | Active?: yes | Decomm?: no
2. Full name: Acme Inc.| Code: ACM | Active?: no | Decomm?: yes
3. Full name: Fake Bank, Ltd.  | Code: FKBNK | Active?: yes | Decomm?: no

Is there a way refactor the search method?

Finally, here's the contents of the decomm_codes.txt and big_report.txt files.

decomm_codes.txt:

RNDM
ACM
FKBNK

big_report.txt:

"Random, Inc"   RNDM    yes
Acme Inc.   ACM no
"Fake Bank, Ltd. "  FKBNK   yes

edited Jun 17, 2019 at 13:57

Stephen Rauch

4,21612 gold badges23 silver badges36 bronze badges

asked Jun 14, 2019 at 13:02

baduker

1,34815 silver badges29 bronze badges

5

\$\begingroup\$ "... but how do I write the results back into a tab-delimited text file?" is likely off-topic here, since only existing and working code can be reviewed. Apart from that it seems like a good question. \$\endgroup\$
– AlexV
Commented Jun 14, 2019 at 13:24

Add a comment |

Stack Exchange Network

Parsing Long Excel Report Files for Predefined Results

0

Browse other questions tagged
python
python-3.x
parsing
excel
or ask your own question.

Hot Network Questions

Parsing Long Excel Report Files for Predefined Results

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Browse other questions tagged pythonpython-3.xparsingexcel or ask your own question.

Related

Hot Network Questions

Browse other questions tagged
python
python-3.x
parsing
excel
or ask your own question.