Read file line by line, split its content, skip empty lines

Question

i got a file like this:

a;a_desc
b;b_desc  

c;
d
;
e;e_desc

What i want is:

read line by line
remove new line character
skip empty lines
if there is a string before the semicolon but not after, use the string before twice
if there is a string but no semicolon, use the string twice
return a list

That's want i want to get:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', 'c'], ['d', 'd'], ['e', 'e_desc']]

What i already got:

filename = 'data.txt'

with open(filename, 'r') as f:

    x = [line.rstrip('\n') for line in f.readlines() if not line.isspace()]

    xx = [line.split(';') for line in x]

    content = [line for line in xx if line[0]]

print(content)

That will give me:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', ''], ['d'], ['e', 'e_desc']]

I could probably create more loops, to catch the c and the d line right.
But is there a shorter way instead of all the loops?

Thanks!

I actually do not expect this to be in the file. It is a user generated file. The user should be smart enough to not create this :) — Paul, Commented Feb 13, 2018 at 19:25

Brendan Abel · Accepted Answer · 2018-02-13 19:11:08Z

2

You could just do a single loop and check the values during each step and double the list if it only has one element.

with open(filename, 'r') as f:
    data = []
    for line in f.readlines():
        line = line.rstrip('\n')
        if not line:
            continue
        line_list = [s for s in line.split(';') if s]
        if not line_list:
            continue
        if len(line_list) == 1:
            line_list *= 2
        data.append(line_list)

edited Feb 13, 2018 at 19:11

answered Feb 13, 2018 at 19:09

Brendan Abel

36.8k15 gold badges93 silver badges124 bronze badges

Add a comment |

Omar Einea · Accepted Answer · 2018-02-13 19:30:58Z

1

Another, maybe simpler solution:

data = []
with open('data.txt') as f:
    # Loop through lines (without \n)
    for line in f.read().splitlines():
        # Make sure line isn't empty or a semicolon
        if not (line is ';' or line is ''):
            # Clean line from spaces then split it
            cells = line.rstrip().split(';')
            # Use first cell twice if second is empty or not there
            if len(cells) < 2 or not cells[1]:
                cells = [cells[0]] * 2
            data.append(cells)

edited Feb 13, 2018 at 19:30

answered Feb 13, 2018 at 19:25

Omar Einea

2,5147 gold badges24 silver badges36 bronze badges

Add a comment |

RomanPerekhrest · Accepted Answer · 2018-02-13 19:32:23Z

1

Use the following approach with a single for loop and a few if conditions:

with open(filename) as f:
    result = []
    for r in f.read().splitlines():
        r = r.strip()
        if r and r[0] != ';':
            pair = r.split(';')
            result.append([pair[0]] * 2 if len(pair) == 1 or not pair[1] else pair)

print(result)

The output:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', 'c'], ['d', 'd'], ['e', 'e_desc']]

answered Feb 13, 2018 at 19:32

RomanPerekhrest

92.3k4 gold badges70 silver badges109 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Read file line by line, split its content, skip empty lines

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
python
python-3.x
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Not the answer you're looking for? Browse other questions tagged pythonpython-3.x or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
python-3.x
or ask your own question.