2

i got a file like this:

a;a_desc
b;b_desc  

c;
d
;
e;e_desc

What i want is:

  • read line by line
  • remove new line character
  • skip empty lines
  • if there is a string before the semicolon but not after, use the string before twice
  • if there is a string but no semicolon, use the string twice
  • return a list

That's want i want to get:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', 'c'], ['d', 'd'], ['e', 'e_desc']]

What i already got:

filename = 'data.txt'

with open(filename, 'r') as f:

    x = [line.rstrip('\n') for line in f.readlines() if not line.isspace()]

    xx = [line.split(';') for line in x]

    content = [line for line in xx if line[0]]

print(content)  

That will give me:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', ''], ['d'], ['e', 'e_desc']]

I could probably create more loops, to catch the c and the d line right.
But is there a shorter way instead of all the loops?

Thanks!

2
  • 1
    how this item a;;;;; should be treated? Commented Feb 13, 2018 at 19:20
  • I actually do not expect this to be in the file. It is a user generated file. The user should be smart enough to not create this :)
    – Paul
    Commented Feb 13, 2018 at 19:25

3 Answers 3

2

You could just do a single loop and check the values during each step and double the list if it only has one element.

with open(filename, 'r') as f:
    data = []
    for line in f.readlines():
        line = line.rstrip('\n')
        if not line:
            continue
        line_list = [s for s in line.split(';') if s]
        if not line_list:
            continue
        if len(line_list) == 1:
            line_list *= 2
        data.append(line_list)
0
1

Another, maybe simpler solution:

data = []
with open('data.txt') as f:
    # Loop through lines (without \n)
    for line in f.read().splitlines():
        # Make sure line isn't empty or a semicolon
        if not (line is ';' or line is ''):
            # Clean line from spaces then split it
            cells = line.rstrip().split(';')
            # Use first cell twice if second is empty or not there
            if len(cells) < 2 or not cells[1]:
                cells = [cells[0]] * 2
            data.append(cells)
1

Use the following approach with a single for loop and a few if conditions:

with open(filename) as f:
    result = []
    for r in f.read().splitlines():
        r = r.strip()
        if r and r[0] != ';':
            pair = r.split(';')
            result.append([pair[0]] * 2 if len(pair) == 1 or not pair[1] else pair)

print(result)

The output:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', 'c'], ['d', 'd'], ['e', 'e_desc']]

Not the answer you're looking for? Browse other questions tagged or ask your own question.