2

I have a column where a string of characters represents a monthly series of events.

A str.split method would be ideal but I can't find the right pattern, regex or otherwise, to split on each character.

Col Foo
 BBBAAAAAR

into

Col Foo_1 | Col Foo_2 | Col Foo_3 | Col Foo_4 ...
B         |B          |B          |A          ...

I currently do it on a loop:

for keys, frames in data_frame_dict.items():
   temp1 = frames.Col_Foo.str.split(' ',expand=True).add_prefix('Feat_Mon_') 

and then append...

Which works for spaces, but I want every character in a column, which right now has no separation between each element.

But I can't find the method pattern that works for a string of characters either here or in the docs.

EDIT: I have already tried '' as a separator and it returns the right amount of columns, but they're all empty. Its as if its splitting on each character and returning the empty space between characters.

5
  • frames.Col Foo.str.split(' ',expand=True).add_prefix('Feat_Mon_') <----- use ` '' ` in place of ` ' '`
    – Pygirl
    Commented Mar 18, 2020 at 20:07
  • I have already tried '' and it returns the right amount of columns, but they're all empty. Its as if its splitting on each character and returning the empty space between characters. Commented Mar 18, 2020 at 20:21
  • but it works for us. can you show us the screenshot ?
    – Pygirl
    Commented Mar 18, 2020 at 20:28
  • imgur.com/a/ugwWZw7 Commented Mar 18, 2020 at 21:01
  • I see it works for a simple dataframe, thank you. I have converted the column to string just in case but its still not working. I'll investigate that, but this answers the initial question. Commented Mar 18, 2020 at 22:15

2 Answers 2

5

If you want to split by character, and the column is type object, you only need to do what you are doing but use the empty '' instead of ' ' as an argument to str.split. This will split the word into all characters.

so the following code should work.

frame['Col Foo'].str.split('',expand=True)
2
  • 1
    should be frame['Col Foo'].str.split('',expand=True)
    – Pygirl
    Commented Mar 18, 2020 at 20:03
  • mentioned in your comment above that I already tried this but it didn't work Commented Mar 18, 2020 at 20:25
0

Are they all the same length? I believe you could convert your Col Foo to a string and then just iterate through the string by character. If they're all the same length, you could hardcode it to a df without a loop. *Sorry, would've been a comment but I don't have the rep to comment

1
  • they can be variable length but there is a set maximum Commented Mar 18, 2020 at 20:24

Not the answer you're looking for? Browse other questions tagged or ask your own question.