Feature request: Series.flatmap, DataFrame.flatmap

I'm working on some language analysis and using pandas to munge the data and grab some descriptive stats. This is just an illustrative example, I'm doing all kinds of slighty different things.

Suppose I have a series containing chunks of text, and I want to turn the line into multiple lines, preserving the index values. Here are the naive results:

``` python
In [53]: s=pd.Series(['This is text No 1.', 'and here is no. 2','and 3'],index=['Alice','Bob','Alice'])
    ...: s
Out[53]: 
Alice    This is text No 1.
Bob       and here is no. 2
Alice                 and 3
dtype: object

In [54]: s.map(lambda x: x.split(' '))
Out[54]: 
Alice    [This, is, text, No, 1.]
Bob       [and, here, is, no., 2]
Alice                    [and, 3]
dtype: object

In [55]: s.apply(lambda x: pd.Series(x.split(' ')))
Out[55]: 
          0     1     2    3    4
Alice  This    is  text   No   1.
Bob     and  here    is  no.    2
Alice   and     3   NaN  NaN  NaN
```

What I'd like is to be able to do is (Made up example):

``` python
In [67]: s.flatmap(lambda x: x.split(' '))
Out[67]: 
Alice    This
Alice    is
Alice    text
Alice    No
Alice    1.
Bob     and
Bob    here
Bob    is
Bob    no.
Bob    2
Alice   and
Alice  3
dtype: object
```

In general, I'd like to be able to explode a single row in a dataframe into multiple rows, by transforming one column value into multiple values, each becoming a new row with the value of other columns prserved, for example:

``` python
In [69]: df=pd.DataFrame([['2014-01-01','Alice',"A B"],['2014-01-02','Bob','C D']],columns=['dt','name','text'])
    ...: df
Out[69]: 
           dt   name text
0  2014-01-01  Alice  A B
1  2014-01-02    Bob  C D

In [70]: df.flatmap(lambda x: x.split(),on='text')
           dt   name text
0  2014-01-01  Alice  A
1  2014-01-01  Alice  B
2  2014-01-01    Bob  C
3  2014-01-01    Bob  D
```

Perhaps there's another way to do this, but that's how my natural instict suggests this should be done, flatmap is a fairly universal concept.
Groupby already does similar things based on return type, It doesn't have to be limited to groupby though.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Feature request: Series.flatmap, DataFrame.flatmap #8517

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature request: Series.flatmap, DataFrame.flatmap #8517

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions