Skip to content

ENH: Create dataframes using every combination of given values, like R's expand.grid() #7426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
onesandzeroes opened this issue Jun 11, 2014 · 1 comment · Fixed by #7556
Closed
Labels
Milestone

Comments

@onesandzeroes
Copy link
Contributor

I find R's expand.grid() function quite useful for quick creation of example datasets. For example:

expand.grid(height = seq(60, 70, 5), weight = seq(100, 180, 40), sex = c("Male","Female"))
   height weight    sex
1      60    100   Male
2      65    100   Male
3      70    100   Male
4      60    140   Male
5      65    140   Male
6      70    140   Male
7      60    180   Male
8      65    180   Male
9      70    180   Male
10     60    100 Female
11     65    100 Female
12     70    100 Female
13     60    140 Female
14     65    140 Female
15     70    140 Female
16     60    180 Female
17     65    180 Female
18     70    180 Female

A simple implementation of this for pandas is easy to put together:

def expand_grid(dct):
    rows = itertools.product(*dct.values())
    return pd.DataFrame.from_records(rows, columns=dct.keys())

df = expand_grid(
    {'height': range(60, 71, 5),
     'weight': range(100, 181, 40),
     'sex': ['Male', 'Female']}
)
print(df)

Do people think this would be a useful addition?

If so, what kind of features should it have beyond the basics? A dtypes argument, specifying which column should be the index, etc.?

I'm also not sure if expand_grid is the most intuitive name, but given that it's duplicating
R functionality, maybe it's best just to leave it as is.

@jreback
Copy link
Contributor

jreback commented Jun 11, 2014

Here is what we use internally for this: https://github.com/pydata/pandas/blob/master/pandas/util/testing.py#L862

I could see a cookbook recipe. want to put a small example together for that?

@jreback jreback added the Docs label Jun 11, 2014
@jreback jreback changed the title ENH: Create dataframes using every combination of given values, like R's expand.grid() ENH: Create dataframes using every combination of given values, like R's expand.grid() Jun 24, 2014
@jreback jreback added this to the 0.14.1 milestone Jun 25, 2014
jorisvandenbossche added a commit that referenced this issue Jun 25, 2014
DOC: Cookbook recipe for emulating R's expand.grid() (#7426)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants