Skip to content

BUG: Fixed non-unique indexing memory allocation issue with .ix/.loc (GH4280) #4283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 18, 2013

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Jul 18, 2013

closes #4280

Had a weird memory allocation scheme (that's why is a scheme!) when determining
a non-unique indexer. Fixed to use a dynamic schema

In [1]: columns = list('ABCDEFG')

def gen_test(l,l2):
            return pd.concat([ DataFrame(randn(l,len(columns)),index=range(l),columns=columns),
                               DataFrame(np.ones((l2,len(columns))),index=[0]*l2,columns=columns) ])

In [3]: df = gen_test(900000,100000)

In [5]: mask = np.arange(100000)

In [6]: df
Out[6]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000000 entries, 0 to 0
Data columns (total 7 columns):
A    1000000  non-null values
B    1000000  non-null values
C    1000000  non-null values
D    1000000  non-null values
E    1000000  non-null values
F    1000000  non-null values
G    1000000  non-null values
dtypes: float64(7)

In [7]: df.loc[mask]
Out[7]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 200000 entries, 0 to 99999
Data columns (total 7 columns):
A    200000  non-null values
B    200000  non-null values
C    200000  non-null values
D    200000  non-null values
E    200000  non-null values
F    200000  non-null values
G    200000  non-null values
dtypes: float64(7)

jreback added a commit that referenced this pull request Jul 18, 2013
BUG: Fixed non-unique indexing memory allocation issue with .ix/.loc (GH4280)
@jreback jreback merged commit 673931d into pandas-dev:master Jul 18, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MemoryError when using .loc or .ix
1 participant