Skip to content

BUG: GH3109 fixed issues where passing an axis of 'columns' would fail #3110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 26, 2013

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Mar 20, 2013

BUG: some operations were expecting an axis number, this fix allows passing the regular axis name as well, e.g.
df.sum(axis='columns') will now work

note: narrowed the scope of this PR to just a bug fix

In [3]: df = pd.DataFrame(np.random.rand(5,2),columns=['A','B'])

In [4]: df.sum(axis=1)
Out[4]: 
0    1.150325
1    0.789142
2    0.581486
3    0.864212
4    0.731511

In [5]: df.sum(axis='columns')
---------------------------------------------------------------------------

Exception: Must have 0<= axis <= 1

@jreback
Copy link
Contributor Author

jreback commented Mar 20, 2013

@wesm @y-p

what do u think?

as an aside this doesn't allow actually accessing the index like
df.myaxis2, that's a bit more complicated (save for later)

@ghost
Copy link

ghost commented Mar 21, 2013

I'm thinking... why don't don't we create a milestone called "rug" and put
all the remaining 0.11 issues under it.

@jreback
Copy link
Contributor Author

jreback commented Mar 25, 2013

@y-p @lodagro any negative thoughts on this?

aside from rugging......

@ghost
Copy link

ghost commented Mar 25, 2013

I'm -1 on this. grokking the axis numbering is not that hard,
and 0/1 is pretty concise, which is nice.
axis aliases especially seem like overkill to me, and having canonical names
provides uniformity when reading other people's code.

my 2c.

@jreback
Copy link
Contributor Author

jreback commented Mar 25, 2013

really

ok

the main issue is that some functions accept 0/1 and also index/columns
while some don't

I have also seen others want to use aliases

maybe I'll just out that up as a recipe then

@ghost
Copy link

ghost commented Mar 25, 2013

That's a different matter. making the API more uniform - definitely.
but axis aliases - I could see the need for this if we had a 10d panel,
less so for 2/3d...

@jreback
Copy link
Contributor Author

jreback commented Mar 25, 2013

ok I'll separate this and maybe just leave a recipient (it's trivial really just set _AXIS_ALIASES)

@ghost
Copy link

ghost commented Mar 25, 2013

yeah, maybe doing the refactor and putting things in place without exposing
new user-facing APIs would be good. If someone gets a lot of benefit from it,
they can build on it by rolling their own.

@jreback
Copy link
Contributor Author

jreback commented Mar 25, 2013

I can easily remove the global set/clear (and make it recipe), but still thinking the local name setting is useful (2nd example) ? or does this make less clear?

In [183]: df = DataFrame(randn(3,2),columns=['c1','c2'],
                   index=['i1','i2','i3'])

In [184]: df.index.name = 'myaxis1'

In [185]: df.columns.name = 'myaxis2'

In [186]: df.sum(axis='myaxis1')
Out[186]: 
myaxis2
c1        -0.168212
c2         0.166038
dtype: float64

In [187]: df.xs('c1', axis='myaxis2')
Out[187]: 
myaxis1
i1         0.758071
i2        -0.548502
i3        -0.377780
Name: c1, dtype: float64

@ghost
Copy link

ghost commented Mar 25, 2013

ok, take it another angle : canonical names are unambiguous regardless of
what operations the object goes through. are index names preserved through all
data operations in pandas? if not , then the axis name might disappear
unexpectedly. you might get the "wrong" one when two indicies with different
names are involved.

Perhaps names are preserved religiously everyhwere, I'm not sure because that's a tough
question to answer with confidence. Merging this opens the door to
all sorts of issues which we can't fully anticipate and which we'll have to implement
workarounds for (if found) since we won't be able to remove it once it's in. and those
workaround might become ugly. it's similar to the metadata discussion that happend a couple
of months ago.

In that sense, the global aliases are actually a cleaner concept, because they really
are just aliases, not a piece of data tied to a particular object which might mutate
in the future.

@jreback
Copy link
Contributor Author

jreback commented Mar 26, 2013

you r of course right

going to strip out everything but he bug fix

always like features but then have to support

@ghost
Copy link

ghost commented Mar 26, 2013

I'm +1 on the other 150+ PR's you've made this month :)

jreback added a commit that referenced this pull request Mar 26, 2013
BUG: GH3109 fixed issues where passing an axis of 'columns' would fail
@jreback jreback merged commit 9eb4689 into pandas-dev:master Mar 26, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Enhancement Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant