-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Add Comparison with Excel documentation #23042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #23042 +/- ##
==========================================
+ Coverage 92.19% 92.19% +<.01%
==========================================
Files 169 169
Lines 50911 50873 -38
==========================================
- Hits 46939 46904 -35
+ Misses 3972 3969 -3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I added some comments that I think could make the examples better.
May be another recipe for Excel users could be a simple example of DataFrame.pivot_table()
? If you think that can be useful, it'd probably be good a link to the reshaping documentation, when pivot tables are explained in more detail.
doc/source/cookbook.rst
Outdated
|
||
.. ipython:: python | ||
|
||
df = pd.DataFrame({'AAA': [1]*8, 'BBB': list(range(0,8))}); df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to have examples that look somehow real. I think it makes easier for users to understand what's going on.
See for example: https://github.com/pandas-dev/pandas/blob/master/pandas/core/generic.py#L789
Also, make sure that the code in the examples follow PEP-8 (there are missing spaces around *
and after the comma in range.
doc/source/cookbook.rst
Outdated
df = pd.DataFrame({'AAA': [1]*8, 'BBB': list(range(0,8))}); df | ||
|
||
# Fill numbers with difference 4 starting from 1 | ||
# in rows 2 to 5 in column AAA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be better to have all the explanations in the text before the Python block, instead of comments.
doc/source/cookbook.rst
Outdated
# Fill numbers with difference 4 starting from 1 | ||
# in rows 2 to 5 in column AAA. | ||
|
||
df.iloc[2:(5+1)].AAA = [ x*4 + 2 for x in range(0, len(df.iloc[2:(5+1)])) ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit to complex for a recipe. I'd prefer the case when in Excel you've got a 1
in the first cell, and you drag it creating a sequence. I think that case is more common, and the code will be much easier to read. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dragging one down will just fill ones in excel. I'm trying to show that they can insert whatever series they want irrespective of how complex it is. But you're right I should change it to something simpler and just say that more can be done in text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe, let's say shift+dragging, so it's 1, 2, 3...
. :) May be you can show both examples, starting by the simplest. I just think showing the current code directly, will feels a bit too scary for an average pandas user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I'll change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, I don't use excel that much myself, but may be Excel users could also appreciate a short example on search/replace equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, maybe using apply()
I was thinking pivot tables, drop duplicates, formulae, adding a row or multiple rows, random data, vlookup.
Excel makes some plots really easy, should that be included?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good, yes, but then let's move this from the cookbook to a new documentation page "Comparison with Excel". You can find pages for Comparison with SQL, SAS and Stata already (check the end of the left sidebar here: http://pandas.pydata.org/pandas-docs/stable/)
3dc8539
to
645e4e4
Compare
645e4e4
to
c84405f
Compare
@@ -0,0 +1,121 @@ | |||
.. currentmodule:: pandas | |||
.. _compare_with_excel: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it might be ok to either link to the excel docs for these functions and/or include a screen shot showing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find a user manual or docs for excel. I could only find this quick start guide. I'll try to come up with pictures and a gif for the fill handle.
Should I be using .. image:: abc.gif
to attach gif?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a few links to docs which might help with the excel user manual hunt!
University of Arizona's bare-bone (but functional) walk-through. Offers some examples, which seems like they can help with documentation written for excel-to-panda comparison.
Towson's counterpart does a more extensive job and includes examples of including an image background (albeit using Bing). Two birds, one stone?
Hope this helped!
Closing as stale though definitely would be a nice documentation update. If you'd like to pick it back up please ping |
closes #22993
Could someone give me more suggestions of Excel functions that are used often?