Skip to content

DOC: Add Comparison with Excel documentation #23042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

rotuna
Copy link

@rotuna rotuna commented Oct 8, 2018

closes #22993

Could someone give me more suggestions of Excel functions that are used often?

@codecov
Copy link

codecov bot commented Oct 8, 2018

Codecov Report

Merging #23042 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23042      +/-   ##
==========================================
+ Coverage   92.19%   92.19%   +<.01%     
==========================================
  Files         169      169              
  Lines       50911    50873      -38     
==========================================
- Hits        46939    46904      -35     
+ Misses       3972     3969       -3
Flag Coverage Δ
#multiple 90.61% <ø> (-0.01%) ⬇️
#single 42.32% <ø> (+0.01%) ⬆️
Impacted Files Coverage Δ
pandas/core/arrays/period.py 92.04% <0%> (-0.82%) ⬇️
pandas/core/arrays/base.py 95.71% <0%> (-0.18%) ⬇️
pandas/core/arrays/datetimes.py 96.82% <0%> (-0.1%) ⬇️
pandas/core/dtypes/dtypes.py 95.56% <0%> (-0.03%) ⬇️
pandas/io/pytables.py 92.44% <0%> (-0.01%) ⬇️
pandas/io/packers.py 88.04% <0%> (ø) ⬆️
pandas/core/dtypes/base.py 100% <0%> (ø) ⬆️
pandas/core/generic.py 96.65% <0%> (ø) ⬆️
pandas/core/arrays/datetimelike.py 95.56% <0%> (ø) ⬆️
pandas/core/indexes/range.py 95.73% <0%> (ø) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1a61e26...c84405f. Read the comment docs.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I added some comments that I think could make the examples better.

May be another recipe for Excel users could be a simple example of DataFrame.pivot_table()? If you think that can be useful, it'd probably be good a link to the reshaping documentation, when pivot tables are explained in more detail.


.. ipython:: python

df = pd.DataFrame({'AAA': [1]*8, 'BBB': list(range(0,8))}); df
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have examples that look somehow real. I think it makes easier for users to understand what's going on.

See for example: https://github.com/pandas-dev/pandas/blob/master/pandas/core/generic.py#L789

Also, make sure that the code in the examples follow PEP-8 (there are missing spaces around * and after the comma in range.

df = pd.DataFrame({'AAA': [1]*8, 'BBB': list(range(0,8))}); df

# Fill numbers with difference 4 starting from 1
# in rows 2 to 5 in column AAA.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better to have all the explanations in the text before the Python block, instead of comments.

# Fill numbers with difference 4 starting from 1
# in rows 2 to 5 in column AAA.

df.iloc[2:(5+1)].AAA = [ x*4 + 2 for x in range(0, len(df.iloc[2:(5+1)])) ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit to complex for a recipe. I'd prefer the case when in Excel you've got a 1 in the first cell, and you drag it creating a sequence. I think that case is more common, and the code will be much easier to read. What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dragging one down will just fill ones in excel. I'm trying to show that they can insert whatever series they want irrespective of how complex it is. But you're right I should change it to something simpler and just say that more can be done in text.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, let's say shift+dragging, so it's 1, 2, 3.... :) May be you can show both examples, starting by the simplest. I just think showing the current code directly, will feels a bit too scary for an average pandas user.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I'll change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, I don't use excel that much myself, but may be Excel users could also appreciate a short example on search/replace equivalent?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe using apply()

I was thinking pivot tables, drop duplicates, formulae, adding a row or multiple rows, random data, vlookup.

Excel makes some plots really easy, should that be included?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, yes, but then let's move this from the cookbook to a new documentation page "Comparison with Excel". You can find pages for Comparison with SQL, SAS and Stata already (check the end of the left sidebar here: http://pandas.pydata.org/pandas-docs/stable/)

@datapythonista datapythonista changed the title Cookbook excel DOC: Add Comparison with Excel documentation Oct 9, 2018
@@ -0,0 +1,121 @@
.. currentmodule:: pandas
.. _compare_with_excel:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it might be ok to either link to the excel docs for these functions and/or include a screen shot showing this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a user manual or docs for excel. I could only find this quick start guide. I'll try to come up with pictures and a gif for the fill handle.

Should I be using .. image:: abc.gif to attach gif?

Copy link

@Remnan13 Remnan13 Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few links to docs which might help with the excel user manual hunt!

University of Arizona's bare-bone (but functional) walk-through. Offers some examples, which seems like they can help with documentation written for excel-to-panda comparison.

Towson's counterpart does a more extensive job and includes examples of including an image background (albeit using Bing). Two birds, one stone?

Hope this helped!

@WillAyd
Copy link
Member

WillAyd commented Nov 23, 2018

Closing as stale though definitely would be a nice documentation update. If you'd like to pick it back up please ping

@WillAyd WillAyd closed this Nov 23, 2018
@afeld afeld mentioned this pull request Dec 18, 2020
26 tasks
@afeld afeld mentioned this pull request Jan 6, 2021
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Equivalent to the Excel fill handle
5 participants