-
Notifications
You must be signed in to change notification settings - Fork 15
Explain filtering of rows and selection of columns in a more Pandas-centric way #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Agree, we should probably fix this. It'll need some thought on how to do it right though. |
I agree it is important to say this in the right way and I think it is worth putting time into because it can be confusing for students if we explain this central part in an unclear way. I think we can avoid |
I like that strategy for I think I will still explain It's easy enough to do this (row position vs index) |
I will try to incorporate this into my current pass because we definitely have to get this right the first time -- I'm actually going to relabel this a bug because it's so off currently |
A solution to this is |
Hmm...OK, how about this: In chapter 1, we will only teach Ch 3 we should give a deeper introduction to these though, including some of the trickiness, e.g. how |
Yeah, I think it makes sense to start with an easier introduction that skips some of the details. Actually, it would make sense to change the current order of the two paragraphs and teach selecting columns before filtering rows because it is more straightforward to just type in the name of a column inside I think this can be a good intro level of explanation:
The remaining operations are selecting columns by number and rows by name, but I don't think we teach that in the R version of the course either. |
@joelostblom my plan for Ch1 is:
Then in chapter 3, we can go into the intricacies of indices, the full generality of I mentioned this as well in #39 I think this is the most natural way to get students going without getting bogged down in slicing/indexing/blah blah blah in their first lecture (one caveat: I do need to check the worksheets and tutorials in week 1+2 to see if it's possible to get away with this) |
I made some comments on the specifics of this approach directly in the #48 . I agree with the general idea of getting students going without being bogged down in details. |
The introduction of
[]
for row filtering and.loc[]
for column filtering seems like a translation offilter
andselect
from the tidyverse, but in Pandas[]
can be used for either columns (by names) or rows (by slices or boolean masks), whereas.loc[]
is used when both rows and columns are to be selected at the same time to avoid ambiguous chained operations.The text was updated successfully, but these errors were encountered: