Skip to content

WIP: Make weekday_name field in DatetimeIndex categorical #21177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

sivakar12
Copy link

@jschendel jschendel added Enhancement Datetime Datetime data dtype Categorical Categorical Data Type labels May 23, 2018
@mroeschke
Copy link
Member

Thanks for the PR!

weekday_name was depreciated in this latest release, so I think it would be more important to apply this feature to day_name (and month_name). You'll want to make adjustments here: https://github.com/pandas-dev/pandas/blob/v0.23.0/pandas/core/indexes/datetimes.py#L2513-L2537

@@ -93,7 +94,14 @@ def f(self):
result = fields.get_date_field(values, field)
result = self._maybe_mask_results(result, convert='float64')

return Index(result, name=self.name)
if field in ['weekday_name', 'day_name']:
cats = ['Monday', 'Tuesday', 'Wednesday', 'Thursday',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are already defined in the calendar standard library (calendar.day_name) or here:

DAYS_FULL = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday',

@codecov
Copy link

codecov bot commented May 23, 2018

Codecov Report

Merging #21177 into master will decrease coverage by 0.08%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21177      +/-   ##
==========================================
- Coverage   91.92%   91.84%   -0.09%     
==========================================
  Files         153      153              
  Lines       49563    49505      -58     
==========================================
- Hits        45559    45466      -93     
- Misses       4004     4039      +35
Flag Coverage Δ
#multiple 90.23% <0%> (-0.09%) ⬇️
#single 41.88% <0%> (+0.07%) ⬆️
Impacted Files Coverage Δ
pandas/io/formats/printing.py 89.38% <0%> (-3.71%) ⬇️
pandas/util/testing.py 84.81% <0%> (-1.16%) ⬇️
pandas/plotting/_core.py 82.39% <0%> (-1.15%) ⬇️
pandas/core/dtypes/missing.py 91.95% <0%> (-0.58%) ⬇️
pandas/core/tools/datetimes.py 84.43% <0%> (-0.55%) ⬇️
pandas/core/dtypes/cast.py 88.06% <0%> (-0.43%) ⬇️
pandas/core/algorithms.py 94.5% <0%> (-0.34%) ⬇️
pandas/core/indexes/category.py 97.03% <0%> (-0.25%) ⬇️
pandas/io/common.py 70.04% <0%> (-0.25%) ⬇️
pandas/io/json/json.py 92.23% <0%> (-0.24%) ⬇️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b36b451...e0451e0. Read the comment docs.

@@ -93,7 +94,12 @@ def f(self):
result = fields.get_date_field(values, field)
result = self._maybe_mask_results(result, convert='float64')

return Index(result, name=self.name)
if field in ['weekday_name', 'day_name']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think all this is need here.

@mroeschke
Copy link
Member

It's useful to add some tests (even if they fail for now) just to see if the changes are on the right track.

test_script.py Outdated
@@ -0,0 +1,5 @@
import pandas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, add tests in pandas/tests/indexes/datetimes/test_misc.py near the other day_name and month_name tests

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad! I didn't mean to stage this file. I'll remove this

@@ -0,0 +1 @@
!coverage.py: This is a private format, don't read it directly!{"lines":{"/home/sivakar/Projects/pandas/pandas/tslib.py":[],"/home/sivakar/Projects/pandas/pandas/json.py":[],"/home/sivakar/Projects/pandas/pandas/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/_version.py":[],"/home/sivakar/Projects/pandas/pandas/parser.py":[],"/home/sivakar/Projects/pandas/pandas/testing.py":[],"/home/sivakar/Projects/pandas/pandas/lib.py":[],"/home/sivakar/Projects/pandas/pandas/conftest.py":[],"/home/sivakar/Projects/pandas/pandas/core/missing.py":[],"/home/sivakar/Projects/pandas/pandas/core/panel.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexing.py":[],"/home/sivakar/Projects/pandas/pandas/core/resample.py":[],"/home/sivakar/Projects/pandas/pandas/core/series.py":[],"/home/sivakar/Projects/pandas/pandas/core/sorting.py":[],"/home/sivakar/Projects/pandas/pandas/core/internals.py":[],"/home/sivakar/Projects/pandas/pandas/core/config_init.py":[],"/home/sivakar/Projects/pandas/pandas/core/algorithms.py":[],"/home/sivakar/Projects/pandas/pandas/core/generic.py":[],"/home/sivakar/Projects/pandas/pandas/core/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/frame.py":[],"/home/sivakar/Projects/pandas/pandas/core/accessor.py":[],"/home/sivakar/Projects/pandas/pandas/core/index.py":[],"/home/sivakar/Projects/pandas/pandas/core/config.py":[],"/home/sivakar/Projects/pandas/pandas/core/categorical.py":[],"/home/sivakar/Projects/pandas/pandas/core/window.py":[],"/home/sivakar/Projects/pandas/pandas/core/apply.py":[],"/home/sivakar/Projects/pandas/pandas/core/base.py":[],"/home/sivakar/Projects/pandas/pandas/core/datetools.py":[],"/home/sivakar/Projects/pandas/pandas/core/common.py":[],"/home/sivakar/Projects/pandas/pandas/core/strings.py":[],"/home/sivakar/Projects/pandas/pandas/core/api.py":[],"/home/sivakar/Projects/pandas/pandas/core/nanops.py":[],"/home/sivakar/Projects/pandas/pandas/core/ops.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/datetimes.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/interval.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/timedeltas.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/numeric.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/datetimelike.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/base.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/period.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/frozen.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/category.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/api.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/accessors.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/multi.py":[],"/home/sivakar/Projects/pandas/pandas/core/indexes/range.py":[],"/home/sivakar/Projects/pandas/pandas/core/util/hashing.py":[],"/home/sivakar/Projects/pandas/pandas/core/util/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/missing.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/dtypes.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/generic.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/concat.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/base.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/common.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/api.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/inference.py":[],"/home/sivakar/Projects/pandas/pandas/core/dtypes/cast.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/tile.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/reshape.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/concat.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/melt.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/merge.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/util.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/api.py":[],"/home/sivakar/Projects/pandas/pandas/core/reshape/pivot.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/scope.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/expr.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/expressions.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/eval.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/pytables.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/engines.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/check.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/common.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/api.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/ops.py":[],"/home/sivakar/Projects/pandas/pandas/core/computation/align.py":[],"/home/sivakar/Projects/pandas/pandas/core/arrays/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/arrays/categorical.py":[],"/home/sivakar/Projects/pandas/pandas/core/arrays/base.py":[],"/home/sivakar/Projects/pandas/pandas/core/sparse/scipy_sparse.py":[],"/home/sivakar/Projects/pandas/pandas/core/sparse/series.py":[],"/home/sivakar/Projects/pandas/pandas/core/sparse/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/sparse/frame.py":[],"/home/sivakar/Projects/pandas/pandas/core/sparse/array.py":[],"/home/sivakar/Projects/pandas/pandas/core/sparse/api.py":[],"/home/sivakar/Projects/pandas/pandas/core/groupby/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/core/groupby/groupby.py":[],"/home/sivakar/Projects/pandas/pandas/core/tools/datetimes.py":[],"/home/sivakar/Projects/pandas/pandas/core/tools/timedeltas.py":[],"/home/sivakar/Projects/pandas/pandas/core/tools/numeric.py":[],"/home/sivakar/Projects/pandas/pandas/core/tools/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/formats/style.py":[],"/home/sivakar/Projects/pandas/pandas/formats/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/tseries/offsets.py":[],"/home/sivakar/Projects/pandas/pandas/tseries/frequencies.py":[],"/home/sivakar/Projects/pandas/pandas/tseries/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/tseries/holiday.py":[],"/home/sivakar/Projects/pandas/pandas/tseries/api.py":[],"/home/sivakar/Projects/pandas/pandas/tseries/plotting.py":[],"/home/sivakar/Projects/pandas/pandas/tseries/converter.py":[],"/home/sivakar/Projects/pandas/pandas/util/decorators.py":[],"/home/sivakar/Projects/pandas/pandas/util/_tester.py":[],"/home/sivakar/Projects/pandas/pandas/util/_validators.py":[],"/home/sivakar/Projects/pandas/pandas/util/_depr_module.py":[],"/home/sivakar/Projects/pandas/pandas/util/_print_versions.py":[],"/home/sivakar/Projects/pandas/pandas/util/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/util/_decorators.py":[],"/home/sivakar/Projects/pandas/pandas/util/testing.py":[],"/home/sivakar/Projects/pandas/pandas/util/_test_decorators.py":[],"/home/sivakar/Projects/pandas/pandas/util/_doctools.py":[],"/home/sivakar/Projects/pandas/pandas/api/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/api/extensions/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/api/types/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/compat/pickle_compat.py":[],"/home/sivakar/Projects/pandas/pandas/compat/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/compat/chainmap_impl.py":[],"/home/sivakar/Projects/pandas/pandas/compat/chainmap.py":[],"/home/sivakar/Projects/pandas/pandas/compat/numpy/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/compat/numpy/function.py":[],"/home/sivakar/Projects/pandas/pandas/plotting/_converter.py":[],"/home/sivakar/Projects/pandas/pandas/plotting/_core.py":[],"/home/sivakar/Projects/pandas/pandas/plotting/_compat.py":[],"/home/sivakar/Projects/pandas/pandas/plotting/_tools.py":[],"/home/sivakar/Projects/pandas/pandas/plotting/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/plotting/_timeseries.py":[],"/home/sivakar/Projects/pandas/pandas/plotting/_misc.py":[],"/home/sivakar/Projects/pandas/pandas/plotting/_style.py":[],"/home/sivakar/Projects/pandas/pandas/errors/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/computation/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/computation/expressions.py":[],"/home/sivakar/Projects/pandas/pandas/tools/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/tools/merge.py":[],"/home/sivakar/Projects/pandas/pandas/tools/plotting.py":[],"/home/sivakar/Projects/pandas/pandas/types/concat.py":[],"/home/sivakar/Projects/pandas/pandas/types/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/types/common.py":[],"/home/sivakar/Projects/pandas/pandas/io/pickle.py":[],"/home/sivakar/Projects/pandas/pandas/io/parsers.py":[],"/home/sivakar/Projects/pandas/pandas/io/s3.py":[],"/home/sivakar/Projects/pandas/pandas/io/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/io/clipboards.py":[],"/home/sivakar/Projects/pandas/pandas/io/date_converters.py":[],"/home/sivakar/Projects/pandas/pandas/io/stata.py":[],"/home/sivakar/Projects/pandas/pandas/io/parquet.py":[],"/home/sivakar/Projects/pandas/pandas/io/pytables.py":[],"/home/sivakar/Projects/pandas/pandas/io/packers.py":[],"/home/sivakar/Projects/pandas/pandas/io/gbq.py":[],"/home/sivakar/Projects/pandas/pandas/io/excel.py":[],"/home/sivakar/Projects/pandas/pandas/io/feather_format.py":[],"/home/sivakar/Projects/pandas/pandas/io/html.py":[],"/home/sivakar/Projects/pandas/pandas/io/common.py":[],"/home/sivakar/Projects/pandas/pandas/io/api.py":[],"/home/sivakar/Projects/pandas/pandas/io/sql.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/latex.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/css.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/printing.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/csvs.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/format.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/style.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/console.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/excel.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/html.py":[],"/home/sivakar/Projects/pandas/pandas/io/formats/terminal.py":[],"/home/sivakar/Projects/pandas/pandas/io/msgpack/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/io/msgpack/_version.py":[],"/home/sivakar/Projects/pandas/pandas/io/msgpack/exceptions.py":[],"/home/sivakar/Projects/pandas/pandas/io/json/json.py":[],"/home/sivakar/Projects/pandas/pandas/io/json/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/io/json/table_schema.py":[],"/home/sivakar/Projects/pandas/pandas/io/json/normalize.py":[],"/home/sivakar/Projects/pandas/pandas/io/sas/sas_xport.py":[],"/home/sivakar/Projects/pandas/pandas/io/sas/sas7bdat.py":[],"/home/sivakar/Projects/pandas/pandas/io/sas/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/io/sas/sas_constants.py":[],"/home/sivakar/Projects/pandas/pandas/io/sas/sasreader.py":[],"/home/sivakar/Projects/pandas/pandas/io/clipboard/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/io/clipboard/clipboards.py":[],"/home/sivakar/Projects/pandas/pandas/io/clipboard/windows.py":[],"/home/sivakar/Projects/pandas/pandas/io/clipboard/exceptions.py":[],"/home/sivakar/Projects/pandas/pandas/_libs/__init__.py":[],"/home/sivakar/Projects/pandas/pandas/_libs/tslibs/__init__.py":[]}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Please be sure not to commit any extraneous files such as this one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! Did not notice that. Thought .gitignore would handle these things.
Shall I amend the last commit or create a new commit deleting the file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your next commit you can simply uncommit or delete this file.

@sivakar12
Copy link
Author

I find that I have to change the data type in a lot of existing tests by setting ordered and categories properties.
How about creating a custom categorical dtype for month_name and day_name as in https://pandas.pydata.org/pandas-docs/stable/generated/pandas.api.types.CategoricalDtype.html?
If so where can this type be put?

@mroeschke
Copy link
Member

Go ahead and change the dtype in the existing tests. Since this is an API change, the existing tests will need to be adjusted.

@pep8speaks
Copy link

pep8speaks commented May 26, 2018

Hello @sivakar12! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on June 20, 2018 at 18:39 Hours UTC

@sivakar12 sivakar12 force-pushed the weekday-as-categorical branch 2 times, most recently from 2439396 to b13a7c1 Compare May 26, 2018 11:41
@jreback
Copy link
Contributor

jreback commented Jun 19, 2018

can you rebase & fixup

@sivakar12 sivakar12 force-pushed the weekday-as-categorical branch from b13a7c1 to e0451e0 Compare June 20, 2018 18:39
@sivakar12
Copy link
Author

OK. I've rebased

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failing the build - pls investigate.
add a subsection in 0.24.0, showing the previous and the new behavior in enhancements.

@@ -2537,7 +2539,8 @@ def day_name(self, locale=None):
result = fields.get_date_name_field(values, 'day_name',
locale=locale)
result = self._maybe_mask_results(result)
return Index(result, name=self.name)
return CategoricalIndex(result, ordered=True, name=self.name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this must have broken some tests as well

@jreback
Copy link
Contributor

jreback commented Oct 11, 2018

closing as stale, if you want to continue working, pls ping and we can re-open. you will need to merge master.

@jreback jreback closed this Oct 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Datetime Datetime data dtype Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: return .dt.weekday/isoweekday/month_name/day_name as ordered categoricals
5 participants