Skip to content

ENH: Add end and end_day options for origin from resample #37805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 44 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
39f1e8f
ENH: Add 'end' option in resample's origin
GYHHAHA Nov 13, 2020
cd5aa64
Update resample.py
GYHHAHA Nov 13, 2020
0184b1d
Update resample.py
GYHHAHA Nov 13, 2020
ff35b6f
Update test_resample_api.py
GYHHAHA Nov 13, 2020
8c4549e
Update resample.py
GYHHAHA Nov 13, 2020
b835d1a
Update test_resample_api.py
GYHHAHA Nov 13, 2020
bf15c67
Update test_resample_api.py
GYHHAHA Nov 13, 2020
e4b01d8
Update test_datetime_index.py
GYHHAHA Nov 13, 2020
d096ccd
add backward para and end_day option
GYHHAHA Nov 27, 2020
222ef8d
add doc-string
GYHHAHA Nov 27, 2020
90c9c5f
add test cases
GYHHAHA Nov 27, 2020
eae898c
fix format
GYHHAHA Nov 27, 2020
2ee1000
Update test_resample_api.py
GYHHAHA Nov 27, 2020
3442e00
Update test_resample_api.py
GYHHAHA Nov 27, 2020
a33acac
Update test_resample_api.py
GYHHAHA Nov 27, 2020
7c54839
Update test_resample_api.py
GYHHAHA Nov 27, 2020
a4e0a39
flake8 fix
GYHHAHA Nov 27, 2020
0e2e390
break lines
GYHHAHA Nov 27, 2020
9f4844a
Update resample.py
GYHHAHA Nov 27, 2020
5b7f396
fix docstring
GYHHAHA Nov 27, 2020
115c92a
split tests
GYHHAHA Nov 28, 2020
7d8d67a
Update generic.py
GYHHAHA Nov 28, 2020
77fc4a3
doc added & tests fix
GYHHAHA Nov 28, 2020
0cff41e
Merge branch 'master' into master
GYHHAHA Nov 28, 2020
b492293
fix doc
GYHHAHA Nov 28, 2020
561096c
Merge remote-tracking branch 'upstream/master'
GYHHAHA Dec 11, 2020
76a015a
Revert "Merge remote-tracking branch 'upstream/master'"
GYHHAHA Dec 11, 2020
a0262ab
Revert "fix doc"
GYHHAHA Dec 11, 2020
b990c5f
Revert "Merge branch 'master' into master"
GYHHAHA Dec 11, 2020
8e8c1e6
Revert "doc added & tests fix"
GYHHAHA Dec 11, 2020
cc9f2e0
Revert "Update generic.py"
GYHHAHA Dec 11, 2020
629773a
Revert "split tests"
GYHHAHA Dec 11, 2020
c79155b
Revert "fix docstring"
GYHHAHA Dec 11, 2020
f46c924
Revert "Update resample.py"
GYHHAHA Dec 11, 2020
af99a33
Revert "break lines"
GYHHAHA Dec 11, 2020
d7db83b
Revert "flake8 fix"
GYHHAHA Dec 11, 2020
69183f6
Revert "Update test_resample_api.py"
GYHHAHA Dec 11, 2020
5b9afee
Revert "Update test_resample_api.py"
GYHHAHA Dec 11, 2020
216bff3
Revert "Update test_resample_api.py"
GYHHAHA Dec 11, 2020
5409a75
Revert "Update test_resample_api.py"
GYHHAHA Dec 11, 2020
90ddc36
Revert "fix format"
GYHHAHA Dec 11, 2020
7b3cffb
Revert "add test cases"
GYHHAHA Dec 11, 2020
2d51a8a
Revert "add doc-string"
GYHHAHA Dec 11, 2020
c24d8f9
Revert "add backward para and end_day option"
GYHHAHA Dec 11, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
steps:

- name: Setting conda path
run: echo "${HOME}/miniconda3/bin" >> $GITHUB_PATH
run: echo "::add-path::${HOME}/miniconda3/bin"

- name: Checkout
uses: actions/checkout@v1
Expand Down Expand Up @@ -98,7 +98,7 @@ jobs:
steps:

- name: Setting conda path
run: echo "${HOME}/miniconda3/bin" >> $GITHUB_PATH
run: echo "::set-env name=PATH::${HOME}/miniconda3/bin:${PATH}"

- name: Checkout
uses: actions/checkout@v1
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
*.log
*.swp
*.pdb
*.zip
.project
.pydevproject
.settings
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ repos:
name: isort (cython)
types: [cython]
- repo: https://github.com/asottile/pyupgrade
rev: v2.7.4
rev: v2.7.3
hooks:
- id: pyupgrade
args: [--py37-plus]
Expand Down
7 changes: 6 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@ matrix:
fast_finish: true

include:
- dist: bionic
python: 3.9-dev
env:
- JOB="3.9-dev" PATTERN="(not slow and not network and not clipboard)"

- env:
- JOB="3.8, slow" ENV_FILE="ci/deps/travis-38-slow.yaml" PATTERN="slow" SQL="1"
services:
Expand Down Expand Up @@ -89,7 +94,7 @@ install:
script:
- echo "script start"
- echo "$JOB"
- source activate pandas-dev
- if [ "$JOB" != "3.9-dev" ]; then source activate pandas-dev; fi
- ci/run_tests.sh

after_script:
Expand Down
17 changes: 8 additions & 9 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM quay.io/condaforge/miniforge3
FROM continuumio/miniconda3

# if you forked pandas, you can pass in your own GitHub username to use your fork
# i.e. gh_username=myname
Expand All @@ -15,6 +15,10 @@ RUN apt-get update \
# Verify git, process tools, lsb-release (common in install instructions for CLIs) installed
&& apt-get -y install git iproute2 procps iproute2 lsb-release \
#
# Install C compilers (gcc not enough, so just went with build-essential which admittedly might be overkill),
# needed to build pandas C extensions
&& apt-get -y install build-essential \
#
# cleanup
&& apt-get autoremove -y \
&& apt-get clean -y \
Expand All @@ -35,14 +39,9 @@ RUN mkdir "$pandas_home" \
# we just update the base/root one from the 'environment.yml' file instead of creating a new one.
#
# Set up environment
RUN conda install -y mamba
RUN mamba env update -n base -f "$pandas_home/environment.yml"
RUN conda env update -n base -f "$pandas_home/environment.yml"

# Build C extensions and pandas
SHELL ["/bin/bash", "-c"]
RUN . /opt/conda/etc/profile.d/conda.sh \
&& conda activate base \
&& cd "$pandas_home" \
&& export \
&& python setup.py build_ext -j 4 \
RUN cd "$pandas_home" \
&& python setup.py build_ext --inplace -j 4 \
&& python -m pip install -e .
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ clean_pyc:
-find . -name '*.py[co]' -exec rm {} \;

build: clean_pyc
python setup.py build_ext
python setup.py build_ext --inplace

lint-diff:
git diff upstream/master --name-only -- "*.py" | xargs flake8
Expand Down
44 changes: 22 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,27 +60,27 @@ Here are just a few of the things that pandas does well:
and saving/loading data from the ultrafast [**HDF5 format**][hdfstore]
- [**Time series**][timeseries]-specific functionality: date range
generation and frequency conversion, moving window statistics,
date shifting and lagging


[missing-data]: https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html
[insertion-deletion]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#column-selection-addition-deletion
[alignment]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html?highlight=alignment#intro-to-data-structures
[groupby]: https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#group-by-split-apply-combine
[conversion]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#dataframe
[slicing]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#slicing-ranges
[fancy-indexing]: https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#advanced
[subsetting]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-indexing
[merging]: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#database-style-dataframe-or-named-series-joining-merging
[joining]: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#joining-on-index
[reshape]: https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html
[pivot-table]: https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html
[mi]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#hierarchical-indexing-multiindex
[flat-files]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#csv-text-files
[excel]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#excel-files
[db]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#sql-queries
[hdfstore]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#hdf5-pytables
[timeseries]: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#time-series-date-functionality
date shifting and lagging.


[missing-data]: https://pandas.pydata.org/pandas-docs/stable/missing_data.html#working-with-missing-data
[insertion-deletion]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion
[alignment]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html?highlight=alignment#intro-to-data-structures
[groupby]: https://pandas.pydata.org/pandas-docs/stable/groupby.html#group-by-split-apply-combine
[conversion]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe
[slicing]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges
[fancy-indexing]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#advanced-indexing-with-ix
[subsetting]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing
[merging]: https://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging
[joining]: https://pandas.pydata.org/pandas-docs/stable/merging.html#joining-on-index
[reshape]: https://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-and-pivot-tables
[pivot-table]: https://pandas.pydata.org/pandas-docs/stable/reshaping.html#pivot-tables-and-cross-tabulations
[mi]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#hierarchical-indexing-multiindex
[flat-files]: https://pandas.pydata.org/pandas-docs/stable/io.html#csv-text-files
[excel]: https://pandas.pydata.org/pandas-docs/stable/io.html#excel-files
[db]: https://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries
[hdfstore]: https://pandas.pydata.org/pandas-docs/stable/io.html#hdf5-pytables
[timeseries]: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-series-date-functionality

## Where to get it
The source code is currently hosted on GitHub at:
Expand Down Expand Up @@ -154,7 +154,7 @@ For usage questions, the best place to go to is [StackOverflow](https://stackove
Further, general questions and discussions can also take place on the [pydata mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata).

## Discussion and Development
Most development discussions take place on GitHub in this repo. Further, the [pandas-dev mailing list](https://mail.python.org/mailman/listinfo/pandas-dev) can also be used for specialized discussions or design issues, and a [Gitter channel](https://gitter.im/pydata/pandas) is available for quick development related questions.
Most development discussions take place on github in this repo. Further, the [pandas-dev mailing list](https://mail.python.org/mailman/listinfo/pandas-dev) can also be used for specialized discussions or design issues, and a [Gitter channel](https://gitter.im/pydata/pandas) is available for quick development related questions.

## Contributing to pandas [![Open Source Helpers](https://www.codetriage.com/pandas-dev/pandas/badges/users.svg)](https://www.codetriage.com/pandas-dev/pandas)

Expand Down
12 changes: 0 additions & 12 deletions asv_bench/benchmarks/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
from pandas._libs import lib

import pandas as pd
from pandas.core.algorithms import make_duplicates_of_left_unique_in_right

from .pandas_vb_common import tm

Expand Down Expand Up @@ -175,15 +174,4 @@ def time_argsort(self, N):
self.array.argsort()


class RemoveDuplicates:
def setup(self):
N = 10 ** 5
na = np.arange(int(N / 2))
self.left = np.concatenate([na[: int(N / 4)], na[: int(N / 4)]])
self.right = np.concatenate([na, na])

def time_make_duplicates_of_left_unique_in_right(self):
make_duplicates_of_left_unique_in_right(self.left, self.right)


from .pandas_vb_common import setup # noqa: F401 isort:skip
43 changes: 0 additions & 43 deletions asv_bench/benchmarks/categoricals.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
import string
import sys
import warnings

import numpy as np
Expand Down Expand Up @@ -69,47 +67,6 @@ def time_existing_series(self):
pd.Categorical(self.series)


class AsType:
def setup(self):
N = 10 ** 5

random_pick = np.random.default_rng().choice

categories = {
"str": list(string.ascii_letters),
"int": np.random.randint(2 ** 16, size=154),
"float": sys.maxsize * np.random.random((38,)),
"timestamp": [
pd.Timestamp(x, unit="s") for x in np.random.randint(2 ** 18, size=578)
],
}

self.df = pd.DataFrame(
{col: random_pick(cats, N) for col, cats in categories.items()}
)

for col in ("int", "float", "timestamp"):
self.df[col + "_as_str"] = self.df[col].astype(str)

for col in self.df.columns:
self.df[col] = self.df[col].astype("category")

def astype_str(self):
[self.df[col].astype("str") for col in "int float timestamp".split()]

def astype_int(self):
[self.df[col].astype("int") for col in "int_as_str timestamp".split()]

def astype_float(self):
[
self.df[col].astype("float")
for col in "float_as_str int int_as_str timestamp".split()
]

def astype_datetime(self):
self.df["float"].astype(pd.DatetimeTZDtype(tz="US/Pacific"))


class Concat:
def setup(self):
N = 10 ** 5
Expand Down
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -486,7 +486,7 @@ def setup(self):
tmp2 = (np.random.random(10000) * 10.0).astype(np.float32)
tmp = np.concatenate((tmp1, tmp2))
arr = np.repeat(tmp, 10)
self.df = DataFrame({"a": arr, "b": arr})
self.df = DataFrame(dict(a=arr, b=arr))

def time_sum(self):
self.df.groupby(["a"])["b"].sum()
Expand Down
Loading