Skip to content

PERF: Construction of a DatetimeIndex from a list of Timestamp with timezone #51247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Mar 15, 2023
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions ci/deps/actions-310-numpydev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,11 @@ dependencies:
- python-dateutil
- pytz
- pip

- pip:
- "cython"
- "--extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple"
- "--pre"
- "numpy"
- "scipy"
- "tzdata>=2022.1; platform_system=='Windows'"
4 changes: 3 additions & 1 deletion ci/deps/actions-310.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@ dependencies:
- scipy>=1.7.1
- sqlalchemy>=1.4.16
- tabulate>=0.8.9
- tzdata>=2022a
- xarray>=0.21.0
- xlrd>=2.0.1
- xlsxwriter>=1.4.3
- zstandard>=0.15.2

- pip:
- tzdata>=2022.1; platform_system=="Windows"
Copy link
Member

@lithomas1 lithomas1 Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to be consistent and go with the pip installed tzdata everywhere, so we don't get surprises if/when Github updates its runner images.

I guess there's a risk of things breaking for people without the tzdata package installed, but that's more of a zoneinfo/Python stdlib problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, but isn't that what I've done (i.e. pip-installed tzdata)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant on all platforms not just Windows, sorry if I was unclear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, sure, I've removed the windows-specific condition, and have installed/required it everywhere

4 changes: 3 additions & 1 deletion ci/deps/actions-311.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@ dependencies:
- scipy>=1.7.1
- sqlalchemy>=1.4.16
- tabulate>=0.8.9
- tzdata>=2022a
- xarray>=0.21.0
- xlrd>=2.0.1
- xlsxwriter>=1.4.3
- zstandard>=0.15.2

- pip:
- tzdata>=2022.1; platform_system=="Windows"
3 changes: 3 additions & 0 deletions ci/deps/actions-38-downstream_compat.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,6 @@ dependencies:
- pandas-gbq>=0.15.0
- pyyaml
- py

- pip:
- tzdata>=2022.1; platform_system=="Windows"
2 changes: 1 addition & 1 deletion ci/deps/actions-38-minimum_versions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,11 @@ dependencies:
- scipy=1.7.1
- sqlalchemy=1.4.16
- tabulate=0.8.9
- tzdata=2022a
- xarray=0.21.0
- xlrd=2.0.1
- xlsxwriter=1.4.3
- zstandard=0.15.2

- pip:
- pyqt5==5.15.1
- tzdata==2022.1; platform_system=="Windows"
3 changes: 3 additions & 0 deletions ci/deps/actions-38.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,6 @@ dependencies:
- xlrd>=2.0.1
- xlsxwriter>=1.4.3
- zstandard>=0.15.2

- pip:
- tzdata>=2022.1; platform_system=="Windows"
4 changes: 3 additions & 1 deletion ci/deps/actions-39.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@ dependencies:
- scipy>=1.7.1
- sqlalchemy>=1.4.16
- tabulate>=0.8.9
- tzdata>=2022a
- xarray>=0.21.0
- xlrd>=2.0.1
- xlsxwriter>=1.4.3
- zstandard>=0.15.2

- pip:
- tzdata>=2022.1; platform_system=="Windows"
3 changes: 3 additions & 0 deletions ci/deps/actions-pypy-38.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,6 @@ dependencies:
- numpy
- python-dateutil
- pytz

- pip:
- tzdata>=2022.1; platform_system=="Windows"
2 changes: 1 addition & 1 deletion ci/test_wheels_windows.bat
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ pd.test(extra_args=['-m not clipboard and not single_cpu and not slow and not ne
pd.test(extra_args=['-m not clipboard and single_cpu and not slow and not network and not db'])

python --version
pip install pytz six numpy python-dateutil
pip install pytz six numpy python-dateutil tzdata>=2022.1
pip install hypothesis>=6.34.2 pytest>=7.0.0 pytest-xdist>=2.2.0 pytest-asyncio>=0.17
pip install --find-links=pandas/dist --no-index pandas
python -c "%test_command%"
19 changes: 0 additions & 19 deletions doc/source/getting_started/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -308,25 +308,6 @@ Dependency Minimum Version pip ext
`numba <https://github.com/numba/numba>`__ 0.53.1 performance Alternative execution engine for operations that accept ``engine="numba"`` using a JIT compiler that translates Python functions to optimized machine code using the LLVM compiler.
===================================================== ================== ================== ===================================================================================================================================================================================

Timezones
^^^^^^^^^

Installable with ``pip install "pandas[timezone]"``

========================= ========================= =============== =============================================================
Dependency Minimum Version pip extra Notes
========================= ========================= =============== =============================================================
tzdata 2022.1(pypi)/ timezone Allows the use of ``zoneinfo`` timezones with pandas.
2022a(for system tzdata) **Note**: You only need to install the pypi package if your
system does not already provide the IANA tz database.
However, the minimum tzdata version still applies, even if it
is not enforced through an error.

If you would like to keep your system tzdata version updated,
it is recommended to use the ``tzdata`` package from
conda-forge.
========================= ========================= =============== =============================================================

Visualization
^^^^^^^^^^^^^

Expand Down
4 changes: 4 additions & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -647,6 +647,10 @@ If installed, we now require:
+-------------------+-----------------+----------+---------+
| python-dateutil | 2.8.2 | X | X |
+-------------------+-----------------+----------+---------+
| | |on Windows| |
| tzdata | 2022.1 |for | X |
| | |timezones | |
+-------------------+-----------------+----------+---------+

For `optional libraries <https://pandas.pydata.org/docs/getting_started/install.html>`_ the general recommendation is to use the latest version.
The following table lists the lowest version per library that is currently being tested throughout the development of pandas.
Expand Down
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ dependencies:
- scipy>=1.7.1
- sqlalchemy>=1.4.16
- tabulate>=0.8.9
- tzdata>=2022a
- xarray>=0.21.0
- xlrd>=2.0.1
- xlsxwriter>=1.4.3
Expand Down Expand Up @@ -119,3 +118,4 @@ dependencies:
- pip:
- sphinx-toggleprompt
- typing_extensions; python_version<"3.11"
- tzdata>=2022.1
5 changes: 2 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@ dependencies = [
"numpy>=1.21.0; python_version>='3.10'",
"numpy>=1.23.2; python_version>='3.11'",
"python-dateutil>=2.8.2",
"pytz>=2020.1"
"pytz>=2020.1",
"tzdata>=2022.1; platform_system=='Windows'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be checked in setup.py? or even just import it in timezones.pyx?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly don't know, I couldn't reproduce this when building from source

Tempted to just ship it, and then check with the nightlies?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lithomas1 thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll tag this as build to run the wheel builders, which should sniff this stuff out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lithomas1 did you try this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the wheel builder jobs ran on this PR. Looks like some failures. I don't have Windows access so I can't help more sorry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the wheel builder jobs ran on this PR

Ah, I see now sorry - thanks! Taking a look

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you might need a check for this in setup.py?

If you uninstall tzdata and try to run python setup.py develop, does this error correctly?

I don't know if setuptools reads this section of pyproject.toml.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

giving this a go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this worked

In a new virtual environment on Windows, I:

  • pip installed numpy
  • pip installed cython
  • pip install versioneer[toml]
  • ran python setup.py develop

and tzdata got installed, along with python-dateutil. Here's (part of) my output:

copying build\lib.win-amd64-cpython-38\pandas\_libs\window\aggregations.cp38-win_amd64.pyd -> pandas\_libs\window
copying build\lib.win-amd64-cpython-38\pandas\_libs\window\indexers.cp38-win_amd64.pyd -> pandas\_libs\window
copying build\lib.win-amd64-cpython-38\pandas\_libs\writers.cp38-win_amd64.pyd -> pandas\_libs
copying build\lib.win-amd64-cpython-38\pandas\io\sas\_sas.cp38-win_amd64.pyd -> pandas\io\sas
copying build\lib.win-amd64-cpython-38\pandas\io\sas\_byteswap.cp38-win_amd64.pyd -> pandas\io\sas
copying build\lib.win-amd64-cpython-38\pandas\_libs\json.cp38-win_amd64.pyd -> pandas\_libs
Creating c:\users\user\pandas-dev\.venv\lib\site-packages\pandas.egg-link (link to .)
Adding pandas 2.1.0.dev0+186.g4b054da685 to easy-install.pth file

Installed c:\users\user\pandas-dev
Processing dependencies for pandas==2.1.0.dev0+186.g4b054da685
Searching for tzdata>=2022.1
Reading https://pypi.org/simple/tzdata/
C:\Users\User\pandas-dev\.venv\lib\site-packages\pkg_resources\__init__.py:123: PkgResourcesDeprecationWarning:  is an invalid version and will not be supported in a future release
  warnings.warn(
Downloading https://files.pythonhosted.org/packages/fa/5e/f99a7df3ae2079211d31ec23b1d34380c7870c26e99159f6e422dcbab538/tzdata-2022.7-py2.py3-none-any.whl#sha256=2b88858b0e3120792a3c0635c23daf36a7d7eeeca657c323da299d2094402a0d
Best match: tzdata 2022.7
Processing tzdata-2022.7-py2.py3-none-any.whl
Installing tzdata-2022.7-py2.py3-none-any.whl to c:\users\user\pandas-dev\.venv\lib\site-packages
Adding tzdata 2022.7 to easy-install.pth file

Installed c:\users\user\pandas-dev\.venv\lib\site-packages\tzdata-2022.7-py3.8.egg

]
classifiers = [
'Development Status :: 5 - Production/Stable',
Expand Down Expand Up @@ -57,7 +58,6 @@ matplotlib = "pandas:plotting._matplotlib"
[project.optional-dependencies]
test = ['hypothesis>=6.34.2', 'pytest>=7.0.0', 'pytest-xdist>=2.2.0', 'pytest-asyncio>=0.17.0']
performance = ['bottleneck>=1.3.2', 'numba>=0.53.1', 'numexpr>=2.7.1']
timezone = ['tzdata>=2022.1']
computation = ['scipy>=1.7.1', 'xarray>=0.21.0']
fss = ['fsspec>=2021.07.0']
aws = ['s3fs>=2021.08.0']
Expand Down Expand Up @@ -112,7 +112,6 @@ all = ['beautifulsoup4>=4.9.3',
'SQLAlchemy>=1.4.16',
'tables>=3.6.1',
'tabulate>=0.8.9',
'tzdata>=2022.1',
'xarray>=0.21.0',
'xlrd>=2.0.1',
'xlsxwriter>=1.4.3',
Expand Down
2 changes: 1 addition & 1 deletion requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ s3fs>=2021.08.0
scipy>=1.7.1
SQLAlchemy>=1.4.16
tabulate>=0.8.9
tzdata>=2022.1
xarray>=0.21.0
xlrd>=2.0.1
xlsxwriter>=1.4.3
Expand Down Expand Up @@ -88,4 +87,5 @@ requests
pygments
sphinx-toggleprompt
typing_extensions; python_version<"3.11"
tzdata>=2022.1
setuptools>=61.0.0
6 changes: 2 additions & 4 deletions scripts/validate_min_versions_in_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@
EXCLUDE_DEPS = {"tzdata", "blosc"}
EXCLUSION_LIST = {
"python=3.8[build=*_pypy]": None,
"tzdata": None,
"pyarrow": None,
}
# pandas package is not available
Expand Down Expand Up @@ -228,10 +227,9 @@ def get_versions_from_ci(content: list[str]) -> tuple[dict[str, str], dict[str,
continue
elif seen_required and line.strip():
if "==" in line:
package, version = line.strip().split("==")

package, version = line.strip().split("==", maxsplit=1)
else:
package, version = line.strip().split("=")
package, version = line.strip().split("=", maxsplit=1)
package = package[2:]
if package in EXCLUDE_DEPS:
continue
Expand Down