Skip to content
forked from pydata/xarray

Commit 3c0d585

Browse files
committed
Merge branch 'main' into groupby-save-codes-new
* main: Preserve `base` and `loffset` arguments in `resample` (pydata#7444) ignore the `pkg_resources` deprecation warning (pydata#7594) Update contains_cftime_datetimes to avoid loading entire variable array (pydata#7494) Support first, last with dask arrays (pydata#7562) update the docs environment (pydata#7442) Add xCDAT to list of Xarray related projects (pydata#7579) [pre-commit.ci] pre-commit autoupdate (pydata#7565) fix nczarr when libnetcdf>4.8.1 (pydata#7575) use numpys SupportsDtype (pydata#7521)
2 parents ea3ed87 + 6d771fc commit 3c0d585

30 files changed

+435
-157
lines changed

.pre-commit-config.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ repos:
1616
files: ^xarray/
1717
- repo: https://github.com/charliermarsh/ruff-pre-commit
1818
# Ruff version.
19-
rev: 'v0.0.248'
19+
rev: 'v0.0.253'
2020
hooks:
2121
- id: ruff
2222
args: ["--fix"]

ci/requirements/doc.yml

+3-4
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ channels:
44
- conda-forge
55
- nodefaults
66
dependencies:
7-
- python=3.9
7+
- python=3.10
88
- bottleneck
99
- cartopy
1010
- cfgrib>=0.9
@@ -23,18 +23,17 @@ dependencies:
2323
- pandas>=1.4
2424
- pooch
2525
- pip
26-
- pydata-sphinx-theme>=0.4.3
2726
- pyproj
2827
- rasterio>=1.1
2928
- scipy!=1.10.0
3029
- seaborn
3130
- setuptools
3231
- sparse
3332
- sphinx-autosummary-accessors
34-
- sphinx-book-theme >= 0.0.38
33+
- sphinx-book-theme >= 0.3.0
3534
- sphinx-copybutton
3635
- sphinx-design
37-
- sphinx!=4.4.0
36+
- sphinx>=5.0
3837
- zarr>=2.10
3938
- pip:
4039
- sphinxext-rediraffe

doc/conf.py

+3-4
Original file line numberDiff line numberDiff line change
@@ -97,8 +97,8 @@
9797

9898

9999
extlinks = {
100-
"issue": ("https://github.com/pydata/xarray/issues/%s", "GH"),
101-
"pull": ("https://github.com/pydata/xarray/pull/%s", "PR"),
100+
"issue": ("https://github.com/pydata/xarray/issues/%s", "GH%s"),
101+
"pull": ("https://github.com/pydata/xarray/pull/%s", "PR%s"),
102102
}
103103

104104
# sphinx-copybutton configurations
@@ -244,12 +244,11 @@
244244
use_repository_button=True,
245245
use_issues_button=True,
246246
home_page_in_toc=False,
247-
extra_navbar="",
248-
navbar_footer_text="",
249247
extra_footer="""<p>Xarray is a fiscally sponsored project of <a href="https://numfocus.org">NumFOCUS</a>,
250248
a nonprofit dedicated to supporting the open-source scientific computing community.<br>
251249
Theme by the <a href="https://ebp.jupyterbook.org">Executable Book Project</a></p>""",
252250
twitter_url="https://twitter.com/xarray_devs",
251+
icon_links=[], # workaround for pydata/pydata-sphinx-theme#1220
253252
)
254253

255254

doc/ecosystem.rst

+1
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ Geosciences
4545
- `xarray-spatial <https://xarray-spatial.org/>`_: Numba-accelerated raster-based spatial processing tools (NDVI, curvature, zonal-statistics, proximity, hillshading, viewshed, etc.)
4646
- `xarray-topo <https://xarray-topo.readthedocs.io/>`_: xarray extension for topographic analysis and modelling.
4747
- `xbpch <https://github.com/darothen/xbpch>`_: xarray interface for bpch files.
48+
- `xCDAT <https://xcdat.readthedocs.io/>`_: An extension of xarray for climate data analysis on structured grids.
4849
- `xclim <https://xclim.readthedocs.io/>`_: A library for calculating climate science indices with unit handling built from xarray and dask.
4950
- `xESMF <https://pangeo-xesmf.readthedocs.io/>`_: Universal regridder for geospatial data.
5051
- `xgcm <https://xgcm.readthedocs.io/>`_: Extends the xarray data model to understand finite volume grid cells (common in General Circulation Models) and provides interpolation and difference operations for such grids.

doc/user-guide/interpolation.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ array-like, which gives the interpolated result as an array.
5050
# interpolation
5151
da.interp(time=[2.5, 3.5])
5252
53-
To interpolate data with a :py:doc:`numpy.datetime64 <reference/arrays.datetime>` coordinate you can pass a string.
53+
To interpolate data with a :py:doc:`numpy.datetime64 <numpy:reference/arrays.datetime>` coordinate you can pass a string.
5454

5555
.. ipython:: python
5656

doc/user-guide/weather-climate.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:
233233

234234
.. ipython:: python
235235
236-
da.resample(time="81T", closed="right", label="right", base=3).mean()
236+
da.resample(time="81T", closed="right", label="right", offset="3T").mean()
237237
238238
.. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timestamp-limitations
239239
.. _ISO 8601 standard: https://en.wikipedia.org/wiki/ISO_8601

doc/whats-new.rst

+11
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,22 @@ New Features
2525

2626
- Fix :py:meth:`xr.cov` and :py:meth:`xr.corr` now support complex valued arrays (:issue:`7340`, :pull:`7392`).
2727
By `Michael Niklas <https://github.com/headtr1ck>`_.
28+
- Support dask arrays in ``first`` and ``last`` reductions.
29+
By `Deepak Cherian <https://github.com/dcherian>`_.
2830

2931
Breaking changes
3032
~~~~~~~~~~~~~~~~
3133

3234

3335
Deprecations
3436
~~~~~~~~~~~~
37+
- Following pandas, the ``base`` and ``loffset`` parameters of
38+
:py:meth:`xr.DataArray.resample` and :py:meth:`xr.Dataset.resample` have been
39+
deprecated and will be removed in a future version of xarray. Using the
40+
``origin`` or ``offset`` parameters is recommended as a replacement for using
41+
the ``base`` parameter and using time offset arithmetic is recommended as a
42+
replacement for using the ``loffset`` parameter (:pull:`8459`). By `Spencer
43+
Clark <https://github.com/spencerkclark>`_.
3544

3645

3746
Bug fixes
@@ -42,6 +51,8 @@ Bug fixes
4251
- Fix matplotlib raising a UserWarning when plotting a scatter plot
4352
with an unfilled marker (:issue:`7313`, :pull:`7318`).
4453
By `Jimmy Westling <https://github.com/illviljan>`_.
54+
- Improved performance in ``open_dataset`` for datasets with large object arrays (:issue:`7484`, :pull:`7494`).
55+
By `Alex Goodman <https://github.com/agoodm>`_ and `Deepak Cherian <https://github.com/dcherian>`_.
4556

4657
Documentation
4758
~~~~~~~~~~~~~

xarray/backends/zarr.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,7 @@ def _get_zarr_dims_and_attrs(zarr_obj, dimension_key, try_nczarr):
207207
"which are required for xarray to determine variable dimensions."
208208
) from e
209209

210-
nc_attrs = [attr for attr in zarr_obj.attrs if attr.startswith("_NC")]
210+
nc_attrs = [attr for attr in zarr_obj.attrs if attr.lower().startswith("_nc")]
211211
attributes = HiddenKeyDict(zarr_obj.attrs, [dimension_key] + nc_attrs)
212212
return dimensions, attributes
213213

@@ -495,7 +495,7 @@ def get_attrs(self):
495495
return {
496496
k: v
497497
for k, v in self.zarr_group.attrs.asdict().items()
498-
if not k.startswith("_NC")
498+
if not k.lower().startswith("_nc")
499499
}
500500

501501
def get_dimensions(self):

xarray/coding/calendar_ops.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,7 @@ def convert_calendar(
147147
from xarray.core.dataarray import DataArray
148148

149149
time = obj[dim]
150-
if not _contains_datetime_like_objects(time):
150+
if not _contains_datetime_like_objects(time.variable):
151151
raise ValueError(f"Coordinate {dim} must contain datetime objects.")
152152

153153
use_cftime = _should_cftime_be_used(time, calendar, use_cftime)
@@ -319,8 +319,8 @@ def interp_calendar(source, target, dim="time"):
319319
target = DataArray(target, dims=(dim,), name=dim)
320320

321321
if not _contains_datetime_like_objects(
322-
source[dim]
323-
) or not _contains_datetime_like_objects(target):
322+
source[dim].variable
323+
) or not _contains_datetime_like_objects(target.variable):
324324
raise ValueError(
325325
f"Both 'source.{dim}' and 'target' must contain datetime objects."
326326
)

xarray/coding/cftime_offsets.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1267,7 +1267,7 @@ def date_range_like(source, calendar, use_cftime=None):
12671267
if not isinstance(source, (pd.DatetimeIndex, CFTimeIndex)) and (
12681268
isinstance(source, DataArray)
12691269
and (source.ndim != 1)
1270-
or not _contains_datetime_like_objects(source)
1270+
or not _contains_datetime_like_objects(source.variable)
12711271
):
12721272
raise ValueError(
12731273
"'source' must be a 1D array of datetime objects for inferring its range."

xarray/coding/frequencies.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -79,11 +79,12 @@ def infer_freq(index):
7979
If there are fewer than three values or the index is not 1D.
8080
"""
8181
from xarray.core.dataarray import DataArray
82+
from xarray.core.variable import Variable
8283

8384
if isinstance(index, (DataArray, pd.Series)):
8485
if index.ndim != 1:
8586
raise ValueError("'index' must be 1D")
86-
elif not _contains_datetime_like_objects(DataArray(index)):
87+
elif not _contains_datetime_like_objects(Variable("dim", index)):
8788
raise ValueError("'index' must contain datetime-like objects")
8889
dtype = np.asarray(index).dtype
8990
if dtype == "datetime64[ns]":

xarray/core/accessor_dt.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -574,7 +574,7 @@ def __new__(cls, obj: T_DataArray) -> CombinedDatetimelikeAccessor:
574574
# we need to choose which parent (datetime or timedelta) is
575575
# appropriate. Since we're checking the dtypes anyway, we'll just
576576
# do all the validation here.
577-
if not _contains_datetime_like_objects(obj):
577+
if not _contains_datetime_like_objects(obj.variable):
578578
raise TypeError(
579579
"'.dt' accessor only available for "
580580
"DataArray with datetime64 timedelta64 dtype or "

xarray/core/common.py

+64-40
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,16 @@
1111
import pandas as pd
1212

1313
from xarray.core import dtypes, duck_array_ops, formatting, formatting_html, ops
14+
from xarray.core.indexing import BasicIndexer, ExplicitlyIndexed
1415
from xarray.core.options import OPTIONS, _get_keep_attrs
16+
from xarray.core.pdcompat import _convert_base_to_offset
1517
from xarray.core.pycompat import is_duck_dask_array
16-
from xarray.core.utils import Frozen, either_dict_or_kwargs, is_scalar
18+
from xarray.core.utils import (
19+
Frozen,
20+
either_dict_or_kwargs,
21+
emit_user_level_warning,
22+
is_scalar,
23+
)
1724

1825
try:
1926
import cftime
@@ -40,6 +47,7 @@
4047
ScalarOrArray,
4148
SideOptions,
4249
T_DataWithCoords,
50+
T_Variable,
4351
)
4452
from xarray.core.variable import Variable
4553

@@ -843,6 +851,12 @@ def _resample(
843851
For frequencies that evenly subdivide 1 day, the "origin" of the
844852
aggregated intervals. For example, for "24H" frequency, base could
845853
range from 0 through 23.
854+
855+
.. deprecated:: 2023.03.0
856+
Following pandas, the ``base`` parameter is deprecated in favor
857+
of the ``origin`` and ``offset`` parameters, and will be removed
858+
in a future version of xarray.
859+
846860
origin : {'epoch', 'start', 'start_day', 'end', 'end_day'}, pd.Timestamp, datetime.datetime, np.datetime64, or cftime.datetime, default 'start_day'
847861
The datetime on which to adjust the grouping. The timezone of origin
848862
must match the timezone of the index.
@@ -858,6 +872,12 @@ def _resample(
858872
loffset : timedelta or str, optional
859873
Offset used to adjust the resampled time labels. Some pandas date
860874
offset strings are supported.
875+
876+
.. deprecated:: 2023.03.0
877+
Following pandas, the ``loffset`` parameter is deprecated in favor
878+
of using time offset arithmetic, and will be removed in a future
879+
version of xarray.
880+
861881
restore_coord_dims : bool, optional
862882
If True, also restore the dimension order of multi-dimensional
863883
coordinates.
@@ -928,8 +948,8 @@ def _resample(
928948
"""
929949
# TODO support non-string indexer after removing the old API.
930950

931-
from xarray.coding.cftimeindex import CFTimeIndex
932951
from xarray.core.dataarray import DataArray
952+
from xarray.core.groupby import TimeResampleGrouper
933953
from xarray.core.resample import RESAMPLE_DIM
934954

935955
if keep_attrs is not None:
@@ -959,28 +979,36 @@ def _resample(
959979
dim_name: Hashable = dim
960980
dim_coord = self[dim]
961981

962-
if isinstance(self._indexes[dim_name].to_pandas_index(), CFTimeIndex):
963-
from xarray.core.resample_cftime import CFTimeGrouper
964-
965-
grouper = CFTimeGrouper(
966-
freq=freq,
967-
closed=closed,
968-
label=label,
969-
base=base,
970-
loffset=loffset,
971-
origin=origin,
972-
offset=offset,
982+
if loffset is not None:
983+
emit_user_level_warning(
984+
"Following pandas, the `loffset` parameter to resample will be deprecated "
985+
"in a future version of xarray. Switch to using time offset arithmetic.",
986+
FutureWarning,
973987
)
974-
else:
975-
grouper = pd.Grouper(
976-
freq=freq,
977-
closed=closed,
978-
label=label,
979-
base=base,
980-
offset=offset,
981-
origin=origin,
982-
loffset=loffset,
988+
989+
if base is not None:
990+
emit_user_level_warning(
991+
"Following pandas, the `base` parameter to resample will be deprecated in "
992+
"a future version of xarray. Switch to using `origin` or `offset` instead.",
993+
FutureWarning,
983994
)
995+
996+
if base is not None and offset is not None:
997+
raise ValueError("base and offset cannot be present at the same time")
998+
999+
if base is not None:
1000+
index = self._indexes[dim_name].to_pandas_index()
1001+
offset = _convert_base_to_offset(base, freq, index)
1002+
1003+
grouper = TimeResampleGrouper(
1004+
freq=freq,
1005+
closed=closed,
1006+
label=label,
1007+
origin=origin,
1008+
offset=offset,
1009+
loffset=loffset,
1010+
)
1011+
9841012
group = DataArray(
9851013
dim_coord, coords=dim_coord.coords, dims=dim_coord.dims, name=RESAMPLE_DIM
9861014
)
@@ -1770,31 +1798,27 @@ def is_np_timedelta_like(dtype: DTypeLike) -> bool:
17701798
return np.issubdtype(dtype, np.timedelta64)
17711799

17721800

1773-
def _contains_cftime_datetimes(array) -> bool:
1774-
"""Check if an array contains cftime.datetime objects"""
1801+
def _contains_cftime_datetimes(array: Any) -> bool:
1802+
"""Check if a array inside a Variable contains cftime.datetime objects"""
17751803
if cftime is None:
17761804
return False
1777-
else:
1778-
if array.dtype == np.dtype("O") and array.size > 0:
1779-
sample = np.asarray(array).flat[0]
1780-
if is_duck_dask_array(sample):
1781-
sample = sample.compute()
1782-
if isinstance(sample, np.ndarray):
1783-
sample = sample.item()
1784-
return isinstance(sample, cftime.datetime)
1785-
else:
1786-
return False
17871805

1806+
if array.dtype == np.dtype("O") and array.size > 0:
1807+
first_idx = (0,) * array.ndim
1808+
if isinstance(array, ExplicitlyIndexed):
1809+
first_idx = BasicIndexer(first_idx)
1810+
sample = array[first_idx]
1811+
return isinstance(np.asarray(sample).item(), cftime.datetime)
17881812

1789-
def contains_cftime_datetimes(var) -> bool:
1813+
return False
1814+
1815+
1816+
def contains_cftime_datetimes(var: T_Variable) -> bool:
17901817
"""Check if an xarray.Variable contains cftime.datetime objects"""
1791-
if var.dtype == np.dtype("O") and var.size > 0:
1792-
return _contains_cftime_datetimes(var.data)
1793-
else:
1794-
return False
1818+
return _contains_cftime_datetimes(var._data)
17951819

17961820

1797-
def _contains_datetime_like_objects(var) -> bool:
1821+
def _contains_datetime_like_objects(var: T_Variable) -> bool:
17981822
"""Check if a variable contains datetime like objects (either
17991823
np.datetime64, np.timedelta64, or cftime.datetime)
18001824
"""

0 commit comments

Comments
 (0)