Skip to content

Commit 9cc17b7

Browse files
committed
Merge remote-tracking branch 'upstream/master' into unique-index
2 parents bf5a943 + 4729766 commit 9cc17b7

File tree

104 files changed

+810
-8906
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

104 files changed

+810
-8906
lines changed

.pre-commit-config.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,4 @@ repos:
1515
hooks:
1616
- id: isort
1717
language: python_venv
18+
exclude: ^pandas/__init__\.py$|^pandas/core/api\.py$

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ Most development discussion is taking place on github in this repo. Further, the
225225

226226
All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.
227227

228-
A detailed overview on how to contribute can be found in the **[contributing guide](https://dev.pandas.io/contributing.html)**. There is also an [overview](.github/CONTRIBUTING.md) on GitHub.
228+
A detailed overview on how to contribute can be found in the **[contributing guide](https://dev.pandas.io/docs/contributing.html)**. There is also an [overview](.github/CONTRIBUTING.md) on GitHub.
229229

230230
If you are simply looking to start working with the pandas codebase, navigate to the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) and start looking through interesting issues. There are a number of issues listed under [Docs](https://github.com/pandas-dev/pandas/issues?labels=Docs&sort=updated&state=open) and [good first issue](https://github.com/pandas-dev/pandas/issues?labels=good+first+issue&sort=updated&state=open) where you could start out.
231231

azure-pipelines.yml

+17-6
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ jobs:
104104
displayName: 'Running benchmarks'
105105
condition: true
106106
107-
- job: 'Docs'
107+
- job: 'Web_and_Docs'
108108
pool:
109109
vmImage: ubuntu-16.04
110110
timeoutInMinutes: 90
@@ -119,6 +119,11 @@ jobs:
119119
ci/setup_env.sh
120120
displayName: 'Setup environment and build pandas'
121121
122+
- script: |
123+
source activate pandas-dev
124+
python web/pandas_web.py web/pandas
125+
displayName: 'Build website'
126+
122127
- script: |
123128
source activate pandas-dev
124129
# Next we should simply have `doc/make.py --warnings-are-errors`, everything else is required because the ipython directive doesn't fail the build on errors (https://github.com/ipython/ipython/issues/11547)
@@ -128,15 +133,21 @@ jobs:
128133
displayName: 'Build documentation'
129134
130135
- script: |
131-
cd doc/build/html
136+
mkdir -p to_deploy/docs
137+
cp -r web/build/* to_deploy/
138+
cp -r doc/build/html/* to_deploy/docs/
139+
displayName: 'Merge website and docs'
140+
141+
- script: |
142+
cd to_deploy
132143
git init
133144
touch .nojekyll
134145
echo "dev.pandas.io" > CNAME
135146
printf "User-agent: *\nDisallow: /" > robots.txt
136147
git add --all .
137148
git config user.email "[email protected]"
138-
git config user.name "pandas-docs-bot"
139-
git commit -m "pandas documentation in master"
149+
git config user.name "pandas-bot"
150+
git commit -m "pandas web and documentation in master"
140151
displayName: 'Create git repo for docs build'
141152
condition : |
142153
and(not(eq(variables['Build.Reason'], 'PullRequest')),
@@ -160,10 +171,10 @@ jobs:
160171
eq(variables['Build.SourceBranch'], 'refs/heads/master'))
161172
162173
- script: |
163-
cd doc/build/html
174+
cd to_deploy
164175
git remote add origin [email protected]:pandas-dev/pandas-dev.github.io.git
165176
git push -f origin master
166-
displayName: 'Publish docs to GitHub pages'
177+
displayName: 'Publish web and docs to GitHub pages'
167178
condition : |
168179
and(not(eq(variables['Build.Reason'], 'PullRequest')),
169180
eq(variables['Build.SourceBranch'], 'refs/heads/master'))

ci/azure/posix.yml

+7
Original file line numberDiff line numberDiff line change
@@ -60,15 +60,21 @@ jobs:
6060
echo "Creating Environment"
6161
ci/setup_env.sh
6262
displayName: 'Setup environment and build pandas'
63+
6364
- script: |
6465
source activate pandas-dev
6566
ci/run_tests.sh
6667
displayName: 'Test'
68+
6769
- script: source activate pandas-dev && pushd /tmp && python -c "import pandas; pandas.show_versions();" && popd
70+
displayName: 'Build versions'
71+
6872
- task: PublishTestResults@2
6973
inputs:
7074
testResultsFiles: 'test-data-*.xml'
7175
testRunTitle: ${{ format('{0}-$(CONDA_PY)', parameters.name) }}
76+
displayName: 'Publish test results'
77+
7278
- powershell: |
7379
$junitXml = "test-data-single.xml"
7480
$(Get-Content $junitXml | Out-String) -match 'failures="(.*?)"'
@@ -94,6 +100,7 @@ jobs:
94100
Write-Error "$($matches[1]) tests failed"
95101
}
96102
displayName: 'Check for test failures'
103+
97104
- script: |
98105
source activate pandas-dev
99106
python ci/print_skipped.py

ci/print_skipped.py

+23-35
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,40 @@
11
#!/usr/bin/env python
2-
3-
import math
42
import os
5-
import sys
63
import xml.etree.ElementTree as et
74

85

9-
def parse_results(filename):
6+
def main(filename):
7+
if not os.path.isfile(filename):
8+
return
9+
1010
tree = et.parse(filename)
1111
root = tree.getroot()
12-
skipped = []
13-
1412
current_class = ""
15-
i = 1
16-
assert i - 1 == len(skipped)
1713
for el in root.findall("testcase"):
1814
cn = el.attrib["classname"]
1915
for sk in el.findall("skipped"):
2016
old_class = current_class
2117
current_class = cn
22-
name = "{classname}.{name}".format(
23-
classname=current_class, name=el.attrib["name"]
24-
)
25-
msg = sk.attrib["message"]
26-
out = ""
2718
if old_class != current_class:
28-
ndigits = int(math.log(i, 10) + 1)
29-
30-
# 4 for : + space + # + space
31-
out += "-" * (len(name + msg) + 4 + ndigits) + "\n"
32-
out += "#{i} {name}: {msg}".format(i=i, name=name, msg=msg)
33-
skipped.append(out)
34-
i += 1
35-
assert i - 1 == len(skipped)
36-
assert i - 1 == len(skipped)
37-
# assert len(skipped) == int(root.attrib['skip'])
38-
return "\n".join(skipped)
39-
40-
41-
def main():
42-
test_files = ["test-data-single.xml", "test-data-multiple.xml", "test-data.xml"]
43-
44-
print("SKIPPED TESTS:")
45-
for fn in test_files:
46-
if os.path.isfile(fn):
47-
print(parse_results(fn))
48-
return 0
19+
yield None
20+
yield {
21+
"class_name": current_class,
22+
"test_name": el.attrib["name"],
23+
"message": sk.attrib["message"],
24+
}
4925

5026

5127
if __name__ == "__main__":
52-
sys.exit(main())
28+
print("SKIPPED TESTS:")
29+
i = 1
30+
for file_type in ("-single", "-multiple", ""):
31+
for test_data in main("test-data{}.xml".format(file_type)):
32+
if test_data is None:
33+
print("-" * 80)
34+
else:
35+
print(
36+
"#{i} {class_name}.{test_name}: {message}".format(
37+
**dict(test_data, i=i)
38+
)
39+
)
40+
i += 1

ci/run_tests.sh

+3-10
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,6 @@
1-
#!/bin/bash
1+
#!/bin/bash -e
22

3-
set -e
4-
5-
if [ "$DOC" ]; then
6-
echo "We are not running pytest as this is a doc-build"
7-
exit 0
8-
fi
9-
10-
# Workaround for pytest-xdist flaky collection order
3+
# Workaround for pytest-xdist (it collects different tests in the workers if PYTHONHASHSEED is not set)
114
# https://github.com/pytest-dev/pytest/issues/920
125
# https://github.com/pytest-dev/pytest/issues/1075
136
export PYTHONHASHSEED=$(python -c 'import random; print(random.randint(1, 4294967295))')
@@ -16,7 +9,7 @@ if [ -n "$LOCALE_OVERRIDE" ]; then
169
export LC_ALL="$LOCALE_OVERRIDE"
1710
export LANG="$LOCALE_OVERRIDE"
1811
PANDAS_LOCALE=`python -c 'import pandas; pandas.get_option("display.encoding")'`
19-
if [[ "$LOCALE_OVERIDE" != "$PANDAS_LOCALE" ]]; then
12+
if [[ "$LOCALE_OVERRIDE" != "$PANDAS_LOCALE" ]]; then
2013
echo "pandas could not detect the locale. System locale: $LOCALE_OVERRIDE, pandas detected: $PANDAS_LOCALE"
2114
# TODO Not really aborting the tests until https://github.com/pandas-dev/pandas/issues/23923 is fixed
2215
# exit 1

doc/redirects.csv

-2
Original file line numberDiff line numberDiff line change
@@ -503,7 +503,6 @@ generated/pandas.DataFrame.to_parquet,../reference/api/pandas.DataFrame.to_parqu
503503
generated/pandas.DataFrame.to_period,../reference/api/pandas.DataFrame.to_period
504504
generated/pandas.DataFrame.to_pickle,../reference/api/pandas.DataFrame.to_pickle
505505
generated/pandas.DataFrame.to_records,../reference/api/pandas.DataFrame.to_records
506-
generated/pandas.DataFrame.to_sparse,../reference/api/pandas.DataFrame.to_sparse
507506
generated/pandas.DataFrame.to_sql,../reference/api/pandas.DataFrame.to_sql
508507
generated/pandas.DataFrame.to_stata,../reference/api/pandas.DataFrame.to_stata
509508
generated/pandas.DataFrame.to_string,../reference/api/pandas.DataFrame.to_string
@@ -1432,7 +1431,6 @@ generated/pandas.Series.to_msgpack,../reference/api/pandas.Series.to_msgpack
14321431
generated/pandas.Series.to_numpy,../reference/api/pandas.Series.to_numpy
14331432
generated/pandas.Series.to_period,../reference/api/pandas.Series.to_period
14341433
generated/pandas.Series.to_pickle,../reference/api/pandas.Series.to_pickle
1435-
generated/pandas.Series.to_sparse,../reference/api/pandas.Series.to_sparse
14361434
generated/pandas.Series.to_sql,../reference/api/pandas.Series.to_sql
14371435
generated/pandas.Series.to_string,../reference/api/pandas.Series.to_string
14381436
generated/pandas.Series.to_timestamp,../reference/api/pandas.Series.to_timestamp

doc/source/reference/frame.rst

-8
Original file line numberDiff line numberDiff line change
@@ -357,15 +357,7 @@ Serialization / IO / conversion
357357
DataFrame.to_msgpack
358358
DataFrame.to_gbq
359359
DataFrame.to_records
360-
DataFrame.to_sparse
361360
DataFrame.to_dense
362361
DataFrame.to_string
363362
DataFrame.to_clipboard
364363
DataFrame.style
365-
366-
Sparse
367-
~~~~~~
368-
.. autosummary::
369-
:toctree: api/
370-
371-
SparseDataFrame.to_coo

doc/source/reference/series.rst

-11
Original file line numberDiff line numberDiff line change
@@ -577,18 +577,7 @@ Serialization / IO / conversion
577577
Series.to_sql
578578
Series.to_msgpack
579579
Series.to_json
580-
Series.to_sparse
581580
Series.to_dense
582581
Series.to_string
583582
Series.to_clipboard
584583
Series.to_latex
585-
586-
587-
Sparse
588-
------
589-
590-
.. autosummary::
591-
:toctree: api/
592-
593-
SparseSeries.to_coo
594-
SparseSeries.from_coo

doc/source/user_guide/io.rst

+8
Original file line numberDiff line numberDiff line change
@@ -4641,6 +4641,14 @@ Several caveats.
46414641

46424642
See the `Full Documentation <https://github.com/wesm/feather>`__.
46434643

4644+
.. ipython:: python
4645+
:suppress:
4646+
4647+
import warnings
4648+
# This can be removed once building with pyarrow >=0.15.0
4649+
warnings.filterwarnings("ignore", "The Sparse", FutureWarning)
4650+
4651+
46444652
.. ipython:: python
46454653
46464654
df = pd.DataFrame({'a': list('abc'),

doc/source/user_guide/sparse.rst

+5-15
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,6 @@
66
Sparse data structures
77
**********************
88

9-
.. note::
10-
11-
``SparseSeries`` and ``SparseDataFrame`` have been deprecated. Their purpose
12-
is served equally well by a :class:`Series` or :class:`DataFrame` with
13-
sparse values. See :ref:`sparse.migration` for tips on migrating.
14-
159
Pandas provides data structures for efficiently storing sparse data.
1610
These are not necessarily sparse in the typical "mostly 0". Rather, you can view these
1711
objects as being "compressed" where any data matching a specific value (``NaN`` / missing value, though any value
@@ -168,6 +162,11 @@ the correct dense result.
168162
Migrating
169163
---------
170164

165+
.. note::
166+
167+
``SparseSeries`` and ``SparseDataFrame`` were removed in pandas 1.0.0. This migration
168+
guide is present to aid in migrating from previous versions.
169+
171170
In older versions of pandas, the ``SparseSeries`` and ``SparseDataFrame`` classes (documented below)
172171
were the preferred way to work with sparse data. With the advent of extension arrays, these subclasses
173172
are no longer needed. Their purpose is better served by using a regular Series or DataFrame with
@@ -366,12 +365,3 @@ row and columns coordinates of the matrix. Note that this will consume a signifi
366365
367366
ss_dense = pd.Series.sparse.from_coo(A, dense_index=True)
368367
ss_dense
369-
370-
371-
.. _sparse.subclasses:
372-
373-
Sparse subclasses
374-
-----------------
375-
376-
The :class:`SparseSeries` and :class:`SparseDataFrame` classes are deprecated. Visit their
377-
API pages for usage.

doc/source/whatsnew/v0.16.0.rst

+2-4
Original file line numberDiff line numberDiff line change
@@ -91,8 +91,7 @@ Interaction with scipy.sparse
9191

9292
Added :meth:`SparseSeries.to_coo` and :meth:`SparseSeries.from_coo` methods (:issue:`8048`) for converting to and from ``scipy.sparse.coo_matrix`` instances (see :ref:`here <sparse.scipysparse>`). For example, given a SparseSeries with MultiIndex we can convert to a `scipy.sparse.coo_matrix` by specifying the row and column labels as index levels:
9393

94-
.. ipython:: python
95-
:okwarning:
94+
.. code-block:: python
9695
9796
s = pd.Series([3.0, np.nan, 1.0, 3.0, np.nan, np.nan])
9897
s.index = pd.MultiIndex.from_tuples([(1, 2, 'a', 0),
@@ -121,8 +120,7 @@ Added :meth:`SparseSeries.to_coo` and :meth:`SparseSeries.from_coo` methods (:is
121120
The from_coo method is a convenience method for creating a ``SparseSeries``
122121
from a ``scipy.sparse.coo_matrix``:
123122

124-
.. ipython:: python
125-
:okwarning:
123+
.. code-block:: python
126124
127125
from scipy import sparse
128126
A = sparse.coo_matrix(([3.0, 1.0, 2.0], ([1, 0, 0], [0, 2, 3])),

doc/source/whatsnew/v0.18.1.rst

+2-4
Original file line numberDiff line numberDiff line change
@@ -393,8 +393,7 @@ used in the ``pandas`` implementation (:issue:`12644`, :issue:`12638`, :issue:`1
393393

394394
An example of this signature augmentation is illustrated below:
395395

396-
.. ipython:: python
397-
:okwarning:
396+
.. code-block:: python
398397
399398
sp = pd.SparseDataFrame([1, 2, 3])
400399
sp
@@ -409,8 +408,7 @@ Previous behaviour:
409408
410409
New behaviour:
411410

412-
.. ipython:: python
413-
:okwarning:
411+
.. code-block:: python
414412
415413
np.cumsum(sp, axis=0)
416414

doc/source/whatsnew/v0.19.0.rst

+2-4
Original file line numberDiff line numberDiff line change
@@ -1235,8 +1235,7 @@ Operators now preserve dtypes
12351235

12361236
- Sparse data structure now can preserve ``dtype`` after arithmetic ops (:issue:`13848`)
12371237

1238-
.. ipython:: python
1239-
:okwarning:
1238+
.. code-block:: python
12401239
12411240
s = pd.SparseSeries([0, 2, 0, 1], fill_value=0, dtype=np.int64)
12421241
s.dtype
@@ -1245,8 +1244,7 @@ Operators now preserve dtypes
12451244
12461245
- Sparse data structure now support ``astype`` to convert internal ``dtype`` (:issue:`13900`)
12471246

1248-
.. ipython:: python
1249-
:okwarning:
1247+
.. code-block:: python
12501248
12511249
s = pd.SparseSeries([1., 0., 2., 0.], fill_value=0)
12521250
s

doc/source/whatsnew/v0.20.0.rst

+2-3
Original file line numberDiff line numberDiff line change
@@ -338,8 +338,7 @@ See the :ref:`documentation <sparse.scipysparse>` for more information. (:issue:
338338

339339
All sparse formats are supported, but matrices that are not in :mod:`COOrdinate <scipy.sparse>` format will be converted, copying data as needed.
340340

341-
.. ipython:: python
342-
:okwarning:
341+
.. code-block:: python
343342
344343
from scipy.sparse import csr_matrix
345344
arr = np.random.random(size=(1000, 5))
@@ -351,7 +350,7 @@ All sparse formats are supported, but matrices that are not in :mod:`COOrdinate
351350
352351
To convert a ``SparseDataFrame`` back to sparse SciPy matrix in COO format, you can use:
353352

354-
.. ipython:: python
353+
.. code-block:: python
355354
356355
sdf.to_coo()
357356

0 commit comments

Comments
 (0)