Skip to content

Commit 7faa509

Browse files
author
Pyry Kovanen
committed
Merge remote-tracking branch 'upstream/master' into empty-json-empty-df-fix
2 parents fc15ba0 + abfac97 commit 7faa509

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

96 files changed

+3113
-382
lines changed

Makefile

+1
Original file line numberDiff line numberDiff line change
@@ -23,3 +23,4 @@ doc:
2323
cd doc; \
2424
python make.py clean; \
2525
python make.py html
26+
python make.py spellcheck

doc/make.py

+15-2
Original file line numberDiff line numberDiff line change
@@ -224,8 +224,9 @@ def _sphinx_build(self, kind):
224224
--------
225225
>>> DocBuilder(num_jobs=4)._sphinx_build('html')
226226
"""
227-
if kind not in ('html', 'latex'):
228-
raise ValueError('kind must be html or latex, not {}'.format(kind))
227+
if kind not in ('html', 'latex', 'spelling'):
228+
raise ValueError('kind must be html, latex or '
229+
'spelling, not {}'.format(kind))
229230

230231
self._run_os('sphinx-build',
231232
'-j{}'.format(self.num_jobs),
@@ -304,6 +305,18 @@ def zip_html(self):
304305
'-q',
305306
*fnames)
306307

308+
def spellcheck(self):
309+
"""Spell check the documentation."""
310+
self._sphinx_build('spelling')
311+
output_location = os.path.join('build', 'spelling', 'output.txt')
312+
with open(output_location) as output:
313+
lines = output.readlines()
314+
if lines:
315+
raise SyntaxError(
316+
'Found misspelled words.'
317+
' Check pandas/doc/build/spelling/output.txt'
318+
' for more details.')
319+
307320

308321
def main():
309322
cmds = [method for method in dir(DocBuilder) if not method.startswith('_')]

doc/source/advanced.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -342,7 +342,7 @@ As usual, **both sides** of the slicers are included as this is label indexing.
342342
columns=micolumns).sort_index().sort_index(axis=1)
343343
dfmi
344344
345-
Basic multi-index slicing using slices, lists, and labels.
345+
Basic MultiIndex slicing using slices, lists, and labels.
346346

347347
.. ipython:: python
348348
@@ -1039,7 +1039,7 @@ On the other hand, if the index is not monotonic, then both slice bounds must be
10391039
KeyError: 'Cannot get right slice bound for non-unique label: 3'
10401040
10411041
:meth:`Index.is_monotonic_increasing` and :meth:`Index.is_monotonic_decreasing` only check that
1042-
an index is weakly monotonic. To check for strict montonicity, you can combine one of those with
1042+
an index is weakly monotonic. To check for strict monotonicity, you can combine one of those with
10431043
:meth:`Index.is_unique`
10441044
10451045
.. ipython:: python

doc/source/basics.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -593,7 +593,7 @@ categorical columns:
593593
frame = pd.DataFrame({'a': ['Yes', 'Yes', 'No', 'No'], 'b': range(4)})
594594
frame.describe()
595595
596-
This behaviour can be controlled by providing a list of types as ``include``/``exclude``
596+
This behavior can be controlled by providing a list of types as ``include``/``exclude``
597597
arguments. The special value ``all`` can also be used:
598598

599599
.. ipython:: python

doc/source/conf.py

+4
Original file line numberDiff line numberDiff line change
@@ -73,10 +73,14 @@
7373
'sphinx.ext.ifconfig',
7474
'sphinx.ext.linkcode',
7575
'nbsphinx',
76+
'sphinxcontrib.spelling'
7677
]
7778

7879
exclude_patterns = ['**.ipynb_checkpoints']
7980

81+
spelling_word_list_filename = ['spelling_wordlist.txt', 'names_wordlist.txt']
82+
spelling_ignore_pypi_package_names = True
83+
8084
with open("index.rst") as f:
8185
index_rst_lines = f.readlines()
8286

doc/source/contributing.rst

+19
Original file line numberDiff line numberDiff line change
@@ -436,6 +436,25 @@ the documentation are also built by Travis-CI. These docs are then hosted `here
436436
<http://pandas-docs.github.io/pandas-docs-travis>`__, see also
437437
the :ref:`Continuous Integration <contributing.ci>` section.
438438

439+
Spell checking documentation
440+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
441+
442+
When contributing to documentation to **pandas** it's good to check if your work
443+
contains any spelling errors. Sphinx provides an easy way to spell check documentation
444+
and docstrings.
445+
446+
Running the spell check is easy. Just navigate to your local ``pandas/doc/`` directory and run::
447+
448+
python make.py spellcheck
449+
450+
The spellcheck will take a few minutes to run (between 1 to 6 minutes). Sphinx will alert you
451+
with warnings and misspelt words - these misspelt words will be added to a file called
452+
``output.txt`` and you can find it on your local directory ``pandas/doc/build/spelling/``.
453+
454+
The Sphinx spelling extension uses an EN-US dictionary to correct words, what means that in
455+
some cases you might need to add a word to this dictionary. You can do so by adding the word to
456+
the bag-of-words file named ``spelling_wordlist.txt`` located in the folder ``pandas/doc/``.
457+
439458
.. _contributing.code:
440459

441460
Contributing to the code base

doc/source/contributing_docstring.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ left before or after the docstring. The text starts in the next line after the
103103
opening quotes. The closing quotes have their own line
104104
(meaning that they are not at the end of the last sentence).
105105

106-
In rare occasions reST styles like bold text or itallics will be used in
106+
In rare occasions reST styles like bold text or italics will be used in
107107
docstrings, but is it common to have inline code, which is presented between
108108
backticks. It is considered inline code:
109109

@@ -706,7 +706,7 @@ than 5, to show the example with the default values. If doing the ``mean``, we
706706
could use something like ``[1, 2, 3]``, so it is easy to see that the value
707707
returned is the mean.
708708

709-
For more complex examples (groupping for example), avoid using data without
709+
For more complex examples (grouping for example), avoid using data without
710710
interpretation, like a matrix of random numbers with columns A, B, C, D...
711711
And instead use a meaningful example, which makes it easier to understand the
712712
concept. Unless required by the example, use names of animals, to keep examples
@@ -877,7 +877,7 @@ be tricky. Here are some attention points:
877877
the actual error only the error name is sufficient.
878878

879879
* If there is a small part of the result that can vary (e.g. a hash in an object
880-
represenation), you can use ``...`` to represent this part.
880+
representation), you can use ``...`` to represent this part.
881881

882882
If you want to show that ``s.plot()`` returns a matplotlib AxesSubplot object,
883883
this will fail the doctest ::

doc/source/cookbook.rst

+11-11
Original file line numberDiff line numberDiff line change
@@ -286,7 +286,7 @@ New Columns
286286
df = pd.DataFrame(
287287
{'AAA' : [1,1,1,2,2,2,3,3], 'BBB' : [2,1,3,4,5,1,2,3]}); df
288288
289-
Method 1 : idxmin() to get the index of the mins
289+
Method 1 : idxmin() to get the index of the minimums
290290

291291
.. ipython:: python
292292
@@ -307,7 +307,7 @@ MultiIndexing
307307

308308
The :ref:`multindexing <advanced.hierarchical>` docs.
309309

310-
`Creating a multi-index from a labeled frame
310+
`Creating a MultiIndex from a labeled frame
311311
<http://stackoverflow.com/questions/14916358/reshaping-dataframes-in-pandas-based-on-column-labels>`__
312312

313313
.. ipython:: python
@@ -330,7 +330,7 @@ The :ref:`multindexing <advanced.hierarchical>` docs.
330330
Arithmetic
331331
**********
332332

333-
`Performing arithmetic with a multi-index that needs broadcasting
333+
`Performing arithmetic with a MultiIndex that needs broadcasting
334334
<http://stackoverflow.com/questions/19501510/divide-entire-pandas-multiindex-dataframe-by-dataframe-variable/19502176#19502176>`__
335335

336336
.. ipython:: python
@@ -342,7 +342,7 @@ Arithmetic
342342
Slicing
343343
*******
344344

345-
`Slicing a multi-index with xs
345+
`Slicing a MultiIndex with xs
346346
<http://stackoverflow.com/questions/12590131/how-to-slice-multindex-columns-in-pandas-dataframes>`__
347347

348348
.. ipython:: python
@@ -363,7 +363,7 @@ To take the cross section of the 1st level and 1st axis the index:
363363
364364
df.xs('six',level=1,axis=0)
365365
366-
`Slicing a multi-index with xs, method #2
366+
`Slicing a MultiIndex with xs, method #2
367367
<http://stackoverflow.com/questions/14964493/multiindex-based-indexing-in-pandas>`__
368368

369369
.. ipython:: python
@@ -386,13 +386,13 @@ To take the cross section of the 1st level and 1st axis the index:
386386
df.loc[(All,'Math'),('Exams')]
387387
df.loc[(All,'Math'),(All,'II')]
388388
389-
`Setting portions of a multi-index with xs
389+
`Setting portions of a MultiIndex with xs
390390
<http://stackoverflow.com/questions/19319432/pandas-selecting-a-lower-level-in-a-dataframe-to-do-a-ffill>`__
391391

392392
Sorting
393393
*******
394394

395-
`Sort by specific column or an ordered list of columns, with a multi-index
395+
`Sort by specific column or an ordered list of columns, with a MultiIndex
396396
<http://stackoverflow.com/questions/14733871/mutli-index-sorting-in-pandas>`__
397397

398398
.. ipython:: python
@@ -664,7 +664,7 @@ The :ref:`Pivot <reshaping.pivot>` docs.
664664
`Plot pandas DataFrame with year over year data
665665
<http://stackoverflow.com/questions/30379789/plot-pandas-data-frame-with-year-over-year-data>`__
666666

667-
To create year and month crosstabulation:
667+
To create year and month cross tabulation:
668668

669669
.. ipython:: python
670670
@@ -677,7 +677,7 @@ To create year and month crosstabulation:
677677
Apply
678678
*****
679679

680-
`Rolling Apply to Organize - Turning embedded lists into a multi-index frame
680+
`Rolling Apply to Organize - Turning embedded lists into a MultiIndex frame
681681
<http://stackoverflow.com/questions/17349981/converting-pandas-dataframe-with-categorical-values-into-binary-values>`__
682682

683683
.. ipython:: python
@@ -1029,8 +1029,8 @@ Skip row between header and data
10291029
01.01.1990 05:00;21;11;12;13
10301030
"""
10311031
1032-
Option 1: pass rows explicitly to skiprows
1033-
""""""""""""""""""""""""""""""""""""""""""
1032+
Option 1: pass rows explicitly to skip rows
1033+
"""""""""""""""""""""""""""""""""""""""""""
10341034

10351035
.. ipython:: python
10361036

doc/source/dsintro.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -1014,7 +1014,7 @@ Deprecate Panel
10141014
Over the last few years, pandas has increased in both breadth and depth, with new features,
10151015
datatype support, and manipulation routines. As a result, supporting efficient indexing and functional
10161016
routines for ``Series``, ``DataFrame`` and ``Panel`` has contributed to an increasingly fragmented and
1017-
difficult-to-understand codebase.
1017+
difficult-to-understand code base.
10181018

10191019
The 3-D structure of a ``Panel`` is much less common for many types of data analysis,
10201020
than the 1-D of the ``Series`` or the 2-D of the ``DataFrame``. Going forward it makes sense for
@@ -1023,7 +1023,7 @@ pandas to focus on these areas exclusively.
10231023
Oftentimes, one can simply use a MultiIndex ``DataFrame`` for easily working with higher dimensional data.
10241024

10251025
In addition, the ``xarray`` package was built from the ground up, specifically in order to
1026-
support the multi-dimensional analysis that is one of ``Panel`` s main usecases.
1026+
support the multi-dimensional analysis that is one of ``Panel`` s main use cases.
10271027
`Here is a link to the xarray panel-transition documentation <http://xarray.pydata.org/en/stable/pandas.html#panel-transition>`__.
10281028

10291029
.. ipython:: python

doc/source/ecosystem.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -187,8 +187,8 @@ and metadata disseminated in
187187
`SDMX <http://www.sdmx.org>`_ 2.1, an ISO-standard
188188
widely used by institutions such as statistics offices, central banks,
189189
and international organisations. pandaSDMX can expose datasets and related
190-
structural metadata including dataflows, code-lists,
191-
and datastructure definitions as pandas Series
190+
structural metadata including data flows, code-lists,
191+
and data structure definitions as pandas Series
192192
or multi-indexed DataFrames.
193193

194194
`fredapi <https://github.com/mortada/fredapi>`__
@@ -263,7 +263,7 @@ Data validation
263263
`Engarde <http://engarde.readthedocs.io/en/latest/>`__
264264
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
265265

266-
Engarde is a lightweight library used to explicitly state your assumptions abour your datasets
266+
Engarde is a lightweight library used to explicitly state your assumptions about your datasets
267267
and check that they're *actually* true.
268268

269269
.. _ecosystem.extensions:

doc/source/enhancingperf.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Cython (Writing C extensions for pandas)
3232
----------------------------------------
3333

3434
For many use cases writing pandas in pure Python and NumPy is sufficient. In some
35-
computationally heavy applications however, it can be possible to achieve sizeable
35+
computationally heavy applications however, it can be possible to achieve sizable
3636
speed-ups by offloading work to `cython <http://cython.org/>`__.
3737

3838
This tutorial assumes you have refactored as much as possible in Python, for example
@@ -806,7 +806,7 @@ truncate any strings that are more than 60 characters in length. Second, we
806806
can't pass ``object`` arrays to ``numexpr`` thus string comparisons must be
807807
evaluated in Python space.
808808

809-
The upshot is that this *only* applies to object-dtype'd expressions. So, if
809+
The upshot is that this *only* applies to object-dtype expressions. So, if
810810
you have an expression--for example
811811

812812
.. ipython:: python

doc/source/extending.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@ you can retain subclasses through ``pandas`` data manipulations.
167167

168168
There are 3 constructor properties to be defined:
169169

170-
- ``_constructor``: Used when a manipulation result has the same dimesions as the original.
170+
- ``_constructor``: Used when a manipulation result has the same dimensions as the original.
171171
- ``_constructor_sliced``: Used when a manipulation result has one lower dimension(s) as the original, such as ``DataFrame`` single columns slicing.
172172
- ``_constructor_expanddim``: Used when a manipulation result has one higher dimension as the original, such as ``Series.to_frame()`` and ``DataFrame.to_panel()``.
173173

doc/source/groupby.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -994,7 +994,7 @@ is only interesting over one column (here ``colname``), it may be filtered
994994
Handling of (un)observed Categorical values
995995
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
996996

997-
When using a ``Categorical`` grouper (as a single grouper, or as part of multipler groupers), the ``observed`` keyword
997+
When using a ``Categorical`` grouper (as a single grouper, or as part of multiple groupers), the ``observed`` keyword
998998
controls whether to return a cartesian product of all possible groupers values (``observed=False``) or only those
999999
that are observed groupers (``observed=True``).
10001000

@@ -1010,7 +1010,7 @@ Show only the observed values:
10101010
10111011
pd.Series([1, 1, 1]).groupby(pd.Categorical(['a', 'a', 'a'], categories=['a', 'b']), observed=True).count()
10121012
1013-
The returned dtype of the grouped will *always* include *all* of the catergories that were grouped.
1013+
The returned dtype of the grouped will *always* include *all* of the categories that were grouped.
10141014

10151015
.. ipython:: python
10161016

doc/source/indexing.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -700,7 +700,7 @@ Current Behavior
700700
Reindexing
701701
~~~~~~~~~~
702702

703-
The idiomatic way to achieve selecting potentially not-found elmenents is via ``.reindex()``. See also the section on :ref:`reindexing <basics.reindexing>`.
703+
The idiomatic way to achieve selecting potentially not-found elements is via ``.reindex()``. See also the section on :ref:`reindexing <basics.reindexing>`.
704704

705705
.. ipython:: python
706706

doc/source/install.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ PyPI and through conda.
3131
Starting **January 1, 2019**, all releases will be Python 3 only.
3232

3333
If there are people interested in continued support for Python 2.7 past December
34-
31, 2018 (either backporting bugfixes or funding) please reach out to the
34+
31, 2018 (either backporting bug fixes or funding) please reach out to the
3535
maintainers on the issue tracker.
3636

3737
For more information, see the `Python 3 statement`_ and the `Porting to Python 3 guide`_.
@@ -199,7 +199,7 @@ Running the test suite
199199
----------------------
200200

201201
pandas is equipped with an exhaustive set of unit tests, covering about 97% of
202-
the codebase as of this writing. To run it on your machine to verify that
202+
the code base as of this writing. To run it on your machine to verify that
203203
everything is working (and that you have all of the dependencies, soft and hard,
204204
installed), make sure you have `pytest
205205
<http://doc.pytest.org/en/latest/>`__ and run:

doc/source/internals.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ There are functions that make the creation of a regular index easy:
4141
- ``date_range``: fixed frequency date range generated from a time rule or
4242
DateOffset. An ndarray of Python datetime objects
4343
- ``period_range``: fixed frequency date range generated from a time rule or
44-
DateOffset. An ndarray of ``Period`` objects, representing Timespans
44+
DateOffset. An ndarray of ``Period`` objects, representing timespans
4545

4646
The motivation for having an ``Index`` class in the first place was to enable
4747
different implementations of indexing. This means that it's possible for you,

0 commit comments

Comments
 (0)