Skip to content

Commit 8433562

Browse files
tommyodjreback
authored andcommitted
Spellcheck of docs, a few minor changes (pandas-dev#18973)
1 parent 4d571bb commit 8433562

File tree

6 files changed

+208
-186
lines changed

6 files changed

+208
-186
lines changed

doc/source/advanced.rst

+43-35
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ See the :ref:`Indexing and Selecting Data <indexing>` for general indexing docum
2424
Whether a copy or a reference is returned for a setting operation, may
2525
depend on the context. This is sometimes called ``chained assignment`` and
2626
should be avoided. See :ref:`Returning a View versus Copy
27-
<indexing.view_versus_copy>`
27+
<indexing.view_versus_copy>`.
2828

29-
See the :ref:`cookbook<cookbook.selection>` for some advanced strategies
29+
See the :ref:`cookbook<cookbook.selection>` for some advanced strategies.
3030

3131
.. _advanced.hierarchical:
3232

@@ -46,7 +46,7 @@ described above and in prior sections. Later, when discussing :ref:`group by
4646
non-trivial applications to illustrate how it aids in structuring data for
4747
analysis.
4848

49-
See the :ref:`cookbook<cookbook.multi_index>` for some advanced strategies
49+
See the :ref:`cookbook<cookbook.multi_index>` for some advanced strategies.
5050

5151
Creating a MultiIndex (hierarchical index) object
5252
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -59,7 +59,7 @@ can think of ``MultiIndex`` as an array of tuples where each tuple is unique. A
5959
``MultiIndex.from_tuples``), or a crossed set of iterables (using
6060
``MultiIndex.from_product``). The ``Index`` constructor will attempt to return
6161
a ``MultiIndex`` when it is passed a list of tuples. The following examples
62-
demo different ways to initialize MultiIndexes.
62+
demonstrate different ways to initialize MultiIndexes.
6363

6464

6565
.. ipython:: python
@@ -196,7 +196,8 @@ highly performant. If you want to see the actual used levels.
196196
# for a specific level
197197
df[['foo','qux']].columns.get_level_values(0)
198198
199-
To reconstruct the ``MultiIndex`` with only the used levels
199+
To reconstruct the ``MultiIndex`` with only the used levels, the
200+
``remove_unused_levels`` method may be used.
200201

201202
.. versionadded:: 0.20.0
202203

@@ -216,7 +217,7 @@ tuples:
216217
s + s[:-2]
217218
s + s[::2]
218219
219-
``reindex`` can be called with another ``MultiIndex`` or even a list or array
220+
``reindex`` can be called with another ``MultiIndex``, or even a list or array
220221
of tuples:
221222

222223
.. ipython:: python
@@ -230,7 +231,7 @@ Advanced indexing with hierarchical index
230231
-----------------------------------------
231232

232233
Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc`` is a
233-
bit challenging, but we've made every effort to do so. for example the
234+
bit challenging, but we've made every effort to do so. For example the
234235
following works as you would expect:
235236

236237
.. ipython:: python
@@ -286,7 +287,7 @@ As usual, **both sides** of the slicers are included as this is label indexing.
286287
287288
df.loc[(slice('A1','A3'),.....), :]
288289
289-
  rather than this:
290+
  You should **not** do this:
290291

291292
.. code-block:: python
292293
@@ -315,7 +316,7 @@ Basic multi-index slicing using slices, lists, and labels.
315316
316317
dfmi.loc[(slice('A1','A3'), slice(None), ['C1', 'C3']), :]
317318
318-
You can use a ``pd.IndexSlice`` to have a more natural syntax using ``:`` rather than using ``slice(None)``
319+
You can use :class:`pandas.IndexSlice` to facilitate a more natural syntax using ``:``, rather than using ``slice(None)``.
319320

320321
.. ipython:: python
321322
@@ -344,7 +345,7 @@ slicers on a single axis.
344345
345346
dfmi.loc(axis=0)[:, :, ['C1', 'C3']]
346347
347-
Furthermore you can *set* the values using these methods
348+
Furthermore you can *set* the values using the following methods.
348349

349350
.. ipython:: python
350351
@@ -379,7 +380,7 @@ selecting data at a particular level of a MultiIndex easier.
379380
df.loc[(slice(None),'one'),:]
380381
381382
You can also select on the columns with :meth:`~pandas.MultiIndex.xs`, by
382-
providing the axis argument
383+
providing the axis argument.
383384

384385
.. ipython:: python
385386
@@ -391,7 +392,7 @@ providing the axis argument
391392
# using the slicers
392393
df.loc[:,(slice(None),'one')]
393394
394-
:meth:`~pandas.MultiIndex.xs` also allows selection with multiple keys
395+
:meth:`~pandas.MultiIndex.xs` also allows selection with multiple keys.
395396

396397
.. ipython:: python
397398
@@ -403,13 +404,13 @@ providing the axis argument
403404
df.loc[:,('bar','one')]
404405
405406
You can pass ``drop_level=False`` to :meth:`~pandas.MultiIndex.xs` to retain
406-
the level that was selected
407+
the level that was selected.
407408

408409
.. ipython:: python
409410
410411
df.xs('one', level='second', axis=1, drop_level=False)
411412
412-
versus the result with ``drop_level=True`` (the default value)
413+
Compare the above with the result using ``drop_level=True`` (the default value).
413414

414415
.. ipython:: python
415416
@@ -470,7 +471,7 @@ allowing you to permute the hierarchical index levels in one step:
470471
Sorting a :class:`~pandas.MultiIndex`
471472
-------------------------------------
472473

473-
For MultiIndex-ed objects to be indexed & sliced effectively, they need
474+
For MultiIndex-ed objects to be indexed and sliced effectively, they need
474475
to be sorted. As with any index, you can use ``sort_index``.
475476

476477
.. ipython:: python
@@ -623,7 +624,8 @@ Index Types
623624
-----------
624625

625626
We have discussed ``MultiIndex`` in the previous sections pretty extensively. ``DatetimeIndex`` and ``PeriodIndex``
626-
are shown :ref:`here <timeseries.overview>`. ``TimedeltaIndex`` are :ref:`here <timedeltas.timedeltas>`.
627+
are shown :ref:`here <timeseries.overview>`, and information about
628+
`TimedeltaIndex`` is found :ref:`here <timedeltas.timedeltas>`.
627629

628630
In the following sub-sections we will highlight some other index types.
629631

@@ -647,44 +649,46 @@ and allows efficient indexing and storage of an index with a large number of dup
647649
df.dtypes
648650
df.B.cat.categories
649651
650-
Setting the index, will create a ``CategoricalIndex``
652+
Setting the index will create a ``CategoricalIndex``.
651653

652654
.. ipython:: python
653655
654656
df2 = df.set_index('B')
655657
df2.index
656658
657659
Indexing with ``__getitem__/.iloc/.loc`` works similarly to an ``Index`` with duplicates.
658-
The indexers MUST be in the category or the operation will raise.
660+
The indexers **must** be in the category or the operation will raise a ``KeyError``.
659661

660662
.. ipython:: python
661663
662664
df2.loc['a']
663665
664-
These PRESERVE the ``CategoricalIndex``
666+
The ``CategoricalIndex`` is **preserved** after indexing:
665667

666668
.. ipython:: python
667669
668670
df2.loc['a'].index
669671
670-
Sorting will order by the order of the categories
672+
Sorting the index will sort by the order of the categories (Recall that we
673+
created the index with with ``CategoricalDtype(list('cab'))``, so the sorted
674+
order is ``cab``.).
671675

672676
.. ipython:: python
673677
674678
df2.sort_index()
675679
676-
Groupby operations on the index will preserve the index nature as well
680+
Groupby operations on the index will preserve the index nature as well.
677681

678682
.. ipython:: python
679683
680684
df2.groupby(level=0).sum()
681685
df2.groupby(level=0).sum().index
682686
683-
Reindexing operations, will return a resulting index based on the type of the passed
684-
indexer, meaning that passing a list will return a plain-old-``Index``; indexing with
687+
Reindexing operations will return a resulting index based on the type of the passed
688+
indexer. Passing a list will return a plain-old ``Index``; indexing with
685689
a ``Categorical`` will return a ``CategoricalIndex``, indexed according to the categories
686-
of the PASSED ``Categorical`` dtype. This allows one to arbitrarily index these even with
687-
values NOT in the categories, similarly to how you can reindex ANY pandas index.
690+
of the **passed** ``Categorical`` dtype. This allows one to arbitrarily index these even with
691+
values **not** in the categories, similarly to how you can reindex **any** pandas index.
688692

689693
.. ipython :: python
690694
@@ -720,7 +724,8 @@ Int64Index and RangeIndex
720724
721725
Indexing on an integer-based Index with floats has been clarified in 0.18.0, for a summary of the changes, see :ref:`here <whatsnew_0180.float_indexers>`.
722726
723-
``Int64Index`` is a fundamental basic index in *pandas*. This is an Immutable array implementing an ordered, sliceable set.
727+
``Int64Index`` is a fundamental basic index in pandas.
728+
This is an Immutable array implementing an ordered, sliceable set.
724729
Prior to 0.18.0, the ``Int64Index`` would provide the default index for all ``NDFrame`` objects.
725730
726731
``RangeIndex`` is a sub-class of ``Int64Index`` added in version 0.18.0, now providing the default index for all ``NDFrame`` objects.
@@ -742,7 +747,7 @@ same.
742747
sf = pd.Series(range(5), index=indexf)
743748
sf
744749
745-
Scalar selection for ``[],.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
750+
Scalar selection for ``[],.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``).
746751
747752
.. ipython:: python
748753
@@ -751,30 +756,32 @@ Scalar selection for ``[],.loc`` will always be label based. An integer will mat
751756
sf.loc[3]
752757
sf.loc[3.0]
753758
754-
The only positional indexing is via ``iloc``
759+
The only positional indexing is via ``iloc``.
755760
756761
.. ipython:: python
757762
758763
sf.iloc[3]
759764
760-
A scalar index that is not found will raise ``KeyError``
765+
A scalar index that is not found will raise a ``KeyError``.
761766
762-
Slicing is ALWAYS on the values of the index, for ``[],ix,loc`` and ALWAYS positional with ``iloc``
767+
Slicing is primarily on the values of the index when using ``[],ix,loc``, and
768+
**always** positional when using ``iloc``. The exception is when the slice is
769+
boolean, in which case it will always be positional.
763770
764771
.. ipython:: python
765772
766773
sf[2:4]
767774
sf.loc[2:4]
768775
sf.iloc[2:4]
769776
770-
In float indexes, slicing using floats is allowed
777+
In float indexes, slicing using floats is allowed.
771778
772779
.. ipython:: python
773780
774781
sf[2.1:4.6]
775782
sf.loc[2.1:4.6]
776783
777-
In non-float indexes, slicing using floats will raise a ``TypeError``
784+
In non-float indexes, slicing using floats will raise a ``TypeError``.
778785
779786
.. code-block:: ipython
780787
@@ -786,7 +793,7 @@ In non-float indexes, slicing using floats will raise a ``TypeError``
786793
787794
.. warning::
788795
789-
Using a scalar float indexer for ``.iloc`` has been removed in 0.18.0, so the following will raise a ``TypeError``
796+
Using a scalar float indexer for ``.iloc`` has been removed in 0.18.0, so the following will raise a ``TypeError``:
790797
791798
.. code-block:: ipython
792799
@@ -816,13 +823,13 @@ Selection operations then will always work on a value basis, for all selection o
816823
dfir.loc[0:1001,'A']
817824
dfir.loc[1000.4]
818825
819-
You could then easily pick out the first 1 second (1000 ms) of data then.
826+
You could retrieve the first 1 second (1000 ms) of data as such:
820827
821828
.. ipython:: python
822829
823830
dfir[0:1000]
824831
825-
Of course if you need integer based selection, then use ``iloc``
832+
If you need integer based selection, you should use ``iloc``:
826833
827834
.. ipython:: python
828835
@@ -975,6 +982,7 @@ consider the following Series:
975982
s
976983
977984
Suppose we wished to slice from ``c`` to ``e``, using integers this would be
985+
accomplished as such:
978986
979987
.. ipython:: python
980988

doc/source/basics.rst

+7-6
Original file line numberDiff line numberDiff line change
@@ -436,7 +436,7 @@ General DataFrame Combine
436436
~~~~~~~~~~~~~~~~~~~~~~~~~
437437

438438
The :meth:`~DataFrame.combine_first` method above calls the more general
439-
DataFrame method :meth:`~DataFrame.combine`. This method takes another DataFrame
439+
:meth:`DataFrame.combine`. This method takes another DataFrame
440440
and a combiner function, aligns the input DataFrame and then passes the combiner
441441
function pairs of Series (i.e., columns whose names are the same).
442442

@@ -540,8 +540,8 @@ will exclude NAs on Series input by default:
540540
np.mean(df['one'])
541541
np.mean(df['one'].values)
542542
543-
``Series`` also has a method :meth:`~Series.nunique` which will return the
544-
number of unique non-NA values:
543+
:meth:`Series.nunique` will return the number of unique non-NA values in a
544+
Series:
545545

546546
.. ipython:: python
547547
@@ -852,7 +852,8 @@ Aggregation API
852852
The aggregation API allows one to express possibly multiple aggregation operations in a single concise way.
853853
This API is similar across pandas objects, see :ref:`groupby API <groupby.aggregate>`, the
854854
:ref:`window functions API <stats.aggregate>`, and the :ref:`resample API <timeseries.aggregate>`.
855-
The entry point for aggregation is the method :meth:`~DataFrame.aggregate`, or the alias :meth:`~DataFrame.agg`.
855+
The entry point for aggregation is :meth:`DataFrame.aggregate`, or the alias
856+
:meth:`DataFrame.agg`.
856857

857858
We will use a similar starting frame from above:
858859

@@ -1913,8 +1914,8 @@ dtype of the column will be chosen to accommodate all of the data types
19131914
# string data forces an ``object`` dtype
19141915
pd.Series([1, 2, 3, 6., 'foo'])
19151916
1916-
The method :meth:`~DataFrame.get_dtype_counts` will return the number of columns of
1917-
each type in a ``DataFrame``:
1917+
The number of columns of each type in a ``DataFrame`` can be found by calling
1918+
:meth:`~DataFrame.get_dtype_counts`.
19181919

19191920
.. ipython:: python
19201921

0 commit comments

Comments
 (0)