Skip to content

DOC: update the IntervalIndex.from_array docstring #20224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

verakai
Copy link
Contributor

@verakai verakai commented Mar 10, 2018

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

  • PR title is "DOC: update the docstring"
  • The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single <your-function-or-method>
  • It has been proofread on language by another sprint participant

################################################################################
################# Docstring (pandas.IntervalIndex.from_arrays) #################
################################################################################

Construct an IntervalIndex from a given element in a left
and right array.

Parameters
----------
left : array-like (1-dimensional)
    Left bounds for each interval.
right : array-like (1-dimensional)
    Right bounds for each interval.
closed : {'left', 'right', 'both', 'neither'}, default 'right'
    Whether the intervals are closed on the left-side, right-side, both
    or neither.
name : object, optional
    Name to be stored in the index.
copy : boolean, default False
    Copy the data.
dtype : dtype or None, default None
    If None, dtype will be inferred.

    .. versionadded:: 0.23.0.

Returns
-------
index : IntervalIndex

Examples
--------
>>> pd.IntervalIndex.from_arrays([0, 1, 2], [1, 2, 3])
IntervalIndex([(0, 1], (1, 2], (2, 3]]
              closed='right',
              dtype='interval[int64]')

If you want to segment different groups of people based on
ages, you can apply the method as follows:

>>> ages = pd.IntervalIndex.from_arrays([0, 2, 13],
...                                     [2, 13, 19], closed='left')
>>> ages
IntervalIndex([[0, 2), [2, 13), [13, 19)]
      closed='left',
      dtype='interval[int64]')
>>> s = pd.Series(['baby', 'kid', 'teen'], ages)
>>> s
[0, 2)      baby
[2, 13)      kid
[13, 19)    teen
dtype: object

Notes
-----
Each element of `left` must be smaller or equal to the `right` element
at the same position, ie, ``left[i] <= right[i]``.

See Also
--------
interval_range : Function to create a fixed frequency IntervalIndex.
IntervalIndex.from_breaks : Construct an IntervalIndex from an array of
                            splits.
IntervalIndex.from_tuples : Construct an IntervalIndex from a
                            list/array of tuples.

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	No summary found (a short summary in a single line should be present at the beginning of the docstring)

@@ -142,20 +142,24 @@ class IntervalIndex(IntervalMixin, Index):

Parameters
----------
data : array-like (1-dimensional)
data : array-Like (1-dimensional)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array-like is fine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also could you limit your changes to just the from_array method? Don't want conflicts with others.

[13, 19) teen
dtype: object

Notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thin kthe order is Returns, Notes, See Also, then Examples.

@@ -457,7 +461,8 @@ def from_breaks(cls, breaks, closed='right', name=None, copy=False,
def from_arrays(cls, left, right, closed='right', name=None, copy=False,
dtype=None):
"""
Construct an IntervalIndex from a a left and right array
Construct an IntervalIndex from a given element in a left
and right array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try to keep this in a single line? I also don't fully understand the "from a given element in ", so would maybe remove that part

Index : The base pandas Index type.
Interval : A bounded slice-like interval; the elements of an IntervalIndex.
qcut : Quantile-based discretization function.
cut : Return indices of half-open bins to which each value of x belongs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is the explanation in the cut docstring, but it is actually not correct (we are having a discussion about this in the PR doing the cut function :)). I would keep the explanation that was there before about converting array of continuous data into intervals

@@ -471,11 +476,15 @@ def from_arrays(cls, left, right, closed='right', name=None, copy=False,
name : object, optional
Name to be stored in the index.
copy : boolean, default False
copy the data
Copy the data.
dtype : dtype or None, default None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change "dtype or None, default None" to "dtype, optional" ?

Notes
-----
Each element of `left` must be smaller or equal to the `right` element
at the same position, ie, ``left[i] <= right[i]``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add that all elements within each array should have similar type? For example, the following raises on master since the mixed types result in object subtype (works on 0.22.0, but will be disallowed in the next release):

In [2]: pd.__version__
Out[2]: '0.23.0.dev0+482.gc3d491a'

In [3]: left = [1, pd.Timestamp('20180101')]

In [4]: right = [2, pd.Timestamp('20180201')]

In [5]: pd.IntervalIndex.from_arrays(left, right)
---------------------------------------------------------------------------
TypeError: category, object, and string subtypes are not supported for IntervalIndex


Notes
-----
Each element of `left` must be smaller or equal to the `right` element
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smaller --> less than

@codecov
Copy link

codecov bot commented Mar 13, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@dd7f567). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #20224   +/-   ##
=========================================
  Coverage          ?   91.72%           
=========================================
  Files             ?      150           
  Lines             ?    49152           
  Branches          ?        0           
=========================================
  Hits              ?    45086           
  Misses            ?     4066           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.11% <ø> (?)
#single 41.84% <ø> (?)
Impacted Files Coverage Δ
pandas/core/indexes/interval.py 93.08% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dd7f567...277a0c5. Read the comment docs.

@TomAugspurger TomAugspurger added this to the 0.23.0 milestone Mar 13, 2018
@TomAugspurger TomAugspurger added the Interval Interval data type label Mar 13, 2018
@TomAugspurger TomAugspurger merged commit 7183439 into pandas-dev:master Mar 13, 2018
@TomAugspurger
Copy link
Contributor

Thanks @verakai!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Interval Interval data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants