diff --git a/doc/source/development/internals.rst b/doc/source/development/internals.rst index f9cff9634f3cb..3dd687ef2087d 100644 --- a/doc/source/development/internals.rst +++ b/doc/source/development/internals.rst @@ -19,9 +19,6 @@ containers for the axis labels: assuming nothing about its contents. The labels must be hashable (and likely immutable) and unique. Populates a dict of label to location in Cython to do ``O(1)`` lookups. -* ``Int64Index``: a version of ``Index`` highly optimized for 64-bit integer - data, such as time stamps -* ``Float64Index``: a version of ``Index`` highly optimized for 64-bit float data * :class:`MultiIndex`: the standard hierarchical index object * :class:`DatetimeIndex`: An Index object with :class:`Timestamp` boxed elements (impl are the int64 values) * :class:`TimedeltaIndex`: An Index object with :class:`Timedelta` boxed elements (impl are the in64 values) diff --git a/doc/source/user_guide/advanced.rst b/doc/source/user_guide/advanced.rst index b8df21ab5a5b4..d6f7bc67b543c 100644 --- a/doc/source/user_guide/advanced.rst +++ b/doc/source/user_guide/advanced.rst @@ -848,125 +848,35 @@ values **not** in the categories, similarly to how you can reindex **any** panda .. _advanced.rangeindex: -Int64Index and RangeIndex -~~~~~~~~~~~~~~~~~~~~~~~~~ +RangeIndex +~~~~~~~~~~ -.. deprecated:: 1.4.0 - In pandas 2.0, :class:`Index` will become the default index type for numeric types - instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types - are therefore deprecated and will be removed in a futire version. - ``RangeIndex`` will not be removed, as it represents an optimized version of an integer index. - -:class:`Int64Index` is a fundamental basic index in pandas. This is an immutable array -implementing an ordered, sliceable set. - -:class:`RangeIndex` is a sub-class of ``Int64Index`` that provides the default index for all ``NDFrame`` objects. -``RangeIndex`` is an optimized version of ``Int64Index`` that can represent a monotonic ordered set. These are analogous to Python `range types `__. - -.. _advanced.float64index: - -Float64Index -~~~~~~~~~~~~ - -.. deprecated:: 1.4.0 - :class:`Index` will become the default index type for numeric types in the future - instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types - are therefore deprecated and will be removed in a future version of Pandas. - ``RangeIndex`` will not be removed as it represents an optimized version of an integer index. - -By default a :class:`Float64Index` will be automatically created when passing floating, or mixed-integer-floating values in index creation. -This enables a pure label-based slicing paradigm that makes ``[],ix,loc`` for scalar indexing and slicing work exactly the -same. - -.. ipython:: python - - indexf = pd.Index([1.5, 2, 3, 4.5, 5]) - indexf - sf = pd.Series(range(5), index=indexf) - sf - -Scalar selection for ``[],.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``). +:class:`RangeIndex` is a sub-class of :class:`Index` that provides the default index for all :class:`DataFrame` and :class:`Series` objects. +``RangeIndex`` is an optimized version of ``Index`` that can represent a monotonic ordered set. These are analogous to Python `range types `__. +A ``RangeIndex`` will always have an ``int64`` dtype. .. ipython:: python - sf[3] - sf[3.0] - sf.loc[3] - sf.loc[3.0] + idx = pd.RangeIndex(5) + idx -The only positional indexing is via ``iloc``. +``RangeIndex`` is the default index for all :class:`DataFrame` and :class:`Series` objects: .. ipython:: python - sf.iloc[3] + ser = pd.Series([1, 2, 3]) + ser.index + df = pd.DataFrame([[1, 2], [3, 4]]) + df.index + df.columns -A scalar index that is not found will raise a ``KeyError``. -Slicing is primarily on the values of the index when using ``[],ix,loc``, and -**always** positional when using ``iloc``. The exception is when the slice is -boolean, in which case it will always be positional. - -.. ipython:: python - - sf[2:4] - sf.loc[2:4] - sf.iloc[2:4] - -In float indexes, slicing using floats is allowed. - -.. ipython:: python - - sf[2.1:4.6] - sf.loc[2.1:4.6] - -In non-float indexes, slicing using floats will raise a ``TypeError``. - -.. code-block:: ipython - - In [1]: pd.Series(range(5))[3.5] - TypeError: the label [3.5] is not a proper indexer for this index type (Int64Index) - - In [1]: pd.Series(range(5))[3.5:4.5] - TypeError: the slice start [3.5] is not a proper indexer for this index type (Int64Index) - -Here is a typical use-case for using this type of indexing. Imagine that you have a somewhat -irregular timedelta-like indexing scheme, but the data is recorded as floats. This could, for -example, be millisecond offsets. - -.. ipython:: python - - dfir = pd.concat( - [ - pd.DataFrame( - np.random.randn(5, 2), index=np.arange(5) * 250.0, columns=list("AB") - ), - pd.DataFrame( - np.random.randn(6, 2), - index=np.arange(4, 10) * 250.1, - columns=list("AB"), - ), - ] - ) - dfir - -Selection operations then will always work on a value basis, for all selection operators. - -.. ipython:: python - - dfir[0:1000.4] - dfir.loc[0:1001, "A"] - dfir.loc[1000.4] - -You could retrieve the first 1 second (1000 ms) of data as such: - -.. ipython:: python - - dfir[0:1000] - -If you need integer based selection, you should use ``iloc``: +A ``RangeIndex`` will behave similarly to a :class:`Index` with an ``int64`` dtype and operations on a ``RangeIndex``, +whose result cannot be represented by a ``RangeIndex``, but should have an integer dtype, will be converted to an ``Index`` with ``int64``. +For example: .. ipython:: python - dfir.iloc[0:5] + idx[[0, 2]] .. _advanced.intervalindex: diff --git a/doc/source/user_guide/indexing.rst b/doc/source/user_guide/indexing.rst index 276157b2868b4..ec7fa5356aada 100644 --- a/doc/source/user_guide/indexing.rst +++ b/doc/source/user_guide/indexing.rst @@ -1582,8 +1582,27 @@ lookups, data alignment, and reindexing. The easiest way to create an index 'd' in index -You can also pass a ``name`` to be stored in the index: +or using numbers: + +.. ipython:: python + index = pd.Index([1, 5, 12]) + index + 5 in index + +If no dtype is given, ``Index`` tries to infer the dtype from the data. +It is also possible to give an explicit dtype when instantiating an :class:`Index`: + +.. ipython:: python + + index = pd.Index(['e', 'd', 'a', 'b'], dtype="string") + index + index = pd.Index([1, 5, 12], dtype="int8") + index + index = pd.Index([1, 5, 12], dtype="float32") + index + +You can also pass a ``name`` to be stored in the index: .. ipython:: python diff --git a/doc/source/user_guide/io.rst b/doc/source/user_guide/io.rst index dc21b9f35d272..50aabad2d0bd3 100644 --- a/doc/source/user_guide/io.rst +++ b/doc/source/user_guide/io.rst @@ -4756,7 +4756,7 @@ Selecting coordinates ^^^^^^^^^^^^^^^^^^^^^ Sometimes you want to get the coordinates (a.k.a the index locations) of your query. This returns an -``Int64Index`` of the resulting locations. These coordinates can also be passed to subsequent +``Index`` of the resulting locations. These coordinates can also be passed to subsequent ``where`` operations. .. ipython:: python diff --git a/doc/source/user_guide/timedeltas.rst b/doc/source/user_guide/timedeltas.rst index 318ca045847f4..3a75aa0b39b1f 100644 --- a/doc/source/user_guide/timedeltas.rst +++ b/doc/source/user_guide/timedeltas.rst @@ -477,7 +477,7 @@ Scalars type ops work as well. These can potentially return a *different* type o # division can result in a Timedelta if the divisor is an integer tdi / 2 - # or a Float64Index if the divisor is a Timedelta + # or a float64 Index if the divisor is a Timedelta tdi / tdi[0] .. _timedeltas.resampling: diff --git a/doc/source/whatsnew/v0.13.0.rst b/doc/source/whatsnew/v0.13.0.rst index df9f0a953ffab..8ce038200acc4 100644 --- a/doc/source/whatsnew/v0.13.0.rst +++ b/doc/source/whatsnew/v0.13.0.rst @@ -310,7 +310,7 @@ Float64Index API change - Added a new index type, ``Float64Index``. This will be automatically created when passing floating values in index creation. This enables a pure label-based slicing paradigm that makes ``[],ix,loc`` for scalar indexing and slicing work exactly the - same. See :ref:`the docs`, (:issue:`263`) + same. (:issue:`263`) Construction is by default for floating type values. diff --git a/doc/source/whatsnew/v2.0.0.rst b/doc/source/whatsnew/v2.0.0.rst index f2615950afec1..ec3ae500a2c11 100644 --- a/doc/source/whatsnew/v2.0.0.rst +++ b/doc/source/whatsnew/v2.0.0.rst @@ -28,6 +28,82 @@ The available extras, found in the :ref:`installation guide` for more information (:issue:`42717`) - Removed deprecated :attr:`Timestamp.freq`, :attr:`Timestamp.freqstr` and argument ``freq`` from the :class:`Timestamp` constructor and :meth:`Timestamp.fromordinal` (:issue:`14146`) - Removed deprecated :class:`CategoricalBlock`, :meth:`Block.is_categorical`, require datetime64 and timedelta64 values to be wrapped in :class:`DatetimeArray` or :class:`TimedeltaArray` before passing to :meth:`Block.make_block_same_class`, require ``DatetimeTZBlock.values`` to have the correct ndim when passing to the :class:`BlockManager` constructor, and removed the "fastpath" keyword from the :class:`SingleBlockManager` constructor (:issue:`40226`, :issue:`40571`) - Removed deprecated global option ``use_inf_as_null`` in favor of ``use_inf_as_na`` (:issue:`17126`)