diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index d01956bb79e11..ad36a68c448a9 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -36,23 +36,6 @@ repos: - id: rst-backticks # these exclusions should be removed and the files fixed exclude: (?x)( - text\.rst| - timeseries\.rst| - visualization\.rst| - missing_data\.rst| - options\.rst| - reshaping\.rst| - scale\.rst| - merging\.rst| - cookbook\.rst| - enhancingperf\.rst| - groupby\.rst| - io\.rst| - overview\.rst| - panel\.rst| - plotting\.rst| - 10min\.rst| - basics\.rst| categorical\.rst| contributing\.rst| contributing_docstring\.rst| diff --git a/doc/source/getting_started/overview.rst b/doc/source/getting_started/overview.rst index 032ba73a7293d..57d87d4ec8a91 100644 --- a/doc/source/getting_started/overview.rst +++ b/doc/source/getting_started/overview.rst @@ -40,7 +40,7 @@ Here are just a few of the things that pandas does well: higher dimensional objects - Automatic and explicit **data alignment**: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and - let `Series`, `DataFrame`, etc. automatically align the data for you in + let ``Series``, ``DataFrame``, etc. automatically align the data for you in computations - Powerful, flexible **group by** functionality to perform split-apply-combine operations on data sets, for both aggregating and diff --git a/doc/source/reference/panel.rst b/doc/source/reference/panel.rst index 94bfe87fe39f0..37d48c2dadf2e 100644 --- a/doc/source/reference/panel.rst +++ b/doc/source/reference/panel.rst @@ -7,4 +7,4 @@ Panel ===== .. currentmodule:: pandas -`Panel` was removed in 0.25.0. For prior documentation, see the `0.24 documentation `_ +``Panel`` was removed in 0.25.0. For prior documentation, see the `0.24 documentation `_ diff --git a/doc/source/reference/plotting.rst b/doc/source/reference/plotting.rst index 95657dfa5fde5..632b39a1fa858 100644 --- a/doc/source/reference/plotting.rst +++ b/doc/source/reference/plotting.rst @@ -7,7 +7,7 @@ Plotting ======== .. currentmodule:: pandas.plotting -The following functions are contained in the `pandas.plotting` module. +The following functions are contained in the ``pandas.plotting`` module. .. autosummary:: :toctree: api/ diff --git a/doc/source/user_guide/10min.rst b/doc/source/user_guide/10min.rst index 93c50fff40305..c3746cbe777a3 100644 --- a/doc/source/user_guide/10min.rst +++ b/doc/source/user_guide/10min.rst @@ -431,9 +431,9 @@ See more at :ref:`Histogramming and Discretization `. String Methods ~~~~~~~~~~~~~~ -Series is equipped with a set of string processing methods in the `str` +Series is equipped with a set of string processing methods in the ``str`` attribute that make it easy to operate on each element of the array, as in the -code snippet below. Note that pattern-matching in `str` generally uses `regular +code snippet below. Note that pattern-matching in ``str`` generally uses `regular expressions `__ by default (and in some cases always uses them). See more at :ref:`Vectorized String Methods `. diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index 6b13319061ea4..e348111fe7881 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -1459,7 +1459,7 @@ for altering the ``Series.name`` attribute. .. versionadded:: 0.24.0 The methods :meth:`DataFrame.rename_axis` and :meth:`Series.rename_axis` -allow specific names of a `MultiIndex` to be changed (as opposed to the +allow specific names of a ``MultiIndex`` to be changed (as opposed to the labels). .. ipython:: python @@ -1592,7 +1592,7 @@ index value along with a Series containing the data in each row: row All values in ``row``, returned as a Series, are now upcasted - to floats, also the original integer value in column `x`: + to floats, also the original integer value in column ``x``: .. ipython:: python @@ -1787,8 +1787,8 @@ used to sort a pandas object by its index levels. .. versionadded:: 1.1.0 Sorting by index also supports a ``key`` parameter that takes a callable -function to apply to the index being sorted. For `MultiIndex` objects, -the key is applied per-level to the levels specified by `level`. +function to apply to the index being sorted. For ``MultiIndex`` objects, +the key is applied per-level to the levels specified by ``level``. .. ipython:: python @@ -1812,8 +1812,8 @@ For information on key sorting by value, see :ref:`value sorting By values ~~~~~~~~~ -The :meth:`Series.sort_values` method is used to sort a `Series` by its values. The -:meth:`DataFrame.sort_values` method is used to sort a `DataFrame` by its column or row values. +The :meth:`Series.sort_values` method is used to sort a ``Series`` by its values. The +:meth:`DataFrame.sort_values` method is used to sort a ``DataFrame`` by its column or row values. The optional ``by`` parameter to :meth:`DataFrame.sort_values` may used to specify one or more columns to use to determine the sorted order. @@ -1855,8 +1855,8 @@ to apply to the values being sorted. s1.sort_values() s1.sort_values(key=lambda x: x.str.lower()) -`key` will be given the :class:`Series` of values and should return a ``Series`` -or array of the same shape with the transformed values. For `DataFrame` objects, +``key`` will be given the :class:`Series` of values and should return a ``Series`` +or array of the same shape with the transformed values. For ``DataFrame`` objects, the key is applied per column, so the key should still expect a Series and return a Series, e.g. diff --git a/doc/source/user_guide/cookbook.rst b/doc/source/user_guide/cookbook.rst index 7542e1dc7df6f..e33e85d3d2224 100644 --- a/doc/source/user_guide/cookbook.rst +++ b/doc/source/user_guide/cookbook.rst @@ -1270,7 +1270,7 @@ Often it's useful to obtain the lower (or upper) triangular form of a correlatio corr_mat.where(mask) -The `method` argument within `DataFrame.corr` can accept a callable in addition to the named correlation types. Here we compute the `distance correlation `__ matrix for a `DataFrame` object. +The ``method`` argument within ``DataFrame.corr`` can accept a callable in addition to the named correlation types. Here we compute the ``distance correlation ``__ matrix for a ``DataFrame`` object. .. ipython:: python diff --git a/doc/source/user_guide/enhancingperf.rst b/doc/source/user_guide/enhancingperf.rst index 9e101c1a20371..ce9db0a5279c3 100644 --- a/doc/source/user_guide/enhancingperf.rst +++ b/doc/source/user_guide/enhancingperf.rst @@ -488,9 +488,9 @@ These operations are supported by :func:`pandas.eval`: * Attribute access, e.g., ``df.a`` * Subscript expressions, e.g., ``df[0]`` * Simple variable evaluation, e.g., ``pd.eval('df')`` (this is not very useful) -* Math functions: `sin`, `cos`, `exp`, `log`, `expm1`, `log1p`, - `sqrt`, `sinh`, `cosh`, `tanh`, `arcsin`, `arccos`, `arctan`, `arccosh`, - `arcsinh`, `arctanh`, `abs`, `arctan2` and `log10`. +* Math functions: ``sin``, ``cos``, ``exp``, ``log``, ``expm1``, ``log1p``, + ``sqrt``, ``sinh``, ``cosh``, ``tanh``, ``arcsin``, ``arccos``, ``arctan``, ``arccosh``, + ``arcsinh``, ``arctanh``, ``abs``, ``arctan2`` and ``log10``. This Python syntax is **not** allowed: diff --git a/doc/source/user_guide/groupby.rst b/doc/source/user_guide/groupby.rst index f745dab00bab8..52342de98de79 100644 --- a/doc/source/user_guide/groupby.rst +++ b/doc/source/user_guide/groupby.rst @@ -216,10 +216,10 @@ in case you want to include ``NA`` values in group keys, you could pass ``dropna .. ipython:: python - # Default `dropna` is set to True, which will exclude NaNs in keys + # Default ``dropna`` is set to True, which will exclude NaNs in keys df_dropna.groupby(by=["b"], dropna=True).sum() - # In order to allow NaN in keys, set `dropna` to False + # In order to allow NaN in keys, set ``dropna`` to False df_dropna.groupby(by=["b"], dropna=False).sum() The default setting of ``dropna`` argument is ``True`` which means ``NA`` are not included in group keys. diff --git a/doc/source/user_guide/io.rst b/doc/source/user_guide/io.rst index a0b16e5fe5d1c..fc5aad12cd5e8 100644 --- a/doc/source/user_guide/io.rst +++ b/doc/source/user_guide/io.rst @@ -117,9 +117,9 @@ index_col : int, str, sequence of int / str, or False, default ``None`` usecols : list-like or callable, default ``None`` Return a subset of the columns. If list-like, all elements must either be positional (i.e. integer indices into the document columns) or strings - that correspond to column names provided either by the user in `names` or + that correspond to column names provided either by the user in ``names`` or inferred from the document header row(s). For example, a valid list-like - `usecols` parameter would be ``[0, 1, 2]`` or ``['foo', 'bar', 'baz']``. + ``usecols`` parameter would be ``[0, 1, 2]`` or ``['foo', 'bar', 'baz']``. Element order is ignored, so ``usecols=[0, 1]`` is the same as ``[1, 0]``. To instantiate a DataFrame from ``data`` with element order preserved use @@ -157,7 +157,7 @@ General parsing configuration dtype : Type name or dict of column -> type, default ``None`` Data type for data or columns. E.g. ``{'a': np.float64, 'b': np.int32}`` - (unsupported with ``engine='python'``). Use `str` or `object` together + (unsupported with ``engine='python'``). Use ``str`` or ``object`` together with suitable ``na_values`` settings to preserve and not interpret dtype. engine : {``'c'``, ``'python'``} @@ -215,19 +215,19 @@ na_values : scalar, str, list-like, or dict, default ``None`` keep_default_na : boolean, default ``True`` Whether or not to include the default NaN values when parsing the data. - Depending on whether `na_values` is passed in, the behavior is as follows: + Depending on whether ``na_values`` is passed in, the behavior is as follows: - * If `keep_default_na` is ``True``, and `na_values` are specified, `na_values` + * If ``keep_default_na`` is ``True``, and ``na_values`` are specified, ``na_values`` is appended to the default NaN values used for parsing. - * If `keep_default_na` is ``True``, and `na_values` are not specified, only + * If ``keep_default_na`` is ``True``, and ``na_values`` are not specified, only the default NaN values are used for parsing. - * If `keep_default_na` is ``False``, and `na_values` are specified, only - the NaN values specified `na_values` are used for parsing. - * If `keep_default_na` is ``False``, and `na_values` are not specified, no + * If ``keep_default_na`` is ``False``, and ``na_values`` are specified, only + the NaN values specified ``na_values`` are used for parsing. + * If ``keep_default_na`` is ``False``, and ``na_values`` are not specified, no strings will be parsed as NaN. - Note that if `na_filter` is passed in as ``False``, the `keep_default_na` and - `na_values` parameters will be ignored. + Note that if ``na_filter`` is passed in as ``False``, the ``keep_default_na`` and + ``na_values`` parameters will be ignored. na_filter : boolean, default ``True`` Detect missing value markers (empty strings and the value of na_values). In data without any NAs, passing ``na_filter=False`` can improve the performance @@ -276,10 +276,10 @@ Iteration +++++++++ iterator : boolean, default ``False`` - Return `TextFileReader` object for iteration or getting chunks with + Return ``TextFileReader`` object for iteration or getting chunks with ``get_chunk()``. chunksize : int, default ``None`` - Return `TextFileReader` object for iteration. See :ref:`iterating and chunking + Return ``TextFileReader`` object for iteration. See :ref:`iterating and chunking ` below. Quoting, compression, and file format @@ -299,7 +299,7 @@ compression : {``'infer'``, ``'gzip'``, ``'bz2'``, ``'zip'``, ``'xz'``, ``None`` .. versionchanged:: 0.24.0 'infer' option added and set to default. .. versionchanged:: 1.1.0 dict option extended to support ``gzip`` and ``bz2``. - .. versionchanged:: 1.2.0 Previous versions forwarded dict entries for 'gzip' to `gzip.open`. + .. versionchanged:: 1.2.0 Previous versions forwarded dict entries for 'gzip' to ``gzip.open``. thousands : str, default ``None`` Thousands separator. decimal : str, default ``'.'`` @@ -327,17 +327,17 @@ comment : str, default ``None`` Indicates remainder of line should not be parsed. If found at the beginning of a line, the line will be ignored altogether. This parameter must be a single character. Like empty lines (as long as ``skip_blank_lines=True``), fully - commented lines are ignored by the parameter `header` but not by `skiprows`. + commented lines are ignored by the parameter ``header`` but not by ``skiprows``. For example, if ``comment='#'``, parsing '#empty\\na,b,c\\n1,2,3' with - `header=0` will result in 'a,b,c' being treated as the header. + ``header=0`` will result in 'a,b,c' being treated as the header. encoding : str, default ``None`` Encoding to use for UTF when reading/writing (e.g. ``'utf-8'``). `List of Python standard encodings `_. dialect : str or :class:`python:csv.Dialect` instance, default ``None`` If provided, this parameter will override values (default or not) for the - following parameters: `delimiter`, `doublequote`, `escapechar`, - `skipinitialspace`, `quotechar`, and `quoting`. If it is necessary to + following parameters: ``delimiter``, ``doublequote``, ``escapechar``, + ``skipinitialspace``, ``quotechar``, and ``quoting``. If it is necessary to override values, a ParserWarning will be issued. See :class:`python:csv.Dialect` documentation for more details. @@ -436,7 +436,7 @@ worth trying. mixed_df['col_1'].apply(type).value_counts() mixed_df['col_1'].dtype - will result with `mixed_df` containing an ``int`` dtype for certain chunks + will result with ``mixed_df`` containing an ``int`` dtype for certain chunks of the column, and ``str`` for others due to the mixed dtypes from the data that was read in. It is important to note that the overall column will be marked with a ``dtype`` of ``object``, which is used for columns with mixed dtypes. @@ -896,7 +896,7 @@ You can also use a dict to specify custom name columns: df It is important to remember that if multiple text columns are to be parsed into -a single date column, then a new column is prepended to the data. The `index_col` +a single date column, then a new column is prepended to the data. The ``index_col`` specification is based off of this new set of columns rather than the original data columns: @@ -937,7 +937,7 @@ Pandas will try to call the ``date_parser`` function in three different ways. If an exception is raised, the next one is tried: 1. ``date_parser`` is first called with one or more arrays as arguments, - as defined using `parse_dates` (e.g., ``date_parser(['2013', '2013'], ['1', '2'])``). + as defined using ``parse_dates`` (e.g., ``date_parser(['2013', '2013'], ['1', '2'])``). 2. If #1 fails, ``date_parser`` is called with all the columns concatenated row-wise into a single array (e.g., ``date_parser(['2013 1', '2013 2'])``). @@ -1369,7 +1369,7 @@ Files with fixed width columns While :func:`read_csv` reads delimited data, the :func:`read_fwf` function works with data files that have known and fixed column widths. The function parameters -to ``read_fwf`` are largely the same as `read_csv` with two extra parameters, and +to ``read_fwf`` are largely the same as ``read_csv`` with two extra parameters, and a different usage of the ``delimiter`` parameter: * ``colspecs``: A list of pairs (tuples) giving the extents of the @@ -1402,7 +1402,7 @@ Consider a typical fixed-width data file: print(open('bar.csv').read()) In order to parse this file into a ``DataFrame``, we simply need to supply the -column specifications to the `read_fwf` function along with the file name: +column specifications to the ``read_fwf`` function along with the file name: .. ipython:: python @@ -1718,7 +1718,7 @@ The ``Series`` and ``DataFrame`` objects have an instance method ``to_csv`` whic allows storing the contents of the object as a comma-separated-values file. The function takes a number of arguments. Only the first is required. -* ``path_or_buf``: A string path to the file to write or a file object. If a file object it must be opened with `newline=''` +* ``path_or_buf``: A string path to the file to write or a file object. If a file object it must be opened with ``newline=''`` * ``sep`` : Field delimiter for the output file (default ",") * ``na_rep``: A string representation of a missing value (default '') * ``float_format``: Format string for floating point numbers @@ -1726,13 +1726,13 @@ function takes a number of arguments. Only the first is required. * ``header``: Whether to write out the column names (default True) * ``index``: whether to write row (index) names (default True) * ``index_label``: Column label(s) for index column(s) if desired. If None - (default), and `header` and `index` are True, then the index names are + (default), and ``header`` and ``index`` are True, then the index names are used. (A sequence should be given if the ``DataFrame`` uses MultiIndex). * ``mode`` : Python write mode, default 'w' * ``encoding``: a string representing the encoding to use if the contents are non-ASCII, for Python versions prior to 3 -* ``line_terminator``: Character sequence denoting line end (default `os.linesep`) -* ``quoting``: Set quoting rules as in csv module (default csv.QUOTE_MINIMAL). Note that if you have set a `float_format` then floats are converted to strings and csv.QUOTE_NONNUMERIC will treat them as non-numeric +* ``line_terminator``: Character sequence denoting line end (default ``os.linesep``) +* ``quoting``: Set quoting rules as in csv module (default csv.QUOTE_MINIMAL). Note that if you have set a ``float_format`` then floats are converted to strings and csv.QUOTE_NONNUMERIC will treat them as non-numeric * ``quotechar``: Character used to quote fields (default '"') * ``doublequote``: Control quoting of ``quotechar`` in fields (default True) * ``escapechar``: Character used to escape ``sep`` and ``quotechar`` when @@ -1885,7 +1885,7 @@ preservation of metadata including but not limited to dtypes and index names. Any orient option that encodes to a JSON object will not preserve the ordering of index and column labels during round-trip serialization. If you wish to preserve - label ordering use the `split` option as it uses ordered containers. + label ordering use the ``split`` option as it uses ordered containers. Date handling +++++++++++++ @@ -2240,7 +2240,7 @@ For line-delimited json files, pandas can also return an iterator which reads in df df.to_json(orient='records', lines=True) - # reader is an iterator that returns `chunksize` lines each iteration + # reader is an iterator that returns ``chunksize`` lines each iteration reader = pd.read_json(StringIO(jsonl), lines=True, chunksize=1) reader for chunk in reader: @@ -3092,7 +3092,7 @@ Dtype specifications ++++++++++++++++++++ As an alternative to converters, the type for an entire column can -be specified using the `dtype` keyword, which takes a dictionary +be specified using the ``dtype`` keyword, which takes a dictionary mapping column names to types. To interpret data with no type inference, use the type ``str`` or ``object``. @@ -3748,8 +3748,8 @@ Passing ``min_itemsize={`values`: size}`` as a parameter to append will set a larger minimum for the string columns. Storing ``floats, strings, ints, bools, datetime64`` are currently supported. For string columns, passing ``nan_rep = 'nan'`` to append will change the default -nan representation on disk (which converts to/from `np.nan`), this -defaults to `nan`. +nan representation on disk (which converts to/from ``np.nan``), this +defaults to ``nan``. .. ipython:: python @@ -4045,7 +4045,7 @@ Query via data columns ++++++++++++++++++++++ You can designate (and index) certain columns that you want to be able -to perform queries (other than the `indexable` columns, which you can +to perform queries (other than the ``indexable`` columns, which you can always query). For instance say you want to perform this common operation, on-disk, and return just the frame that matches this query. You can specify ``data_columns = True`` to force all columns to @@ -4076,7 +4076,7 @@ be ``data_columns``. store.root.df_dc.table There is some performance degradation by making lots of columns into -`data columns`, so it is up to the user to designate these. In addition, +``data columns``, so it is up to the user to designate these. In addition, you cannot change data columns (nor indexables) after the first append/put operation (Of course you can simply read in the data and create a new table!). @@ -4203,7 +4203,7 @@ having a very wide table, but enables more efficient queries. The ``append_to_multiple`` method splits a given single DataFrame into multiple tables according to ``d``, a dictionary that maps the -table names to a list of 'columns' you want in that table. If `None` +table names to a list of 'columns' you want in that table. If ``None`` is used in place of a list, that table will have the remaining unspecified columns of the given DataFrame. The argument ``selector`` defines which table is the selector table (which you can make queries from). @@ -4843,8 +4843,8 @@ Parquet supports partitioning of data based on the values of one or more columns df.to_parquet(path='test', engine='pyarrow', partition_cols=['a'], compression=None) -The `path` specifies the parent directory to which data will be saved. -The `partition_cols` are the column names by which the dataset will be partitioned. +The ``path`` specifies the parent directory to which data will be saved. +The ``partition_cols`` are the column names by which the dataset will be partitioned. Columns are partitioned in the order they are given. The partition splits are determined by the unique values in the partition columns. The above example creates a partitioned dataset that may look like: @@ -5495,7 +5495,7 @@ SAS formats ----------- The top-level function :func:`read_sas` can read (but not write) SAS -`xport` (.XPT) and (since *v0.18.0*) `SAS7BDAT` (.sas7bdat) format files. +XPORT (.xpt) and (since *v0.18.0*) SAS7BDAT (.sas7bdat) format files. SAS files only contain two value types: ASCII text and floating point values (usually 8 bytes but sometimes truncated). For xport files, @@ -5543,7 +5543,7 @@ SPSS formats .. versionadded:: 0.25.0 The top-level function :func:`read_spss` can read (but not write) SPSS -`sav` (.sav) and `zsav` (.zsav) format files. +SAV (.sav) and ZSAV (.zsav) format files. SPSS files contain column names. By default the whole file is read, categorical columns are converted into ``pd.Categorical``, @@ -5566,7 +5566,7 @@ avoid converting categorical columns into ``pd.Categorical``: df = pd.read_spss('spss_data.sav', usecols=['foo', 'bar'], convert_categoricals=False) -More information about the `sav` and `zsav` file format is available here_. +More information about the SAV and ZSAV file formats is available here_. .. _here: https://www.ibm.com/support/knowledgecenter/en/SSLVMB_22.0.0/com.ibm.spss.statistics.help/spss/base/savedatatypes.htm diff --git a/doc/source/user_guide/merging.rst b/doc/source/user_guide/merging.rst index bc8fc5a7e4f4e..aee56a2565310 100644 --- a/doc/source/user_guide/merging.rst +++ b/doc/source/user_guide/merging.rst @@ -77,7 +77,7 @@ some configurable handling of "what to do with the other axes": levels=None, names=None, verify_integrity=False, copy=True) * ``objs`` : a sequence or mapping of Series or DataFrame objects. If a - dict is passed, the sorted keys will be used as the `keys` argument, unless + dict is passed, the sorted keys will be used as the ``keys`` argument, unless it is passed, in which case the values will be selected (see below). Any None objects will be dropped silently unless they are all None in which case a ValueError will be raised. @@ -1234,7 +1234,7 @@ resetting indexes. DataFrame. .. note:: - When DataFrames are merged using only some of the levels of a `MultiIndex`, + When DataFrames are merged using only some of the levels of a ``MultiIndex``, the extra levels will be dropped from the resulting merge. In order to preserve those levels, use ``reset_index`` on those level names to move those levels to columns prior to doing the merge. @@ -1487,7 +1487,7 @@ compare two DataFrame or Series, respectively, and summarize their differences. This feature was added in :ref:`V1.1.0 `. -For example, you might want to compare two `DataFrame` and stack their differences +For example, you might want to compare two ``DataFrame`` and stack their differences side by side. .. ipython:: python @@ -1523,7 +1523,7 @@ If you wish, you may choose to stack the differences on rows. df.compare(df2, align_axis=0) -If you wish to keep all original rows and columns, set `keep_shape` argument +If you wish to keep all original rows and columns, set ``keep_shape`` argument to ``True``. .. ipython:: python diff --git a/doc/source/user_guide/missing_data.rst b/doc/source/user_guide/missing_data.rst index 06a7c6e33768e..9294897686d46 100644 --- a/doc/source/user_guide/missing_data.rst +++ b/doc/source/user_guide/missing_data.rst @@ -251,7 +251,7 @@ can propagate non-NA values forward or backward: **Limit the amount of filling** If we only want consecutive gaps filled up to a certain number of data points, -we can use the `limit` keyword: +we can use the ``limit`` keyword: .. ipython:: python :suppress: diff --git a/doc/source/user_guide/options.rst b/doc/source/user_guide/options.rst index 398336960e769..563fc941294d1 100644 --- a/doc/source/user_guide/options.rst +++ b/doc/source/user_guide/options.rst @@ -109,7 +109,7 @@ It's also possible to reset multiple options at once (using a regex): ``option_context`` context manager has been exposed through the top-level API, allowing you to execute code with given option values. Option values -are restored automatically when you exit the `with` block: +are restored automatically when you exit the ``with`` block: .. ipython:: python @@ -306,10 +306,10 @@ display.encoding UTF-8 Defaults to the detected en meant to be displayed on the console. display.expand_frame_repr True Whether to print out the full DataFrame repr for wide DataFrames across - multiple lines, `max_columns` is + multiple lines, ``max_columns`` is still respected, but the output will wrap-around across multiple "pages" - if its width exceeds `display.width`. + if its width exceeds ``display.width``. display.float_format None The callable should accept a floating point number and return a string with the desired format of the number. @@ -371,11 +371,11 @@ display.max_rows 60 This sets the maximum numbe fully or just a truncated or summary repr. 'None' value means unlimited. display.min_rows 10 The numbers of rows to show in a truncated - repr (when `max_rows` is exceeded). Ignored - when `max_rows` is set to None or 0. When set - to None, follows the value of `max_rows`. + repr (when ``max_rows`` is exceeded). Ignored + when ``max_rows`` is set to None or 0. When set + to None, follows the value of ``max_rows``. display.max_seq_items 100 when pretty-printing a long sequence, - no more then `max_seq_items` will + no more then ``max_seq_items`` will be printed. If items are omitted, they will be denoted by the addition of "..." to the resulting string. diff --git a/doc/source/user_guide/reshaping.rst b/doc/source/user_guide/reshaping.rst index 1b90aeb00cf9c..e6797512ce3cf 100644 --- a/doc/source/user_guide/reshaping.rst +++ b/doc/source/user_guide/reshaping.rst @@ -609,8 +609,8 @@ This function is often used along with discretization functions like ``cut``: See also :func:`Series.str.get_dummies `. :func:`get_dummies` also accepts a ``DataFrame``. By default all categorical -variables (categorical in the statistical sense, those with `object` or -`categorical` dtype) are encoded as dummy variables. +variables (categorical in the statistical sense, those with ``object`` or +``categorical`` dtype) are encoded as dummy variables. .. ipython:: python diff --git a/doc/source/user_guide/scale.rst b/doc/source/user_guide/scale.rst index cddc3cb2600fd..206d8dd0f4739 100644 --- a/doc/source/user_guide/scale.rst +++ b/doc/source/user_guide/scale.rst @@ -214,7 +214,7 @@ work for arbitrary-sized datasets. for path in files: # Only one dataframe is in memory at a time... df = pd.read_parquet(path) - # ... plus a small Series `counts`, which is updated. + # ... plus a small Series ``counts``, which is updated. counts = counts.add(df['name'].value_counts(), fill_value=0) counts.astype(int) @@ -349,7 +349,7 @@ Now we can do things like fast random access with ``.loc``. ddf.loc['2002-01-01 12:01':'2002-01-01 12:05'].compute() -Dask knows to just look in the 3rd partition for selecting values in `2002`. It +Dask knows to just look in the 3rd partition for selecting values in 2002. It doesn't need to look at any other data. Many workflows involve a large amount of data and processing it in a way that diff --git a/doc/source/user_guide/text.rst b/doc/source/user_guide/text.rst index e03ba74f95c90..dd6ac37d88f08 100644 --- a/doc/source/user_guide/text.rst +++ b/doc/source/user_guide/text.rst @@ -266,7 +266,7 @@ i.e., from the end of the string to the beginning of the string: Some caution must be taken to keep regular expressions in mind! For example, the following code will cause trouble because of the regular expression meaning of -`$`: +``$``: .. ipython:: python diff --git a/doc/source/user_guide/timeseries.rst b/doc/source/user_guide/timeseries.rst index 868bf5a1672ff..253fea122b3f8 100644 --- a/doc/source/user_guide/timeseries.rst +++ b/doc/source/user_guide/timeseries.rst @@ -1800,12 +1800,12 @@ See :ref:`groupby.iterating-label` or :class:`Resampler.__iter__` for more. .. _timeseries.adjust-the-start-of-the-bins: -Use `origin` or `offset` to adjust the start of the bins -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Use ``origin`` or ``offset`` to adjust the start of the bins +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. versionadded:: 1.1.0 -The bins of the grouping are adjusted based on the beginning of the day of the time series starting point. This works well with frequencies that are multiples of a day (like `30D`) or that divide a day evenly (like `90s` or `1min`). This can create inconsistencies with some frequencies that do not meet this criteria. To change this behavior you can specify a fixed Timestamp with the argument ``origin``. +The bins of the grouping are adjusted based on the beginning of the day of the time series starting point. This works well with frequencies that are multiples of a day (like ``30D``) or that divide a day evenly (like ``90s`` or ``1min``). This can create inconsistencies with some frequencies that do not meet this criteria. To change this behavior you can specify a fixed Timestamp with the argument ``origin``. For example: diff --git a/doc/source/user_guide/visualization.rst b/doc/source/user_guide/visualization.rst index 8ce4b30c717a4..f41912445455d 100644 --- a/doc/source/user_guide/visualization.rst +++ b/doc/source/user_guide/visualization.rst @@ -67,7 +67,7 @@ On DataFrame, :meth:`~DataFrame.plot` is a convenience to plot all of the column @savefig frame_plot_basic.png df.plot(); -You can plot one column versus another using the `x` and `y` keywords in +You can plot one column versus another using the ``x`` and ``y`` keywords in :meth:`~DataFrame.plot`: .. ipython:: python @@ -496,7 +496,7 @@ Area plot You can create area plots with :meth:`Series.plot.area` and :meth:`DataFrame.plot.area`. Area plots are stacked by default. To produce stacked area plot, each column must be either all positive or all negative values. -When input data contains `NaN`, it will be automatically filled by 0. If you want to drop or fill by different values, use :func:`dataframe.dropna` or :func:`dataframe.fillna` before calling `plot`. +When input data contains ``NaN``, it will be automatically filled by 0. If you want to drop or fill by different values, use :func:`dataframe.dropna` or :func:`dataframe.fillna` before calling ``plot``. .. ipython:: python :suppress: @@ -1078,7 +1078,7 @@ layout and formatting of the returned plot: plt.close('all') -For each kind of plot (e.g. `line`, `bar`, `scatter`) any additional arguments +For each kind of plot (e.g. ``line``, ``bar``, ``scatter``) any additional arguments keywords are passed along to the corresponding matplotlib function (:meth:`ax.plot() `, :meth:`ax.bar() `, @@ -1271,7 +1271,7 @@ Using the ``x_compat`` parameter, you can suppress this behavior: plt.close('all') If you have more than one plot that needs to be suppressed, the ``use`` method -in ``pandas.plotting.plot_params`` can be used in a `with statement`: +in ``pandas.plotting.plot_params`` can be used in a ``with`` statement: .. ipython:: python