Skip to content

TST: Fix doctest in _parse_latex_table_styles #42674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
datapythonista opened this issue Jul 22, 2021 · 12 comments · Fixed by #42752
Closed

TST: Fix doctest in _parse_latex_table_styles #42674

datapythonista opened this issue Jul 22, 2021 · 12 comments · Fixed by #42752
Assignees
Labels
Docs good first issue Testing pandas testing functions or related to the test suite
Milestone

Comments

@datapythonista
Copy link
Member

tl;dr

Fix the next doctest error:

____________________________________________________ [doctest] pandas.io.formats.style_render._parse_latex_table_styles _____________________________________________________
1249 
1250     Return the first 'props' 'value' from ``tables_styles`` identified by ``selector``.
1251 
1252     Examples
1253     --------
1254     >>> table_styles = [{'selector': 'foo', 'props': [('attr','value')],
UNEXPECTED EXCEPTION: SyntaxError('invalid syntax', ('<doctest pandas.io.formats.style_render._parse_latex_table_styles[0]>', 2, 72, "                {'selector': 'bar', 'props': [('attr', 'overwritten')]},\n"))
Traceback (most recent call last):
  File "/home/mgarcia/miniconda3/envs/pandas-dev/lib/python3.8/doctest.py", line 1336, in __run
    exec(compile(example.source, filename, "single",
  File "<doctest pandas.io.formats.style_render._parse_latex_table_styles[0]>", line 2
    {'selector': 'bar', 'props': [('attr', 'overwritten')]},
                                                           ^
SyntaxError: invalid syntax
/home/mgarcia/src/pandas/pandas/io/formats/style_render.py:1254: UnexpectedException

Detailed instructions

Python allows to have example code in the documentation, like in:

def add(num1, num2):
    """
    Computes the sum of the two numbers.

    Examples
    --------
    >>> add(2, 2)
    4
    """
    return num1 + num2

In pandas, we use this to document most elements. And there are tools, like pytest,
that can run the examples, and make sure everything is correct.

For historical reasons, we have many examples where the code fails to run, or the
actual output is different from the expected output. For example, check the next
incorrect examples:

def add(num1, num2):
    """
    Computes the sum of the two numbers.

    Examples
    --------
    >>> add(2, 2)
    5

    >>> add(2, 2
    4

    >>> add(2, number)
    4

    ...
    """
    return num1 + num2

All them will fail for different reasons. To test the docstring of an object,
the next command can be run:

python -m pytest --doctest-modules pandas/core/frame.py::pandas.core.frame.DataFrame.info

Where pandas/core/frame.py is the file where the docstring is defined, and
pandas.core.frame.DataFrame.info is the object. A whole file can also be tested
by removing the :: and the object from the command above.

In general, the errors in the examples can be fixed with things like:

  • Fixing a typo (a missing comma, an mispelled variable name...)
  • Adding an object that hasn't been defined (like, if df is used, but
    no sample dataset df has been first defined)
  • Fixing the expected output, when it's wrong
  • In exceptional cases, examples shouldn't run, since they can't work.
    For example, a function that connects to a private webservice. In
    such cases, we can add # doctest: +SKIP at the end of the lines
    that should not run

To be able to properly fix an example for the first time, the next steps
are needed:

  • Install a pandas development environment in your computer. There are
    simplified instructions in this page,
    and more detailed information in pandas official contributing page.
  • Run the doctests for the object of interest (the one in this issue),
    and make sure the examples are still broken in the master branch of
    pandas
  • Fix the file locally, and run the doctests again, to make sure the
    fix is working as expected
  • Optionally have a look and make sure that the code in the examples
    follow PEP-8, and fix the style if it doesn't
  • Commit your changes, push your branch to a fork, and open a pull
    request. Make sure you edit the line Closes #XXXX with the issue
    number you are addressing, so the issue is automatically closed,
    when the pull request is merged
  • Make sure the continuous integration of your pull request finishes
    in green. If it doesn't, check if the problem is in your changes
    (sometimes things break in master for technical problems, and in
    that case you just need to wait for a core developer to fix the
    problem)
  • Address any comment from the reviewers (just make changes locally,
    commit, and push to your branch, no need to open new pull requests)
@datapythonista datapythonista added Testing pandas testing functions or related to the test suite Docs good first issue labels Jul 22, 2021
@Leonardofreua
Copy link
Contributor

I'll start to see this problem, ok?

@datapythonista
Copy link
Member Author

Sure, if you write take in a comment, it'll be assigned to you (a bit more clear that nobody else should work on it). It's a hack we've got since we couldn't get GitHub to let anyone assign issues to themselves

@Leonardofreua
Copy link
Contributor

take

@Leonardofreua
Copy link
Contributor

@datapythonista thanks for the tip.

@aneesh98
Copy link
Contributor

Hi @Leonardofreua, I hope you are doing well. I just wanted to ask you, if its okay with you if I work on this issue? because the doctest mentioned in this issue is failing ci checks in my pull request #42700. Also if you are in the process of solving this issue and going to make PR for the same, please let me know, I won't take up this issue then, no problem. Do let me know.

@Leonardofreua
Copy link
Contributor

Hi @aneesh98 , I hope you are doing well too. I'm working on some doctests yes, but what do you think about releasing a PR just fixing the doctests that are failing their PR? Then I launch another one correcting the missing ones.

@aneesh98
Copy link
Contributor

Hi @Leonardofreua , I am doing well, thanks for asking. Actually this particular doctest mentioned in this issue is failing in the CI Check of my pull request, so thats why I am asking if its okay with you if I go ahead with solving this particular case and include its solution in my PR?

@Leonardofreua
Copy link
Contributor

@aneesh98 you can proceed without any problem.

@Leonardofreua
Copy link
Contributor

Leonardofreua commented Jul 26, 2021

Hi, @datapythonista I hope you are doing well. I would like to confirm a small detail with you for the scenario where the only alternative is to use # doctest: +SKIP.

In the case of applied styles that result in a cell or background color changing, these could also receive # doctest: +SKIP? Since it's not possible to express the result in the doctests.

Example:

>>> def highlight_max(x, color):
...     return np.where(x == np.nanmax(x.to_numpy()), f"color: {color};", None)
>>> df = pd.DataFrame(np.random.randn(5, 2), columns=["A", "B"])
>>> df.style.apply(highlight_max, color='red')             
>>> df.style.apply(highlight_max, color='blue', axis=1)    
>>> df.style.apply(highlight_max, color='green', axis=None)

Or is there an alternative to show these changed results?

@datapythonista
Copy link
Member Author

Good point. There are different alternatives. First is to show that code, but not generating an output, for example with result = ..., or also finishing the line with a semicolon.

Then, you can use +ELLIPSIS, similar to +SKIP, but instead of skipping the test, you just validate the part of the ourput that you care about, and the rest you use ... to show than something else will be there. No strong preference in this case, whatever you think will be more useful for readers of that documentation page.

@Leonardofreua
Copy link
Contributor

@datapythonista I'm going to try some of the suggestions you gave.

But I ended up finding the documentation for the background_gradient() method and I found this alternative quite interesting than displaying the result via .jpg image.

image

This way the documentation makes it very clear what will happen when running the code.

@datapythonista
Copy link
Member Author

We need the tests to pass, so we don't have errors in our examples, but what's important is that the documentation is useful for users. Didn't check this example in detail, but what you propose seems like a good idea. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs good first issue Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants