Skip to content

[BLD] Add script that fails build if git tags do not exist #27770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 7, 2019

Conversation

qwhelan
Copy link
Contributor

@qwhelan qwhelan commented Aug 6, 2019

The Travis build process can fail in hard to decipher ways if git tags are not synced between upstream and a developer's repo. This failure mode would occur in the main repo if we go 2000 commits (current clone depth, ~9 months) without tagging a release. This PR explicitly fails the build with instructions of how to resolve if this situation is encountered.

The underlying issue is that versioneer and the underlying git describe rely on tags to exist in the local repo for the purposes of creating a version description like:

v0.25.0-116-gee54d95952

As Travis makes a shallow clone, these tags are will not exist if:

  • A developer has not explicitly synced tags between repos (the default git behavior)
  • pandas goes 2000 commits without tagging a commit

If tags do not exist, we get something like:

0+untagged.2000.g66ada8c

Which causes causes tests involving downstream libraries like pyarrow, statsmodels, etc to fail their internal version checks in sometimes indecipherable ways like this:

=================================== FAILURES ===================================

____________________________ TestFeather.test_error ____________________________
[gw0] linux -- Python 3.7.3 /home/travis/miniconda3/envs/pandas-dev/bin/python
self = <pandas.tests.io.test_feather.TestFeather object at 0x7f353ab95c18>
    def test_error(self):
        for obj in [
            pd.Series([1, 2, 3]),
            1,
            "foo",
            pd.Timestamp("20130101"),
            np.array([1, 2, 3]),
        ]:
>           self.check_error_on_write(obj, ValueError)
pandas/tests/io/test_feather.py:49: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/tests/io/test_feather.py:27: in check_error_on_write
    to_feather(df, path)
pandas/io/feather_format.py:24: in to_feather
    from pyarrow import feather
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    from distutils.version import LooseVersion
    import os
    
    import six
    import pandas as pd
    import warnings
    
>   from pyarrow.compat import pdapi
E   ImportError: cannot import name 'pdapi' from 'pyarrow.compat' (/home/travis/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/pyarrow/compat.py)

With some digging, you can eventually find that the root cause is:

self = LooseVersion ('0+untagged.2000.g66ada8c'), other = LooseVersion ('0.21')
    def _cmp (self, other):
        if isinstance(other, str):
            other = LooseVersion(other)
    
        if self.version == other.version:
            return 0
>       if self.version < other.version:
E       TypeError: '<' not supported between instances of 'str' and 'int'

Syncing tags between repos resolves the issue.

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@qwhelan qwhelan force-pushed the git_tags_check branch 2 times, most recently from 95f5dd5 to 00e8d8a Compare August 6, 2019 05:22
@gfyoung gfyoung added the Build Library building on various platforms label Aug 6, 2019
@gfyoung gfyoung requested a review from jreback August 6, 2019 07:09
@WillAyd
Copy link
Member

WillAyd commented Aug 6, 2019

Is there a downside to just removing this depth altogether?

https://docs.travis-ci.com/user/customizing-the-build/#git-clone-depth

(side note - unclear to me if removing it defaults to 50 or unlimited but figured you might know already)

@qwhelan
Copy link
Contributor Author

qwhelan commented Aug 6, 2019

@WillAyd Doing a full clone takes a few seconds longer (~6s instead of ~3s). Not sure if there's a space constraint/etc to care about, but Travis probably sees decent savings by setting it as the default.

If we do go that route, I think it'd make sense to leave the python test in case a dev is working with a shallow clone.

@WillAyd
Copy link
Member

WillAyd commented Aug 6, 2019

Yea I think the 3 seconds is probably worth the simplicity; it's insignificant in the overall process

@jreback jreback added this to the 1.0 milestone Aug 7, 2019
@jreback
Copy link
Contributor

jreback commented Aug 7, 2019

yeah this is fine. we originally set the clone depth as IIRC that was how the examples did it. over the years have made it bigger; so this is fine. thanks.

@jreback jreback merged commit 9724ace into pandas-dev:master Aug 7, 2019
quintusdias pushed a commit to quintusdias/pandas_dev that referenced this pull request Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants