[BLD] Add script that fails build if git tags do not exist #27770

qwhelan · 2019-08-06T03:53:13Z

The Travis build process can fail in hard to decipher ways if git tags are not synced between upstream and a developer's repo. This failure mode would occur in the main repo if we go 2000 commits (current clone depth, ~9 months) without tagging a release. This PR explicitly fails the build with instructions of how to resolve if this situation is encountered.

The underlying issue is that versioneer and the underlying git describe rely on tags to exist in the local repo for the purposes of creating a version description like:

v0.25.0-116-gee54d95952

As Travis makes a shallow clone, these tags are will not exist if:

A developer has not explicitly synced tags between repos (the default git behavior)
pandas goes 2000 commits without tagging a commit

If tags do not exist, we get something like:

0+untagged.2000.g66ada8c

Which causes causes tests involving downstream libraries like pyarrow, statsmodels, etc to fail their internal version checks in sometimes indecipherable ways like this:

=================================== FAILURES ===================================

____________________________ TestFeather.test_error ____________________________
[gw0] linux -- Python 3.7.3 /home/travis/miniconda3/envs/pandas-dev/bin/python
self = <pandas.tests.io.test_feather.TestFeather object at 0x7f353ab95c18>
    def test_error(self):
        for obj in [
            pd.Series([1, 2, 3]),
            1,
            "foo",
            pd.Timestamp("20130101"),
            np.array([1, 2, 3]),
        ]:
>           self.check_error_on_write(obj, ValueError)
pandas/tests/io/test_feather.py:49: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/tests/io/test_feather.py:27: in check_error_on_write
    to_feather(df, path)
pandas/io/feather_format.py:24: in to_feather
    from pyarrow import feather
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    from distutils.version import LooseVersion
    import os
    
    import six
    import pandas as pd
    import warnings
    
>   from pyarrow.compat import pdapi
E   ImportError: cannot import name 'pdapi' from 'pyarrow.compat' (/home/travis/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/pyarrow/compat.py)

With some digging, you can eventually find that the root cause is:

self = LooseVersion ('0+untagged.2000.g66ada8c'), other = LooseVersion ('0.21')
    def _cmp (self, other):
        if isinstance(other, str):
            other = LooseVersion(other)
    
        if self.version == other.version:
            return 0
>       if self.version < other.version:
E       TypeError: '<' not supported between instances of 'str' and 'int'

Syncing tags between repos resolves the issue.

WillAyd · 2019-08-06T15:22:43Z

Is there a downside to just removing this depth altogether?

https://docs.travis-ci.com/user/customizing-the-build/#git-clone-depth

(side note - unclear to me if removing it defaults to 50 or unlimited but figured you might know already)

qwhelan · 2019-08-06T15:54:52Z

@WillAyd Doing a full clone takes a few seconds longer (~6s instead of ~3s). Not sure if there's a space constraint/etc to care about, but Travis probably sees decent savings by setting it as the default.

If we do go that route, I think it'd make sense to leave the python test in case a dev is working with a shallow clone.

WillAyd · 2019-08-06T21:59:07Z

Yea I think the 3 seconds is probably worth the simplicity; it's insignificant in the overall process

jreback · 2019-08-07T13:26:57Z

yeah this is fine. we originally set the clone depth as IIRC that was how the examples did it. over the years have made it bigger; so this is fine. thanks.

…v#27770)

qwhelan force-pushed the git_tags_check branch 2 times, most recently from 95f5dd5 to 00e8d8a Compare August 6, 2019 05:22

gfyoung added the Build Library building on various platforms label Aug 6, 2019

gfyoung requested a review from jreback August 6, 2019 07:09

qwhelan added 2 commits August 6, 2019 20:59

[BLD] Add script that fails build if git tags do not exist

7f5f2f7

[BLD] Have Travis builds perform full clone instead of shallow ones

2d362bb

qwhelan force-pushed the git_tags_check branch from 00e8d8a to 2d362bb Compare August 7, 2019 03:59

jreback added this to the 1.0 milestone Aug 7, 2019

jreback merged commit 9724ace into pandas-dev:master Aug 7, 2019

quintusdias pushed a commit to quintusdias/pandas_dev that referenced this pull request Aug 16, 2019

[BLD] Add script that fails build if git tags do not exist (pandas-de…

13957ca

…v#27770)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BLD] Add script that fails build if git tags do not exist #27770

[BLD] Add script that fails build if git tags do not exist #27770

qwhelan commented Aug 6, 2019 •

edited

Loading

WillAyd commented Aug 6, 2019

qwhelan commented Aug 6, 2019

WillAyd commented Aug 6, 2019

jreback commented Aug 7, 2019

[BLD] Add script that fails build if git tags do not exist #27770

[BLD] Add script that fails build if git tags do not exist #27770

Conversation

qwhelan commented Aug 6, 2019 • edited Loading

WillAyd commented Aug 6, 2019

qwhelan commented Aug 6, 2019

WillAyd commented Aug 6, 2019

jreback commented Aug 7, 2019

qwhelan commented Aug 6, 2019 •

edited

Loading