Skip to content

BLD/TST: add pyarrow on CI to macosx build #18714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Dec 10, 2017 · 1 comment
Closed

BLD/TST: add pyarrow on CI to macosx build #18714

jreback opened this issue Dec 10, 2017 · 1 comment
Labels
CI Continuous Integration Compat pandas objects compatability with Numpy or Python functions IO Parquet parquet, feather
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Dec 10, 2017

xref #18662 (comment)

Currently failing

on pyarrow 0.7.1, fp 0.1.3, on macosx

(pandas) bash-3.2$ pytest pandas/tests/io/test_parquet.py --tb=short
=========================================================================================== test session starts ===========================================================================================
platform darwin -- Python 3.6.1, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /Users/jreback/pandas, inifile: setup.cfg
plugins: xdist-1.16.0, cov-2.3.1
collected 38 items                                                                                                                                                                                         

pandas/tests/io/test_parquet.py .....F............s...s...x...s.s.....

================================================================================================ FAILURES =================================================================================================
_________________________________________________________________________________________ test_cross_engine_pa_fp _________________________________________________________________________________________
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/api.py:96: in __init__
    with open_with(fn2, 'rb') as f:
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/util.py:44: in default_open
    return open(f, mode)
E   NotADirectoryError: [Errno 20] Not a directory: '/var/folders/h3/mr_r3bkj5yg0pbx9fr3tk1r00000gp/T/tmpii71wdx8/_metadata'

During handling of the above exception, another exception occurred:
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/api.py:119: in _parse_header
    fmd = read_thrift(f, parquet_thrift.FileMetaData)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/thrift_structures.py:22: in read_thrift
    obj.read(pin)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:1899: in read
    _elem53.read(iprot)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:1742: in read
    _elem33.read(iprot)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:1656: in read
    self.meta_data.read(iprot)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:1487: in read
    self.statistics.read(iprot)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:298: in read
    iprot.skip(ftype)
../miniconda3/envs/pandas/lib/python3.6/site-packages/thrift/protocol/TProtocol.py:208: in skip
    self.readString()
../miniconda3/envs/pandas/lib/python3.6/site-packages/thrift/protocol/TProtocol.py:184: in readString
    return binary_to_str(self.readBinary())
../miniconda3/envs/pandas/lib/python3.6/site-packages/thrift/compat.py:37: in binary_to_str
    return bin_val.decode('utf8')
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 2: invalid start byte

During handling of the above exception, another exception occurred:
pandas/tests/io/test_parquet.py:186: in test_cross_engine_pa_fp
    result = read_parquet(path, engine=fp)
pandas/io/parquet.py:211: in read_parquet
    return impl.read(path, columns=columns, **kwargs)
pandas/io/parquet.py:123: in read
    return self.api.ParquetFile(path).to_pandas(columns=columns, **kwargs)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/api.py:102: in __init__
    self._parse_header(f, verify)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/api.py:122: in _parse_header
    self.fn)
E   fastparquet.util.ParquetException: Metadata parse failed: /var/folders/h3/mr_r3bkj5yg0pbx9fr3tk1r00000gp/T/tmpii71wdx8
======================================================================== 1 failed, 32 passed, 4 skipped, 1 xfailed in 2.93 seconds ========================================================================
@jreback jreback added Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate IO Parquet parquet, feather labels Dec 10, 2017
@jreback jreback modified the milestones: 0.22.0, 0.21.1 Dec 10, 2017
@jreback
Copy link
Contributor Author

jreback commented Dec 10, 2017

so the osx build tests an older version of feather-format, but this does not prevent us from installing pyarrow as well, so this is reproducible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Compat pandas objects compatability with Numpy or Python functions IO Parquet parquet, feather
Projects
None yet
2 participants