Skip to content

FIX: 'parser_trim_buffers' properly initializes word pointers #13788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

ivannz
Copy link
Contributor

@ivannz ivannz commented Jul 25, 2016

Summary

The pull request:

Details

Basically the function parse_trim_buffers did not properly move word pointers in parser->words and related fields. This pull request aims at fixing that.

Changes in parse_trim_buffers:

  • block L1239 -- L1256 (/* trim stream */) was augmented with parser->words update loop;
  • blocks L1224 -- L1237 (/* trim words, word_starts */) and L1239 -- L1256(/* trim stream */) were swapped to preserve pointer consistency;

Output of test.sh:

======================================================================
FAIL: test_round_trip_frame_sep (pandas.io.tests.test_clipboard.TestClipboard)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ivannz/Github/pd_fork/pandas/pandas/io/tests/test_clipboard.py", line 73, in test_round_trip_frame_sep
    self.check_round_trip_frame(dt, sep=',')
  File "/Users/ivannz/Github/pd_fork/pandas/pandas/io/tests/test_clipboard.py", line 69, in check_round_trip_frame
    tm.assert_frame_equal(data, result, check_dtype=False)
  File "/Users/ivannz/Github/pd_fork/pandas/pandas/util/testing.py", line 1276, in assert_frame_equal
    right.columns))
  File "/Users/ivannz/Github/pd_fork/pandas/pandas/util/testing.py", line 1022, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: DataFrame are different

DataFrame shape (number of columns) are different
[left]:  2, Index([u'en', u'es'], dtype='object')
[right]: 0, Index([], dtype='object')

----------------------------------------------------------------------
Ran 10377 tests in 1197.728s

FAILED (SKIP=363, failures=1)

Apart from this, there were a couple of deprecation warnings in files other that tokenizer.c.

@jreback
Copy link
Contributor

jreback commented Jul 25, 2016

can you add a test that reproduces the segfault (and this PR fixes)

@jreback jreback added IO CSV read_csv, to_csv Bug labels Jul 25, 2016
@jreback
Copy link
Contributor

jreback commented Jul 25, 2016

this is going to need a run of the asv suite for the csv benchmarks to see if anything changed. see here

@codecov-io
Copy link

codecov-io commented Jul 25, 2016

Current coverage is 85.23% (diff: 100%)

Merging #13788 into master will not change coverage

@@             master     #13788   diff @@
==========================================
  Files           140        140          
  Lines         50415      50415          
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
  Hits          42971      42971          
  Misses         7444       7444          
  Partials          0          0          

Powered by Codecov. Last update a3cddfa...d59624e

@ivannz
Copy link
Contributor Author

ivannz commented Jul 26, 2016

I ran the ASV benchmarks. It says

[ 12.88%] ··· Running gil.nogil_read_csv.time_nogil_read_csv                                                                                          327.76ms
[ 12.88%] ··· Running gil.nogil_read_csv.time_nogil_read_csv                                                                                          327.76ms
[ 12.94%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_datetime                                                                                 326.62ms
[ 13.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_object                                                                                    18.14ms
[ 30.96%] ··· Running io_bench.read_csv_from_s3.time_read_nrows                                                                                        n/a;...
[ 31.02%] ··· Running io_bench.read_csv_infer_datetime_format_custom.time_read_csv_infer_datetime_format_custom                                        11.53ms
[ 31.08%] ··· Running io_bench.read_csv_infer_datetime_format_iso8601.time_read_csv_infer_datetime_format_iso8601                                       2.07ms
[ 31.14%] ··· Running io_bench.read_csv_infer_datetime_format_ymd.time_read_csv_infer_datetime_format_ymd                                               2.13ms
[ 31.20%] ··· Running io_bench.read_csv_skiprows.time_read_csv_skiprows                                                                                12.84ms
[ 31.26%] ··· Running io_bench.read_csv_standard.time_read_csv_standard                                                                                11.19ms
[ 34.46%] ··· Running packers.packers_read_csv.time_packers_read_csv                                                                                  180.63ms
[ 36.88%] ··· Running parser_vb.read_csv_comment2.time_read_csv_comment2                                                                               25.18ms
[ 36.94%] ··· Running parser_vb.read_csv_default_converter.time_read_csv_default_converter                                                              1.94ms
[ 37.00%] ··· Running parser_vb.read_csv_default_converter_python_engine.time_read_csv_default_converter                                                3.26ms
[ 37.06%] ··· Running parser_vb.read_csv_default_converter_with_decimal.time_read_csv_default_converter_with_decimal                                    1.98ms
[ 37.12%] ··· Running parser_vb.read_csv_default_converter_with_decimal_python_engine.time_read_csv_default_converter_with_decimal                      9.93ms
[ 37.18%] ··· Running parser_vb.read_csv_precise_converter.time_read_csv_precise_converter                                                              1.91ms
[ 37.24%] ··· Running parser_vb.read_csv_roundtrip_converter.time_read_csv_roundtrip_converter                                                          2.61ms
[ 37.30%] ··· Running parser_vb.read_csv_thou_vb.time_read_csv_thou_vb                                                                                 25.74ms
[ 37.36%] ··· Running parser_vb.read_csv_vb.time_read_csv_vb                                                                                           21.59ms
[ 62.88%] ··· Running gil.nogil_read_csv.time_nogil_read_csv                                                                                          321.78ms
[ 62.94%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_datetime                                                                                 322.15ms
[ 63.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_object                                                                                    16.19ms
[ 80.96%] ··· Running io_bench.read_csv_from_s3.time_read_nrows                                                                                        n/a;...
[ 81.02%] ··· Running io_bench.read_csv_infer_datetime_format_custom.time_read_csv_infer_datetime_format_custom                                        11.52ms
[ 81.08%] ··· Running io_bench.read_csv_infer_datetime_format_iso8601.time_read_csv_infer_datetime_format_iso8601                                       2.03ms
[ 81.14%] ··· Running io_bench.read_csv_infer_datetime_format_ymd.time_read_csv_infer_datetime_format_ymd                                               2.10ms
[ 81.20%] ··· Running io_bench.read_csv_skiprows.time_read_csv_skiprows                                                                                12.41ms
[ 81.26%] ··· Running io_bench.read_csv_standard.time_read_csv_standard                                                                                10.81ms
[ 84.46%] ··· Running packers.packers_read_csv.time_packers_read_csv                                                                                  171.84ms
[ 86.88%] ··· Running parser_vb.read_csv_comment2.time_read_csv_comment2                                                                               25.49ms
[ 86.94%] ··· Running parser_vb.read_csv_default_converter.time_read_csv_default_converter                                                              1.87ms
[ 87.00%] ··· Running parser_vb.read_csv_default_converter_python_engine.time_read_csv_default_converter                                                3.26ms
[ 87.06%] ··· Running parser_vb.read_csv_default_converter_with_decimal.time_read_csv_default_converter_with_decimal                                    1.80ms
[ 87.12%] ··· Running parser_vb.read_csv_default_converter_with_decimal_python_engine.time_read_csv_default_converter_with_decimal                      9.94ms
[ 87.18%] ··· Running parser_vb.read_csv_precise_converter.time_read_csv_precise_converter                                                              1.74ms
[ 87.24%] ··· Running parser_vb.read_csv_roundtrip_converter.time_read_csv_roundtrip_converter                                                          2.42ms
[ 87.30%] ··· Running parser_vb.read_csv_thou_vb.time_read_csv_thou_vb                                                                                 25.13ms
[ 87.36%] ··· Running parser_vb.read_csv_vb.time_read_csv_vb                                                                                           20.23ms

...

BENCHMARKS NOT SIGNIFICANTLY CHANGED.

The full results are in the attached archive.
results.tar.gz

@ivannz
Copy link
Contributor Author

ivannz commented Jul 26, 2016

I added a test that check for either segfault, or memory corruption during parsing.

@jorisvandenbossche
Copy link
Member

@ivannz You pulled in some changes from other PRs that have been merged recently. Can you rebase to fix this? Normally

git fetch upstream
git rebase upstream/master
git push -f origin parser_trim_fix

should do the trick

@jorisvandenbossche
Copy link
Member

For the asv benchmarks, you will need to compare to current master to see if anything changed. In fact, we just did merge a pull request to clarify how to do this (#13794):

asv continuous upstream/master HEAD -b csv

@ivannz ivannz force-pushed the parser_trim_fix branch from c47684b to aec02e1 Compare July 26, 2016 09:28
@jreback
Copy link
Contributor

jreback commented Jul 26, 2016

tests are just like any other parser test
put in io/tests/parser/common.py

@ivannz ivannz force-pushed the parser_trim_fix branch from aec02e1 to c092c2b Compare July 26, 2016 10:07
@ivannz
Copy link
Contributor Author

ivannz commented Jul 26, 2016

I had to update the stressfulness of the test, because sometimes safe_realloc just expands the parser->stream buffer, which does not corrupt the pointers in parser->words.

import pandas as pd
from pandas.compat import StringIO
record_ = "9999-9,99:99,,,,ZZ,ZZ,,,ZZZ-ZZZZ,.Z-ZZZZ,-9.99,,,9.99,ZZZZZ,,-99,9,ZZZ-ZZZZ,ZZ-ZZZZ,,9.99,ZZZ-ZZZZZ,ZZZ-ZZZZZ,ZZZ-ZZZZ,ZZZ-ZZZZ,ZZZ-ZZZZ,ZZZ-ZZZZ,ZZZ-ZZZZ,ZZZ-ZZZZ,999,ZZZ-ZZZZ,,ZZ-ZZZZ,,,,,ZZZZ,ZZZ-ZZZZZ,ZZZ-ZZZZ,,,9,9,9,9,99,99,999,999,ZZZZZ,ZZZ-ZZZZZ,ZZZ-ZZZZ,9,ZZ-ZZZZ,9.99,ZZ-ZZZZ,ZZ-ZZZZ,,,,ZZZZ,,,ZZ,ZZ,,,,,,,,,,,,,9,,,999.99,999.99,,,ZZZZZ,,,Z9,,,,,,,ZZZ,ZZZ,,,,,,,,,,,ZZZZZ,ZZZZZ,ZZZ-ZZZZZZ,ZZZ-ZZZZZZ,ZZ-ZZZZ,ZZ-ZZZZ,ZZ-ZZZZ,ZZ-ZZZZ,,,999999,999999,ZZZ,ZZZ,,,ZZZ,ZZZ,999.99,999.99,,,,ZZZ-ZZZ,ZZZ-ZZZ,-9.99,-9.99,9,9,,99,,9.99,9.99,9,9,9.99,9.99,,,,9.99,9.99,,99,,99,9.99,9.99,,,ZZZ,ZZZ,,999.99,,999.99,ZZZ,ZZZ-ZZZZ,ZZZ-ZZZZ,,,ZZZZZ,ZZZZZ,ZZZ,ZZZ,9,9,,,,,,ZZZ-ZZZZ,ZZZ999Z,,,999.99,,999.99,ZZZ-ZZZZ,,,9.999,9.999,9.999,9.999,-9.999,-9.999,-9.999,-9.999,9.999,9.999,9.999,9.999,9.999,9.999,9.999,9.999,99999,ZZZ-ZZZZ,,9.99,ZZZ,,,,,,,,ZZZ,,,,,9,,,,9,,,,,,,,,,ZZZ-ZZZZ,ZZZ-ZZZZ,,ZZZZZ,ZZZZZ,ZZZZZ,ZZZZZ,,,9.99,,ZZ-ZZZZ,ZZ-ZZZZ,ZZ,999,,,,ZZ-ZZZZ,ZZZ,ZZZ,ZZZ-ZZZZ,ZZZ-ZZZZ,,,99.99,99.99,,,9.99,9.99,9.99,9.99,ZZZ-ZZZZ,,,ZZZ-ZZZZZ,,,,,-9.99,-9.99,-9.99,-9.99,,,,,,,,,ZZZ-ZZZZ,,9,9.99,9.99,99ZZ,,-9.99,-9.99,ZZZ-ZZZZ,,,,,,,ZZZ-ZZZZ,9.99,9.99,9999,,,,,,,,,,-9.9,Z/Z-ZZZZ,999.99,9.99,,999.99,ZZ-ZZZZ,ZZ-ZZZZ,9.99,9.99,9.99,9.99,9.99,9.99,,ZZZ-ZZZZZ,ZZZ-ZZZZZ,ZZZ-ZZZZZ,ZZZ-ZZZZZ,ZZZ-ZZZZZ,ZZZ,ZZZ,ZZZ,ZZZ,9.99,,,-9.99,ZZ-ZZZZ,-999.99,,-9999,,999.99,,,,999.99,99.99,,,ZZ-ZZZZZZZZ,ZZ-ZZZZ-ZZZZZZZ,,,,ZZ-ZZ-ZZZZZZZZ,ZZZZZZZZ,ZZZ-ZZZZ,9999,999.99,ZZZ-ZZZZ,-9.99,-9.99,ZZZ-ZZZZ,99:99:99,,99,99,,9.99,,-99.99,,,,,,9.99,ZZZ-ZZZZ,-9.99,-9.99,9.99,9.99,,ZZZ,,,,,,,ZZZ,ZZZ,,,,,"
csv_data = "\\n".join([record_]*173) + "\\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just right actual code. no need to shell out to do this.

further you need to compare the result with an expected frame.

@ivannz
Copy link
Contributor Author

ivannz commented Jul 26, 2016

Here are the results of asv continuous -E virtualenv upstream/master HEAD -b csv:

Running 50 total benchmarks (2 commits * 1 environments * 25 benchmarks)
[  0.00%] · For pandas commit hash c092c2b0:
[  2.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv                                                                                                                                330.53ms
[  4.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_datetime                                                                                                                       331.21ms
[  6.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_object                                                                                                                          16.88ms
[  8.00%] ··· Running io_bench.frame_to_csv.time_frame_to_csv                                                                                                                                54.20ms
[ 10.00%] ··· Running io_bench.frame_to_csv2.time_frame_to_csv2                                                                                                                             107.84ms
[ 12.00%] ··· Running io_bench.frame_to_csv_date_formatting.time_frame_to_csv_date_formatting                                                                                                13.89ms
[ 14.00%] ··· Running io_bench.frame_to_csv_mixed.time_frame_to_csv_mixed                                                                                                                   180.47ms
[ 16.00%] ··· Running io_bench.read_csv_from_s3.time_read_nrows                                                                                                                              n/a;...
[ 18.00%] ··· Running io_bench.read_csv_infer_datetime_format_custom.time_read_csv_infer_datetime_format_custom                                                                              11.98ms
[ 20.00%] ··· Running io_bench.read_csv_infer_datetime_format_iso8601.time_read_csv_infer_datetime_format_iso8601                                                                             2.15ms
[ 22.00%] ··· Running io_bench.read_csv_infer_datetime_format_ymd.time_read_csv_infer_datetime_format_ymd                                                                                     2.15ms
[ 24.00%] ··· Running io_bench.read_csv_skiprows.time_read_csv_skiprows                                                                                                                      13.10ms
[ 26.00%] ··· Running io_bench.read_csv_standard.time_read_csv_standard                                                                                                                      11.36ms
[ 28.00%] ··· Running io_bench.write_csv_standard.time_write_csv_standard                                                                                                                    22.73ms
[ 30.00%] ··· Running packers.packers_read_csv.time_packers_read_csv                                                                                                                        187.62ms
[ 32.00%] ··· Running packers.packers_write_csv.time_packers_write_csv                                                                                                                      625.05ms
[ 34.00%] ··· Running parser_vb.read_csv_comment2.time_read_csv_comment2                                                                                                                     26.11ms
[ 36.00%] ··· Running parser_vb.read_csv_default_converter.time_read_csv_default_converter                                                                                                    1.97ms
[ 38.00%] ··· Running parser_vb.read_csv_default_converter_python_engine.time_read_csv_default_converter                                                                                      3.33ms
[ 40.00%] ··· Running parser_vb.read_csv_default_converter_with_decimal.time_read_csv_default_converter_with_decimal                                                                          1.98ms
[ 42.00%] ··· Running parser_vb.read_csv_default_converter_with_decimal_python_engine.time_read_csv_default_converter_with_decimal                                                           10.16ms
[ 44.00%] ··· Running parser_vb.read_csv_precise_converter.time_read_csv_precise_converter                                                                                                    1.91ms
[ 46.00%] ··· Running parser_vb.read_csv_roundtrip_converter.time_read_csv_roundtrip_converter                                                                                                2.67ms
[ 48.00%] ··· Running parser_vb.read_csv_thou_vb.time_read_csv_thou_vb                                                                                                                       23.23ms
[ 50.00%] ··· Running parser_vb.read_csv_vb.time_read_csv_vb                                                                                                                                 22.11ms

[ 50.00%] · For pandas commit hash 690d52cf:
[ 52.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv                                                                                                                                321.53ms
[ 54.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_datetime                                                                                                                       323.61ms
[ 56.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_object                                                                                                                          20.18ms
[ 58.00%] ··· Running io_bench.frame_to_csv.time_frame_to_csv                                                                                                                                53.05ms
[ 60.00%] ··· Running io_bench.frame_to_csv2.time_frame_to_csv2                                                                                                                             110.65ms
[ 62.00%] ··· Running io_bench.frame_to_csv_date_formatting.time_frame_to_csv_date_formatting                                                                                                13.39ms
[ 64.00%] ··· Running io_bench.frame_to_csv_mixed.time_frame_to_csv_mixed                                                                                                                   182.09ms
[ 66.00%] ··· Running io_bench.read_csv_from_s3.time_read_nrows                                                                                                                              n/a;...
[ 68.00%] ··· Running io_bench.read_csv_infer_datetime_format_custom.time_read_csv_infer_datetime_format_custom                                                                              12.03ms
[ 70.00%] ··· Running io_bench.read_csv_infer_datetime_format_iso8601.time_read_csv_infer_datetime_format_iso8601                                                                             2.11ms
[ 72.00%] ··· Running io_bench.read_csv_infer_datetime_format_ymd.time_read_csv_infer_datetime_format_ymd                                                                                     2.19ms
[ 74.00%] ··· Running io_bench.read_csv_skiprows.time_read_csv_skiprows                                                                                                                      13.24ms
[ 76.00%] ··· Running io_bench.read_csv_standard.time_read_csv_standard                                                                                                                      11.19ms
[ 78.00%] ··· Running io_bench.write_csv_standard.time_write_csv_standard                                                                                                                    22.07ms
[ 80.00%] ··· Running packers.packers_read_csv.time_packers_read_csv                                                                                                                        175.88ms
[ 82.00%] ··· Running packers.packers_write_csv.time_packers_write_csv                                                                                                                      638.85ms
[ 84.00%] ··· Running parser_vb.read_csv_comment2.time_read_csv_comment2                                                                                                                     27.93ms
[ 86.00%] ··· Running parser_vb.read_csv_default_converter.time_read_csv_default_converter                                                                                                    1.85ms
[ 88.00%] ··· Running parser_vb.read_csv_default_converter_python_engine.time_read_csv_default_converter                                                                                      3.38ms
[ 90.00%] ··· Running parser_vb.read_csv_default_converter_with_decimal.time_read_csv_default_converter_with_decimal                                                                          1.91ms
[ 92.00%] ··· Running parser_vb.read_csv_default_converter_with_decimal_python_engine.time_read_csv_default_converter_with_decimal                                                           10.13ms
[ 94.00%] ··· Running parser_vb.read_csv_precise_converter.time_read_csv_precise_converter                                                                                                    1.77ms
[ 96.00%] ··· Running parser_vb.read_csv_roundtrip_converter.time_read_csv_roundtrip_converter                                                                                                2.52ms
[ 98.00%] ··· Running parser_vb.read_csv_thou_vb.time_read_csv_thou_vb                                                                                                                       24.98ms
[100.00%] ··· Running parser_vb.read_csv_vb.time_read_csv_vb                                                                                                                                 24.66ms
BENCHMARKS NOT SIGNIFICANTLY CHANGED.

Complete results are in this archive:
results.tar.gz

@jreback
Copy link
Contributor

jreback commented Jul 26, 2016

pls add a note in whatsnew / bug fix section

@ivannz
Copy link
Contributor Author

ivannz commented Jul 26, 2016

@jreback , I rewrote the test as you suggested. It is a problem if nosetests does not recover from segfault when run on it?

@jreback
Copy link
Contributor

jreback commented Jul 26, 2016

@ivannz the whole point is for it NOT to recover. A segfault is as noticiable as any other error. If you have a test that segfaults, and it is fixed, it will pass.

@ivannz ivannz force-pushed the parser_trim_fix branch from c092c2b to 5489ad6 Compare July 26, 2016 11:47
@ivannz
Copy link
Contributor Author

ivannz commented Jul 26, 2016

I added a note in whatsnew / bug fix section of v0.19.0.txt.

Here are the latest results of asv continuous -E virtualenv upstream/master HEAD -b csv

· Running 50 total benchmarks (2 commits * 1 environments * 25 benchmarks)
[  0.00%] · For pandas commit hash c715ec76:
[  2.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv                                                                                                                                332.52ms
[  4.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_datetime                                                                                                                       336.29ms
[  6.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_object                                                                                                                          17.21ms
[  8.00%] ··· Running io_bench.frame_to_csv.time_frame_to_csv                                                                                                                                53.66ms
[ 10.00%] ··· Running io_bench.frame_to_csv2.time_frame_to_csv2                                                                                                                             107.95ms
[ 12.00%] ··· Running io_bench.frame_to_csv_date_formatting.time_frame_to_csv_date_formatting                                                                                                14.02ms
[ 14.00%] ··· Running io_bench.frame_to_csv_mixed.time_frame_to_csv_mixed                                                                                                                   185.26ms
[ 16.00%] ··· Running io_bench.read_csv_from_s3.time_read_nrows                                                                                                                              n/a;...
[ 18.00%] ··· Running io_bench.read_csv_infer_datetime_format_custom.time_read_csv_infer_datetime_format_custom                                                                              11.95ms
[ 20.00%] ··· Running io_bench.read_csv_infer_datetime_format_iso8601.time_read_csv_infer_datetime_format_iso8601                                                                             2.14ms
[ 22.00%] ··· Running io_bench.read_csv_infer_datetime_format_ymd.time_read_csv_infer_datetime_format_ymd                                                                                     2.17ms
[ 24.00%] ··· Running io_bench.read_csv_skiprows.time_read_csv_skiprows                                                                                                                      13.40ms
[ 26.00%] ··· Running io_bench.read_csv_standard.time_read_csv_standard                                                                                                                      11.43ms
[ 28.00%] ··· Running io_bench.write_csv_standard.time_write_csv_standard                                                                                                                    25.87ms
[ 30.00%] ··· Running packers.packers_read_csv.time_packers_read_csv                                                                                                                        178.49ms
[ 32.00%] ··· Running packers.packers_write_csv.time_packers_write_csv                                                                                                                      629.38ms
[ 34.00%] ··· Running parser_vb.read_csv_comment2.time_read_csv_comment2                                                                                                                     26.11ms
[ 36.00%] ··· Running parser_vb.read_csv_default_converter.time_read_csv_default_converter                                                                                                    1.98ms
[ 38.00%] ··· Running parser_vb.read_csv_default_converter_python_engine.time_read_csv_default_converter                                                                                      3.35ms
[ 40.00%] ··· Running parser_vb.read_csv_default_converter_with_decimal.time_read_csv_default_converter_with_decimal                                                                          2.01ms
[ 42.00%] ··· Running parser_vb.read_csv_default_converter_with_decimal_python_engine.time_read_csv_default_converter_with_decimal                                                           10.10ms
[ 44.00%] ··· Running parser_vb.read_csv_precise_converter.time_read_csv_precise_converter                                                                                                    1.88ms
[ 46.00%] ··· Running parser_vb.read_csv_roundtrip_converter.time_read_csv_roundtrip_converter                                                                                                2.69ms
[ 48.00%] ··· Running parser_vb.read_csv_thou_vb.time_read_csv_thou_vb                                                                                                                       22.01ms
[ 50.00%] ··· Running parser_vb.read_csv_vb.time_read_csv_vb                                                                                                                                 25.23ms

[ 50.00%] · For pandas commit hash 98c5b88d:
[ 52.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv                                                                                                                                319.35ms
[ 54.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_datetime                                                                                                                       318.53ms
[ 56.00%] ··· Running gil.nogil_read_csv.time_nogil_read_csv_object                                                                                                                          18.57ms
[ 58.00%] ··· Running io_bench.frame_to_csv.time_frame_to_csv                                                                                                                                55.03ms
[ 60.00%] ··· Running io_bench.frame_to_csv2.time_frame_to_csv2                                                                                                                             103.54ms
[ 62.00%] ··· Running io_bench.frame_to_csv_date_formatting.time_frame_to_csv_date_formatting                                                                                                13.35ms
[ 64.00%] ··· Running io_bench.frame_to_csv_mixed.time_frame_to_csv_mixed                                                                                                                   180.89ms
[ 66.00%] ··· Running io_bench.read_csv_from_s3.time_read_nrows                                                                                                                              n/a;...
[ 68.00%] ··· Running io_bench.read_csv_infer_datetime_format_custom.time_read_csv_infer_datetime_format_custom                                                                              11.53ms
[ 70.00%] ··· Running io_bench.read_csv_infer_datetime_format_iso8601.time_read_csv_infer_datetime_format_iso8601                                                                             2.05ms
[ 72.00%] ··· Running io_bench.read_csv_infer_datetime_format_ymd.time_read_csv_infer_datetime_format_ymd                                                                                     2.09ms
[ 74.00%] ··· Running io_bench.read_csv_skiprows.time_read_csv_skiprows                                                                                                                      12.41ms
[ 76.00%] ··· Running io_bench.read_csv_standard.time_read_csv_standard                                                                                                                      11.40ms
[ 78.00%] ··· Running io_bench.write_csv_standard.time_write_csv_standard                                                                                                                    22.19ms
[ 80.00%] ··· Running packers.packers_read_csv.time_packers_read_csv                                                                                                                        175.62ms
[ 82.00%] ··· Running packers.packers_write_csv.time_packers_write_csv                                                                                                                      603.39ms
[ 84.00%] ··· Running parser_vb.read_csv_comment2.time_read_csv_comment2                                                                                                                     25.62ms
[ 86.00%] ··· Running parser_vb.read_csv_default_converter.time_read_csv_default_converter                                                                                                    1.79ms
[ 88.00%] ··· Running parser_vb.read_csv_default_converter_python_engine.time_read_csv_default_converter                                                                                      3.26ms
[ 90.00%] ··· Running parser_vb.read_csv_default_converter_with_decimal.time_read_csv_default_converter_with_decimal                                                                          1.83ms
[ 92.00%] ··· Running parser_vb.read_csv_default_converter_with_decimal_python_engine.time_read_csv_default_converter_with_decimal                                                           10.13ms
[ 94.00%] ··· Running parser_vb.read_csv_precise_converter.time_read_csv_precise_converter                                                                                                    1.74ms
[ 96.00%] ··· Running parser_vb.read_csv_roundtrip_converter.time_read_csv_roundtrip_converter                                                                                                2.44ms
[ 98.00%] ··· Running parser_vb.read_csv_thou_vb.time_read_csv_thou_vb                                                                                                                       21.64ms
[100.00%] ··· Running parser_vb.read_csv_vb.time_read_csv_vb                                                                                                                                 24.61ms

BENCHMARKS NOT SIGNIFICANTLY CHANGED.

@@ -5,6 +5,8 @@
import platform
import codecs

import subprocess

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

@ivannz
Copy link
Contributor Author

ivannz commented Jul 26, 2016

@jorisvandenbossche , could you cancel Travis CI job #21105, please.

except ValueError:
# Ignore unsuported dtype=object by engine=python
# in this case output_ list is empty
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead raise nose.SkipTest('....') here

@ivannz ivannz force-pushed the parser_trim_fix branch from 68b5d37 to d59624e Compare July 26, 2016 23:23
@jreback jreback closed this in 31f8e4d Jul 27, 2016
@jreback
Copy link
Contributor

jreback commented Jul 27, 2016

thanks!

@pijucha
Copy link
Contributor

pijucha commented Jul 31, 2016

I tried to compile master branch on Windows VS2015 and it failed with the errors:

pandas/src/parser/tokenizer.c(1260): error C2036: 'void *': unknown size
pandas/src/parser/tokenizer.c(1264): error C2036: 'void *': unknown size

So I guess newptr in these two lines should be cast to char *.

@jreback
Copy link
Contributor

jreback commented Jul 31, 2016

can u make a new issue? and fix if u can

@pijucha
Copy link
Contributor

pijucha commented Jul 31, 2016

OK. I'll later submit the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected segmentation fault in pd.read_csv C-engine
5 participants