chore: influxdb_client/client/write: fix data_frame_to_list_of_points #183

rogpeppe · 2021-01-13T09:22:34Z

Fix the possibility of data corruption by using a much simpler
regular expression to fix up the results.

Also avoid mutating the DataFrame that's been passed in
by making a shallow copy. This changes semantics
so that if a column mentioned in PointSettings exists
in the actual data too, it won't be overridden.

Also change the test_write_data_frame test, which is essentially benchmarking
the data_frame_to_list_of_points function, so that it just does that
so it can easily be run on a local machine and is independent
of network speed. It runs in about 10s, which is comparable
to the previous performance.

Closes #182

CHANGELOG.md updated
Rebased/mergeable
A test has been added if appropriate
pytest tests completes successfully
Commit messages are in semantic format

codecov · 2021-01-13T15:16:41Z

Codecov Report

Merging #183 (51174a1) into master (91dcafb) will increase coverage by 0.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #183      +/-   ##
==========================================
+ Coverage   89.72%   89.81%   +0.08%     
==========================================
  Files          26       26              
  Lines        1957     1963       +6     
==========================================
+ Hits         1756     1763       +7     
+ Misses        201      200       -1

Impacted Files	Coverage Δ
...fluxdb_client/client/write/dataframe_serializer.py	`98.66% <100.00%> (+1.56%)`	⬆️
influxdb_client/client/write/point.py	`98.36% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91dcafb...51174a1. Read the comment docs.

philjb · 2021-01-13T17:57:03Z

influxdb_client/client/write/dataframe_serializer.py

+    print(f'measurement_name: {measurement_name}')
+    print(f'keys: {keys}')
+    print(f'tag columns: {data_frame_tag_columns}')
+    print(f'lambda p: f"""{{measurement_name}}{tags} {fields} {timestamp}"""')


Are the prints meant to stay?

Nope, that's just me debugging. This PR is still a draft as yet.

Fix the possibility of data corruption by using a much simpler regular expression to fix up the results. Also avoid mutating the DataFrame that's been passed in by making a shallow copy. This changes semantics so that if a column mentioned in `PointSettings` exists in the actual data too, it won't be overridden. Also change the `test_write_data_frame` test, which is essentially benchmarking the `data_frame_to_list_of_points` function, so that it just does that so it can easily be run on a local machine and is independent of network speed. It runs in about 10s, which is comparable to the previous performance.

bednar

Thanks for awesome PR - nice code docs, perfect performance 👍

rogpeppe force-pushed the rog-002-fix-dataframe-serializer branch 3 times, most recently from 735c358 to caba68a Compare January 13, 2021 09:30

rogpeppe changed the title ~~Rog 002 fix dataframe serializer~~ chore: influxdb_client/client/write: fix data_frame_to_list_of_points Jan 13, 2021

rogpeppe force-pushed the rog-002-fix-dataframe-serializer branch 8 times, most recently from d2f85a8 to f4be9f4 Compare January 13, 2021 15:12

philjb reviewed Jan 13, 2021

View reviewed changes

rogpeppe force-pushed the rog-002-fix-dataframe-serializer branch 2 times, most recently from 9da33f0 to 8f03699 Compare January 14, 2021 13:19

rogpeppe marked this pull request as ready for review January 14, 2021 14:30

rogpeppe force-pushed the rog-002-fix-dataframe-serializer branch from 8f03699 to 64df1c7 Compare January 14, 2021 15:39

rogpeppe force-pushed the rog-002-fix-dataframe-serializer branch from 64df1c7 to 51174a1 Compare January 14, 2021 15:53

bednar approved these changes Jan 18, 2021

View reviewed changes

bednar merged commit 5e6569c into influxdata:master Jan 18, 2021

bednar added this to the 1.14.0 milestone Jan 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: influxdb_client/client/write: fix data_frame_to_list_of_points #183

chore: influxdb_client/client/write: fix data_frame_to_list_of_points #183

rogpeppe commented Jan 13, 2021 •

edited

Loading

codecov bot commented Jan 13, 2021 •

edited

Loading

philjb Jan 13, 2021

rogpeppe Jan 13, 2021

bednar left a comment

chore: influxdb_client/client/write: fix data_frame_to_list_of_points #183

chore: influxdb_client/client/write: fix data_frame_to_list_of_points #183

Conversation

rogpeppe commented Jan 13, 2021 • edited Loading

codecov bot commented Jan 13, 2021 • edited Loading

Codecov Report

philjb Jan 13, 2021

Choose a reason for hiding this comment

rogpeppe Jan 13, 2021

Choose a reason for hiding this comment

bednar left a comment

Choose a reason for hiding this comment

rogpeppe commented Jan 13, 2021 •

edited

Loading

codecov bot commented Jan 13, 2021 •

edited

Loading