FluxTable / FluxRecord can't handle tables with duplicate column labels #500

pgorczak · 2022-09-12T14:02:28Z

Specifications

Code sample to reproduce problem

Make a query for measurements that include a field called result or other labels that occur by default in the annotated CSV response header. Make the field name part of the annotated CSV response header via pivot. The annotated CSV returned by the API will have columns with duplicate labels.

Expected behavior

Being able to access all data returned by the API as annotated CSV

Actual behavior

Flux CSV parser can't handle duplicate header names. It turns CSV rows into FluxRecords whose internal values are represented as python dicts, meaning values from columns with the same label overwrite each other.

See flux_csv_parser.py from line 262 for the logic that collapses a list of columns into a dict based on the column label.

Additional info

No response

The text was updated successfully, but these errors were encountered:

bednar · 2022-09-12T14:52:35Z

Hi @pgorczak,

thanks for using our client.

I will take a look.

Regards

pgorczak · 2022-09-12T15:00:18Z

Thank you @bednar :) I just clarified the expected behavior since the problem isn't about measurements but rather how the annotated CSV is parsed into Python objects

bednar · 2022-09-13T06:25:37Z

Just for clarification, the problem is caused by Annotated CSV with duplicate column names:

#datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,string,string,double
#group,false,false,true,true,false,true,true,false
#default,_result,,,,,,,
,result,table,_start,_stop,_time,_measurement,location,result
,,0,2022-09-13T06:14:40.469404272Z,2022-09-13T06:24:40.469404272Z,2022-09-13T06:24:33.746Z,my_measurement,Prague,25.3
,,0,2022-09-13T06:14:40.469404272Z,2022-09-13T06:24:40.469404272Z,2022-09-13T06:24:39.299Z,my_measurement,Prague,25.3
,,0,2022-09-13T06:14:40.469404272Z,2022-09-13T06:24:40.469404272Z,2022-09-13T06:24:40.454Z,my_measurement,Prague,25.3

from datetime import datetime

from influxdb_client import WritePrecision, InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS

with InfluxDBClient(url="http://localhost:8086", token="my-token", org="my-org", debug=True) as client:
    query_api = client.query_api()

    p = Point("my_measurement") \
        .tag("location", "Prague") \
        .field("result", 25.3) \
        .time(datetime.utcnow(), WritePrecision.MS)
    write_api = client.write_api(write_options=SYNCHRONOUS)

    write_api.write(bucket="my-bucket", record=p)

    tables = query_api.query(
        'from(bucket:"my-bucket") |> range(start: -10m) |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")')
    for table in tables:
        print(table)
        for record in table.records:
            # process record
            print(record.values)

bednar · 2022-09-14T09:14:20Z

Hi @pgorczak,

the #502 adds possibility to access your data by record.row. The record.row is an array of column values from Annotated CSV row.

If you would like to use this fixed version before regular release, please install client via:

pip install git+https://github.com/influxdata/influxdb-client-python.git@record-row-array

What do you think about this solution?

Regards

pgorczak · 2022-09-19T17:06:58Z

Sorry for the late response @bednar . I somehow didn't get the notification.

Just tried record.row and it works like a charm. I think it's a useful alternative way to access row values. Thank you!

bednar · 2022-09-20T07:16:28Z

@pgorczak thanks for testing, I will keep open this issue until #502 will be merged.

pgorczak added the bug Something isn't working label Sep 12, 2022

pgorczak changed the title ~~FluxRecord can't handle tables with duplicate column labels~~ FluxTable / FluxRecord can't handle tables with duplicate column labels Sep 12, 2022

bednar self-assigned this Sep 12, 2022

bednar added the state: in progress label Sep 12, 2022

bednar mentioned this issue Sep 14, 2022

feat: add FluxRecord.row with response data stored in array #502

Merged

6 tasks

bednar removed the state: in progress label Sep 14, 2022

pgorczak closed this as completed Sep 20, 2022

bednar reopened this Sep 20, 2022

bednar closed this as completed in #502 Sep 21, 2022

bednar added this to the 1.33.0 milestone Sep 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FluxTable / FluxRecord can't handle tables with duplicate column labels #500

FluxTable / FluxRecord can't handle tables with duplicate column labels #500

pgorczak commented Sep 12, 2022 •

edited

Loading

bednar commented Sep 12, 2022

pgorczak commented Sep 12, 2022

bednar commented Sep 13, 2022 •

edited

Loading

bednar commented Sep 14, 2022

pgorczak commented Sep 19, 2022

bednar commented Sep 20, 2022

FluxTable / FluxRecord can't handle tables with duplicate column labels #500

FluxTable / FluxRecord can't handle tables with duplicate column labels #500

Comments

pgorczak commented Sep 12, 2022 • edited Loading

Specifications

Code sample to reproduce problem

Expected behavior

Actual behavior

Additional info

bednar commented Sep 12, 2022

pgorczak commented Sep 12, 2022

bednar commented Sep 13, 2022 • edited Loading

bednar commented Sep 14, 2022

pgorczak commented Sep 19, 2022

bednar commented Sep 20, 2022

pgorczak commented Sep 12, 2022 •

edited

Loading

bednar commented Sep 13, 2022 •

edited

Loading