Skip to content

Occasionally losing data points along with error message: "The batch item wasn't processed successfully because: (400) {"code":"invalid","message":"writing requires points"}" #80

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
joeyhagedorn opened this issue Apr 12, 2020 · 16 comments
Labels
wontfix This will not be worked on
Milestone

Comments

@joeyhagedorn
Copy link

I've been encountering occasional errors with a really simple python program to write batches of points. I thought my usage was more-or-less very basic, so I'm not clear why this is happening. Perhaps the batching machinery is improperly creating empty batches and dropping points?

I get many log messages like the following:

The batch item wasn't processed successfully because: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Date': 'Sat, 11 Apr 2020 23:14:18 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Content-Length': '54', 'Connection': 'keep-alive', 'Strict-Transport-Security': 'max-age=15724800; includeSubDomains', 'x-platform-error-code': 'invalid'})
HTTP response body: {"code":"invalid","message":"writing requires points"}

Observe how the chronograf.csv data is missing some values like (0, 9, 27, 30, 36, etc).

I've attached some sample code, sample local output, and a sample CSV exported from InfluxDB explorer UI.
Also attached in this Gist for nice formatting.

SampleCode.py.txt
LocalOutput.txt
2020-04-11-16-47_chronograf_data.csv.txt

Configuration Info:

InfluxDB version: InfluxDB Cloud 2.0
influxdb_client python module version: 1.5.0
Python version: 3.7.3
OS: Raspbian Linux (Buster)

@joeyhagedorn
Copy link
Author

A little more investigation seems to point to the error being associated with the flush_interval.

@bednar
Copy link
Contributor

bednar commented Apr 12, 2020

@joeyhagedorn thanks! we will take a look.

@bednar bednar added the bug Something isn't working label Apr 12, 2020
@bednar bednar added this to the 1.6.0 milestone Apr 12, 2020
@joeyhagedorn
Copy link
Author

Thanks! I suspect the root cause is related to the very low flush_interval—I misunderstood this earlier as number of points, not milliseconds—however I'd still not expect it to drop points, even if it were specified with a low value.

@bednar
Copy link
Contributor

bednar commented Apr 13, 2020

Yeah, it should write everything into database.

Thanks for precise description of bug

@bednar
Copy link
Contributor

bednar commented Apr 16, 2020

Hi @joeyhagedorn,

I tried to simulate it but everything works fine for me. Could you try it again?

I prepared a following "test-case":

https://github.com/influxdata/influxdb-client-python/blob/bednar/loosing-data/examples/loosing_data.py

One small glitch to your code, at the end you should call: write_api.__del__() to dispose buffer.

Regards

@bednar bednar modified the milestones: 1.6.0, 1.7.0 Apr 16, 2020
@joeyhagedorn
Copy link
Author

joeyhagedorn commented Apr 20, 2020

Hrm… testing again today seems to have 100% resolved this original problem I had been encountering. It reproduced 100% before with the sample code I supplied at the time, and doesn't reproduce at all anymore.

I changed the "delay" to 0.1 seconds so I could try many more iterations and after a while, found that one data point was dropped.

I did once encounter this error:

The batch item wasn't processed successfully because: (502)
Reason: Bad Gateway
HTTP response headers: HTTPHeaderDict({'Date': 'Mon, 20 Apr 2020 19:34:12 GMT', 'Content-Type': 'text/html', 'Content-Length': '154', 'Connection': 'keep-alive', 'Strict-Transport-Security': 'max-age=15724800; includeSubDomains'})
HTTP response body: <html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>openresty</center>
</body>
</html>

and subsequently found that the point being uploaded was indeed dropped and not retried:

Pressure: PSI count: 499
Pressure: PSI count: 499

Perhaps the original Reason: Bad Request errors were due to a server issue that is now resolved—and the missing data points come from retry behavior is being what I expect?

In an effort to exacerbate the problem, i changed the flush_interval to "1", and ran your example repeatedly:
for i in {1..25}; do python3 testinfluxdb.py ; done
and found indeed more points dropped:

Pressure: PSI count: 2437
Pressure: PSI count: 2447

@bednar
Copy link
Contributor

bednar commented Apr 21, 2020

Hi @joeyhagedorn,

The flush_interval=1 is definitely our source of problem, thanks for helping us with that.

I was able to simulate and fix issue about empty request:

The batch item wasn't processed successfully because: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Date': 'Sat, 11 Apr 2020 23:14:18 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Content-Length': '54', 'Connection': 'keep-alive', 'Strict-Transport-Security': 'max-age=15724800; includeSubDomains', 'x-platform-error-code': 'invalid'})
HTTP response body: {"code":"invalid","message":"writing requires points"}

see - #85

I will continue to investigate why are points dropped when is flush_interval set to '1'.

FYI: library try to retry if a reason is 429 or 503. It follows API specification: https://v2.docs.influxdata.com/v2.0/api/#operation/PostWrite.

Regards

@joeyhagedorn
Copy link
Author

Awesome, thanks for the thorough investigation.

@bednar bednar removed this from the 1.7.0 milestone May 12, 2020
@Eslih
Copy link

Eslih commented Jun 23, 2020

I am facing the same / very similar issue. I am using the following parameters:

batch_size=1_000, flush_interval=10, retry_interval=1_000

Tried with flush interval of 1, 5, 10, 20, 25 ... . I didn't find the exact level where I lose data. I don't need to flush my data this fast, but as @joeyhagedorn said: this shouldn't occur.

Even though not all data is saved to InfluxDB, I don't get any errors (debug mode is enabled). It might be a memory leak. I don't see much load on my InfluxDB server. On my "client" server I see that one core is used for about 80% (I should look if I could parallelize the workload).

FYI, server specs (InfluxDB):

  • 16 virtual cores (Intel Xeon Platinum 8176)
  • 64GB RAM (about 12GB in use)
  • SSD storage

Client is running on another VM, same server, 32GB RAM.

Using 1.8.0.dev0 client version, Influx version is InfluxDB 2.0.0-beta.10 (will check if beta 12 solves the issue).

@Eslih
Copy link

Eslih commented Jun 24, 2020

Using 1.8.0.dev0 client version, Influx version is InfluxDB 2.0.0-beta.10 (will check if beta 12 solves the issue).

Tested with stable release (1.8.0) and InfluxDB 2.0.0-beta.12 as well. Like expected, the issue still exists.

@bednar
Copy link
Contributor

bednar commented Jun 24, 2020

@Eslih thanks for detail info

@Neocryan
Copy link

Hi there, I encounter this problem too:
My Project will collect financial data daily and write to the db, inserting around 500 points daily. I used the default param on the write_api.
Every time my script runs it will lose a not small part of the data and got no error or logs. I need to run the same script 2~3 times to update all the data successfully.
Is there some good param on the write_api to assure the data are all uploaded?

@bednar
Copy link
Contributor

bednar commented Jul 1, 2020

Hi @Neocryan,

Could you share a little bit more about how you use a client? Code snippet will be helpful. Could you also enable debug by:

client = InfluxDBClient(url="http://localhost:9999", token="my-token", org="my-org", debug=True)

and check that the response from InfluxDB is correct?

For your purpose (500 points daily) is ideal to use a default synchronous write:

from influxdb_client import InfluxDBClient, Point
from influxdb_client .client.write_api import SYNCHRONOUS

client = InfluxDBClient(url="http://localhost:9999", token="my-token", org="my-org")
write_api = client.write_api(write_options=SYNCHRONOUS)

write_api.write(bucket="my-bucket", record=data)

client.__del__()

Regards

@Neocryan
Copy link

Neocryan commented Jul 1, 2020

Hi @bednar, SYNCHRONOUS works, thanks!

@danxmoran
Copy link

(I think) this can be closed now. "Writing requires points" was a server-side error from InfluxDB Cloud, and it was recently removed. Empty request bodies will now return 204 responses.

@bednar
Copy link
Contributor

bednar commented Oct 26, 2021

The issue seems to be fixed. Please use with statement for initializing the client and the batching write_api.

The following script was used for testing:

import time
from datetime import datetime
from influxdb_client import InfluxDBClient, WriteOptions, Point

url = "https://us-west-2-1.aws.cloud2.influxdata.com"
token = "..."
org = "..."
bucket = "..."
measurement = "python-loosing-data_" + str(datetime.now())

with InfluxDBClient(url=url, token=token, debug=False) as client:
    options = WriteOptions(batch_size=8, flush_interval=8, jitter_interval=0, retry_interval=1000)
    with client.write_api(write_options=options) as write_api:
        for i in range(50):
            valOne = float(i)
            valTwo = float(i) + 0.5
            pointOne = Point(measurement).tag("sensor", "sensor1").field("PSI", valOne).time(time=datetime.utcnow())
            pointTwo = Point(measurement).tag("sensor", "sensor2").field("PSI", valTwo).time(time=datetime.utcnow())

            write_api.write(bucket, org, [pointOne, pointTwo])
            print("PSI Readings: (%f, %f)" % (valOne, valTwo))
            time.sleep(0.5)

    query = f'from(bucket: "{bucket}") |> range(start: 0) |> filter(fn: (r) => r["_measurement"] == "{measurement}") |> count()'
    tables = client.query_api().query(query, org)
    for table in tables:
        for record in table.records:
            print(f'{record.get_measurement()}: {record.get_field()} count: {record.get_value()}')

print("end")

@bednar bednar closed this as completed Oct 26, 2021
@bednar bednar added the wontfix This will not be worked on label Oct 26, 2021
@bednar bednar added this to the 1.23.0 milestone Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants