Skip to content

Ingestion errors when a datetime object has 0 on the milliseconds #2941

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
joseparajelesGL opened this issue Apr 23, 2025 · 4 comments
Open

Comments

@joseparajelesGL
Copy link

We have a date field on ES8 with the following mapping

"date": {"type": "date"},

But we are getting a validation format

failed to parse date field [2025-03-26T20:25:00-22:00] with format [strict_date_optional_time\|\|epoch_millis]

Due to the lack of the decimal places of the seconds.
We are sending a datetime object so it's triggering this line

formatted_data = data.isoformat()
and calling .isoformat() however isoformat for datetime has a behavior that if there are not microseconds it will use the second level precision https://github.com/python/cpython/blob/95d9dea1c4ed1b1de80074b74301cee0b38d5541/Lib/_pydatetime.py#L177-L179

I can work around this by subclassing the JsonSerializer and adding an overwrite for datetimes, however is there a more permanent solution you could implement in the python client?

@gioboa
Copy link

gioboa commented Apr 24, 2025

Hi 👋, I'm more than happy to help on this.
Let's wait for the team members feedback

@miguelgrinberg
Copy link
Contributor

Due to the lack of the decimal places of the seconds.

Hi! How did you make the assessment that the problem is the missing milli or nanosecond fraction?

I think the problem is your timezone offset instead. First of all my understanding is that a timezone of -22:00 like the one you have in the example above does not apply to any place on Earth. According to Wikipedia timezones go from -12:00 to +14:00. So you should figure out why you are using an invalid timezone offset.

I was able to insert datetimes with integer seconds, and also with milli or nanosecond fractions just fine, as long as I did not include the timezone offset.

I was also able to insert datetimes that include valid timezone offsets, but for this I had to use a custom format for the date field.

Here is a mapping that I used for my test:

PUT /datetest/_mapping
{
  "properties": {
    "date": {
      "type": "date"
    },
    "datetz": {
      "type": "date",
      "format": "yyyy-MM-dd'T'HH:mm:ssZZZZZ||yyyy-MM-dd'T'HH:mm:ss.SSSZZZZZ||yyyy-MM-dd'T'HH:mm:ss.SSSSSSZZZZZ"
    }
  }
}

Here are some examples of datetimes that worked with the above definitions. Note that I changed your -22:00 to -12:00 to make it a valid timezone offset.

POST /datetest/_doc
{
  "date": "2025-03-26T20:25:00",
  "datetz": "2025-03-26T20:25:00-12:00"
}
POST /datetest/_doc
{
  "date": "2025-03-26T20:25:00.123",
  "datetz": "2025-03-26T20:25:00.123-12:00"
}
POST /datetest/_doc
{
  "date": "2025-03-26T20:25:00.123456",
  "datetz": "2025-03-26T20:25:00.123456-12:00"
}

@joseparajelesGL
Copy link
Author

Sorry for the late reply, I just got back to this
I ran these

PUT date-test
{
  "mappings": {
    "properties": {
      "date": { "type": "date" }
    }
  }
}
PUT date-test-2
{
  "mappings": {
    "properties": {
      "date": { "type": "date", "format":"yyyy-MM-dd'T'HH:mm:ssZZZZZ||yyyy-MM-dd'T'HH:mm:ss.SSSZZZZZ||yyyy-MM-dd'T'HH:mm:ss.SSSSSSZZZZZ"}
    }
  }
}

And these worked as expected

PUT date-test/_create/full-ms-odd-tz
{
  "date": "2025-03-26T20:25:00.000-18:00"
}

PUT date-test/_create/full-ms-normal-tz
{
  "date": "2025-03-26T20:25:00.000+02:00"
}
PUT date-test-2/_create/full-ms-normal-tz
{
  "date": "2025-03-26T20:25:00.000-02:00"
}

With this one failing a a different validation

PUT date-test-2/_create/full-ms-odd-tz
{
  "date": "2025-03-26T20:25:00.000-22:00"
}

{
  "error": {
    "root_cause": [
      {
        "type": "date_time_exception",
        "reason": "date_time_exception: Zone offset not in valid range: -18:00 to +18:00"
      }
    ],
    "type": "document_parsing_exception",
    "reason": "[2:11] failed to parse field [date] of type [date] in document with id 'full-ms-odd-tz'. Preview of field's value: '2025-03-26T20:25:00.000-22:00'",
    "caused_by": {
      "type": "date_time_exception",
      "reason": "date_time_exception: Zone offset not in valid range: -18:00 to +18:00"
    }
  },
  "status": 400
}

The rest were properly indexed

Similar with

PUT date-test/_create/no-ms-odd-tz
{
  "date": "2025-03-26T20:25:00-18:00"
}

PUT date-test/_create/no-ms-normal-tz
{
  "date": "2025-03-26T20:25:00+02:00"
}

PUT date-test-2/_create/no-ms-odd-tz
{
  "date": "2025-03-26T20:25:00-18:00"
}

PUT date-test-2/_create/no-ms-normal-tz
{
  "date": "2025-03-26T20:25:00-02:00"
}

Looks like the issue is the internal Java validation from -18:00 to +18:00 in the timezones. Is there a way of getting the underlaying error instead of just failed to parse date field [2025-03-26T20:25:00.000-20:00] with format [strict_date_optional_time||epoch_millis] ?

@miguelgrinberg
Copy link
Contributor

The more specific error message is reported under the caused_by key in the Elasticsearch error. We actually noticed that these are sometimes useful and have recently made a change to include them in the Python exceptions. See #2932. Unfortunately the only released version that includes this is 9.0.1, which only works with Elasticsearch 9. The same change is going to be released on the 8.19.0 client, currently scheduled for release in July 2025.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants