Skip to content

Spurious ConsumerFetchSizeTooSmall error? #135

Closed
@clofresh

Description

@clofresh

I'm making a batch fetch request on 16 partitions and getting ConsumerFetchSizeTooSmall() errors. I tried raising the max buffer size to a something extremely large like 1gb, but it still gives the error.

2014-02-27 21:00:28,728 WARNING [kafka] Fetch size too small, increase to 104857600 (2x) and retry
2014-02-27 21:00:28,733 ERROR [kafka] Max fetch size 1073741824 too small
2014-02-27 21:00:28,733 ERROR [dd] [10931] [core.py:165] -·
Traceback (most recent call last):
...
  File "/usr/local/lib/python2.6/dist-packages/kafka_python-0.9.0_6db14-py2.6.egg/kafka/consumer.py", line 323, in get_messages
    message = self.get_message(block, timeout)
  File "/usr/local/lib/python2.6/dist-packages/kafka_python-0.9.0_6db14-py2.6.egg/kafka/consumer.py", line 343, in get_message
    self._fetch()
  File "/usr/local/lib/python2.6/dist-packages/kafka_python-0.9.0_6db14-py2.6.egg/kafka/consumer.py", line 404, in _fetch
    raise e
ConsumerFetchSizeTooSmall

Hacking the code to just reraise the original exception gives this stack trace:

   File "/usr/local/lib/python2.6/dist-packages/kafka_python-0.9.0_6db14-py2.6.egg/kafka/consumer.py", line 323, in get_messages
     message = self.get_message(block, timeout)
   File "/usr/local/lib/python2.6/dist-packages/kafka_python-0.9.0_6db14-py2.6.egg/kafka/consumer.py", line 343, in get_message
     self._fetch()
   File "/usr/local/lib/python2.6/dist-packages/kafka_python-0.9.0_6db14-py2.6.egg/kafka/consumer.py", line 386, in _fetch
     for message in resp.messages:
   File "/usr/local/lib/python2.6/dist-packages/kafka_python-0.9.0_6db14-py2.6.egg/kafka/protocol.py", line 117, in _decode_message_set_iter
     (msg, cur) = read_int_string(data, cur)
   File "/usr/local/lib/python2.6/dist-packages/kafka_python-0.9.0_6db14-py2.6.egg/kafka/util.py", line 50, in read_int_string
     raise BufferUnderflowError("Not enough data left")
 BufferUnderflowError: Not enough data left

I think it may be a bug in the client logic, since I can read individually from each partition.

Here is a sample response payload:

https://www.dropbox.com/s/cw1i1ct1p1aplp4/raw_fetch_response.txt.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions