Skip to content

putMetric in MetricsLogger is not ThreadSafe and leading to ArrayIndexOutOfBoundsExceptions #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dbadami opened this issue Dec 29, 2020 · 9 comments
Labels
enhancement New feature or request

Comments

@dbadami
Copy link

dbadami commented Dec 29, 2020

Hi,

We're adding metrics using using the MetricsLogger putMetric method in our application. We have multiple threads adding a metric for the same key and we received an ArrayIndexOutOfBoundsException when doing so. In our logs we were able to track it down to the addValue method in MetricDefinition.

Our hypothesis is that this is happening because the resizing of the array list is synchronized on so you can have an element that can being inserted at index that is greater than the current size of the array.

Happy to make the fix if the change is as simple as:

   MetricDefinition(String name, Unit unit, double value) {
        this(name, unit, Collections.synchronizedList(new ArrayList<>(Arrays.asList(value))));
    }

Thanks,

Devavrat

@yaozhaoy yaozhaoy added the enhancement New feature or request label Jan 8, 2021
@yaozhaoy
Copy link
Contributor

yaozhaoy commented Jan 8, 2021

The library is not thread-safe. Your fix can fix one place but there're also other places needs. We will prioritize the enhancement of this.

@axeliux
Copy link

axeliux commented Mar 30, 2021

We also need to make sure that the size of the list does not go beyond the current 100 limit.

#72

@goldmar
Copy link

goldmar commented Apr 20, 2021

The current API is not compatible with being used in a multi-threaded application at all.

Since putProperty, putDimensions, and putMetric are separate calls, it is not possible to publish a metric with a given set of properties and dimensions without other threads potentially changing the properties and dimensions concurrently.

Are there any plans to address this?

@jaredcnance
Copy link
Member

jaredcnance commented Apr 20, 2021

@goldmar this assumes you’re using the same metrics logger instance on multiple threads, right? Can you use a different instance per thread? You could access it through a ThreadLocal to simplify this. There are definitely other enhancements we need to make to improve throughput for socket writing but even with that I think you’d still want a distinct logger per thread. Can you help us understand your use case a little better? It’s possible I’ve misunderstood. Is there some issue you’re hitting where distinct metrics loggers per thread are not working as expected?

@goldmar
Copy link

goldmar commented Apr 20, 2021

@jaredcnance Thank you for the quick reply! We have started building our first service on ECS Fargate and are just trying to figure out the best way to emit metrics. We have configured Firelens with fluentbit and it works well, however I am not sure yet what is the best way to use this logging library. We have an Akka Streams application which consumes data from Kinesis and SQS and writes to Elasticsearch.

I don't think using ThreadLocals is the way to go in an async context; I think I'll create a custom Sink for emitting metrics in Akka instead. This will guarantee that there is only a single thread calling the MetricsLogger.

@jaredcnance
Copy link
Member

jaredcnance commented Apr 20, 2021

I see, so you want to be able to do something like:

  1. Collect some metrics or metadata on a single MetricsLogger
  2. Pass the logger into one or more async contexts where they can add new metrics or metadata
  3. Join the async contexts (e.g. Future.get()) and flush the metrics

Or even:

  1. Collect some metrics or metadata on a single MetricsLogger
  2. Clone the metrics logger and pass into an async context
  3. Flush from the async context

Is that the generalized use case?

@goldmar
Copy link

goldmar commented Apr 20, 2021

So what I would like to do, is pass around a single MetricsLogger around in my async contexts and emit my metrics. This is obviously not possible because MetricsLogger is not only not threadsafe but also has the above mentioned limitations of its API. If you look at the implementations of different Logger classes (log4j1, log4j2, logback), all of these classes are threadsafe. So I think it is reasonable to expect this to be the case here as well, since the use cases are somewhat related.

Since this is currently not possible, a workaround is required. It could be using a ThreadLocal as you suggested above (I am not sure whether this plays well together with Akka though) or some other way of ensuring that the object is not called from multiple threads at the same time.

@jaredcnance
Copy link
Member

jaredcnance commented Apr 20, 2021

We discussed offline and to restate goldmar@'s specific ask is for an additional interface that simplifies the emission of metrics where flush() can be called immediately. This provides a better DX for multithreaded scenarios since user's don't have to be concerned with state and concurrency, but also comes at the loss of colocating related values and increased log volume. In addition, this kind of flush every time behavior should likely be implemented (or used carefully) after we support async writing. Otherwise every individual metric can block on a socket write.

add(
  String key,
  Double value,
  CloudWatchUnit unit,
  List<DimensionSet> dimensions,
  Map<String, String> properties
);

addAll(
  List<Metric> metrics,
  List<DimensionSet> dimensions,
  Map<String, String> properties
);

@Stephen-Bao
Copy link
Contributor

Stephen-Bao commented Jun 7, 2022

@dbadami Hi Devavrat,
As we’re continuing working on enhancing thread safety of the library, we’re trying to gather all available information on reported thread-safe problems. Since now the version has changed to v1.0.5, I would greatly appreciate it if you can provide more information about whether you’re still experiencing issues while using this library with multi-threading, what exception type you get, and any other details about your issue.
I would like to simulate the issue using your specific example as well. Will it be possible to share a part of your application(if not complete) that can help me do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants