Skip to content

putDimension with same key is adding always new elements #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
miguelcss opened this issue Nov 24, 2021 · 5 comments · Fixed by #105
Closed

putDimension with same key is adding always new elements #91

miguelcss opened this issue Nov 24, 2021 · 5 comments · Fixed by #105
Assignees
Labels
bug Something isn't working

Comments

@miguelcss
Copy link

miguelcss commented Nov 24, 2021

Hello,

I'm using the lib for a Java service on EC2, not serverless, so I create a single instance of the MetricsLogger that is created as @Bean in a Spring context, and then I have have it in the class that log metrics as:

@Component
public class MetricsHelper {

    private MetricsLogger metricsLogger;
    
    public void publishMetric(String operation) {
        metrics.putDimensions(DimensionSet.of("API", operation));
        metricsLogger.putMetric("Request", 1, Unit.COUNT);
        metricsLogger.flush();
    }

Then I use this helper as such

        metricsHelper.publishMetric("SomeOperation");

However, when checking the logGroup, as the logged metrics increase, I see an ever increasing list of "Dimensions", even if the value for the metric is just 1, the event includes a huge Dimensions list

A sample record:

{
    "_aws": {
        "Timestamp": 1637780784996,
        "CloudWatchMetrics": [
            {
                "Namespace": "someNameSpace",
                "Metrics": [
                    {
                        "Name": "Time",
                        "Unit": "Milliseconds"
                    }
                ],
                "Dimensions": [
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ],
                    [
                        "LogGroup",
                        "ServiceName",
                        "ServiceType",
                        "Api"
                    ]
                ]
            }
        ],
        "LogGroupName": "SomeService-metrics"
    },
    "Time": 327,
    "Api": "someOperation",
    "LogGroup": "SomeService-metrics",
    "ServiceName": "SomeService",
    "ServiceType": "someType"
}

Digging a bit in code, whenever calling putDimension, indeed a new element is added to a java List<DimensionSet>.
Code is here in MetricDirective.

Should this really be a list? if adding a new dimension that is already used, we could just update the value, this feels like it could be a Map and not a List to avoid this growing record size.

Or, is this the right usage? maybe when doing flush, this list should be cleaned? From the README.md I feel like the usage is correct, and user should not be worried about dimension internal structure (growing, or deleting/removing elements).

So maybe use a map would solve it, or integrate a clean in the flush

@miguelcss
Copy link
Author

miguelcss commented Nov 24, 2021

I see that setDimensions MetricsContext code actually creates a new array, so it can be used to prevent the issue, but by doing this we override any existing defintions such as build in dimensions for LogGroup, ServiceName and ServiceType.

How can we add a new dimension while keeping existing ones, without adding new empty elements in each request to that Dimensions List?

I can create a new object in each metric flush, as it would happen in a Lambda/serverless context, but that seems poor usage of memory for EC2 context for what are similar metric flush, using same setup for a given service.

@miguelcss
Copy link
Author

Any feedback on this?

@arielapostoli arielapostoli self-assigned this Jan 6, 2022
@arielapostoli arielapostoli added the bug Something isn't working label Jan 6, 2022
@arielapostoli
Copy link

This is a bug and we will be working on a fix. We do not have an ETA yet.

@miguelcss
Copy link
Author

@arielapostoli Thank you for the update.

I have a very small PR for simplifying the test docker image , and I have another one prepared with some refactor simplifications, could you, or someone in the team take a look at that PR?
Thanks

@paggynie
Copy link
Contributor

paggynie commented Jun 19, 2022

you can try

public class MetricsHelper {

    private MetricsLogger metricsLogger;

   // separate put dimension with publish. Call this whenever new dimension is needed. Otherwise it just needs to be called once
   public void putDimensions(String operation) {
       metricsLogger.putDimensions(DimensionSet.of("API", operation));
   }
    
    public void publishMetric() {
        metricsLogger.putMetric("Request", 1, Unit.COUNT);
        // flush will copy and keep existing dimensions 
        metricsLogger.flush();
    }

Then use it in following way:

metricsLogger.putDimensions("someOperation");
metrics.publishMetric();
metrics.publishMetric(); // will still keep dimension API: someOperation
metrics.publishMetric();

because after flush, the existing dimension will be copied and kept

Note that we use List<DimensionSet> because it is a list of map of dimensions - The Dimensionset is backed by a map. The generated dimensions in emf format will be "Dimensions": [ [key1, key2], [key3, key4] ] (value is defined in outer json)
if you don't want to modify existing dimension, eg key1, and still want to keep existing dimension, but only want to add new dimensions, eg add [key5, key6], you can use the approach I mentioned above. It will generate [ [key1, key2], [key3, key4], [key5, key6]]

The dimensions you use `["LogGroup", "ServiceName", "ServiceType","Api" ] seems won't change once set so this makes me ^ should be enough

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants