Skip to content

Memory performance counters #4172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Memory performance counters #4172

wants to merge 3 commits into from

Conversation

Mpdreamz
Copy link
Member

@Mpdreamz Mpdreamz commented Oct 22, 2019

This PR updates us to the latest RecycableMemoryStreamManager from:

https://github.com/microsoft/Microsoft.IO.RecyclableMemoryStream

This now implements GetBuffer() so we can investigate removing our ToArray() calls.
We have a pending PR for this for a while now here: #3797

I've extended it so that it emits metrics using the counters System.Diagnostics.Tracing.*Counter

To aid with visualizing and testing this I imported @russcam StackOverflow indexer to our codebase (as Tests.LongRunning).

I've added --profile switch that emits the process id on run and waits for confirmation to run on stdin. At the end of the run it will ask to run again if --profile is specified.

This allows you to run dotnet counters and other profiles/visualizers/tracers:

Start a cluster:

$ ./build.sh cluster readonly

Indexing the askubuntu stack exchange questions and answers:

$  dotnet run -c RELEASE -- posts -e http://localhost:9200 -f /home/mpdreamz/datasets/askubuntu/Posts.xml --profile

Will emit:

[17:34:35.074] Current Process Id: 5290
[17:34:35.080] Press any key to continue...

From another console starting:

dotnet counters monitor --refresh-interval 1 Elasticsearch-Net-RecyclableMemoryStream -p 5290
[Elasticsearch-Net-RecyclableMemoryStream]
    Active memory streams                              0    
    Large buffers active                               0    
    Large buffers free                                 0    
    Large pool in use size                             0    
    Pooled blocks active                             300    
    Small pool free blocks                    39,321,600    
    Small pool in use size                             0    
    Small pool max size                                0 

And after a 2nd run:

[Elasticsearch-Net-RecyclableMemoryStream]
    Active memory streams                              0    
    Large buffers active                               0    
    Large buffers free                                 0    
    Large pool in use size                             0    
    Pooled blocks active                             300    
    Small pool free blocks                    39,321,600    
    Small pool in use size                             0    
    Small pool max size                                0 

Thats 40mb after indexing a 1GB xml file twice.

Pools are reused but there is currently no upper limit to this pool this means that the pool will be as large as the users heaviest load and most likely stays around there.

  • Is this bad? Do we want to tackle this?
  • Default buffer size is quite high would we benefit from a lower buffer size?
  • Do we ever want large buffers in the pool?
  • Can we set Small pool max size to anything that works for everyone?

Related tickets:
#4165
microsoft/Microsoft.IO.RecyclableMemoryStream#75
#3797
microsoft/Microsoft.IO.RecyclableMemoryStream#23

@Mpdreamz
Copy link
Member Author

Closing this in favor of smaller dedicated PR's

@andreminelli
Copy link

There is any chance of these fixes get into v6.x? It's happening on release 6.8.1, too :(

@CSharpBender
Copy link

Is this fixed? I'm seeing this issue in Elasticsearch 7.6.1
Elasticsearch.Net.RecyclableMemoryStreamManager is using 3GB of memory on one of my small services that is indexing some documents from time to time. Initially I thought I have a memory leak in my code.

@Mpdreamz
Copy link
Member Author

Mpdreamz commented Nov 2, 2020

If you update to 7.8.1 or over we changed the default away from Elasticsearch.Net.RecyclableMemoryStreamManager to one that news a MemoryStream everytime. This should behave much better out of the box.

@Mpdreamz
Copy link
Member Author

Mpdreamz commented Nov 2, 2020

Actually on 7.x the default has not changed yet

You can call .MemoryStreamFactory(Elasticsearch.Net.MemoryStreamFactory.Default) to change away from RecyclableMemoryStreamManager

I will add another PR to switch the default (and look at backporting this to 6.x @andreminelli)

@CSharpBender
Copy link

CSharpBender commented Nov 3, 2020

Thanks @Mpdreamz for the fast response.
I did some further tests and the RecyclableMemoryStreamManager misbehaves only when I'm using .DisableDirectStreaming() for debugging purposes. Otherwise the memory is correctly released.
So, for now I'm going to use the Default memory stream only in Debug because it fixes the high memory footprint issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants