Skip to content

Reduce allocations in the CodeLens providers #719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 24, 2018

Conversation

rjmholt
Copy link
Contributor

@rjmholt rjmholt commented Aug 8, 2018

Went and tackled some easy stuff in the CodeLens provider (since disabling it seems to give the best performance improvement for EditorServices).

There should be no semantic changes from this.

I didn't go after caching, since that's a bit riskier and probably needs some design discussion about where the caches live (i.e. what objects are responsible for them).

var scriptFile =
_editorSession.Workspace.GetFile(
codeLensParams.TextDocument.Uri);
ScriptFile scriptFile = _editorSession.Workspace.GetFile(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like. The rule for using var on my team is this - use var when the type of the RHS is obvious i.e. it's a literal, a new statement or type-cast. And also when required - anonymous types. Do not use var when the type is not obvious, which it usually isn't when invoking a method (or reading a property).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agreed on that rule!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also 100% agree!

Copy link
Collaborator

@SeeminglyScience SeeminglyScience left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! One concern/question

acc.AddRange(providerResult);
}

return acc.ToArray();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think allocating a list and then an array would end up with more allocations. I don't know enough about how SelectMany is implemented to say that with any certainty, but maybe a quick benchmark test would be a good idea.

Speaking of, benchmarks is a good bullet point for #720

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I wrote a nasty little benchmark:

    class Program
    {
        private static Random s_random = new Random();

        static void Main(string[] args)
        {
            var tests = new Dictionary<string, Action>
            {
                { "Select Many Concatenation", ConcatWithSelectMany },
                { "List Build Concatentation", ConcatWithList },
            };

            RunBenchmark(tests);
        }

        private static void ConcatWithSelectMany()
        {
            GenerateMultiData(d => d.ToTransformedDummyData()).SelectMany(d => d).ToArray();
        }

        private static void ConcatWithList()
        {
            var acc = new List<TransformedDummyData>();
            foreach (TransformedDummyData[] dataArr in GenerateMultiData(d => d.ToTransformedDummyData()))
            {
                acc.AddRange(dataArr);
            }
            acc.ToArray();
        } 

        private static IEnumerable<TransformedDummyData[]> GenerateMultiData(Func<DummyData, TransformedDummyData> process, int quantity = 5)
        {
            for (int i = 0; i < quantity; i++)
            {
                yield return GenerateData(s_random.Next(5, 15)).Select(process).ToArray();
            }
        }

        private static IEnumerable<DummyData> GenerateData(int quantity = 10)
        {
            for (int i = 0; i < quantity; i++)
            {
                yield return new DummyData(s_random.Next(), new object(), Guid.NewGuid().ToString());
            }
        }

        private static IntervalData Time(string testName, Action action, int repeats = 1000, int subInterval = 100)
        {
            var times = new List<double>(repeats);
            for (int i = 0; i < repeats; i++)
            {
                var watch = Stopwatch.StartNew();
                for (int j = 0; j < subInterval; j++)
                {
                    action();
                }
                watch.Stop();
                times.Add((double)watch.ElapsedMilliseconds / subInterval);
            }

            double maxTime = times.Max();
            double minTime = times.Min();
            double meanTime = times.Average();
            double[] sortedTimes = times.OrderBy(t => t).ToArray();

            return new IntervalData
            {
                TestName = testName,
                MaxTime = maxTime,
                MaxTimeIndex = times.IndexOf(maxTime),
                MinTime = minTime,
                MinTimeIndex = times.IndexOf(minTime),
                TotalTime = times.Sum(),
                MeanTime = meanTime,
                StdDev = Math.Sqrt(times.Sum(t => Math.Pow(t - meanTime, 2)) / (times.Count - 1)),
                MedianTime = sortedTimes.Length % 2 == 0 ? sortedTimes[sortedTimes.Length / 2] : sortedTimes[sortedTimes.Length / 2 + 1]
            };
        }

        private static void RunBenchmark(IDictionary<string, Action> tests)
        {
            foreach (KeyValuePair<string, Action> test in tests)
            {
                IntervalData result = Time(test.Key, test.Value);
                Console.WriteLine($"Test {result.TestName}:");
                Console.WriteLine($"\tTotal Time:\t\t\t{result.TotalTime}");
                Console.WriteLine($"\tMean Time:\t\t\t{result.MeanTime}");
                Console.WriteLine($"\tMedian Time:\t\t\t{result.MedianTime}");
                Console.WriteLine($"\tStd Dev:\t\t\t{result.StdDev}");
                Console.WriteLine($"\tMax Time:\t\t\t{result.MaxTime}");
                Console.WriteLine($"\tMax Time Index:\t\t\t{result.MaxTimeIndex}");
                Console.WriteLine($"\tMin Time:\t\t\t{result.MinTime}");
                Console.WriteLine($"\tMin Time Index:\t\t\t{result.MinTimeIndex}");
                Console.WriteLine();
            }
        }
    }

    public class DummyData
    {
        public DummyData(int x, object y, string z)
        {
            X = x;
            Y = y;
            Z = z;
        }

        public int X { get; }

        public object Y { get; }

        public string Z { get; }
    }

    public static class DummyDataExtensions
    {
        public static TransformedDummyData ToTransformedDummyData(this DummyData data)
        {
            return new TransformedDummyData(data.X+1, data.Z?.ToUpper());
        }
    }

    public class TransformedDummyData
    {
        public TransformedDummyData(int x, string z)
        {
            X = x;
            Z = z;
        }

        public int X { get; }

        public string Z { get; }
    }

    public class IntervalData
    {
        public string TestName;

        public double MaxTime;

        public int MaxTimeIndex;

        public double MinTime;

        public int MinTimeIndex;

        public double TotalTime;

        public double MeanTime;

        public double MedianTime;

        public double StdDev;
    }

With the following results:

Test Select Many Concatenation:
	Total Time:			60.7800000000007
	Mean Time:			0.0607800000000007
	Median Time:			0.06
	Std Dev:			0.00494125652198061
	Max Time:			0.13
	Max Time Index:			0
	Min Time:			0.06
	Min Time Index:			3

Test List Build Concatentation:
	Total Time:			60.4900000000008
	Mean Time:			0.0604900000000008
	Median Time:			0.06
	Std Dev:			0.00472040063788888
	Max Time:			0.13
	Max Time Index:			431
	Min Time:			0.06
	Min Time Index:			1


Don't know how representative my benchmark really is, but given that result, I will revert to SelectMany since it's simpler and may end up allocating less in the long run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

I converted your benchmark to use BenchmarkDotNet.

Here's the results I got in CoreCLR/Windows

Method Mean Error StdDev Scaled ScaledSD Gen 0 Allocated
ConcatWithSelectMany 36.28 us 0.7209 us 0.8302 us 1.00 0.00 5.4321 16.82 KB
ConcatWithList 38.01 us 0.9687 us 2.8562 us 1.05 0.08 5.6763 17.6 KB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, see, that's what I should be doing -- using the grown ups tools.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested on Linux btw. So Framework is possibly different.


return lenses;
return lenses.ToArray();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo: SelectMany vs List > Array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to leave this one on the basis of avoiding some LINQ closures

}
}

return acc.ToArray();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo: SelectMany vs List > Array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on LINQ

Range = foundReference.ScriptRegion.ToRange()
});
}
referenceLocations = acc.ToArray();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo: SelectMany vs List > Array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on LINQ

Copy link
Member

@TylerLeonhardt TylerLeonhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a couple questions

this.symbolProvider
.ProvideDocumentSymbols(scriptFile);
var lenses = new List<CodeLens>();
foreach (SymbolReference symbol in _symbolProvider.ProvideDocumentSymbols(scriptFile))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only calls ProvideDocumentSymbols once and iterates through the result, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, yes. Why do you ask?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spoke to @SeeminglyScience about this. We're good

symbol.ScriptRegion))
.ToArray();
var acc = new List<CodeLens>();
foreach (SymbolReference sym in _symbolProvider.ProvideDocumentSymbols(scriptFile))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question

else
{
var acc = new List<Location>();
foreach (SymbolReference foundReference in referencesResult.FoundReferences)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like all these loops could be a little more DRY.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in each particular loop? Or the fact we are doing loops like this in each call?

They were decidedly pithier as LINQ expressions, but that meant inefficiencies. The problem is that the logic inside each loop is different, so we either write out the loop or use some kind of function pointer or closure.

But I guess to address the DRY need, we should:

  • Identify what the repetition you're seeing is, and
  • Work out what the best refactoring is that preserves efficiency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the foreach's were all doing similar things but they're actually quite different :) we're good

.SelectMany(r => r)
.ToArray();
return InvokeProviders(provider => provider.ProvideCodeLenses(scriptFile))
.SelectMany(codeLens => codeLens)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More LINQ to remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the one I put back in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, see #719 (comment)

@rjmholt
Copy link
Contributor Author

rjmholt commented Aug 17, 2018

AppVeyor keeps timing out. We may need to run the tests manually

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants