LINQPad pathologically slow behavior on large scripts after many edits
The syntax-highlighting editor used by LINQPad 6 (and presumably LINQPad 5) exhibits O(n^2) behavior after many edits
(a second bug: Ctrl-C does not copy the currently selected text in the About Box. Also, it would probably be good if there was a way
to copy the LINQPad version and CPU type..)
LINQPad 6 v6.6.8 (X64)
.NET Core version (host): 3.1.1
.NET Core version (queries): 3.1.1
Roslyn Version: 3.4.0-beta4-19568-04
FSharp.Compiler.Service version: 188.8.131.52
I often use LINQPad to prototype new designs for things; between the built-in debugger and its expansive visualization capabilities, I can very rapidly test out new ideas. As such, I frequently end up with scripts on the order of 10KB, sometimes much larger.
One thing that has afflicted my LINQPad (versions 5 and 6) is that for some queries -- and it always seems to be large ones -- the editor starts slowing down noticeably: consider, for example, the incident that prompted this posting: I could start typing "the quick brown fox jumped over the lazy dogs", by the time I finish typing "dogs", the UI hadn't even started processing "over" yet.
When this happens, if I close and re-open the query, the UI becomes speedy again, at least for a little while; eventually, however, it starts slowing down again. This also occurs if I choose Clone Query: the new query window is nice and responsive, while the old one continues to be painfully slow. (This is insidious, because it means that I can't just file a bug report saying "this query is slow to edit" because it won't be, not at first.)
Troubleshooting attempt 1: ETW trace
I ran an ETW trace while the problem was happening, and I saw LINQPad6.exe sitting at a baseline of about 12.5% CPU usage with brief spikes of up to 25%. As this is a i7-6700 (4-cores/8-threads), that suggests one thread (the UI thread) steadily consuming 100% of a single processing unit, with a second thread doing brief bursts of work on a second processing unit.
Troubleshooting Attempt 2: Visual Studio trace
This is where I struck gold.
I ran a CPU sampling with VS2019, and that gave me a much more clear picture: 66% of the CPU time was spent in the
StringBuilder::get_Chars method, which is the indexer for StringBuilder (
char ch = sb[idx]).
The biggest callers of this method are:
(everything else was less than 1%)
Some interesting facts about the StringBuilder implementation in .NET Core and .NET 4 (though not .NET 2):
- It stores the character data in a "rope" structure: a singly-linked list of chunks, stored in reverse order: that, is the head of the list is the last chunk, and each element contains a link to the previous chunk. (This makes a lot of sense, because since StringBuilder is for building a string; as such, most edits are likely to happen at the end, and having direct access to the end means appends are O(1).)
- If you add a very large amount of data to the StringBuilder at once -- such as when you initialize it from a string, or have very large Insert/Append -- it'll put it all into one oversized chunk
- If you're inserting short strings -- such as single characters -- into the middle, however, it'll break things up into 16-character chunks, to minimize the amount of data movement needed.
Therefore, if you're inserting one character at a time into a StringBuilder, you'll mostly end up with a linked-list of 16-character chunks.
Therefore, after enough edits, the average number of steps needed to retrieve the character at a specific index is ((n / 16) / 2), which means that the StringBuilder index algorithm is O(n).
However, those methods?
GetCharacterColumn? I suspect they're being called in a loop, for all of the characters in the file. Which is n operations. And doing an O(n) operation n times is O(n^2).
This explains why reopening the file makes things fast again: the StringBuilder is initialized once with the entire contents of the file, so it's a single chunk -- making
get_Chars O(1), and the editor's methods O(n). But as I start editing -- and most of my edits get done at the beginning of the file -- it starts slowing down.
As a final test, I went back to the tab exhibiting one-second-per-letter behavior, scrolled to the very end (so viewing and editing on the head chunk) and started typing; the UI reacted instantly. I went back to the very first line and started typing, and lo! One second per letter.
Thanks for that excellent analysis. I've written a query to test the performance characteristics of StringBuilder under those conditions and you're absolutely correct:
I've just uploaded a new LINQPad 6 beta (6.7.3) with the editor modified such that it recreates the StringBuilder every 250 edits to prevent excessive fragmentation. Let me know whether this fixes the problem.
Joe, I tried the beta 6.7.3 and seems to offer a dramatic improvement in performance.
I wrote a script to send a series of "// xxxxxxx" lines (100 'x'es per line) using SendKeys/SendInput, and while there's still a problem somewhere, it's obviously much faster:
(at this point the script crashed; I have no idea why, I suspect a bug in the Automation code I'm using to interrogate the text editor.)
On LINQPad 6.7.3 things looked much better:
It's obvious that something is slowing down eventually, but it's a lot less noticeable.
If you want to hammer it a lot on your own, here's a script that can do that:
When running it, turning on CAPS LOCK for a few seconds should cause it to stop trying to dump keyboard output, so you can switch to another application without causing massive problems.
I suspect you might need a different data structure entirely to store your text -- but since the real fault seems to lie in the editor control, I doubt you can simply swap in a new one. You've made it not-pathologically-slow, but it's still doing a https://wiki.c2.com/?ShlemielThePainter algorithm.
Thanks, that's good news. I'll add the same fix to LINQPad 5 in the next few days. In terms of further improving performance, I believe that the bottlenecks are no longer in StringBuilder (although let me know if you have evidence otherwise).
I've run another trace, and I definitely don't see StringBuilder::get_Chars() on there anymore.
Unfortunately, I can't work out exactly where the next slowdown lies, because I'm running up against what seems to be a Visual Studio bug, and I don't know how to report it to Microsoft.
In multiple places where an async method is invoked, there's a call out to EtwEventWriteTransfer (in advapi32.dll), below which lies a whole new copy of the call stack, all the way up to the __scrt_common_main_seh call (the executable entry point)! As a result, a large majority of methods have their CPU time incorrectly computed, and I can't really find the true hot spots.
Hope this is not out of place. But I just wanted to post a huge
I've been using LINQPad for years in much the same way as OP to prototype ideas and sometimes they get fairly complex and lengthy. And this has been a very significant pain point for me, that the editing gets incredibly slow after awhile. I am so happy that the root cause of this has been uncovered. Please accept my sincerest gratitude!
Thanks for your support! And a big thank-you to @StevieO.
Joined the forums just to echo the THANK YOU sentiments! I've been working around it by copy/pasting the text, or losing my undo buffer. It makes so much more sense now!