LINQPad pathologically slow behavior on large scripts after many edits
The syntax-highlighting editor used by LINQPad 6 (and presumably LINQPad 5) exhibits O(n^2) behavior after many edits
(a second bug: Ctrl-C does not copy the currently selected text in the About Box. Also, it would probably be good if there was a way
to copy the LINQPad version and CPU type..)
LINQPad 6 v6.6.8 (X64)
.NET Core version (host): 3.1.1
.NET Core version (queries): 3.1.1
Roslyn Version: 3.4.0-beta4-19568-04
FSharp.Compiler.Service version: 184.108.40.206
I often use LINQPad to prototype new designs for things; between the built-in debugger and its expansive visualization capabilities, I can very rapidly test out new ideas. As such, I frequently end up with scripts on the order of 10KB, sometimes much larger.
One thing that has afflicted my LINQPad (versions 5 and 6) is that for some queries -- and it always seems to be large ones -- the editor starts slowing down noticeably: consider, for example, the incident that prompted this posting: I could start typing "the quick brown fox jumped over the lazy dogs", by the time I finish typing "dogs", the UI hadn't even started processing "over" yet.
When this happens, if I close and re-open the query, the UI becomes speedy again, at least for a little while; eventually, however, it starts slowing down again. This also occurs if I choose Clone Query: the new query window is nice and responsive, while the old one continues to be painfully slow. (This is insidious, because it means that I can't just file a bug report saying "this query is slow to edit" because it won't be, not at first.)
Troubleshooting attempt 1: ETW trace
I ran an ETW trace while the problem was happening, and I saw LINQPad6.exe sitting at a baseline of about 12.5% CPU usage with brief spikes of up to 25%. As this is a i7-6700 (4-cores/8-threads), that suggests one thread (the UI thread) steadily consuming 100% of a single processing unit, with a second thread doing brief bursts of work on a second processing unit.
Troubleshooting Attempt 2: Visual Studio trace
This is where I struck gold.
I ran a CPU sampling with VS2019, and that gave me a much more clear picture: 66% of the CPU time was spent in the
StringBuilder::get_Chars method, which is the indexer for StringBuilder (
char ch = sb[idx]).
The biggest callers of this method are:
(everything else was less than 1%)
Some interesting facts about the StringBuilder implementation in .NET Core and .NET 4 (though not .NET 2):
- It stores the character data in a "rope" structure: a singly-linked list of chunks, stored in reverse order: that, is the head of the list is the last chunk, and each element contains a link to the previous chunk. (This makes a lot of sense, because since StringBuilder is for building a string; as such, most edits are likely to happen at the end, and having direct access to the end means appends are O(1).)
- If you add a very large amount of data to the StringBuilder at once -- such as when you initialize it from a string, or have very large Insert/Append -- it'll put it all into one oversized chunk
- If you're inserting short strings -- such as single characters -- into the middle, however, it'll break things up into 16-character chunks, to minimize the amount of data movement needed.
Therefore, if you're inserting one character at a time into a StringBuilder, you'll mostly end up with a linked-list of 16-character chunks.
Therefore, after enough edits, the average number of steps needed to retrieve the character at a specific index is ((n / 16) / 2), which means that the StringBuilder index algorithm is O(n).
However, those methods?
GetCharacterColumn? I suspect they're being called in a loop, for all of the characters in the file. Which is n operations. And doing an O(n) operation n times is O(n^2).
This explains why reopening the file makes things fast again: the StringBuilder is initialized once with the entire contents of the file, so it's a single chunk -- making
get_Chars O(1), and the editor's methods O(n). But as I start editing -- and most of my edits get done at the beginning of the file -- it starts slowing down.
As a final test, I went back to the tab exhibiting one-second-per-letter behavior, scrolled to the very end (so viewing and editing on the head chunk) and started typing; the UI reacted instantly. I went back to the very first line and started typing, and lo! One second per letter.