Home

LINQPad pathologically slow behavior on large scripts after many edits

TL;DR

The syntax-highlighting editor used by LINQPad 6 (and presumably LINQPad 5) exhibits O(n^2) behavior after many edits

Version info

(a second bug: Ctrl-C does not copy the currently selected text in the About Box. Also, it would probably be good if there was a way
to copy the LINQPad version and CPU type..)

LINQPad 6 v6.6.8 (X64)
.NET Core version (host): 3.1.1
.NET Core version (queries): 3.1.1
Roslyn Version: 3.4.0-beta4-19568-04
FSharp.Compiler.Service version: 22.0.3.0

Background

I often use LINQPad to prototype new designs for things; between the built-in debugger and its expansive visualization capabilities, I can very rapidly test out new ideas. As such, I frequently end up with scripts on the order of 10KB, sometimes much larger.

One thing that has afflicted my LINQPad (versions 5 and 6) is that for some queries -- and it always seems to be large ones -- the editor starts slowing down noticeably: consider, for example, the incident that prompted this posting: I could start typing "the quick brown fox jumped over the lazy dogs", by the time I finish typing "dogs", the UI hadn't even started processing "over" yet.

When this happens, if I close and re-open the query, the UI becomes speedy again, at least for a little while; eventually, however, it starts slowing down again. This also occurs if I choose Clone Query: the new query window is nice and responsive, while the old one continues to be painfully slow. (This is insidious, because it means that I can't just file a bug report saying "this query is slow to edit" because it won't be, not at first.)

Troubleshooting attempt 1: ETW trace

I ran an ETW trace while the problem was happening, and I saw LINQPad6.exe sitting at a baseline of about 12.5% CPU usage with brief spikes of up to 25%. As this is a i7-6700 (4-cores/8-threads), that suggests one thread (the UI thread) steadily consuming 100% of a single processing unit, with a second thread doing brief bursts of work on a second processing unit.

Troubleshooting Attempt 2: Visual Studio trace

This is where I struck gold.

I ran a CPU sampling with VS2019, and that gave me a much more clear picture: 66% of the CPU time was spent in the StringBuilder::get_Chars method, which is the indexer for StringBuilder (char ch = sb[idx]).

The biggest callers of this method are:

Method CPU time
EditorView::DrawTextAreaDisplayLines 26%
DisplayLineCollection::GetLayout 25%
SyntaxEditorRenderingCache::MeasureText 10%
DisplayLineCollection::GetLineLength 3%
EditorView::GetCharacterColumn 1%

(everything else was less than 1%)

StringBuilder internals

Some interesting facts about the StringBuilder implementation in .NET Core and .NET 4 (though not .NET 2):

  1. It stores the character data in a "rope" structure: a singly-linked list of chunks, stored in reverse order: that, is the head of the list is the last chunk, and each element contains a link to the previous chunk. (This makes a lot of sense, because since StringBuilder is for building a string; as such, most edits are likely to happen at the end, and having direct access to the end means appends are O(1).)
  2. If you add a very large amount of data to the StringBuilder at once -- such as when you initialize it from a string, or have very large Insert/Append -- it'll put it all into one oversized chunk
  3. If you're inserting short strings -- such as single characters -- into the middle, however, it'll break things up into 16-character chunks, to minimize the amount of data movement needed.

Therefore, if you're inserting one character at a time into a StringBuilder, you'll mostly end up with a linked-list of 16-character chunks.

Therefore, after enough edits, the average number of steps needed to retrieve the character at a specific index is ((n / 16) / 2), which means that the StringBuilder index algorithm is O(n).

However, those methods? DrawTextAreaDisplayLines, GetLayout, MeasureText, GetCharacterColumn? I suspect they're being called in a loop, for all of the characters in the file. Which is n operations. And doing an O(n) operation n times is O(n^2).

This explains why reopening the file makes things fast again: the StringBuilder is initialized once with the entire contents of the file, so it's a single chunk -- making get_Chars O(1), and the editor's methods O(n). But as I start editing -- and most of my edits get done at the beginning of the file -- it starts slowing down.

As a final test, I went back to the tab exhibiting one-second-per-letter behavior, scrolled to the very end (so viewing and editing on the head chunk) and started typing; the UI reacted instantly. I went back to the very first line and started typing, and lo! One second per letter.

Comments

  • Thanks for that excellent analysis. I've written a query to test the performance characteristics of StringBuilder under those conditions and you're absolutely correct:
    http://share.linqpad.net/54c67i.linq

    I've just uploaded a new LINQPad 6 beta (6.7.3) with the editor modified such that it recreates the StringBuilder every 250 edits to prevent excessive fragmentation. Let me know whether this fixes the problem.

  • Joe, I tried the beta 6.7.3 and seems to offer a dramatic improvement in performance.

    I wrote a script to send a series of "// xxxxxxx" lines (100 'x'es per line) using SendKeys/SendInput, and while there's still a problem somewhere, it's obviously much faster:

    LINQPad 6.6.8:

    [2020-03-16T18:15:09.8092607-04:00] Added 153 characters in 00:00:00.9954097 (153.7/sec)
    [2020-03-16T18:15:10.8089573-04:00] Added 137 characters in 00:00:00.9994679 (137.1/sec)
    [2020-03-16T18:15:11.8122872-04:00] Added 125 characters in 00:00:01.0035911 (124.6/sec)
    [2020-03-16T18:15:12.8089527-04:00] Added 111 characters in 00:00:00.9961522 (111.4/sec)
    [2020-03-16T18:15:14.8069482-04:00] Added 186 characters in 00:00:01.9987840 (93.1/sec)
    [2020-03-16T18:15:15.8083327-04:00] Added 84 characters in 00:00:01.0006835 (83.9/sec)
    [2020-03-16T18:15:16.8094619-04:00] Added 74 characters in 00:00:01.0011186 (73.9/sec)
    [2020-03-16T18:15:18.8099845-04:00] Added 134 characters in 00:00:02.0010248 (67.0/sec)
    [2020-03-16T18:15:19.8074672-04:00] Added 62 characters in 00:00:00.9970201 (62.2/sec)
    [2020-03-16T18:15:21.8155571-04:00] Added 117 characters in 00:00:02.0081069 (58.3/sec)
    [2020-03-16T18:15:23.8181305-04:00] Added 103 characters in 00:00:02.0020217 (51.4/sec)
    [2020-03-16T18:15:25.8240144-04:00] Added 100 characters in 00:00:02.0065665 (49.8/sec)
    [2020-03-16T18:15:27.8259802-04:00] Added 94 characters in 00:00:02.0011751 (47.0/sec)
    [2020-03-16T18:15:30.8112390-04:00] Added 126 characters in 00:00:02.9854697 (42.2/sec)
    [2020-03-16T18:15:32.8297998-04:00] Added 76 characters in 00:00:02.0185896 (37.7/sec)
    [2020-03-16T18:15:35.8210084-04:00] Added 104 characters in 00:00:02.9905225 (34.8/sec)
    [2020-03-16T18:15:39.8152705-04:00] Added 125 characters in 00:00:03.9946897 (31.3/sec)
    [2020-03-16T18:15:42.8135855-04:00] Added 86 characters in 00:00:02.9982165 (28.7/sec)
    [2020-03-16T18:15:46.8414000-04:00] Added 107 characters in 00:00:04.0272866 (26.6/sec)
    [2020-03-16T18:15:51.8131021-04:00] Added 117 characters in 00:00:04.9719634 (23.5/sec)
    [2020-03-16T18:15:56.8148669-04:00] Added 106 characters in 00:00:05.0022651 (21.2/sec)
    [2020-03-16T18:16:01.8435579-04:00] Added 96 characters in 00:00:05.0283232 (19.1/sec)
    [2020-03-16T18:16:07.8348406-04:00] Added 105 characters in 00:00:05.9914205 (17.5/sec)
    [2020-03-16T18:16:13.8532989-04:00] Added 97 characters in 00:00:06.0184253 (16.1/sec)
    [2020-03-16T18:16:21.8391721-04:00] Added 117 characters in 00:00:07.9854329 (14.7/sec)
    [2020-03-16T18:16:28.8612767-04:00] Added 94 characters in 00:00:07.0215562 (13.4/sec)
    [2020-03-16T18:16:37.8173122-04:00] Added 110 characters in 00:00:08.9567701 (12.3/sec)
    [2020-03-16T18:16:46.8678253-04:00] Added 100 characters in 00:00:09.0503310 (11.0/sec)
    [2020-03-16T18:16:57.8371668-04:00] Added 111 characters in 00:00:10.9693085 (10.1/sec)
    [2020-03-16T18:17:08.8084459-04:00] Added 99 characters in 00:00:10.9706173 (9.0/sec)
    [2020-03-16T18:17:21.8147713-04:00] Added 107 characters in 00:00:13.0066765 (8.2/sec)
    [2020-03-16T18:17:34.8793867-04:00] Added 104 characters in 00:00:13.0639042 (8.0/sec)
    [2020-03-16T18:17:48.8128941-04:00] Added 109 characters in 00:00:13.9337089 (7.8/sec)
    

    (at this point the script crashed; I have no idea why, I suspect a bug in the Automation code I'm using to interrogate the text editor.)

    On LINQPad 6.7.3 things looked much better:

    [2020-03-16T18:19:10.1427416-04:00] Added 200 characters in 00:00:01.0088760 (198.2/sec)
    [2020-03-16T18:19:11.1383161-04:00] Added 180 characters in 00:00:00.9946775 (181.0/sec)
    [2020-03-16T18:19:12.1426626-04:00] Added 171 characters in 00:00:01.0047496 (170.2/sec)
    [2020-03-16T18:19:13.1391471-04:00] Added 161 characters in 00:00:00.9956863 (161.7/sec)
    [2020-03-16T18:19:14.1414325-04:00] Added 157 characters in 00:00:01.0027708 (156.6/sec)
    [2020-03-16T18:19:15.1398356-04:00] Added 148 characters in 00:00:00.9978507 (148.3/sec)
    [2020-03-16T18:19:16.1434805-04:00] Added 144 characters in 00:00:01.0038047 (143.5/sec)
    [2020-03-16T18:19:17.1418713-04:00] Added 135 characters in 00:00:00.9983976 (135.2/sec)
    [2020-03-16T18:19:18.1397343-04:00] Added 135 characters in 00:00:00.9981770 (135.2/sec)
    [2020-03-16T18:19:19.1394410-04:00] Added 127 characters in 00:00:00.9991222 (127.1/sec)
    [2020-03-16T18:19:20.1407617-04:00] Added 127 characters in 00:00:01.0012870 (126.8/sec)
    [2020-03-16T18:19:21.1403604-04:00] Added 124 characters in 00:00:00.9998809 (124.0/sec)
    [2020-03-16T18:19:22.1443829-04:00] Added 121 characters in 00:00:01.0032315 (120.6/sec)
    [2020-03-16T18:19:23.1397225-04:00] Added 113 characters in 00:00:00.9961540 (113.4/sec)
    [2020-03-16T18:19:24.1411483-04:00] Added 111 characters in 00:00:01.0005790 (110.9/sec)
    [2020-03-16T18:19:25.1393991-04:00] Added 114 characters in 00:00:00.9988968 (114.1/sec)
    [2020-03-16T18:19:26.1401060-04:00] Added 108 characters in 00:00:01.0006846 (107.9/sec)
    [2020-03-16T18:19:27.1385183-04:00] Added 113 characters in 00:00:00.9981560 (113.2/sec)
    [2020-03-16T18:19:28.1405192-04:00] Added 118 characters in 00:00:01.0015870 (117.8/sec)
    [2020-03-16T18:19:29.1410583-04:00] Added 115 characters in 00:00:01.0000372 (115.0/sec)
    [2020-03-16T18:19:30.1421637-04:00] Added 116 characters in 00:00:01.0018351 (115.8/sec)
    [2020-03-16T18:19:31.1393259-04:00] Added 106 characters in 00:00:00.9965382 (106.4/sec)
    [2020-03-16T18:19:32.1413579-04:00] Added 109 characters in 00:00:01.0022345 (108.8/sec)
    [2020-03-16T18:19:33.1423143-04:00] Added 104 characters in 00:00:01.0011187 (103.9/sec)
    [2020-03-16T18:19:34.1411617-04:00] Added 103 characters in 00:00:00.9988180 (103.1/sec)
    [2020-03-16T18:19:35.1406057-04:00] Added 100 characters in 00:00:00.9993622 (100.1/sec)
    [2020-03-16T18:19:37.1426302-04:00] Added 192 characters in 00:00:02.0011833 (95.9/sec)
    [2020-03-16T18:19:38.1471346-04:00] Added 95 characters in 00:00:01.0047634 (94.5/sec)
    [2020-03-16T18:19:39.1457866-04:00] Added 89 characters in 00:00:00.9991945 (89.1/sec)
    

    It's obvious that something is slowing down eventually, but it's a lot less noticeable.
    If you want to hammer it a lot on your own, here's a script that can do that:

    http://share.linqpad.net/27mn5p.linq

    When running it, turning on CAPS LOCK for a few seconds should cause it to stop trying to dump keyboard output, so you can switch to another application without causing massive problems.

    I suspect you might need a different data structure entirely to store your text -- but since the real fault seems to lie in the editor control, I doubt you can simply swap in a new one. You've made it not-pathologically-slow, but it's still doing a https://wiki.c2.com/?ShlemielThePainter algorithm.

  • Thanks, that's good news. I'll add the same fix to LINQPad 5 in the next few days. In terms of further improving performance, I believe that the bottlenecks are no longer in StringBuilder (although let me know if you have evidence otherwise).

  • I've run another trace, and I definitely don't see StringBuilder::get_Chars() on there anymore.

    Unfortunately, I can't work out exactly where the next slowdown lies, because I'm running up against what seems to be a Visual Studio bug, and I don't know how to report it to Microsoft.

    In multiple places where an async method is invoked, there's a call out to EtwEventWriteTransfer (in advapi32.dll), below which lies a whole new copy of the call stack, all the way up to the __scrt_common_main_seh call (the executable entry point)! As a result, a large majority of methods have their CPU time incorrectly computed, and I can't really find the true hot spots.

  • Hope this is not out of place. But I just wanted to post a huge

    THANK YOU!!!

    I've been using LINQPad for years in much the same way as OP to prototype ideas and sometimes they get fairly complex and lengthy. And this has been a very significant pain point for me, that the editing gets incredibly slow after awhile. I am so happy that the root cause of this has been uncovered. Please accept my sincerest gratitude!

    Best Regards,
    ~ mellamokb

  • Thanks for your support! And a big thank-you to @StevieO.

Sign In or Register to comment.