The syntax-highlighting editor used by LINQPad 6 (and presumably LINQPad 5) exhibits O(n^2) behavior after many edits
(a second bug: Ctrl-C does not copy the currently selected text in the About Box. Also, it would probably be good if there was a way
to copy the LINQPad version and CPU type..)
LINQPad 6 v6.6.8 (X64)
.NET Core version (host): 3.1.1
.NET Core version (queries): 3.1.1
Roslyn Version: 3.4.0-beta4-19568-04
FSharp.Compiler.Service version: 184.108.40.206
I often use LINQPad to prototype new designs for things; between the built-in debugger and its expansive visualization capabilities, I can very rapidly test out new ideas. As such, I frequently end up with scripts on the order of 10KB, sometimes much larger.
One thing that has afflicted my LINQPad (versions 5 and 6) is that for some queries -- and it always seems to be large ones -- the editor starts slowing down noticeably: consider, for example, the incident that prompted this posting: I could start typing "the quick brown fox jumped over the lazy dogs", by the time I finish typing "dogs", the UI hadn't even started processing "over" yet.
When this happens, if I close and re-open the query, the UI becomes speedy again, at least for a little while; eventually, however, it starts slowing down again. This also occurs if I choose Clone Query: the new query window is nice and responsive, while the old one continues to be painfully slow. (This is insidious, because it means that I can't just file a bug report saying "this query is slow to edit" because it won't be, not at first.)
Troubleshooting attempt 1: ETW trace
I ran an ETW trace while the problem was happening, and I saw LINQPad6.exe sitting at a baseline of about 12.5% CPU usage with brief spikes of up to 25%. As this is a i7-6700 (4-cores/8-threads), that suggests one thread (the UI thread) steadily consuming 100% of a single processing unit, with a second thread doing brief bursts of work on a second processing unit.
Troubleshooting Attempt 2: Visual Studio trace
This is where I struck gold.
I ran a CPU sampling with VS2019, and that gave me a much more clear picture: 66% of the CPU time was spent in the
StringBuilder::get_Chars method, which is the indexer for StringBuilder (
char ch = sb[idx]).
The biggest callers of this method are:
(everything else was less than 1%)
Some interesting facts about the StringBuilder implementation in .NET Core and .NET 4 (though not .NET 2):
- It stores the character data in a "rope" structure: a singly-linked list of chunks, stored in reverse order: that, is the head of the list is the last chunk, and each element contains a link to the previous chunk. (This makes a lot of sense, because since StringBuilder is for building a string; as such, most edits are likely to happen at the end, and having direct access to the end means appends are O(1).)
- If you add a very large amount of data to the StringBuilder at once -- such as when you initialize it from a string, or have very large Insert/Append -- it'll put it all into one oversized chunk
- If you're inserting short strings -- such as single characters -- into the middle, however, it'll break things up into 16-character chunks, to minimize the amount of data movement needed.
Therefore, if you're inserting one character at a time into a StringBuilder, you'll mostly end up with a linked-list of 16-character chunks.
Therefore, after enough edits, the average number of steps needed to retrieve the character at a specific index is ((n / 16) / 2), which means that the StringBuilder index algorithm is O(n).
However, those methods?
GetCharacterColumn? I suspect they're being called in a loop, for all of the characters in the file. Which is n operations. And doing an O(n) operation n times is O(n^2).
This explains why reopening the file makes things fast again: the StringBuilder is initialized once with the entire contents of the file, so it's a single chunk -- making
get_Chars O(1), and the editor's methods O(n). But as I start editing -- and most of my edits get done at the beginning of the file -- it starts slowing down.
As a final test, I went back to the tab exhibiting one-second-per-letter behavior, scrolled to the very end (so viewing and editing on the head chunk) and started typing; the UI reacted instantly. I went back to the very first line and started typing, and lo! One second per letter.
In addition to targeting .NET Core 3, there are a bunch of new features:
- You can now add references to other .linq files, via the the new #load directive.
- The extensibility model for writing custom data context drivers has been updated for LINQPad 6.
Creating a project is now as simple as running a LINQPad script, and publishing it is as simple as publishing a NuGet package.
- When executing a selection in 'C# Program' mode, the selected text can now call other methods in the query.
- The 'Go to definition' shortcut (F12) now works for symbols defined in My Extensions (as well as #load-ed queries).
- Outlining now works in statements mode, too, if you have #regions or methods defined.
- If you hold down the control key anywhere in the editor, it will display the quick-info tooltip, as well as any errors or warnings on that line.
- If you type /// before a member, LINQPad will now expand it into a simple (one-line) XML summary. These summaries are picked up from #load-ed queries, as well as My Extensions.
- The back-end for the NuGet Package Manager has been re-written to be faster and more reliable. It now works directly from the local user cache,
and recognizes reference assemblies and native dependencies in line with .NET Core protocols.
When restoring packages, LINQPad 6 searches all enabled package sources.
- Queries now automatically re-loaded when the file is changed externally.
- When calling file-based methods such as File.Open, the editor now offers autocompletion on the file path.
- There are new options available when calling Dump - to suppress column totals, control repeating headers, and specify a depth at which to initially collapse the results.
- When compiling your queries, LINQPad 6 preferences reference assemblies over runtime assemblies. If the reference assemblies for the current
.NET Core runtime version have not been installed, LINQPad will offer to download the appropriate NuGet package.
- LINQPad 6 supports soft cancellation. By monitoring this.QueryCancelToken, your query can respond to the Cancel button, and
elect to end early without the underlying process being killed.
- You can now connect to SQL CE databases without installing SQL CE. LINQPad will automatically download the NuGet package as required.
let me introduce my new CSV data context dynamic driver.
- create new data context from CSV files, one table per file
- automatic relations detection
More info and download here:
Any comments and suggestions are welcome.
Happy CSV querying
For those who didn't do it yet: download the 4.47 version from the beta section and extract lprun.exe to the same directory as LINQPad.exe. To use lprun.exe from any Command Prompt or PowerShell console on your machine, add the path where you execute LINQPad from to your system path (I use "C:\Program Files\LINQPad\") as follows:
Right click 'Computer', then click 'Properties', 'Advanced system settings'. Click 'Environment Variables'. Look for the variable Path in the System variables section, select it, and click the Edit button. Add the above mention path to the Variable value (overly concerned warning: do not remove the current value, use the arrow-right key to move to the end and dehilight the current selection and use a semi-colon to separate the current value from your LINQPad path) and click OK.
When you open a new command prompt or PowerShell console window, you can now run lprun whenever you want! It opens up new ways of interacting with your system, and even with PowerShell. Joe, thanks a lot!