Caching things between execution runs (but same linqpad process, same open tab)?

jamesmanning · March 2012

There are times when I want to fetch something expensive and want to keep it around in memory between execution runs, ideally have it still look/feel like a local variable (var foo = GetStuff() in a 'C# Statements' run). In such a scenario I'm really looking for more of a REPL-like environment I guess (like VS's Immediate Window or PowerShell) instead of the 'run then throw away the results' that seems to happen now.

It seems like there's at least a few somewhat-painful-but-obvious options, including trying to (de)serialize the object graph between runs, or forcing it to preserve the app domains and then trying to make a static variable in some class to hold the graph between runs or the like.

Since it seems like this is something others have likely already run across and dealt with themselves, though, I'm guessing (well, hoping!) there's already a better answer out there that I just haven't found yet.

Any recommendations/ideas/pointers?

John · March 2012

I use LINQPad every single day to solve a variety of problems and have never needed a true repl environment, such as you might get with Python. It would be interesting to know more about the specific problem you are working on or if it is more general the process you take to solve a given problem. Perhaps we can help make suggestions the. If true repl is what you are after then I heard that mono has this already so you might take a look at that.

jamesmanning · March 2012

I typically just deal with the fact that I might have to re-request data.

Most of the time if it's too painful to keep regenerating/refetch the data (for instance, trying to do 'ad-hoc' analysis over something like a remote web service that takes multiple minutes and returns lots of data), I just switch over to PowerShell and just deal with it there, but I really miss having the environment of LINQPad, specifically getting to write C# with intellisense.

I could likely export it out to Excel and do the analysis there, but 1) I'm slower there than in LINQPad and 2) often the first chunks of analysis aren't fetching all the data, but instead doing things like getting counts grouped by certain criteria, which is much faster doing via SQL than fetching all the data.

I may just need to bite the bullet and get better at Excel (and maybe learn PowerPivot), but it'd be nice to be able to keep 'expensive' data/objects around in memory instead of having them thrown out.

Since Roslyn already includes a REPL sample with it, I was hoping that maybe LINQPad could enable it if Roslyn is around - not sure if it's something that could be 'added on' without changing LINQPad (I couldn't think of how, at least

JoeAlbahari · March 2012

It would be easy enough to save/restore data between runs - in fact you can do that right now (albeit clumsily) as follows:

AppDomain.CurrentDomain.SetData ("foo", "bar");
AppDomain.CurrentDomain.GetData ("foo").Dump();

(Comment out the first line the second time you run it.)

The difficulty is in typing the data. For example, if the result of your query was an IQueryable of an anonymous type, you'd want to somehow get the result back so it was typed as such so you can queries over it in subsequent runs.

JoeAlbahari · March 2012

Actually, I've just had an idea.

Go to the "My Extensions" query and define this method:

public static class MyExtensions
{
    public static IEnumerable<T> Cache<T> (this IEnumerable<T> o, string key = "default")
    {
        string slot = "Cache." + key;		
        object existing = AppDomain.CurrentDomain.GetData (slot);
        
        if (existing is Array && typeof (T).IsAssignableFrom (existing.GetType().GetElementType()))
            return (IEnumerable<T>) existing;

        var result = o.ToArray();
        AppDomain.CurrentDomain.SetData (slot, result);
        return result;
    }
}

Usage:

var customers = Customers.Cache("cust");
...

This will retrieve the customers from the database only on the first run.

Here are the limitations:

- It works only for IQueryable or lazily-evaluated IEnumerable sequences
- The element type T must be "stable" (in other words, it must not get recompiled between query runs). So it can be a CLR type, a type from a typed datacontext, or a custom type that you define in My Extensions or any assembly that you reference. Anonymous types will NOT work because they're different each time you compile. Tuples are OK.

Also note that LINQPad sometimes recycles app domains for performance and other reasons. You can avoid this by going to Edit | Preferences | Advanced and checking "Always Preserve Application Domains".

You can force LINQPad to clear the cache by pressing Shift+Control+F5.

JoeAlbahari · March 2012

The following enhanced version will let you cache anonymous types, too, whose members can contain other anonymous types and enumerables/lists/arrays of anonymous types:

public static class MyExtensions
{
	public static IEnumerable<T> Cache<T> (this IEnumerable<T> o, string key = "default")
	{
		string slot = "Cache." + key;		
		object existing = AppDomain.CurrentDomain.GetData (slot);
		
		if (existing is Array && typeof (T).IsAssignableFrom (existing.GetType().GetElementType()))
			return (IEnumerable<T>) existing;
			
		if (existing is Array && CanShredAnonymousObject (typeof (T), existing.GetType().GetElementType()))
			return ShredEnumerable<T> ((IEnumerable)existing, existing.GetType().GetElementType()).ToArray();

		var result = o.ToArray();
		AppDomain.CurrentDomain.SetData (slot, result);
		return result;
	}
	
	static IEnumerable<TTarget> ShredEnumerable<TTarget> (IEnumerable source, Type sourceElementType)
	{
		foreach (var element in source)
			yield return (TTarget) ShredAnonymousObject (element, sourceElementType, typeof (TTarget));
	}	
	
	static bool CanShredAnonymousObject (Type sourceType, Type targetType)
	{
		return
			sourceType.Name.StartsWith ("<") && 
			targetType.Name.StartsWith ("<") && 
			sourceType.GetProperties ().Select (p => p.Name).OrderBy (p => p).SequenceEqual (
			targetType.GetProperties ().Select (p => p.Name).OrderBy (p => p));
	}
	
	static object ShredAnonymousObject (object source, Type sourceType, Type targetType)
	{
		object[] args = targetType.GetConstructors().Single()
			.GetParameters()
			.Select (p => ShredValue (sourceType.GetProperty (p.Name).GetValue (source, null), p.ParameterType))
			.ToArray();

		return Activator.CreateInstance (targetType, args);
	}
	
	static object ShredValue (object source, Type targetType)
	{	
		if (source == null) return null;
		Type sourceType = source.GetType();
		
		if (targetType.IsAssignableFrom (sourceType)) return source;
		
		if (targetType.IsArray && source is Array && CanShredAnonymousObject (sourceType.GetElementType(), targetType.GetElementType()))
		{
			var sourceElementType = sourceType.GetElementType();
			var targetElementType = targetType.GetElementType();
			var sourceArray = (Array) source;
			var targetArray = Array.CreateInstance (targetElementType, sourceArray.Length);			
			for (int i = 0; i < sourceArray.Length; i++)
				targetArray.SetValue (ShredAnonymousObject (sourceArray.GetValue (i), sourceElementType, targetElementType), i);
			return targetArray;
		}
		
		if (targetType.IsGenericType &&
			(targetType.GetGenericTypeDefinition() == typeof (IEnumerable<>) || targetType.GetGenericTypeDefinition() == typeof (List<>)) && 
			sourceType.GetInterface ("System.Collections.Generic.IEnumerable`1") != null &&
			CanShredAnonymousObject (
				targetType.GetGenericArguments()[0], 
				sourceType.GetInterface ("System.Collections.Generic.IEnumerable`1").GetGenericArguments()[0]))
		{
			var sourceElementType = sourceType.GetInterface ("System.Collections.Generic.IEnumerable`1").GetGenericArguments()[0];
			var targetElementType = targetType.GetGenericArguments()[0];
			var target = (IList) Activator.CreateInstance (typeof (List<>).MakeGenericType (targetElementType));
			foreach (var sourceElement in (IEnumerable)source)
				target.Add (ShredAnonymousObject (sourceElement, sourceElementType, targetElementType));
			return target;
		}	
		
		throw new NotSupportedException ("Unrecognized type: " + targetType.FullName);
	}
}

JoeAlbahari · April 2012

An enhanced version of Cache() is built into the latest LINQPad beta:
www.linqpad.net/beta.aspx

jamesmanning · April 2012

YOU MAGNIFICENT BASTARD!

(hopefully that comes across as complimentary correctly

Tormod · August 2012

This is huge! Suddenly I have blazingly fast "Interactive C# to anything". My root object wasn't an enumerable, but this was easily fixed by just creating a helper iterator method with only one line : yield return new MyCustomObject(). The object takes about 30-40 seconds to create out of a bunch of local files, so it was an annoyance.

Tormod · August 2012

Whoops. Didn't even see the Utils.Cache() method. I have one question, though. Could it include an overload to clear the cache in case the creation parameters change (perhaps an optional bool "ClearCache" parameter defaulting to false)? I assume that if I change the key, I will get a new object, but then the other object is kept in memory. The reason why the object is cached is because it is big and not changed often between query executions.

Caching things between execution runs (but same linqpad process, same open tab)?

Comments

Categories