Bill Blogs in C# -- Linq

Bill Blogs in C# -- Linq

Created: 11/1/2013 7:04:09 PM

I gave three talks at DevIntersection in Las Vegas. All the demos are on my GitHub page. 

 

The Modern C# demos are in the Modern C# repository.

The TypeScript demos are in the TypeScriptFlashCards repository.

The Practical LINQ demos are in the Practical LINQ repository.

In all cases, the commits mirror the changes I made during the presentations. Thank you for attending my talks, and I hope it was worth the time investment you made.  If you have any questions or comments, please leave them below and I’ll answer them with repository updates or new blog posts.

Created: 10/31/2013 12:11:02 PM

I finally had enough of a break that I can post the slides from my two talks at the Boston Code Camp.

My first talk was Practical LINQ, N tips and tricks for some number N. You can download the presentation materials here: PracticalLINQpptx.pdf

The demos are on GitHub: PracticalLINQ  Each of the labeled commits mirrors the changes I made during the presentation.

My second presentation was The Task Asynchronous Programming Model. The presentation materials are here: TAPExplained.pdf

Those demos are also available on GitHub: AsyncVoid. Again, the labeled commits match the changes I made during the presentation.

If you have any questions or comments, please make them below, and I’ll update the repositories, or write a new blog post about what’s changed.

Created: 10/22/2013 6:20:15 PM

I'll be at DevIntersection / AngleBrackets next week. If you're going, I hope to see you there.

I'll be giving three different sessions:

Monday at 3:45: Modern C#

Too often, we teach C# by explaining features version-by-version. That practice leads developers to believe that they should prefer the C# practices we used at the turn of the millennium. It's time to teach C# 5 without taking students on a journey of the idioms we used in all previous versions. Instead, this would teach the language from the perspective of C# 5, and leveraging the idioms we use today throughout the material. Encourage developers to follow the practices we use now, for the reasons we use them now. Encourage your peers to leverage the best of modern C# instead of reaching for the classic answers.

Tuesday at noon: If TypeScript is the answer, what was the question?

Typescript promises to bring C# developers to the world of client JavaScript. But what about existing JavaScript developers? What about existing JavaScript libraries? In this session, we'll look at what Typescript adds to JavaScript, and how to integrate Typescript into the right places in your development, without forcing it into areas where plain vanilla JavaScript is the correct answer.

Tuesday at 2:15: LINQ in Practice

LINQ idioms can be used to write more readable code for many everyday problems. Yet many developers reach for the familiar imperative idioms instead. This session shows developers how to recognize common practices where LINQ would create more readable, maintainable, and extendable code. You'll learn to recognize imperative code smells and replace those with LINQ queries.  You'll learn to write those queries in ways that help other developers understand and leverage the LINQ queries you've created.

I'm excited about all three of them. It's a good mix of super new unreleased technology, new technology, and everyday practices that could be improved.

But wait, there's more. When I'm not speaking, I'll be writing code at the Humanitarian Toolbox Hackathon. We've made quite a bit of progress on a few different apps since last Spring when we held our first hackathon at DevIntersection. In fact, our first app is almost ready for trials with relief organizations. If you're at DevIntersection, take some time between sessions and stop by. Write some code for an open source project that will have a positive impact on the lives of people affected by natural disasters. Contribute, and learn something.

Oh, and we may have some surprise guests stop by during the hackathon.

If you haven't registered yet, there's still time. The conferences are co-located, so you can register for either Angle Brackets , or DevIntersection. You can attend sessions listed under either conference, and attend the hackathon as well.

I hope to see you there.

Created: 10/14/2013 1:15:33 PM

This coming Saturday, October 19th, I’ll be at Boston Code Camp 20. I’ve got two talks:  Practical LINQ in the morning, and The Task Asynchronous Programming Model in the late afternoon.

This will be the first time I’ve spoken at the Boston Code Camp, and I’m looking forward to meeting developers in the New England community. I’m spending more time on the East Coast now and I’d like to become more involved with that community.

Between my sessions, I’ll be splitting time between web sessions and Windows 8 sessions.  There’s a great set of content on both topics, and it will be a good chance for me to pick up some knowledge on both of these areas.

I hope to see you there.

Created: 8/29/2011 5:10:24 PM

I try to avoid performance topics in my blog, because it’s very hard to create generalizations on performance. So, read this not as general performance guidelines, but as a way to understand Lazy Evaluation, LINQ, and Functional Programming.

A developer asked me a question about an algorithm that calculated some statistics for a very large sequence of numbers.  Here’s a (slightly) modified version of the code that calculated the Standard Deviation:

var mean = sequence.Sum() / sequence.Count();
var variance = sequence.Select(n => n * n).Sum() / sequence.Count() -
(sequence.Sum() * sequence.Sum()) /
((double)sequence.Count() * (double)sequence.Count());
var stdDeviation = Math.Sqrt(variance);

In the production code, each value in the sequence is calculated from other values, and requires a file read.

The important point for this code was that the sequence (more than a million numbers) was being enumerated eight times. Every call to Count(), or Sum(), including calculating the sum of squares, required enumerating the entire sequence. For a sample like this, you could pre-calculate and store the count, sum, and sum of squares:

        double sum = sequence.Sum();
int count = sequence.Count();
var mean = sum / count;
var variance = sequence.Select(n => n * n).Sum() / count -
(sum * sum) / ((double)count * (double)count);
var stdDeviation = Math.Sqrt(variance);

More complicated algorithms may require more work.  In many cases, a better solution is to use a different method, Aggregate, to calculate all the needed values in one enumeration.  Here’s the same calculation using one enumeration to calculate the sum, sum of squares, and count for the sequence:

var seed = new { Count = 0, Sum = 0.0, SumOfSquares = 0.0 };
var accValues = sequence.Aggregate(seed, (currentTotal, element) => new
{
Count = currentTotal.Count + 1,
Sum = currentTotal.Sum + element,
SumOfSquares = currentTotal.SumOfSquares + element * element
});

// now, calculate Standard deviation:
var mean = accValues.Sum / accValues.Count;
var variance = accValues.SumOfSquares / accValues.Count -
(accValues.Sum * accValues.Sum) /
((double)accValues.Count * (double)accValues.Count);
var stdDeviation = Math.Sqrt(variance);

I said in my opening that this wasn’t about performance. In fact, in my tests on a sequence of random numbers, all three samples take very similar times, within 100ms. In the full production version, where enumeration took much more time, and there were more calculations (resulting in more than 8 enumerations), the results were quite striking.

Another possible enhancement to this algorithm would be to use a new version of Aggregate in PLINQ to perform some of the calculations in parallel.

What I’m saying is this: please don’t rewrite LINQ queries that your happy with. Instead, remember that lazy evaluation means calculating the values each time. File this away for later and when you see code that does exhibit the problems I mentioned above, try folding enumerations into a single operation that calculates multiple results.

Created: 1/26/2011 4:11:06 AM

One of the reasons I love CodeMash is because I great questions from very smart people. This last CodeMash, one of questions was:

Is there a way to merge two sequences into a new sorted sequence, assuming that both source sequences are already sorted.

There is an obvious simple answer using LINQ:

sequence1.Concat(sequence2).OrderBy(n => n)

That obvious answer works, but it is sadly slower than it needs to be. A faster solution would make use of the fact that the input sequences are already sorted. In fact, this is much slower because the Concat() call puts the the two sequences in an order that makes OrderBy do more work.

          public
          static IEnumerable<T> sortedMerge<T>(
    IEnumerable<T> sequence1, 
    IEnumerable<T> sequence2, 
    IComparer<T> comparer)
{
    var iter1 = sequence1.GetEnumerator();
    var iter2 = sequence2.GetEnumerator();
          bool seq2 = iter2.MoveNext();
          bool seq1 = iter1.MoveNext();
 
          while (seq1 && seq2)
    {
          if (comparer.Compare(iter1.Current, iter2.Current) <= 0)
        {
          yield
          return iter1.Current;
            seq1 = iter1.MoveNext();
        } else
        {
          yield
          return iter2.Current;
            seq2 = iter2.MoveNext();
        }
    }
          // might have some items in one of the sequences
        
          // remaining.
        
          while (seq1)
    {
          yield
          return iter1.Current;
        seq1 = iter1.MoveNext();
    }
          while (seq2)
    {
          yield
          return iter2.Current;
        seq2 = iter2.MoveNext();
    }
}
 
          public
          static IEnumerable<T> SortedMerge<T>(
    IEnumerable<T> sequence1, 
    IEnumerable<T> sequence2)
          where T : IComparable<T>
{
    var comparer = Comparer<T>.Default;
          return sortedMerge(sequence1, sequence2, comparer);
}

It’s just a bit more LINQ work: Keep returning the next item in each sequence while it is less than the next item in the other sequence. The two overloads allow you to merge sequences that are sorted either using the type’s IComparable interface or a customer IComparer.

I wrote this using a TDD approach, ensuring that my version agreed with the original LINQ query. You can get the code and the tests from Elevate.

Created: 9/30/2010 7:01:31 PM

In my last post, I discussed how you could compose methods together at runtime. I hinted that there was more to come.  The issue had to do with the Trace() method.  Here’s sample output from one run of the application, as coded for my last post:

enter operation 
trace 
Printing current sequence 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
enter operation 
dbl 
enter operation 
done 
2 
4 
6 
8 
10 
12 
14 
16 
18 
20

Notice how the output from Trace happens immediately.  Well, that’s because Trace calls ToList() in order to create a snapshot of the list at that point in time.  But what if you want to defer execution?  Well, you might try the obvious implementation of Trace():

        private
        static IEnumerable<int> Trace(IEnumerable<int> sequence)
{
foreach(int item in sequence)
{
Console.WriteLine(item);
yieldreturn item;
}
}
The doesn’t work.  Because everything is evaluated lazily, including the trace, you’ll get the trace output interspersed with final results:
enter operation 
trace 
enter operation 
dbl 
enter operation 
done 
Printing current sequence 
1 
2 
2 
4 
3 
6 
4 
8 
5 
10 
6 
12 
7 
14 
8 
16 
9 
18 
10 
20

That’s not what I want either.  I want the trace operation to occur as a single operation but not until the entire pipeline is composed, and the entire pipeline gets executed.  What I want is this output:

enter operation 
trace 
enter operation 
dbl 
enter operation 
done 
Printing current sequence 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
2 
4 
6 
8 
10 
12 
14 
16 
18 
20

Now, I have the trace command execute only when the function composition has completed, and the yet it happens in one operation.  The Trace() method still uses ToList() in order to pull all the elements through the pipeline and get a single trace of the full list. 

To make that work, you have to compose Func objects instead of the sequence.  Then, when you enumerate all the elements in the composed functions, all the composed functions execute in a chain. First, you need a Compose() method that lets you compose two like functions into a single operation that executes the first function, then the second function on all the items in the sequence:

        public
        static
        class Extensions
{
publicstatic Func<IEnumerable<int>> Compose(this Func<IEnumerable<int>> srcFunc,
Func<IEnumerable<int>, IEnumerable<int>> nextFunc)
{
// wish extension syntax worked here.
// () => srcFunc().nextFunc();
return () => nextFunc(srcFunc());
}

// Initial method:
publicstatic Func<IEnumerable<int>> Compose(this IEnumerable<int> src,
Func<IEnumerable<int>, IEnumerable<int>> nextFunc)
{
// wish extension syntax worked here.
// () => src.nextFunc();
return () => nextFunc(src);
}
}
Next, you need to modify the main program logic to compose functions instead of sequences:
 
Func<IEnumerable<int>> sequence = () => Enumerable.Range(1, 10);
bool done = false;
do
{
Console.WriteLine("enter operation");
var input = Console.ReadLine();
switch (input)
{
case"odd":
sequence = sequence.Compose((s) => s.Where(n => n % 2 == 1));
break;
case"even":
sequence = sequence.Compose((s) => s.Where(n => n % 2 == 0));
break;
case"uniq":
sequence = sequence.Compose((s) => s.Distinct());
break;
case"sqr":
sequence = sequence.Compose((s) => s.Select(n => n * n));
break;
case"dbl":
sequence = sequence.Compose((s) => s.Select(n => n * 2));
break;
case"trace":
sequence = sequence.Compose(s => Trace(s));
break;
case"done":
done = true;
break;
}
} while (!done);
foreach (var item in sequence())
Console.WriteLine(item);

All the changes fallout from the fact that sequence is now a Func<IEnumerable<int>> rather than an IEnumerable<int>. Now, all the compositions in the switch statement use that Compose() method to stitch together the functions.  Finally, notice that I need to execute the composed functions in the sequence variable in the final foreach statement.
 
The first version is certainly easier to comprehend.  And, it is simpler. However, it may lack some features that you need, if any of the methods you are composing have any side effects. (It may be important exactly when those side effects occur).
Created: 9/27/2010 7:09:09 PM

Last week my SRT Colleague Chris Marinos wrote a blog post discussing the differences between F# and C# quotations. If you don’t read his post carefully, and if you read my latest article on the C# Developer Center, you may think he’s not correct in how C# resolves calls to IQueryable and IEnumerable.  He is correct, and his post shows yet another subtle rule in how the C# compiler resolves parameters.

Here’s the code that could confuse readers:

        //refactoring the predicate into a helper method
        
//NOTE: in practice, I'd probably leave this as a lambda
//since it's neither complicated nor multi-purpose.
public bool Condition(int x)
{
return x > 15;
}

var values =
0.Through(20)
.Where(Condition);

As Chris points out in his article, this will work, but will use the LINQ to Objects implementation for the Where clause.  In my article in LINQ providers, I said that you would use the AsEnumerable() method to force the LINQ to Objects implementation.  Well, we’re both right. Had Chris wrote the query this way:
        //refactoring the predicate into a helper method
        
//NOTE: in practice, I'd probably leave this as a lambda
//since it's neither complicated nor multi-purpose.
public bool Condition(int x)
{
return x > 15;
}

var values =
0.Through(20)
.Where(x => Condition(x));

That would use the LINQ to SQL implementation (based on IQueryable), and you’d have an exception thrown because the LINQ to SQL libraries has no idea how to implement a general method in T-SQL.

Did you spot the difference?

In the first case, the parameter to Where() is a Method Group. In the second case, the parameter to Where is a Lambda Expression whose right side is a MethodCallExpression. That makes all the difference. A method group cannot be converted to an expression, but it can be converted to a delegate (if the argument lists allow it). IQueryable.Where() takes an Expression<Func<T, bool> as its parameter. IEnumerable.Where() takes a Func<T,bool>. Both IQueryable.Where and IEnumerable.Where are in scope (because IQueryable<T> derives from IEnumerable<T>), and only one matches. Therefore, in Chris’ sample, the IEnumerable version gets called.

In my C# DevCenter article, I continued to use lambda expressions as the parameters to my methods. That meant when the IQueryable implementation could not do what I wanted, it would throw an exception. Then, I would add the AsEnumerable() call in order to expressly choose the LINQ to Objects implementation.

I prefer using the AsEnumerable() call because it makes your intent clear. I would not rely on passing a method group to a query method in order to force the LINQ to Objects implementation. Someone who doesn’t read closely would assume it is wrong, and change it. However, it is important to understand why sometimes your code does exactly what it is supposed to do, even when that’s not what you intended.

Created: 2/4/2010 8:48:00 PM

This is going to be fun.  It’s a bit of LINQ, a bit of academic Computer Science, and a bit of meteorology.

Euler Problem 14 concerns a sequence referred to as hailstorm numbers.

Hailstorm number sequences are generated by applying one of two functions to a number in order to generate the next number. In this case, the sequence is:

n –> n / 2 (n is even)

n –> 3n + 1 (n is odd)

When n is even, the next number in the sequence is smaller. When n is odd, the next number in the sequence is larger

In all cases, it is believed that for every starting number, the sequence will oscillate for some time, and then eventually converge to some minimum number. (In this case, that minimum number is 1.)

These sequences are called hailstorm numbers because they (very simplistically) act like hail in a storm: oscillating up and down in a thunderhead before eventually falling to earth.

This particular problem asks you to find the sequence that has the longest chain, given input numbers less than 1 million.

Writing some code

The brute force method will take your computer a very long time to compute. The problem asks for the longest sequence, and you’ve got a million of them to compute.

Stack size is a related problem.  You could write a method that computes the next value, and then finds the sequence size recursively:

 

   1: private static long Generate(long n)
   2: {
   3:     if (n == 1) { return 1; }
   4:     var next = (n % 2 == 0) ? n / 2 : 3 * n + 1;
   5:     return Generate(next) + 1;
   6: }

A LINQ query gives you the answer:

   1: var answer = (from StartingValue in Enumerable.Range(1, 999999)
   2:              let SequenceSize = Generate(StartingValue)
   3:              orderby SequenceSize descending
   4:              select new { StartingValue, SequenceSize }).First();

This works, and does give the correct answer.

But I’m not really satisfied, because it takes more than 15 seconds (on my PDC laptop) to finish.

Making it Faster

There are two steps to making this faster.  The final version uses a technique called Memoization. Memoization enables you to avoid computing the same result more than once. Pure functions have a useful property in that the output depends on their inputs, and nothing else. Therefore, once you’ve computed the sequence length for, say 64, you should never need to compute it again. It’s always going to be the same.

Moemoization means to store the result for a computation and returned the stored result rather than do the work again. This can provide significant savings in a recursive algorithm like this. For example, memoization of the result for 64 (7) means saving the computations for 64, 32,16,8,4,2 and 1. Memoization of the results for longer sequences would mean correspondingly larger savings.

You could modify the Generate method to provide storage for previously computed results.  Here’s how you would do that:

 

   1: private static Dictionary<long, long> StoredResults = new Dictionary<long, long>();
   2:  
   3: private static long Generate(long n)
   4: {
   5:     if (StoredResults.ContainsKey(n))
   6:         return StoredResults[n];
   7:     if (n == 1) 
   8:     { 
   9:         StoredResults[n] = 1;
  10:         return 1; 
  11:     }
  12:     var next = (n % 2 == 0) ? n / 2 : 3 * n + 1;
  13:     var answer = Generate(next) + 1;
  14:     StoredResults[n] = answer;
  15:     return answer;
  16: }

But, what’s the fun in that?  There’s no reuse. It’s a one off solution.

I want to write a generic Memoize function that lets me memoize any function with one variable. Wes Dyer’s post explains this technique in detail.  Memoize is a generic method that executes a function, abstracting away the types for both the parameter type and the result type:

   1: public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
   2: {
   3:     var previousResults = new Dictionary<T, TResult>();
   4:  
   5:     // Important:  This is a lamdba, not the result.
   6:     return (T arg) =>
   7:     {
   8:         if (previousResults.ContainsKey(arg))
   9:             return previousResults[arg];
  10:         else
  11:         {
  12:             TResult result = function(arg);
  13:             previousResults.Add(arg, result);
  14:             return result;
  15:         }
  16:     };
  17: }

 

The first question you may have is how the previous results actually works. It’s a local variable, not a static storage. How can it possible live beyond the scope of the method?

Welll, that’s the magic of a closure. Memoize doesn’t return a value, it returns some func that enables you to find the value later. That func contains the dictionary. I go into this technique in Items 33 and 40 in More Effective C#.

In order to use this, you need to move Generate() from a regular method to a lamdba expression (even if it is a multi-line lambda):

   1: Func<long, long> GenerateSequenceSize = null;
   2: GenerateSequenceSize = (long n) =>
   3: {
   4:     if (n == 1) { return 1; }
   5:     var next = (n % 2 == 0) ? n / 2 : 3 * n + 1;
   6:     return GenerateSequenceSize(next) + 1;
   7: };
   8:  
   9: GenerateSequenceSize = GenerateSequenceSize.Memoize();

Now, we have the generate function in a form we can memoize. The only trick here is that you have to set GenerateSequenceSize to null before you assign it in line 2. Otherwise, the compiler complains about using an unassigned value in line 6.

You can extend the Memoize function to methods with more than one input, but for now, I’ll leave that as an exercise for the reader.

Created: 10/19/2009 5:39:00 PM

You probably noticed that Visual Studio 2010 Beta 2 was released for download today (for MSDN subscribers). The general release will be Wednesday (Oct 21).

I’ve had limited time (obviously) to work with this, but I’m already impressed.  The WPF editor has shown lots of progress. It’s much more responsive than in earlier beta builds. The language features (at least for C#) are coming along well.

That bodes well for the announced release date of March 22, 2010. Yes, they’ve placed a stake in the ground, and this release has an official launch date.

In addition, Microsoft made some announcements about MSDN licensing and pricing.  Microsoft has the full announcement here. There are a couple of interesting items that are very important in this announcement:

1. Every Visual Studio Premium license includes Team Foundation Server with 1 Cal. That means if your team has VS Premium, you can use TFS right out of the box.

2. WIndows Azure “Development and Test Use”. Visual Studio Premium (and above) will include compute hours (and data storage) in Windows Azure for test purposes. (UPDATE:  The full terms are here.) VS2010 with Premium MSDN will get (initially) 750 hours of compute time per month, 10 Gigabytes of storage, and more).

That promises to be a very exciting 2010!

Created: 8/26/2009 3:15:30 AM

Well, it’s been a long time since I’ve taken the time to solve and blog about one of the Euler problems.  It was time to pick this up again.

Problem 13 says:

Work out the first 10 digits of the sum of the following one-hundred 50-digit numbers. <long list of numbers elided>

The key here is that you only need the first 10 digits of the answer. Therefore, you only need to add the first 11 digits for each of the 100 numbers. Here’s why:  These numbers do not contain any 0’s in the first position.  Therefore, the sum of all the first digits is at least 100. The first digit contributes at least 3 digits in the final answer. Once you get to the 11th digit, that number can’t be great than 900. (100 9’s).  It won’t affect anything in the first 10 digits.  That’s the end of the work.

That makes the final answer one LINQ query:

             1:
          static
          void Main(string[] args)
             2: {
             3:     var finalAnswer = (from index in Enumerable.Range(0, 11)
             4:                        from s in listOfDigits
             5:                        select int.Parse(s[index].ToString()) * 
             6:                        (long)Math.Pow(10, 11 - index)).
             7:                        Sum();
             8:     Console.WriteLine(finalAnswer);
             9: }

listOfDigits (elided) is a list of strings where each string contains the digits for one number.

The first line of the query creates an enumeration for the first 11 digits.

The next two lines select a single character from each string, parsing that character to create an integer.

Next, do a little math to move that single digit into the correct column for the sum.

Finally, sum all the integers.

Sweet, huh?

Created: 8/19/2009 12:29:16 PM

Chris Marinos started Elevate as a learning time project at SRT Solutions.  It’s still very young, but it’s already very useful to me.

Chris wrote a great introductory post on the library, which is hosted on CodePlex.

Elevate contains methods we use, usually based on LINQ queries, or otherwise built on top of the .NET BCL.  There are methods to build sequences, sub-divide very large sequences into chunks, perform pattern matching on sequences, and some general purpose composable APIs.

It’s still a very young project, and we’re adding functionality on a regular basis. We’re also actively seeking other community members’ input and contribution. Please visit the project, join and submit ideas. Or, just download it, try it, and suggest ideas on the CodePlex pages.

Created: 7/23/2009 3:04:04 AM

A friend asked me about some issues he was having using Enumerable.Cast<T>(). In his mind, it just wasn’t working. Like so many problems, it was working correctly, just not the way he expected. It’s worth examining.

Examine this class:

             1:
          public
          class MyType
             2: {
             3:
          public String StringMember { get; set; }
             4:  
             5:
          public
          static
          implicit
          operator String(MyType aString)
             6:     {
             7:
          return aString.StringMember;
             8:     }
             9:  
            10:
          public
          static
          implicit
          operator MyType(String aString)
            11:     {
            12:
          return
          new MyType { StringMember = aString };
            13:     }
            14: }

Note:  I normally recommend against conversions, (See Item 28 in Effective C#), but that’s the key to this issue.

Consider this code (assume that GetSomeSttrings() returns a sequence of strings)

             1: var answer1 = GetSomeStrings().Cast<MyType>();
             2:
          try
        
             3: {
             4:
          foreach (var v in answer1)
             5:         Console.WriteLine(v);
             6: }
             7:
          catch (InvalidCastException)
             8: {
             9:     Console.WriteLine("Cast Failed!");
            10: }

You’d expect that GetSomeStrings().Cast<MyType>() would correctly convert each string to a MyType usingthe implicit conversion operator defined in MyType. It doesn’t, it throws an InvalidCastException.

The above code is equivalent to this construct, using a query expression:

             1: var answer3 = from MyType v in GetSomeStrings()
             2:               select v;
             3:
          try
        
             4: {
             5:
          foreach (var v in answer3)
             6:         Console.WriteLine(v);
             7: }
             8:
          catch (InvalidCastException)
             9: {
            10:     Console.WriteLine("Cast failed again");
            11: }

The type declaration on the range variable is converted to a call to Cast<MyType> by the compiler (See Item 36 in More Effective C#). Again, it throws an InvalidCastException.

Here’s one way to restructure the code so that it works:

             1: var answer2 = from v in GetSomeStrings()
             2:               select (MyType)v;
             3:
          foreach (var v in answer2)
             4:     Console.WriteLine(v);

What’s the difference? The two versions that don’t work use Cast<T>(), and the version that works includes the cast in the lambda used as the argument to Select().

And, that’s where the difference lies.

Cast<T>() Cannot access User Defined Conversions

When the compiler creates IL for Cast<T>, it can only assume the functionality in System.Object. System.Object does not contain any conversion methods, therefore, Cast<T>() does not generate any IL that might call any conversion operators.

Cast<T>() will only succeed if its argument is not derived from the target (or a type that implements the target if the target is an interface), Cast<T> fails.

On the other hand, placing the cast in the lambda for the Select clause enables the compiler to know about the conversion operators in the MyType class. That means in succeeds.

As I’ve pointed out before, I normally view Conversion operators as a code smell. On occasion, they are useful, but often they’ll cause more problems than they are worth. Here, without the conversion operators, no developer would be tempted to write the example code that didn’t work.

Of course, if I’m recommending not to use conversion operators, I should offer an alternative.  MyType already contains a read/write property to store the string property, so you can just remove the conversion operators and write either of these constructs:

             1: var answer4 = GetSomeStrings().Select(n => new MyType { StringMember = n });
             2: var answer5 = from v in GetSomeStrings()
             3:               select new MyType { StringMember = v };

Also, if you needed to, you could create a different constructor for MyType.

Created: 7/13/2009 3:08:52 PM

During my recent vacation, I read the final print version of Essential LINQ, by Charlie Calvert and Dinesh Kulkarni.

Normally, I try to answer the question, “Who should read this book?” That answer eluded me on this book, due to the thorough treatment Charlie and Dinesh give the subject. Essential LINQ is approachable by developers that have minimal experience with LINQ, and yet those developers that have been using LINQ since day one will learn something from this book.

The Essence of LINQ and LINQ Patterns and Practices

Everyone will learn something from two chapters in this book. “The Essence of LINQ” describes the principles behind the LINQ libraries and language enhancements. By understanding the goals of LINQ, you’ll immediately gain insight that will make LINQ more approachable and more productive for you.

In chapter 16, toward the end of book, Charlie and Dinesh discuss some patterns you’ll run into while developing with LINQ. You’ll learn how to use LINQ to SQL entities as part of a large multi-tier application. You’ll learn how to improve performance in LINQ applications. You’ll learn how to separate concerns in LINQ based applicaitons. LINQ is too new to be considered complete in terms of ‘best practices’, and thankfully neither Charlie nor Dinesh approach this subject with that kind of arrogance. Instead, they offer their recommendations, and invite discussion on the subject.

A catalog of LINQ Functionality

Throughout the rest of the book, Charlie and Dinesh explain LINQ to Objects and LINQ to SQL from a usage standpoint, and from an internal design standpoint. In other chapters, LINQ to XML is discussed. The authors provide examples of transforming data between XML and relational storage models as well. In every section, they tie together those features with the concepts discussed in “The Essence of LINQ”. That continuity helps to reinforce your understanding of LINQ through its design concepts.

After reading this book, you’ll be able to leverage LINQ in all your regular coding tasks. You’ll have a much better understanding of how LINQ works, and when you’ll encounter subtle differences in how different LINQ providers may behave.

LINQ to SQL vs. Entity Framework

It seems you can’t discuss LINQ without at least wading into the controversy of LINQ to SQL vs. Entity Framework. This book wades there as well (It was finished about the time the first EF release was made). More time is spent on LINQ to SQL, as it is more approachable from an internal design perspective. However, the chapters that cover EF build on that knowledge to help you understand how the two database LINQ technologies are more complementary adversarial. In addition, they touch on when you should consider one over the other in your application.

A look at providers

This section is the least complete, but the most useful to look into the future of the LINQ universe. It’s too easy to view LINQ as LINQ to Objects, LINQ to SQL, and LINQ to XML, and nothing more. This chapter gives you a brief view some of the other providers people have created for other types of data stores. Looking at some of those providers (especially the IQToolkit) will give you a greater appreciation for how LINQ can be used with a wider variety of data sources than you ever imagined.

So, is this book for you?

If you are interested in being more productive with LINQ, you should read this book. You’ll probably thumb through it again and again as you search for better ways to solve different problems.

Created: 1/21/2009 9:32:53 PM

I took some time to create the MyPhotos sample for the Live Framework SDK (currently in CTP).

I’m impressed with the consistency of the programming model.  You can get quite a bit done using just a few objects: A LiveOperatingEnvironment, Mesh, MeshObject, and DataFeed.

In only a few pages of code, I’ve got a desktop application that allows me to create folders in the mesh, upload pictures, and search for them.

the code looks amazingly familiar to any C# developer (I wrote is C#. I believe any VB.NET developer would say the same about the VB.NET version).

There are a few items to point out in the API.  this method shows how you can create a query to find data feeds in a given mesh object. It returns any mesh Files in a single mesh object:

             1: DataFeed GetFileDataFeed(MeshObject meshObject)
             2: {
             3:     DataFeed theDataFeed = (from dataFeed in meshObject.CreateQuery<DataFeed>()
             4:
          where dataFeed.Resource.Title == LIVE_MESH_FILES
             5:                             select dataFeed).FirstOrDefault<DataFeed>();
             6:  
             7:
          return theDataFeed;
             8: }

It’s as simple as running a query against the mesh object.  more importantly, the CreateQuery<T>() method returns a LiveQuery<T> object, and LiveQuery<T> implements IQueryProvider!  All the query execution will happen at the server. The live framework parses the query and minimizes the traffic by pushing logic to the cloud rather than pulling data to query from the cloud.

As you explore the rest of the Live APIs you’ll find that many of the APIs that search for data in the cloud make use of IQueryProvider.

Creating objects is as simple as adding them to the correct collection.  Also, you can add other properties to the resource collection associated with an object. It’s four lines of code to upload and annotate a picture with a rating:

 

             1: DataEntry dataEntry = theDataFeed.DataEntries.Add(filestream, filename);
             2:  
             3:
          // Add user data (ratings) to data entry
        
             4: dataEntry.Resource.SetUserData<Ratings>(theRating);
             5:  
             6:
          // Required to see user data on all devices/Live Desktop/Data Model Browser
        
             7: dataEntry.Update();
             8:  
             9:
          // Make sure everyone is synchronized
        
            10: theDataFeed.SyncEntries.Synchronize();
            11: theDataFeed.Update();

More coming soon… There’s cloud applications in my future.

Current Projects

I create content for .NET Core. My work appears in the .NET Core documentation site. I'm primarily responsible for the section that will help you learn C#.

All of these projects are Open Source (using the Creative Commons license for content, and the MIT license for code). If you would like to contribute, visit our GitHub Repository. Or, if you have questions, comments, or ideas for improvement, please create an issue for us.

I'm also the president of Humanitarian Toolbox. We build Open Source software that supports Humanitarian Disaster Relief efforts. We'd appreciate any help you can give to our projects. Look at our GitHub home page to see a list of our current projects. See what interests you, and dive in.

Or, if you have a group of volunteers, talk to us about hosting a codeathon event.