I recently received an email with a set of questions about how to construct code that manipulates strings. I’m paraphrasing, but the question boiled down to whether I recommended always using StringBuilder to construct strings, or whether should be avoided except when building a string from multiple objects in a loop.
The answer, as it usually is in performance related questions, is “it depends”. Anytime you are making code changes when your intent is increasing performance, you need to measure your application both before and after the change. A lot of non-local program characteristics affect performance.
That said, there are some general guidelines that can help you make a good first choice.
The goal you’re trying to achieve is usually to minimize the amount of garbage that gets created. Strings are immutable, so every time to make a new assignment to a string, you’re manipulating the reference object, and the string that was being referred to is now garbage.
var str1 = "foo";
str1 = "bar"; // the string "foo" is now garbage.
The snippet above creates one object that becomes garbage in that same snippet. One item of garbage is not big issue. However, if this snippet is on an often called branch of code, it will add up quickly.
However, assignment in and of itself string does not necessarily create garbage. Consider this code:
var str2 = "foo"; var str1 = str2; str2 = "bar"; // No extra garbage str1 and str2 point to two different objects
I wouldn’t use this as an example of great coding style. The code creates two different strings, each pointed to by one reference. There’s an assignment, but this snippet does not create garbage. What happens to the strings created depends on code outside of the snippet above.
Because strings are immutable, and programmers want to manipulate strings, the .NET Base Class Library includes the StringBuilder class. The string builder class enables you to modify a string like object and then retrieve the immutable string once you’ve created the final version.
Nerd note: The string.Format() methods use a StringBuilder internally. If you are arguing over the relative performance of string.Format() and StringBuilder, please stop.
In general, for simple formatting operations that can be done in one statement, my preference is for string.Format. It’s simpler to read, and it returns the final immutable string after creating it using a StringBuilder.
For more complicated code that builds a string, I’ll use a StringBuilder.
So should you ever use string.Concat?
It’s very rare that I say a certain method should never be used. The BCL team members (and every other developer I know) has enough work to do without writing methods that should never be called. I will use string.Concat when none of the string objects being concatenate are being discarded. Consider this:
var FullName = person.FirstName + person.LastName;
The person object does not become garbage immediately. string.Concat uses some internal classes to avoid creating temporary objects, so you don’t create temporary objects.
Remember: If you’re not measuring, it’s not engineering
These are, at best, general guidelines. If you are making any changes where the goal is to increase performance, measure before and after the change. Make sure that any changes you make actually do positively impact performance.
Most importantly, make sure you’re optimizing code that is executed often enough that your changes have a real impact to your users.