KnowDotNet NetRefactor

A Quick Introduction to Boxing

...and why it matters

by William Ryan
Print this Article Discuss in Forums

Reference Types:

By far, these are the more common types that you'll run into.  By their very nature, space for Reference types is grabbed from the Managed Heap.  And since they come from the Managed Heap, they greatly affect garbage collection and memory management.  This point I'd like to emphasize.  If you have an object that's a reference type, and you instantiate a new instance of it, that simple task could cause a garbage collection to occur depending on the situation.  This is a very big reason to keep your code as tight as possible, and make sure you don't use what you don't need.  In a nutshell, reference types are more costly in every regard so they should be used judiciously.

Value Types:

These are the polar opposites of Reference types.  They aren't allocated from the Managed Heap, they run off the thread's stack.  This is good for many reasons.  Since they don't come out of the heap, you don't have to worry about them causing  a Garbage Collection to occur.  The other primary benefit is that the variable's information is stored in its own instance.  No reference is stored to it.

How do you know the difference?

There's a clean and elegant way to discern this.  Everything that is a class, or derived from a class is a Reference type.  System.Collections, System.Data, System.Everything else are reference types.  On the other hand, Structures and Enumerations are value types (This is in part one of the big reasons I've advocated using Enumerations so liberally in programming.  Not only do they help readability, they are more efficient than other constructs).  So always use Value Types right?  Well, not necessarily.  Boxing and Unboxing are two things you really need to keep in mind.

Boxing and Unboxing:

Jeffrey Richter's Applied Microsoft .NET Framework Programming has a truly elegant discussion of boxing, and if you haven't done so already, do yourself a favor and purchase a copy of it (but beforehand, prepare yourself to deal with learning about your bad habits that you probably didn't even know existed).

Well, we know that Value types are generally more efficient than Reference types.  So why use a Reference type when a Value type would do?  What happens if you combine them.  A prime example is any of the
System.Collection Classes.  If you are like most programmers, you use collections to store things.  These things can be strings, ints, points, Foos', Bars', Excel Spreadsheets and anything else.  So what happens when a Value Type is stored in a Reference  type?  Boxing!  

As Richter's book notes, this is what happens when Boxing takes place:

1)    Memory is allocated from the managed heap.  The amount of memory allocated is the size the value type requires plus and additional overhead to consider this value type to be a true object.  The additional overhead includes a method table pointer and a SyncBLockIndex.
2)    The value type'[s fields are copied to the newly allocated heap memory.
3)    The address of the object is returned This address is now a
reference to an object; THE VALUE TYPE IS NOT A REFERENCE TYPE.  

In short, you incur the overhead associated with Value types when you create the value types.  Then you incur the overhead plus a little more associated with Reference type.  DoublePlusBad!

So Unboxing is just the opposite? NO.  How could you make a Reference type a value type?  If you could, .NET would do it right off the bat in most cases .

So what is unboxing?  Well, let's say you loaded a list of Foos in an Arraylist or similar Collection object.  Arraylists natively don't demand strong typing and when you try to reference say Item[0] and the item is a Value type, then you have to get it to somewhere.  So how is this accomplished?  Unboxing.  That is, using the reference and copying its values to a value type on the Managed Heap.  Just kidding, that would be silly.  Actually, it copies it to the thread stack.  

Neither boxing or unboxing is ideal, but it's often a necessary evil.  However, you have a lot of control over it.  For instance, I  could make an class BillsFoo and instantiate it.  Right off the bat, you know this is a reference type.  Well, let's say that BillsFoo is simply a container object.  Then I could accomplish the same thing with a Structure which is a value type.  So should I always use a value type?  Well, think about it for a second.  Let's say I create a collection of 50,000 BillsFoos and I have to iterate through them all over my app.  You can easily see that a lot of boxing (loading them into the collection) and unboxing (assigning them from the collection) is going to go on.  So, in this case, the boxing/unboxing might make it prohibitive vs. a class but if I didn't need to load these into a collection, the choice is pretty straightforward.

Remember also that both VB.NET and C# handle boxing and unboxing natively.  This is a mixed blessing because on the one hand, you don't have to write the code to deal with it, on the other hand, you may not know that your code is terribly inefficient.  From my experience in newsgroups, there are a great deal of developers out there who have no concept of boxing or it's consequences.  Similarly, many ignore the performance ramifications of their code in regard to Disposing of objects and Garbage collection.  In my next article, I'll build upon this understanding to show you why and how this is a Really important consideration.

Writing Add-Ins for Visual Studio .NET
Writing Add-ins for Visual Studio .NET
by Les Smith
Apress Publishing