|
|
Performance Issues in ADO.NET DataTable Usage.Use Enum to Access Items in DataRow | | Use Enum to Access DataColumns in a DataRow for Performance. Integer indexes are faster than String indexes for accessing an Item in a DataRow.
I am in the midst of refactoring some code written in the early days of VB.NET, when I and other project members were just learning .NET strategies. The particular method is walking through the rows of a DataTable. In the first place, it had many lines of code like the following line.
jlength = dt.Rows(i).Item("job_length")
|
The first step that I took was to place the some code at the top of the For Loop so that I could remove all of the references to "dt.Rows(i).Item(...)". The reason is obvious; for every line where this type of reference is made, the Compiler has to generate code to evaluate dt, rows(i), and Item("name"). I corrected this problem by placing this code.
Dim dr As DataRow
For i = 0 To inbrRows
dr = dt.Rows(i)
|
Now I can change the reference to the Item in the DataRow by using the dr object instead of dt.Rows(i), as follows.
jlength = dr("job_length")
|
That will give you better performance as you access the DataColumns in the DataRow.
Now, assume that I have many columns selected in the DataTable. It is faster to use an Integer index to the column than to use a String index, as shown in the next line.
However, if I have many columns in the table, it will be very easy for me to use the wrong number, e.g., 12 instead of 13, etc., and if there is such a column, and it is the right data type, I may not raise an error, but I sure do have a bug! Even if I am lucky enough to code it properly, the code is not very readable, to put it lightly. The solution is to use an Enum. Even without an add-in (which I might stop and write very soon), I can create the Enum with a couple of simple steps.
First, I will create an empty Enum shell as shown below.
Private Enum MDF
End Enum
|
In this case, I am working in a MissingDataFields Report method, thus the name "MDF". Next, I will go to the Stored Procedure and copy the Select clause of the procedure and paste it into the Enum shell as shown next.
Private Enum MDF
Select author_id, date_dictated, transcribed_file_time,
work_type,excel_job_number,foreign_batch,
job_number, author_name, excel_author_name, department,
date_transcribed,dictation_date_time,transcription_date_time,job_length
End Enum
|
Obviously, the Enum is no good at this point, so I simply remove the Select and and then start highlighting the commas (,) and press return until all of them are gone. The result will be as follows.
Private Enum MDF
author_id
date_dictated
transcribed_file_time
work_type
foreign_batch
job_number
author_name
department
date_transcribed
dictation_date_time
transcription_date_time
job_length
End Enum
|
And guess what, the first name in the Enum will have a value of "0", the second will be "1", etc. The Enum values are automatically ordered the same way the the columns will appear in the DataTable. Now, I can start referencing the desired columns, using their real name, but the code generated will be using an integer value, and Intellisense will kick in as soon as I type "MDF.". An example is shown next.
sLastDept = CStr(dr(MDF.department))
sLastWT = CStr(dr(MDF.work_type))
| While I am on the subject of performance, again, while doing some refactoring of some very long methods, I created a smaller method and passed a DataTable object from the calling method to the new method. The new method is being called in a For..Loop and only processing one DataRow of the DataTable each time it is called. I found that my new method was causing the report to run 8-10 times longer than it had before I refactored the larger method. Immediately, I suspected that passing the DataTable (with several hundred rows in it) was the problem. I changed the call to pass only one DataRow and the performance reverted to normal. Even though the DataTable was passed ByRef, there was still a huge performance hit when passing the whole DataTable.
|
|