KnowDotNet

Dynamic SQL Needs to die

by William Ryan

For two years now, I've espoused the point that there's NO, as in 0, reason to use concatenated dynamic SQL any more.  I'm by no means the only one and if you asked anyone who really understands the subject, there's no doubt there's unanimity on this subject.  Yet just about every day, in one newsgroup/forum or another, there's someone who posts code with Dynamic SQL/Concatenated SQL that's the source of their trouble.  And there's even more people who have code that's next to impossible to debug b/c they are using it.  

Before I continue, let me clearly define what I mean by dynamic SQL.  Many people have taken it to mean SQL Statements that aren't implemented via stored procedures. That's absolutely not what I mean.  So let me show you an example:

"SELECT * FROM SomeTable WHERE Field1 = '" & tbFirstValue.Text &  "' AND SecondField = '" & tbSecondValue.Text & "'"


This can just as easily be implemented with:

"SELECT * FROM SomeTable Where Field1 = @FirstValue AND SecondField = @SecondField"


This pretty much proves that anything you do with concatenated dynamic SQL can be done with Parameterized dynamic SQL.  If possible, Stored Procedures are much better, but sometimes you don't know in advance what fields you need and using dynamic sql is a lot easier to implement.  If you need to use it, use parameters. Why?

1)  Concatenating values is error prone.  Take the first example that I used.  Pretend that I forgot one of the apostrophes.  The code would compile and everything would appear to be fine.  However, once I executed the command, it would blow up and result in a Syntax error.  This is an easy mistake to make no matter how careful you are.  Also, pretend you had 15 values in the Where clause.    It'd be a feat indeed to get it right the first time.

Another example is Date fields.  Different RDBMS implementations require a different Date Format.  It's very easy to forget about this and forget one of the critical formatting chunks.  Again, if you are totally careful you probably won't make many mistakes here, but your new programmers probably will.  Someone editing your code probably will etc.

Another problem is fields that have an apostrophe in them.  This happens so often it's amazing.  And the worst part is that it usually doesn't come up in testing.  You usually find out from a customer who can't finish their work b/c you were too lazy to code this correctly or too proud to admit your method sucks.  Since this statement is in the code, you'll have to fix it, rebuild it, and redeploy it.  That's a lot of inconvenience for everyone involved and totally unfair to burden other people with.  Face it, you won't be there for ever so even if you don't mind fixing it and don't mind putting your customers through the inconvenience, someone else is going to be burdened by this.  As a professional, you should be better than this.  Another common thing that comes up is that you can just write a function that changes the single quotes to double quotes so it can be used.  Well, you'll have to call this on every SQL statement to safely assume it will work.  You'll probably forget to do it at least once, and other programmers may forget it or not know that you have this function when they maintain your code.  

2) It's a security risk.  If you do this, you can kid yourself and pretend that your UI validation can prevent every problem associated with it.  In order to do this, you'll have to spend a lot of time writing this validation code that adds NO value to your app.  At best it prevents damage from happening.  If you forget it in one critical place, you could expose your database to a hacker, and you could thereby be responsible if a hacker figures out this vulnerability and exposes it.  Even if you think you are that good at UI validation, you are playing with fire here, and you can't, as a professional, honestly say that it's ok to expose your company to this level of risk.  A hacker may do something that you can't fix.  They may get data that you can't take back from them.  You expose your company and its customers to a great deal of risk.  All because of you are too proud or too lazy to prevent it.  Again this is inexcusable.  Moreover, you are invariably going to waste time writing validation code that otherwise wouldn't be necessary.  That's  a flagrant waste of your employers money.  If you don't have a problem with that, then at least you can admit that time spent validating input in other ways would be a better use of your time and company resources.  

NOBODY THINKS they are going to get hacked.  Actually, it'd be more correct to say that nobody who's been hacked thought that their code would be the entry point to the hack.  Even if no one figures out you are the one that did it, you still no that your shortsightedness led to this problem.  And even if that doesn't bother you...I ask you this.  Would you feel the same way if it was your financial data that was exposed?  Would you feel the same way if it was your company?  Would you want someone excessively proud or lazy to potentially destroy your company and at best, cause it to spend a lot of money unnecessarily or cause your company a lot of embarassment?

Also, by virtue of using code like this as opposed to Stored Procedures, you will have to grant many more permissions than are necessary.  Do you really think this is ok unless you absolutely have no other choice?

3) It's a preformance killer.  Your query won't be able to take advantage of cached execution plans.  That means that the same query can hit the db over and over again and each time the Server won't be able to reuse plans it's already cached.

4)  It's unmanageable or difficult to manage at best.  I saw a post the other day where someone had a complex dynamic/concatenated sql statement.  The person was told by someone that their code was unreadable and asked to change it to parameters before they continue.  It's also unclear what's going on.  Sure you can look at it for a few minutes, but like I mentioned in item 1, one single quote that's missing can cause the whole thing to blow up.  That's not always easy to spot.  One customer with an Irish last name can cause the same hassle.  Try debugging that.  Sure you can do it, but it's slow and time consuming.  The code is unclear, hard to read and even harder to debug.  If you used parameters, you wouldn't have to step through the whole statement. You wouldn't have to take a guess as to what the field value was.

I'd also like to mention the whole OOP aspect of this.  How is the first statement object oriented?  Compare that with a Parameterized query.  Which is more object oriented?  Also, what if you wanted to clear all of the values or set all of the values to a specified value.  How would you do it with the first statment?  Manually count all of the fields, and reconstruct the statement.  If you needed them to be other values othern than the textboxes for instance, you'd have a real problem if  the textboxes still had values in them.  So you could create variables.  This takes even more work and add no value.  And you'd still have to recreate the statement.  With Parameters, you could just loop through the collection and set the values iteratively.  You could loop through everything with a date and just switch them.  In every way it's easier and you don't need to recreate the statement.  Also, you've loosely coupled the parameters from the syntax. This way you can work on either piece in isolation.  Which is more object oriented?

I also came across a guy on the newsgroups the other day that said (rather snidely) that using parameters would break his data access layer. His layer just used strings as the commandtext and didn't use parameters at all.  The first thing I mentioned was that his data access layer is pretty lame.  Microsoft's DAAB has an easy solution for this, so they've proven it's easy to do.  And even if it wasn't, is "not breaking his data access layer" worth risking the integrity of his database? Is it worth all of the associated headaches due to maintenance issues and different dates or last names?  I really doubt he'd show his boss his post and the reasoning behind it, which is testimony to its frailty

Conclusion:

Everyone who's experienced with ADO.NET that I've come across, and that's a LOT of people, agree that dynamic sql w/out parameters is a terrible way to do things.  It's all risk and no benefit.  Pride is usually the thing that gets in the way, people don't like admitting they did something incorrectly.  I take the opposite view.  As a programmer you are going to do things wrong. You are going to find out that there's better ways to do stuff you thought was good. You are going to write code you thought was great but is actually awful.  It's life, deal with it.  As a professional, if you insist on using concatenated sql statement w/out parameters, you put your company, its customers and your reputation at risk, and there's no benefit.  This is a lot to lose just to keep a BAD habit.  It's often difficult to find and catch every security risk, but this one is right out in the open.  It's a bug waiting to happen.  Don't punish innocent people just b/c of your pride, laziness or ingnorance. If you didn't know what you were doing was wrong, welcome to the club, every programmer out there has been in this boat.  And making a mistake is fine.  Doing something dangerous is fine---you're going to do it.  But not fixing it once you know it's dangerous, that's unforgiveable.  No one is going to fire you for using concatenated sql. If you go to your boss and tell him you need some time to start making some changes, it's not going to be a big deal unless your boss doesn't understand what's at risk.  It's an easy mistake to make.  It's the way things used to work and there are many book examples out there showing this way.  That's usually b/c authors try to keep things simple and only use one parameter concatenations.  And although I've seen a few examples of this early on in a book, I have yet to see it done after the chapter on parameters.

Do yourself and everyone else a favor and do things right the first time, or at least fix what you've done as soon as you can.  Even if you do it little by little, it's still a lot better than not doing it at all.  Better late than never.