Mismatches between data types on columns (or variables) can have disastrous results. It’s bitten me twice this past week. Take the following query:
Which results in the following plan being generated:
Four billion rows to an index spool? Really?
The issue here was that the U.UserName column was defined as an NVARCHAR(256), whereas the SH.UserName column was defined as VARCHAR(256). This results in the query having to convert all the rows in the SearchHistory table to be converted from NVARCHAR to VARCHAR (5,620,388 rows, plus since it’s a nested subquery, multiply that times the number of rows the parent query is returning). This was pegging the cpu on a production server nightly.
The way sql server determines which column will be converted is strictly based upon data type precedence. A VARCHAR will always be converted upwards to an NVARCHAR rather than the inverse. If you look at the data type precedence list, conversion will happen from the bottom upwards.
If I convert the U.UserName from an NVARCHAR to VARCHAR it avoids having to convert the entire 5-million plus row table into VARCHAR in order to do the join. This fixes the symptom, but the best solution is to ensure that your types match in the first place. This applies not only to joins, but to variables as well.
This blog post from Jonathan Kehayias will help you find any queries that may be doing implicit conversions.