Currently we are working in a projects which handles data from multiple databases (SQL Server 2005) with massive data. All together these databases size is around 3 TB and one of them is around 1.2 TB. Now I am coming to actual problem we ran into yesterday, one of our stored procedure was taking longer time compare to the time, it takes while executing the sql query separately contains by the stored procedure. In real scenario it is not suppose to be. After long investigation we figure out it is due to "Parameter Sniffing". Thanks to google:):)
A nice thing which I was not aware of, after working last 5 years with SQL Server. Lets come to the topic and try to understand how and why SQL Server does this??
What is Parameter Sniffing?
When an SP is not in the procedure cache, when it is executed the query optimizer needs to compile the SP to create an execution plan. In order to do this the query optimizer needs to look at the parameters that are passed and the body of the SP to determine the best method to go about processing the SP. When the query optimizer looks at the SP’s parameters, to help determine how to optimize the execution of the SP, it is known as parameter sniffing.
Once the query optimizer has sniffed the parameters and determined the best approach for processing the SP, it caches the execution plan in the procedure cache. All subsequent executions of the SP re-use the execution plan from the procedure cache regardless if different parameters are passed. The potential problem with this approach is the parameters that were used when the plan was cached might not produce an optimal plan for all execution of the SP, especially those that have significantly different set of records returned depending on the parameters passed. For instance, if you passed parameters that required a large number of records to be read, the plan might decide a table or index scan would be the most efficient method to process the SP. Then if the same SP was called with a different set of parameters that would only return a specific record, it would used the cached execution plan and perform an table or index scan operation to resolve it’s query, even if a index seek operation would be more efficient in returning the results for the second execution of the SP.
If you have an SP that sometimes processes quickly, and other times processes slowly with different sets of parameters, then possibly parameter sniffing is causing your procedures to have varying execution times. In this article we will look at different methods to write your SPs to help the query optimizer in picking a good execution plan most of the time.
Lets take one example which will give you clear understanding for the same. I am going to use SQL Server 2008 to illustrate this as SQL 2008 version is installed in my laptop. I am also going to use "AdventureWorks" database. If you don't have download from here .
Step 1: Create a new procedure on SalesOrderHeader table to get customer based on customer id.
USE AdventureWorks
GO
CREATE PROC GetCustomerOrders (@FirstCust int, @LastCust int)
AS
SELECT * FROM Sales.SalesOrderHeader
WHERE CustomerID between @FirstCust and @LastCust
This code accepts two parameters. These two parameters are used in a WHERE clause to determine which SalesOrderHeader records to return. Now, let’s look at the execution plan for this SP when we call it with two different sets of parameters. First, let’s call it with the following code:
USE AdventureWorks
GO
DBCC FREEPROCCACHE
EXEC GetCustomerOrders 1,1000
When I execute the above code, I get the following execution plan:
Now, I will execute (Note: Clearing the caches) the same SP with a smaller range of CustomerID’s and see what kind of execution plan I get. Here is the code I will be executing:
USE AdventureWorks
GO
DBCC FREEPROCCACHE
EXEC GetCustomerOrders 1,10
When I execute the above code, I get the following execution plan:
This time I get a different execution plan. Now, you can see that when using a smaller range of CustomerID’s (1-10), I get an Index Seek operation. Depending on which execution of this SP was the first to be execute it, would compile and cache the execution plan for all subsequent executions of this SP. To verify this let’s run the following code and review the execution plans:
USE AdventureWorks
GO
DBCC FREEPROCCACHE
EXEC GetCustomerOrders 1,1000
GO
EXEC GetCustomerOrders 1,10
Here is the execution plan for the above T-SQL batch:
By looking at this code, you can see that the second execution of the GetCustOrder SP now performs a Clustered Index Scan operation to find all the customer orders from a small range of CustomerID’s. This occurs because the query optimizer only sniffed the parameters for the first execution of the GetCustOrder, which had a large range, and then cached that execution plan. The cached execution plan was then used for the second execution of the GetCustOrder SP.
The first compile of an SP creates the execution plan based on the parameters passed, and then this plan is stored in the procedure cache for the current and all future executions of the same SP. The last code segment above demonstrated how this could cause our second execution of GetCustOrder to use a less efficient execution plan--in this case, a Clustered Index Scan operation instead of an Index Seek operation. Let’s look at ways to overcome this problem.
Now, How to Eliminate the Parameter Sniffing Problem:
Luckily the fix is easy. Take your stored procedure that looks like this (generalizing a query here:
USE AdventureWorks
GO
ALTER PROC GetCustomerOrders (@FirstCust int, @LastCust int)
AS
DECLARE @FC int
DECLARE @LC int
SET @FC = @FirstCust
SET @LC = @LastCust
SELECT * FROM Sales.SalesOrderHeader
WHERE CustomerID BETWEEN @FC AND @LC
Now try to execute the above the query and see the execution plan, but you have to try yourself... Enjoy...
Still, this method of eliminating the parameter sniffing problem doesn’t mean you will get an optimal plan for each execution of the SP. You still only get one execution plan stored in the procedure cache, which will be used for all executions of the SP. Although that one execution plan on average will perform optimal if you call the SP many times with many different parameter values. If you really want to create different execution plans based on the parameters passed then you will need to uses a different approach.
Wednesday, February 9, 2011
Subscribe to:
Posts (Atom)