How can you increase your query performance in SQL Server

July 29, 2007

Partitioning is the key, You can greatly increase your query performance by making use of the partitioning, partitioning is the horizontal splitting of rows in a table or index defined on a table, smaller logical units that act as one table or index, this would improve your querying performance because you are querying several smaller tables rather than on much larger one. And its advantages increase when the hosting server has multiple processors, which allow the partitions to be queried in parallel in the same query, and when the partitions are spread across multiple file groups.

In SQL Server 2005, administration is much easier because the database engine maintains the partitions in the background, allowing you to focus on other tasks (remember speed to the market mindset).

It’s not practical to partition every table in your database. The typical usage scenario for a partitioned table is one that is very large and is experiencing degrading performance over time. I’ll walk you through the steps of partitioning a large table.

Partitioning example:

The following example demonstrates how to partition a table that stores archived data to increase query performance. Before diving into the code for the example, here’s a brief overview of the moving parts that make up a partitioned table.

Partition Functions

When a table is partitioned, it is broken horizontally into smaller table partitions that are used together to assimilate the whole table. To do this, SQL Server has to know how to break the table into smaller parts. This is where a Partition Function comes into play. A Partition Function is the logic that maps the rows of a table or index into predefined partitions based on the values in a field of data. This will be the first partition object I create in the example.

Partition Schemes

You tell SQL Server how to horizontally split a table with a Partition Function. You also need to indicate how the partitions will be stored in the database. In SQL Server 2005, you map a table partition to a filegroup, which is basically a logic grouping that the database uses to store its data files and transaction log files. Every database has at least one filegroup named the Primary filegroup and additional filegroups for administrative and performance purposes. In a Partition Scheme, you can define that all partitions are mapped to the same filegroup, or you can use the scheme to split up the partitions across filegroups. The advantage is that, when the filegroups are split across individual disks, SQL Server will be better equipped to use resources when retrieving data. This advantage becomes even greater when you are running queries against the partitioned tables on a server with multiple processors.

Creating a partitioned table:

Step one is to define the partition function that you will use to map out the partitions in the table. In the Partition Function defined below, I use three partitions, one for each product type in the SalesArchive table, which I will define later. These partitions will essentially break the SalesArchive archive table into three different tables that are automatically maintained by SQL Server.

CREATE PARTITION FUNCTION [pf_Product](VARCHAR(10)) AS RANGE LEFT

FOR VALUES (N’ProductType1′, N’ProductType2′, N’ProductType3′)

The range in which a function is defined determines which boundary the partition values will belong. RANGE LEFT Specifies that partition values will be less than or equal to the values defined in the Partition Function.

The Partition Function is created, so now I need to create the Partition Scheme. For example, I will map all partitions to the primary filegroup. If I wanted to map the partitions to different filegroups, I would include the filegroup name ordinally in the filegroup list. Here’s the script for the partition scheme:

CREATE PARTITION SCHEME [ps_Product] AS PARTITION [pf_Product]

ALL TO ([PRIMARY])

Now that I have the tables set up for the partitions, I can create the SalesArchive table and load data into it. At the end of the CREATE TABLE statement, the Partition Scheme uses a field name from the table to tell SQL Server how to map the data from the table to partitions.

IF OBJECT_ID(‘SalesArchive’)>0
DROP TABLE [SalesArchive];
GO
CREATE TABLE [dbo].[SalesArchive]
(
[SaleID] [int] IDENTITY(1,1),
[Product] [varchar](10) NULL,
[SaleDate] [datetime] NULL,
[SalePrice] [money] NULL
)ON [ps_Product](Product)
GO

DECLARE @i SMALLINT
SET @i = 1
WHILE (@i <=10000)
BEGIN
INSERT INTO SalesArchive(Product, SaleDate, SalePrice)
VALUES(‘ProductType1’, DATEADD(mm, @i, ‘3/11/1919’), DATEPART(ms, GETDATE()) + (@i + 1))
INSERT INTO SalesArchive(Product, SaleDate, SalePrice)
VALUES(‘ProductType2’, DATEADD(mm, @i, ‘3/11/1927’), DATEPART(ms, GETDATE()) + (@i + 2))
INSERT INTO SalesArchive(Product, SaleDate, SalePrice)
VALUES(‘ProductType3’, DATEADD(mm, @i, ‘3/11/1908’), DATEPART(ms, GETDATE()) + (@i + 3))
SET @i = @i + 1
END

I’m going to make sure the partitions are working correctly by running some queries on the data. The following query returns all rows from the SalesArchive table and uses the $partition function to indicate which partition the returned row is a member of:

SELECT $partition.[pf_Product](Product), *

FROM SalesArchive

This query returns all partitions that are mapped to the SalesArchive table:

SELECT * From sys.partitions

WHERE OBJECT_NAME(object_id) = ‘SalesArchive’

Here are some great online references to learn more, and see some performance comparisons:

http://www.sqljunkies.com/article/f4920050-6c63-4109-93ff-c2b7eb0a5835.scuk

http://www.sqlskills.com/resources/Whitepapers/Partitioning%20in%20SQL%20Server%202005%20Beta%20II.htm

http://www.microsoft.com/technet/technetmag/issues/2007/03/Partitioning/default.aspx?loc=en

Posted by Waleed Abdelwahab Ali
Filed in Databases, Development

7 Comments »

7 Responses to “How can you increase your query performance in SQL Server”

Mohamed Meshref Says:

July 31, 2007 at 4:54 pm
Actually you can also increase the query performance by doing some stuff on the fly:

• SET NOCOUNT ON should be at the top of every stored procedure. SET NOCOUNT prevents DONE_IN_PROC messages (which are almost always discarded, anyway) from being sent to the client for each statement executed in a stored procedure.
• Don’t unnecessarily return large results from a stored procedure for further processing in a client app. Allow SQL Server to do what it does best: traverse, filter, sort, and aggregate data.
• Prefer implicit data-type (domain) constraints to check constraints. While you could certainly create a check constraint over a varchar column to ensure that it only contains numbers, a much simpler approach is to define it as a numeric column to begin with. Ditto for dates. You could define a check constraint over a character column that validated its contents to be sure it only stored valid dates, or you could just define the column as a date column to begin with.
• Prefer default constraints to default objects. In fact, don’t use default (or rule) objects at all, as they’ve been deprecated.
• Prefer check constraints to relational constraints for simple domain membership tests. For example, while you could set up a table containing possible customer genders so that you could then reference that table from a foreign key constraint, it’s far more efficient to simply embed the gender validation in a check constraint (e.g., CHECK Gender = ‘M’ or Gender = ‘F’).
• Prefer non-null columns to nullable columns. Nullability naturally complicates logical operations because it implies three-value logic. Don’t needlessly make every column in every table nullable. There are some column types (primary keys, for example) that cannot even be nullable. Think about whether a column will ever be omitted during an insertion and whether there’s a suitable default value for it.

Thanks to Ken 🙂

Reply
Mohamed Meshref Says:

July 31, 2007 at 4:57 pm
Read more about SQL Server and performance issues on Ken’s blog:

http://blogs.msdn.com/khen1234/default.aspx

Reply
Waleed Abdelwahab Says:

July 31, 2007 at 5:34 pm
Great addition, Many Thanks to our SQL Server Dinosaur, Meshref!

Reply
thiru Says:

September 6, 2007 at 3:31 pm
dear sir!

this query performance discription is very usful to me.i will test based on ur discription

regards
thiruppathi

Reply
sudhansu Says:

May 14, 2008 at 12:19 pm
my sp having one view and 10 functions with clustered index. it take 15 min time to execute 1200 records. you have any idea to increase the performance.

Reply
Asharaf Ali Says:

January 19, 2010 at 8:33 am
Hi
thanks. It helped me

Reply
Anonymous Says:

September 13, 2011 at 4:20 am
Thankyou very much for your need and simple example

Reply