Buy @ Amazon

Clustered Vs Non-Clustered Index In SQL Server


The comparison table below should be a quick refresher to those DB Specialists attending technical interviews or ones wanting to take a decision of whether or not to create an index for a table in the live server. 

Clustered Index

Non-Clustered Index

Helps you to store Data and index together.

Stores the data at one location and indices at another location. Non-clustered Index store key values only and has Pointers to Index rows (for Clustered table) or RID (for Heap/Non-Clustered table).
Stores data pages in the leaf nodes of the index.

Never stores data pages in the leaf nodes of the index.

You can sort the records and store clustered index physically in memory as per the order. A non-clustered index helps you to creates a logical order for data rows and uses pointers for physical data files.
Doesn’t require additional disk space

Require additional disk space to store the index separately

Faster data accessing.

Relatively slower.

The size of the clustered index is quite large. The size of the non-clustered index is small compared to the clustered index.
A table or view is allowed only 1 clustered index at a time. A table can have more than 1 non-clustered indexes.

Note: In Microsoft terminology, a heap is a table without a clustered index.


Think you got the concepts right? Now crack the quiz below ;)

Quiz: As a SQL Server DB Specialist, will you recommend creating an index (if so, what kind of indexes) for a table in the following use-case. 

  • You got a Item_Pricing_History table that grew to 5 million records holding data for last couple of years for specific items in a catalog. This table has no index in it thus far, to be explicit.
  • Everyday before the prices in the Item_Pricing table are updated, these records are copied to Item_Pricing_History
  • The Analytics team decided that it needs data no more than a quarter and the remaining data in the Item_Pricing_History can be archived for better performance of running analytics queries against this table. 
  • You have a scheduled job that copies 1000s of rows from Item_Pricing table to Item_Pricing_History.
  • You have a analytics team that runs a variety of queries on Item_Pricing_History table from time to time.
  • You have analytics jobs that run against Item_Pricing_History table that runs as a trigger for certain events.
  • You have the archival process on Item_Pricing_History table that runs as scheduled job archiving rows in batches. 
The junior DB developer seeks your advice on whether or not you should create an index for this table especially to speed up the archival process, so that the other processes will run faster with reduced table size. If you were to advice on an index, what index(s) should you be creating? 

Item_Pricing_History

id

Bigint

item_id

Int

price

Money

ts

Datetime


Note: No question can cover the complete context surrounding the problem. You are encouraged to make your assumptions explicit for the solutions you propose and even better if you could explain how your proposed solutions would vary depending on the variable in your assumptions.

Note: Interview questions are open ended and can be leading ones too ;)