The comparison table below should be a quick refresher to those DB Specialists attending technical interviews or ones wanting to take a decision of whether or not to create an index for a table in the live server.
Clustered Index |
Non-Clustered Index |
Helps you to store Data and index together. |
Stores the data at one location and indices at another location. Non-clustered Index store key values only and has Pointers to Index rows (for Clustered table) or RID (for Heap/Non-Clustered table). |
Stores data pages in the leaf nodes of the index. |
Never stores data pages in the leaf nodes of the index. |
You can sort the records and store clustered index physically in memory as per the order. | A non-clustered index helps you to creates a logical order for data rows and uses pointers for physical data files. |
Doesn’t require additional disk space |
Require additional disk space to store the index separately |
Faster data accessing. |
Relatively slower. |
The size of the clustered index is quite large. | The size of the non-clustered index is small compared to the clustered index. |
A table or view is allowed only 1 clustered index at a time. | A table can have more than 1 non-clustered indexes. |
Note: In Microsoft terminology, a heap is a table without a clustered index.
Think you got the concepts right? Now crack the quiz below ;)
Quiz: As a SQL Server DB Specialist, will you recommend creating an index (if so, what kind of indexes) for a table in the following use-case.
- You got a Item_Pricing_History table that grew to 5 million records holding data for last couple of years for specific items in a catalog. This table has no index in it thus far, to be explicit.
- Everyday before the prices in the Item_Pricing table are updated, these records are copied to Item_Pricing_History.
- The Analytics team decided that it needs data no more than a quarter and the remaining data in the Item_Pricing_History can be archived for better performance of running analytics queries against this table.
- You have a scheduled job that copies 1000s of rows from Item_Pricing table to Item_Pricing_History.
- You have a analytics team that runs a variety of queries on Item_Pricing_History table from time to time.
- You have analytics jobs that run against Item_Pricing_History table that runs as a trigger for certain events.
- You have the archival process on Item_Pricing_History table that runs as scheduled job archiving rows in batches.
Item_Pricing_History |
|
id |
Bigint |
item_id |
Int |
price |
Money |
ts |
Datetime |