Buy @ Amazon

Choosing Subset of Columns from a Pandas Dataframe


It is so common a situation in data-science to select a subset of columns from a data-set. In Python the usual approach is to select a set of columns using List Comprehension or using pandas df.drop() method.

I typically employ using the List Comprehension method of choosing a subset of columns. The drawback of this approach is that it is verbose over its drop() counterpart. However, there is a distinct advantage with this approach which is that this approach guarantees idempotency.

Being a huge fanboy of idempotency coming from a mathematical background and  a hatred for verbosity and duplication, I came end-up using the utility methods shown in this blog post. Clearly, this utility function gives the advantage of brevity and idempotency. 

Yay, I win!