DataFrame.
drop_duplicates
Return DataFrame with duplicate rows removed, optionally only considering certain columns.
Only consider certain columns for identifying duplicates, by default use all of the columns.
Determines which duplicates (if any) to keep. - first : Drop duplicates except for the first occurrence. - last : Drop duplicates except for the last occurrence. - False : Drop all duplicates.
first
last
Whether to drop duplicates in place or to return a copy.
DataFrame with duplicates removed or None if inplace=True.
inplace=True
>>> df = ps.DataFrame( ..
>>> df a b
>>> df.drop_duplicates().sort_index() a b
>>> df.drop_duplicates('a').sort_index() a b
>>> df.drop_duplicates(['a', 'b']).sort_index() a b
>>> df.drop_duplicates(keep='last').sort_index() a b
>>> df.drop_duplicates(keep=False).sort_index() a b