Series.
compare
Compare to another Series and show the differences.
Object to compare with.
If true, all rows and columns are kept. Otherwise, only the ones with different values are kept.
If true, the result keeps values that are equal. Otherwise, equal values are shown as NaNs.
Notes
Matching NaNs will not appear as a difference.
Examples
>>> from pyspark.pandas.config import set_option, reset_option >>> set_option("compute.ops_on_diff_frames", True) >>> s1 = ps.Series(["a", "b", "c", "d", "e"]) >>> s2 = ps.Series(["a", "a", "c", "b", "e"])
Align the differences on columns
>>> s1.compare(s2).sort_index() self other 1 b a 3 d b
Keep all original rows
>>> s1.compare(s2, keep_shape=True).sort_index() self other 0 None None 1 b a 2 None None 3 d b 4 None None
Keep all original rows and also all original values
>>> s1.compare(s2, keep_shape=True, keep_equal=True).sort_index() self other 0 a a 1 b a 2 c c 3 d b 4 e e
>>> reset_option("compute.ops_on_diff_frames")