pyspark.pandas.
to_numeric
Convert argument to a numeric type.
See also
DataFrame.astype
Cast argument to a specified dtype.
to_datetime
Convert argument to datetime.
to_timedelta
Convert argument to timedelta.
numpy.ndarray.astype
Cast a numpy array to a specified type.
Examples
>>> psser = ps.Series(['1.0', '2', '-3']) >>> psser 0 1.0 1 2 2 -3 dtype: object
>>> ps.to_numeric(psser) 0 1.0 1 2.0 2 -3.0 dtype: float32
If given Series contains invalid value to cast float, just cast it to np.nan
>>> psser = ps.Series(['apple', '1.0', '2', '-3']) >>> psser 0 apple 1 1.0 2 2 3 -3 dtype: object
>>> ps.to_numeric(psser) 0 NaN 1 1.0 2 2.0 3 -3.0 dtype: float32
Also support for list, tuple, np.array, or a scalar
>>> ps.to_numeric(['1.0', '2', '-3']) array([ 1., 2., -3.])
>>> ps.to_numeric(('1.0', '2', '-3')) array([ 1., 2., -3.])
>>> ps.to_numeric(np.array(['1.0', '2', '-3'])) array([ 1., 2., -3.])
>>> ps.to_numeric('1.0') 1.0