SparkContext.
runJob
Executes the given partitionFunc on the specified set of partitions, returning the result as an array of elements.
If ‘partitions’ is not specified, this will run over all partitions.
Examples
>>> myRDD = sc.parallelize(range(6), 3) >>> sc.runJob(myRDD, lambda part: [x * x for x in part]) [0, 1, 4, 9, 16, 25]
>>> myRDD = sc.parallelize(range(6), 3) >>> sc.runJob(myRDD, lambda part: [x * x for x in part], [0, 2], True) [0, 1, 16, 25]