pyspark.resource.
ResourceProfile
Resource profile to associate with an RDD. A pyspark.resource.ResourceProfile allows the user to specify executor and task requirements for an RDD that will get applied during a stage. This allows the user to change the resource requirements between stages. This is meant to be immutable so user cannot change it after building.
pyspark.resource.ResourceProfile
New in version 3.1.0.
Notes
This API is evolving.
Examples
Create Executor resource requests.
>>> executor_requests = ( ... ExecutorResourceRequests() ... .cores(2) ... .memory("6g") ... .memoryOverhead("1g") ... .pysparkMemory("2g") ... .offheapMemory("3g") ... .resource("gpu", 2, "testGpus", "nvidia.com") ... )
Create task resource requasts.
>>> task_requests = TaskResourceRequests().cpus(2).resource("gpu", 2)
Create a resource profile.
>>> builder = ResourceProfileBuilder() >>> resource_profile = builder.require(executor_requests).require(task_requests).build
Create an RDD with the resource profile.
>>> rdd = sc.parallelize(range(10)).withResources(resource_profile) >>> rdd.getResourceProfile() <pyspark.resource.profile.ResourceProfile object ...> >>> rdd.getResourceProfile().taskResources {'cpus': <...TaskResourceRequest...>, 'gpu': <...TaskResourceRequest...>} >>> rdd.getResourceProfile().executorResources {'gpu': <...ExecutorResourceRequest...>, 'cores': <...ExecutorResourceRequest...>, 'offHeap': <...ExecutorResourceRequest...>, 'memoryOverhead': <...ExecutorResourceRequest...>, 'pyspark.memory': <...ExecutorResourceRequest...>, 'memory': <...ExecutorResourceRequest...>}
Attributes
executorResources
id
taskResources