pyspark.streaming.DStream.cogroup¶
-
DStream.
cogroup
(other: pyspark.streaming.dstream.DStream[Tuple[K, U]], numPartitions: Optional[int] = None) → pyspark.streaming.dstream.DStream[Tuple[K, Tuple[pyspark.resultiterable.ResultIterable[V], pyspark.resultiterable.ResultIterable[U]]]][source]¶ Return a new DStream by applying ‘cogroup’ between RDDs of this DStream and other DStream.
Hash partitioning is used to generate the RDDs with numPartitions partitions.