pyspark.sql.streaming.StreamingQuery

class pyspark.sql.streaming.StreamingQuery(jsq: py4j.java_gateway.JavaObject)[source]

A handle to a query that is executing continuously in the background as new data arrives. All these methods are thread-safe.

New in version 2.0.0.

Changed in version 3.5.0: Supports Spark Connect.

Notes

This API is evolving.

Methods

awaitTermination([timeout])

Waits for the termination of this query, either by query.stop() or by an exception.

exception()

New in version 2.1.0.

explain([extended])

Prints the (logical and physical) plans to the console for debugging purpose.

processAllAvailable()

Blocks until all available data in the source has been processed and committed to the sink.

stop()

Stop this streaming query.

Attributes

id

Returns the unique id of this query that persists across restarts from checkpoint data.

isActive

Whether this streaming query is currently active or not.

lastProgress

Returns the most recent StreamingQueryProgress update of this streaming query or None if there were no progress updates

name

Returns the user-specified name of the query, or null if not specified.

recentProgress

Returns an array of the most recent [[StreamingQueryProgress]] updates for this query.

runId

Returns the unique id of this query that does not persist across restarts.

status

Returns the current status of the query.