pyspark.sql.streaming.DataStreamWriter.format#
- DataStreamWriter.format(source)[source]#
- Specifies the underlying output data source. - New in version 2.0.0. - Changed in version 3.5.0: Supports Spark Connect. - Parameters
- sourcestr
- string, name of the data source, which for now can be ‘parquet’. 
 
 - Notes - This API is evolving. - Examples - >>> df = spark.readStream.format("rate").load() >>> df.writeStream.format("text") <...streaming.readwriter.DataStreamWriter object ...> - This API allows to configure the source to write. The example below writes a CSV file from Rate source in a streaming manner. - >>> import tempfile >>> import time >>> with tempfile.TemporaryDirectory(prefix="format1") as d: ... with tempfile.TemporaryDirectory(prefix="format2") as cp: ... df = spark.readStream.format("rate").load() ... q = df.writeStream.format("csv").option("checkpointLocation", cp).start(d) ... time.sleep(5) ... q.stop() ... spark.read.schema("timestamp TIMESTAMP, value STRING").csv(d).show() +...---------+-----+ |...timestamp|value| +...---------+-----+ ...