pyspark.sql.DataFrameWriter.insertInto#
- DataFrameWriter.insertInto(tableName, overwrite=None)[source]#
Inserts the content of the
DataFrame
to the specified table.It requires that the schema of the
DataFrame
is the same as the schema of the table.New in version 1.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- overwritebool, optional
If true, overwrites existing data. Disabled by default
Notes
Unlike
DataFrameWriter.saveAsTable()
,DataFrameWriter.insertInto()
ignores the column names and just uses position-based resolution.Examples
>>> _ = spark.sql("DROP TABLE IF EXISTS tblA") >>> df = spark.createDataFrame([ ... (100, "Hyukjin Kwon"), (120, "Hyukjin Kwon"), (140, "Haejoon Lee")], ... schema=["age", "name"] ... ) >>> df.write.saveAsTable("tblA")
Insert the data into ‘tblA’ table but with different column names.
>>> df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA") >>> spark.read.table("tblA").sort("age").show() +---+------------+ |age| name| +---+------------+ |100|Hyukjin Kwon| |100|Hyukjin Kwon| |120|Hyukjin Kwon| |120|Hyukjin Kwon| |140| Haejoon Lee| |140| Haejoon Lee| +---+------------+ >>> _ = spark.sql("DROP TABLE tblA")