| Home | Trees | Indices | Help | 
 | 
|---|
|  | 
Main entry point for SparkSQL functionality.
A SQLContext can be used create SchemaRDDs, register SchemaRDDs as tables, execute SQL over tables, cache tables, and read parquet files.
| Instance Methods | |||
| 
 | |||
| 
 | |||
| 
 | |||
| 
 | |||
| 
 | |||
| 
 | |||
| 
 | |||
| 
 | |||
| Method Details | 
| 
 Create a new SQLContext. 
 | 
| 
 Infer and apply a schema to an RDD of  We peek at the first row of the RDD to determine the fields names and types, and then use that to extract all the dictionaries. >>> srdd = sqlCtx.inferSchema(rdd) >>> srdd.collect() == [{"field1" : 1, "field2" : "row1"}, {"field1" : 2, "field2": "row2"}, ... {"field1" : 3, "field2": "row3"}] True | 
| 
 Registers the given RDD as a temporary table in the catalog. Temporary tables exist only during the lifetime of this instance of SQLContext. >>> srdd = sqlCtx.inferSchema(rdd) >>> sqlCtx.registerRDDAsTable(srdd, "table1") | 
| 
 Loads a Parquet file, returning the result as a SchemaRDD. >>> import tempfile, shutil >>> parquetFile = tempfile.mkdtemp() >>> shutil.rmtree(parquetFile) >>> srdd = sqlCtx.inferSchema(rdd) >>> srdd.saveAsParquetFile(parquetFile) >>> srdd2 = sqlCtx.parquetFile(parquetFile) >>> srdd.collect() == srdd2.collect() True | 
| 
 Return a SchemaRDD representing the result of the given query. >>> srdd = sqlCtx.inferSchema(rdd) >>> sqlCtx.registerRDDAsTable(srdd, "table1") >>> srdd2 = sqlCtx.sql("SELECT field1 AS f1, field2 as f2 from table1") >>> srdd2.collect() == [{"f1" : 1, "f2" : "row1"}, {"f1" : 2, "f2": "row2"}, ... {"f1" : 3, "f2": "row3"}] True | 
| 
 Returns the specified table as a SchemaRDD. >>> srdd = sqlCtx.inferSchema(rdd) >>> sqlCtx.registerRDDAsTable(srdd, "table1") >>> srdd2 = sqlCtx.table("table1") >>> srdd.collect() == srdd2.collect() True | 
| Home | Trees | Indices | Help | 
 | 
|---|
| Generated by Epydoc 3.0.1 on Fri May 30 01:48:46 2014 | http://epydoc.sourceforge.net |