| dapply {SparkR} | R Documentation | 
Apply a function to each partition of a SparkDataFrame.
Apply a function to each partition of a SparkDataFrame and collect the result back
dapply(x, func, schema) dapplyCollect(x, func) ## S4 method for signature 'SparkDataFrame,'function',structType' dapply(x, func, schema) ## S4 method for signature 'SparkDataFrame,'function'' dapplyCollect(x, func)
| x | A SparkDataFrame | 
| func | A function to be applied to each partition of the SparkDataFrame. func should have only one parameter, to which a data.frame corresponds to each partition will be passed. The output of func should be a data.frame. | 
| schema | The schema of the resulting DataFrame after the function is applied. It must match the output of func. | 
| x | A SparkDataFrame | 
| func | A function to be applied to each partition of the SparkDataFrame. func should have only one parameter, to which a data.frame corresponds to each partition will be passed. The output of func should be a data.frame. | 
Other SparkDataFrame functions: SparkDataFrame-class,
[[, agg,
arrange, as.data.frame,
attach, cache,
collect, colnames,
coltypes, columns,
count, describe,
dim, distinct,
dropDuplicates, dropna,
drop, dtypes,
except, explain,
filter, first,
group_by, head,
histogram, insertInto,
intersect, isLocal,
join, limit,
merge, mutate,
ncol, persist,
printSchema,
registerTempTable, rename,
repartition, sample,
saveAsTable, selectExpr,
select, showDF,
show, str,
take, unionAll,
unpersist, withColumn,
write.df, write.jdbc,
write.json, write.parquet,
write.text
Other SparkDataFrame functions: SparkDataFrame-class,
[[, agg,
arrange, as.data.frame,
attach, cache,
collect, colnames,
coltypes, columns,
count, describe,
dim, distinct,
dropDuplicates, dropna,
drop, dtypes,
except, explain,
filter, first,
group_by, head,
histogram, insertInto,
intersect, isLocal,
join, limit,
merge, mutate,
ncol, persist,
printSchema,
registerTempTable, rename,
repartition, sample,
saveAsTable, selectExpr,
select, showDF,
show, str,
take, unionAll,
unpersist, withColumn,
write.df, write.jdbc,
write.json, write.parquet,
write.text
## Not run: 
##D   df <- createDataFrame (sqlContext, iris)
##D   df1 <- dapply(df, function(x) { x }, schema(df))
##D   collect(df1)
##D 
##D   # filter and add a column
##D   df <- createDataFrame (
##D           sqlContext, 
##D           list(list(1L, 1, "1"), list(2L, 2, "2"), list(3L, 3, "3")),
##D           c("a", "b", "c"))
##D   schema <- structType(structField("a", "integer"), structField("b", "double"),
##D                      structField("c", "string"), structField("d", "integer"))
##D   df1 <- dapply(
##D            df,
##D            function(x) {
##D              y <- x[x[1] > 1, ]
##D              y <- cbind(y, y[1] + 1L)
##D            },
##D            schema)
##D   collect(df1)
##D   # the result
##D   #       a b c d
##D   #     1 2 2 2 3
##D   #     2 3 3 3 4
## End(Not run)
## Not run: 
##D   df <- createDataFrame (sqlContext, iris)
##D   ldf <- dapplyCollect(df, function(x) { x })
##D 
##D   # filter and add a column
##D   df <- createDataFrame (
##D           sqlContext, 
##D           list(list(1L, 1, "1"), list(2L, 2, "2"), list(3L, 3, "3")),
##D           c("a", "b", "c"))
##D   ldf <- dapplyCollect(
##D            df,
##D            function(x) {
##D              y <- x[x[1] > 1, ]
##D              y <- cbind(y, y[1] + 1L)
##D            })
##D   # the result
##D   #       a b c d
##D   #       2 2 2 3
##D   #       3 3 3 4
## End(Not run)