| sample {SparkR} | R Documentation | 
Return a sampled subset of this SparkDataFrame using a random seed. Note: this is not guaranteed to provide exactly the fraction specified of the total count of of the given SparkDataFrame.
sample(x, withReplacement = FALSE, fraction, seed) sample_frac(x, withReplacement = FALSE, fraction, seed) ## S4 method for signature 'SparkDataFrame' sample(x, withReplacement = FALSE, fraction, seed) ## S4 method for signature 'SparkDataFrame' sample_frac(x, withReplacement = FALSE, fraction, seed)
| x | A SparkDataFrame | 
| withReplacement | Sampling with replacement or not | 
| fraction | The (rough) sample target fraction | 
| seed | Randomness seed value. Default is a random seed. | 
sample since 1.4.0
sample_frac since 1.4.0
Other SparkDataFrame functions: SparkDataFrame-class,
agg, alias,
arrange, as.data.frame,
attach,SparkDataFrame-method,
broadcast, cache,
checkpoint, coalesce,
collect, colnames,
coltypes,
createOrReplaceTempView,
crossJoin, cube,
dapplyCollect, dapply,
describe, dim,
distinct, dropDuplicates,
dropna, drop,
dtypes, exceptAll,
except, explain,
filter, first,
gapplyCollect, gapply,
getNumPartitions, group_by,
head, hint,
histogram, insertInto,
intersectAll, intersect,
isLocal, isStreaming,
join, limit,
localCheckpoint, merge,
mutate, ncol,
nrow, persist,
printSchema, randomSplit,
rbind, rename,
repartitionByRange,
repartition, rollup,
saveAsTable, schema,
selectExpr, select,
showDF, show,
storageLevel, str,
subset, summary,
take, toJSON,
unionByName, union,
unpersist, withColumn,
withWatermark, with,
write.df, write.jdbc,
write.json, write.orc,
write.parquet, write.stream,
write.text
## Not run: 
##D sparkR.session()
##D path <- "path/to/file.json"
##D df <- read.json(path)
##D collect(sample(df, fraction = 0.5))
##D collect(sample(df, FALSE, 0.5))
##D collect(sample(df, TRUE, 0.5, seed = 3))
## End(Not run)