| repartitionByRange {SparkR} | R Documentation |
The following options for repartition by range are possible:
1. Return a new SparkDataFrame range partitioned by
the given columns into numPartitions.
2. Return a new SparkDataFrame range partitioned by the given column(s),
using spark.sql.shuffle.partitions as number of partitions.
repartitionByRange(x, ...) ## S4 method for signature 'SparkDataFrame' repartitionByRange(x, numPartitions = NULL, col = NULL, ...)
x |
a SparkDataFrame. |
... |
additional column(s) to be used in the range partitioning. |
numPartitions |
the number of partitions to use. |
col |
the column by which the range partitioning will be performed. |
repartitionByRange since 2.4.0
Other SparkDataFrame functions: SparkDataFrame-class,
agg, alias,
arrange, as.data.frame,
attach,SparkDataFrame-method,
broadcast, cache,
checkpoint, coalesce,
collect, colnames,
coltypes,
createOrReplaceTempView,
crossJoin, cube,
dapplyCollect, dapply,
describe, dim,
distinct, dropDuplicates,
dropna, drop,
dtypes, exceptAll,
except, explain,
filter, first,
gapplyCollect, gapply,
getNumPartitions, group_by,
head, hint,
histogram, insertInto,
intersectAll, intersect,
isLocal, isStreaming,
join, limit,
localCheckpoint, merge,
mutate, ncol,
nrow, persist,
printSchema, randomSplit,
rbind, rename,
repartition, rollup,
sample, saveAsTable,
schema, selectExpr,
select, showDF,
show, storageLevel,
str, subset,
summary, take,
toJSON, unionByName,
union, unpersist,
withColumn, withWatermark,
with, write.df,
write.jdbc, write.json,
write.orc, write.parquet,
write.stream, write.text
## Not run:
##D sparkR.session()
##D path <- "path/to/file.json"
##D df <- read.json(path)
##D newDF <- repartitionByRange(df, col = df$col1, df$col2)
##D newDF <- repartitionByRange(df, 3L, col = df$col1, df$col2)
## End(Not run)