| spark.findFrequentSequentialPatterns {SparkR} | R Documentation | 
A parallel PrefixSpan algorithm to mine frequent sequential patterns.
spark.findFrequentSequentialPatterns returns a complete set of frequent sequential
patterns.
For more details, see
PrefixSpan.
spark.findFrequentSequentialPatterns(data, ...) ## S4 method for signature 'SparkDataFrame' spark.findFrequentSequentialPatterns( data, minSupport = 0.1, maxPatternLength = 10L, maxLocalProjDBSize = 32000000L, sequenceCol = "sequence" )
| data | A SparkDataFrame. | 
| ... | additional argument(s) passed to the method. | 
| minSupport | Minimal support level. | 
| maxPatternLength | Maximal pattern length. | 
| maxLocalProjDBSize | Maximum number of items (including delimiters used in the internal storage format) allowed in a projected database before local processing. | 
| sequenceCol | name of the sequence column in dataset. | 
A complete set of frequent sequential patterns in the input sequences of itemsets.
The returned SparkDataFrame contains columns of sequence and corresponding
frequency. The schema of it will be:
sequence: ArrayType(ArrayType(T)), freq: integer
where T is the item type
spark.findFrequentSequentialPatterns(SparkDataFrame) since 3.0.0
## Not run: 
##D df <- createDataFrame(list(list(list(list(1L, 2L), list(3L))),
##D                            list(list(list(1L), list(3L, 2L), list(1L, 2L))),
##D                            list(list(list(1L, 2L), list(5L))),
##D                            list(list(list(6L)))),
##D                       schema = c("sequence"))
##D frequency <- spark.findFrequentSequentialPatterns(df, minSupport = 0.5, maxPatternLength = 5L,
##D                                                   maxLocalProjDBSize = 32000000L)
##D showDF(frequency)
## End(Not run)