public class HdfsSpout extends BaseRichSpout
| Constructor and Description | 
|---|
| HdfsSpout() | 
| Modifier and Type | Method and Description | 
|---|---|
| void | ack(Object msgId)Storm has determined that the tuple emitted by this spout with the msgId identifier has been fully processed. | 
| void | close()Called when an ISpout is going to be shutdown. | 
| void | declareOutputFields(OutputFieldsDeclarer declarer)Declare the output schema for all the streams of this topology. | 
| protected void | emitData(List<Object> tuple,
        org.apache.storm.hdfs.spout.HdfsSpout.MessageId id) | 
| void | fail(Object msgId)The tuple emitted by this spout with the msgId identifier has failed to be fully processed. | 
| SpoutOutputCollector | getCollector() | 
| org.apache.hadoop.fs.Path | getLockDirPath() | 
| void | nextTuple()When this method is called, Storm is requesting that the Spout emit tuples to the output collector. | 
| void | open(Map<String,Object> conf,
    TopologyContext context,
    SpoutOutputCollector collector)Called when a task for this component is initialized within a worker on the cluster. | 
| HdfsSpout | setArchiveDir(String archiveDir) | 
| HdfsSpout | setBadFilesDir(String badFilesDir) | 
| HdfsSpout | setClocksInSync(boolean clocksInSync) | 
| HdfsSpout | setCommitFrequencyCount(int commitFrequencyCount) | 
| HdfsSpout | setCommitFrequencySec(int commitFrequencySec) | 
| HdfsSpout | setHdfsUri(String hdfsUri) | 
| HdfsSpout | setIgnoreSuffix(String ignoreSuffix) | 
| HdfsSpout | setLockDir(String lockDir) | 
| HdfsSpout | setLockTimeoutSec(int lockTimeoutSec) | 
| HdfsSpout | setMaxOutstanding(int maxOutstanding) | 
| HdfsSpout | setReaderType(String readerType) | 
| HdfsSpout | setSourceDir(String sourceDir) | 
| HdfsSpout | withConfigKey(String configKey)set key name under which HDFS options are placed. | 
| HdfsSpout | withOutputFields(String... fields)Output field names. | 
| HdfsSpout | withOutputStream(String streamName)Set output stream name. | 
activate, deactivategetComponentConfigurationclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetComponentConfigurationpublic HdfsSpout setCommitFrequencyCount(int commitFrequencyCount)
public HdfsSpout setCommitFrequencySec(int commitFrequencySec)
public HdfsSpout setMaxOutstanding(int maxOutstanding)
public HdfsSpout setLockTimeoutSec(int lockTimeoutSec)
public HdfsSpout setClocksInSync(boolean clocksInSync)
public HdfsSpout withOutputFields(String... fields)
Output field names. Number of fields depends upon the reader type
public HdfsSpout withConfigKey(String configKey)
set key name under which HDFS options are placed. (similar to HDFS bolt). default key name is ‘hdfs.config’
public org.apache.hadoop.fs.Path getLockDirPath()
public SpoutOutputCollector getCollector()
public void nextTuple()
ISpoutWhen this method is called, Storm is requesting that the Spout emit tuples to the output collector. This method should be non-blocking, so if the Spout has no tuples to emit, this method should return. nextTuple, ack, and fail are all called in a tight loop in a single thread in the spout task. When there are no tuples to emit, it is courteous to have nextTuple sleep for a short amount of time (like a single millisecond) so as not to waste too much CPU.
protected void emitData(List<Object> tuple, org.apache.storm.hdfs.spout.HdfsSpout.MessageId id)
public void open(Map<String,Object> conf, TopologyContext context, SpoutOutputCollector collector)
ISpoutCalled when a task for this component is initialized within a worker on the cluster. It provides the spout with the environment in which the spout executes.
This includes the:
conf - The Storm configuration for this spout. This is the configuration provided to the topology merged in with cluster  configuration on this machine.context - This object can be used to get information about this task’s place within the topology, including the task id and  component id of this task, input and output information, etc.collector - The collector is used to emit tuples from this spout. Tuples can be emitted at any time, including the open and  close methods. The collector is thread-safe and should be saved as an instance variable of this spout object.public void close()
ISpoutCalled when an ISpout is going to be shutdown. There is no guarentee that close will be called, because the supervisor kill -9’s worker processes on the cluster.
The one context where close is guaranteed to be called is a topology is killed when running Storm in local mode.
close in interface ISpoutclose in class BaseRichSpoutpublic void ack(Object msgId)
ISpoutStorm has determined that the tuple emitted by this spout with the msgId identifier has been fully processed. Typically, an implementation of this method will take that message off the queue and prevent it from being replayed.
ack in interface ISpoutack in class BaseRichSpoutpublic void fail(Object msgId)
ISpoutThe tuple emitted by this spout with the msgId identifier has failed to be fully processed. Typically, an implementation of this method will put that message back on the queue to be replayed at a later time.
fail in interface ISpoutfail in class BaseRichSpoutpublic void declareOutputFields(OutputFieldsDeclarer declarer)
IComponentDeclare the output schema for all the streams of this topology.
declarer - this is used to declare output stream ids, output fields, and whether or not each output stream is a direct streamCopyright © 2021 The Apache Software Foundation. All rights reserved.