• Data Source 和 Sink 的容错保证

    Data Source 和 Sink 的容错保证

    Flink’s fault tolerance mechanism recovers programs in the presence of failures andcontinues to execute them. Such failures include machine hardware failures, network failures,transient program failures, etc.

    Flink can guarantee exactly-once state updates to user-defined state only when the source participates in thesnapshotting mechanism. The following table lists the state update guarantees of Flink coupled with the bundled connectors.

    Please read the documentation of each connector to understand the details of the fault tolerance guarantees.

    SourceGuaranteesNotes
    Apache Kafkaexactly onceUse the appropriate Kafka connector for your version
    AWS Kinesis Streamsexactly once
    RabbitMQat most once (v 0.10) / exactly once (v 1.0)
    Twitter Streaming APIat most once
    Collectionsexactly once
    Filesexactly once
    Socketsat most once

    To guarantee end-to-end exactly-once record delivery (in addition to exactly-once state semantics), the data sink needsto take part in the checkpointing mechanism. The following table lists the delivery guarantees (assuming exactly-oncestate updates) of Flink coupled with bundled sinks:

    SinkGuaranteesNotes
    HDFS BucketingSinkexactly onceImplementation depends on Hadoop version
    Elasticsearchat least once
    Kafka producerat least once/ exactly onceexactly once with transactional producers (v 0.11+)
    Cassandra sinkat least once / exactly onceexactly once only for idempotent updates
    AWS Kinesis Streamsat least once
    File sinksat least once
    Socket sinksat least once
    Standard outputat least once
    Redis sinkat least once