Class MessagingService

  • All Implemented Interfaces:
    MessageDelivery, MessagingServiceMBean

    public class MessagingService
    extends MessagingServiceMBeanImpl
    implements MessageDelivery
    MessagingService implements all internode communication - with the exception of SSTable streaming (for now). Specifically, it's responsible for dispatch of outbound messages to other nodes and routing of inbound messages to their appropriate IVerbHandler.

    Using MessagingService: sending requests and responses

    The are two ways to send a Message, and you should pick one depending on the desired behaviour: 1. To send a request that expects a response back, use sendWithCallback(Message, InetAddressAndPort, RequestCallback) method. Once a response message is received, RequestCallback.onResponse(Message) method will be invoked on the provided callback - in case of a success response. In case of a failure response (see Verb.FAILURE_RSP), or if a response doesn't arrive within verb's configured expiry time, RequestCallback.onFailure(InetAddressAndPort, RequestFailureReason) will be invoked instead. 2. To send a response back, or a message that expects no response, use send(Message, InetAddressAndPort) method. See also: Message.out(Verb, Object), Message.responseWith(Object), and Message.failureResponse(RequestFailureReason).

    Using MessagingService: handling a request

    As described in the previous section, to handle responses you only need to implement RequestCallback interface - so long as your response verb handler is the default ResponseVerbHandler. There are two steps you need to perform to implement request handling: 1. Create a IVerbHandler to process incoming requests and responses for the new type (if applicable). 2. Add a new Verb to the enum for the new request type, and, if applicable, one for the response message. MessagingService will now automatically invoke your handler whenever a Message with this verb arrives.

    Architecture of MessagingService

    QOS

    Since our messaging protocol is TCP-based, and also doesn't yet support interleaving messages with each other, we need a way to prevent head-of-line blocking adversely affecting all messages - in particular, large messages being in the way of smaller ones. To achive that (somewhat), we maintain three messaging connections to and from each peer: - one for large messages - defined as being larger than OutboundConnections.LARGE_MESSAGE_THRESHOLD (65KiB by default) - one for small messages - defined as smaller than that threshold - and finally, a connection for urgent messages - usually small and/or that are important to arrive promptly, e.g. gossip-related ones

    Wire format and framing

    Small messages are grouped together into frames, and large messages are split over multiple frames. Framing provides application-level integrity protection to otherwise raw streams of data - we use CRC24 for frame headers and CRC32 for the entire payload. LZ4 is optionally used for compression. You can find the on-wire format description of individual messages in the comments for Message.Serializer, alongside with format evolution notes. For the list and descriptions of available frame decoders see FrameDecoder comments. You can find wire format documented in the javadoc of FrameDecoder implementations: see FrameDecoderCrc and FrameDecoderLZ4 in particular.

    Architecture of outbound messaging

    OutboundConnection is the core class implementing outbound connection logic, with OutboundConnection.enqueue(Message) being its main entry point. The connections are initiated by OutboundConnectionInitiator. Netty pipeline for outbound messaging connections generally consists of the following handlers: [(optional) SslHandler] <- [FrameEncoder] OutboundConnection handles the entire lifetime of a connection: from the very first handshake to any necessary reconnects if necessary. Message-delivery flow varies depending on the connection type. For ConnectionType.SMALL_MESSAGES and ConnectionType.URGENT_MESSAGES, Message serialization and delivery occurs directly on the event loop. See OutboundConnection.EventLoopDelivery for details. For ConnectionType.LARGE_MESSAGES, to ensure that servicing large messages doesn't block timely service of other requests, message serialization is offloaded to a companion thread pool (SocketFactory.synchronousWorkExecutor). Most of the work will be performed by AsyncChannelOutputPlus. Please see OutboundConnection.LargeMessageDelivery for details. To prevent fast clients, or slow nodes on the other end of the connection from overwhelming a host with enqueued, unsent messages on heap, we impose strict limits on how much memory enqueued, undelivered messages can claim. Every individual connection gets an exclusive permit quota to use - 4MiB by default; every endpoint (group of large, small, and urgent connection) is capped at, by default, at 128MiB of undelivered messages, and a global limit of 512MiB is imposed on all endpoints combined. On an attempt to OutboundConnection.enqueue(Message), the connection will attempt to allocate permits for message-size number of bytes from its exclusive quota; if successful, it will add the message to the queue; if unsuccessful, it will need to allocate remainder from both endpoint and lobal reserves, and if it fails to do so, the message will be rejected, and its callbacks, if any, immediately expired. For a more detailed description please see the docs and comments of OutboundConnection.

    Architecture of inbound messaging

    InboundMessageHandler is the core class implementing inbound connection logic, paired with FrameDecoder. Inbound connections are initiated by InboundConnectionInitiator. The primary entry points to these classes are FrameDecoder.channelRead(ShareableBytes) and AbstractMessageHandler.process(FrameDecoder.Frame). Netty pipeline for inbound messaging connections generally consists of the following handlers: [(optional) SslHandler] -> [FrameDecoder] -> [InboundMessageHandler] FrameDecoder is responsible for decoding incoming frames and work stashing; InboundMessageHandler then takes decoded frames from the decoder and processes the messages contained in them. The flow differs between small and large messages. Small ones are deserialized immediately, and only then scheduled on the right thread pool for the Verb for execution. Large messages, OTOH, aren't deserialized until they are just about to be executed on the appropriate Stage. Similarly to outbound handling, inbound messaging imposes strict memory utilisation limits on individual endpoints and on global aggregate consumption, and implements simple flow control, to prevent a single fast endpoint from overwhelming a host. Every individual connection gets an exclusive permit quota to use - 4MiB by default; every endpoint (group of large, small, and urgent connection) is capped at, by default, at 128MiB of unprocessed messages, and a global limit of 512MiB is imposed on all endpoints combined. On arrival of a message header, the handler will attempt to allocate permits for message-size number of bytes from its exclusive quota; if successful, it will proceed to deserializing and processing the message. If unsuccessful, the handler will attempt to allocate the remainder from its endpoint and global reserve; if either allocation is unsuccessful, the handler will cease any further frame processing, and tell FrameDecoder to stop reading from the network; subsequently, it will put itself on a special AbstractMessageHandler.WaitQueue, to be reactivated once more permits become available. For a more detailed description please see the docs and comments of InboundMessageHandler and FrameDecoder.

    Observability

    MessagingService exposes diagnostic counters for both outbound and inbound directions - received and sent bytes and message counts, overload bytes and message count, error bytes and error counts, and many more. See InternodeInboundMetrics and InternodeOutboundMetrics for JMX-exposed counters. We also provide system_views.internode_inbound and system_views.internode_outbound virtual tables - implemented in InternodeInboundTable and InternodeOutboundTable respectively.
    • Method Detail

      • getVersionOrdinal

        public static int getVersionOrdinal​(int version)
        This is an optimisation to speed up the translation of the serialization version to the MessagingService.Version enum ordinal.
        Parameters:
        version - the serialization version
        Returns:
        a MessagingService.Version ordinal value
      • sendWithCallback

        public void sendWithCallback​(Message message,
                                     InetAddressAndPort to,
                                     RequestCallback cb)
        Send a non-mutation message to a given endpoint. This method specifies a callback which is invoked with the actual response.
        Specified by:
        sendWithCallback in interface MessageDelivery
        Parameters:
        message - message to be sent.
        to - endpoint to which the message needs to be sent
        cb - callback interface which is used to pass the responses or suggest that a timeout occurred to the invoker of the send().
      • sendWriteWithCallback

        public void sendWriteWithCallback​(Message message,
                                          Replica to,
                                          AbstractWriteResponseHandler<?> handler)
        Send a mutation message or a Paxos Commit to a given endpoint. This method specifies a callback which is invoked with the actual response. Also holds the message (only mutation messages) to determine if it needs to trigger a hint (uses StorageProxy for that).
        Parameters:
        message - message to be sent.
        to - endpoint to which the message needs to be sent
        handler - callback interface which is used to pass the responses or suggest that a timeout occurred to the invoker of the send().
      • send

        public void send​(Message message,
                         InetAddressAndPort to)
        Send a message to a given endpoint. This method adheres to the fire and forget style messaging.
        Specified by:
        send in interface MessageDelivery
        Parameters:
        message - messages to be sent.
        to - endpoint to which the message needs to be sent
      • respond

        public <V> void respond​(V response,
                                Message<?> message)
        Send a message to a given endpoint. This method adheres to the fire and forget style messaging.
        Specified by:
        respond in interface MessageDelivery
        Parameters:
        message - messages to be sent.
        response -
      • closeOutbound

        public void closeOutbound​(InetAddressAndPort to)
        Only to be invoked once we believe the endpoint will never be contacted again. We close the connection after a five minute delay, to give asynchronous operations a chance to terminate
      • removeInbound

        public void removeInbound​(InetAddressAndPort from)
        Only to be invoked once we believe the connections will never be used again.
      • interruptOutbound

        public void interruptOutbound​(InetAddressAndPort to)
        Closes any current open channel/connection to the endpoint, but does not cause any message loss, and we will try to re-establish connections immediately
      • maybeReconnectWithNewIp

        public io.netty.util.concurrent.Future<java.lang.Void> maybeReconnectWithNewIp​(InetAddressAndPort address,
                                                                                       InetAddressAndPort preferredAddress)
        Reconnect to the peer using the given addr. Outstanding messages in each channel will be sent on the current channel. Typically this function is used for something like EC2 public IP addresses which need to be used for communication between EC2 regions.
        Parameters:
        address - IP Address to identify the peer
        preferredAddress - IP Address to use (and prefer) going forward for connecting to the peer
      • shutdown

        public void shutdown()
        Wait for callbacks and don't allow anymore to be created (since they could require writing hints)
      • shutdown

        public void shutdown​(long timeout,
                             java.util.concurrent.TimeUnit units,
                             boolean shutdownGracefully,
                             boolean shutdownExecutors)
      • shutdownAbrubtly

        public void shutdownAbrubtly()
      • listen

        public void listen()
      • waitUntilListening

        public void waitUntilListening()
                                throws java.lang.InterruptedException
        Throws:
        java.lang.InterruptedException