File transfer
When tranferring whole or subsections of an sstable, only the DATA component is shipped. To that end,
there are three "modes" of an sstable transfer that need to be handled somewhat differently:
1) uncompressed sstable - data needs to be read into user space so it can be manipulated: checksum validation,
apply stream compression (see next section), and/or TLS encryption.
2) compressed sstable, transferred with SSL/TLS - data needs to be read into user space as that is where the TLS encryption
needs to happen. Netty does not allow the pretense of doing zero-copy transfers when TLS is in the pipeline;
data must explicitly be pulled into user-space memory for TLS encryption to work.
3) compressed sstable, transferred without SSL/TLS - data can be streamed via zero-copy transfer as the data does not
need to be manipulated (it can be sent "as-is").
Compressing the data
We always want to transfer as few bytes as possible of the wire when streaming a file. If the
sstable is not already compressed via table compression options, we apply an on-the-fly stream compression
to the data. The stream compression format is documented in
StreamCompressionSerializer
You may be wondering: why implement your own compression scheme? why not use netty's built-in compression codecs,
like
Lz4FrameEncoder
? That makes complete sense if all the sstables
to be streamed are non using sstable compression (and obviously you wouldn't use stream compression when the sstables
are using sstable compression). The problem is when you have a mix of files, some using sstable compression
and some not. You can either:
- send the files of one type over one kind of socket, and the others over another socket
- send them both over the same socket, but then auto-adjust per each file type.
I've opted for the latter to keep socket/channel management simpler and cleaner.