Transform this Mappable into another by mapping after.
Transform this Mappable into another by mapping after. We don't call this map because of conflicts with Mappable, unfortunately
Because TupleConverter cannot be covariant, we need to jump through this hoop.
Because TupleConverter cannot be covariant, we need to jump through this hoop. A typical implementation might be: (implicit conv: TupleConverter[T]) and then:
override def converter[U >: T] = TupleConverter.asSuperConverter[T, U](conv)
Subclasses of Source MUST override this method.
Subclasses of Source MUST override this method. They may call out to TestTapFactory for making Taps suitable for testing.
If you want to filter, you should use this and output a 0 or 1 length Iterable.
If you want to filter, you should use this and output a 0 or 1 length Iterable. Filter does not change column names, and we generally expect to change columns here
Don't use the whole string of the iterable, which can be huge.
Don't use the whole string of the iterable, which can be huge. We take the first 10 items + the identityHashCode of the iter.
Allows you to read a Tap on the submit node NOT FOR USE IN THE MAPPERS OR REDUCERS.
Allows you to read a Tap on the submit node NOT FOR USE IN THE MAPPERS OR REDUCERS. Typical use might be to read in Job.next to determine if another job is needed
The mock passed in to scalding.JobTest may be considered as a mock of the Tap or the Source.
The mock passed in to scalding.JobTest may be considered as a mock of the Tap or the Source. By default, as of 0.9.0, it is considered as a Mock of the Source. If you set this to true, the mock in TestMode will be considered to be a mock of the Tap (which must be transformed) and not the Source.
write the pipe but return the input so it can be chained into the next operation
write the pipe but return the input so it can be chained into the next operation
(Since version 0.9.0) replace with Mappable.toIterator
Allows working with an iterable object defined in the job (on the submitter) to be used within a Job as you would a Pipe/RichPipe
These lists should probably be very tiny by Hadoop standards. If they are getting large, you should probably dump them to HDFS and use the normal mechanisms to address the data (a FileSource).