lamp.data

package lamp.data

Type members

Classlikes

trait BatchStream[+I, S, C]
Companion:
object
Companion:
class
case object EmptyBatch extends StreamControl[Nothing]
case object EndStream extends StreamControl[Nothing]
object IOLoops

Contains a training loops and helpers around it

Contains a training loops and helpers around it

The two training loops implemented here are:

sealed trait LoopState
case class NonEmptyBatch[I](batch: I) extends StreamControl[I]
case class Peek(label: String) extends Module
object Reader
object SWA
case class SWALoopState(model: Seq[STen], optimizer: Seq[STen], epoch: Int, lastValidationLoss: Option[Double], minValidationLoss: Option[Double], numberOfAveragedModels: Int, averagedModels: Option[Seq[Tensor]], learningCurve: List[(Int, Double, Option[Double])]) extends LoopState
case class SimpleLoopState(model: Seq[STen], optimizer: Seq[STen], epoch: Int, lastValidationLoss: Option[Double], minValidationLoss: Option[Double], minValidationLossModel: Option[(Int, Seq[Tensor])], learningCurve: List[(Int, Double, Option[(Double, Double)])]) extends LoopState
case class SimpleThenSWALoopState(simple: SimpleLoopState, swa: Option[SWALoopState]) extends LoopState
object StateIO
sealed trait StreamControl[+I]
Companion:
object
Companion:
class
object Text
Companion:
object
Companion:
class
Companion:
object
Companion:
class
object Writer

Serializes tensors

Serializes tensors

This format is similar to the ONNX external tensor serialization format, but it uses JSON rather then protobuf.

Format specification

Sequences of tensors are serialized into a JSON descriptor and a data blob. The schema of the descriptor is the case class lamp.data.schemas.TensorList. The location field in this schema holds a path to the data blob. If this the location a relative POSIX then it is relative to the file path where the descriptor itself is written.

The descriptor may be embedded into larger JSON structures.

The data blob itself is the raw data in little endian byte order. Floating point is IEEE-754. The descriptor specifies the byte offset and byte length of the tensors inside the data blob. As such, the data blob contains no framing or other control bytes, but it may contain padding bytes between tensors.