lamp.data

package lamp.data

Members list

Packages

package lamp.data.bert

Greedy contraction of consecutive n-grams

Greedy contraction of consecutive n-grams

Attributes

Data loader and inference utilities for the language model module in lamp.nn.langaugemodel

Data loader and inference utilities for the language model module in lamp.nn.langaugemodel

Attributes

Type members

Classlikes

trait BatchStream[+I, S, C]

A functional stateful stream of items

A functional stateful stream of items

lamp's training loops work from data presented in BatchStreams.

An instance of BatchStream is an description of the data stream, it does not by itself allocates or stores any data. The stream needs to be driven by an interpreter. lamp.data.IOLoops and the companion object BatchStream contain those interpreters to make something useful with a BatchStream.

See the abstract members and the companion object for more documentation.

Type parameters

C

type of accessory resources (e.g. buffers), the stream might need an instance of this type for its working. The intended use for fixed, pre-allocated pinned buffer pairs to facilitate host-device copies. See lamp.Device.toBatched and lamp.BufferPair.

I

the item type , the stream will yield items of this type

S

the state type, the stream will carry over and accumulate state of this type

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Self type
BatchStream[I, S, C]
object BatchStream

Attributes

Companion
trait
Supertypes
class Object
trait Matchable
class Any
Self type

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
trait Codec

An abstraction around byte to token encodings.

An abstraction around byte to token encodings.

Attributes

Supertypes
class Object
trait Matchable
class Any
Known subtypes
object IdentityCodec.type
trait CodecFactory[T <: Codec]

An abstraction around byte to token encodings.

An abstraction around byte to token encodings.

Attributes

Supertypes
class Object
trait Matchable
class Any
Known subtypes
object DataParallel

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
case object EmptyBatch extends StreamControl[Nothing]

Attributes

Supertypes
trait Singleton
trait Product
trait Mirror
trait Serializable
trait Product
trait Equals
trait StreamControl[Nothing]
class Object
trait Matchable
class Any
Show all
Self type
EmptyBatch.type
case object EndStream extends StreamControl[Nothing]

Attributes

Supertypes
trait Singleton
trait Product
trait Mirror
trait Serializable
trait Product
trait Equals
trait StreamControl[Nothing]
class Object
trait Matchable
class Any
Show all
Self type
EndStream.type

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
object IOLoops

Contains a training loops and helpers around it

Contains a training loops and helpers around it

The two training loops implemented here are:

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
IOLoops.type
object IdentityCodec extends Codec

Attributes

Supertypes
trait Codec
class Object
trait Matchable
class Any
Self type

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
sealed trait LoopState

Attributes

Supertypes
class Object
trait Matchable
class Any
Known subtypes
case class NonEmptyBatch[I](batch: I) extends StreamControl[I]

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
trait StreamControl[I]
class Object
trait Matchable
class Any
Show all
case class Peek(label: String) extends Module

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
trait GenericModule[Variable, Variable]
class Object
trait Matchable
class Any
Show all
object Reader

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
Reader.type
object SWA

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
SWA.type
case class SWALoopState(model: Seq[STen], optimizer: Seq[STen], epoch: Int, lastValidationLoss: Option[Double], minValidationLoss: Option[Double], numberOfAveragedModels: Int, averagedModels: Option[Seq[Tensor]], learningCurve: List[(Int, Double, Option[Double])]) extends LoopState

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
trait LoopState
class Object
trait Matchable
class Any
Show all
case class SimpleLoopState(model: Seq[STen], optimizer: Seq[STen], epoch: Int, lastValidationLoss: Option[Double], minValidationLoss: Option[Double], minValidationLossModel: Option[(Int, Seq[Tensor])], learningCurve: List[(Int, Double, Option[(Double, Double)])]) extends LoopState

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
trait LoopState
class Object
trait Matchable
class Any
Show all
case class SimpleThenSWALoopState(simple: SimpleLoopState, swa: Option[SWALoopState]) extends LoopState

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
trait LoopState
class Object
trait Matchable
class Any
Show all
object StateIO

Helpers to read and write training loop state

Helpers to read and write training loop state

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
StateIO.type
sealed trait StreamControl[+I]

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Known subtypes
object EmptyBatch.type
object EndStream.type
class NonEmptyBatch[I]
object StreamControl

Attributes

Companion
trait
Supertypes
trait Sum
trait Mirror
class Object
trait Matchable
class Any
Self type
object Text

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
Text.type

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any

Attributes

Companion
trait
Supertypes
class Object
trait Matchable
class Any
Self type

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any

Attributes

Companion
trait
Supertypes
class Object
trait Matchable
class Any
Self type
object Writer

Serializes tensors

Serializes tensors

This format is similar to the ONNX external tensor serialization format, but it uses JSON rather then protobuf.

==Format specification==

Sequences of tensors are serialized into a JSON descriptor and a data blob. The schema of the descriptor is the case class lamp.data.schemas.TensorList. The location field in this schema holds a path to the data blob. If this is a relative POSIX path then it is relative to the file path where the descriptor itself is written. Otherwise it is an absolute path of the data blob file.

The descriptor may be embedded into larger JSON structures.

The data blob itself is the raw data in little endian byte order. Floating point is IEEE-754. The descriptor specifies the byte offset and byte length of the tensors inside the data blob. As such, the data blob contains no framing or other control bytes, but it may contain padding bytes between tensors.

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
Writer.type