class AutoGGUFVisionModel extends AnnotatorModel[AutoGGUFVisionModel] with HasBatchedAnnotateTextImage[AutoGGUFVisionModel] with HasEngine with HasLlamaCppModelProperties with HasLlamaCppInferenceProperties with HasProtectedParams with CompletionPostProcessing
Multimodal annotator that uses the llama.cpp library to generate text completions with large language models. It supports ingesting images for captioning.
At the moment only CLIP based models are supported.
For settable parameters, and their explanations, see HasLlamaCppInferenceProperties, HasLlamaCppModelProperties and refer to the llama.cpp documentation of server.cpp for more information.
If the parameters are not set, the annotator will default to use the parameters provided by the model.
This annotator expects a column of annotator type AnnotationImage for the image and Annotation for the caption. Note that the image bytes in the image annotation need to be raw image bytes without preprocessing. We provide the helper function ImageAssembler.loadImagesAsBytes to load the image bytes from a directory.
Pretrained models can be loaded with pretrained of the companion object:
val autoGGUFVisionModel = AutoGGUFVisionModel.pretrained() .setInputCols("image', "document") .setOutputCol("completions")
The default model is "Qwen2.5_VL_3B_Instruct_Q4_K_M_gguf", if no name is provided.
For available pretrained models please see the Models Hub.
For extended examples of usage, see the AutoGGUFVisionModelTest and the example notebook.
Note
To use GPU inference with this annotator, make sure to use the Spark NLP GPU package and set
the number of GPU layers with the setNGpuLayers method.
When using larger models, we recommend adjusting GPU usage with setNCtx and setNGpuLayers
according to your hardware to avoid out-of-memory errors.
Example
import com.johnsnowlabs.nlp.ImageAssembler import com.johnsnowlabs.nlp.annotator._ import com.johnsnowlabs.nlp.base._ import org.apache.spark.ml.Pipeline import org.apache.spark.sql.DataFrame import org.apache.spark.sql.functions.lit val documentAssembler = new DocumentAssembler() .setInputCol("caption") .setOutputCol("caption_document") val imageAssembler = new ImageAssembler() .setInputCol("image") .setOutputCol("image_assembler") val imagesPath = "src/test/resources/image/" val data: DataFrame = ImageAssembler .loadImagesAsBytes(ResourceHelper.spark, imagesPath) .withColumn("caption", lit("Caption this image.")) // Add a caption to each image. val nPredict = 40 val model = AutoGGUFVisionModel.pretrained() .setInputCols("caption_document", "image_assembler") .setOutputCol("completions") .setBatchSize(4) .setNGpuLayers(99) .setNCtx(4096) .setMinKeep(0) .setMinP(0.05f) .setNPredict(nPredict) .setNProbs(0) .setPenalizeNl(false) .setRepeatLastN(256) .setRepeatPenalty(1.18f) .setStopStrings(Array("</s>", "Llama:", "User:")) .setTemperature(0.05f) .setTfsZ(1) .setTypicalP(1) .setTopK(40) .setTopP(0.95f) val pipeline = new Pipeline().setStages(Array(documentAssembler, imageAssembler, model)) pipeline .fit(data) .transform(data) .selectExpr("reverse(split(image.origin, '/'))[0] as image_name", "completions.result") .show(truncate = false) +-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |image_name |result | +-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |palace.JPEG |[ The image depicts a large, ornate room with high ceilings and beautifully decorated walls. There are several chairs placed throughout the space, some of which have cushions] | |egyptian_cat.jpeg|[ The image features two cats lying on a pink surface, possibly a bed or sofa. One cat is positioned towards the left side of the scene and appears to be sleeping while holding] | |hippopotamus.JPEG|[ A large brown hippo is swimming in a body of water, possibly an aquarium. The hippo appears to be enjoying its time in the water and seems relaxed as it floats] | |hen.JPEG |[ The image features a large chicken standing next to several baby chickens. In total, there are five birds in the scene: one adult and four young ones. They appear to be gathered together] | |ostrich.JPEG |[ The image features a large, long-necked bird standing in the grass. It appears to be an ostrich or similar species with its head held high and looking around. In addition to] | |junco.JPEG |[ A small bird with a black head and white chest is standing on the snow. It appears to be looking at something, possibly food or another animal in its vicinity. The scene takes place out] | |bluetick.jpg |[ A dog with a red collar is sitting on the floor, looking at something. The dog appears to be staring into the distance or focusing its attention on an object in front of it.] | |chihuahua.jpg |[ A small brown dog wearing a sweater is sitting on the floor. The dog appears to be looking at something, possibly its owner or another animal in the room. It seems comfortable and relaxed]| |tractor.JPEG |[ A man is sitting in the driver's seat of a green tractor, which has yellow wheels and tires. The tractor appears to be parked on top of an empty field with] | |ox.JPEG |[ A large bull with horns is standing in a grassy field.] | +-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
- Grouped
- Alphabetic
- By Inheritance
- AutoGGUFVisionModel
- CompletionPostProcessing
- HasProtectedParams
- HasLlamaCppInferenceProperties
- HasLlamaCppModelProperties
- HasEngine
- HasBatchedAnnotateTextImage
- AnnotatorModel
- CanBeLazy
- RawAnnotator
- HasOutputAnnotationCol
- HasInputAnnotationCols
- HasOutputAnnotatorType
- ParamsAndFeaturesWritable
- HasFeatures
- DefaultParamsWritable
- MLWritable
- Model
- Transformer
- PipelineStage
- Logging
- Params
- Serializable
- Identifiable
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
Type Members
- implicit class ProtectedParam[T] extends Param[T]
- Definition Classes
- HasProtectedParams
- type AnnotationContent = Seq[Row]
internal types to show Rows as a relevant StructType Should be deleted once Spark releases UserDefinedTypes to @developerAPI
internal types to show Rows as a relevant StructType Should be deleted once Spark releases UserDefinedTypes to @developerAPI
- Attributes
- protected
- Definition Classes
- AnnotatorModel
- type AnnotatorType = String
- Definition Classes
- HasOutputAnnotatorType
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def $[T](param: Param[T]): T
- Attributes
- protected
- Definition Classes
- Params
- def $$[T](feature: StructFeature[T]): T
- Attributes
- protected
- Definition Classes
- HasFeatures
- def $$[K, V](feature: MapFeature[K, V]): Map[K, V]
- Attributes
- protected
- Definition Classes
- HasFeatures
- def $$[T](feature: SetFeature[T]): Set[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
- def $$[T](feature: ArrayFeature[T]): Array[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- def _transform(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): DataFrame
- Attributes
- protected
- Definition Classes
- AnnotatorModel
- def afterAnnotate(dataset: DataFrame): DataFrame
- Attributes
- protected
- Definition Classes
- AnnotatorModel
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def batchAnnotate(batchedAnnotations: Seq[(Annotation, AnnotationImage)]): Seq[Seq[Annotation]]
Completes the batch of annotations.
Completes the batch of annotations.
- batchedAnnotations
The single batch of annotations
- returns
Completed text sequences sentences that belong to the same original row !! (challenging)
- Definition Classes
- AutoGGUFVisionModel → HasBatchedAnnotateTextImage
- def batchProcess(rows: Iterator[_]): Iterator[Row]
- Definition Classes
- HasBatchedAnnotateTextImage
- val batchSize: IntParam
Size of every batch (Default depends on model).
Size of every batch (Default depends on model).
- Definition Classes
- HasBatchedAnnotateTextImage
- def beforeAnnotate(dataset: Dataset[_]): Dataset[_]
- Attributes
- protected
- Definition Classes
- AnnotatorModel
- val cachePrompt: BooleanParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val chatTemplate: Param[String]
- Definition Classes
- HasLlamaCppModelProperties
- final def checkSchema(schema: StructType, inputAnnotatorType: String): Boolean
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
- final def clear(param: Param[_]): AutoGGUFVisionModel.this.type
- Definition Classes
- Params
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @HotSpotIntrinsicCandidate() @native()
- def close(): Unit
Closes the llama.cpp model backend freeing resources.
Closes the llama.cpp model backend freeing resources. The model is reloaded when used again.
- def copy(extra: ParamMap): AutoGGUFVisionModel
requirement for annotators copies
requirement for annotators copies
- Definition Classes
- RawAnnotator → Model → Transformer → PipelineStage → Params
- def copyValues[T <: Params](to: T, extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
- final def defaultCopy[T <: Params](extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
- val defragmentationThreshold: FloatParam
- Definition Classes
- HasLlamaCppModelProperties
- val disableLog: BooleanParam
- Definition Classes
- HasLlamaCppModelProperties
- val disableTokenIds: IntArrayParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val dynamicTemperatureExponent: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val dynamicTemperatureRange: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val engine: Param[String]
This param is set internally once via loadSavedModel.
This param is set internally once via loadSavedModel. That's why there is no setter
- Definition Classes
- HasEngine
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def explainParam(param: Param[_]): String
- Definition Classes
- Params
- def explainParams(): String
- Definition Classes
- Params
- def extraValidate(structType: StructType): Boolean
- Attributes
- protected
- Definition Classes
- RawAnnotator
- def extraValidateMsg: String
Override for additional custom schema checks
Override for additional custom schema checks
- Attributes
- protected
- Definition Classes
- RawAnnotator
- final def extractParamMap(): ParamMap
- Definition Classes
- Params
- final def extractParamMap(extra: ParamMap): ParamMap
- Definition Classes
- Params
- val features: ArrayBuffer[Feature[_, _, _]]
- Definition Classes
- HasFeatures
- val flashAttention: BooleanParam
- Definition Classes
- HasLlamaCppModelProperties
- val frequencyPenalty: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- def get[T](feature: StructFeature[T]): Option[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
- def get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]
- Attributes
- protected
- Definition Classes
- HasFeatures
- def get[T](feature: SetFeature[T]): Option[Set[T]]
- Attributes
- protected
- Definition Classes
- HasFeatures
- def get[T](feature: ArrayFeature[T]): Option[Array[T]]
- Attributes
- protected
- Definition Classes
- HasFeatures
- final def get[T](param: Param[T]): Option[T]
- Definition Classes
- Params
- def getBatchSize: Int
Size of every batch.
Size of every batch.
- Definition Classes
- HasBatchedAnnotateTextImage
- def getCachePrompt: Boolean
- Definition Classes
- HasLlamaCppInferenceProperties
- def getChatTemplate: String
- Definition Classes
- HasLlamaCppModelProperties
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- final def getDefault[T](param: Param[T]): Option[T]
- Definition Classes
- Params
- def getDefragmentationThreshold: Float
- Definition Classes
- HasLlamaCppModelProperties
- def getDisableLog: Boolean
- Definition Classes
- HasLlamaCppModelProperties
- def getDisableTokenIds: Array[Int]
- Definition Classes
- HasLlamaCppInferenceProperties
- def getDynamicTemperatureExponent: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getDynamicTemperatureRange: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getEngine: String
- Definition Classes
- HasEngine
- def getFlashAttention: Boolean
- Definition Classes
- HasLlamaCppModelProperties
- def getFrequencyPenalty: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getGrammar: String
- Definition Classes
- HasLlamaCppInferenceProperties
- def getIgnoreEos: Boolean
- Definition Classes
- HasLlamaCppInferenceProperties
- def getInferenceParameters: InferenceParameters
- Attributes
- protected
- Definition Classes
- HasLlamaCppInferenceProperties
- def getInputCols: Array[String]
- returns
input annotations columns currently used
- Definition Classes
- HasInputAnnotationCols
- def getInputPrefix: String
- Definition Classes
- HasLlamaCppInferenceProperties
- def getInputSuffix: String
- Definition Classes
- HasLlamaCppInferenceProperties
- def getLazyAnnotator: Boolean
- Definition Classes
- CanBeLazy
- def getLogVerbosity: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getMainGpu: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getMetadata: String
Get the metadata for the model
Get the metadata for the model
- Definition Classes
- HasLlamaCppModelProperties
- def getMetadataMap: Map[String, Map[String, String]]
- Definition Classes
- HasLlamaCppModelProperties
- def getMinKeep: Int
- Definition Classes
- HasLlamaCppInferenceProperties
- def getMinP: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getMiroStat: String
- Definition Classes
- HasLlamaCppInferenceProperties
- def getMiroStatEta: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getMiroStatTau: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getModelDraft: String
- Definition Classes
- HasLlamaCppModelProperties
- def getModelIfNotSet: GGUFWrapperMultiModal
- def getModelParameters: ModelParameters
- Attributes
- protected
- Definition Classes
- HasLlamaCppModelProperties
- def getNBatch: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getNCtx: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getNDraft: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getNGpuLayers: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getNGpuLayersDraft: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getNKeep: Int
- Definition Classes
- HasLlamaCppInferenceProperties
- def getNPredict: Int
- Definition Classes
- HasLlamaCppInferenceProperties
- def getNProbs: Int
- Definition Classes
- HasLlamaCppInferenceProperties
- def getNThreads: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getNThreadsBatch: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getNUbatch: Int
- Definition Classes
- HasLlamaCppModelProperties
- def getNoKvOffload: Boolean
- Definition Classes
- HasLlamaCppModelProperties
- def getNuma: String
- Definition Classes
- HasLlamaCppModelProperties
- final def getOrDefault[T](param: Param[T]): T
- Definition Classes
- Params
- final def getOutputCol: String
Gets annotation column name going to generate
Gets annotation column name going to generate
- Definition Classes
- HasOutputAnnotationCol
- def getParam(paramName: String): Param[Any]
- Definition Classes
- Params
- def getPenalizeNl: Boolean
- Definition Classes
- HasLlamaCppInferenceProperties
- def getPenaltyPrompt: String
- Definition Classes
- HasLlamaCppInferenceProperties
- def getPresencePenalty: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getRemoveThinkingTag: Option[String]
- Definition Classes
- CompletionPostProcessing
- def getRepeatLastN: Int
- Definition Classes
- HasLlamaCppInferenceProperties
- def getRepeatPenalty: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getRopeFreqBase: Float
- Definition Classes
- HasLlamaCppModelProperties
- def getRopeFreqScale: Float
- Definition Classes
- HasLlamaCppModelProperties
- def getRopeScalingType: String
- Definition Classes
- HasLlamaCppModelProperties
- def getSamplers: Array[String]
- Definition Classes
- HasLlamaCppInferenceProperties
- def getSeed: Int
- Definition Classes
- HasLlamaCppInferenceProperties
- def getSplitMode: String
- Definition Classes
- HasLlamaCppModelProperties
- def getStopStrings: Array[String]
- Definition Classes
- HasLlamaCppInferenceProperties
- def getSystemPrompt: String
- Definition Classes
- HasLlamaCppModelProperties
- def getTemperature: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getTfsZ: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getTokenBias: Map[String, Float]
- Definition Classes
- HasLlamaCppInferenceProperties
- def getTokenIdBias: Map[Int, Float]
- Definition Classes
- HasLlamaCppInferenceProperties
- def getTopK: Int
- Definition Classes
- HasLlamaCppInferenceProperties
- def getTopP: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getTypicalP: Float
- Definition Classes
- HasLlamaCppInferenceProperties
- def getUseChatTemplate: Boolean
- Definition Classes
- HasLlamaCppInferenceProperties
- def getUseMlock: Boolean
- Definition Classes
- HasLlamaCppModelProperties
- def getUseMmap: Boolean
- Definition Classes
- HasLlamaCppModelProperties
- def getYarnAttnFactor: Float
- Definition Classes
- HasLlamaCppModelProperties
- def getYarnBetaFast: Float
- Definition Classes
- HasLlamaCppModelProperties
- def getYarnBetaSlow: Float
- Definition Classes
- HasLlamaCppModelProperties
- def getYarnExtFactor: Float
- Definition Classes
- HasLlamaCppModelProperties
- def getYarnOrigCtx: Int
- Definition Classes
- HasLlamaCppModelProperties
- val gpuSplitMode: Param[String]
Set how to split the model across GPUs
Set how to split the model across GPUs
- NONE: No GPU split
- LAYER: Split the model across GPUs by layer
- ROW: Split the model across GPUs by rows
- Definition Classes
- HasLlamaCppModelProperties
- val grammar: Param[String]
- Definition Classes
- HasLlamaCppInferenceProperties
- final def hasDefault[T](param: Param[T]): Boolean
- Definition Classes
- Params
- def hasParam(paramName: String): Boolean
- Definition Classes
- Params
- def hasParent: Boolean
- Definition Classes
- Model
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- val ignoreEos: BooleanParam
- Definition Classes
- HasLlamaCppInferenceProperties
- def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
- val inputAnnotatorTypes: Array[AnnotatorType]
Annotator reference id.
Annotator reference id. Used to identify elements in metadata or to refer to this annotator type
- Definition Classes
- AutoGGUFVisionModel → HasInputAnnotationCols
- final val inputCols: StringArrayParam
columns that contain annotations necessary to run this annotator AnnotatorType is used both as input and output columns if not specified
columns that contain annotations necessary to run this annotator AnnotatorType is used both as input and output columns if not specified
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
- val inputPrefix: Param[String]
- Definition Classes
- HasLlamaCppInferenceProperties
- val inputSuffix: Param[String]
- Definition Classes
- HasLlamaCppInferenceProperties
- final def isDefined(param: Param[_]): Boolean
- Definition Classes
- Params
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def isSet(param: Param[_]): Boolean
- Definition Classes
- Params
- def isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- val lazyAnnotator: BooleanParam
- Definition Classes
- CanBeLazy
- def log: Logger
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logName: String
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- val logVerbosity: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- def logWarning(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- val logger: Logger
- Attributes
- protected
- Definition Classes
- HasLlamaCppModelProperties
- val mainGpu: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- val metadata: ProtectedParam[String]
- Definition Classes
- HasLlamaCppModelProperties
- val minKeep: IntParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val minP: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val miroStat: Param[String]
- Definition Classes
- HasLlamaCppInferenceProperties
- val miroStatEta: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val miroStatTau: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val modelDraft: Param[String]
- Definition Classes
- HasLlamaCppModelProperties
- def msgHelper(schema: StructType): String
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
- val nBatch: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- val nCtx: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- val nDraft: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- val nGpuLayers: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- val nGpuLayersDraft: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- val nKeep: IntParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val nPredict: IntParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val nProbs: IntParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val nThreads: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- val nThreadsBatch: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- val nUbatch: IntParam
- Definition Classes
- HasLlamaCppModelProperties
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- val noKvOffload: BooleanParam
- Definition Classes
- HasLlamaCppModelProperties
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- val numaStrategy: Param[String]
Set optimization strategies that help on some NUMA systems (if available)
Set optimization strategies that help on some NUMA systems (if available)
Available Strategies:
- DISABLED: No NUMA optimizations
- DISTRIBUTE: Spread execution evenly over all
- ISOLATE: Only spawn threads on CPUs on the node that execution started on
- NUMA_CTL: Use the CPU map provided by numactl
- MIRROR: Mirrors the model across NUMA nodes
- Definition Classes
- HasLlamaCppModelProperties
- def onWrite(path: String, spark: SparkSession): Unit
- Definition Classes
- AutoGGUFVisionModel → ParamsAndFeaturesWritable
- val optionalInputAnnotatorTypes: Array[String]
- Definition Classes
- HasInputAnnotationCols
- val outputAnnotatorType: AnnotatorType
- Definition Classes
- AutoGGUFVisionModel → HasOutputAnnotatorType
- final val outputCol: Param[String]
- Attributes
- protected
- Definition Classes
- HasOutputAnnotationCol
- lazy val params: Array[Param[_]]
- Definition Classes
- Params
- var parent: Estimator[AutoGGUFVisionModel]
- Definition Classes
- Model
- val penalizeNl: BooleanParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val penaltyPrompt: Param[String]
- Definition Classes
- HasLlamaCppInferenceProperties
- val presencePenalty: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- def processCompletions(results: Array[String]): Array[String]
- Attributes
- protected
- Definition Classes
- CompletionPostProcessing
- val removeThinkingTag: Param[String]
- Definition Classes
- CompletionPostProcessing
- val repeatLastN: IntParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val repeatPenalty: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val ropeFreqBase: FloatParam
- Definition Classes
- HasLlamaCppModelProperties
- val ropeFreqScale: FloatParam
- Definition Classes
- HasLlamaCppModelProperties
- val ropeScalingType: Param[String]
Set the RoPE frequency scaling method, defaults to linear unless specified by the model.
Set the RoPE frequency scaling method, defaults to linear unless specified by the model.
- UNSPECIFIED: Don't use any scaling
- LINEAR: Linear scaling
- YARN: YaRN RoPE scaling
- Definition Classes
- HasLlamaCppModelProperties
- val samplers: StringArrayParam
- Definition Classes
- HasLlamaCppInferenceProperties
- def save(path: String): Unit
- Definition Classes
- MLWritable
- Annotations
- @throws("If the input path already exists but overwrite is not enabled.") @Since("1.6.0")
- val seed: IntParam
- Definition Classes
- HasLlamaCppInferenceProperties
- def set[T](param: ProtectedParam[T], value: T): AutoGGUFVisionModel.this.type
Sets the value for a protected Param.
Sets the value for a protected Param.
If the parameter was already set, it will not be set again. Default values do not count as a set value and can be overridden.
- T
Type of the parameter
- param
Protected parameter to set
- value
Value for the parameter
- returns
This object
- Definition Classes
- HasProtectedParams
- def set[T](feature: StructFeature[T], value: T): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
- def set[K, V](feature: MapFeature[K, V], value: Map[K, V]): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
- def set[T](feature: SetFeature[T], value: Set[T]): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
- def set[T](feature: ArrayFeature[T], value: Array[T]): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
- final def set(paramPair: ParamPair[_]): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- Params
- final def set(param: String, value: Any): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- Params
- final def set[T](param: Param[T], value: T): AutoGGUFVisionModel.this.type
- Definition Classes
- Params
- def setBatchSize(size: Int): AutoGGUFVisionModel.this.type
Size of every batch.
Size of every batch.
- Definition Classes
- HasBatchedAnnotateTextImage
- def setCachePrompt(cachePrompt: Boolean): AutoGGUFVisionModel.this.type
Whether to remember the prompt to avoid reprocessing it
Whether to remember the prompt to avoid reprocessing it
- Definition Classes
- HasLlamaCppInferenceProperties
- def setChatTemplate(chatTemplate: String): AutoGGUFVisionModel.this.type
The chat template to use
The chat template to use
- Definition Classes
- HasLlamaCppModelProperties
- def setDefault[T](feature: StructFeature[T], value: () => T): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
- def setDefault[K, V](feature: MapFeature[K, V], value: () => Map[K, V]): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
- def setDefault[T](feature: SetFeature[T], value: () => Set[T]): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
- def setDefault[T](feature: ArrayFeature[T], value: () => Array[T]): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
- final def setDefault(paramPairs: ParamPair[_]*): AutoGGUFVisionModel.this.type
- Attributes
- protected
- Definition Classes
- Params
- final def setDefault[T](param: Param[T], value: T): AutoGGUFVisionModel.this.type
- Attributes
- protected[org.apache.spark.ml]
- Definition Classes
- Params
- def setDefragmentationThreshold(defragThold: Float): AutoGGUFVisionModel.this.type
Set the KV cache defragmentation threshold
Set the KV cache defragmentation threshold
- Definition Classes
- HasLlamaCppModelProperties
- def setDisableLog(disableLog: Boolean): AutoGGUFVisionModel.this.type
- Definition Classes
- HasLlamaCppModelProperties
- def setDisableTokenIds(disableTokenIds: Array[Int]): AutoGGUFVisionModel.this.type
Set the token ids to disable in the completion.
Set the token ids to disable in the completion. This corresponds to
setTokenBiaswith a value ofFloat.NEGATIVE_INFINITY.- Definition Classes
- HasLlamaCppInferenceProperties
- def setDynamicTemperatureExponent(dynatempExponent: Float): AutoGGUFVisionModel.this.type
Set the dynamic temperature exponent
Set the dynamic temperature exponent
- Definition Classes
- HasLlamaCppInferenceProperties
- def setDynamicTemperatureRange(dynatempRange: Float): AutoGGUFVisionModel.this.type
Set the dynamic temperature range
Set the dynamic temperature range
- Definition Classes
- HasLlamaCppInferenceProperties
- def setFlashAttention(flashAttention: Boolean): AutoGGUFVisionModel.this.type
Whether to enable Flash Attention
Whether to enable Flash Attention
- Definition Classes
- HasLlamaCppModelProperties
- def setFrequencyPenalty(frequencyPenalty: Float): AutoGGUFVisionModel.this.type
Set the repetition alpha frequency penalty
Set the repetition alpha frequency penalty
- Definition Classes
- HasLlamaCppInferenceProperties
- def setGpuSplitMode(splitMode: String): AutoGGUFVisionModel.this.type
Set how to split the model across GPUs
Set how to split the model across GPUs
- NONE: No GPU split -LAYER: Split the model across GPUs by layer 2. ROW: Split the model across GPUs by rows
- Definition Classes
- HasLlamaCppModelProperties
- def setGrammar(grammar: String): AutoGGUFVisionModel.this.type
Set BNF-like grammar to constrain generations
Set BNF-like grammar to constrain generations
- Definition Classes
- HasLlamaCppInferenceProperties
- def setIgnoreEos(ignoreEos: Boolean): AutoGGUFVisionModel.this.type
Set whether to ignore end of stream token and continue generating (implies --logit-bias 2-inf)
Set whether to ignore end of stream token and continue generating (implies --logit-bias 2-inf)
- Definition Classes
- HasLlamaCppInferenceProperties
- final def setInputCols(value: String*): AutoGGUFVisionModel.this.type
- Definition Classes
- HasInputAnnotationCols
- def setInputCols(value: Array[String]): AutoGGUFVisionModel.this.type
Overrides required annotators column if different than default
Overrides required annotators column if different than default
- Definition Classes
- HasInputAnnotationCols
- def setInputPrefix(inputPrefix: String): AutoGGUFVisionModel.this.type
Set the prompt to start generation with
Set the prompt to start generation with
- Definition Classes
- HasLlamaCppInferenceProperties
- def setInputSuffix(inputSuffix: String): AutoGGUFVisionModel.this.type
Set a suffix for infilling
Set a suffix for infilling
- Definition Classes
- HasLlamaCppInferenceProperties
- def setLazyAnnotator(value: Boolean): AutoGGUFVisionModel.this.type
- Definition Classes
- CanBeLazy
- def setLogVerbosity(logVerbosity: Int): AutoGGUFVisionModel.this.type
Set the verbosity threshold.
Set the verbosity threshold. Messages with a higher verbosity will be ignored.
Values map to the following:
- GGML_LOG_LEVEL_NONE = 0
- GGML_LOG_LEVEL_DEBUG = 1
- GGML_LOG_LEVEL_INFO = 2
- GGML_LOG_LEVEL_WARN = 3
- GGML_LOG_LEVEL_ERROR = 4
- GGML_LOG_LEVEL_CONT = 5 (continue previous log)
- Definition Classes
- HasLlamaCppModelProperties
- def setMainGpu(mainGpu: Int): AutoGGUFVisionModel.this.type
Set the GPU that is used for scratch and small tensors
Set the GPU that is used for scratch and small tensors
- Definition Classes
- HasLlamaCppModelProperties
- def setMetadata(metadata: String): AutoGGUFVisionModel.this.type
Set the metadata for the model
Set the metadata for the model
- Definition Classes
- HasLlamaCppModelProperties
- def setMinKeep(minKeep: Int): AutoGGUFVisionModel.this.type
Set the amount of tokens the samplers should return at least (0 = disabled)
Set the amount of tokens the samplers should return at least (0 = disabled)
- Definition Classes
- HasLlamaCppInferenceProperties
- def setMinP(minP: Float): AutoGGUFVisionModel.this.type
Set min-p sampling
Set min-p sampling
- Definition Classes
- HasLlamaCppInferenceProperties
- def setMiroStat(mirostat: String): AutoGGUFVisionModel.this.type
Set MiroStat sampling strategies.
Set MiroStat sampling strategies.
- DISABLED: No MiroStat
- V1: MiroStat V1
- V2: MiroStat V2
- Definition Classes
- HasLlamaCppInferenceProperties
- def setMiroStatEta(mirostatEta: Float): AutoGGUFVisionModel.this.type
Set the MiroStat learning rate, parameter eta
Set the MiroStat learning rate, parameter eta
- Definition Classes
- HasLlamaCppInferenceProperties
- def setMiroStatTau(mirostatTau: Float): AutoGGUFVisionModel.this.type
Set the MiroStat target entropy, parameter tau
Set the MiroStat target entropy, parameter tau
- Definition Classes
- HasLlamaCppInferenceProperties
- def setModelDraft(modelDraft: String): AutoGGUFVisionModel.this.type
Set the draft model for speculative decoding
Set the draft model for speculative decoding
- Definition Classes
- HasLlamaCppModelProperties
- def setModelIfNotSet(spark: SparkSession, wrapper: GGUFWrapperMultiModal): AutoGGUFVisionModel.this.type
- def setNBatch(nBatch: Int): AutoGGUFVisionModel.this.type
Set the logical batch size for prompt processing (must be >=32 to use BLAS)
Set the logical batch size for prompt processing (must be >=32 to use BLAS)
- Definition Classes
- HasLlamaCppModelProperties
- def setNCtx(nCtx: Int): AutoGGUFVisionModel.this.type
Set the size of the prompt context
Set the size of the prompt context
- Definition Classes
- HasLlamaCppModelProperties
- def setNDraft(nDraft: Int): AutoGGUFVisionModel.this.type
Set the number of tokens to draft for speculative decoding
Set the number of tokens to draft for speculative decoding
- Definition Classes
- HasLlamaCppModelProperties
- def setNGpuLayers(nGpuLayers: Int): AutoGGUFVisionModel.this.type
Set the number of layers to store in VRAM (-1 - use default)
Set the number of layers to store in VRAM (-1 - use default)
- Definition Classes
- HasLlamaCppModelProperties
- def setNGpuLayersDraft(nGpuLayersDraft: Int): AutoGGUFVisionModel.this.type
Set the number of layers to store in VRAM for the draft model (-1 - use default)
Set the number of layers to store in VRAM for the draft model (-1 - use default)
- Definition Classes
- HasLlamaCppModelProperties
- def setNKeep(nKeep: Int): AutoGGUFVisionModel.this.type
Set the number of tokens to keep from the initial prompt
Set the number of tokens to keep from the initial prompt
- Definition Classes
- HasLlamaCppInferenceProperties
- def setNParallel(nParallel: Int): AutoGGUFVisionModel.this.type
Sets the number of parallel processes for decoding.
Sets the number of parallel processes for decoding. This is an alias for
setBatchSize.- nParallel
The number of parallel processes for decoding
- def setNPredict(nPredict: Int): AutoGGUFVisionModel.this.type
Set the number of tokens to predict
Set the number of tokens to predict
- Definition Classes
- HasLlamaCppInferenceProperties
- def setNProbs(nProbs: Int): AutoGGUFVisionModel.this.type
Set the amount top tokens probabilities to output if greater than 0.
Set the amount top tokens probabilities to output if greater than 0.
- Definition Classes
- HasLlamaCppInferenceProperties
- def setNThreads(nThreads: Int): AutoGGUFVisionModel.this.type
Set the number of threads to use during generation
Set the number of threads to use during generation
- Definition Classes
- HasLlamaCppModelProperties
- def setNThreadsBatch(nThreadsBatch: Int): AutoGGUFVisionModel.this.type
Set the number of threads to use during batch and prompt processing
Set the number of threads to use during batch and prompt processing
- Definition Classes
- HasLlamaCppModelProperties
- def setNUbatch(nUbatch: Int): AutoGGUFVisionModel.this.type
Set the physical batch size for prompt processing (must be >=32 to use BLAS)
Set the physical batch size for prompt processing (must be >=32 to use BLAS)
- Definition Classes
- HasLlamaCppModelProperties
- def setNoKvOffload(noKvOffload: Boolean): AutoGGUFVisionModel.this.type
Whether to disable KV offload
Whether to disable KV offload
- Definition Classes
- HasLlamaCppModelProperties
- def setNumaStrategy(numa: String): AutoGGUFVisionModel.this.type
Set optimization strategies that help on some NUMA systems (if available)
Set optimization strategies that help on some NUMA systems (if available)
Available Strategies:
- DISABLED: No NUMA optimizations
- DISTRIBUTE: spread execution evenly over all
- ISOLATE: only spawn threads on CPUs on the node that execution started on
- NUMA_CTL: use the CPU map provided by numactl
- MIRROR: Mirrors the model across NUMA nodes
- Definition Classes
- HasLlamaCppModelProperties
- final def setOutputCol(value: String): AutoGGUFVisionModel.this.type
Overrides annotation column name when transforming
Overrides annotation column name when transforming
- Definition Classes
- HasOutputAnnotationCol
- def setParent(parent: Estimator[AutoGGUFVisionModel]): AutoGGUFVisionModel
- Definition Classes
- Model
- def setPenalizeNl(penalizeNl: Boolean): AutoGGUFVisionModel.this.type
Set whether to penalize newline tokens
Set whether to penalize newline tokens
- Definition Classes
- HasLlamaCppInferenceProperties
- def setPenaltyPrompt(penaltyPrompt: String): AutoGGUFVisionModel.this.type
Override which part of the prompt is penalized for repetition.
Override which part of the prompt is penalized for repetition.
- Definition Classes
- HasLlamaCppInferenceProperties
- def setPresencePenalty(presencePenalty: Float): AutoGGUFVisionModel.this.type
Set the repetition alpha presence penalty
Set the repetition alpha presence penalty
- Definition Classes
- HasLlamaCppInferenceProperties
- def setRemoveThinkingTag(value: String): AutoGGUFVisionModel.this.type
Set a thinking tag (e.g.
Set a thinking tag (e.g.
think) to be removed from output. Will produce the regex(?s)<$TAG>.+?</$TAG>- Definition Classes
- CompletionPostProcessing
- def setRepeatLastN(repeatLastN: Int): AutoGGUFVisionModel.this.type
Set the last n tokens to consider for penalties
Set the last n tokens to consider for penalties
- Definition Classes
- HasLlamaCppInferenceProperties
- def setRepeatPenalty(repeatPenalty: Float): AutoGGUFVisionModel.this.type
Set the penalty of repeated sequences of tokens
Set the penalty of repeated sequences of tokens
- Definition Classes
- HasLlamaCppInferenceProperties
- def setRopeFreqBase(ropeFreqBase: Float): AutoGGUFVisionModel.this.type
Set the RoPE base frequency, used by NTK-aware scaling
Set the RoPE base frequency, used by NTK-aware scaling
- Definition Classes
- HasLlamaCppModelProperties
- def setRopeFreqScale(ropeFreqScale: Float): AutoGGUFVisionModel.this.type
Set the RoPE frequency scaling factor, expands context by a factor of 1/N
Set the RoPE frequency scaling factor, expands context by a factor of 1/N
- Definition Classes
- HasLlamaCppModelProperties
- def setRopeScalingType(ropeScalingType: String): AutoGGUFVisionModel.this.type
Set the RoPE frequency scaling method, defaults to linear unless specified by the model.
Set the RoPE frequency scaling method, defaults to linear unless specified by the model.
- NONE: Don't use any scaling
- LINEAR: Linear scaling
- YARN: YaRN RoPE scaling
- Definition Classes
- HasLlamaCppModelProperties
- def setSamplers(samplers: Array[String]): AutoGGUFVisionModel.this.type
Set which samplers to use for token generation in the given order .
Set which samplers to use for token generation in the given order .
Available Samplers are:
- TOP_K: Top-k sampling
- TFS_Z: Tail free sampling
- TYPICAL_P: Locally typical sampling p
- TOP_P: Top-p sampling
- MIN_P: Min-p sampling
- TEMPERATURE: Temperature sampling
- Definition Classes
- HasLlamaCppInferenceProperties
- def setSeed(seed: Int): AutoGGUFVisionModel.this.type
Set the RNG seed
Set the RNG seed
- Definition Classes
- HasLlamaCppInferenceProperties
- def setStopStrings(stopStrings: Array[String]): AutoGGUFVisionModel.this.type
Set strings upon seeing which token generation is stopped
Set strings upon seeing which token generation is stopped
- Definition Classes
- HasLlamaCppInferenceProperties
- def setSystemPrompt(systemPrompt: String): AutoGGUFVisionModel.this.type
Set a system prompt to use
Set a system prompt to use
- Definition Classes
- HasLlamaCppModelProperties
- def setTemperature(temperature: Float): AutoGGUFVisionModel.this.type
Set the temperature
Set the temperature
- Definition Classes
- HasLlamaCppInferenceProperties
- def setTfsZ(tfsZ: Float): AutoGGUFVisionModel.this.type
Set tail free sampling, parameter z
Set tail free sampling, parameter z
- Definition Classes
- HasLlamaCppInferenceProperties
- def setTokenBias(tokenBias: HashMap[String, Double]): AutoGGUFVisionModel.this.type
Set the tokens to disable during completion.
Set the tokens to disable during completion. (Override for PySpark)
- Definition Classes
- HasLlamaCppInferenceProperties
- def setTokenBias(tokenBias: Map[String, Float]): AutoGGUFVisionModel.this.type
Set the tokens to disable during completion.
Set the tokens to disable during completion.
- Definition Classes
- HasLlamaCppInferenceProperties
- def setTokenIdBias(tokenIdBias: HashMap[Integer, Double]): AutoGGUFVisionModel.this.type
Set the token ids to disable in the completion.
Set the token ids to disable in the completion. (Override for PySpark)
- Definition Classes
- HasLlamaCppInferenceProperties
- def setTokenIdBias(tokenIdBias: Map[Int, Float]): AutoGGUFVisionModel.this.type
Set the token ids to disable in the completion.
Set the token ids to disable in the completion.
- Definition Classes
- HasLlamaCppInferenceProperties
- def setTopK(topK: Int): AutoGGUFVisionModel.this.type
Set top-k sampling
Set top-k sampling
- Definition Classes
- HasLlamaCppInferenceProperties
- def setTopP(topP: Float): AutoGGUFVisionModel.this.type
Set top-p sampling
Set top-p sampling
- Definition Classes
- HasLlamaCppInferenceProperties
- def setTypicalP(typicalP: Float): AutoGGUFVisionModel.this.type
Set locally typical sampling, parameter p
Set locally typical sampling, parameter p
- Definition Classes
- HasLlamaCppInferenceProperties
- def setUseChatTemplate(useChatTemplate: Boolean): AutoGGUFVisionModel.this.type
Set whether or not generate should apply a chat template
Set whether or not generate should apply a chat template
- Definition Classes
- HasLlamaCppInferenceProperties
- def setUseMlock(useMlock: Boolean): AutoGGUFVisionModel.this.type
Whether to force the system to keep model in RAM rather than swapping or compressing
Whether to force the system to keep model in RAM rather than swapping or compressing
- Definition Classes
- HasLlamaCppModelProperties
- def setUseMmap(useMmap: Boolean): AutoGGUFVisionModel.this.type
Whether to use memory-map model (faster load but may increase pageouts if not using mlock)
Whether to use memory-map model (faster load but may increase pageouts if not using mlock)
- Definition Classes
- HasLlamaCppModelProperties
- def setYarnAttnFactor(yarnAttnFactor: Float): AutoGGUFVisionModel.this.type
Set the YaRN scale sqrt(t) or attention magnitude
Set the YaRN scale sqrt(t) or attention magnitude
- Definition Classes
- HasLlamaCppModelProperties
- def setYarnBetaFast(yarnBetaFast: Float): AutoGGUFVisionModel.this.type
Set the YaRN low correction dim or beta
Set the YaRN low correction dim or beta
- Definition Classes
- HasLlamaCppModelProperties
- def setYarnBetaSlow(yarnBetaSlow: Float): AutoGGUFVisionModel.this.type
Set the YaRN high correction dim or alpha
Set the YaRN high correction dim or alpha
- Definition Classes
- HasLlamaCppModelProperties
- def setYarnExtFactor(yarnExtFactor: Float): AutoGGUFVisionModel.this.type
Set the YaRN extrapolation mix factor
Set the YaRN extrapolation mix factor
- Definition Classes
- HasLlamaCppModelProperties
- def setYarnOrigCtx(yarnOrigCtx: Int): AutoGGUFVisionModel.this.type
Set the YaRN original context size of model
Set the YaRN original context size of model
- Definition Classes
- HasLlamaCppModelProperties
- val stopStrings: StringArrayParam
- Definition Classes
- HasLlamaCppInferenceProperties
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- val systemPrompt: Param[String]
- Definition Classes
- HasLlamaCppModelProperties
- val temperature: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val tfsZ: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- def toString(): String
- Definition Classes
- Identifiable → AnyRef → Any
- val tokenBias: StructFeature[Map[String, Float]]
- Definition Classes
- HasLlamaCppInferenceProperties
- val tokenIdBias: StructFeature[Map[Int, Float]]
- Definition Classes
- HasLlamaCppInferenceProperties
- val topK: IntParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val topP: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- final def transform(dataset: Dataset[_]): DataFrame
Given requirements are met, this applies ML transformation within a Pipeline or stand-alone Output annotation will be generated as a new column, previous annotations are still available separately metadata is built at schema level to record annotations structural information outside its content
Given requirements are met, this applies ML transformation within a Pipeline or stand-alone Output annotation will be generated as a new column, previous annotations are still available separately metadata is built at schema level to record annotations structural information outside its content
- dataset
Dataset[Row]
- Definition Classes
- AnnotatorModel → Transformer
- def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
- Definition Classes
- Transformer
- Annotations
- @Since("2.0.0")
- def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
- Definition Classes
- Transformer
- Annotations
- @varargs() @Since("2.0.0")
- final def transformSchema(schema: StructType): StructType
requirement for pipeline transformation validation.
requirement for pipeline transformation validation. It is called on fit()
- Definition Classes
- RawAnnotator → PipelineStage
- def transformSchema(schema: StructType, logging: Boolean): StructType
- Attributes
- protected
- Definition Classes
- PipelineStage
- Annotations
- @DeveloperApi()
- val typicalP: FloatParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val uid: String
- Definition Classes
- AutoGGUFVisionModel → Identifiable
- val useChatTemplate: BooleanParam
- Definition Classes
- HasLlamaCppInferenceProperties
- val useMlock: BooleanParam
- Definition Classes
- HasLlamaCppModelProperties
- val useMmap: BooleanParam
- Definition Classes
- HasLlamaCppModelProperties
- def validate(schema: StructType): Boolean
takes a Dataset and checks to see if all the required annotation types are present.
takes a Dataset and checks to see if all the required annotation types are present.
- schema
to be validated
- returns
True if all the required types are present, else false
- Attributes
- protected
- Definition Classes
- RawAnnotator
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- def wrapColumnMetadata(col: Column): Column
- Attributes
- protected
- Definition Classes
- RawAnnotator
- def write: MLWriter
- Definition Classes
- ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable
- val yarnAttnFactor: FloatParam
- Definition Classes
- HasLlamaCppModelProperties
- val yarnBetaFast: FloatParam
- Definition Classes
- HasLlamaCppModelProperties
- val yarnBetaSlow: FloatParam
- Definition Classes
- HasLlamaCppModelProperties
- val yarnExtFactor: FloatParam
- Definition Classes
- HasLlamaCppModelProperties
- val yarnOrigCtx: IntParam
- Definition Classes
- HasLlamaCppModelProperties
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)
Inherited from CompletionPostProcessing
Inherited from HasProtectedParams
Inherited from HasLlamaCppInferenceProperties
Inherited from HasLlamaCppModelProperties
Inherited from HasEngine
Inherited from HasBatchedAnnotateTextImage[AutoGGUFVisionModel]
Inherited from AnnotatorModel[AutoGGUFVisionModel]
Inherited from CanBeLazy
Inherited from RawAnnotator[AutoGGUFVisionModel]
Inherited from HasOutputAnnotationCol
Inherited from HasInputAnnotationCols
Inherited from HasOutputAnnotatorType
Inherited from ParamsAndFeaturesWritable
Inherited from HasFeatures
Inherited from DefaultParamsWritable
Inherited from MLWritable
Inherited from Model[AutoGGUFVisionModel]
Inherited from Transformer
Inherited from PipelineStage
Inherited from Logging
Inherited from Params
Inherited from Serializable
Inherited from Identifiable
Inherited from AnyRef
Inherited from Any
Parameters
A list of (hyper-)parameter keys this annotator can take. Users can set and get the parameter values through setters and getters, respectively.