Trains an unlabeled parser that finds a grammatical relations between two words in a sentence.
Unlabeled parser that finds a grammatical relation between two words in a sentence.
Unlabeled parser that finds a grammatical relation between two words in a sentence.
Dependency parser provides information about word relationship. For example, dependency parsing can tell you what the subjects and objects of a verb are, as well as which words are modifying (describing) the subject. This can help you find precise answers to specific questions.
This is the instantiated model of the DependencyParserApproach. For training your own model, please see the documentation of that class.
Pretrained models can be loaded with pretrained
of the companion object:
val dependencyParserApproach = DependencyParserModel.pretrained() .setInputCols("sentence", "pos", "token") .setOutputCol("dependency")
The default model is "dependency_conllu"
, if no name is provided.
For available pretrained models please see the Models Hub.
For extended examples of usage, see the Spark NLP Workshop and the DependencyParserApproachTestSpec.
import spark.implicits._ import com.johnsnowlabs.nlp.base.DocumentAssembler import com.johnsnowlabs.nlp.annotators.Tokenizer import com.johnsnowlabs.nlp.annotators.parser.dep.DependencyParserModel import com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronModel import com.johnsnowlabs.nlp.annotators.sbd.pragmatic.SentenceDetector import org.apache.spark.ml.Pipeline val documentAssembler = new DocumentAssembler() .setInputCol("text") .setOutputCol("document") val sentence = new SentenceDetector() .setInputCols("document") .setOutputCol("sentence") val tokenizer = new Tokenizer() .setInputCols("sentence") .setOutputCol("token") val posTagger = PerceptronModel.pretrained() .setInputCols("sentence", "token") .setOutputCol("pos") val dependencyParser = DependencyParserModel.pretrained() .setInputCols("sentence", "pos", "token") .setOutputCol("dependency") val pipeline = new Pipeline().setStages(Array( documentAssembler, sentence, tokenizer, posTagger, dependencyParser )) val data = Seq( "Unions representing workers at Turner Newall say they are 'disappointed' after talks with stricken parent " + "firm Federal Mogul." ).toDF("text") val result = pipeline.fit(data).transform(data) result.selectExpr("explode(arrays_zip(token.result, dependency.result)) as cols") .selectExpr("cols['0'] as token", "cols['1'] as dependency").show(8, truncate = false) +------------+------------+ |token |dependency | +------------+------------+ |Unions |ROOT | |representing|workers | |workers |Unions | |at |Turner | |Turner |workers | |Newall |say | |say |Unions | |they |disappointed| +------------+------------+
TypedDependencyParserMdoel to extract labels for the dependencies
This is the companion object of DependencyParserApproach.
This is the companion object of DependencyParserApproach. Please refer to that class for the documentation.
This is the companion object of DependencyParserModel.
This is the companion object of DependencyParserModel. Please refer to that class for the documentation.
Trains an unlabeled parser that finds a grammatical relations between two words in a sentence.
For instantiated/pretrained models, see DependencyParserModel.
Dependency parser provides information about word relationship. For example, dependency parsing can tell you what the subjects and objects of a verb are, as well as which words are modifying (describing) the subject. This can help you find precise answers to specific questions.
The required training data can be set in two different ways (only one can be chosen for a particular model):
setDependencyTreeBank
setConllU
Apart from that, no additional training data is needed.
See DependencyParserApproachTestSpec for further reference on how to use this API.
Example
TypedDependencyParserApproach to extract labels for the dependencies