Syntactical parsing involves the analysis of words in the sentence for grammar and their arrangement in a manner that shows the relationships among the words.

Dependency Grammar and Part of Speech tags are the important attributes of text syntactics.

Corpus

coming soon

Plugins

DependencyBuilder

@Plugin(type = SparkSink.PLUGIN_TYPE)
@Name("DependencyBuilder")
@Description("A building stage for an Apache Spark-NLP based Unlabeled Dependency Parser model.")
public class DependencyBuilder extends TextSink {

    ...

}

Parameters

Model Name The unique name of the Unlabeled Dependency Parser model.
Corpus Field The name of the field in the input schema that contains the annotated corpus document.
Corpus Format The format of the training corpus. Supported values are 'conll-u' (CoNLL-U corpus) and 'treebank' (TreeBank corpus). Default is 'conll-u'.
Model Configuration
Iterations The number of iterations to train the model. Default is 10.

DependencyParser

@Plugin(type = SparkCompute.PLUGIN_TYPE)
@Name("DependencyParser")
@Description("A transformation stage that leverages an Unlabeled Dependency Parser model "
  + "to extract syntactic relations between words in a text document.")
public class DependencyParser extends TextCompute {

    ...

}

Parameters

Model Name The unique name of the Unlabeled Dependency Parser model.
Part of Speech Name The unique name of the Part of Speech model.
Text Field The name of the field in the input schema that contains the text document.
Sentence Field The name of the field in the output schema that contains the extracted sentences.
Dependency Field The name of the field in the output schema that contains the word dependencies.