Plugins
DateMatcher
This plugin recognizes the following kinds of date formats and transforms them into a user-defined output format:
- 1978-01-28
- 1984/04/02
- 1/02/1980
- 2/28/79
- The 31st of April in the year 2008
- Fri, 21 Nov 1997
- Jan 21, ‘97
- Sun, Nov 21
- jan 1st
- next thursday
- last wednesday
- today
- tomorrow
- yesterday
- next week
- next month
- next year
- day after
- the day before
- 0600h
- 06:00 hours
- 6pm
- 5:30 a.m.
- at 5
- 12:59
- 23:59
- 1988/11/23 6pm
- next week at 7.30
- 5 am tomorrow
@Plugin(type = SparkCompute.PLUGIN_TYPE)
@Name("DateMatcher")
@Description("A transformation stage that reads different forms of date and time expressions "
+ "and converts them to a provided date format. This stage transforms each text document "
+ "into a list of sentences where each detected date and time expression is replaced by "
+ "the provided format. As an alternative, the list of detected date and time expressions "
+ "is returned.")
public class DateMatcher extends TextCompute {
...
}
Parameters
Text Field | The name of the field in the input schema that contains the text document. |
Date Field | The name of the field in the output schema that contains the text matches. |
Date Format | The expected output date format. Default is 'yyyy/MM/dd'. |
Output Option | An option to determine how to format the output of the date matcher. Supported values are 'extract' and 'replace'. Default is 'replace'. |
PhraseMatcher
@Plugin(type = SparkCompute.PLUGIN_TYPE)
@Name("PhraseMatcher")
@Description("A transformation stage that leverages the Spark NLP Text Matcher to detected provided ."
+ "phrases in the input text document.")
public class PhraseMatcher extends TextCompute {
...
}
Parameters
Text Field | The name of the field in the input schema that contains the text document. |
Phrase Field | The name of the field in the output schema that contains the text matches. |
Phrases | A delimiter separated list of text phrases. |
Phrase Delimiter | The delimiter used to separate the different text phrases.
RegexMatcher
@Plugin(type = SparkCompute.PLUGIN_TYPE)
@Name("RegexMatcher")
@Description("A transformation stage that leverages the Spark NLP Regex Matcher to detected provided ."
+ "Regex rules in the input text document.")
public class RegexMatcher extends TextCompute {
...
}
Parameters
Text Field | The name of the field in the input schema that contains the text document. |
Regex Field | The name of the field in the output schema that contains the text matches. |
Regex Rules | A delimiter separated list of Regex rules. |
Rule Delimiter | The delimiter used to separate the different Regex rules.