What is in it?
PredictiveWorks. supports state-of-the-art big data pipelines built on top of Apache Spark and Google CDAP. The seamless integration of these open-source giants is a solid foundation for standardization and unification.
As a result all important business questions, referring to Gartner’s Analytic Continuum, can always be answered in the same way:
Connector Plugins
PredictiveWorks. offers plugins for data integration purposes, called connectors. Connectors are built to access data stores, streaming platforms, Cloud services and SaaS applications.
Users who want to read or write data for any use case are no longer hampered by the plethora of different APIs, data formats and technologies current data source and destinations come with.
Connectors are organized in three plugin categories:
Built-in Connectors
Google CDAP ships with a wide variety of connector plugins for popular data stores, streaming platforms, cloud services and SaaS applications. These plugins offer easy-to-use starting and endpoints of data pipelines.
Graph Connector
Connected or linked data represent an important part of the data spectrum. Graph analytics is the instrument of choice to analyze linked data, and graph databases the tool of choice to persist them.
PredictiveGraph. is a distributed in-memory graph database, and part of the portfolio of PredictiveWorks.
This connector makes PredictiveGraph. available as pluggable data destination for any data pipeline.
Purpose-built Connector
PredictiveWorks. complements Google CDAP’s built-in connectors with plugins that originate from specific enterprise demands, referring to a certain business area or vertical, such as Cyber Defense and Internet-of-Things.
Analytics Plugins
PredictiveWorks. organizes advanced data analytics in five plugin categories:
Deep Learning
PredictiveWorks. externalizes Intel's BigDL deep learning as standardized analytics plugins, with the ability of seamless combination with any other plugin.
Neural networks of any flavor can be used via point-and-click selection as pipeline components.
Deep learning plugins are organized as WorksDL package.
Machine Learning
PredictiveWorks. externalizes Apache Spark ML machine learning as standardized analytics plugins, with the ability of seamless combination with any other plugin.
Feature engineering, classification, regression and more can be used via point-and-click selection as pipeline components.
Machine learning plugins are organized as WorksML package.
Natural Language
PredictiveWorks. support for standardized Natural Language Processing is based on the externalization of John Snow Labs' Spark NLP library as pipeline plugins
Dependency parsing, Part of Speech Tagging, Named Entity Recognition, Sentiment Analysis, Word Embeddings and more can be used as pipeline components, also with the ability of seamless combination with any other plugin.
Natural language plugins are organized as WorksText package.
Queries & Rules
SQL queries for data discovery and business rules for condition matching are still important components of the data analytics spectrum.
PredictiveWorks. externalizes the query functionality of Apache Spark SQL for ad-hoc data aggregation, grouping and filtering for data in motion and historical data at rest. This plugin is organized as WorksSQL package.
PredictiveWorks. complex event processing capability is complemented by the provisioning of Drools' Business Rule Engine. This plugin is organized as WorksRules package.
SQL based data discovery and business rule execution can be combined with any other available plugin with a point-and-click interface.
Time Series
PredictiveWorks. support for standardized time series forecast and prediction is based on the externalization of Dr. Krusche & Partner's Spark Time library as pipeline plugins.
Time series aggregation, interpolation, resampling can be combined with seasonal & trend decomposition, regression and forecasting, or any other plugin.
Time series plugins are organized as WorksTS package.