Crux users can now create data pipeline validations, to confirm the data meets their specifications, and anomaly detections, to identify data outliers.
Guarantee your data conforms to supplier specifications
Identify anomalies utilizing a proprietary machine learning model
Define custom data quality checks to evaluate proprietary metrics and thresholds
Receive data quality calculations for all datasets in a single, machine readable format available via Crux Query or any Crux Deliver destination
Crux Protect runs data quality checks consisting of two components: a metric (the item to be measured) and an upper and/ or lower threshold.
Metrics are available for many key data quality dimensions. Each can be further modified by using select operators (e.g. and/or logic) and modifiers (e.g. grouping or filters) to create the desired data quality check.
Metrics are available in the following categories: Missing and Duplicate values, Semantic meaning and formatting, and Numeric calculations, both statistical and arithmetic. Custom metrics can also be defined to assess bespoke quality points or business specific rules (e.g. detecting anomalies on an internal time-series factor or validating unstructured text doesn’t contain HTML).
For each check, boundaries are set to determine whether a check passes or fails. Currently, thresholds are either static or relative, based on a historical value of a metric.
Anomaly Detection can be used to calculate thresholds automatically, based on a proprietary model which dynamically adapts to account for changes in a metric’s behavior.
Output from Crux Protect is made available as a new dataset which can be made available to clients via Crux Deliver or Crux Query. Notifications are generated in the respective location when data quality results are available.
Crux Protect is a key component of Crux’s data ecosystem, which includes 24/7 monitoring, change management, and secure access. Your customized datasets can be made available through a multitude of delivery destinations via Crux Deliver, including cloud platforms and API, or you can have Crux host the storage for you via Crux Query.
Focus on extracting unique value from data by offloading time-consuming data delivery and operations work to Crux.
Central data hub with integrations to popular cloud platforms
Activate data pipelines to access 14k datasets from 140+ Data Suppliers
Secure, ready-made infrastructure with immediate onboarding
Reliable with 24/7 operations and support team
Customizable platform and services to meet end use case needs