Industrial Use Case · Cork Production

Turning Factory Data into End-Product Quality Predictions

How we built predictive models that anticipate weight, density, granulometry, and moisture — and delivered an operational simulator capable of forecasting reprocessing outcomes before they happen.

14+
Predictive Models Developed
2 Sources
Industrial SCADA + ACC Labs
4 Variables
Weight · Density · Granulometry · Moisture
Chapter 1

Process Diagnosis and Methodology Design

The challenge was to turn critical variables from Amorim Cork Solutions' production process into reliable predictions. To do that, we mapped two data universes — the industrial history from SCADA and the lab samples from ACC Labs — and precisely defined which models would be needed for each variable and each measurement point along the line.

The solution covers 14 distinct product-variable combinations, using Linear Regression models for weight, density, and moisture, and Logistic Regression for granulometry — the only categorical variable in the process. Each model was designed around a clear business objective: predict REP 1 output before the end of the shift.

REP1 Input Raw Material SCADA Industrial Data Sensors · 25M rows LABS ACC Labs Samples per Shift ML Predictive Models Linear / Logistic Regression OUT Final Product Predicted Specifications

Models by Variable

Predicted Variable Model Type Data Source
Weight (MLI 1/2, 0.25/0.5, 0.5/1) Linear Regression SCADA
Weight Biomass · Excess Linear Regression SCADA
Density (MLI 1/2, 0.5/1) Linear Regression ACC Labs
Moisture (MLI 1/2, 0.5/1) Linear Regression ACC Labs
Granulometry (MLI 0.5/1) Logistic Regression ACC Labs

Approach Foundations

  • Dual Data Sources: Industrial SCADA data combined with ACC Labs samples for maximum process coverage.
  • Linear Models by Design: A deliberate choice for linear regression to ensure interpretability, auditability, and operational trust.
  • Rigorous Cross-Validation: Each model goes through k-fold cross-validation, splitting train and test sets to guarantee real-world generalization.
  • Multidimensional Metrics: MAE, MAPE, R², and Spearman correlation evaluated together for every model developed.

* Full methodology walkthrough available in Chapter 1 of the video.

Chapter 2

Data Pipeline and Unified Lineage on Foundry

End-to-End Traceability

Every data point consumed by the models has a documented origin and an auditable transformation. We built a full lineage graph — from raw factory data all the way to the model in production — running on Palantir Foundry infrastructure.

  • Raw Layer: Unmodified ingestion of industrial sensor data and laboratory logs.
  • Intermediate Layer: Transformations, cleansing, noise filtering, and duplicate record removal.
  • Primary Layer: Final analytical dataset, aggregated per shift, ready for modeling.
  • Models Layer: Versioned, traceable ML artifacts with associated evaluation metrics.

Connected Data Sources

Source System Data Consumed Status
Industrial SCADA Process sensors · 25M+ rows Automated
ACC Labs Lab samples per shift Automated
Shift Emails OCR via Gemini · Screen data and notes Automated
MES / EPC Batch types and input moisture Automated

* Architecture discreetly structured on Palantir Foundry infrastructure.

Lineage Architecture (Raw → Models)

SCADA Raw ACC Labs Raw MES / EPC Emails OCR RAW Cleansing & Validation INTERMEDIATE Shift Dataset PRIMARY ML 14× MODELS

Data Pipeline — Live Demo

Chapter 3

Model Analysis, Hypotheses, and Shift Overview

With clean, unified data in place, the platform delivers three layers of intelligence: detailed evaluation of each predictive model, investigation of process hypotheses with statistical validation, and a shift-by-shift operational view that connects what happened on the factory floor to what the models predict.

Model Analysis MAE · MAPE · R² · Spearman Empirical Knowledge Hypotheses · Correlations · Scatter Shift Overview Scales · Mills · Rotex

Model Performance

Each model is evaluated with error distributions, Predictions vs. Actuals, and baseline comparisons — across separate train and test sets to ensure real-world generalization.

Process Hypotheses

The platform investigates and documents hypotheses such as the impact of raw material moisture and the influence of origin (China vs. other batches) on the stability and quality of the final product.

Shift-by-Shift Operations

Scale evolution over time, setpoints vs. actual mill speeds, and Rotex/screens used each shift — all accessible through a single date selector.

Chapter 4

Reprocessing Simulator — Decide Before You Act

The Simulator is where analytical intelligence becomes an operational tool. The plant manager configures shift parameters — raw material type, mill settings, and maintenance history — and receives, within seconds, a production forecast and expected final product specifications, complete with confidence intervals and failure probability per variable.

INPUTS Raw Material Type Rotex & Screen Config. Mills (MIM + PPS) Days Since Maintenance Volis Engine Active ML Models PREDICTED OUTPUTS Estimated Production Volume (kg) Predicted Density + Confidence Interval Predicted Moisture + Confidence Interval Granulometry + Failure Probability (%)

Configuration

Raw material, scale, and equipment parameters all adjustable from a single screen.

Prediction

Active ML models deliver real-time estimates for every input combination configured.

Intervals

Every prediction includes a lower and upper prediction interval, quantifying model uncertainty.

Risk

Failure probability per product flags spec deviations before they occur on the production line.

Conclusion

Less waste. More control. Quality predicted before the shift ends.

Amorim Cork Solutions no longer operates in the dark. Every shift starts with a quality forecast for the final product — and operators can simulate decisions before making them. If your industrial operation deals with untracked process variability or unpredictable output quality — this problem already has a solution. We can do the same for you.

Volis Symbol