Volis · Predictive Intelligence for Cork Production

Chapter 1

Process Diagnosis and Methodology Design

The challenge was to turn critical variables from Amorim Cork Solutions' production process into reliable predictions. To do that, we mapped two data universes — the industrial history from SCADA and the lab samples from ACC Labs — and precisely defined which models would be needed for each variable and each measurement point along the line.

The solution covers 14 distinct product-variable combinations, using Linear Regression models for weight, density, and moisture, and Logistic Regression for granulometry — the only categorical variable in the process. Each model was designed around a clear business objective: predict REP 1 output before the end of the shift.

Models by Variable

Predicted Variable	Model Type	Data Source
Weight (MLI 1/2, 0.25/0.5, 0.5/1)	Linear Regression	SCADA
Weight Biomass · Excess	Linear Regression	SCADA
Density (MLI 1/2, 0.5/1)	Linear Regression	ACC Labs
Moisture (MLI 1/2, 0.5/1)	Linear Regression	ACC Labs
Granulometry (MLI 0.5/1)	Logistic Regression	ACC Labs

Approach Foundations

Dual Data Sources: Industrial SCADA data combined with ACC Labs samples for maximum process coverage.
Linear Models by Design: A deliberate choice for linear regression to ensure interpretability, auditability, and operational trust.
Rigorous Cross-Validation: Each model goes through k-fold cross-validation, splitting train and test sets to guarantee real-world generalization.
Multidimensional Metrics: MAE, MAPE, R², and Spearman correlation evaluated together for every model developed.

* Full methodology walkthrough available in Chapter 1 of the video.

Chapter 2

Data Pipeline and Unified Lineage on Foundry

End-to-End Traceability

Every data point consumed by the models has a documented origin and an auditable transformation. We built a full lineage graph — from raw factory data all the way to the model in production — running on Palantir Foundry infrastructure.

Raw Layer: Unmodified ingestion of industrial sensor data and laboratory logs.
Intermediate Layer: Transformations, cleansing, noise filtering, and duplicate record removal.
Primary Layer: Final analytical dataset, aggregated per shift, ready for modeling.
Models Layer: Versioned, traceable ML artifacts with associated evaluation metrics.

Connected Data Sources

Source System	Data Consumed	Status
Industrial SCADA	Process sensors · 25M+ rows	Automated
ACC Labs	Lab samples per shift	Automated
Shift Emails	OCR via Gemini · Screen data and notes	Automated
MES / EPC	Batch types and input moisture	Automated

* Architecture discreetly structured on Palantir Foundry infrastructure.

Lineage Architecture (Raw → Models)

Data Pipeline — Live Demo

Chapter 3

Model Analysis, Hypotheses, and Shift Overview

With clean, unified data in place, the platform delivers three layers of intelligence: detailed evaluation of each predictive model, investigation of process hypotheses with statistical validation, and a shift-by-shift operational view that connects what happened on the factory floor to what the models predict.

Model Performance

Each model is evaluated with error distributions, Predictions vs. Actuals, and baseline comparisons — across separate train and test sets to ensure real-world generalization.

Process Hypotheses

The platform investigates and documents hypotheses such as the impact of raw material moisture and the influence of origin (China vs. other batches) on the stability and quality of the final product.

Shift-by-Shift Operations

Scale evolution over time, setpoints vs. actual mill speeds, and Rotex/screens used each shift — all accessible through a single date selector.

Chapter 4

Reprocessing Simulator — Decide Before You Act

The Simulator is where analytical intelligence becomes an operational tool. The plant manager configures shift parameters — raw material type, mill settings, and maintenance history — and receives, within seconds, a production forecast and expected final product specifications, complete with confidence intervals and failure probability per variable.

Configuration

Raw material, scale, and equipment parameters all adjustable from a single screen.

Prediction

Active ML models deliver real-time estimates for every input combination configured.

Intervals

Every prediction includes a lower and upper prediction interval, quantifying model uncertainty.

Risk

Failure probability per product flags spec deviations before they occur on the production line.

Conclusion

Less waste. More control. Quality predicted before the shift ends.

Amorim Cork Solutions no longer operates in the dark. Every shift starts with a quality forecast for the final product — and operators can simulate decisions before making them. If your industrial operation deals with untracked process variability or unpredictable output quality — this problem already has a solution. We can do the same for you.