AutoML

Train production ML models in minutes

Multi-engine AutoML with H2O and FLAML. Automatically selects algorithms, tunes hyperparameters, and builds stacked ensembles — with parallel engine orchestration for Enterprise plans.

platform.coreplexml.io

AutoML experiment leaderboard with model comparison and metrics

MULTI-ENGINE AUTOML

Two engines. One experiment. Best model wins.

Run H2O and FLAML in parallel on the same dataset. Each engine explores different optimization strategies independently. CorePlexML picks the best model across all engines automatically.

H2O: 15+ algorithmsFLAML: Cost-aware sklearnEnterprise: Parallel execution

H2O

XGBoost, GBM, DL, GLM, DRF, Stacked Ensembles

FLAML

Random Forest, Extra Trees, Logistic Regression

Coming soon

AutoGluon Tabular, MLJAR-supervised, TPOT

WORKFLOW

From raw data to production model in three steps

No manual feature engineering, hyperparameter tuning, or algorithm selection. AutoML handles the entire pipeline.

1. Upload Your Data

Drag and drop CSV, Excel, JSON, or XML files. The platform automatically detects column types, identifies the target variable, and profiles your data for quality issues.

2. Configure & Train

Select your target column, problem type, and engine (H2O or FLAML). AutoML tests multiple algorithms with Bayesian hyperparameter optimization and stacked ensembles. Enterprise plans support parallel multi-engine execution.

3. Evaluate & Deploy

Review the model leaderboard with metrics, SHAP explanations, and feature importance. Deploy the best model to production with one click via MLOps.

Key Capabilities

Everything you need to get the most out of this module.

Automated Algorithm Selection

50+ algorithms tested automatically. XGBoost, GBM, deep learning, GLM, and more — the engine picks the best for your data.

Hyperparameter Tuning

Bayesian optimization finds optimal hyperparameters faster than grid or random search.

Stacked Ensembles

Combine multiple models into powerful ensembles that outperform any single algorithm.

GPU Acceleration

Leverage GPU compute for faster training on large datasets. Automatic fallback to CPU when needed.

H2O ENGINE

15+ algorithms with stacked ensembles

H2O.ai evaluates XGBoost, GBM, Deep Learning, Random Forest, GLM, and Stacked Ensembles with Bayesian hyperparameter tuning.

XGBoost

Gradient-boosted decision trees optimized for speed and performance. Handles missing values natively and supports GPU acceleration.

Gradient Boosting (GBM)

Sequential ensemble method that builds trees correcting previous errors. Excellent for tabular data with complex feature interactions.

Deep Learning

Multi-layer neural networks with configurable architectures. Automatic regularization, dropout, and early stopping for production stability.

Random Forest (DRF)

Parallel ensemble of decision trees with bagging. Robust against overfitting and provides reliable feature importance rankings.

Generalized Linear (GLM)

Interpretable linear models with regularization (L1/L2). Ideal when model explainability is a regulatory requirement.

Stacked Ensembles

Meta-learner that combines predictions from all trained models. Typically achieves the best performance by leveraging model diversity.

FLAML ENGINE

Cost-aware optimization with scikit-learn

Microsoft FLAML finds the best model within a time budget using cost-aware search. Lightweight, fast, and ideal for quick iterations.

Random Forest (sklearn)

Scikit-learn Random Forest with automatic hyperparameter search via FLAML's cost-aware optimization. Fast convergence for tabular data.

Extra Trees

Extremely randomized trees with faster training. FLAML optimizes split thresholds and tree depth automatically for best performance.

Logistic Regression (L1)

Sparse regularized classifier via FLAML. Ideal for high-dimensional datasets where feature selection and interpretability matter.

MULTI-ENGINE BY PLAN

Engine capabilities scale with your plan

Choose your engine per experiment. Enterprise plans unlock parallel multi-engine execution for maximum model coverage.

Plan	Engines	Per Experiment	Parallel
Free	H2O	1	—
Pro	H2O, FLAML	1 (choose one)	—
Team	H2O, FLAML	1 (choose one)	—
Enterprise	H2O, FLAML	Up to 3	✓ Up to 3

Coming soon: AutoGluon Tabular, MLJAR-supervised, and TPOT engines are in planning. The multi-engine architecture is designed to support additional engines as they become available.

Engines

H2O + FLAML (multi-engine)

Problem Types

Classification & Regression

Execution Mode

Single or Parallel

Explainability

SHAP & Feature Importance

GPU Support

CUDA with CPU fallback

Cross-Validation

Automatic k-fold

Plan Capabilities

Engine access by plan

Coming Soon

AutoGluon, MLJAR, TPOT

PYTHON SDK

Train models programmatically

Use the Python SDK to automate your training pipelines. Full experiment management, from data upload to SHAP explanations.

train_model.py

from coreplexml import CorePlexMLClient

client = CorePlexMLClient(
    base_url="https://api.coreplexml.io",
    api_key="sk_your_api_key"
)

# Upload training data
dataset = client.datasets.upload(
    project_id="proj_abc",
    file_path="customers.csv",
    name="Customer Churn Data"
)

# Start AutoML training with engine selection
experiment = client.experiments.create(
    project_id="proj_abc",
    dataset_version_id=dataset["dataset_version_id"],
    target_column="churn",
    problem_type="classification",
    engine="h2o",  # or "flaml"
    config={"max_models": 20, "balance_classes": True}
)

# Wait for training to complete
result = client.experiments.wait(experiment["id"], timeout=3600)
print(f"Best model: {result['best_model_id']}")
print(f"AUC: {result['metrics']['auc']:.4f}")

# Get feature importance and SHAP values
explain = client.experiments.explain(experiment["id"])
for feat in explain["feature_importance"][:5]:
    print(f"  {feat['feature']}: {feat['importance']:.3f}")