Apache SystemML AI/ML - PowerPoint PPT Presentation

About This Presentation
Title:

Apache SystemML AI/ML

Description:

This presentation gives an overview of the Apache SystemML AI/ML project. It explains Apache SystemML AI/ML in terms of it's functionality, dependencies and how systemDS has been forked from it providing greater functionality. Links for further information and connecting – PowerPoint PPT presentation

Number of Views:30
Slides: 12
Provided by: semtechs

less

Transcript and Presenter's Notes

Title: Apache SystemML AI/ML


1
What Is Apache SystemML ?
  • A machine learning system
  • Designed to scale to Spark / Hadoop clusters
  • Open source / Apache 2 license
  • Developed in Java
  • Supports R-like and Python-like languages
  • Which are designed to scale into the big data
    range
  • Automatic optimization at scale for data and
    cluster

2
SystemML Execution Modes
  • System ML supports multiple execution modes
  • Including
  • Standalone
  • Spark Batch
  • Spark MLContext
  • Hadoop Batch
  • Java Machine Learning Connector (JMLC)

3
SystemML Dependencies
  • System DS forked from ML 1.2
  • Current dependencies
  • Java 8
  • Scala 2.11
  • Python 2.7/3.5
  • Hadoop 2.6
  • Spark 2.1

4
What Is Apache SystemDS ?
  • Forked from Apache SystemML 1.2 in September
    2018
  • Supports linear algebra programs over matrices
  • Replaces the underlying data model and compiler
  • Substantially extends the supported
    functionalities
  • Supports the whole data science lifecycle
  • Data integration, cleaning
  • Feature engineering
  • Model training
  • Over efficient
  • Local and distributed ML
  • Deployment, serving

5
What Is Apache SystemDS ?
  • R-like languages for
  • The data-science life cycle stages
  • Differing expertise levels
  • High-level scripts are compiled into hybrid
    execution plans
  • For local, in-memory CPU / GPU operations
  • For distributed operations on Apache Spark
  • Underlying data model are DataTensors
  • Tensors (multi-dimensional arrays) whose first
    dimension
  • May have a heterogeneous and nested schema

6
SystemDS Algorithms
  • Descriptive Statistics
  • Univariate Statistics
  • Bivariate Statistics
  • Stratified Bivariate Statistics
  • Classification
  • Multinomial Logistic Regression
  • Support Vector Machines
  • Binary-Class Support Vector Machines
  • Multi-Class Support Vector Machines
  • Naive Bayes
  • Decision Trees
  • Random Forests

7
SystemDS Algorithms
  • Clustering
  • K-Means Clustering
  • Regression
  • Linear Regression
  • Stepwise Linear Regression
  • Generalized Linear Models
  • Stepwise Generalized Linear Regression
  • Regression Scoring and Prediction
  • Matrix Factorization
  • Principal Component Analysis
  • Matrix Completion via Alternating Minimizations

8
SystemDS Algorithms
  • Survival Analysis
  • Kaplan-Meier Survival Analysis
  • Cox Proportional Hazard Regression Model
  • Factorization Machines
  • Factorization Machine

9
SystemDS Deep Neural Nets
  • Use SystemDS to implement deep neural networks
  • Specifying network in Keras format / invoke with
    Keras2DML API
  • Specifying network in Caffe format / invoke with
    Caffe2DML API
  • Use DML-bodied SystemDS-NN library
  • Ease training compute resource issues with
  • Native BLAS (Basic Linear Algebra Subprograms)
  • SystemDS GPU backend

10
Available Books
  • See Big Data Made Easy
  • Apress Jan 2015
  • See Mastering Apache Spark
  • Packt Oct 2015
  • See Complete Guide to Open Source Big Data
    Stack
  • Apress Jan 2018
  • Find the author on Amazon
  • www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
  • Connect on LinkedIn
  • www.linkedin.com/in/mike-frampton-38563020

11
Connect
  • Feel free to connect on LinkedIn
  • www.linkedin.com/in/mike-frampton-38563020
  • See my open source blog at
  • open-source-systems.blogspot.com/
  • I am always interested in
  • New technology
  • Opportunities
  • Technology based issues
  • Big data integration
Write a Comment
User Comments (0)
About PowerShow.com