Apache Flink Training - PowerPoint PPT Presentation

About This Presentation
Title:

Apache Flink Training

Description:

Learntek is global online training provider on Big Data Analytics, Hadoop, Machine Learning, Deep Learning, IOT, AI, Cloud Technology, DEVOPS, Digital Marketing and other IT and Management courses. – PowerPoint PPT presentation

Number of Views:301
Slides: 14
Provided by: learntek12
Tags:

less

Transcript and Presenter's Notes

Title: Apache Flink Training


1
APACHE FLINK
2
Apache Flink
The following topics will be covered in our
Apache Flink Online Training
3
What is Apache Flink?
  • Apache Flink is an open source stream processing
    framework developed by
  • the Apache Software Foundation. The core of
    Apache Flink is a distributed streaming dataflow
    engine written in Java and Scala. Apache Flinks
    dataflow programming model provides
    event-at-a-time processing on both finite and
    infinite datasets. At a basic level, Flink
    programs consist of streams and transformations.

4
.. Continues
Conceptually, a stream is a (potentially
never-ending) flow of data records, and a
transformation is an operation that takes one
or more streams as input, and produces one or
more output streams as a result. Programs can be
written in Java, Scala, Python, and SQL and are
automatically compiled and optimized into
dataflow programs that are executed in a cluster
or cloud environment.
5
Why Apache Flink?
  • Flink provides a high-throughput, low-latency
    streaming engine as well as support for
    event-time processing and state management. Flink
    applications are fault-tolerant in the event of
    machine failure and support exactly-once
    semantics. Flink executes arbitrary dataflow
    programs in a data-parallel and pipelined
    manner. Flinks pipelined runtime system enables
    the execution of bulk/batch and stream
    processing programs. Furthermore, Flinks runtime
    supports the execution of iterative algorithms
    natively.

6
Flink Introduction
  • Architecture
  • Distributed Execution
  • Job Manager
  • Task Manager
  • Features
  • Deploying Flink on Google Cloud and AWS

7
Data Stream API
  • Execution environment
  • Data sources
  • Transformations
  • Data sinks
  • Connectors

8
Batch Processing API
  • Data sources
  • Transformations
  • Broadcast Variable
  • Connectors to various Systems
  • Iterations

9
Structure data handling using Table API
  • Registering tables
  • Accessing the registered table
  • Operators
  • Data types
  • SQL

10
Complex event processing
  • Introduction to CEP and Flink CEP
  • Event Streams
  • Pattern API
  • Continuity
  • Selecting from Pattern

11
Graph API
  • Flink Graph Library Gelly
  • Graph Representation
  • Graph Properties
  • Graph Transformations
  • Graph Mutations
  • Iterative Graph Processing
  • Scatter-Gather Processing

12
Integration between Flink and Hadoop
  • Flink-Yarn Session
  • Job Submission to Flink
  • Execution of a Flink job on YARN
  • Flink and YARN interaction details

13
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com