Apache Flink Online Training PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Apache Flink Online Training


1
  • APACHE FLINK 

2
Apache Flink
  • The following topics will be covered in our
  • Apache Flink Online Training

3
What is Apache Flink?
  • Apache Flink is an open source stream processing
    framework developed by the Apache Software
    Foundation. The core of Apache Flink is a
    distributed streaming dataflow engine written in
    Java and Scala. Apache Flinks dataflow
    programming model provides event-at-a-time
    processing on both finite and infinite datasets.
    At a basic level, Flink programs consist of
    streams and transformations.

4
.. Continues
  • Conceptually, a stream is a (potentially
    never-ending) flow of data records, and a
    transformation is an operation that takes one or
    more streams as input, and produces one or more
    output streams as a result. Programs can be
    written in Java, Scala, Python, and SQL and are
    automatically compiled and optimized into
    dataflow programs that are executed in a cluster
    or cloud environment.

5
Why Apache Flink?
  • Flink provides a high-throughput, low-latency
    streaming engine as well as support for
    event-time processing and state management. Flink
    applications are fault-tolerant in the event of
    machine failure and support exactly-once
    semantics. Flink executes arbitrary dataflow
    programs in a data-parallel and pipelined manner.
    Flinks pipelined runtime system enables the
    execution of bulk/batch and stream processing
    programs. Furthermore, Flinks runtime supports
    the execution of iterative algorithms natively.

6
Flink Introduction
  • Architecture
  • Distributed Execution
  • Job Manager
  • Task Manager
  • Features
  • Deploying Flink on Google Cloud and AWS

7
Data Stream API 
  • Execution environment
  • Data sources
  • Transformations
  • Data sinks
  • Connectors

8
Batch Processing API
  • Data sources
  • Transformations
  • Broadcast Variable
  • Connectors to various Systems
  • Iterations

9
Structure data handling using Table API 
  • Registering tables
  • Accessing the registered table
  • Operators
  • Data types
  • SQL

10
Complex event processing
  • Introduction to CEP and Flink CEP
  • Event Streams
  • Pattern API
  • Continuity
  • Selecting from Pattern

11
Graph API
  • Flink Graph Library Gelly
  • Graph Representation
  • Graph Properties
  • Graph Transformations
  • Graph Mutations
  • Iterative Graph Processing
  • Scatter-Gather Processing

12
Integration between Flink and Hadoop
  • Flink-Yarn Session
  • Job Submission to Flink
  • Execution of a Flink job on YARN
  • Flink and YARN interaction details

13
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com