Title: Apache Flink Training
1APACHE FLINK
2Apache Flink
The following topics will be covered in our
Apache Flink Online Training
3What is Apache Flink?
- Apache Flink is an open source stream processing
framework developed by - the Apache Software Foundation. The core of
Apache Flink is a distributed streaming dataflow
engine written in Java and Scala. Apache Flinks
dataflow programming model provides
event-at-a-time processing on both finite and
infinite datasets. At a basic level, Flink
programs consist of streams and transformations.
4.. Continues
Conceptually, a stream is a (potentially
never-ending) flow of data records, and a
transformation is an operation that takes one
or more streams as input, and produces one or
more output streams as a result. Programs can be
written in Java, Scala, Python, and SQL and are
automatically compiled and optimized into
dataflow programs that are executed in a cluster
or cloud environment.
5Why Apache Flink?
- Flink provides a high-throughput, low-latency
streaming engine as well as support for
event-time processing and state management. Flink
applications are fault-tolerant in the event of
machine failure and support exactly-once
semantics. Flink executes arbitrary dataflow
programs in a data-parallel and pipelined
manner. Flinks pipelined runtime system enables
the execution of bulk/batch and stream
processing programs. Furthermore, Flinks runtime
supports the execution of iterative algorithms
natively.
6Flink Introduction
- Architecture
- Distributed Execution
- Job Manager
- Task Manager
- Features
- Deploying Flink on Google Cloud and AWS
7Data Stream API
- Execution environment
- Data sources
- Transformations
- Data sinks
- Connectors
8Batch Processing API
- Data sources
- Transformations
- Broadcast Variable
- Connectors to various Systems
- Iterations
9Structure data handling using Table API
- Registering tables
- Accessing the registered table
- Operators
- Data types
- SQL
10Complex event processing
- Introduction to CEP and Flink CEP
- Event Streams
- Pattern API
- Continuity
- Selecting from Pattern
11Graph API
- Flink Graph Library Gelly
- Graph Representation
- Graph Properties
- Graph Transformations
- Graph Mutations
- Iterative Graph Processing
- Scatter-Gather Processing
12Integration between Flink and Hadoop
- Flink-Yarn Session
- Job Submission to Flink
- Execution of a Flink job on YARN
- Flink and YARN interaction details
13(No Transcript)