Overview

Apache Spark with Scala course is designed for you to make you an expert in the Big Data Hadoop Ecosystem. This course will cover Scala programming language, Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark.

Availability: In stock

Regular Price: ₹12,999.00

Special Price ₹11,999.00

* Required Fields

Regular Price: ₹12,999.00

Special Price ₹11,999.00

Upcoming Classes

31st Aug 2019
4 Days
Weekend Online (WebEx) & Offline (Bangalore) Classes - Saturday & Sunday
Time : 11 AM to 6 PM
Price : 11999
30th Nov 2019
4 Days
Weekend Online (WebEx) & Offline (Bangalore) Classes - Saturday & Sunday
Time : 11 AM to 6 PM
Price : 11999

Course Description

Day wise distribution of class :

  • Day 1 : Scala , Spark Introduction and Component of Spark ,Operation on Single RDD,
  • Day 2 :  Operation on Paired RDD, Fault tolerance and Persistence, Optimizing spark code,  IO in spark
  • Day 3 :  Spark Streaming, SparkSQL

Course Instructor : Raju Kumar Mishraview more

Objective

Note :

  1. I will start with set of objective questions which give me idea of participants and also it will encourage participants to learn
  2. Every fundamental of learning is backed by objective questions and Hands on
  3. At the end of every day there will be an objective test
  4. There will be number of real time stories related to big data and Spark
  5. I will last my course with set of multiple choice questions which demonstrate the improvements in participants

 

Introduction to Big Data and Distributed Computing :

 

Big data analysis is future. This section of course will help you to understand, the need of distributed computation.

  • Introduction to data.
  • Data Science a vision.
  • Big data Introduction.
  • Parallel computation.
  • Problem with parallel computation.
  • Traditional parallel computation systems.

Hadoop :

  • Introduction to Hadoop.
  • Hadoop Components.
  • HDFS and its architecture.
  • HDFS Commands

◦       mkdir

◦       ls

◦       rmdir and rm

◦       copyFromLocal

◦       put

◦       cat

◦       copyToLocal

◦       get

◦       touchz

◦       mv

◦       cp

◦       distcp

◦       etc…...

  • fsimage and edits log files.
  • Hadoop property files.
  • Introduction to MapReduce.
  • Shortcoming of MapReduce.

 

Scala :

  • Introduction to Scala
  • Scala variables
  • Operators in Scala
  • Interactive mode and script base programming introduction
  • Scala data type and operations on them
  • Scala Collections (Touple, Map etc)
  • Control Flow and looping in Scala
  • Functions in Scala (Declaration, Definition Types and calling)
  • Object oriented Scala
  • Introduction to function programming in scala.
  • Pattern Matching a introduction.

 

Spark Introduction :

  • Introduction to Spark.
  • Spark and Hadoop (Similarity and Differences)
  • Spark Execution (Master Slave System , Drive, Driver manager and Executors)
  • Spark Shell
  • Resilient Distributed dataSet (RDD)

Operations On RDD :

  • Creation of RDD
  • Transformation and Action Introduction
  • Lazy evaluation
  • Some Important Transformation :
    • filter
    • map
    • flatMap
    • distinct
    • sample
    • union
    • intersection
    • subtract
    • cartesian
    • Some Important Action
      • first
      • take
      • top
      • reduce
      • fold
      • aggregate
      • foreach
      • count
      • collect
    • Creation of Paired RDD
    • Some important Transformation on pairRDD
      • combineBy
      • mapValues
      • groupByKeys
      • reduceByKeys
      • sortByKeys
      • subsractByKey
      • Joines and their Type
      • cogroup
    • Some Important action on pair RDD
      • lookUp
      • collectAsMap
      • countByKey
    • Hands on all the functions

Fault tolerance and Persistence :

  • RDD lineage
  • persistence
  • Benefit of persistence

 

Optimizing Spark program

  • Introduction to partitioning
  • Inbuilt partitioners (Hash and Range)
  • Benefits of partitioning
  • groupByKey and reduceBykey comparison
  • Spark broadcasting and accumulators

IO in Spark :

  • TextFile
  • Csv File
  • JSON
  • Data From HDFS

   

Spark Streaming :

  • Introduction to Spark Streaming
  • Transformation
  • Reading from HDFS
  • Window Concept
  • Push Based Receiver and Pull Based receiver
  • Kafka integration with Streaming.
  • Performance

SparkSQL.

  • Introduction to SparkSQL
  • SparkSQL datatype
  • DataFrame an Introduction.
  • Creation of a dataframe.
  • Summary statistics on DataFrame.
  • Aggregation  on Given Data.
  • SparkSQL and SQL
  • Introduction to Hive.
  • Using data from Hive and HiveQL.
  • Optimizing SparkSQL code.

Spark Code Deployment and cluster managers.

  • Submitting Spark  code on StandAlone cluster manager.
  • Submitting Spark  code on  YARN
  • Submitting Spark code on Mesos

 

Note  : Every part of course will be associated with hands on . A number of objective questions will always help you in scratch your brain.

 

Projects :

 

Project 1 : Spark core can be used for data preparation and  aggregation. Aggregation will be implemented using Spark core APIs.

For data aggregation movie lance data will be used.

 

Project 2 :  Implementing streaming data word frequency visualization.  using Kafka and Spark streaming integration.

 

Project 3 : Implementation of Moving average using SparkSQL.

 

Project 4 : Data preprocessing, data manipulation and aggregation using SparkSQL.  It will be done using Real time data.

Customer Reviews

Chiranjibi

Nice place to learn Spark

The experience with Wallsoul was very Knowledgeable.
Where I got to know more about practical knowledge as well as some theoretical knowledge also.
The team of our coordinators was also very supportive and the faculty was awesome.
I appreciate my trainer (Mr. Raju Mishra) who gave me lots of knowledge to grow in Big Data world.

Arun Goudar

Excellent platform to explore technologies.

This course helped me to understand in depth about spark and scala . Spending time here worth than any other institute. Good facility and interactive sessions made me to grab more knowledge on the technology. Real time examples and scenarios gave me clear picture on usage of spark components. If you are looking for in depth knowledge on technology then you are in the right place. Happy Learning.

Write Your Own Review

You're reviewing: Spark with Scala