Apache Spark Fundamentals

COURSE
Packt Admin
2 hrs

Apache Spark Fundamentals

COURSE
Packt Admin
2 hrs
OR
Included in GO1 PremiumStarting from $12 per user for teamsLearn moreTry it free
OR
Included in GO1 PremiumStarting from $12 per user for teamsLearn moreTry it free

Course Overview 

This video is a comprehensive tutorial to help you learn all the fundamentals of Apache Spark, one of the trending big data processing frameworks on the market today. We will introduce you to the various components of the Spark framework to efficiently process, analyze, and visualize data. You will also get the brief introduction of Apache Hadoop and Scala programming language before start writing with Spark programming. You will learn about the Apache Spark programming fundamentals such as Resilient Distributed Datasets (RDD) and see which operations can be used to perform a transformation or action operation on the RDD. We'll show you how to load and save data from various data sources as different type of files, No-SQL and RDBMS databases etc. We'll also explain Spark advanced programming concepts such as managing Key-Value pairs, accumulators etc. Finally, you'll discover how to create an effective Spark application and execute it on Hadoop cluster to the data and gain insights to make informed business decisions. By the end of this video, you will be well-versed with all the fundamentals of Apache Spark and implementing them in Spark. Style and Approach: Filled with examples, this course will help you learn Apache Spark Fundamentals and get started with the Apache Spark. You will learn to build Spark applications and also execution of Spark execution on Hadoop cluster.


Target Audience 

This course is for data scientists, big data technology developers and analysts who want to learn the fundamentals of Apache Spark from a single, comprehensive source, instead of spending countless hours on the internet trying to take bits and pieces from different sources. Some familiarity with Scala would be helpful.


Learning Objectives 

  • History of Apache Spark and the introduction of Spark components
  • Learn how to get started with Apache Spark
  • Introduction to Apache Hadoop, its processes and components: HDFS, YARN and Map Reduce
  • Introduction of programming language: Scala, Scala fundamentals such as classes, objects in Scala, Collections in Scala, etc.  
  • Apache Spark programming fundamentals and Resilient Distributed Datasets (RDD)
  • See which operations can be used to perform a transformation or action operation on the RDD 
  • Find out how to load and save data in Spark 
  • Write Spark application in Scala and execute it on Hadoop cluster

Business Outcomes  

  • Leverage the power of Apache Spark to perform efficient data processing and analytics on your data in real-time
  • Process and analyze streams of data with ease and perform machine learning efficiently
  • A comprehensive tutorial to help you get the most out of the trending Big Data framework for all your data processing needs
Learning
Section 1: Introducing Spark
1.1 Course Overviewvideo
1.2 Spark Introductionvideo
1.3 Spark Componentsvideo
1.4 Getting Startedvideo
Section 2: Hadoop and Spark
2.1 Introduction to Hadoopvideo
2.2 Hadoop Processes and Componentsvideo
2.3 HDFS and YARNvideo
2.4 Map Reducevideo
Section 3: Scala from 30,000 feet
3.1 Introduction to Scalavideo
3.2 Scala Programming Fundamentalsvideo
3.3 Objects in Scalavideo
3.4 Collectionsvideo
Section 4: Spark Programming
4.1 Spark Executionvideo
4.2 Understanding RDDvideo
4.3 RDD Operationsvideo
Section 5: Advanced Spark Programming
5.1 Loading and Saving Data in Sparkvideo
5.2 Managing Key-Value Pairsvideo
5.3 Accumulatorsvideo
5.4 Writing a Spark Applicationvideo
CommunityBlogPartners
© Copyright 2019 GO1 - All Rights Reserved