Course

Apache Spark Fundamentals

Packt

Updated Feb 13, 2019

This video is a comprehensive tutorial to help you learn all the fundamentals of Apache Spark, one of the trending big data processing frameworks on the market today. We will introduce you to the various components of the Spark framework to efficiently process, analyze, and visualize data. You will also get the brief introduction of Apache Hadoop and Scala programming language before start writing with Spark programming. You will learn about the Apache Spark programming fundamentals such as Resilient Distributed Datasets (RDD) and see which operations can be used to perform a transformation or action operation on the RDD. We'll show you how to load and save data from various data sources as different type of files, No-SQL and RDBMS databases etc. We'll also explain Spark advanced programming concepts such as managing Key-Value pairs, accumulators etc. Finally, you'll discover how to create an effective Spark application and execute it on Hadoop cluster to the data and gain insights to make informed business decisions. By the end of this video, you will be well-versed with all the fundamentals of Apache Spark and implementing them in Spark. Style and Approach: Filled with examples, this course will help you learn Apache Spark Fundamentals and get started with the Apache Spark. You will learn to build Spark applications and also execution of Spark execution on Hadoop cluster.

Target Audience

This course is for data scientists, big data technology developers and analysts who want to learn the fundamentals of Apache Spark from a single, comprehensive source, instead of spending countless hours on the internet trying to take bits and pieces from different sources. Some familiarity with Scala would be helpful.

Business Outcomes

Leverage the power of Apache Spark to perform efficient data processing and analytics on your data in real-time
Process and analyze streams of data with ease and perform machine learning efficiently
A comprehensive tutorial to help you get the most out of the trending Big Data framework for all your data processing needs

Related learning

Advanced Analytics and Real-Time Data Processing in Apache SparkCourse ⋅ 180 mins

Learning PySparkCourse ⋅ 180 mins

Big Data Processing using Apache SparkCourse ⋅ 60 mins

Big Data Analytics Using Apache SparkCourse ⋅ 840 mins

Explore more technology skills

IT Software

Web Design and Development

Data & Analytics

Design and Animation

Gaming and Games Development

Devops, Networking and Security

Programming and Web Development

Computer Science and Engineering

Packt

GLOBAL

Packt is an exciting global IT content provider in the web development and emerging technology space. Founded in 2004 in Birmingham, UK, Packt's mission is to help the world put software to work in new ways, through the delivery of effective learning and information services to IT professionals. Packt’s content is developed for IT professionals, web developers, students and IT hobbyists who are looking to upskill or re-skill. With content that's been designed to be very hands on, as opposed to theory based, there is a clear focus on 'learning by doing’ - giving learners something to show at the end of each course! Through our partnership with Packt, Go1 Content Hub customers will now have access to the very latest AI, machine learning, data science and web development online courses.