Course

Big Data Processing using Apache Spark

Packt

Updated Jan 21, 2019

Every year we have a big increment of data that we need to store and analyze. When we want to aggregate all data about our users and analyze that data to find insights from it, terabytes of data undergo processing. To be able to process such amounts of data, we need to use a technology that can distribute multiple computations and make them more efficient. Apache Spark is a technology that allows us to process big data leading to faster and scalable processing. In this course, we will learn how to leverage Apache Spark to be able to process big data quickly. We will cover the basics of Spark API and its architecture in detail. In the second section of the course, we will learn about Data Mining and Data Cleaning, wherein we will look at the Input Data Structure and how Input data is loaded In the third section we will be writing actual jobs that analyze data. By the end of the course, you will have sound understanding of the Spark framework which will help you in writing the code understand the processing of big data. Style and Approach: Filled with hands-on examples, this course will help you learn how to process big data using Apache.

Target Audience

If you are a software Engineer interested in Big Data Processing then this course is for you. A basic understanding and functional knowledge of Apache Spark and big data are required.

Business Outcomes

Explore the Apache Spark Architecture and delve into its API and key features
Implement Efficient Big Data Processing using this framework
Write Code that is Maintainable and easy to Test

Related learning

Big Data Analytics Using Apache SparkCourse ⋅ 840 mins

Advanced Analytics and Real-Time Data Processing in Apache SparkCourse ⋅ 180 mins

Apache Spark FundamentalsCourse ⋅ 120 mins

Learning PySparkCourse ⋅ 180 mins

Explore more technology skills

IT Software

Web Design and Development

Data & Analytics

Design and Animation

Gaming and Games Development

Devops, Networking and Security

Programming and Web Development

Computer Science and Engineering

Packt

GLOBAL

Packt is an exciting global IT content provider in the web development and emerging technology space. Founded in 2004 in Birmingham, UK, Packt's mission is to help the world put software to work in new ways, through the delivery of effective learning and information services to IT professionals. Packt’s content is developed for IT professionals, web developers, students and IT hobbyists who are looking to upskill or re-skill. With content that's been designed to be very hands on, as opposed to theory based, there is a clear focus on 'learning by doing’ - giving learners something to show at the end of each course! Through our partnership with Packt, Go1 Content Hub customers will now have access to the very latest AI, machine learning, data science and web development online courses.