Apache Spark with Scala - Hands On with Big Data!

Get to grips with the fundamentals of Apache Spark for real-time Big Data processing 7 hours 23 minutes

1 already enrolled!

Enrollment in this course is by invitation only

Learn

Frame your Big Data problems as Apache Spark jobs

Set up the development environment for Scala and Apache Spark

Develop efficient Spark applications using Scala

Build and deploy Spark jobs on Hadoop clusters

Process real-time streams of data using Spark Streaming

Query your structured data using SparkSQL and work with the DataSets API

Analyze and process graph structures using Spark’s GraphX module

About

“Big data" analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including Amazon, EBay, NASA JPL, and Yahoo all use Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster. You'll learn those same techniques, using your own Windows system right at home. It's easier than you might think, and you'll be learning from an ex-engineer and senior manager from Amazon and IMDb.

Learn and master the art of framing data analysis problems as Spark problems through over 20 hands-on examples, and then scale them up to run on cloud computing services in this course.

This course is very hands-on; you'll spend most of your time following along with the instructor as we write, analyze, and run real code together – both on your own system, and in the cloud using Amazon's Elastic MapReduce service. 7.5 hours of video content is included, with over 20 real examples of increasing complexity you can build, run and study yourself. Move through them at your own pace, on your own schedule. The course wraps up with an overview of other Spark-based technologies, including Spark SQL, Spark Streaming, and GraphX.

Features

Understand the fundamentals of Scala and the Apache Spark ecosystem

Handle large streams of data with Spark Streaming and perform Machine Learning in real time with Spark MLlib

Comprehensive tutorial packed with practical examples to help you develop real-world Big Data applications with Spark with Scala

Course Length : 7 hours 23 minutes

ISBN : 9781787129849

Author

Frank Kane

Frank Kane - Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.