Spark for Data Science Cookbook

Spark for Data Science Cookbook

by Padma Priya Chitturi

Year: 2017

Pages: 692

Publisher: Packt Publishing

ISBN: 978-1785880100, 1785880101


KEY FEATURES * Optimize your work flow with Spark in data science, and get solutions to all your big data problems * Large-scale data science made easy with Spark * Get recipes to make the most of Spark's power and speed in predictive analytics BOOK DESCRIPTION Spark has emerged as the big data platform of choice for data scientists. The real power and value proposition of Apache Spark is its platform to execute data science tasks. Spark's unique use case is that it combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing, and visualizations to allow data scientists to tackle the complexities that come with raw unstructured data sets. This hands-on, practical resource will allow you to dive in and become comfortable and confident in working with Spark for data science. We will walk you through various techniques to deal with simple and complex data science tasks with Spark. We'll effectively offer solutions to problematic concepts in data science using Spark's data science libraries. The book will help you derive intelligent information at every step of the way through simple yet efficient recipes that will not only show you how to implement algorithms, but also optimize your work. WHAT YOU WILL LEARN * Explore the topics of data mining, text mining, NLP, information retrieval, and machine learning * Solve real-world analytical problems with large data sets * Get the flavor of challenges in data science and address them with a variety of analytical tools on a distributed system like Spark (apt for iterative algorithms), which offers in-memory processing and more flexibility for data analysis at scale


Algorithms Big Data Data Analysis Data Mining Data Science Machine Learning NLP Processing Spark Tools

Comments (0)