Explore the world of data science from scratch with Julia by your side
* An in-depth exploration of Julia's growing ecosystem of packages
* Work with the most powerful open-source libraries for deep learning, data wrangling, and data visualization
* Learn about deep learning using Mocha.jl and give speed and high performance to data analysis on large data more » sets
Julia is a fast and high performing language that's perfectly suited to data science with a mature package ecosystem and is now feature complete. It is a good tool for a data science practitioner. There was a famous post at Harvard Business Review that Data Scientist is the sexiest job of the 21st century. (https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century).
This book will help you get familiarised with Julia's rich ecosystem, which is continuously evolving, allowing you to stay on top of your game.
This book contains the essentials of data science and gives a high-level overview of advanced statistics and techniques. You will dive in and will work on generating insights by performing inferential statistics, and will reveal hidden patterns and trends using data mining. This has the practical coverage of statistics and machine learning. You will develop knowledge to build statistical models and machine learning systems in Julia with attractive visualizations.
You will then delve into the world of Deep learning in Julia and will understand the framework, Mocha.jl with which you can create artificial neural networks and implement deep learning.
This book addresses the challenges of real-world data science problems, including data cleaning, data preparation, inferential statistics, statistical modeling, building high-performance machine learning systems and creating effective visualizations using Julia.
WHAT YOU WILL LEARN
* Apply statistical models in Julia for data-driven decisions
* Understanding the process of data munging and data preparation using Julia
* Explore techniques to visualize data using Julia and D3 based packages
* Using Julia to create self-learning systems using cutting edge machine learning algorithms
* Create supervised and unsupervised machine learning systems using Julia. Also, explore ensemble models
* Build a recommendation engine in Julia
* Dive into Julia’s deep learning framework and build a system using Mocha.jl
ABOUT THE AUTHOR
Anshul Joshi is a data science professional with more than 2 years of experience primarily in data munging, recommendation systems, predictive modeling, and distributed computing. He is a deep learning and AI enthusiast. Most of the time, he can be caught exploring GitHub or trying anything new on which he can get his hands on. He blogs on anshuljoshi.xyz.
TABLE OF CONTENTS
1. The Groundwork – Julia's Environment
2. Data Munging
3. Data Exploration
4. Deep Dive into Inferential Statistics
5. Making Sense of Data Using Visualization
6. Supervised Machine Learning
7. Unsupervised Machine Learning
8. Creating Ensemble Models
9. Time Series
10. Collaborative Filtering and Recommendation System
11. Introduction to Deep Learning « less
A practical guide to obtaining, transforming, exploring, and analyzing data using Python, MongoDB, and Apache Spark
Learn to use various data analysis tools and algorithms to classify, cluster, visualize, simulate, and forecast your data Apply Machine Learning algorithms to different kinds of data such as social networks, time series, and images A hands-on guide to understanding the nature of data more » and how to turn it into insight Book Description Beyond buzzwords like Big Data or Data Science, there are a great opportunities to innovate in many businesses using data analysis to get data-driven products. Data analysis involves asking many questions about data in order to discover insights and generate value for a product or a service. This book explains the basic data algorithms without the theoretical jargon, and you'll get hands-on turning data into insights using machine learning techniques. We will perform data-driven innovation processing for several types of data such as text, Images, social network graphs, documents, and time series, showing you how to implement large data processing with MongoDB and Apache Spark.
What you will learn
Acquire, format, and visualize your data
Build an image-similarity search engine
Generate meaningful visualizations anyone can understand
Get started with analyzing social network graphs
Find out how to implement sentiment text analysis Install data analysis tools such as Pandas, MongoDB, and Apache Spark
Get to grips with Apache Spark Implement machine learning algorithms such as classification or forecasting
About the Author
Hector Cuesta is founder and Chief Data Scientist at Dataxios, a machine intelligence research company. Holds a BA in Informatics and a M.Sc. in Computer Science. He provides consulting services for data-driven product design with experience in a variety of industries including financial services, retail, fintech, e-learning and Human Resources. He is an enthusiast of Robotics in his spare time. « less
Real-World Machine Learning is a practical guide designed to teach working developers the art of ML project execution. Without overdosing you on academic theory and complex mathematics, it introduces the day-to-day practice of machine learning, preparing you to successfully build and deploy more » powerful ML systems.
About the Technology
Machine learning systems help you find valuable insights and patterns in data, which you'd never recognize with traditional methods. In the real world, ML techniques give you a way to identify trends, forecast behavior, and make fact-based recommendations. It's a hot and growing field, and up-to-speed ML developers are in demand.
About the Book
Real-World Machine Learning will teach you the concepts and techniques you need to be a successful machine learning practitioner without overdosing you on abstract theory and complex mathematics. By working through immediately relevant examples in Python, you'll build skills in data acquisition and modeling, classification, and regression. You'll also explore the most important tasks like model validation, optimization, scalability, and real-time streaming. When you're done, you'll be ready to successfully build, deploy, and maintain your own powerful ML systems.
* Predicting future behavior
* Performance evaluation and optimization
* Analyzing sentiment and making recommendations
About the Reader
No prior machine learning experience assumed. Readers should know Python.
About the Authors
Henrik Brink, Joseph Richards and Mark Fetherolf are experienced data scientists engaged in the daily practice of machine learning.
Table of Contents
1. THE MACHINE-LEARNING WORKFLOW
2. What is machine learning?
3. Real-world data
4. Modeling and prediction
5. Model evaluation and optimization
6. Basic feature engineering
7. PRACTICAL APPLICATION
8. Example: NYC taxi data
9. Advanced feature engineering
10. Advanced NLP example: movie review sentiment
11. Scaling machine-learning workflows
12. Example: digital display advertising « less
Analyze your data and delve deep into the world of machine learning with the latest Spark version, 2.0
About This Book
Perform data analysis and build predictive models on huge datasets that leverage
Apache Spark Learn to integrate data science algorithms and techniques with the fast and scalable computing features of Spark to address big data challenges
Work through practical examples on real-world more » problems with sample code snippets
Who This Book Is For
This book is for anyone who wants to leverage Apache Spark for data science and machine learning. If you are a technologist who wants to expand your knowledge to perform data science operations in Spark, or a data scientist who wants to understand how algorithms are implemented in Spark, or a newbie with minimal development experience who wants to learn about Big Data Analytics, this book is for you!
What You Will Learn
Consolidate, clean, and transform your data acquired from various data sources
Perform statistical analysis of data to find hidden insights
Explore graphical techniques to see what your data looks like
Use machine learning techniques to build predictive models
Build scalable data products and solutions
Start programming using the RDD, DataFrame and Dataset APIs
Become an expert by improving your data analytical skills
This is the era of Big Data. The words ‘Big Data’implies big innovation and enables a competitive advantage for businesses. Apache Spark was designed to perform Big Data analytics at scale, and so Spark is equipped with the necessary algorithms and supports multiple programming languages. Whether you are a technologist, a data scientist, or a beginner to Big Data analytics, this book will provide you with all the skills necessary to perform statistical data analysis, data visualization, predictive modeling, and build scalable data products or solutions using Python, Scala, and R. « less
Get more from your data through creating practical machine learning systems with Python
Using machine learning to gain deeper insights from data is a key skill required by modern application developers and analysts alike. Python is a wonderful language to develop machine learning applications. As a dynamic language, it allows for fast exploration and experimentation. With its excellent more » collection of open source machine learning libraries you can focus on the task at hand while being able to quickly try out many ideas.
This book shows you exactly how to find patterns in your raw data. You will start by brushing up on your Python machine learning knowledge and introducing libraries. You'll quickly get to grips with serious, real-world projects on datasets, using modeling, creating recommendation systems. Later on, the book covers advanced topics such as topic modeling, basket analysis, and cloud computing. These will extend your abilities and enable you to create large complex systems.
With this book, you gain the tools and understanding required to build your own systems, tailored to solve your real-world data analysis problems. « less
In the past few years the generation of data and our capability to store and process it has grown exponentially. There is a need for scalable analytics frameworks and people with the right skills to get the information needed from this Big Data. Apache Mahout is one of the first and most prominent Big more » Data machine learning platforms. It implements machine learning algorithms on top of distributed processing platforms such as Hadoop and Spark.
Starting with the basics of Mahout and machine learning, you will explore prominent algorithms and their implementation in Mahout development. You will learn about Mahout building blocks, addressing feature extraction, reduction and the curse of dimensionality, delving into classification use cases with the random forest and Naïve Bayes classifier and item and user-based recommendation. You will then work with clustering Mahout using the K-means algorithm and implement Mahout without MapReduce. Finish with a flourish by exploring end-to-end use cases on customer analytics and test analytics to get a real-life practical know-how of analytics projects.
Who This Book Is For
If you are a Java developer and want to use Mahout and machine learning to solve Big Data Analytics use cases then this book is for you. Familiarity with shell scripts is assumed but no prior experience is required. « less
Essential Techniques for Predictive Analysis
Learn a simpler and more effective way to analyze data and predict outcomes with Python
Machine Learning in Python shows you how to successfully analyze data using only two core machine learning algorithms, and how to apply them using Python. By focusing on two algorithm families that effectively more » predict outcomes, this book is able to provide full descriptions of the mechanisms at work, and the examples that illustrate the machinery with specific, hackable code. The algorithms are explained in simple terms with no complex math and applied using Python, with guidance on algorithm selection, data preparation, and using the trained models in practice. You will learn a core set of Python programming techniques, various methods of building predictive models, and how to measure the performance of each model to ensure that the right one is used. The chapters on penalized linear regression and ensemble methods dive deep into each of the algorithms, and you can use the sample code in the book to develop your own data analysis solutions.
Machine learning algorithms are at the core of data analytics and visualization. In the past, these methods required a deep background in math and statistics, often in combination with the specialized R programming language. This book demonstrates how machine learning can be implemented using the more widely used and accessible Python programming language. « less
Explore over 110 recipes to analyze data and build predictive models with the simple and easy-to-use R code
The R language is a powerful open source functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics.
This book covers the basics of R by setting up a user-friendly programming environment and performing more » data ETL in R. Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationships. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimension reduction.
***** Who This Book Is For *****
If you want to learn how to use R for machine learning and gain insights from your data, then this book is ideal for you. Regardless of your level of experience, this book covers the basics of applying R to machine learning through to advanced techniques. While it is helpful if you are familiar with basic programming or machine learning concepts, you do not require prior experience to benefit from this book. « less
Successfully leverage advanced machine learning techniques using the Clojure ecosystem
Clojure for Machine Learning is an introduction to machine learning techniques and algorithms. This book demonstrates how you can apply these techniques to real-world problems using the Clojure programming language.
It explores many machine learning techniques and also describes how to use Clojure more » to build machine learning systems. This book starts off by introducing the simple machine learning problems of regression and classification. It also describes how you can implement these machine learning techniques in Clojure. The book also demonstrates several Clojure libraries, which can be useful in solving machine learning problems. « less
Create scalable machine learning applications to power a modern data-driven business using Spark
Apache Spark is a framework for distributed computing that is designed from the ground up to be optimized for low latency tasks and in-memory data storage. It is one of the few frameworks for parallel computing that combines speed, scalability, in-memory processing, and fault tolerance with ease of programming more » and a flexible, expressive, and powerful API design.
This book guides you through the basics of Spark's API used to load and process data and prepare the data to use as input to the various machine learning models. There are detailed examples and real-world use cases for you to explore common machine learning models including recommender systems, classification, regression, clustering, and dimensionality reduction. You will cover advanced topics such as working with large-scale text data, and methods for online machine learning and model evaluation using Spark Streaming. « less