Designing and Operating Resilient Database Systems
If you’re an IT professional looking to broaden your knowledge of database administration, this practical book takes you through each component of site reliability and operations within the context of database engines. IT staffers with minimal database operations experience can use this knowledge as more » a foundation of the architecture and operations within a specific database.
This book uses open-source engines such as MySQL, PostgreSQL, MongoDB, and Cassandra as examples throughout. « less
Combine the incredible powers of Spark, Mesos, Akka, Cassandra, and Kafka to build data processing platforms that can take on even the hardest of your data troubles!
* This highly practical guide shows you how to use the best of the big data technologies to solve your response-critical problems
* Learn the art of making cheap-yet-effective big data architecture without using complex Greek-letter architectures
* Use this easy-to-follow guide to build more » fast data processing systems for your organization
SMACK is an open source full stack for big data architecture. It is a combination of Spark, Mesos, Akka, Cassandra, and Kafka. This stack is the newest technique to tackle critical real-time analytics for big data. This highly practical guide will teach you how to integrate these technologies to create a highly efficient data analysis system for fast data processing.
We'll start off with an introduction to SMACK and show you when to use it. First you'll get to grips with functional thinking and problem solving using Scala. Next you'll come to understand the Akka architecture. Then you'll get to know how to improve the data structure architecture and optimize resources using Apache Spark.
Moving forward, you'll learn how to perform linear scalability in databases with Apache Cassandra. You'll grasp the high throughput distributed messaging systems using Apache Kafka. We'll show you how to build a cheap but effective cluster infrastructure with Apache Mesos. Finally, you will deep dive into the different aspects of SMACK and you'll get the chance to practice these aspects of SMACK through a few study cases.
By the end of the book, you will be able to integrate all the components of the SMACK stack and use them together to achieve highly effective and fast data processing.
You will start off with introduction to SMACK and when to use the same. In the later chapters you will be deep diving into the different aspects of SMACK. You will be starting with functional thinking and problem solving using Scala. You will understand Akka architecture. You will know how to improve the architecture and optimize resources using Apache Spark. You will learn how to make linear scalability in Databases with Apache Cassandra. You will understand the high throughput distributed messaging systems using Apache Kafka. You will learn how to build a cheap but effective cluster infrastructure with Apache Mesos. You will be able to practice these aspects of SMACk with few study cases.
By the end of the book you will be able to integrate all the components of the SMACK stack and use them together for highly effective and fast data processing.
WHAT YOU WILL LEARN
* Build an affordable yet powerful cluster infrastructure
* Make queries, reports, and graphs based on your business' demands
* Manage and exploit unstructured and No-SQL data sources
* Use tools to monitor the performance of your architecture
* Integrate all the technology to decide which one is better than the other in replacing or reinforcing « less
A Step-by-Step Guide
Leverage the power of visualization in business intelligence and data science to make quicker and better decisions. Use statistics and data mining to make compelling and interactive dashboards. This book will help those familiar with Tableau software chart their journey to being a visualization expert.
Pro more » Tableau demonstrates the power of visual analytics and teaches you how to:
* Connect to various data sources such as spreadsheets, text files, relational databases (Microsoft SQL Server, MySQL, etc.), non-relational databases (NoSQL such as MongoDB, Cassandra), R data files, etc.
* Write your own custom SQL, etc.
* Perform statistical analysis in Tableau using R
* Use a multitude of charts (pie, bar, stacked bar, line, scatter plots, dual axis, histograms, heat maps, tree maps, highlight tables, box and whisker, etc.)
What you’ll learn
* Connect to various data sources such as relational databases (Microsoft SQL Server, MySQL), non-relational databases (NoSQL such as MongoDB, Cassandra), write your own custom SQL, join and blend data sources, etc.
* Leverage table calculations (moving average, year over year growth, LOD (Level of Detail), etc.
* Integrate Tableau with R
* Tell a compelling story with data by creating highly interactive dashboards
Who this book is for
All levels of IT professionals, from executives responsible for determining IT strategies to systems administrators, to data analysts, to decision makers responsible for driving strategic initiatives, etc. The book will help those familiar with Tableau software chart their journey to a visualization expert. « less
Achieve scalability and high availability without compromising on performance
* See how to get 100 percent uptime with your Cassandra applications using this easy-follow guide
* Learn how to avoid common and not-so-common mistakes while working with Cassandra using this highly practical guide
* Get familiar with the intricacies of working with Cassandra for high more » availability in your work environment with this go-to-guide
Apache Cassandra is a massively scalable, peer-to-peer database designed for 100 percent uptime, with deployments in the tens of thousands of nodes, all supporting petabytes of data. This book offers a practical insight into building highly available, real-world applications using Apache Cassandra.
The book starts with the fundamentals, helping you to understand how Apache Cassandra’s architecture allows it to achieve 100 percent uptime when other systems struggle to do so. You’ll get an excellent understanding of data distribution, replication, and Cassandra’s highly tunable consistency model. Then we take an in-depth look at Cassandra's robust support for multiple data centers, and you’ll see how to scale out a cluster. Next, the book explores the domain of application design, with chapters discussing the native driver and data modeling. Lastly, you’ll find out how to steer clear of common anti-patterns and take advantage of Cassandra’s ability to fail gracefully.
WHAT YOU WILL LEARN
* Understand how the core architecture of Cassandra enables highly available applications
* Use replication and tunable consistency levels to balance consistency, availability, and performance
* Set up multiple data centers to enable failover, load balancing, and geographic distribution
* Add capacity to your cluster with zero downtime
* Take advantage of high availability features in the native driver
* Create data models that scale well and maximize availability
* Understand common anti-patterns so you can avoid them
* Keep your system working well even during failure scenarios
ABOUT THE AUTHOR
Robbie Strickland has been involved in the Apache Cassandra project since 2010, and he initially went to production with the 0.5 release. He has made numerous contributions over the years, including work on drivers for C# and Scala and multiple contributions to the core Cassandra codebase. In 2013 he became the very first certified Cassandra developer, and in 2014 DataStax selected him as an Apache Cassandra MVP.
Robbie has been an active speaker and writer in the Cassandra community and is the founder of the Atlanta Cassandra Users Group. Other examples of his writing can be found on the DataStax blog, and he has presented numerous webinars and conference talks over the years.
TABLE OF CONTENTS
1. Cassandra's Approach to High Availability
2. Data Distribution
4. Data Centers
5. Scaling Out
6. High Availability Features in the Native Java Client
7. Modeling for Availability
9. Failing Gracefully « less
Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This expanded second edition—updated for Cassandra 3.0—provides more » the technical details and practical examples you need to put this database to work in a production environment.
Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s non-relational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility.
* Understand Cassandra’s distributed and decentralized structure
* Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell
* Create a working data model and compare it with an equivalent relational model
* Develop sample applications using client drivers for languages including Java, Python, and Node.js
* Explore cluster topology and learn how nodes exchange data
* Maintain a high level of performance in your cluster
* Deploy Cassandra on site, in the Cloud, or with Docker
* Integrate Cassandra with Spark, Hadoop, Elasticsearch, Solr, and Lucene « less
Beginning Apache Cassandra Development introduces you to one of the most robust and best-performing NoSQL database platforms on the planet. Apache Cassandra is a document database following the JSON document model. It is specifically designed to manage large amounts of data across many commodity servers more » without there being any single point of failure. This design approach makes Apache Cassandra a robust and easy-to-implement platform when high availability is needed.
Design, build, and analyze your data intricately using Cassandra
Starting with a quick introduction to Cassandra, this book flows through various aspects such as fundamental data modeling approaches, selection of data types, designing a data model, choosing suitable keys and indexes through to a real-world application, all the while applying the best practices covered more » in this book.
Although the application is small, you will be involved in the full development life cycle. You will go through the design considerations of coming up with a flexible and sustainable data model for a stock market technical-analysis application written in Python. As business changes continually and so does a data model, you will also learn the techniques of evolving a data model to address new business requirements. Running a web-scale Cassandra cluster requires many careful considerations such as evolving a data model, performance tuning, and system monitoring. This book is an invaluable tutorial for anyone who wants to adopt Cassandra. « less
Understand and apply Cassandra design and usage patterns, and solve real-world business or technical problems
Cassandra is a powerful data store solution in the open source NoSQL world. The ability to use its vast capabilities correctly is the need of the hour as more developers start using this powerful tool. Hence, it becomes important to be able to understand how and where to apply Cassandra correctly.
This more » practical guide will help you understand the strengths and weaknesses of Cassandra and teach you to how to identify business and technical use cases that Cassandra solves.You will also learn how to solve real world business problems and enable you to use Cassandra in the best possible way. « less
Harness the power of Apache Cassandra to build scalable, fault-tolerant, and readily available applications
Apache Cassandra is a massively scalable, peer-to-peer database designed for 100 percent uptime, with deployments in the tens of thousands of nodes supporting petabytes of data.
This book offers readers a practical insight into building highly available, real-world applications using Apache Cassandra. more » The book starts with the fundamentals, helping you to understand how the architecture of Apache Cassandra allows it to achieve 100 percent uptime when other systems struggle to do so. You'll have an excellent understanding of data distribution, replication, and Cassandra's highly tunable consistency model. This is followed by an in-depth look at Cassandra's robust support for multiple data centers, and how to scale out a cluster. Next, the book explores the domain of application design, with chapters discussing the native driver and data modeling. Lastly, you'll find out how to steer clear of common antipatterns and take advantage of Cassandra's ability to fail gracefully. « less
What could you do with data if scalability wasn't a problem? With this hands-on guide, you'll learn how Apache Cassandra handles hundreds of terabytes of data while remaining highly available across multiple data centers - capabilities that have attracted Facebook, Twitter, and other data-intensive companies. more » Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment.
Author Eben Hewitt demonstrates the advantages of Cassandra's nonrelational design, and pays special attention to data modeling. If you're a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra's speed and flexibility. « less