IT eBooks
Download, Read, Use
Kafka: The Definitive Guide
Kafka: The Definitive Guide

Every enterprise application creates data, whether it's log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you're an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you'll learn Kafka's design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.Understand publish-subscribe messaging and how it fits in the big data ecosystem;Explore Kafka producers and consumers for wri ...
Mastering Azure Analytics
Mastering Azure Analytics

Microsoft Azure has over 20 platform-as-a-service (PaaS) offerings that can act in support of a big data analytics solution. So which one is right for your project? This practical book helps you understand the breadth of Azure services by organizing them into a reference framework you can use when crafting your own big data analytics solution. You'll not only be able to determine which service best fits the job, but also learn how to implement a complete solution that scales, provides human fault tolerance, and supports future needs.Understand the fundamental patterns of the data lake and lambda architecture;Recognize the canonical steps in the analytics data pipeline and learn how to use Azure Data Factory to orchestrate them;Implement data lakes and lambda architectures, using Azure Data Lake Store, Data Lake Analytics, HDInsight (including Spark), Stream Analytics, SQL Data Warehouse, and Event Hubs;Understand where Azure Machine Learning fits i ...
Streaming Systems
Streaming Systems

Streaming data is a big deal in big data these days, and for good reason. Businesses crave ever more timely data, and streaming is a good way to achieve lower latency. Plus, streaming is a much easier way to tame the massive, unbounded data sets that are increasingly common today. Expanded from co-author Tyler Akidau's popular series of blog posts "Streaming 101" and "Streaming 102", this practical book shows data engineers, data scientists, and developers how to work with streaming or event-time data in a conceptual and platform-agnostic way. You'll go from "101"-level understanding of stream processing to a nuanced grasp of the what, where, when, and how of processing real-time data streams. Dive deep into topics including watermarks and windowing, as well as state and timers in the context of stream processing. Although the book uses Apache Beam code snippets to make examples concrete, it presents a general and broad explanation of streaming that's not tied to a specific frame ...
Learning Apache Drill
Learning Apache Drill

Apache Drill enables interactive analysis of massively large datasets, allowing you to execute SQL queries against data in many different data sources - including Hadoop and MongoDB clusters, HBase, or even your local file system - and get results quickly. With this practical guide, analysts and data scientists focused on business or research applications will learn how to incorporate Drill capabilities into complex programs, including how to use Drill queries to replace some MapReduce operations in a large-scale program. Drill committers Charles Givre and Paul Rogers provide an introduction to Drill and its ability to handle large files containing data in flexible formats with nested data structures and tables. You'll discover how this capability fills a gap in the Hadoop ecosystem. Additional topics show you how to:Prepare and organize data to maximize Drill performance;Set expectations for Drill performance on different data types and volumes;Reconcil ...
Think Data Structures
Think Data Structures

If you're a student studying computer science or a software developer preparing for technical interviews, this practical book will help you learn and review some of the most important ideas in software engineering - data structures and algorithms - in a way that's clearer, more concise, and more engaging than other materials. By emphasizing practical knowledge and skills over theory, author Allen Downey shows you how to use data structures to implement efficient algorithms, and then analyze and measure their performance. You'll explore the important classes in the Java collections framework (JCF), how they're implemented, and how they're expected to perform. Each chapter presents hands-on exercises supported by test code online. Use data structures such as lists and maps, and understand how they work; Build an application that reads Wikipedia pages, parses the contents, and navigates the resulting data tree; Analyze code to predict how fast it will run and how much memory it will ...
Learning ELK Stack
Learning ELK Stack

The ELK stack - Elasticsearch, Logstash, and Kibana, is a powerful combination of open source tools. Elasticsearch is for deep search and data analytics. Logstash is for centralized logging, log enrichment, and parsing. Kibana is for powerful and beautiful data visualizations. In short, the Elasticsearch ELK stack makes searching and analyzing data easier than ever before. This book will introduce you to the ELK (Elasticsearch, Logstash, and Kibana) stack, starting by showing you how to set up the stack by installing the tools, and basic configuration. You'll move on to building a basic data pipeline using the ELK stack. Next, you'll explore the key features of Logstash and its role in the ELK stack, including creating Logstash plugins, which will enable you to use your own customized plugins. The importance of Elasticsearch and Kibana in the ELK stack is also covered, along with various types of advanced data analysis, and a variety of charts, tables ,and maps. Finally, by th ...
Rails, Angular, Postgres, and Bootstrap
Rails, Angular, Postgres, and Bootstrap

As a Rails developer, you care about user experience and performance, but you also want simple and maintainable code. Achieve all that by embracing the full stack of web development, from styling with Bootstrap, building an interactive user interface with AngularJS, to storing data quickly and reliably in PostgreSQL. Take a holistic view of full-stack development to create usable, high-performing applications, and learn to use these technologies effectively in a Ruby on Rails environment. ...
Implementing Cloud Design Patterns for AWS
Implementing Cloud Design Patterns for AWS

Whether you are just getting your feet wet in cloud infrastructure or already creating complex systems, this book aims at describing patterns that can be used to fit your system needs. The initial patterns will cover some basic processes such as maintaining and storing backups as well as handling redundancy. The book will then take you through patterns of high availability. Following this, the book will discuss patterns for processing static and dynamic data and patterns for uploading data. The book will then dive into patterns for databases and data processing. In the final leg of your journey, you will get to grips with advanced patterns on Operations and Networking and also get acquainted with Throw-away Environments. ...
Ruby Data Processing
Ruby Data Processing

Gain the basics of Ruby's map, reduce, and select functions and discover how to use them to solve data-processing problems. This compact hands-on book explains how you can encode certain complex programs in 10 lines of Ruby code, an astonishingly small number. You will walk through problems and solutions which are effective because they use map, reduce, and select. As you read Ruby Data Processing, type in the code, run the code, and ponder the results. Tweak the code to test the code and see how the results change. After reading this book, you will have a deeper understanding of how to break data-processing problems into processing stages, each of which is understandable, debuggable, and composable, and how to combine the stages to solve your data-processing problem. As a result, your Ruby coding will become more efficient and your programs will be more elegant and robust. Discover Ruby data processing and how to do it using the map, reduce, and select functions; Develop compl ...
Learning Predictive Analytics with R
Learning Predictive Analytics with R

R is statistical software that is used for data analysis. There are two main types of learning from data: unsupervised learning, where the structure of data is extracted automatically; and supervised learning, where a labeled part of the data is used to learn the relationship or scores in a target attribute. As important information is often hidden in a lot of data, R helps to extract that information with its many standard and cutting-edge statistical functions. This book is packed with easy-to-follow guidelines that explain the workings of the many key data mining tools of R, which are used to discover knowledge from your data. ...
← Prev       Next →
Reproduction of site books is authorized only for informative purposes and strictly for personal, private use.
Only Direct Download
IT eBooks Group © 2011-2025