?>
IT eBooks
Download, Read, Use

Analytics eBooks

Spark for Python Developers
Spark for Python Developers

Looking for a cluster computing system that provides high-level APIs? Apache Spark is your answer—an open source, fast, and general purpose cluster computing system. Spark's multi-stage memory primitives provide performance up to 100 times faster than Hadoop, and it is also well-suited for machine learning algorithms. Are you a Python developer inclined to work with Spark engine? If so, this book will be your companion as you create data-intensive app using Spark as a processing engine, Python visualization libraries, and web frameworks such as Flask. To begin with, you will learn the most effective way to install the Python development environment powered by Spark, Blaze, and Bookeh. You will then find out how to connect with data stores such as MySQL, MongoDB, Cassandra, and Hadoop. You'll expand your skills throughout, getting familiarized with the various data sources (Github, Twitter, Meetup, and Blogs), their data structures, and solutions to effectively tackle complex ...
Practical Graph Analytics with Apache Giraph
Practical Graph Analytics with Apache Giraph

Practical Graph Analytics with Apache Giraph helps you build data mining and machine learning applications using the Apache Foundation's Giraph framework for graph processing. This is the same framework as used by Facebook, Google, and other social media analytics operations to derive business value from vast amounts of interconnected data points. Graphs arise in a wealth of data scenarios and describe the connections that are naturally formed in both digital and real worlds. Examples of such connections abound in online social networks such as Facebook and Twitter, among users who rate movies from services like Netflix and Amazon Prime, and are useful even in the context of biological networks for scientific research. Whether in the context of business or science, viewing data as connected adds value by increasing the amount of information available to be drawn from that data and put to use in generating new revenue or scientific opportunities. Apache Giraph offers ...
Learning ELK Stack
Learning ELK Stack

The ELK stack - Elasticsearch, Logstash, and Kibana, is a powerful combination of open source tools. Elasticsearch is for deep search and data analytics. Logstash is for centralized logging, log enrichment, and parsing. Kibana is for powerful and beautiful data visualizations. In short, the Elasticsearch ELK stack makes searching and analyzing data easier than ever before. This book will introduce you to the ELK (Elasticsearch, Logstash, and Kibana) stack, starting by showing you how to set up the stack by installing the tools, and basic configuration. You'll move on to building a basic data pipeline using the ELK stack. Next, you'll explore the key features of Logstash and its role in the ELK stack, including creating Logstash plugins, which will enable you to use your own customized plugins. The importance of Elasticsearch and Kibana in the ELK stack is also covered, along with various types of advanced data analysis, and a variety of charts, tables ,and maps. Finally, by th ...
Fast Data Processing with Spark, 2nd Edition
Fast Data Processing with Spark, 2nd Edition

Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (GraphX), and real-time analysis (Spark Streaming), it can be interactively used to quickly process and query big datasets. Fast Data Processing with Spark - Second Edition covers how to write distributed programs with Spark. The book will guide you through every step required to write effective distributed programs from setting up your cluster and interactively exploring the API to developing analytics applications and tuning them for your purposes. ...
Learning Predictive Analytics with R
Learning Predictive Analytics with R

R is statistical software that is used for data analysis. There are two main types of learning from data: unsupervised learning, where the structure of data is extracted automatically; and supervised learning, where a labeled part of the data is used to learn the relationship or scores in a target attribute. As important information is often hidden in a lot of data, R helps to extract that information with its many standard and cutting-edge statistical functions. This book is packed with easy-to-follow guidelines that explain the workings of the many key data mining tools of R, which are used to discover knowledge from your data. ...
Pentaho Analytics for MongoDB Cookbook
Pentaho Analytics for MongoDB Cookbook

MongoDB is an open source, schemaless NoSQL database system. Pentaho as a famous open source Analysis tool provides high performance, high availability, and easy scalability for large sets of data. The variant features in Pentaho for MongoDB are designed to empower organizations to be more agile and scalable and also enables applications to have better flexibility, faster performance, and lower costs. Whether you are brand new to online learning or a seasoned expert, this book will provide you with the skills you need to create turnkey analytic solutions that deliver insight and drive value for your organization. The book will begin by taking you through Pentaho Data Integration and how it works with MongoDB. You will then be taken through the Kettle Thin JDBC Driver for enabling a Java application to interact with a database. This will be followed by exploration of a MongoDB collection using Pentaho Instant view and creating reports with MongoDB as a datasource using Pentaho Rep ...
Practical Google Analytics and Google Tag Manager for Developers
Practical Google Analytics and Google Tag Manager for Developers

Whether you're a marketer with development skills or a full-on web developer/analyst, Practical Google Analytics and Google Tag Manager for Developers shows you how to implement Google Analytics using Google Tag Manager to jumpstart your web analytics measurement. There's a reason that so many organizations use Google Analytics. Effective collection of data with Google Analytics can reduce customer acquisition costs, provide priceless feedback on new product initiatives, and offer insights that will grow a customer or client base. So where does Google Tag Manager fit in? Google Tag Manager allows for unprecedented collaboration between marketing and technical teams, lightning fast updates to your site, and standardization of the most common tags for on-site tracking and marketing efforts. To achieve the rich data you're really after to better serve your users' needs, you'll need the tools Google Tag Manager provides for a best-in-class implementation o ...
Practical Graph Analytics with Apache Giraph
Practical Graph Analytics with Apache Giraph

Practical Graph Analytics with Apache Giraph helps you build data mining and machine learning applications using the Apache Foundation's Giraph framework for graph processing. This is the same framework as used by Facebook, Google, and other social media analytics operations to derive business value from vast amounts of interconnected data points. Graphs arise in a wealth of data scenarios and describe the connections that are naturally formed in both digital and real worlds. Examples of such connections abound in online social networks such as Facebook and Twitter, among users who rate movies from services like Netflix and Amazon Prime, and are useful even in the context of biological networks for scientific research. Whether in the context of business or science, viewing data as connected adds value by increasing the amount of information available to be drawn from that data and put to use in generating new revenue or scientific opportunities. Apache Giraph offers ...
Real Time Analytics with SAP HANA
Real Time Analytics with SAP HANA

SAP HANA is an in-memory database created by SAP. SAP HANA breaks traditional database barriers to simplify IT landscapes, eliminating data preparation, pre-aggregation, and tuning. SAP HANA and in-memory computing allow you to instantly access huge volumes of structured and unstructured data, including text data, from different sources. Starting with data modeling, this fast-paced guide shows you how to add a system to SAP HANA Studio, create a schema, packages, and delivery unit. Moving on, you'll get an understanding of real-time replication via SLT and learn how to use SAP HANA Studio to perform this. We'll also have a quick look at SAP Business Object DATA service and SAP Direct Extractor for Data Load. After that, you will learn to create HANA artifacts - Analytical Privileges and Calculation View. At the end of the book, we will explore the SMART DATA access option and AFL library, and finally deliver pre-packaged functionality that can be used to build information models fas ...
Storm Blueprints: Patterns for Distributed Real-time Computation
Storm Blueprints: Patterns for Distributed Real-time Computation

Storm is the most popular framework for real-time stream processing. Storm provides the fundamental primitives and guarantees required for fault-tolerant distributed computing in high-volume, mission critical applications. It is both an integration technology as well as a data flow and control mechanism, making it the core of many big data platforms. Storm is essential if you want to deploy, operate, and develop data processing flows capable of processing billions of transactions. Storm: Distributed Real-time Computation Blueprints - covers a broad range of distributed computing topics, including not only design and integration patterns, but also domains and applications to which the technology is immediately useful and commonly applied. This book introduces you to Storm using real-world examples, beginning with simple Storm topologies. ...
Real-Time Analytics
Real-Time Analytics

Real-time analytics is the hottest topic in data analytics today. In Real-Time Analytics - expert Byron Ellis teaches data analysts technologies to build an effective real-time analytics platform. This platform can then be used to make sense of the constantly changing data that is beginning to outpace traditional batch-based analysis platforms. The author is among a very few leading experts in the field. He has a prestigious background in research, development, analytics real-time visualization, and Big Data streaming and is uniquely qualified to help you explore this revolutionary field. Moving from a description of the overall analytic architecture of real-time analytics to using specific tools to obtain targeted results, Real-Time Analytics leverages open source and modern commercial tools to construct robust, efficient systems that can provide real-time analysis in a cost-effective manner. ...
← Prev       Next →
Reproduction of site books is authorized only for informative purposes and strictly for personal, private use.
Only Direct Download
IT eBooks Group © 2011-2026