In-Memory Analytics with Apache Arrow
Apache Arrow is designed to accelerate analytics and allow the exchange of data across big data systems easily.
In-Memory Analytics with Apache Arrow begins with a quick overview of the Apache Arrow format, before moving on to helping you to understand Arrow's versatility and benefits as you walk through a variety of real-world use cases. You'll cover key tasks such as enhancing data science workflows with Arrow, using Arrow and Apache Parquet with Apache Spark and Jupyter for better performance and hassle-free data translation, as well as working with Perspective, an open source interactive graphical and tabular analysis tool for browsers. As you advance, you'll explore the different data interchange and storage formats and become well-versed with the relationships between Arrow, Parquet, Feather, Protobuf, Flatbuffers, JSON, and CSV. In addition to understanding the basic structure of the Arrow Flight and Flight SQL protocols, you'll learn about Dremio's usage of Apa ...
Even You Can Learn Statistics and Analytics, 4th Edition
This book discusses statistics and analytics using plain language and avoiding mathematical jargon. If you thought you couldnt learn these data analysis subjects because they were too technical or too mathematical, this book is for you!
This edition delivers more everyday examples and end-of-chapter exercises and contains updated instructions for using Microsoft Excel. Youll use downloadable data sets and spreadsheet solutions, template-based solutions you can put right to work. Using this book, you will understand the important concepts of statistics and analytics including learning the basic vocabulary of these subjects.
Create tabular and visual summaries and learn to avoid common charting errors; Gain experience working with common descriptive statistics measures including the mean, median, and mode; and standard deviation and variance, among others; Understand the probability concepts that underlie inferential statistics; Learn how to apply hypothesis tests, us ...
Advanced Analytics and Deep Learning Models
The book provides readers with an in-depth understanding of concepts and technologies related to the importance of analytics and deep learning in many useful real-world applications such as e-healthcare, transportation, agriculture, stock market, etc.
Advanced analytics is a mixture of machine learning, artificial intelligence, graphs, text mining, data mining, semantic analysis. It is an approach to data analysis. Beyond the traditional business intelligence, it is a semi and autonomous analysis of data by using different techniques and tools.
However, deep learning and data analysis both are high centers of data science. Almost all the private and public organizations collect heavy amounts of data, i.e., domain-specific data. Many small/large companies are exploring large amounts of data for existing and future technology. Deep learning is also exploring large amounts of unsupervised data making it beneficial and effective for big data. Deep learning can be used t ...
Advanced Analytics with PySpark
The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best practices in Spark programming.
Data scientists Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, and Josh Wills offer an introduction to the Spark ecosystem, then dive into patterns that apply common techniques-including classification, clustering, collaborative filtering, and anomaly detection, to fields such as genomics, security, and finance. This updated edition also covers NLP and image processing.
If you have a basic understanding of machine learning and statistics and you program in Python, this book will get you started with large-scale data analysis.
Familiarize your ...
Adaptive Machine Learning Algorithms with Python
Learn to use adaptive algorithms to solve real-world streaming data problems. This book covers a multitude of data processing challenges, ranging from the simple to the complex. At each step, you will gain insight into real-world use cases, find solutions, explore code used to solve these problems, and create new algorithms for your own use.
Authors Chanchal Chatterjee and Vwani P. Roychowdhury begin by introducing a common framework for creating adaptive algorithms, and demonstrating how to use it to address various streaming data issues. Examples range from using matrix functions to solve machine learning and data analysis problems to more critical edge computation problems. They handle time-varying, non-stationary data with minimal compute, memory, latency, and bandwidth.
Upon finishing this book, you will have a solid understanding of how to solve adaptive machine learning and data analytics problems and be able to derive new algorithms for your own use cases. You will ...
Analytics Optimization with Columnstore Indexes in Microsoft SQL Server
Meet the challenge of storing and accessing analytic data in SQL Server in a fast and performant manner. This book illustrates how columnstore indexes can provide an ideal solution for storing analytic data that leads to faster performing analytic queries and the ability to ask and answer business intelligence questions with alacrity. The book provides a complete walk through of columnstore indexing that encompasses an introduction, best practices, hands-on demonstrations, explanations of common mistakes, and presents a detailed architecture that is suitable for professionals of all skill levels.
With little or no knowledge of columnstore indexing you can become proficient with columnstore indexes as used in SQL Server, and apply that knowledge in development, test, and production environments. This book serves as a comprehensive guide to the use of columnstore indexes and provides definitive guidelines. You will learn when columnstore indexes should be used, and the performance ga ...
Data Engineering with Google Cloud Platform
With this book, you'll understand how the highly scalable Google Cloud Platform (GCP) enables data engineers to create end-to-end data pipelines right from storing and processing data and workflow orchestration to presenting data through visualization dashboards.
Starting with a quick overview of the fundamental concepts of data engineering, you'll learn the various responsibilities of a data engineer and how GCP plays a vital role in fulfilling those responsibilities. As you progress through the chapters, you'll be able to leverage GCP products to build a sample data warehouse using Cloud Storage and BigQuery and a data lake using Dataproc. The book gradually takes you through operations such as data ingestion, data cleansing, transformation, and integrating data with other sources. You'll learn how to design IAM for data governance, deploy ML pipelines with the Vertex AI, leverage pre-built GCP models as a service, and visualize data with Google Data Studio to build compelling rep ...
This book helps business analysts generate powerful and sophisticated analyses from their data using DAX and get the most out of Microsoft Business Intelligence tools.
Extreme DAX will first teach you the principles of business intelligence, good model design, and how DAX fits into it all. Then, you'll launch into detailed examples of DAX in real-world business scenarios such as inventory calculations, forecasting, intercompany business, and data security. At each step, senior DAX experts will walk you through the subtleties involved in working with Power BI models and common mistakes to look out for as you build advanced data aggregations.
You'll deepen your understanding of DAX functions, filters, and measures, and how and when they can be used to derive effective insights. You'll also be provided with PBIX files for each chapter, so that you can follow along and explore in your own time. ...
Web App Development and Real-Time Web Analytics with Python
Learn to develop and deploy dashboards as web apps using the Python programming language, and how to integrate algorithms into web apps.
Author Tshepo Chris Nokeri begins by introducing you to the basics of constructing and styling static and interactive charts and tables before exploring the basics of HTML, CSS, and Bootstrap, including an approach to building web pages with HTML. From there, he'll show you the key Python web frameworks and techniques for building web apps with them. You'll then see how to style web apps and incorporate themes, including interactive charts and tables to build dashboards, followed by a walkthrough of creating URL routes and securing web apps. You'll then progress to more advanced topics, like building machine learning algorithms and integrating them into a web app. The book concludes with a demonstration of how to deploy web apps in prevalent cloud platforms.
Web App Development and Real-Time Web Analytics with Python is ideal for intermed ...
Practical Fraud Prevention
Over the past two decades, the booming ecommerce and fintech industries have become a breeding ground for fraud. Organizations that conduct business online are constantly engaged in a cat-and-mouse game with these invaders. In this practical book, Gilit Saporta and Shoshana Maraney draw on their fraud-fighting experience to provide best practices, methodologies, and tools to help you detect and prevent fraud and other malicious activities.
Data scientists, data analysts, and fraud analysts will learn how to identify and quickly respond to attacks. You'll get a comprehensive view of typical incursions as well as recommended detection methods. Online fraud is constantly evolving. This book helps experienced researchers safely guide and protect their organizations in this ever-changing fraud landscape.
With this book, you will: Examine current fraud attacks and learn how to mitigate them; Find the right balance between preventing fraud and providing a smooth customer experience; Sha ...
If you want to increase Tableau's value to your organization, this practical book has your back. Authors Ann Jackson and Luke Stanke guide data analysts through strategies for solving real-world analytics problems using Tableau. Starting with the basics and building toward advanced topics such as multidimensional analysis and user experience, you'll explore pragmatic and creative examples that you can apply to your own data.
Staying competitive today requires the ability to quickly analyze and visualize data and make data-driven decisions. With this guide, data practitioners and leaders alike will learn strategies for building compelling and purposeful visualizations, dashboards, and data products. Every chapter contains the why behind the solution and the technical knowledge you need to make it work.
Use this book as a high-value on-the-job reference guide to Tableau; Visualize different data types and tackle specific data challenges; Create compelling data visualizations ...