Advanced Analytics with PySpark
The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best practices in Spark programming.
Data scientists Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, and Josh Wills offer an introduction to the Spark ecosystem, then dive into patterns that apply common techniques-including classification, clustering, collaborative filtering, and anomaly detection, to fields such as genomics, security, and finance. This updated edition also covers NLP and image processing.
If you have a basic understanding of machine learning and statistics and you program in Python, this book will get you started with large-scale data analysis.
Familiarize your ...
Adaptive Machine Learning Algorithms with Python
Learn to use adaptive algorithms to solve real-world streaming data problems. This book covers a multitude of data processing challenges, ranging from the simple to the complex. At each step, you will gain insight into real-world use cases, find solutions, explore code used to solve these problems, and create new algorithms for your own use.
Authors Chanchal Chatterjee and Vwani P. Roychowdhury begin by introducing a common framework for creating adaptive algorithms, and demonstrating how to use it to address various streaming data issues. Examples range from using matrix functions to solve machine learning and data analysis problems to more critical edge computation problems. They handle time-varying, non-stationary data with minimal compute, memory, latency, and bandwidth.
Upon finishing this book, you will have a solid understanding of how to solve adaptive machine learning and data analytics problems and be able to derive new algorithms for your own use cases. You will ...
Analytics Optimization with Columnstore Indexes in Microsoft SQL Server
Meet the challenge of storing and accessing analytic data in SQL Server in a fast and performant manner. This book illustrates how columnstore indexes can provide an ideal solution for storing analytic data that leads to faster performing analytic queries and the ability to ask and answer business intelligence questions with alacrity. The book provides a complete walk through of columnstore indexing that encompasses an introduction, best practices, hands-on demonstrations, explanations of common mistakes, and presents a detailed architecture that is suitable for professionals of all skill levels.
With little or no knowledge of columnstore indexing you can become proficient with columnstore indexes as used in SQL Server, and apply that knowledge in development, test, and production environments. This book serves as a comprehensive guide to the use of columnstore indexes and provides definitive guidelines. You will learn when columnstore indexes should be used, and the performance ga ...
Data Engineering with Google Cloud Platform
With this book, you'll understand how the highly scalable Google Cloud Platform (GCP) enables data engineers to create end-to-end data pipelines right from storing and processing data and workflow orchestration to presenting data through visualization dashboards.
Starting with a quick overview of the fundamental concepts of data engineering, you'll learn the various responsibilities of a data engineer and how GCP plays a vital role in fulfilling those responsibilities. As you progress through the chapters, you'll be able to leverage GCP products to build a sample data warehouse using Cloud Storage and BigQuery and a data lake using Dataproc. The book gradually takes you through operations such as data ingestion, data cleansing, transformation, and integrating data with other sources. You'll learn how to design IAM for data governance, deploy ML pipelines with the Vertex AI, leverage pre-built GCP models as a service, and visualize data with Google Data Studio to build compelling rep ...
This book helps business analysts generate powerful and sophisticated analyses from their data using DAX and get the most out of Microsoft Business Intelligence tools.
Extreme DAX will first teach you the principles of business intelligence, good model design, and how DAX fits into it all. Then, you'll launch into detailed examples of DAX in real-world business scenarios such as inventory calculations, forecasting, intercompany business, and data security. At each step, senior DAX experts will walk you through the subtleties involved in working with Power BI models and common mistakes to look out for as you build advanced data aggregations.
You'll deepen your understanding of DAX functions, filters, and measures, and how and when they can be used to derive effective insights. You'll also be provided with PBIX files for each chapter, so that you can follow along and explore in your own time. ...
Web App Development and Real-Time Web Analytics with Python
Learn to develop and deploy dashboards as web apps using the Python programming language, and how to integrate algorithms into web apps.
Author Tshepo Chris Nokeri begins by introducing you to the basics of constructing and styling static and interactive charts and tables before exploring the basics of HTML, CSS, and Bootstrap, including an approach to building web pages with HTML. From there, he'll show you the key Python web frameworks and techniques for building web apps with them. You'll then see how to style web apps and incorporate themes, including interactive charts and tables to build dashboards, followed by a walkthrough of creating URL routes and securing web apps. You'll then progress to more advanced topics, like building machine learning algorithms and integrating them into a web app. The book concludes with a demonstration of how to deploy web apps in prevalent cloud platforms.
Web App Development and Real-Time Web Analytics with Python is ideal for intermed ...
Practical Fraud Prevention
Over the past two decades, the booming ecommerce and fintech industries have become a breeding ground for fraud. Organizations that conduct business online are constantly engaged in a cat-and-mouse game with these invaders. In this practical book, Gilit Saporta and Shoshana Maraney draw on their fraud-fighting experience to provide best practices, methodologies, and tools to help you detect and prevent fraud and other malicious activities.
Data scientists, data analysts, and fraud analysts will learn how to identify and quickly respond to attacks. You'll get a comprehensive view of typical incursions as well as recommended detection methods. Online fraud is constantly evolving. This book helps experienced researchers safely guide and protect their organizations in this ever-changing fraud landscape.
With this book, you will: Examine current fraud attacks and learn how to mitigate them; Find the right balance between preventing fraud and providing a smooth customer experience; Sha ...
If you want to increase Tableau's value to your organization, this practical book has your back. Authors Ann Jackson and Luke Stanke guide data analysts through strategies for solving real-world analytics problems using Tableau. Starting with the basics and building toward advanced topics such as multidimensional analysis and user experience, you'll explore pragmatic and creative examples that you can apply to your own data.
Staying competitive today requires the ability to quickly analyze and visualize data and make data-driven decisions. With this guide, data practitioners and leaders alike will learn strategies for building compelling and purposeful visualizations, dashboards, and data products. Every chapter contains the why behind the solution and the technical knowledge you need to make it work.
Use this book as a high-value on-the-job reference guide to Tableau; Visualize different data types and tackle specific data challenges; Create compelling data visualizations ...
Advanced Analytics with Transact-SQL
Learn about business intelligence (BI) features in T-SQL and how they can help you with data science and analytics efforts without the need to bring in other languages such as R and Python. This book shows you how to compute statistical measures using your existing skills in T-SQL. You will learn how to calculate descriptive statistics, including centers, spreads, skewness, and kurtosis of distributions. You will also learn to find associations between pairs of variables, including calculating linear regression formulas and confidence levels with definite integration.
No analysis is good without data quality. Advanced Analytics with Transact-SQL introduces data quality issues and shows you how to check for completeness and accuracy, and measure improvements in data quality over time. The book also explains how to optimize queries involving temporal data, such as when you search for overlapping intervals. More advanced time-oriented information in the book includes haza ...
Mastering Tableau 2021, 3rd Edition
Tableau is one of the leading business intelligence (BI) tools used to solve data analysis challenges. With this book, you will master Tableau's features and offerings in various paradigms of the BI domain.
Updated with fresh topics including Quick Level of Detail expressions, the newest Tableau Server features, Einstein Discovery, and more, this book covers essential Tableau concepts and advanced functionalities. Leveraging Tableau Hyper files and using Prep Builder, you'll be able to perform data preparation and handling easily. You'll gear up to perform complex joins, spatial joins, unions, and data blending tasks using practical examples. Following this, you'll learn how to execute data densification and further explore expert-level examples to help you with calculations, mapping, and visual design using Tableau extensions. You'll also learn about improving dashboard performance, connecting to Tableau Server and understanding data visualization with examples. Finally, you'll cov ...
Advancing into Analytics
Data analytics may seem daunting, but if you're an experienced Excel user, you have a unique head start. With this hands-on guide, intermediate Excel users will gain a solid understanding of analytics and the data stack. By the time you complete this book, you'll be able to conduct exploratory data analysis and hypothesis testing using a programming language.
Exploring and testing relationships are core to analytics. By using the tools and frameworks in this book, you'll be well positioned to continue learning more advanced data analysis techniques. Author George Mount, founder and CEO of Stringfest Analytics demonstrates key statistical concepts with spreadsheets, then pivots your existing knowledge about data manipulation into R and Python programming.
This practical book guides you through:
- Foundations of analytics in Excel: Use Excel to test relationships between variables and build compelling demonstrations of important concepts in statistics a ...