IT eBooks
Download, Read, Use
Practical Python Data Wrangling and Data Quality
Practical Python Data Wrangling and Data Quality

The world around us is full of data that holds unique insights and valuable stories, and this book will help you uncover them. Whether you already work with data or want to learn more about its possibilities, the examples and techniques in this practical book will help you more easily clean, evaluate, and analyze data so that you can generate meaningful insights and compelling visualizations. Complementing foundational concepts with expert advice, author Susan E. McGregor provides the resources you need to extract, evaluate, and analyze a wide variety of data sources and formats, along with the tools to communicate your findings effectively. This book delivers a methodical, jargon-free way for data practitioners at any level, from true novices to seasoned professionals, to harness the power of data. Use Python 3.8+ to read, write, and transform data from a variety of sources; Understand and use programming basics in Python to wrangle data at scale; Organize, document, and structu ...
Data Science on the Google Cloud Platform, 2nd Edition
Data Science on the Google Cloud Platform, 2nd Edition

Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP. Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way. You'll learn how to: Employ best practices in building highly scalable data and ML pipelines on Google Cloud; Automate and schedule data ingest using Cloud Run; Create and populate a dashboard in Data Studio; Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery; Conduct interactive data exploration with BigQuery; Create a Bayesian model with Spark on Cloud Dataproc; Fore ...
Data Mesh
Data Mesh

We're at an inflection point in data, where our data management solutions no longer match the complexity of organizations, the proliferation of data sources, and the scope of our aspirations to get value from data with AI and analytics. In this practical book, author Zhamak Dehghani introduces data mesh, a decentralized sociotechnical paradigm drawn from modern distributed architecture that provides a new approach to sourcing, sharing, accessing, and managing analytical data at scale. Dehghani guides practitioners, architects, technical leaders, and decision makers on their journey from traditional big data architecture to a distributed and multidimensional approach to analytical data management. Data mesh treats data as a product, considers domains as a primary concern, applies platform thinking to create self-serve data infrastructure, and introduces a federated computational model of data governance. Get a complete introduction to data mesh principles and its constituents; Des ...
Pro Data Visualization Using R and JavaScript, 2nd Edition
Pro Data Visualization Using R and JavaScript, 2nd Edition

Use R 4, RStudio, Tidyverse, and Shiny to interrogate and analyze your data, and then use the D3 JavaScript library to format and display that data in an elegant, informative, and interactive way. You will learn how to gather data effectively, and also how to understand the philosophy and implementation of each type of chart, so as to be able to represent the results visually. With the popularity of the R language, the art and practice of creating data visualizations is no longer the preserve of mathematicians, statisticians, or cartographers. As technology leaders, we can gather metrics around what we do and use data visualizations to communicate that information. Pro Data Visualization Using R and JavaScript combines the power of the R language with the simplicity and familiarity of JavaScript to display clear and informative data visualizations. Gathering and analyzing empirical data is the key to truly understanding anything. We can track operational metrics to quantify the h ...
Pro Serverless Data Handling with Microsoft Azure
Pro Serverless Data Handling with Microsoft Azure

Design and build architectures on the Microsoft Azure platform specifically for data-driven and ETL applications. Modern cloud architectures rely on serverless components more than ever, and this book helps you identify those components of data-driven or ETL applications that can be tackled using the technologies available on the Azure platform. The book shows you which Azure components are best suited to form a strong foundation for data-driven applications in the Microsoft Azure Cloud. If you are a solution architect or a decision maker, the conceptual aspects of this book will help you gain a deeper understanding of the underlying technology and its capabilities. You will understand how to develop using Azure Functions, Azure Data Factory, Logic Apps, and to employ serverless databases in your application to achieve the best scalability and design. If you are a developer, you will benefit from the hands-on approach used throughout this book. Many practical examples and architectu ...
Data Engineering with Google Cloud Platform
Data Engineering with Google Cloud Platform

With this book, you'll understand how the highly scalable Google Cloud Platform (GCP) enables data engineers to create end-to-end data pipelines right from storing and processing data and workflow orchestration to presenting data through visualization dashboards. Starting with a quick overview of the fundamental concepts of data engineering, you'll learn the various responsibilities of a data engineer and how GCP plays a vital role in fulfilling those responsibilities. As you progress through the chapters, you'll be able to leverage GCP products to build a sample data warehouse using Cloud Storage and BigQuery and a data lake using Dataproc. The book gradually takes you through operations such as data ingestion, data cleansing, transformation, and integrating data with other sources. You'll learn how to design IAM for data governance, deploy ML pipelines with the Vertex AI, leverage pre-built GCP models as a service, and visualize data with Google Data Studio to build compelling rep ...
Critical Data Literacy
Critical Data Literacy

A short course for students to increase their proficiency in analyzing and interpreting data visualizations. By completing this short course students will be able to explain the importance of data literacy, identify data visualization issues in order to improve their own skills in data story-telling. The intended outcome of this course is to help students become more discerning and critical users of data, graphs, charts and infographics. The need to understand data visualizations has never been more important. Every day we are inundated with more data, graphs and charts. Some of these data visualizations are well-designed and easy to understand, and others are confusing and misleading. Data literacy is often framed as a set of skills for data professionals, but we believe data literacy is for everyone. Everyone can benefit from improving their understanding of how data is created and their ability to analyze and interpret data. In this book, we will introduce the key stages in ...
Building an Effective Data Science Practice
Building an Effective Data Science Practice

Gain a deep understanding of data science and the thought process needed to solve problems in that field using the required techniques, technologies and skills that go into forming an interdisciplinary team. This book will enable you to set up an effective team of engineers, data scientists, analysts, and other stakeholders that can collaborate effectively on crucial aspects such as problem formulation, execution of experiments, and model performance evaluation. You'll start by delving into the fundamentals of data science - classes of data science problems, data science techniques and their applications - and gradually build up to building a professional reference operating model for a data science function in an organization. This operating model covers the roles and skills required in a team, the techniques and technologies they use, and the best practices typically followed in executing data science projects. Building an Effective Data Science Practice provides a common base ...
Essential Math for Data Science
Essential Math for Data Science

Master the math needed to excel in data science, machine learning, and statistics. In this book author Thomas Nield guides you through areas like calculus, probability, linear algebra, and statistics and how they apply to techniques like linear regression, logistic regression, and neural networks. Along the way you'll also gain practical insights into the state of data science and how to use those insights to maximize your career. Learn how to: Use Python code and libraries like SymPy, NumPy, and scikit-learn to explore essential mathematical concepts like calculus, linear algebra, statistics, and machine learning; Understand techniques like linear regression, logistic regression, and neural networks in plain English, with minimal mathematical notation and jargon; Perform descriptive statistics and hypothesis testing on a dataset to interpret p-values and statistical significance; Manipulate vectors and matrices and perform matrix decomposition; Integrate and build upon incremental ...
Beginning Data Science in R 4, 2nd Edition
Beginning Data Science in R 4, 2nd Edition

Discover best practices for data analysis and software development in R and start on the path to becoming a fully-fledged data scientist. Updated for the R 4.0 release, this book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R. Beginning Data Science in R 4, Second Edition details how data science is a combination of statistics, computational science, and machine learning. You'll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. This requires computational methods and programming, and R is an ideal programming language for this. This book is based on a number of lecture notes for classes the author has taught on data science and statistical programming using the R programming language. Modern data analysis requires computational skills and usually a minimum of programming. ...
← Prev       Next →
Reproduction of site books is authorized only for informative purposes and strictly for personal, private use.
Only Direct Download
IT eBooks Group © 2011-2025