Essential Math for Data ScienceMaster the math needed to excel in data science, machine learning, and statistics. In this book author Thomas Nield guides you through areas like calculus, probability, linear algebra, and statistics and how they apply to techniques like linear regression, logistic regression, and neural networks. Along the way you'll also gain practical insights into the state of data science and how to use those insights to maximize your career.
Learn how to: Use Python code and libraries like SymPy, NumPy, and scikit-learn to explore essential mathematical concepts like calculus, linear algebra, statistics, and machine learning; Understand techniques like linear regression, logistic regression, and neural networks in plain English, with minimal mathematical notation and jargon; Perform descriptive statistics and hypothesis testing on a dataset to interpret p-values and statistical significance; Manipulate vectors and matrices and perform matrix decomposition; Integrate and build upon incremental ...
Beginning Data Science in R 4, 2nd EditionDiscover best practices for data analysis and software development in R and start on the path to becoming a fully-fledged data scientist. Updated for the R 4.0 release, this book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R.
Beginning Data Science in R 4, Second Edition details how data science is a combination of statistics, computational science, and machine learning. You'll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. This requires computational methods and programming, and R is an ideal programming language for this.
This book is based on a number of lecture notes for classes the author has taught on data science and statistical programming using the R programming language. Modern data analysis requires computational skills and usually a minimum of programming. ...
Productive and Efficient Data Science with PythonThis book focuses on the Python-based tools and techniques to help you become highly productive at all aspects of typical data science stacks such as statistical analysis, visualization, model selection, and feature engineering.
You'll review the inefficiencies and bottlenecks lurking in the daily business process and solve them with practical solutions. Automation of repetitive data science tasks is a key mindset that is promoted throughout the book. You'll learn how to extend the existing coding practice to handle larger datasets with high efficiency with the help of advanced libraries and packages that already exist in the Python ecosystem.
The book focuses on topics such as how to measure the memory footprint and execution speed of machine learning models, quality test a data science pipelines, and modularizing a data science pipeline for app development. You'll review Python libraries which come in very handy for automating and speeding up the day-to-day tasks.
In the end ...
Fundamentals of Data EngineeringData engineering has grown rapidly in the past decade, leaving many software engineers, data scientists, and analysts looking for a comprehensive view of this practice. With this practical book, you'll learn how to plan and build systems to serve the needs of your organization and customers by evaluating the best technologies available through the framework of the data engineering lifecycle.
Authors Joe Reis and Matt Housley walk you through the data engineering lifecycle and show you how to stitch together a variety of cloud technologies to serve the needs of downstream data consumers. You'll understand how to apply the concepts of data generation, ingestion, orchestration, transformation, storage, and governance that are critical in any data environment regardless of the underlying technology.
This book will help you: Get a concise overview of the entire data engineering landscape; Assess data engineering problems using an end-to-end framework of best practices; Cut through mar ...
Pro Data Mashup for Power BIThis book provides all you need to find data from external sources and load and transform that data into Power BI where you can mine it for business insights and a competitive edge. This ranges from connecting to corporate databases such as Azure SQL and SQL Server to file-based data sources, and cloud- and web-based data sources. The book also explains the use of Direct Query and Live Connect to establish instant connections to databases and data warehouses and avoid loading data.
The book provides detailed guidance on techniques for transforming inbound data into normalized data sets that are easy to query and analyze. This covers data cleansing, data modification, and standardization as well as merging source data into robust data structures that can feed into your data model. You will learn how to pivot and transpose data and extrapolate missing values as well as harness external programs such as R and Python into a Power Query data flow. You also will see how to handle errors ...
Python for Data Analysis, 3rd EditionGet the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.10 and pandas 1.4, the third edition of this hands-on guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You'll learn the latest versions of pandas, NumPy, and Jupyter in the process.
Written by Wes McKinney, the creator of the Python pandas project, this book is a practical, modern introduction to data science tools in Python. It's ideal for analysts new to Python and for Python programmers new to data science and scientific computing. Data files and related material are available on GitHub. ...
Practical Data PrivacyBetween major privacy regulations like the GDPR and CCPA and expensive and notorious data breaches, there has never been so much pressure to ensure data privacy. Unfortunately, integrating privacy into data systems is still complicated. This essential guide will give you a fundamental understanding of modern privacy building blocks, like differential privacy, federated learning, and encrypted computation. Based on hard-won lessons, this book provides solid advice and best practices for integrating breakthrough privacy-enhancing technologies into production systems.
Practical Data Privacy answers important questions such as: What do privacy regulations like GDPR and CCPA mean for my data workflows and data science use cases? What does "anonymized data" really mean? How do I actually anonymize data? How does federated learning and analysis work? Homomorphic encryption sounds great, but is it ready for use? How do I compare and choose the best privacy-preserving technologies and method ...
Expert Data Modeling with Power BI, 2nd EditionThis book is a comprehensive guide to understanding the ins and outs of data modeling and how to create full-fledged data models using Power BI confidently. In this new, fully updated edition, you'll learn how to connect data from multiple sources, understand data, define and manage relationships between data, and shape data models to gain deep and detailed insights about your organization. As you advance through the chapters, the book will demonstrate how to prepare efficient data models in the Power Query Editor and use simpler DAX code with new data modeling features. You'll explore how to use the various data modeling and navigation techniques and perform custom calculations using the modeling features with the help of real-world examples. Finally, you'll learn how to use some new and advanced modeling features to enhance your data models to carry out a wide variety of complex tasks. Additionally, you'll learn valuable best practices and explore common data modeling complications a ...
Access Data Analysis CookbookThis book offers practical recipes to solve a variety of common problems that users have with extracting Access data and performing calculations on it. Whether you use Access 2007 or an earlier version, this book will teach you new methods to query data, different ways to move data in and out of Access, how to calculate answers to financial and investment issues, how to jump beyond SQL by manipulating data with VBA, and more. ...
Real World Instrumentation with PythonLearn how to develop your own applications to monitor or control instrumentation hardware. Whether you need to acquire data from a device or automate its functions, this practical book shows you how to use Python's rapid development capabilities to build interfaces that include everything from software to wiring. You get step-by-step instructions, clear examples, and hands-on tips for interfacing a PC to a variety of devices.
Use the book's hardware survey to identify the interface type for your particular device, and then follow detailed examples to develop an interface with Python and C. Organized by interface type, data processing activities, and user interface implementations, this book is for anyone who works with instrumentation, robotics, data acquisition, or process control. ...