Practical Statistics for Data Scientists, 2nd EditionStatistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.
Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.
With this book, you'll learn: Why exploratory data analysis is a key preliminary step in data science; How random sampling can reduce bias and yield a higher-quality dataset, even with big data; How the principles of experimental design yield definitive answers to ...
Building an Anonymization PipelineHow can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner.
Luk Arbuckle and Khaled El Emam from Privacy Analytics explore end-to-end solutions for anonymizing device and IoT data, based on collection models and use cases that address real business needs. These examples come from some of the most demanding data environments, such as healthcare, using approaches that have withstood the test of time.
Create anonymization solutions diverse enough to cover a spectrum of use cases; Match your solutions to the data you use, the people you share it with, and your analysis goals; Build anonymization pipelines around various data collection models to cover different business needs; Generate an anonymized version of original data or use ...
PCI DSSGain a broad understanding of how PCI DSS is structured and obtain a high-level view of the contents and context of each of the 12 top-level requirements. The guidance provided in this book will help you effectively apply PCI DSS in your business environments, enhance your payment card defensive posture, and reduce the opportunities for criminals to compromise your network or steal sensitive data assets.
Businesses are seeing an increased volume of data breaches, where an opportunist attacker from outside the business or a disaffected employee successfully exploits poor company practices. Rather than being a regurgitation of the PCI DSS controls, this book aims to help you balance the needs of running your business with the value of implementing PCI DSS for the protection of consumer payment card data.
Applying lessons learned from history, military experiences (including multiple deployments into hostile areas), numerous PCI QSA assignments, and corporate cybersecurity and InfoS ...
Python One-LinersPython One-Liners will teach you how to read and write "one-liners": concise statements of useful functionality packed into a single line of code. You'll learn how to systematically unpack and understand any line of Python code, and write eloquent, powerfully compressed Python like an expert.
The book's five chapters cover tips and tricks, regular expressions, machine learning, core data science topics, and useful algorithms. Detailed explanations of one-liners introduce key computer science concepts and boost your coding and analytical skills.
You'll learn about advanced Python features such as list comprehension, slicing, lambda functions, regular expressions, map and reduce functions, and slice assignments.
You'll also learn how to: Leverage data structures to solve real-world problems, like using Boolean indexing to find cities with above-average pollution; Use NumPy basics such as array, shape, axis, type, broadcasting, advanced indexing, slicing, sorting, searching, aggr ...
Mastering Azure Machine LearningThe increase being seen in data volume today requires distributed systems, powerful algorithms, and scalable cloud infrastructure to compute insights and train and deploy machine learning (ML) models. This book will help you improve your knowledge of building ML models using Azure and end-to-end ML pipelines on the cloud.
The book starts with an overview of an end-to-end ML project and a guide on how to choose the right Azure service for different ML tasks. It then focuses on Azure ML and takes you through the process of data experimentation, data preparation, and feature engineering using Azure ML and Python. You'll learn advanced feature extraction techniques using natural language processing (NLP), classical ML techniques, and the secrets of both a great recommendation engine and a performant computer vision model using deep learning methods. You'll also explore how to train, optimize, and tune models using Azure AutoML and HyperDrive, and perform distributed training on Azure ML ...
Hands-On Machine Learning with C++C++ can make your machine learning models run faster and more efficiently. This handy guide will help you learn the fundamentals of machine learning (ML), showing you how to use C++ libraries to get the most out of your data. This book makes machine learning with C++ for beginners easy with its example-based approach, demonstrating how to implement supervised and unsupervised ML algorithms through real-world examples.
This book will get you hands-on with tuning and optimizing a model for different use cases, assisting you with model selection and the measurement of performance. You'll cover techniques such as product recommendations, ensemble learning, and anomaly detection using modern C++ libraries such as PyTorch C++ API, Caffe2, Shogun, Shark-ML, mlpack, and dlib. Next, you'll explore neural networks and deep learning using examples such as image classification and sentiment analysis, which will help you solve various problems. Later, you'll learn how to handle production and de ...
Build a Career in Data ScienceYou are going to need more than technical knowledge to succeed as a data scientist. Build a Career in Data Science teaches you what school leaves out, from how to land your first job to the lifecycle of a data science project, and even how to become a manager.
What are the keys to a data scientist's long-term success? Blending your technical know-how with the right "soft skills" turns out to be a central ingredient of a rewarding career.
Build a Career in Data Science is your guide to landing your first data science job and developing into a valued senior employee. By following clear and simple instructions, you'll learn to craft an amazing resume and ace your interviews. In this demanding, rapidly changing field, it can be challenging to keep projects on track, adapt to company needs, and manage tricky stakeholders. You'll love the insights on how to handle expectations, deal with failures, and plan your career path in the stories from seasoned data scientists included in the bo ...
Machine Learning with R, the tidyverse, and mlrMachine learning (ML) is a collection of programming techniques for discovering relationships in data. With ML algorithms, you can cluster and classify data for tasks like making recommendations or fraud detection and make predictions for sales trends, risk analysis, and other forecasts. Once the domain of academic data scientists, machine learning has become a mainstream business process, and tools like the easy-to-learn R programming language put high-quality data analysis in the hands of any programmer. Machine Learning with R, the tidyverse, and mlr teaches you widely used ML techniques and how to apply them to your own datasets using the R programming language and its powerful ecosystem of tools. This book will get you started!
Machine Learning with R, the tidyverse, and mlr gets you started in machine learning using R Studio and the awesome mlr machine learning package. This practical guide simplifies theory and avoids needlessly complicated statistics or math. All core ML tec ...
Deep Reinforcement Learning in ActionHumans learn best from feedback - we are encouraged to take actions that lead to positive results while deterred by decisions with negative consequences. This reinforcement process can be applied to computer programs allowing them to solve more complex problems that classical programming cannot. Deep Reinforcement Learning in Action teaches you the fundamental concepts and terminology of deep reinforcement learning, along with the practical skills and techniques you'll need to implement it into your own projects.
Deep reinforcement learning AI systems rapidly adapt to new environments, a vast improvement over standard neural networks. A DRL agent learns like people do, taking in raw data such as sensor input and refining its responses and predictions through trial and error.
Deep Reinforcement Learning in Action teaches you how to program AI agents that adapt and improve based on direct feedback from their environment. In this example-rich tutorial, you'll master foundational and ...
Excel Data Analysis, 2nd EditionThis book offers a comprehensive and readable introduction to modern business and data analytics. It is based on the use of Excel, a tool that virtually all students and professionals have access to. The explanations are focused on understanding the techniques and their proper application, and are supplemented by a wealth of in-chapter and end-of-chapter exercises. In addition to the general statistical methods, the book also includes Monte Carlo simulation and optimization. The second edition has been thoroughly revised: new topics, exercises and examples have been added, and the readability has been further improved.
The book is primarily intended for students in business, economics and government, as well as professionals, who need a more rigorous introduction to business and data analytics - yet also need to learn the topic quickly and without overly academic explanations. ...