Python Object-Oriented Programming, 4th EditionObject-oriented programming (OOP) is a popular design paradigm in which data and behaviors are encapsulated in such a way that they can be manipulated together. Python Object-Oriented Programming, Fourth Edition dives deep into the various aspects of OOP, Python as an OOP language, common and advanced design patterns, and hands-on data manipulation and testing of more complex OOP systems. These concepts are consolidated by open-ended exercises, as well as a real-world case study at the end of every chapter, newly written for this edition. All example code is now compatible with Python 3.9+ syntax and has been updated with type hints for ease of learning.
Steven and Dusty provide a comprehensive, illustrative tour of important OOP concepts, such as inheritance, composition, and polymorphism, and explain how they work together with Python's classes and data structures to facilitate good design. In addition, the book also features an in-depth look at Python's exception handling and how ...
Data Science at the Command Line, 2nd EditionThis thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools-useful whether you work with Windows, macOS, or Linux.
You'll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you're comfortable processing data with Python or R, you'll learn how to greatly improve your data science workflow by leveraging the command line's power. This book is ideal for data scientists, analysts, engineers, system administrators, and researchers.
Obtain data from websites, APIs, databases, and spreadsheets; Perform scrub operations on text, CSV, HTM, XML, and JSON files; Explore data, compute descriptive statistics, and create visualizations; M ...
Essential Math for Data ScienceMaster the math needed to excel in data science, machine learning, and statistics. In this book author Thomas Nield guides you through areas like calculus, probability, linear algebra, and statistics and how they apply to techniques like linear regression, logistic regression, and neural networks. Along the way you'll also gain practical insights into the state of data science and how to use those insights to maximize your career.
Learn how to: Use Python code and libraries like SymPy, NumPy, and scikit-learn to explore essential mathematical concepts like calculus, linear algebra, statistics, and machine learning; Understand techniques like linear regression, logistic regression, and neural networks in plain English, with minimal mathematical notation and jargon; Perform descriptive statistics and hypothesis testing on a dataset to interpret p-values and statistical significance; Manipulate vectors and matrices and perform matrix decomposition; Integrate and build upon incremental ...
Advanced Analytics with PySparkThe amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best practices in Spark programming.
Data scientists Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, and Josh Wills offer an introduction to the Spark ecosystem, then dive into patterns that apply common techniques-including classification, clustering, collaborative filtering, and anomaly detection, to fields such as genomics, security, and finance. This updated edition also covers NLP and image processing.
If you have a basic understanding of machine learning and statistics and you program in Python, this book will get you started with large-scale data analysis.
Familiarize yourself wi ...
Effective Data Science InfrastructureEffective Data Science Infrastructure: How to make data scientists more productive is a hands-on guide to assembling infrastructure for data science and machine learning applications. It reveals the processes used at Netflix and other data-driven companies to manage their cutting edge data infrastructure. In it, you'll master scalable techniques for data storage, computation, experiment tracking, and orchestration that are relevant to companies of all shapes and sizes. You'll learn how you can make data scientists more productive with your existing cloud infrastructure, a stack of open source software, and idiomatic Python.
The author is donating proceeds from this book to charities that support women and underrepresented groups in data science.
Growing data science projects from prototype to production requires reliable infrastructure. Using the powerful new techniques and tooling in this book, you can stand up an infrastructure stack that will scale with any organization, from ...
Data-Oriented ProgrammingData-Oriented Programming is a one-of-a-kind guide that introduces the data-oriented paradigm. This groundbreaking approach represents data with generic immutable data structures. It simplifies state management, eases concurrency, and does away with the common problems you'll find in object-oriented code. The book presents powerful new ideas through conversations, code snippets, and diagrams that help you quickly grok what's great about DOP. Best of all, the paradigm is language-agnostic - you'll learn to write DOP code that can be implemented in JavaScript, Ruby, Python, Clojure, and also in traditional OO languages like Java or C#.
Code that combines behavior and data, as is common in object-oriented designs, can introduce almost unmanageable complexity for state management. The Data-oriented programming (DOP) paradigm simplifies state management by holding application data in immutable generic data structures and then performing calculations using non-mutating general-purpose fun ...
Pro Data Mashup for Power BIThis book provides all you need to find data from external sources and load and transform that data into Power BI where you can mine it for business insights and a competitive edge. This ranges from connecting to corporate databases such as Azure SQL and SQL Server to file-based data sources, and cloud- and web-based data sources. The book also explains the use of Direct Query and Live Connect to establish instant connections to databases and data warehouses and avoid loading data.
The book provides detailed guidance on techniques for transforming inbound data into normalized data sets that are easy to query and analyze. This covers data cleansing, data modification, and standardization as well as merging source data into robust data structures that can feed into your data model. You will learn how to pivot and transpose data and extrapolate missing values as well as harness external programs such as R and Python into a Power Query data flow. You also will see how to handle errors ...
Python for Data SciencePython is an ideal choice for accessing, manipulating, and gaining insights from data of all kinds. Python for Data Science introduces you to the Pythonic world of data analysis with a learn-by-doing approach rooted in practical examples and hands-on activities. You'll learn how to write Python code to obtain, transform, and analyze data, practicing state-of-the-art data processing techniques for use cases in business management, marketing, and decision support.
You will discover Python's rich set of built-in data structures for basic operations, as well as its robust ecosystem of open-source libraries for data science, including NumPy, pandas, scikit-learn, matplotlib, and more. Examples show how to load data in various formats, how to streamline, group, and aggregate data sets, and how to create charts, maps, and other visualizations. Later chapters go in-depth with demonstrations of real-world data applications, including using location data to power a taxi service, market basket ...
Python for Geospatial Data AnalysisIn spatial data science, things in closer proximity to one another likely have more in common than things that are farther apart. With this practical book, geospatial professionals, data scientists, business analysts, geographers, geologists, and others familiar with data analysis and visualization will learn the fundamentals of spatial data analysis to gain a deeper understanding of their data questions.
Author Bonny P. McClain demonstrates why detecting and quantifying patterns in geospatial data is vital. Both proprietary and open source platforms allow you to process and visualize spatial information. This book is for people familiar with data analysis or visualization who are eager to explore geospatial integration with Python. ...
Advanced Data Analytics Using Python, 2nd EditionUnderstand advanced data analytics concepts such as time series and principal component analysis with ETL, supervised learning, and PySpark using Python. This book covers architectural patterns in data analytics, text and image classification, optimization techniques, natural language processing, and computer vision in the cloud environment.
Generic design patterns in Python programming is clearly explained, emphasizing architectural practices such as hot potato anti-patterns. You'll review recent advances in databases such as Neo4j, Elasticsearch, and MongoDB. You'll then study feature engineering in images and texts with implementing business logic and see how to build machine learning and deep learning models using transfer learning.
Advanced Analytics with Python, 2nd edition features a chapter on clustering with a neural network, regularization techniques, and algorithmic design patterns in data analytics with reinforcement learning. Finally, the recommender system in PySpa ...