MongoDB: The Definitive Guide, 3rd EditionManage your data with a system designed to support modern application development. Updated for MongoDB 4.2, the third edition of this authoritative and accessible guide shows you the advantages of using document-oriented databases. You'll learn how this secure, high-performance system enables flexible data models, high availability, and horizontal scalability.
Authors Shannon Bradshaw, Eoin Brazil, and Kristina Chodorow provide guidance for database developers, advanced configuration for system administrators, and use cases for a variety of projects. NoSQL newcomers and experienced MongoDB users will find updates on querying, indexing, aggregation, transactions, replica sets, ops management, sharding and data administration, durability, monitoring, and security.
In six parts, this book shows you how to: Work with MongoDB, perform write operations, find documents, and create complex queries; Index collections, aggregate data, and use transactions for your application; Configure a ...
Learning SQL, 3rd EditionAs data floods into your company, you need to put it to work right away—and SQL is the best tool for the job. With the latest edition of this introductory guide, author Alan Beaulieu helps developers get up to speed with SQL fundamentals for writing database applications, performing administrative tasks, and generating reports. You'll find new chapters on SQL and big data, analytic functions, and working with very large databases.
Each chapter presents a self-contained lesson on a key SQL concept or technique using numerous illustrations and annotated examples. Exercises let you practice the skills you learn. Knowledge of SQL is a must for interacting with data. With Learning SQL, you'll quickly discover how to put the power and flexibility of this language to work.
Move quickly through SQL basics and several advanced features; Use SQL data statements to generate, manipulate, and retrieve data; Create database objects, such as tables, indexes, and constraints with SQL schema st ...
Codeless Data Structures and AlgorithmsIn the era of self-taught developers and programmers, essential topics in the industry are frequently learned without a formal academic foundation. A solid grasp of data structures and algorithms (DSA) is imperative for anyone looking to do professional software development and engineering, but classes in the subject can be dry or spend too much time on theory and unnecessary readings. Regardless of your programming language background, Codeless Data Structures and Algorithms has you covered.
In this book, author Armstrong Subero will help you learn DSAs without writing a single line of code. Straightforward explanations and diagrams give you a confident handle on the topic while ensuring you never have to open your code editor, use a compiler, or look at an integrated development environment. Subero introduces you to linear, tree, and hash data structures and gives you important insights behind the most common algorithms that you can directly apply to your own programs.
Codeless ...
Building a Data Integration TeamFind the right people with the right skills. This book clarifies best practices for creating high-functioning data integration teams, enabling you to understand the skills and requirements, documents, and solutions for planning, designing, and monitoring both one-time migration and daily integration systems.
The growth of data is exploding. With multiple sources of information constantly arriving across enterprise systems, combining these systems into a single, cohesive, and documentable unit has become more important than ever. But the approach toward integration is much different than in other software disciplines, requiring the ability to code, collaborate, and disentangle complex business rules into a scalable model.
Data migrations and integrations can be complicated. In many cases, project teams save the actual migration for the last weekend of the project, and any issues can lead to missed deadlines or, at worst, corrupted data that needs to be reconciled post-deployment. T ...
Beginning Microsoft Power BIAnalyze company data quickly and easily using Microsoft's powerful data tools. Learn to build scalable and robust data models, clean and combine different data sources effectively, and create compelling and professional visuals.
Beginning Power BI is a hands-on, activity-based guide that takes you through the process of analyzing your data using the tools that that encompass the core of Microsoft's self-service BI offering. Starting with Power Query, you will learn how to get data from a variety of sources, and see just how easy it is to clean and shape the data prior to importing it into a data model. Using Power BI tabular and the Data Analysis Expressions (DAX), you will learn to create robust scalable data models which will serve as the foundation of your data analysis. From there you will enter the world of compelling interactive visualizations to analyze and gain insight into your data. You will wrap up your Power BI journey by learning how to package and share your reports an ...
Cloud Native Data Center NetworkingIf you want to study, build, or simply validate your thinking about modern cloud native data center networks, this is your book. Whether you're pursuing a multitenant private cloud, a network for running machine learning, or an enterprise data center, author Dinesh Dutt takes you through the steps necessary to design a data center that's affordable, high capacity, easy to manage, agile, and reliable.
Ideal for network architects, data center operators, and network and containerized application developers, this book mixes theory with practice to guide you through the architecture and protocols you need to create and operate a robust, scalable network infrastructure. The book offers a vendor-neutral way to look at network design. For those interested in open networking, this book is chock-full of examples using open source software, from FRR to Ansible.
In the context of a cloud native data center, you'll examine: Clos topology; Network disaggregation; Network operating system choi ...
The Practitioner's Guide to Graph DataGraph data closes the gap between the way humans and computers view the world. While computers rely on static rows and columns of data, people navigate and reason about life through relationships. This practical guide demonstrates how graph data brings these two approaches together. By working with concepts from graph theory, database schema, distributed systems, and data analysis, you'll arrive at a unique intersection known as graph thinking.
Authors Denise Koessler Gosnell and Matthias Broecheler show data engineers, data scientists, and data analysts how to solve complex problems with graph databases. You'll explore templates for building with graph technology, along with examples that demonstrate how teams think about graph data within an application.
Build an example application architecture with relational and graph technologies; Use graph technology to build a Customer 360 application, the most popular graph data pattern today; Dive into hierarchical data and troubleshoot ...
Cassandra: The Definitive Guide, 3rd EditionImagine what you could do if scalability wasn't a problem. With this hands-on guide, you'll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This third edition - updated for Cassandra 4.0 - provides the technical details and practical examples you need to put this database to work in a production environment.
Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra's nonrelational design, with special attention to data modeling. If you're a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra's speed and flexibility.
Understand Cassandra's distributed and decentralized structure; Use the Cassandra Query Language (CQL) and cqlsh - the CQL shell; Create a working data model and compare it with an equivalent relational model; Develop sample applications using ...
Practical Statistics for Data Scientists, 2nd EditionStatistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.
Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.
With this book, you'll learn: Why exploratory data analysis is a key preliminary step in data science; How random sampling can reduce bias and yield a higher-quality dataset, even with big data; How the principles of experimental design yield definitive answers to ...
Building an Anonymization PipelineHow can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner.
Luk Arbuckle and Khaled El Emam from Privacy Analytics explore end-to-end solutions for anonymizing device and IoT data, based on collection models and use cases that address real business needs. These examples come from some of the most demanding data environments, such as healthcare, using approaches that have withstood the test of time.
Create anonymization solutions diverse enough to cover a spectrum of use cases; Match your solutions to the data you use, the people you share it with, and your analysis goals; Build anonymization pipelines around various data collection models to cover different business needs; Generate an anonymized version of original data or use ...