IT eBooks
Download, Read, Use
Apache Spark 2: Data Processing and Real-Time Analytics
Apache Spark 2: Data Processing and Real-Time Analytics

Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform. You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools. By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle. ...
Machine Learning with Apache Spark Quick Start Guide
Machine Learning with Apache Spark Quick Start Guide

Every person and every organization in the world manages data, whether they realize it or not. Data is used to describe the world around us and can be used for almost any purpose, from analyzing consumer habits to fighting disease and serious organized crime. Ultimately, we manage data in order to derive value from it, and many organizations around the world have traditionally invested in technology to help process their data faster and more efficiently. But we now live in an interconnected world driven by mass data creation and consumption where data is no longer rows and columns restricted to a spreadsheet, but an organic and evolving asset in its own right. With this realization comes major challenges for organizations: how do we manage the sheer size of data being created every second (think not only spreadsheets and databases, but also social media posts, images, videos, music, blogs and so on)? And once we can manage all of this data, how do we derive real value from it? Th ...
Data Analysis with Python
Data Analysis with Python

Data Analysis with Python offers a modern approach to data analysis so that you can work with the latest and most powerful Python tools, AI techniques, and open source libraries. Industry expert David Taieb shows you how to bridge data science with the power of programming and algorithms in Python. You'll be working with complex algorithms, and cutting-edge AI in your data analysis. Learn how to analyze data with hands-on examples using Python-based tools and Jupyter Notebook. You'll find the right balance of theory and practice, with extensive code files that you can integrate right into your own data projects. Explore the power of this approach to data analysis by then working with it across key industry case studies. Four fascinating and full projects connect you to the most critical data analysis challenges you're likely to meet in today. The first of these is an image recognition application with TensorFlow - embracing the importance today of AI in your data analysis. The secon ...
Prepare Your Data for Tableau
Prepare Your Data for Tableau

Focus on the most important and most often overlooked factor in a successful Tableau project - data. Without a reliable data source, you will not achieve the results you hope for in Tableau. This book does more than teach the mechanics of data preparation. It teaches you: how to look at data in a new way, to recognize the most common issues that hinder analytics, and how to mitigate those factors one by one. Tableau can change the course of business, but the old adage of "garbage in, garbage out" is the hard truth that hides behind every Tableau sales pitch. That amazing sales demo does not work as well with bad data. The unfortunate reality is that almost all data starts out in a less-than-perfect state. Data prep is hard. Traditionally, we were forced into the world of the database where complex ETL (Extract, Transform, Load) operations created by the data team did all the heavy lifting for us. Fortunately, we have moved past those days. With the introduction of the Tableau Dat ...
Data Science from Scratch, 2nd Edition
Data Science from Scratch, 2nd Edition

To really learn data science, you should not only master the tools - data science libraries, frameworks, modules, and toolkits - but also understand the ideas and principles underlying them. Updated for Python 3.6, this second edition of Data Science from Scratch shows you how these tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with the hacking skills you need to get started as a data scientist. Packed with new material on deep learning, statistics, and natural language processing, this updated book shows you how to find the gems in today's messy glut of data. Get a crash course in Python; Learn the basics of linear algebra, statistics, and probability - and how and when they're used in data science; Collect, explore, clean, munge, and manipulate data; Dive into the fundamentals of machine ...
Practical Synthetic Data Generation
Practical Synthetic Data Generation

Building and testing machine learning models requires access to large and diverse data. But where can you find usable datasets without running into privacy issues? This practical book introduces techniques for generating synthetic data - fake data generated from real data - so you can perform secondary analysis to do research, understand customer behaviors, develop new products, or generate new revenue. Data scientists will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps for generating synthetic data from real datasets. And business leaders will see how synthetic data can help accelerate time to a product or solution. This book describes: Steps for generating synthetic data using multivariate normal distributions; Methods for distribution fitting covering different goodness-of-fit metrics; How to replicate the simple structure of orig ...
Graph Algorithms
Graph Algorithms

Learn how graph algorithms can help you leverage relationships within your data to develop intelligent solutions and enhance your machine learning models. With this practical guide, developers and data scientists will discover how graph analytics deliver value, whether they're used for building dynamic network models or forecasting real-world behavior. Mark Needham and Amy Hodler from Neo4j explain how graph algorithms describe complex structures and reveal difficult-to-find patterns - from finding vulnerabilities and bottlenecks to detecting communities and improving machine learning predictions. You'll walk through hands-on examples that show you how to use graph algorithms in Apache Spark and Neo4j, two of the most common choices for graph analytics. Learn how graph analytics reveal more predictive elements in today's data; Understand how popular graph algorithms work and how they're applied; Use sample code and tips from more than 20 graph algorithm examples; Learn which alg ...
Graph Databases, 2nd Edition
Graph Databases, 2nd Edition

Discover how graph databases can help you manage and query highly connected data. With this practical book, you'll learn how to design and implement a graph database that brings the power of graphs to bear on a broad range of problem domains. Whether you want to speed up your response to user queries or build a database that can adapt as your business evolves, this book shows you how to apply the schema-free graph model to real-world problems. This second edition includes new code samples and diagrams, using the latest Neo4j syntax, as well as information on new functionality. Learn how different organizations are using graph databases to outperform their competitors. With this book's data modeling, query, and code examples, you'll quickly be able to implement your own solution. Model data with the Cypher query language and property graph model; Learn best practices and common pitfalls when modeling with graphs; Plan and implement a graph database solution in test-driven fashion ...
Algorithms
Algorithms

Algorithms are the lifeblood of computer science. They are the machines that proofs build and the music that programs play. Their history is as old as mathematics itself. This book is a wide-ranging, idiosyncratic treatise on the design and analysis of algorithms, covering several fundamental techniques, with an emphasis on intuition and the problem-solving process. The book includes important classical examples, hundreds of battle-tested exercises, far too many historical digressions, and exaclty four typos. Jeff Erickson is a computer science professor at the University of Illinois, Urbana-Champaign; this book is based on algorithms classes he has taught there since 1998. ...
SQL Server Big Data Clusters
SQL Server Big Data Clusters

Use this guide to one of SQL Server 2019's most impactful features - Big Data Clusters. You will learn about data virtualization and data lakes for this complete artificial intelligence (AI) and machine learning (ML) platform within the SQL Server database engine. You will know how to use Big Data Clusters to combine large volumes of streaming data for analysis along with data stored in a traditional database. For example, you can stream large volumes of data from Apache Spark in real time while executing Transact-SQL queries to bring in relevant additional data from your corporate, SQL Server database. Filled with clear examples and use cases, this book provides everything necessary to get started working with Big Data Clusters in SQL Server 2019. You will learn about the architectural foundations that are made up from Kubernetes, Spark, HDFS, and SQL Server on Linux. You then are shown how to configure and deploy Big Data Clusters in on-premises environments or in the cloud. Next, ...
← Prev       Next →
Reproduction of site books is authorized only for informative purposes and strictly for personal, private use.
Only Direct Download
IT eBooks Group © 2011-2026