Advanced Analytics with Spark, 2nd EditionIn the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. Updated for Spark 2.1, this edition acts as an introduction to these techniques and other best practices in Spark programming.
You'll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques - including classification, clustering, collaborative filtering, and anomaly detection - to fields such as genomics, security, and finance.
If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you'll find the book's patterns useful for working on your own data applications.
Familiarize yourself with the Spark programming model; Become comfortable within the Spark ec ...
Agile Data Science 2.0Data science teams looking to turn research into useful analytics applications require not only the right tools, but also the right approach if they're to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn how to use the Agile Data Science development methodology to build data applications with Python, Apache Spark, Kafka, and other tools.
Author Russell Jurney demonstrates how to compose a data platform for building, deploying, and refining analytics applications with Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, and Apache Airflow. You'll learn an iterative approach that lets you quickly change the kind of analysis you're doing, depending on what the data is telling you. Publish data science work as a web application, and affect meaningful change in your organization.
Build value from your data in a series of agile sprints, using the data-value pyramid; Extract features for statistical models from a single data ...
Hands-On Automated Machine LearningAutoML is designed to automate parts of Machine Learning. Readily available AutoML tools are making data science practitioners' work easy and are received well in the advanced analytics community. Automated Machine Learning covers the necessary foundation needed to create automated machine learning modules and helps you get up to speed with them in the most practical way possible.
In this book, you'll learn how to automate different tasks in the machine learning pipeline such as data preprocessing, feature selection, model training, model optimization, and much more. In addition to this, it demonstrates how you can use the available automation libraries, such as auto-sklearn and MLBox, and create and extend your own custom AutoML components for Machine Learning.
By the end of this book, you will have a clearer understanding of the different aspects of automated Machine Learning, and you'll be able to incorporate automation tasks using practical datasets. You can leverage your lea ...
Machine Learning SolutionsMachine learning (ML) helps you find hidden insights from your data without the need for explicit programming. This book is your key to solving any kind of ML problem you might come across in your job.
You'll encounter a set of simple to complex problems while building ML models, and you'll not only resolve these problems, but you'll also learn how to build projects based on each problem, with a practical approach and easy-to-follow examples.
The book includes a wide range of applications: from analytics and NLP, to computer vision domains. Some of the applications you will be working on include stock price prediction, a recommendation engine, building a chat-bot, a facial expression recognition system, and many more. The problem examples we cover include identifying the right algorithm for your dataset and use cases, creating and labeling datasets, getting enough clean data to carry out processing, identifying outliers, overftting datasets, hyperparameter tuning, and more. Here, ...
Learning SparkData in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates.
Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You'll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning.
Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell; Leverage Spark's powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib; Use one programming paradigm instead of mixing and matching tools like Hiv ...
Robot Operating System (ROS) for Absolute BeginnersLearn how to get started with robotics programming using Robot Operation System (ROS). Targeted for absolute beginners in ROS, Linux, and Python, this short guide shows you how to build your own robotics projects.
ROS is an open-source and flexible framework for writing robotics software. With a hands-on approach and sample projects, Robot Operating System for Absolute Beginners will enable you to begin your first robot project. You will learn the basic concepts of working with ROS and begin coding with ROS APIs in both C++ and Python.
Install ROS; Review fundamental ROS concepts; Work with frequently used commands in ROS; Build a mobile robot from scratch using ROS. ...
Mastering Machine Learning AlgorithmsMachine learning is a subset of AI that aims to make modern-day computer systems smarter and more intelligent. The real power of machine learning resides in its algorithms, which make even the most difficult things capable of being handled by machines. However, with the advancement in the technology and requirements of data, machines will have to be smarter than they are today to meet the overwhelming data needs; mastering these algorithms and using them optimally is the need of the hour.
Mastering Machine Learning Algorithms is your complete guide to quickly getting to grips with popular machine learning algorithms. You will be introduced to the most widely used algorithms in supervised, unsupervised, and semi-supervised machine learning, and will learn how to use them in the best possible manner. Ranging from Bayesian models to the MCMC algorithm to Hidden Markov models, this book will teach you how to extract features from your dataset and perform dimensionality reduction by maki ...
Learning Regular ExpressionsRegular expression experts have long been armed with an incredibly powerful tool, one that can be used to perform all sorts of sophisticated text processing and manipulation in just about every language and on every platform. That's the good news. The bad news is that for too long, regular expressions have been the exclusive property of only the most tech savvy. Until now.
Ben Forta's Learning Regular Expressions teaches you the regular expressions that you really need to know, starting with simple text matches and working up to more complex topics, including the use of backreferences, conditional evaluation, and look-ahead processing. You'll learn what you can use, and you'll learn it methodically, systematically, and simply.
Regular expressions are nowhere near as complex as they appear to be at first glance. All it takes is a clear understanding of the problem being solved and how to leverage regular expressions to solve them.
Read and understand regular expressions; Use li ...
Adventures in Minecraft, 2nd EditionIf you love playing Minecraft and want to learn how to code and create your own mods, this book was designed just for you. Working within the game itself, you'll learn to set up and run your own local Minecraft server, interact with the game on PC, Mac and Raspberry Pi, and develop Python programming skills that apply way beyond Minecraft. You'll learn how to use coordinates, how to change the player's position, how to create and delete blocks and how to check when a block has been hit.
The adventures aren't limited to the virtual - you'll also learn how to connect Minecraft to a BBC micro:bit so your Minecraft world can sense and control objects in the real world! The companion website gives you access to tutorial videos to make sure you understand the book, starter kits to make setup simple, completed code files, and badges to collect for your accomplishments. Written specifically for young people by professional Minecraft geeks, this fun, easy-to-follow guide helps you expand Min ...
AWS Lambda Quick Start GuideAWS Lambda is a part of AWS that lets you run your code without provisioning or managing servers. This enables you to deploy applications and backend services that operate with no upfront cost. This book gets you up to speed on how to build scalable systems and deploy serverless applications with AWS Lambda.
The book starts with the fundamental concepts of AWS Lambda, and then teaches you how to combine your applications with other AWS services, such as AmazonAPI Gateway and DynamoDB. This book will also give a quick walk through on how to use the Serverless Framework to build larger applications that can structure code or autogenerate boilerplate code that can be used to get started quickly for increased productivity.
Toward the end of the book, you will learn how to write, run, and test Lambda functions using Node.js, Java, Python, and C#. ...