Understanding Azure Data Factory
Improve your analytics and data platform to solve major challenges, including operationalizing big data and advanced analytics workloads on Azure. You will learn how to monitor complex pipelines, set alerts, and extend your organization's custom monitoring requirements.
This book starts with an overview of the Azure Data Factory as a hybrid ETL/ELT orchestration service on Azure. The book then dives into data movement and the connectivity capability of Azure Data Factory. You will learn about the support for hybrid data integration from disparate sources such as on-premise, cloud, or from SaaS applications. Detailed guidance is provided on how to transform data and on control flow. Demonstration of operationalizing the pipelines and ETL with SSIS is included. You will know how to leverage Azure Data Factory to run existing SSIS packages. As you advance through the book, you will wrap up by learning how to create a single pane for end-to-end monitoring, which is a key s ...
Apache Spark 2: Data Processing and Real-Time Analytics
Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform.
You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools.
By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle. ...
Go Machine Learning Projects
Go is the perfect language for machine learning; it helps to clearly describe complex algorithms, and also helps developers to understand how to run efficient optimized code. This book will teach you how to implement machine learning in Go to make programs that are easy to deploy and code that is not only easy to understand and debug, but also to have its performance measured.
The book begins by guiding you through setting up your machine learning environment with Go libraries and capabilities. You will then plunge into regression analysis of a real-life house pricing dataset and build a classification model in Go to classify emails as spam or ham. Using Gonum, Gorgonia, and STL, you will explore time series analysis along with decomposition and clean up your personal Twitter timeline by clustering tweets. In addition to this, you will learn how to recognize handwriting using neural networks and convolutional neural networks. Lastly, you'll learn how to choose the most appropriate m ...
Dynamic Oracle Performance Analytics
Use an innovative approach that relies on big data and advanced analytical techniques to analyze and improve Oracle Database performance. The approach used in this book represents a step-change paradigm shift away from traditional methods. Instead of relying on a few hand-picked, favorite metrics, or wading through multiple specialized tables of information such as those found in an automatic workload repository (AWR) report, you will draw on all available data, applying big data methods and analytical techniques to help the performance tuner draw impactful, focused performance improvement conclusions.
This book briefly reviews past and present practices, along with available tools, to help you recognize areas where improvements can be made. The book then guides you through a step-by-step method that can be used to take advantage of all available metrics to identify problem areas and work toward improving them. The method presented simplifies the tuning process and solves the probl ...
Apache Hadoop 3 Quick Start Guide
Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.
The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems.
The book also covers the important aspects of the big data software development lifecycle, including quality assurance and control, performance, administration, and monitoring.
You will then learn about the Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive, and HBase. Finally, you will look at advanced topics, including real time streaming using Apache Storm, ...
Machine Learning for Healthcare Analytics Projects
Machine Learning (ML) has changed the way organizations and individuals use data to improve the efficiency of a system. ML algorithms allow strategists to deal with a variety of structured, unstructured, and semi-structured data. Machine Learning for Healthcare Analytics Projects is packed with new approaches and methodologies for creating powerful solutions for healthcare analytics.
This book will teach you how to implement key machine learning algorithms and walk you through their use cases by employing a range of libraries from the Python ecosystem. You will build five end-to-end projects to evaluate the efficiency of Artificial Intelligence (AI) applications for carrying out simple-to-complex healthcare analytics tasks. With each project, you will gain new insights, which will then help you handle healthcare data efficiently. As you make your way through the book, you will use ML to detect cancer in a set of patients using support vector machines (SVMs) and k-Neare ...
Internet of Things for Architects
The Internet of Things (IoT) is the fastest growing technology market. Industries are embracing IoT technologies to improve operational expenses, product life, and people's well-being. An architectural guide is necessary if you want to traverse the spectrum of technologies needed to build a successful IoT system, whether that's a single device or millions of devices.
This book encompasses the entire spectrum of IoT solutions, from sensors to the cloud. We start by examining modern sensor systems and focus on their power and functionality. After that, we dive deep into communication theory, paying close attention to near-range PAN, including the new Bluetooth 5.0 specification and mesh networks. Then, we explore IP-based communication in LAN and WAN, including 802.11ah, 5G LTE cellular, SigFox, and LoRaWAN. Next, we cover edge routing and gateways and their role in fog computing, as well as the messaging protocols of MQTT and CoAP.
With the data now in internet form, you'll get an ...
Mastering Qlik Sense
Qlik Sense is a powerful, self-servicing Business Intelligence tool for data discovery, analytics and visualization. It allows you to create personalized Business Intelligence solutions from raw data and get actionable insights from it.
This book is your one-stop guide to mastering Qlik Sense, catering to all your organizational BI needs. You'll see how you can seamlessly navigate through tons of data from multiple sources and take advantage of the various APIs available in Qlik and its components for guided analytics. You'll also learn how to embed visualizations into your existing BI solutions and extend the capabilities of Qlik Sense to create new visualizations and dashboards that work across all platforms. We also cover other advanced concepts such as porting your Qlik View applications to Qlik Sense,and working with Qlik Cloud. Finally, you'll implement enterprise-wide security and access control for resources and data sources through practical examples.
With the kno ...
Mastering Microsoft Power BI
This book is intended for business intelligence professionals responsible for the design and development of Power BI content as well as managers, architects and administrators who oversee Power BI projects and deployments. The chapters flow from the planning of a Power BI project through the development and distribution of content to the administration of Power BI for an organization.
BI developers will learn how to create sustainable and impactful Power BI datasets, reports, and dashboards. This includes connecting to data sources, shaping and enhancing source data, and developing an analytical data model. Additionally, top report and dashboard design practices are described using features such as Bookmarks and the Power KPI visual.
BI managers will learn how Power BI's tools work together such as with the On-premises data gateway and how content can be staged and securely distributed via Apps. Additionally, both the Power BI Report Server and Power BI Premium are reviewed.
Python Data Analytics, 2nd Edition
Explore the latest Python tools and techniques to help you tackle the world of data acquisition and analysis. You'll review scientific computing with NumPy, visualization with matplotlib, and machine learning with scikit-learn.
This revision is fully updated with new content on social media data analysis, image analysis with OpenCV, and deep learning libraries. Each chapter includes multiple examples demonstrating how to work with each library. At its heart lies the coverage of pandas, for high-performance, easy-to-use data structures and tools for data manipulation.
Mastering Predictive Analytics with scikit-learn and TensorFlow
Python is a programming language that provides a wide range of features that can be used in the field of data science. Mastering Predictive Analytics with scikit-learn and TensorFlow covers various implementations of ensemble methods, how they are used with real-world datasets, and how they improve prediction accuracy in classification and regression problems.
This book starts with ensemble methods and their features. You will see that scikit-learn provides tools for choosing hyperparameters for models. As you make your way through the book, you will cover the nitty-gritty of predictive analytics and explore its features and characteristics. You will also be introduced to artificial neural networks and TensorFlow, and how it is used to create neural networks. In the final chapter, you will explore factors such as computational power, along with improvement methods and software enhancements for efficient predictive analytics.
By the end of this book, you will be well ...