Explore the modern market of data analytics platforms and the benefits of using Snowflake computing, the data warehouse built for the cloud.
With the rise of cloud technologies, organizations prefer to deploy their analytics using cloud providers such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform. Cloud vendors are offering modern data platforms for building cloud analytics solutions to collect data and consolidate into single storage solutions that provide insights for business users. The core of any analytics framework is the data warehouse, and previously customers did not have many choices of platform to use.
Snowflake was built specifically for the cloud and it is a true game changer for the analytics market. This book will help onboard you to Snowflake, present best practices to deploy, and use the Snowflake data warehouse. In addition, it covers modern analytics architecture and use cases. It provides use ...
Text Analytics with Python, 2nd Edition
Leverage Natural Language Processing (NLP) in Python and learn how to set up your own robust environment for performing text analytics. This second edition has gone through a major revamp and introduces several significant changes and new topics based on the recent trends in NLP.
You'll see how to use the latest state-of-the-art frameworks in NLP, coupled with machine learning and deep learning models for supervised sentiment analysis powered by Python to solve actual case studies. Start by reviewing Python for NLP fundamentals on strings and text data and move on to engineering representation methods for text data, including both traditional statistical models and newer deep learning-based embedding models. Improved techniques and new methods around parsing and processing text are discussed as well.
Text summarization and topic models have been overhauled so the book showcases how to build, tune, and interpret topic models in the context of an interest dataset on NIPS conferenc ...
Understanding Azure Data Factory
Improve your analytics and data platform to solve major challenges, including operationalizing big data and advanced analytics workloads on Azure. You will learn how to monitor complex pipelines, set alerts, and extend your organization's custom monitoring requirements.
This book starts with an overview of the Azure Data Factory as a hybrid ETL/ELT orchestration service on Azure. The book then dives into data movement and the connectivity capability of Azure Data Factory. You will learn about the support for hybrid data integration from disparate sources such as on-premise, cloud, or from SaaS applications. Detailed guidance is provided on how to transform data and on control flow. Demonstration of operationalizing the pipelines and ETL with SSIS is included. You will know how to leverage Azure Data Factory to run existing SSIS packages. As you advance through the book, you will wrap up by learning how to create a single pane for end-to-end monitoring, which is a key s ...
Apache Spark 2: Data Processing and Real-Time Analytics
Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform.
You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools.
By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle. ...
Go Machine Learning Projects
Go is the perfect language for machine learning; it helps to clearly describe complex algorithms, and also helps developers to understand how to run efficient optimized code. This book will teach you how to implement machine learning in Go to make programs that are easy to deploy and code that is not only easy to understand and debug, but also to have its performance measured.
The book begins by guiding you through setting up your machine learning environment with Go libraries and capabilities. You will then plunge into regression analysis of a real-life house pricing dataset and build a classification model in Go to classify emails as spam or ham. Using Gonum, Gorgonia, and STL, you will explore time series analysis along with decomposition and clean up your personal Twitter timeline by clustering tweets. In addition to this, you will learn how to recognize handwriting using neural networks and convolutional neural networks. Lastly, you'll learn how to choose the most appropriate m ...
Dynamic Oracle Performance Analytics
Use an innovative approach that relies on big data and advanced analytical techniques to analyze and improve Oracle Database performance. The approach used in this book represents a step-change paradigm shift away from traditional methods. Instead of relying on a few hand-picked, favorite metrics, or wading through multiple specialized tables of information such as those found in an automatic workload repository (AWR) report, you will draw on all available data, applying big data methods and analytical techniques to help the performance tuner draw impactful, focused performance improvement conclusions.
This book briefly reviews past and present practices, along with available tools, to help you recognize areas where improvements can be made. The book then guides you through a step-by-step method that can be used to take advantage of all available metrics to identify problem areas and work toward improving them. The method presented simplifies the tuning process and solves the probl ...
Apache Hadoop 3 Quick Start Guide
Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.
The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems.
The book also covers the important aspects of the big data software development lifecycle, including quality assurance and control, performance, administration, and monitoring.
You will then learn about the Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive, and HBase. Finally, you will look at advanced topics, including real time streaming using Apache Storm, ...
Machine Learning for Healthcare Analytics Projects
Machine Learning (ML) has changed the way organizations and individuals use data to improve the efficiency of a system. ML algorithms allow strategists to deal with a variety of structured, unstructured, and semi-structured data. Machine Learning for Healthcare Analytics Projects is packed with new approaches and methodologies for creating powerful solutions for healthcare analytics.
This book will teach you how to implement key machine learning algorithms and walk you through their use cases by employing a range of libraries from the Python ecosystem. You will build five end-to-end projects to evaluate the efficiency of Artificial Intelligence (AI) applications for carrying out simple-to-complex healthcare analytics tasks. With each project, you will gain new insights, which will then help you handle healthcare data efficiently. As you make your way through the book, you will use ML to detect cancer in a set of patients using support vector machines (SVMs) and k-Neare ...
Internet of Things for Architects
The Internet of Things (IoT) is the fastest growing technology market. Industries are embracing IoT technologies to improve operational expenses, product life, and people's well-being. An architectural guide is necessary if you want to traverse the spectrum of technologies needed to build a successful IoT system, whether that's a single device or millions of devices.
This book encompasses the entire spectrum of IoT solutions, from sensors to the cloud. We start by examining modern sensor systems and focus on their power and functionality. After that, we dive deep into communication theory, paying close attention to near-range PAN, including the new Bluetooth 5.0 specification and mesh networks. Then, we explore IP-based communication in LAN and WAN, including 802.11ah, 5G LTE cellular, SigFox, and LoRaWAN. Next, we cover edge routing and gateways and their role in fog computing, as well as the messaging protocols of MQTT and CoAP.
With the data now in internet form, you'll get an ...
Mastering Qlik Sense
Qlik Sense is a powerful, self-servicing Business Intelligence tool for data discovery, analytics and visualization. It allows you to create personalized Business Intelligence solutions from raw data and get actionable insights from it.
This book is your one-stop guide to mastering Qlik Sense, catering to all your organizational BI needs. You'll see how you can seamlessly navigate through tons of data from multiple sources and take advantage of the various APIs available in Qlik and its components for guided analytics. You'll also learn how to embed visualizations into your existing BI solutions and extend the capabilities of Qlik Sense to create new visualizations and dashboards that work across all platforms. We also cover other advanced concepts such as porting your Qlik View applications to Qlik Sense,and working with Qlik Cloud. Finally, you'll implement enterprise-wide security and access control for resources and data sources through practical examples.
With the kno ...
Mastering Microsoft Power BI
This book is intended for business intelligence professionals responsible for the design and development of Power BI content as well as managers, architects and administrators who oversee Power BI projects and deployments. The chapters flow from the planning of a Power BI project through the development and distribution of content to the administration of Power BI for an organization.
BI developers will learn how to create sustainable and impactful Power BI datasets, reports, and dashboards. This includes connecting to data sources, shaping and enhancing source data, and developing an analytical data model. Additionally, top report and dashboard design practices are described using features such as Bookmarks and the Power KPI visual.
BI managers will learn how Power BI's tools work together such as with the On-premises data gateway and how content can be staged and securely distributed via Apps. Additionally, both the Power BI Report Server and Power BI Premium are reviewed.