IT eBooks
Download, Read, Use
R Web Scraping Quick Start Guide
R Web Scraping Quick Start Guide

Web scraping is a technique to extract data from websites. It simulates the behavior of a website user to turn the website itself into a web service to retrieve or introduce new data. This book gives you all you need to get started with scraping web pages using R programming. You will learn about the rules of RegEx and Xpath, key components for scraping website data. We will show you web scraping techniques, methodologies, and frameworks. With this book's guidance, you will become comfortable with the tools to write and test RegEx and XPath rules. We will focus on examples of dynamic websites for scraping data and how to implement the techniques learned. You will learn how to collect URLs and then create XPath rules for your first web scraping script using rvest library. From the data you collect, you will be able to calculate the statistics and create R plots to visualize them. Finally, you will discover how to use Selenium drivers with R for more sophisticated scraping. You ...
Apache Hadoop 3 Quick Start Guide
Apache Hadoop 3 Quick Start Guide

Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS. The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems. The book also covers the important aspects of the big data software development lifecycle, including quality assurance and control, performance, administration, and monitoring. You will then learn about the Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive, and HBase. Finally, you will look at advanced topics, including real time streaming using Apache Storm, ...
Exploring Data Science
Exploring Data Science

There's never been a better time to get into data science. But where do you start? Data Science is a broad field, incorporating aspects of statistics, machine learning, and data engineering. It's easy to become overwhelmed, or end up learning about a small section of data science or a single methodology. Exploring Data Science is a collection of five hand-picked chapters introducing you to various areas in data science and explaining which methodologies work best for each. John Mount and Nina Zumel, authors of Practical Data Science with R, selected these chapters to give you the big picture of the many data domains. You'll learn about time series, neural networks, text analytics, and more. As you explore different modeling practices, you'll see practical examples of how R, Python, and other languages are used in data science. Along the way, you'll experience a sample of Manning books you may want to add to your library. ...
Exploring Data with Python
Exploring Data with Python

Python has become a required skill for data science, and it's easy to see why. It's powerful, easy to learn, and includes the libraries like Pandas, Numpy, and Scikit that help you slice, scrub, munge, and wrangle your data. Even with a great language and fantastic tools though, there's plenty to learn! Exploring Data with Python is a collection of chapters from three Manning books, hand-picked by Naomi Ceder, the chair of the Python Software Foundation. This free eBook starts building your foundation in data science processes with practical Python tips and techniques for working and aspiring data scientists. In it, you'll get a clear introduction to the data science process. Then, you'll practice using Python for processing, cleaning, and exploring interesting datasets. Finally, you'll get a practical demonstration of modelling and prediction with classification and regression. When you finish, you'll have a good overview of Python in data science and a well-lit path to continue yo ...
Julia 1.0 Programming Cookbook
Julia 1.0 Programming Cookbook

Julia, with its dynamic nature and high-performance, provides comparatively minimal time for the development of computational models with easy-to-maintain computational code. This book will be your solution-based guide as it will take you through different programming aspects with Julia. Starting with the new features of Julia 1.0, each recipe addresses a specific problem, providing a solution and explaining how it works. You will work with the powerful Julia tools and data structures along with the most popular Julia packages. You will learn to create vectors, handle variables, and work with functions. You will be introduced to various recipes for numerical computing, distributed computing, and achieving high performance. You will see how to optimize data science programs with parallel computing and memory allocation. We will look into more advanced concepts such as metaprogramming and functional programming. Finally, you will learn how to tackle issues while working with databases ...
Mastering Matplotlib 2.x
Mastering Matplotlib 2.x

In this book, you'll get hands-on with customizing your data plots with the help of Matplotlib. You'll start with customizing plots, making a handful of special-purpose plots, and building 3D plots. You'll explore non-trivial layouts, Pylab customization, and more about tile configuration. You'll be able to add text, put lines in plots, and also handle polygons, shapes, and annotations. Non-Cartesian and vector plots are exciting to construct, and you'll explore them further in this book. You'll delve into niche plots and visualize ordinal and tabular data. In this book, you'll be exploring 3D plotting, one of the best features when it comes to 3D data visualization, along with Jupyter Notebook, widgets, and creating movies for enhanced data representation. Geospatial plotting will also be explored. Finally, you'll learn how to create interactive plots with the help of Jupyter. Learn expert techniques for effective data visualization using Matplotlib 3 and Python with our latest off ...
Reactive Data Handling
Reactive Data Handling

We depend on web applications to be highly-available and to provide us with up-to-the-second data. This shift toward real-time data processing is also a key aspect of the Internet of Things, which the Gartner Group predicts by 2020 will include 26 billion actively-connected physical devices sending, receiving, and processing streams. That's a lot of data. The reactive application architecture is an answer to the requirements of high availability and resource efficiency. Reactive Data Handling is a collection of five hand-picked chapters introducing you to building reactive applications capable of handling real-time processing with large data loads. Manuel Bernhardt, author of Reactive Web Applications , selected these chapters to show you how reactive application architecture solves real-time data demands. You'll start with the high-level architecture of reactive applications and then look at low-level practical aspects. After you read these chapters, you'll understand the benefits ...
Splunk 7.x Quick Start Guide
Splunk 7.x Quick Start Guide

Splunk is a leading platform and solution for collecting, searching, and extracting value from ever increasing amounts of big data - and big data is eating the world! This book covers all the crucial Splunk topics and gives you the information and examples to get the immediate job done. You will find enough insights to support further research and use Splunk to suit any business environment or situation. Splunk 7.x Quick Start Guide gives you a thorough understanding of how Splunk works. You will learn about all the critical tasks for architecting, implementing, administering, and utilizing Splunk Enterprise to collect, store, retrieve, format, analyze, and visualize machine data. You will find step-by-step examples based on real-world experience and practical use cases that are applicable to all Splunk environments. There is a careful balance between adequate coverage of all the critical topics with short but relevant deep-dives into the configuration options and steps to carry out ...
Python for Data Mining Quick Syntax Reference
Python for Data Mining Quick Syntax Reference

Learn how to use Python and its structures, how to install Python, and which tools are best suited for data analyst work. This book provides you with a handy reference and tutorial on topics ranging from basic Python concepts through to data mining, manipulating and importing datasets, and data analysis. Python for Data Mining Quick Syntax Reference covers each concept concisely, with many illustrative examples. You'll be introduced to several data mining packages, with examples of how to use each of them. The first part covers core Python including objects, lists, functions, modules, and error handling. The second part covers Python's most important data mining packages: NumPy and SciPy for mathematical functions and random data generation, pandas for dataframe management and data import, Matplotlib for drawing charts, and scikitlearn for machine learning. Install Python and choose a development environment; Understand the basic concepts of object-oriented programming; Imp ...
Hands-On Geospatial Analysis with R and QGIS
Hands-On Geospatial Analysis with R and QGIS

Managing spatial data has always been challenging and it's getting more complex as the size of data increases. Spatial data is actually big data and you need different tools and techniques to work your way around to model and create different workflows. R and QGIS have powerful features that can make this job easier. This book is your companion for applying machine learning algorithms on GIS and remote sensing data. You'll start by gaining an understanding of the nature of spatial data and installing R and QGIS. Then, you'll learn how to use different R packages to import, export, and visualize data, before doing the same in QGIS. Screenshots are included to ease your understanding. Moving on, you'll learn about different aspects of managing and analyzing spatial data, before diving into advanced topics. You'll create powerful data visualizations using ggplot2, ggmap, raster, and other packages of R. You'll learn how to use QGIS 3.2.2 to visualize and manage (create, edit, and fo ...
← Prev       Next →
Reproduction of site books is authorized only for informative purposes and strictly for personal, private use.
Only Direct Download
IT eBooks Group © 2011-2025