Spark for Python DevelopersLooking for a cluster computing system that provides high-level APIs? Apache Spark is your answer—an open source, fast, and general purpose cluster computing system. Spark's multi-stage memory primitives provide performance up to 100 times faster than Hadoop, and it is also well-suited for machine learning algorithms.
Are you a Python developer inclined to work with Spark engine? If so, this book will be your companion as you create data-intensive app using Spark as a processing engine, Python visualization libraries, and web frameworks such as Flask.
To begin with, you will learn the most effective way to install the Python development environment powered by Spark, Blaze, and Bookeh. You will then find out how to connect with data stores such as MySQL, MongoDB, Cassandra, and Hadoop.
You'll expand your skills throughout, getting familiarized with the various data sources (Github, Twitter, Meetup, and Blogs), their data structures, and solutions to effectively tackle complex ...
Creating Data Stories with Tableau PublicTableau Public is a very useful tool in anyone's data reporting toolbox that allows authors to add an interactive data element to any article. It allows investigative journalists and bloggers to tell a “data story”, allowing others to explore your data visualization. The relative ease of Tableau Public visualization creation allows data stories to be developed rapidly. It allows readers to explore data associations in multiple-sourced public data, and uses state-of-the-art dashboard and chart graphics to immerse the users in an interactive experience.
This book offers investigative journalists, bloggers, and other data story tellers a rich discussion of visualization creation topics, features, and functions. This book allows data story tellers to quickly gain confidence in understanding and expanding their visualization-creation knowledge, and allows them to quickly create interesting, interactive data visualizations to bring a richness and vibrancy to complex articles.
The b ...
Big Data Analytics with SparkThis book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.
Big Data Analytics with Spark shows you how to use Spark and leverage its easy-to-use features to increase your productivity. You learn to perform fast data analysis using its in-memory caching and advanced execution engine, employ in-memory computing capabilities for building high-performance machine learning and low-latency interactive analytics applications, and much more. Moreover, the book shows you how to use Spark as a single integrated platform for a variety of data processing tasks, including ETL pipelines, BI, live data stream processing, graph analytics, and machine learning.
The book also includes a chapter on Scala, the hottest functional programming l ...
Scalable Big Data ArchitectureThis book highlights the different types of data architecture and illustrates the many possibilities hidden behind the term "Big Data", from the usage of No-SQL databases to the deployment of stream analytics architecture, machine learning, and governance.
Scalable Big Data Architecture covers real-world, concrete industry use cases that leverage complex distributed applications , which involve web applications, RESTful API, and high throughput of large amount of data stored in highly scalable No-SQL data stores such as Couchbase and Elasticsearch. This book demonstrates how data processing can be done at scale from the usage of NoSQL datastores to the combination of Big Data distribution.
When the data processing is too complex and involves different processing topology like long running jobs, stream processing, multiple data sources correlation, and machine learning, it's often necessary to delegate the load to Hadoop or Spark and use the No-SQL to serve processed dat ...
Social Media Mining with RThe growth of social media over the last decade has revolutionized the way individuals interact and industries conduct business. Individuals produce data at an unprecedented rate by interacting, sharing, and consuming content through social media. However, analyzing this ever-growing pile of data is quite tricky and, if done erroneously, could lead to wrong inferences.
By using this essential guide, you will gain hands-on experience with generating insights from social media data. This book provides detailed instructions on how to obtain, process, and analyze a variety of socially-generated data while providing a theoretical background to help you accurately interpret your findings. You will be shown R code and examples of data that can be used as a springboard as you get the chance to undertake your own analyses of business, social, or political data. ...
Scaling Big Data with Hadoop and SolrAs data grows exponentially day-by-day, extracting information becomes a tedious activity in itself. Technologies like Hadoop are trying to address some of the concerns, while Solr provides high-speed faceted search. Bringing these two technologies together is helping organizations resolve the problem of information extraction from Big Data by providing excellent distributed faceted search capabilities.
Scaling Big Data with Hadoop and Solr is a step-by-step guide that helps you build high performance enterprise search engines while scaling data. Starting with the basics of Apache Hadoop and Solr, this book then dives into advanced topics of optimizing search with some interesting real-world use cases and sample Java code. ...
Learning IPython for Interactive Computing and Data VisualizationYou already use Python as a scripting language, but did you know it is also increasingly used for scientific computing and data analysis? Interactive programming is essential in such exploratory tasks and IPython is the perfect tool for that. Once you've learnt it, you won't be able to live without it.
Learning IPython for Interactive Computing and Data Visualization is a practical, hands-on, example-driven tutorial to considerably improve your productivity during interactive Python sessions, and shows you how to effectively use IPython for interactive computing and data analysis. ...
Learn Computer Science with SwiftMaster the basics of solving logic puzzles, and creating algorithms using Swift on Apple platforms. This book is based on the curriculum currently being used in common computer classes. You'll learn to automate algorithmic processes that scale using Swift in the context of iOS, macOS, tvOS, and watchOS.
Begin by understanding how to think computationally: to formulate a computational problem and recognize patterns and ways to validate it. Then jump ahead past the abstractions and conceptual work into using code snippets to build frameworks and write code using Xcode and Swift. Once you have frameworks in place, you'll learn to use algorithms and structure data. Finally, you'll see how to bring people into what you've built through a useable UI and how UI and code relate. ...
Text Analytics with PythonDerive useful insights from your data using Python. You will learn both basic and advanced concepts, including text and language syntax, structure, and semantics. You will focus on algorithms and techniques, such as text classification, clustering, topic modeling, and text summarization.
Text Analytics with Python teaches you the techniques related to natural language processing and text analytics, and you will gain the skills to know which technique is best suited to solve a particular problem. You will look at each technique and algorithm with both a bird's eye view to understand how it can be used as well as with a microscopic view to understand the mathematical concepts and to implement them to solve your own problems.
Understand the major concepts and techniques of natural language processing (NLP) and text analytics, including syntax and structure; Build a text classification system to categorize news articles, analyze app or game reviews using topic modeling and text summa ...
Applied Text Analysis with PythonThe programming landscape of natural language processing has changed dramatically in the past few years. Machine learning approaches now require mature tools like Python's scikit-learn to apply models to text at scale. This practical guide shows programmers and data scientists who have an intermediate-level understanding of Python and a basic understanding of machine learning and natural language processing how to become more proficient in these two exciting areas of data science.
This book presents a concise, focused, and applied approach to text analysis with Python, and covers topics including text ingestion and wrangling, basic machine learning on text, classification for text analysis, entity resolution, and text visualization. Applied Text Analysis with Python will enable you to design and develop language-aware data products.
You'll learn how and why machine learning algorithms make decisions about language to analyze text; how to ingest, wrangle, and preprocess language d ...