A Common-Sense Guide to Data Structures and AlgorithmsAlgorithms and data structures are much more than abstract concepts. Mastering them enables you to write code that runs faster and more efficiently, which is particularly important for today's web and mobile apps. This book takes a practical approach to data structures and algorithms, with techniques and real-world scenarios that you can use in your daily production code. Graphics and examples make these computer science concepts understandable and relevant. You can use these techniques with any language; examples in the book are in JavaScript, Python, and Ruby.
Use Big O notation, the primary tool for evaluating algorithms, to measure and articulate the efficiency of your code, and modify your algorithm to make it faster. Find out how your choice of arrays, linked lists, and hash tables can dramatically affect the code you write. Use recursion to solve tricky problems and create algorithms that run exponentially faster than the alternatives. Dig into advanced data structures such a ...
Streaming SystemsStreaming data is a big deal in big data these days, and for good reason. Businesses crave ever more timely data, and streaming is a good way to achieve lower latency. Plus, streaming is a much easier way to tame the massive, unbounded data sets that are increasingly common today.
Expanded from co-author Tyler Akidau's popular series of blog posts "Streaming 101" and "Streaming 102", this practical book shows data engineers, data scientists, and developers how to work with streaming or event-time data in a conceptual and platform-agnostic way. You'll go from "101"-level understanding of stream processing to a nuanced grasp of the what, where, when, and how of processing real-time data streams.
Dive deep into topics including watermarks and windowing, as well as state and timers in the context of stream processing. Although the book uses Apache Beam code snippets to make examples concrete, it presents a general and broad explanation of streaming that's not tied to a specific frame ...
Applied Text Analysis with PythonThe programming landscape of natural language processing has changed dramatically in the past few years. Machine learning approaches now require mature tools like Python's scikit-learn to apply models to text at scale. This practical guide shows programmers and data scientists who have an intermediate-level understanding of Python and a basic understanding of machine learning and natural language processing how to become more proficient in these two exciting areas of data science.
This book presents a concise, focused, and applied approach to text analysis with Python, and covers topics including text ingestion and wrangling, basic machine learning on text, classification for text analysis, entity resolution, and text visualization. Applied Text Analysis with Python will enable you to design and develop language-aware data products.
You'll learn how and why machine learning algorithms make decisions about language to analyze text; how to ingest, wrangle, and preprocess language d ...
Learning Apache DrillApache Drill enables interactive analysis of massively large datasets, allowing you to execute SQL queries against data in many different data sources - including Hadoop and MongoDB clusters, HBase, or even your local file system - and get results quickly. With this practical guide, analysts and data scientists focused on business or research applications will learn how to incorporate Drill capabilities into complex programs, including how to use Drill queries to replace some MapReduce operations in a large-scale program.
Drill committers Charles Givre and Paul Rogers provide an introduction to Drill and its ability to handle large files containing data in flexible formats with nested data structures and tables. You'll discover how this capability fills a gap in the Hadoop ecosystem.
Additional topics show you how to:Prepare and organize data to maximize Drill performance;Set expectations for Drill performance on different data types and volumes;Reconcil ...
Social Media Analytics StrategyThis book shows you how to use social media analytics to optimize your business performance. The tools discussed will prepare you to create and implement an effective digital marketing strategy. From understanding the data and its sources to detailed metrics, dashboards, and reports, this book is a robust tool for anyone seeking a tangible return on investment from social media and digital marketing.
Social Media Analytics Strategy speaks to marketers who do not have a technical background and creates a bridge into the digital world. Comparable books are either too technical for marketers (aimed at software developers) or too basic and do not take strategy into account. They also lack an overview of the entire process around using analytics within a company project. They don't go into the everyday details and also don't touch upon common mistakes made by marketers.
This book highlights patterns of common challenges experienced by marketers from entry level to directors and C-leve ...
Spark: The Definitive GuideLearn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals.
You'll explore the basic operations and common functions of Spark's structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark's scalable machine-learning library.
Get a gentle overview of big data and Spark; Learn about DataFrames, SQL, and Datasets - Spark's core APIs - through worked examples; Dive into Spark's low-level APIs, RDDs, and execution of SQL and DataFrames; Understand how Spark runs on a cluster ...
Think Data StructuresIf you're a student studying computer science or a software developer preparing for technical interviews, this practical book will help you learn and review some of the most important ideas in software engineering - data structures and algorithms - in a way that's clearer, more concise, and more engaging than other materials.
By emphasizing practical knowledge and skills over theory, author Allen Downey shows you how to use data structures to implement efficient algorithms, and then analyze and measure their performance. You'll explore the important classes in the Java collections framework (JCF), how they're implemented, and how they're expected to perform. Each chapter presents hands-on exercises supported by test code online.
Use data structures such as lists and maps, and understand how they work; Build an application that reads Wikipedia pages, parses the contents, and navigates the resulting data tree; Analyze code to predict how fast it will run and how much memory it will ...
JSON at WorkJSON is becoming the backbone for meaningful data interchange over the internet. This format is now supported by an entire ecosystem of standards, tools, and technologies for building truly elegant, useful, and efficient applications. With this hands-on guide, author and architect Tom Marrs shows you how to build enterprise-class applications and services by leveraging JSON tooling and message/document design.
JSON at Work provides application architects and developers with guidelines, best practices, and use cases, along with lots of real-world examples and code samples. You'll start with a comprehensive JSON overview, explore the JSON ecosystem, and then dive into JSON's use in the enterprise.
Get acquainted with JSON basics and learn how to model JSON data; Learn how to use JSON with Node.js, Ruby on Rails, and Java; Structure JSON documents with JSON Schema to design and test APIs; Search the contents of JSON documents with JSON Search tools; Convert JSON documents to other d ...
Machine Learning and SecurityCan machine learning techniques solve our computer security problems and finally put an end to the cat-and-mouse game between attackers and defenders? Or is this hope merely hype? Now you can dive into the science and answer this question for yourself. With this practical guide, you'll explore ways to apply machine learning to security issues such as intrusion detection, malware classification, and network analysis.
Machine learning and security specialists Clarence Chio and David Freeman provide a framework for discussing the marriage of these two fields, as well as a toolkit of machine-learning algorithms that you can apply to an array of security problems. This book is ideal for security engineers and data scientists alike.
Learn how machine learning has contributed to the success of modern spam filters; Quickly detect anomalies, including breaches, fraud, and impending system failure; Conduct malware analysis by extracting useful information from computer binaries; Uncover at ...
Beginning Windows 8 Data DevelopmentThis book introduces novice developers to a range of data access strategies for storing and retreiving data both locally and remotely. It provides you with a range of fully working data access solutions and the insight you need to know when, and how, to apply each of the techniques to best advantage.
Focussing specifically on how the Windows 8 app developer can work with the Windows Runtime (often called Windows RT) framework this book provides careful analysis of the many options you have open to you, along with a comparision of their strengths and weaknesses under different conditions. With the days of a single database being the right choice for almost all development projects long gone. You will lean that the right choice for your app now depends on a variety of factors and getting it right will be critical to your customer's end user experience. ...