Data Center HandbookData Center Handbook provides the fundamentals, technologies, and best practices in designing, constructing and managing mission critical, energy efficient data centers.
The most comprehensive single source guide ever published in this field, with 36 chapters and over 350 illustrations written by 50 world class authors; Offers disaster management techniques and lessons learned from 2011 earthquake and tsunami in Japan, and 2012 Superstorm Sandy; Discusses international standards and requirements, with contributions from experts in the United States, Canada, United Kingdom, France, Sweden, Japan, Korea, and China. ...
Mastering ElasticSearchElasticSearch is fast, distributed, scalable, and written in the Java search engine that leverages Apache Lucene capabilities providing a new level of control over how you index and search even the largest set of data.
Mastering ElasticSearch covers the intermediate and advanced functionalities of ElasticSearch and will let you understand not only how ElasticSearch works, but will also guide you through its internals such as caches, Apache Lucene library, monitoring capabilities, and the Java API. In addition to that you'll see the practical usage of ElasticSearch configuration parameters, monitoring API, and easy-to-use and extend examples on how to extend ElasticSearch by writing your own plugins. ...
Sams Teach Yourself Core Data for Mac and iOS in 24 Hours, 2nd EditionIn just 24 sessions of one hour or less, start using Core Data to build powerful data-driven apps for iOS devices and Mac OS X computers! Using this book's straightforward, step-by-step approach, you'll discover how Apple's built-in data persistence framework can help you meet any data-related requirement, from casual to enterprise-class. Beginning with the absolute basics, you'll learn how to create data models, build interfaces, interact with users, work with data sources and table views. Every lesson builds on what you've already learned, giving you a rock-solid foundation for real-world success! ...
Bioinformatics with R CookbookBioinformatics is an interdisciplinary field that develops and improves upon the methods for storing, retrieving, organizing, and analyzing biological data. R is the primary language used for handling most of the data analysis work done in the domain of bioinformatics.
Bioinformatics with R Cookbook is a hands-on guide that provides you with a number of recipes offering you solutions to all the computational tasks related to bioinformatics in terms of packages and tested codes.
With the help of this book, you will learn how to analyze biological data using R, allowing you to infer new knowledge from your data coming from different types of experiments stretching from microarray to NGS and mass spectrometry. ...
HBase EssentialsWith an example-oriented approach, this book begins by providing you with a step-by-step learning process to effortlessly set up HBase clusters and design schemas. Gradually, you will be taken through advanced data modeling concepts and the intricacies of the HBase architecture. Moreover, you will also get acquainted with the HBase client API and HBase shell. Essentially, this book aims to provide you with a solid grounding in the NoSQL columnar database space and also helps you take advantage of the real power of HBase using data scans, filters, and the MapReduce framework. Most importantly, the book also provides you with practical use cases covering various HBase clients, HBase cluster administration, and performance tuning. ...
scikit-learn CookbookPython is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. Its consistent API and plethora of features help solve any machine learning problem it comes across.
The book starts by walking through different methods to prepare your data—be it a dataset with missing values or text columns that require the categories to be turned into indicator variables. After the data is ready, you'll learn different techniques aligned with different objectives—be it a dataset with known outcomes such as sales by state, or more complicated problems such as clustering similar customers. Finally, you'll learn how to polish your algorithm to ensure that it's both accurate and resilient to new datasets. ...
IPython Notebook EssentialsIn data science, it is difficult to present interesting visual or technical content, as it involves scientific notations that are not easy to type in a normal document format. IPython provides a web-based UI called Notebook, which creates a working environment for interactive computing that combines code execution with computational documents. IPython Notebook makes the task simpler as it was developed for scientific programming to solve larger problems through a series of smaller programs. IPython Notebook is used to learn Python in a fun and interactive way and to do some serious parallel / technical computing.
The book begins with an introduction to the efficient use of IPython Notebook for interactive computation. The book then focuses on the integration of technologies such as matplotlib, pandas, and SciPy. The book is aimed at empowering you to work with IPython Notebook for interactive computing, configuring it, creating your own notebooks / research documents. You will learn ...
Mastering HadoopHadoop is synonymous with Big Data processing. Its simple programming model, "code once and deploy at any scale" paradigm, and an ever-growing ecosystem makes Hadoop an all-encompassing platform for programmers with different levels of expertise.
This book explores the industry guidelines to optimize MapReduce jobs and higher-level abstractions such as Pig and Hive in Hadoop 2.0. Then, it dives deep into Hadoop 2.0 specific features such as YARN and HDFS Federation.
This book is a step-by-step guide that focuses on advanced Hadoop concepts and aims to take your Hadoop knowledge and skill set to the next level. The data processing flow dictates the order of the concepts in each chapter, and each chapter is illustrated with code fragments or schematic diagrams. ...
Mastering SplunkSplunk is the definitive technology solution used to manage the ever-growing volumes of machine-generated data. This technology is indispensable for industries involved in big data analysis, online services, education, finance, healthcare, retail, and telecommunications. So, having Splunk experience will be relevant for a long time to come!
This book will first take you through the evolution of Splunk and how it fits into an organization's architectural roadmap. Master advanced search topics and explore in-depth methods to leverage Splunk tables, charts, fields, and other cases. As we advance through the chapters, you will master the best practices of values and lookups, indexes, business effective dashboards, and discover the cornerstones of how to evolve your current Splunk application and its monitoring capabilities. Finally, we round things off with the discussion of transactions from an enterprise perspective. ...
Scala for Machine LearningThe discovery of information through data clustering and classification is becoming a key differentiator for competitive organizations. Machine learning applications are everywhere, from self-driving cars, engineering designs, biometrics, and trading strategies, to detection of genetic anomalies.
The book begins with an introduction to the functional capabilities of the Scala programming language that are critical to the creation of machine learning algorithms such as dependency injection and implicits.
Next, you'll learn about data preprocessing and filtering techniques. Following this, you'll move on to clustering and dimension reduction, Naïve Bayes, regression models, sequential data, regularization and kernelization, support vector machines, neural networks, generic algorithms, and re-enforcement learning. A review of the Akka framework and Apache Spark clusters concludes the tutorial. ...