教土豆学计算机
Google 3 架马车
GFS
Bigtable
MapReduce
Chubby
Hadoop - open source implementations of Google GFS, Bigtable and MapReduce
Online Articles
A scalable distributed file system for large distributed data-intensive applications
GFS: Evolution on Fast-forward
Bigtable: A Distributed Storage System for Structured Data
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: peta-bytes of data across thousands of commodity servers.
Spanner: Google’s Globally-Distributed Database
Spanner is Google’s scalable, multi-version, globally distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions.
F1: A Distributed SQL Database That Scales
F1 is a distributed relational database system built at Google to support the AdWords business. F1 is a hybrid database that combines high availability, the scalability of NoSQL systems like Bigtable, and the consistency and usability of traditional SQL databases. F1 is built on Spanner, which provides synchronous cross-datacenter replication and strong consistency.
Hive – A Petabyte Scale Data Warehouse Using Hadoop
Books
Hadoop Application Architecture
Hadoop: The Definitive Guide
Field Guide to Hadoop
MapReduce Design Patterns
HBase: The Definitive Guide
Programming Hive