Improve
Book
- 大数据日知录架构与算法
- Spark内核设计的艺术:架构设计与实现
- 深入分布式缓存:从原理到实践
- 从Paxos到Zookeeper:分布式一致性原理与实践
- 程序员的数学:1,2,3
- 算法(第四版)
- Head First:设计模式
- MySQL技术内幕:InnoDB存储引擎
- 深入理解Java虚拟机:JVM高级特性与最佳实践
Blog
- Distributed Systems Architecture
- JackYu庾
- liuyaolei
- d0evi1
- 写点什么
- 石山园
- OopsOutOfMemory
- lmalds
- Oracle Big Data Blog
- 滴滴云博客
- 网易大数据技术专区
- 柳伟卫/老卫/Way Lau’s Personal Site
- 运维那点事
- 我爱自然语言处理
- 那海蓝蓝
- 许雪里的博客
- 徐阿衡
- ROOTLu
- Ws99
- 听见下雨的声音
- Pelhans Blog
- 莫烦PYTHON
- Jack Gao’s Blog
- What a DREAMY Journey
- PyLab
- Madhukar’s Blog
Site
- ML Wiki
- MBA智库百科
- ApacheCN
- DMLC
- dataArtisans
- 腾讯云 - 云+社区
- HBase技术社区
- Flink China
- PingCAP Blog
- Java Tutorials List
- Spring For All
- 小米生态云文档
- 阿里云大学
- Linux Virtual Server
- Tutorials Point
- 网易大数据
- 天池
- 内存溢出-聚客
- Oracle Life
- 开放的中文知识图谱
- NoSQL漫谈
- 码农场
- PaperWeekly
- SofaSofa-数据科学社区
- nbviewer
- Altinity
- KDnuggets
- MySlide
Project
InfluxData
InfluxData provides a Modern Time Series Platform, designed from the ground up to handle metrics and events. InfluxData’s products are based on an open source core. This open source core consists of the projects—Telegraf, InfluxDB, Chronograf, and Kapacitor; collectively called the TICK Stack.
Link
Oryx 2
Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine learning. It is a framework for building applications, but also includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering.
Link
NLTK
Natural Language Toolkit
Link
DeepWalk
Deep Learning for Graphs
Link
ClickHouse
ClickHouse is an open source column-oriented database management system capable of real time generation of analytical data reports using SQL queries.
Quick Start
Blazing Fast
Linearly Scalable
Hardware Efficient
Fault Tolerant
Feature Rich
Highly Reliable
Simple and Handy
Link
ONOS
ONOS is the only SDN controller platform that supports the transition from legacy “brown field” networks to SDN “green field” networks. This enables exciting new capabilities, and disruptive deployment and operational cost points for network operators.
Link
SnappyData
SnappyData, the Spark Database.
Stream - Transact - Analyze - Predict all in one cluster
Link
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Link
JavaCC
Link
Dr. Elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark