

Introduction¶

当前版本基于Spark SQL 2.x进行整理，参考了主流分布式SQL计算引擎相关的开源项目。

Spark SQL

Reference¶

Spark SQL: Spark SQL is Apache Spark's module for working with structured data.
Hive: The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.
Presto: Distributed SQL Query Engine for Big Data.