

Projects & Articles¶

Projects¶

ACL¶

LDAP/AD¶

Kerberos¶

Apache Ranger¶

Apache Knox¶

Apache Sentry¶

Apache Metron¶

Apache Gobblin¶

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

Apache Atlas¶

Atlas is a scalable and extensible set of core foundational governance services – enabling enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the whole enterprise data ecosystem.

Apache Atlas provides open metadata management and governance capabilities for organizations to build a catalog of their data assets, classify and govern these assets and provide collaboration capabilities around these data assets for data scientists, analysts and the data governance team.

Features¶

Metadata types & instances
Classification
Lineage
Search/Discovery
Security & Data Masking

Links¶

Site、Docs、Repo

Apache Griffin¶

Apache Griffin is a Data Quality Service platform built on Apache Hadoop and Apache Spark. It provides a framework process for defining data quality model, executing data quality measurement, automating data profiling and validation, as well as a unified data quality visualization across multiple data systems. It tries to address the data quality challenges in big data and streaming context.

Measure Model¶

Accuracy（准确性）- Does data reflect the real-world objects or a verifiable source
Profiling（统计）- Apply statistical analysis and assessment of data values within a dataset for consistency, uniqueness and logic
Completeness（完整性）- Is all necessary data present
Timeliness（实时性）- Is the data available at the time needed
Anomaly detection（异常检测）- Pre-built algorithm functions for the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset
Validity（有效性）- Are all data values within the data domains specified by the business

Links¶

Site、Docs、Repo

Projects & Articles¶

Projects¶

ACL¶

LDAP/AD¶

Kerberos¶

Apache Ranger¶

Apache Knox¶

Apache Sentry¶

Apache Metron¶

Apache Gobblin¶

Apache Atlas¶

Features¶

Links¶

Apache Griffin¶

Measure Model¶

Links¶

Articles¶

Reference¶