Python Data Lineage (Gudu SQLFlow Lite version for python)
Python data lineage package (aka Gudu SQLFlow Lite version for python) is a tool set used to analyze SQL statements and stored procedures of various databases to obtain complex データ系統 relationships and visualize them.
Gudu SQLFlow Lite version for python allows Python developers to quickly integrate data lineage analysis and visualization capabilities into their own Python applications. It can also be used in daily work by data scientists to quickly discover data lineage from complex SQL scripts that usually used in ETL jobs do the data transform in a huge data platform.
Gudu SQLFlow Lite version for python is free for non-commercial use and can handle any complex SQL statements with a length of up to 10k, including support for stored procedures. It supports SQL dialect from more than 20 major database vendors such as Oracle, DB2, Snowflake, Redshift, Postgres and so on.
Gudu SQLFlow Lite version for python includes a Java library for analyzing complex SQL statements and stored procedures to retrieve data lineage relationships, a Python file that utilizes jpype to call the APIs in the Java library, and a JavaScript library for visualizing data lineage relationships.
Gudu SQLFlow Lite version for python can also automatically extract table and column constraints, as well as relationships between tables and fields, from DDL scripts exported from the database and generate an ER Diagram.
Automatically visualize data lineage
We can automatically obtain the data lineage relationships contained in the following Oracle SQL statement.
And visualize it as:
Python data lineage package features:
Generate interactive data lineage visualizations
Create data lineage in JSON/CSV/GRAPHML
20 を超える主要なデータベース ベンダーの SQL をサポート
Python データ系統ツールの仕組み
Now, all the above components are packaged into a single repository on github and you get it for free by simply clone it.
– No database connection is needed.
– No internet connection is needed.
You only need a JDK and a python interpreter to run this python data lineage package locally.