{"id":2963,"date":"2021-03-08T01:59:31","date_gmt":"2021-03-08T09:59:31","guid":{"rendered":"https:\/\/www.gudusoft.com\/?page_id=2963"},"modified":"2024-01-28T19:17:37","modified_gmt":"2024-01-29T03:17:37","slug":"python-data-lineage","status":"publish","type":"page","link":"https:\/\/www.gudusoft.com\/fr\/python-data-lineage\/","title":{"rendered":"Discover data lineage had never been easier"},"content":{"rendered":"<div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"background-color: rgba(255,255,255,0);background-position: center center;background-repeat: no-repeat;border-width: 0px 0px 0px 0px;border-color:#e8eaf0;border-style:solid;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start\" style=\"max-width:1310.4px;margin-left: calc(-4% \/ 2 );margin-right: calc(-4% \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_1 1_1 fusion-flex-column\"><div class=\"fusion-column-wrapper fusion-flex-justify-content-flex-start fusion-content-layout-column\" style=\"background-position:left top;background-repeat:no-repeat;-webkit-background-size:cover;-moz-background-size:cover;-o-background-size:cover;background-size:cover;padding: 0px 0px 0px 0px;\"><div class=\"fusion-text fusion-text-1\" style=\"line-height:26px;\"><h3>Python Data Lineage (Gudu SQLFlow Lite version for python)<\/h3>\n<p>Python data lineage package (aka Gudu SQLFlow Lite version for python) is a tool set used to analyze SQL statements and stored procedures of various databases to obtain complex <a href=\"https:\/\/en.wikipedia.org\/wiki\/Data_lineage\">lign\u00e9e de donn\u00e9es<\/a>\u00a0 relationships and visualize them.<\/p>\n<p><a href=\"https:\/\/github.com\/sqlparser\/python_data_lineage\" rel=\"nofollow\">Gudu SQLFlow Lite version for python<\/a> allows Python developers to quickly integrate data lineage analysis and visualization capabilities into their own Python applications. It can also be used in daily work by data scientists to quickly discover data lineage from complex SQL scripts that usually used in ETL jobs do the data transform in a huge data platform.<\/p>\n<p>Gudu SQLFlow Lite version for python is free for non-commercial use and can handle any complex SQL statements with a length of up to 10k, including support for stored procedures. It supports SQL dialect from more than 20 major database vendors such as Oracle, DB2, Snowflake, Redshift, Postgres and so on.<\/p>\n<p>Gudu SQLFlow Lite version for python includes <a href=\"https:\/\/www.gudusoft.com\/fr\/sqlflow-java-library-2\/\">a Java library<\/a> for analyzing complex SQL statements and stored procedures to retrieve data lineage relationships, <a href=\"https:\/\/github.com\/sqlparser\/python_data_lineage\/blob\/main\/dlineage.py\">a Python file<\/a> that utilizes jpype to call the APIs in the Java library, and <a href=\"https:\/\/docs.gudusoft.com\/4.-sqlflow-widget\/get-started\">a JavaScript library<\/a> for visualizing data lineage relationships.<\/p>\n<p>Gudu SQLFlow Lite version for python can also automatically extract table and column constraints, as well as relationships between tables and fields, from <a href=\"https:\/\/docs.gudusoft.com\/6.-sqlflow-ingester\/introduction\">DDL scripts exported from the database <\/a>and generate an ER Diagram.<\/p>\n<\/div><style type=\"text\/css\">@media only screen and (max-width:1024px) {.fusion-title.fusion-title-1{margin-top:10px!important;margin-bottom:20px!important;}}<\/style><div class=\"fusion-title title fusion-title-1 fusion-sep-none fusion-title-text fusion-title-size-one\" style=\"margin-top:10px;margin-bottom:20px;\"><h1 class=\"title-heading-left\" style=\"margin:0;\"><h3>Automatically visualize data lineage<\/h3><\/h1><\/div><div class=\"fusion-text fusion-text-2\"><div>By executing this command:<\/div>\n<div><\/div>\n<\/div><style type=\"text\/css\" scopped=\"scopped\">.fusion-syntax-highlighter-1 > .CodeMirror, .fusion-syntax-highlighter-1 > .CodeMirror .CodeMirror-gutters {background-color:#ffffff;}.fusion-syntax-highlighter-1 > .CodeMirror .CodeMirror-gutters { background-color: #ffffff; }.fusion-syntax-highlighter-1 > .CodeMirror .CodeMirror-linenumber { color: #e8eaf0; }<\/style><div class=\"fusion-syntax-highlighter-container fusion-syntax-highlighter-1 fusion-syntax-highlighter-theme-light\" style=\"opacity:0;margin-top:10px;margin-right:0px;margin-bottom:30px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:#e8eaf0;\"><div class=\"syntax-highlighter-copy-code\"><span class=\"syntax-highlighter-copy-code-title\" data-id=\"fusion_syntax_highlighter_1\" style=\"font-size:14px;\">Copy to Clipboard<\/span><\/div><textarea class=\"fusion-syntax-highlighter-textarea\" id=\"fusion_syntax_highlighter_1\" data-readonly=\"nocursor\" data-linenumbers=\"1\" data-linewrapping=\"\" data-theme=\"default\" data-mode=\"text\/x-sh\">python dlineage.py \/t oracle \/f test.sql \/graph<\/textarea><\/div><div class=\"fusion-text fusion-text-3\"><p>We can automatically obtain the data lineage relationships contained in the following Oracle SQL statement.<\/p>\n<\/div><style type=\"text\/css\" scopped=\"scopped\">.fusion-syntax-highlighter-2 > .CodeMirror, .fusion-syntax-highlighter-2 > .CodeMirror .CodeMirror-gutters {background-color:#ffffff;}.fusion-syntax-highlighter-2 > .CodeMirror .CodeMirror-gutters { background-color: #ffffff; }.fusion-syntax-highlighter-2 > .CodeMirror .CodeMirror-linenumber { color: #e8eaf0; }<\/style><div class=\"fusion-syntax-highlighter-container fusion-syntax-highlighter-2 fusion-syntax-highlighter-theme-light\" style=\"opacity:0;margin-top:10px;margin-right:0px;margin-bottom:10px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:#e8eaf0;\"><div class=\"syntax-highlighter-copy-code\"><span class=\"syntax-highlighter-copy-code-title\" data-id=\"fusion_syntax_highlighter_2\" style=\"font-size:14px;\">Copy to Clipboard<\/span><\/div><textarea class=\"fusion-syntax-highlighter-textarea\" id=\"fusion_syntax_highlighter_2\" data-readonly=\"nocursor\" data-linenumbers=\"1\" data-linewrapping=\"\" data-theme=\"default\" data-mode=\"text\/sql\">CREATE VIEW vsal \nAS \n  SELECT a.deptno                  \"Department\", \n         a.num_emp \/ b.total_count \"Employees\", \n         a.sal_sum \/ b.total_sal   \"Salary\" \n  FROM   (SELECT deptno, \n                 Count()  num_emp, \n                 SUM(sal) sal_sum \n          FROM   scott.emp \n          WHERE  city = 'NYC' \n          GROUP  BY deptno) a, \n         (SELECT Count()  total_count, \n                 SUM(sal) total_sal \n          FROM   scott.emp \n          WHERE  city = 'NYC') b \n;\n\nINSERT ALL\n\tWHEN ottl < 100000 THEN\n\t\tINTO small_orders\n\t\t\tVALUES(oid, ottl, sid, cid)\n\tWHEN ottl > 100000 and ottl < 200000 THEN\n\t\tINTO medium_orders\n\t\t\tVALUES(oid, ottl, sid, cid)\n\tWHEN ottl > 200000 THEN\n\t\tinto large_orders\n\t\t\tVALUES(oid, ottl, sid, cid)\n\tWHEN ottl > 290000 THEN\n\t\tINTO special_orders\nSELECT o.order_id oid, o.customer_id cid, o.order_total ottl,\no.sales_rep_id sid, c.credit_limit cl, c.cust_email cem\nFROM orders o, customers c\nWHERE o.customer_id = c.customer_id;<\/textarea><\/div><div class=\"fusion-text fusion-text-4\"><p>And visualize it as:<\/p>\n<\/div><div ><span class=\"fusion-imageframe imageframe-none imageframe-1 hover-type-none\"><a class=\"fusion-no-lightbox\" href=\"https:\/\/github.com\/sqlparser\/python_data_lineage\" target=\"_self\" aria-label=\"oracle_data_lineage\"><img decoding=\"async\" width=\"1024\" height=\"719\" alt=\"python data lineage\" src=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2024\/01\/oracle_data_lineage-1024x719.png\" class=\"img-responsive wp-image-6441\" srcset=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2024\/01\/oracle_data_lineage-200x140.png 200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2024\/01\/oracle_data_lineage-400x281.png 400w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2024\/01\/oracle_data_lineage-600x421.png 600w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2024\/01\/oracle_data_lineage-800x562.png 800w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2024\/01\/oracle_data_lineage-1200x842.png 1200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2024\/01\/oracle_data_lineage.png 1225w\" sizes=\"(max-width: 1024px) 100vw, (max-width: 640px) 100vw, 1024px\" \/><\/a><\/span><\/div><style type=\"text\/css\">@media only screen and (max-width:1024px) {.fusion-title.fusion-title-2{margin-top:10px!important;margin-bottom:1px!important;}}<\/style><div class=\"fusion-title title fusion-title-2 fusion-sep-none fusion-title-text fusion-title-size-one\" style=\"margin-top:10px;margin-bottom:1px;\"><h1 class=\"title-heading-left\" style=\"margin:0;\"><h3>Python data lineage package features:<\/h3><\/h1><\/div><ul class=\"fusion-checklist fusion-checklist-1\" style=\"font-size:18px;line-height:30.6px;\"><li class=\"fusion-li-item\"><span style=\"height:30.6px;width:30.6px;margin-right:12.6px;\" class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon fa fa-check\" style=\"color:#ffffff;\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\" style=\"margin-left:43.2px;\">\n<p>Generate interactive data lineage visualizations<\/p>\n<\/div><\/li><li class=\"fusion-li-item\"><span style=\"height:30.6px;width:30.6px;margin-right:12.6px;\" class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon fa fa-check\" style=\"color:#ffffff;\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\" style=\"margin-left:43.2px;\">\n<p>Create data lineage in JSON\/CSV\/GRAPHML<\/p>\n<\/div><\/li><li class=\"fusion-li-item\"><span style=\"height:30.6px;width:30.6px;margin-right:12.6px;\" class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon fa fa-check\" style=\"color:#ffffff;\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\" style=\"margin-left:43.2px;\">\n<p>Support SQL from more than 20 major database vendors<\/p>\n<\/div><\/li><\/ul><style type=\"text\/css\">@media only screen and (max-width:1024px) {.fusion-title.fusion-title-3{margin-top:10px!important;margin-bottom:30px!important;}}<\/style><div class=\"fusion-title title fusion-title-3 fusion-sep-none fusion-title-center fusion-title-text fusion-title-size-one\" style=\"margin-top:10px;margin-bottom:30px;\"><h1 class=\"title-heading-center\" style=\"margin:0;\"><h2 style=\"text-align: center;\">How python data lineage tool works<\/h2><\/h1><\/div><div style=\"text-align:center;\"><span class=\"fusion-imageframe imageframe-none imageframe-2 hover-type-none\"><a class=\"fusion-no-lightbox\" href=\"https:\/\/sqlflow.gudusoft.com\" target=\"_blank\" aria-label=\"python-data-lineage-overview\" rel=\"noopener noreferrer\"><img decoding=\"async\" width=\"1024\" height=\"1140\" alt=\"python data lineage\" src=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2021\/03\/python-data-lineage-overview-e1615281779694.png\" class=\"img-responsive wp-image-2969\"\/><\/a><\/span><\/div><div class=\"fusion-separator fusion-full-width-sep\" style=\"align-self: center;margin-left: auto;margin-right: auto;margin-top:10px;margin-bottom:30px;width:100%;\"><\/div><div class=\"fusion-text fusion-text-5\"><p>Now, all the above components are packaged into a single repository on github and you get it for free by simply clone it.<\/p>\n<\/div><style type=\"text\/css\" scopped=\"scopped\">.fusion-syntax-highlighter-3 > .CodeMirror, .fusion-syntax-highlighter-3 > .CodeMirror .CodeMirror-gutters {background-color:#ffffff;}.fusion-syntax-highlighter-3 > .CodeMirror .CodeMirror-gutters { background-color: #ffffff; }.fusion-syntax-highlighter-3 > .CodeMirror .CodeMirror-linenumber { color: #e8eaf0; }<\/style><div class=\"fusion-syntax-highlighter-container fusion-syntax-highlighter-3 fusion-syntax-highlighter-theme-light\" style=\"opacity:0;margin-top:0px;margin-right:0px;margin-bottom:20px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:#e8eaf0;\"><div class=\"syntax-highlighter-copy-code\"><span class=\"syntax-highlighter-copy-code-title\" data-id=\"fusion_syntax_highlighter_3\" style=\"font-size:14px;\">Copy to Clipboard<\/span><\/div><textarea class=\"fusion-syntax-highlighter-textarea\" id=\"fusion_syntax_highlighter_3\" data-readonly=\"nocursor\" data-linenumbers=\"1\" data-linewrapping=\"\" data-theme=\"default\" data-mode=\"text\/x-sh\">git clone https:\/\/github.com\/sqlparser\/python_data_lineage.git<\/textarea><\/div><div class=\"fusion-text fusion-text-6\"><p>&#8211; No database connection is needed.<br \/>\n&#8211; No internet connection is needed.<\/p>\n<p>You only need a JDK and a python interpreter to run this python data lineage package locally.<\/p>\n<\/div><div class=\"fusion-separator fusion-full-width-sep\" style=\"align-self: center;margin-left: auto;margin-right: auto;margin-top:10px;margin-bottom:20px;width:100%;\"><\/div><div style=\"text-align:center;\"><style type=\"text\/css\">.fusion-button.button-1 {border-radius:25px;}<\/style><a class=\"fusion-button button-flat fusion-button-default-size button-default button-1 fusion-button-default-span fusion-button-default-type\" target=\"_self\" href=\"https:\/\/github.com\/sqlparser\/python_data_lineage\"><span class=\"fusion-button-text\">Go to github repo Now<\/span><\/a><\/div><\/div><\/div><style type=\"text\/css\">.fusion-body .fusion-builder-column-0{width:100% !important;margin-top : 0px;margin-bottom : 20px;}.fusion-builder-column-0 > .fusion-column-wrapper {padding-top : 0px !important;padding-right : 0px !important;margin-right : 1.92%;padding-bottom : 0px !important;padding-left : 0px !important;margin-left : 1.92%;}@media only screen and (max-width:1024px) {.fusion-body .fusion-builder-column-0{width:100% !important;order : 0;}.fusion-builder-column-0 > .fusion-column-wrapper {margin-right : 1.92%;margin-left : 1.92%;}}@media only screen and (max-width:640px) {.fusion-body .fusion-builder-column-0{width:100% !important;order : 0;}.fusion-builder-column-0 > .fusion-column-wrapper {margin-right : 1.92%;margin-left : 1.92%;}}<\/style><\/div><style type=\"text\/css\">.fusion-body .fusion-flex-container.fusion-builder-row-1{ padding-top : 0px;margin-top : 0px;padding-right : 0px;padding-bottom : 0px;margin-bottom : 0px;padding-left : 0px;}<\/style><\/div>","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/pages\/2963"}],"collection":[{"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/comments?post=2963"}],"version-history":[{"count":44,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/pages\/2963\/revisions"}],"predecessor-version":[{"id":6468,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/pages\/2963\/revisions\/6468"}],"wp:attachment":[{"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/media?parent=2963"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}