1. SQLFlow Widget, embed the live versions of the data lineage content into your web app

    SQLFlow Widget SQLFlow Widget is a javascript library that embeds the live versions of the data lineage content generated by the SQLFlow into your web app instantly. How to use SQLFlow Widget create a simple html page include jquery.min.js include sqlflow.library.2.3.8.js include index.js Allow end-users to enter SQL queries. With SQLFlow Widget, you can create your own portal that accepts SQL queries from the user directly and show the data lineage content returned by the SQLFlow in specific sections of your web app. Please check this page to see how it works. Retrieve data lineage for a specific table/column SQLFlow enables you periodically ingest SQL queries and metadata from differents sources and save the generated data lineage result for later use. In your web app, you just specify a particular table/column, SQLFlow Widget will fetch the data lineage of this table/column from the SQLFlow and embed the live versions of the data lineage content into your web app. The live demo for this feature will be available soon! SQLFlow on-premise version server We have set up an SQLFlow on-premise version server(https://www.sqlflow.net/api/gspLive_backend), So you may set up the SQLFlow widget on your site and connect to this SQLFlow Server to see how it works. Please DON’T SUBMIT your private SQL to this site. You are just using some demo SQL queries for evaluation purposes. const sqlflow = await SQLFlow.init({ container: document.getElementById('demo-1'), width: 1000, height: 800, apiPrefix: 'https://www.sqlflow.net/api/gspLive_backend', }); Source of the sample html file and JS file https://www.sqlflow.net/sqlflow-widget/index.html <!DOCTYPE html> <html lang="zh-CN"> <head> <meta charset="UTF-8" /> <title>SQLFlow Widget Demo</title> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js"></script> <script src="https://www.sqlflow.net/sqlflow-library/sqlflow.library.2.3.8.js"></script> <script src="index.js"></script> <style> body { padding: 0 20px 20px; } h1 { margin-top: 50px; font-size: 24px; } .config { margin: 20px; } .config .row { display: flex; align-items: center; margin-bottom: 10px; } .config .row input { width: 300px; } .config .row textarea { width: 500px; height: 300px; } .config .row input[type='checkbox'] { width: 20px; cursor: pointer; } .checkbox { cursor: pointer; } .checkbox .title { width: 220px; } .input .title { width: 80px; } </style> </head> <body> <div class="config"> <div class="row input"> <div class="title">sqltext</div> <textarea id="sqltext"> CREATE VIEW vsal AS SELECT a.deptno "Department", a.num_emp / b.total_count "Employees", a.sal_sum / b.total_sal "Salary" FROM (SELECT deptno, Count() num_emp, SUM(sal) sal_sum FROM scott.emp WHERE city = 'NYC' GROUP BY deptno) a, (SELECT Count() total_count, SUM(sal) total_sal FROM scott.emp WHERE city = 'NYC') b ; </textarea> </div> <div class="row checkbox"> <div class="title">dataflow</div> <input id="dataflow" type="checkbox" checked /> </div> <div class="row checkbox"> <div class="title">impact</div> <input id="impact" type="checkbox" /> </div> <div class="row checkbox"> <div class="title">show intermediate recordset</div> <input id="recordset" type="checkbox" checked /> </div> <div class="row checkbox"> <div class="title">show function</div> <input id="function" type="checkbox" /> </div> <button id="visualize">visualize</button> </div> <div class="sqlflow" id="demo-1"></div> </body> </html> https://www.sqlflow.net/sqlflow-widget/index.js $(async () => { const $sqltext = $('#sqltext'); const $dataflow = $('#dataflow'); const $impact = $('#impact'); const $recordset = $('#recordset'); const $function = $('#function'); const $visualize = $('#visualize'); // get a instance of SQLFlow console.log(document.getElementById('demo-1')); const sqlflow = await SQLFlow.init({ container: document.getElementById('demo-1'), width: 1000, height: 800, apiPrefix: 'https://www.sqlflow.net/api/gspLive_backend', }); // set dbvendor property sqlflow.vendor.set('oracle'); const visualize = async () => { // set sql text property sqlflow.sqltext.set($sqltext.val()); sqlflow.visualize(); }; visualize(); $visualize.click(visualize); $dataflow.change(() => { const checked = $dataflow.prop('checked'); sqlflow.setting.dataflow.set(checked); }); $impact.change(() => { const checked = $impact.prop('checked'); sqlflow.setting.impact.set(checked); }); $recordset.change(() => { const checked = $recordset.prop('checked'); sqlflow.setting.showIntermediateRecordset.set(checked); }); $function.change(() => { const checked = $function.prop('checked'); sqlflow.setting.showFunction.set(checked); }); });

    2021/09/13 SQLFlow

  2. Grabit connects to Snowflake using key pair authentication

    Snowflake supports using key pair authentication for enhanced authentication security as an alternative to basic authentication (i.e. username and password). Grabit supports using key pair authentication as well. In order to connect to Snowflake using key pair authentication, you need to configure Key Pair Authentication first. Configure the grabit After that, you may use a grabit config file like this to connect to Snowflake: { "databaseServer":{ "hostname":"127.0.0.1", "port":"433", "username":"USERNAME", "privateKeyFile":"/data/rsa_key.p8", "privateKeyFilePwd":"123456", "database":"DATABASE", "enableQueryHistory":true, "queryHistoryBlockOfTimeInMinutes":30, "queryHistorySqlType":"SELECT,UPDATE", "snowflakeDefaultRole":"custom_role" }, "SQLFlowServer":{ "server":"http:127.0.0.1", "serverPort":"8081", "userId":"gudu|0123456789", "userSecret":"" }, "SQLScriptSource":"database", "databaseType":"snowflake" } privateKeyFile, this is the file that store the private key. privateKeyFilePwd, this is the password used when generate the key, leave it empty if not used. other optins, please check this document for detailed explanation. Fetch SQL from qurey history In order to fetch SQL from query history, we need to create a role that able to access SNOWFLAKE database, and this role is able to access WAREHOUSE as well. For example, we create a role: custom_role, and assign privileges of test_warehouse to this role like this: use role accountadmin; grant imported privileges on database snowflake to role custom_role; use role custom_role; use role sysadmin GRANT ALL PRIVILEGES ON WAREHOUSE test_warehouse TO role custom_role; Official Document about QUERY_HISTORY

    2021/09/07 Grabit SQLFlow

  3. Discover data lineage of SQL in CSV file

    SQLFlow can discover data lineage of SQL queries in a CSV file. CSV file format In order to discover data lineage from SQL queries, at least one column that includes the SQL query must be presented in the CSV file. Other columns such as database, schema, object type, and object name are helpful to help SQLFlow to make a more precise data lineage analysis. But there are not mandatory. Here is the common structure of a CSV file that may submit to SQLFlow for the data lineage analysis. Database,Schema,ObjectType,ObjectName,ObjectCode,Notes Your CSV file may have a different structure, as long as the column incudes the SQL query exits, and you tell the SQLFlow the column position of this SQL query, then it can be analyzed correctly. column information database: database that this SQL query will run against. schema: schema that includes database objects in the SQL query. objectType: type of the SQL query such as procedure, function, trigger, view. objectName: name of the SQL query. objectCode: the SQL query code. Notes: comments for this SQL query. Position of the columns in the CSV file In order to help SQLFlow understand your CSV file correctly, you need to tell the column position of each predefined column through the UI interface of the SQLFlow. Please note that the column name needn’t be the exact same as the predefined one, the SQLFlow only cares about the position. The default columns required is in a format like this: Database,Schema,ObjectType,ObjectName,ObjectCode,Notes if your CSV file has the same format, you may tell SQLFlow that Database is in column 1 Schema is in column 2 ObjectType is in column 3 ObjectName is in column 4 ObjectCode is in column 5 Notes is in column 6 If you have a CSV file in format like this: catalog schema type code database1 schema1 procedure “create procedure test_db.test_schema.PRC_STG(IN register_batch_id INT64) BEGIN # STEP 1 # Declare variables –DECLARE date_to_query DATE; –DECLARE hour_to_query INT64; DECLARE v_insert_time DATETIME; ….” As you can see, the catalog (AKA database) is at the first column; the schema is at the second column, object type is at the third column, and query code is at the fourth column. In this CSV file, other columns such as ObjectType, Notes in the default CSV file format don’t exist. objectCode column ObjectCode column includes the SQL query. Usually, the SQL query will be in multi lines, So it should be enclosed with an enclosed char, which is " by default. If the SQL query is enclosed with other char, then it must be set using objectCodeEncloseChar. The objectCodeEncloseChar itself inside the SQL query should be escaped by objectCodeEscapeChar. The default objectCodeEscapeChar is ". enclosed char of other columns Besides the objectCode column, if other column is enclosed by the objectCodeEncloseChar as well, SQLFlow will remove this char automatically. Set the CSV format Upload a CSV file by creating a SQLFlow job. After upload the CSV file, click “This is a csv file” checkbox. Tells the SQLFlow column position. 0 means this columns doesn’t exist in CSV file.

    2021/09/05 SQLFlow