{"id":3554,"date":"2022-04-26T06:51:53","date_gmt":"2022-04-26T14:51:53","guid":{"rendered":"https:\/\/www.gudusoft.com\/?p=3554"},"modified":"2022-05-19T01:36:03","modified_gmt":"2022-05-19T09:36:03","slug":"amazon-redshift-data-lineage-gudu-sqlflow","status":"publish","type":"post","link":"https:\/\/www.gudusoft.com\/fr\/amazon-redshift-data-lineage-gudu-sqlflow\/","title":{"rendered":"Lign\u00e9e de donn\u00e9es Amazon Redshift | Gudu SQLFlow"},"content":{"rendered":"<div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"background-color: rgba(255,255,255,0);background-position: center center;background-repeat: no-repeat;border-width: 0px 0px 0px 0px;border-color:#e8eaf0;border-style:solid;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start\" style=\"max-width:1310.4px;margin-left: calc(-4% \/ 2 );margin-right: calc(-4% \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_1 1_1 fusion-flex-column\"><div class=\"fusion-column-wrapper fusion-flex-justify-content-flex-start fusion-content-layout-column\" style=\"background-position:left top;background-repeat:no-repeat;-webkit-background-size:cover;-moz-background-size:cover;-o-background-size:cover;background-size:cover;padding: 0px 0px 0px 0px;\"><div class=\"fusion-text fusion-text-1\" style=\"line-height:26px;\"><h2><strong>Lign\u00e9e de donn\u00e9es Amazon Redshift | Gudu SQLFlow<\/strong><\/h2>\n<p>In your Amazon warehouse environment, use <strong><a class=\"rank-math-link\" href=\"https:\/\/docs.aws.amazon.com\/redshift\/latest\/dg\/c-getting-started-using-spectrum.html\">Amazon Redshift Spectrum<\/a><\/strong> to query data directly from files on Amazon S3, save data in Redshift databases, and use Business Intelligence tools such as Tableau, PowerBI, Looker, Qlik, Superset to generate reports from the data. The data originates from your business source system and lands on Amazon S3, then using an ETL tool such as DBT to transfer and store it in Redshift Database for later uses.<\/p>\n<div id=\"attachment_3581\" style=\"width: 587px\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-3581\" decoding=\"async\" class=\"size-full wp-image-3581\" src=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage.png\" alt=\"Lign\u00e9e de donn\u00e9es Amazon Redshift\" width=\"577\" height=\"402\" srcset=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage-200x139.png 200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage-300x209.png 300w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage-400x279.png 400w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage.png 577w\" sizes=\"(max-width: 577px) 100vw, 577px\" \/><p id=\"caption-attachment-3581\" class=\"wp-caption-text\">Lign\u00e9e de donn\u00e9es Amazon Redshift<\/p><\/div>\n<p>In order to have an overview of the data flow in your Amazon warehouse system, you need a <strong>outil de lign\u00e9e de donn\u00e9es<\/strong> to help you understand how the data arrived at a particular location, as well as the intermediate steps and transformations which happen as the data moves through the business system.<\/p>\n<p>One way to get <strong><a class=\"rank-math-link\" href=\"https:\/\/www.gudusoft.com\/fr\/quest-ce-que-la-lignee-des-donnees-pourquoi-est-elle-importante\/\">the data linage<\/a><\/strong> automatically from the Amazon warehouse environment is to analyze all SQL queries used during the data loading, transforming, and analysis. The good news is that all those SQL statements are stored in <strong><a class=\"rank-math-link\" href=\"https:\/\/docs.aws.amazon.com\/redshift\/latest\/mgmt\/db-auditing.html#db-auditing-logs\">the Redshift user activity log<\/a><\/strong> et <strong>Gudu<\/strong><strong> SQLFlow<\/strong> can parse those log files to discover the data lineage automatically.<\/p>\n<h3 class=\"wp-block-heading\">Redshift User activity log<\/h3>\n<p>The user activity log is useful primarily for troubleshooting purposes and here we use it for the data lineage discovery. It tracks information about the types of queries that both the users and the system perform in the database.<\/p>\n<p><strong>Logs each query before it is run on the database.<\/strong><\/p>\n<figure id=\"w251aac30c24c10b9c18b4\" class=\"wp-block-table\">\n<table>\n<thead>\n<tr>\n<th>Column name<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>recordtime<\/td>\n<td>Time the event occurred.<\/td>\n<\/tr>\n<tr>\n<td>db<\/td>\n<td>Database name.<\/td>\n<\/tr>\n<tr>\n<td>user<\/td>\n<td>User name.<\/td>\n<\/tr>\n<tr>\n<td>pid<\/td>\n<td>Process ID associated with the statement.<\/td>\n<\/tr>\n<tr>\n<td>userid<\/td>\n<td>User ID.<\/td>\n<\/tr>\n<tr>\n<td>xid<\/td>\n<td>Transaction ID.<\/td>\n<\/tr>\n<tr>\n<td>query<\/td>\n<td>A prefix of LOG: followed by the text of the query, including newlines.<\/td>\n<\/tr>\n<\/tbody>\n<\/table><figcaption>Redshift user activity log<\/figcaption><\/figure>\n<p>Please check <a class=\"rank-math-link\" href=\"https:\/\/docs.aws.amazon.com\/redshift\/latest\/mgmt\/db-auditing.html#db-auditing-logs\">this article<\/a> to see how to enable logging.<\/p>\n<\/p>\n<h5 class=\"wp-block-heading\">Amazon Redshift User Activity Audit Log Sample<\/h5>\n<\/p>\n<pre class=\"wp-block-code\"><code>'2018-05-21T06:00:09Z UTC [ db=prod_sales user=duc pid=99753 userid=95 xid=6728324 ]' LOG: create table SumoProdbackUp.organization as (select * from SumoProd.simpleuser) \r\n'2018-05-21T06:00:09Z UTC [ db=vendor user=ankit pid=36616 userid=53 xid=2956702 ]' LOG: DELETE FROM SumoProd.employee WHERE id = 38; \r\n'2018-05-21T06:20:09Z UTC [ db=dev user=himanshu pid=64458 userid=35 xid=5143208 ]' LOG: drop user testuser3<\/code><\/pre>\n<\/p>\n<p>&nbsp;<\/p>\n<\/p>\n<h3 class=\"wp-block-heading\">Automatic data lineage analysis<\/h3>\n<\/p>\n<p><strong><a class=\"rank-math-link\" href=\"https:\/\/sqlflow.gudusoft.com\">Gudu SQLFlow<\/a><\/strong> is a tool that automated SQL data lineage analysis across Databases, ETL, Business Intelligence, Cloud, and Hadoop environments by parsing SQL Script and stored procedure. It can also analyze the Redshift user activity log files to discover the data lineage and depict all the data movement graphically.<\/p>\n<\/p>\n<p>Here is a part of a data lineage picture that was generated after analyzing the Amazon Redshift user activity log files:<\/p>\n<\/p>\n<figure class=\"wp-block-image size-large\"><\/figure>\n<\/p>\n<figure class=\"wp-block-image alignnone wp-image-3564\"><a href=\"https:\/\/sqlflow.gudusoft.com\"><img decoding=\"async\" width=\"1024\" height=\"300\" class=\"wp-image-3564\" src=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-1024x300.png\" alt=\"Lign\u00e9e de donn\u00e9es Amazon Redshift\u00a0\" srcset=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-200x59.png 200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-300x88.png 300w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-400x117.png 400w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-600x176.png 600w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-768x225.png 768w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-800x234.png 800w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-1024x300.png 1024w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-1200x352.png 1200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-1536x450.png 1536w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage.png 1778w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption>Lign\u00e9e de donn\u00e9es Amazon Redshift<\/figcaption><\/figure>\n<\/p>\n<p>&nbsp;<\/p>\n<\/p>\n<h3><strong>Conclusion<\/strong>\u00a0<\/h3>\n<p>Merci d&#039;avoir lu notre article et nous esp\u00e9rons qu&#039;il pourra vous aider \u00e0 mieux comprendre <strong>Lign\u00e9e de donn\u00e9es Amazon Redshift<\/strong>. Si vous souhaitez en savoir plus sur <strong>Lign\u00e9e de donn\u00e9es Amazon Redshift<\/strong>, we would like to advise you to visit our website <a href=\"https:\/\/sqlflow.gudusoft.com\"><strong>Gudu SQLFlow<\/strong><\/a> for more information. Gudu SQLFlow, as a data lineage analysis tool, can not only analyze SQL script files, obtain data lineage, and perform visual display, but also allow users to provide data lineage in CSV format and perform visual display. Thanks again! <strong>( Edited by Ryan on Apr 26, 2022)<\/strong><\/p>\n<\/div><\/div><\/div><style type=\"text\/css\">.fusion-body .fusion-builder-column-0{width:100% !important;margin-top : 0px;margin-bottom : 0px;}.fusion-builder-column-0 > .fusion-column-wrapper {padding-top : 0px !important;padding-right : 0px !important;margin-right : 1.92%;padding-bottom : 0px !important;padding-left : 0px !important;margin-left : 1.92%;}@media only screen and (max-width:1024px) {.fusion-body .fusion-builder-column-0{width:100% !important;}.fusion-builder-column-0 > .fusion-column-wrapper {margin-right : 1.92%;margin-left : 1.92%;}}@media only screen and (max-width:640px) {.fusion-body .fusion-builder-column-0{width:100% !important;}.fusion-builder-column-0 > .fusion-column-wrapper {margin-right : 1.92%;margin-left : 1.92%;}}<\/style><\/div><style type=\"text\/css\">.fusion-body .fusion-flex-container.fusion-builder-row-1{ padding-top : 0px;margin-top : 0px;padding-right : 0px;padding-bottom : 0px;margin-bottom : 0px;padding-left : 0px;}<\/style><\/div>","protected":false},"excerpt":{"rendered":"","protected":false},"author":27,"featured_media":3583,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[14],"tags":[89,90,91,75,92],"_links":{"self":[{"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/posts\/3554"}],"collection":[{"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/users\/27"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/comments?post=3554"}],"version-history":[{"count":31,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/posts\/3554\/revisions"}],"predecessor-version":[{"id":3896,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/posts\/3554\/revisions\/3896"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/media\/3583"}],"wp:attachment":[{"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/media?parent=3554"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/categories?post=3554"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gudusoft.com\/fr\/wp-json\/wp\/v2\/tags?post=3554"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}