{"id":3554,"date":"2022-04-26T06:51:53","date_gmt":"2022-04-26T14:51:53","guid":{"rendered":"https:\/\/www.gudusoft.com\/?p=3554"},"modified":"2022-05-19T01:36:03","modified_gmt":"2022-05-19T09:36:03","slug":"amazon-redshift-data-lineage-gudu-sqlflow","status":"publish","type":"post","link":"https:\/\/www.gudusoft.com\/de\/amazon-redshift-data-lineage-gudu-sqlflow\/","title":{"rendered":"Amazon Redshift Datenherkunft |"},"content":{"rendered":"<div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"background-color: rgba(255,255,255,0);background-position: center center;background-repeat: no-repeat;border-width: 0px 0px 0px 0px;border-color:#e8eaf0;border-style:solid;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start\" style=\"max-width:1310.4px;margin-left: calc(-4% \/ 2 );margin-right: calc(-4% \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_1 1_1 fusion-flex-column\"><div class=\"fusion-column-wrapper fusion-flex-justify-content-flex-start fusion-content-layout-column\" style=\"background-position:left top;background-repeat:no-repeat;-webkit-background-size:cover;-moz-background-size:cover;-o-background-size:cover;background-size:cover;padding: 0px 0px 0px 0px;\"><div class=\"fusion-text fusion-text-1\" style=\"line-height:26px;\"><h2><strong>Amazon Redshift Datenherkunft |<\/strong><\/h2>\n<p>In your Amazon warehouse environment, use <strong><a class=\"rank-math-link\" href=\"https:\/\/docs.aws.amazon.com\/redshift\/latest\/dg\/c-getting-started-using-spectrum.html\">Amazon Redshift Spectrum<\/a><\/strong> to query data directly from files on Amazon S3, save data in Redshift databases, and use Business Intelligence tools such as Tableau, PowerBI, Looker, Qlik, Superset to generate reports from the data. The data originates from your business source system and lands on Amazon S3, then using an ETL tool such as DBT to transfer and store it in Redshift Database for later uses.<\/p>\n<div id=\"attachment_3581\" style=\"width: 587px\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-3581\" decoding=\"async\" class=\"size-full wp-image-3581\" src=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage.png\" alt=\"Amazon Redshift-Datenherkunft\" width=\"577\" height=\"402\" srcset=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage-200x139.png 200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage-300x209.png 300w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage-400x279.png 400w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Amazon_Redshift_Data_Lineage.png 577w\" sizes=\"(max-width: 577px) 100vw, 577px\" \/><p id=\"caption-attachment-3581\" class=\"wp-caption-text\">Amazon Redshift-Datenherkunft<\/p><\/div>\n<p>In order to have an overview of the data flow in your Amazon warehouse system, you need a <strong>Datenherkunftstool<\/strong> to help you understand how the data arrived at a particular location, as well as the intermediate steps and transformations which happen as the data moves through the business system.<\/p>\n<p>One way to get <strong><a class=\"rank-math-link\" href=\"https:\/\/www.gudusoft.com\/de\/whats-data-lineage-why-important\/\">the data linage<\/a><\/strong> automatically from the Amazon warehouse environment is to analyze all SQL queries used during the data loading, transforming, and analysis. The good news is that all those SQL statements are stored in <strong><a class=\"rank-math-link\" href=\"https:\/\/docs.aws.amazon.com\/redshift\/latest\/mgmt\/db-auditing.html#db-auditing-logs\">the Redshift user activity log<\/a><\/strong> Und <strong>Gudu<\/strong><strong> SQLFlow<\/strong> can parse those log files to discover the data lineage automatically.<\/p>\n<h3 class=\"wp-block-heading\">Redshift User activity log<\/h3>\n<p>The user activity log is useful primarily for troubleshooting purposes and here we use it for the data lineage discovery. It tracks information about the types of queries that both the users and the system perform in the database.<\/p>\n<p><strong>Logs each query before it is run on the database.<\/strong><\/p>\n<figure id=\"w251aac30c24c10b9c18b4\" class=\"wp-block-table\">\n<table>\n<thead>\n<tr>\n<th>Column name<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>recordtime<\/td>\n<td>Time the event occurred.<\/td>\n<\/tr>\n<tr>\n<td>db<\/td>\n<td>Database name.<\/td>\n<\/tr>\n<tr>\n<td>user<\/td>\n<td>User name.<\/td>\n<\/tr>\n<tr>\n<td>pid<\/td>\n<td>Process ID associated with the statement.<\/td>\n<\/tr>\n<tr>\n<td>userid<\/td>\n<td>User ID.<\/td>\n<\/tr>\n<tr>\n<td>xid<\/td>\n<td>Transaction ID.<\/td>\n<\/tr>\n<tr>\n<td>query<\/td>\n<td>A prefix of LOG: followed by the text of the query, including newlines.<\/td>\n<\/tr>\n<\/tbody>\n<\/table><figcaption>Redshift user activity log<\/figcaption><\/figure>\n<p>Please check <a class=\"rank-math-link\" href=\"https:\/\/docs.aws.amazon.com\/redshift\/latest\/mgmt\/db-auditing.html#db-auditing-logs\">this article<\/a> to see how to enable logging.<\/p>\n<\/p>\n<h5 class=\"wp-block-heading\">Amazon Redshift User Activity Audit Log Sample<\/h5>\n<\/p>\n<pre class=\"wp-block-code\"><code>'2018-05-21T06:00:09Z UTC [ db=prod_sales user=duc pid=99753 userid=95 xid=6728324 ]' LOG: create table SumoProdbackUp.organization as (select * from SumoProd.simpleuser) \r\n'2018-05-21T06:00:09Z UTC [ db=vendor user=ankit pid=36616 userid=53 xid=2956702 ]' LOG: DELETE FROM SumoProd.employee WHERE id = 38; \r\n'2018-05-21T06:20:09Z UTC [ db=dev user=himanshu pid=64458 userid=35 xid=5143208 ]' LOG: drop user testuser3<\/code><\/pre>\n<\/p>\n<p>&nbsp;<\/p>\n<\/p>\n<h3 class=\"wp-block-heading\">Automatic data lineage analysis<\/h3>\n<\/p>\n<p><strong><a class=\"rank-math-link\" href=\"https:\/\/sqlflow.gudusoft.com\">Gudu SQLFlow<\/a><\/strong> is a tool that automated SQL data lineage analysis across Databases, ETL, Business Intelligence, Cloud, and Hadoop environments by parsing SQL Script and stored procedure. It can also analyze the Redshift user activity log files to discover the data lineage and depict all the data movement graphically.<\/p>\n<\/p>\n<p>Here is a part of a data lineage picture that was generated after analyzing the Amazon Redshift user activity log files:<\/p>\n<\/p>\n<figure class=\"wp-block-image size-large\"><\/figure>\n<\/p>\n<figure class=\"wp-block-image alignnone wp-image-3564\"><a href=\"https:\/\/sqlflow.gudusoft.com\"><img decoding=\"async\" width=\"1024\" height=\"300\" class=\"wp-image-3564\" src=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-1024x300.png\" alt=\"Amazon Redshift-Datenherkunft\u00a0\" srcset=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-200x59.png 200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-300x88.png 300w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-400x117.png 400w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-600x176.png 600w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-768x225.png 768w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-800x234.png 800w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-1024x300.png 1024w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-1200x352.png 1200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage-1536x450.png 1536w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/amazon-redshift-user-activity-log-data-lineage.png 1778w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption>Amazon Redshift-Datenherkunft<\/figcaption><\/figure>\n<\/p>\n<p>&nbsp;<\/p>\n<\/p>\n<h3><strong>Abschluss<\/strong>\u00a0<\/h3>\n<p>Vielen Dank f\u00fcr das Lesen unseres Artikels. Wir hoffen, dass er Ihnen dabei hilft, ein besseres Verst\u00e4ndnis zu erlangen von <strong>Amazon Redshift-Datenherkunft<\/strong>Wenn Sie mehr erfahren m\u00f6chten \u00fcber <strong>Amazon Redshift-Datenherkunft<\/strong>, we would like to advise you to visit our website <a href=\"https:\/\/sqlflow.gudusoft.com\"><strong>Gudu SQLFlow<\/strong><\/a> for more information. Gudu SQLFlow, as a data lineage analysis tool, can not only analyze SQL script files, obtain data lineage, and perform visual display, but also allow users to provide data lineage in CSV format and perform visual display. Thanks again! <strong>( Edited by Ryan on Apr 26, 2022)<\/strong><\/p>\n<\/div><\/div><\/div><style type=\"text\/css\">.fusion-body .fusion-builder-column-0{width:100% !important;margin-top : 0px;margin-bottom : 0px;}.fusion-builder-column-0 > .fusion-column-wrapper {padding-top : 0px !important;padding-right : 0px !important;margin-right : 1.92%;padding-bottom : 0px !important;padding-left : 0px !important;margin-left : 1.92%;}@media only screen and (max-width:1024px) {.fusion-body .fusion-builder-column-0{width:100% !important;}.fusion-builder-column-0 > .fusion-column-wrapper {margin-right : 1.92%;margin-left : 1.92%;}}@media only screen and (max-width:640px) {.fusion-body .fusion-builder-column-0{width:100% !important;}.fusion-builder-column-0 > .fusion-column-wrapper {margin-right : 1.92%;margin-left : 1.92%;}}<\/style><\/div><style type=\"text\/css\">.fusion-body .fusion-flex-container.fusion-builder-row-1{ padding-top : 0px;margin-top : 0px;padding-right : 0px;padding-bottom : 0px;margin-bottom : 0px;padding-left : 0px;}<\/style><\/div>","protected":false},"excerpt":{"rendered":"","protected":false},"author":27,"featured_media":3583,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[14],"tags":[89,90,91,75,92],"_links":{"self":[{"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/posts\/3554"}],"collection":[{"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/users\/27"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/comments?post=3554"}],"version-history":[{"count":31,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/posts\/3554\/revisions"}],"predecessor-version":[{"id":3896,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/posts\/3554\/revisions\/3896"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/media\/3583"}],"wp:attachment":[{"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/media?parent=3554"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/categories?post=3554"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/tags?post=3554"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}