{"id":3311,"date":"2022-04-18T17:33:06","date_gmt":"2022-04-19T01:33:06","guid":{"rendered":"https:\/\/www.gudusoft.com\/?p=3311"},"modified":"2022-07-03T04:35:23","modified_gmt":"2022-07-03T12:35:23","slug":"whats-data-lineage-why-important","status":"publish","type":"post","link":"https:\/\/www.gudusoft.com\/de\/whats-data-lineage-why-important\/","title":{"rendered":"Was ist Datenherkunft? | Warum ist Datenherkunft so wichtig?"},"content":{"rendered":"<div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"background-color: rgba(255,255,255,0);background-position: center center;background-repeat: no-repeat;border-width: 0px 0px 0px 0px;border-color:#e8eaf0;border-style:solid;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start\" style=\"max-width:1310.4px;margin-left: calc(-4% \/ 2 );margin-right: calc(-4% \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_1 1_1 fusion-flex-column\"><div class=\"fusion-column-wrapper fusion-flex-justify-content-flex-start fusion-content-layout-column\" style=\"background-position:left top;background-repeat:no-repeat;-webkit-background-size:cover;-moz-background-size:cover;-o-background-size:cover;background-size:cover;padding: 0px 0px 0px 0px;\"><div class=\"fusion-text fusion-text-1\" style=\"line-height:26px;\"><h1><strong><b>Was ist Datenherkunft? | Warum ist Datenherkunft so wichtig?<\/b><\/strong><\/h1>\n<p style=\"text-align: left\">Nowadays, with the rapid development of economy and technology,\u00a0we are surrounded by all kinds of data, and almost every part of our business depends on it in some way. When we&#8217;re busy deciding how best to manage our data, we may feel we don&#8217;t have time to delve into its real benefits for our company.\u00a0Consider this. Data should be available to our company 24\/7. To that end, understanding the details of where it originated, how it got there, and how it circulated in the business is critical to its value.<\/p>\n<div id=\"attachment_3510\" style=\"width: 586px\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-3510\" decoding=\"async\" class=\"size-full wp-image-3510\" src=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Data_Lineage-2.png\" alt=\"Datenherkunft\" width=\"576\" height=\"384\" srcset=\"https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Data_Lineage-2-200x133.png 200w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Data_Lineage-2-300x200.png 300w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Data_Lineage-2-400x267.png 400w, https:\/\/www.gudusoft.com\/wp-content\/uploads\/2022\/04\/Data_Lineage-2.png 576w\" sizes=\"(max-width: 576px) 100vw, 576px\" \/><p id=\"caption-attachment-3510\" class=\"wp-caption-text\">Datenherkunft<\/p><\/div>\n<p style=\"text-align: left\">Input\u00a0<a href=\"https:\/\/www.gudusoft.com\/de\/blog\/sqlflow-visualize-data-lineage-stored-procedure\/\"><strong><b>Datenherkunft<\/b><\/strong><\/a>, an exquisite tool for unearthing the origin of the gold mine, understanding it, and ensuring that it ends up in the hands of those who need it most. So\u00a0<strong><b>what&#8217;s data lineage<\/b><\/strong>? Why data lineage is so important? In this post, let&#8217;s take a closer look at the\u00a0<strong><b>Datenherkunft<\/b><\/strong>.<\/p>\n<h2><strong><b>What&#8217;s Data Lineage?<\/b><\/strong><\/h2>\n<p>It is the pedigree of the data. In short, it refers to a record of how the data arrived at a particular location, as well as the intermediate steps and transformations which happen as the data moves through the business system. In essence, the\u00a0<strong><b>Datenherkunft<\/b><\/strong>\u00a0gives us a detailed map of the data journey, including all the steps along the way, as shown above.<\/p>\n<h2><strong><b>Data Lineage vs. Data Provenance<\/b><\/strong><\/h2>\n<p>The concept of data provenance is related to data lineage. It refers to the source of the data. Based on the provenance, we can make assumptions about the reliability and quality of the data.\u00a0Both\u00a0<strong><b>Data Warehouse<\/b><\/strong>\u00a0Und\u00a0<strong><b><a href=\"https:\/\/en.wikipedia.org\/wiki\/Data_lake\">Datensee<\/a> administrators<\/b><\/strong>\u00a0should focus on tracking data provenance and data lineage. Key aspects of metadata management include knowing where and when the data originated, who had touched it, and how to modify it.<\/p>\n<h2><strong><b>Why Data Lineage Is So Important?<\/b><\/strong><\/h2>\n<p>Knowing the provenance and lineage of data is highly important for the following reasons:<\/p>\n<p>First, we can assess the credibility of data based on its provenance.\u00a0In addition, it can help us understand and correct the sources of mistakes. Besides,\u00a0it allows us to identify false assumptions about the data that might distort the analysis. Furthermore,\u00a0it provides audit trails for data governance and regulatory purposes. Moreover, with its help, we can ensure that data flows are protected from tampering.\u00a0Finally, it enables us to identify and avoid data duplication, simplifying operations and reducing costs.<\/p>\n<h2><strong><b>What Business Value Can Data Lineage Provide Us?<\/b><\/strong><\/h2>\n<p>Although data lineage may seem like an abstract concept, a comprehensive understanding of the entire life-cycle of data can add value to the business in several areas:<\/p>\n<h3><strong><b>1. Improve business performance<\/b><\/strong><\/h3>\n<p>Almost every decision in the modern enterprise relies on BI and decision support systems (DSS).\u00a0For example, which features should be prioritized in new product design, where to advertise, and which sales and marketing strategies should be used to maximize revenue, profitability, and customer loyalty.\u00a0The phrase &#8220;garbage in, garbage out&#8221; can be used to all aspects of analysis. Wrong data can seriously distort results and influence business performance.<\/p>\n<h3><strong><b>2. Manage regulatory compliance and risk<\/b><\/strong><b><\/b><b><\/b><\/h3>\n<p><b><\/b>Organizations from all industries must handle various regulatory requirements and some regulatory requirements influence only certain industries.\u00a0Examples include HIPAA, which aims to protect patient information in healthcare, and Basel, which aims to mitigate risk in international banking. Others, like the EU&#8217;s General Data Protection Regulation (GDPR), influence all industries.\u00a0Owning metadata which tracks data lineage for data governance purposes reduces business risk and costs associated with compliance and it\u00a0also makes it easier and more cost-effective to comply with potential new regulations in the future.<\/p>\n<h3><strong><b>3. Handle evolving data sources<\/b><\/strong><\/h3>\n<p>Systems and data sources change with the evolution of business conditions.\u00a0For instance, an analytics application which estimates customer behavior just by looking at traditional point-of-sale data is almost certainly wrong.\u00a0This analytics approach will miss customers for e-commerce orders, in-app purchases, and a variety of other sales channels and demographics.\u00a0Though this may seem obvious, the problem of data bias and undetected data sources is a problem that even the most complex organization can easily fall into.<\/p>\n<h3><strong><b>4. Reduce IT cost and risk<\/b><\/strong><\/h3>\n<p>What all of the above examples have in common is that they all rely on information technology (IT). Organizations that understand data sets and how they are used can build new applications more easily and solve problems with existing applications more quickly and economically. If the metadata source of the data is clear, it is much easier and cost-effective to modify or add an analysis application.<\/p>\n<h2><strong><b>How to manage data lineage?<\/b><\/strong><\/h2>\n<p>Data lineage management is particularly important in a data lake environment.The data lake contains different data sets in different formats from different sources such as images, video files, log files, documents, raw text, or files in JSON, CSV, Apache Parquet, or optimized row column (ORC) format.\u00a0In addition, datasets in the data lake are constantly being added, often quickly, and various tools can access and process the raw data to produce additional derived datasets.<\/p>\n<p>When these issues of diversity and speed are combined with large volumes of data, manually tracking the origins and details of every data item is impossible.\u00a0Metadata management must be automated in a data lake environment and it is a particular concern when managing data lakes.\u00a0Unlike the data itself, which is stored in the data lake, metadata is &#8220;data about data&#8221; and can take many forms.<\/p>\n<h2><strong><b>Abschluss<\/b><\/strong><\/h2>\n<p>Vielen Dank f\u00fcr das Lesen unseres Artikels. Wir hoffen, dass er Ihnen dabei hilft, ein besseres Verst\u00e4ndnis zu erlangen von<strong><b>\u00a0what&#8217;s<\/b><\/strong>\u00a0<strong><b>data lineage and why\u00a0data lineage is so important<\/b><\/strong>.\u00a0If you want to know more about data lineage, we advise you to visit <strong><a href=\"https:\/\/www.gudusoft.com\/de\/\">Gudu SQLFlow<\/a><\/strong> f\u00fcr weitere Informationen. Nochmals vielen Dank! <strong>\u00a0(Published by Ryan on Apr 18, 2022)<\/strong><\/p>\n<\/div><\/div><\/div><style type=\"text\/css\">.fusion-body .fusion-builder-column-0{width:100% !important;margin-top : 0px;margin-bottom : 0px;}.fusion-builder-column-0 > .fusion-column-wrapper {padding-top : 0px !important;padding-right : 0px !important;margin-right : 1.92%;padding-bottom : 0px !important;padding-left : 0px !important;margin-left : 1.92%;}@media only screen and (max-width:1024px) {.fusion-body .fusion-builder-column-0{width:100% !important;}.fusion-builder-column-0 > .fusion-column-wrapper {margin-right : 1.92%;margin-left : 1.92%;}}@media only screen and (max-width:640px) {.fusion-body .fusion-builder-column-0{width:100% !important;}.fusion-builder-column-0 > .fusion-column-wrapper {margin-right : 1.92%;margin-left : 1.92%;}}<\/style><\/div><style type=\"text\/css\">.fusion-body .fusion-flex-container.fusion-builder-row-1{ padding-top : 0px;margin-top : 0px;padding-right : 0px;padding-bottom : 0px;margin-bottom : 0px;padding-left : 0px;}<\/style><\/div>","protected":false},"excerpt":{"rendered":"","protected":false},"author":27,"featured_media":3379,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[31,178],"tags":[55,56,54],"_links":{"self":[{"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/posts\/3311"}],"collection":[{"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/users\/27"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/comments?post=3311"}],"version-history":[{"count":12,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/posts\/3311\/revisions"}],"predecessor-version":[{"id":4989,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/posts\/3311\/revisions\/4989"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/media\/3379"}],"wp:attachment":[{"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/media?parent=3311"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/categories?post=3311"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gudusoft.com\/de\/wp-json\/wp\/v2\/tags?post=3311"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}