Challenges & solutions
Around your data warehouse and data lake
Data warehouses within organizations have for many years been providing the insights that support important decisions. Although Massive Parallel Processing (MPP) architecture has enabled Data warehouses to easily process large amounts of data, data warehouses are primarily focused on structured data.
As medium and large organizations are increasingly dealing with unstructured data as well as streaming data, they are running into the limitations of their data warehouses. In this first blog in a series of two, read about the next step in data-driven processes within these organizations: Lakehouse architecture.
As data warehouses are not suitable for storing unstructured data and streaming data, a data lake is often used for this purpose. A data lake is a repository for raw data in various formats. The advantage of a data lake is that it is inexpensive and supports all data formats. The disadvantage of a data lake is the lack of the following key features:
The lack of these features means that a data lake is NOT the solution for processing data into insights. Many organizations therefore have both a data lake and a data warehouse in use. This is not an ideal situation, as it leads to siloing of data flows: on structured data (via the data warehouse) and on unstructured data and streaming data (via the data lake). Many of these organizations are looking for a solution to combat this silo formation. This solution is Lakehouse architecture.
What is a Lakehouse?
A Lakehouse is an open architecture that combines the features of a data lake and a data warehouse. A Lakehouse does this by implementing the data structures and data management features of a data warehouse on a data lake. This creates a win-win situation: the benefits of a data warehouse and the benefits of a data lake.
What are the main benefits of a Lakehouse?
The main benefits of a Lakehouse are:
You may now be wondering: how do I offer all the data from a Lakehouse conveniently for use? A Lakehouse stores both data in raw form and in processed form with business logic in it. This distinction in dataset maturity is important: raw datasets are less useful for standard reports and KPIs, while processed data may exclude data of interest to Data Science.
A common solution for this is medallion architecture, in which data is categorized as Gold Data, Silver Data and Bronze Data. As with medals in sports, gold is better than silver and silver is better than bronze. In medallion architecture, raw data is labeled as Bronze data and processed data is labeled as Silver Data or Gold Data.
Understanding and processing Bronze Data requires more expertise than understanding and processing Gold Data. In addition, Bronze Data may unintentionally contain more sensitive data than Gold Data. So it makes sense to shield access to these types of data with roles and groups. For example: Bronze Data are accessible only to Data Engineers or Data Scientists. For that, the Access Control Lists (ACLs) in Azure Data Lake Storage provide the solution. This allows rights to be set up per layer, source system or domain.
Most of the new data platforms we implement for clients follow the Lakehouse architecture. A Lakehouse fits well with organizations that need a widely deployable analytics platform. Within a Lakehouse, different types of use cases are supported, ranging from Data Exploration to Data Science, Reporting and Business Intelligence. This requires high data literacy and extensive knowledge of data-driven work among your users.
Therefore, for some organizations, a data warehouse is still the best choice. A data warehouse fits well with organizations that want to focus solely on Reporting and Business Intelligence in the medium term. Within a data warehouse, the complexity of a data platform is lower, requiring less high data literacy and less extensive knowledge of data-driven work among users.
If a Lakehouse is the right decision for your organization, the key is to determine the right migration strategy. In the second blog in this series, we take you through determining the best migration strategy, the skill set needed and the different thinking that is essential when following the Lakehouse architecture.
Challenges & solutions
Around your data warehouse and data lake
Want to learn more about the key benefits and challenges of Lakehouse architecture, or get started right away? Connect with our data & analytics experts.
Read more
Customer cases and resources on Data & Analytics
Wij, en derde partijen, gebruiken cookies op onze website. We gebruiken cookies om statistieken bij te houden, uw voorkeuren op te slaan, maar ook voor marketingdoeleinden (bijvoorbeeld het op maat aanbieden van advertenties). Door op 'Instellingen' te klikken kunt u meer lezen over onze cookies en uw voorkeuren aanpassen. Door op 'Alles accepteren' te klikken, ga je akkoord met het gebruik van alle cookies zoals beschreven in onze privacy- en cookie policy.
Purpose
This cookie is used to store your preferences regarding cookies. The history is stored in your local storage.
Cookies
Location of Processing
European Union
Technologies Used
Cookies
Expiration date
1 year
Why required?
Required web technologies and cookies make our website technically accessible to and usable for you. This applies to essential base functionalities such as navigation on the website, correct display in your internet browser or requesting your consent. Without these web technologies and cookies our website does not work.
Purpose
These cookies are stored to keep you logged into the website.
Cookies
Location of Processing
European Union
Technologies Used
Cookies
Expiration date
1 year
Why required?
Required web technologies and cookies make our website technically accessible to and usable for you. This applies to essential base functionalities such as navigation on the website, correct display in your internet browser or requesting your consent. Without these web technologies and cookies our website does not work.
Purpose
This cookie is used to submit forms to us in a safe way.
Cookies
Location of Processing
European Union
Technologies Used
Cookies
Expiration date
1 year
Why required?
Required web technologies and cookies make our website technically accessible to and usable for you. This applies to essential base functionalities such as navigation on the website, correct display in your internet browser or requesting your consent. Without these web technologies and cookies our website does not work.
Purpose
This service provided by Google is used to load specific tags (or trackers) based on your preferences and location.
Why required?
This web technology enables us to insert tags based on your preferences. It is required but adheres to your settings and will not load any tags if you do not consent to them.
Purpose
This cookie is used to store your preferences regarding language.
Cookies
Why required?
We use your browser language to determine which language to show on our website. When you change the default language, this cookie makes sure your language preference is persistent.
Purpose
This service provided by uMarketingSuite is used to track anonymized analytics on the HSO.com application. We find it very important that your privacy is protected. Therefore, we collect and store this data anonymously on our own servers. This cookie helps us collect data from HSO.com so that we can improve the website. Examples of this are: it allows us to track engagement by page, measuring various events like scroll-depth, time on page and clicks.
Cookie
Purpose
With your consent, this website will load Google Analytics to track behavior across the site.
Cookies
Purpose
With your consent, this website will load the Google Advertising tag which enables HSO to report user activity from HSO.com to Google. This enables HSO to track conversions and create remarketing lists based on user activity on HSO.com.
Possible cookies
Please refer to the below page for an updated view of all possible cookies that the Google Ads tag may set.
Cookie information for Google's ad products (safety.google)
Technologies Used
Cookies
Purpose
With your consent, we use IPGeoLocation to retrieve a country code based on your IP address. We use this service to be able to trigger the right web technologies for the right people.
Purpose
With your consent, we use Leadfeeder to identify companies by their IP-addresses. Leadfeeder automatically filters out all users visiting from residential IP addresses and ISPs. All visit data is aggregated on the company level.
Cookies
Purpose
With your consent, this website will load the LinkedIn Insights tag which enables us to see analytical data on website performance, allows us to build audiences, and use retargeting as an advertising technique. Learn more about LinkedIn cookies here.
Cookies
Purpose
With your consent, this website will load the Microsoft Advertising Universal Event Tracking tag which enables HSO to report user activity from HSO.com to Microsoft Advertising. HSO can then create conversion goals to specify which subset of user actions on the website qualify to be counted as conversions. Similarly, HSO can create remarketing lists based on user activity on HSO.com and Microsoft Advertising matches the list definitions with UET logged user activity to put users into those lists.
Cookies
Technologies Used
Cookies
Purpose
With your consent, this website will load the Microsoft Dynamics 365 Marketing tag which enables HSO to score leads based on your level of interaction with the website. The cookie contains no personal information, but does uniquely identify a specific browser on a specific machine. Learn more about Microsoft Dynamics 365 Marketing cookies here.
Cookies
Technologies Used
Cookies
Purpose
With your consent, we use Spotler to measures more extensive recurring website visits based on IP address and draw up a profile of a visitor.
Cookies
Technologies Used
Cookies
Purpose
With your consent, this website will show videos embedded from Vimeo.
Technologies Used
Cookies
Purpose
With your consent, this website will show videos embedded from Youtube.
Cookies
Technologies Used
Cookies
Purpose
With your consent, this website will load the Meta-pixel tag which enables us to see analytical data on website performance, allows us to build audiences, and use retargeting as an advertising technique through platforms owned by Meta, like Facebook and Instagram. Learn more about Facebook cookies here. You can adjust how ads work for you on Facebook here.
Cookies
Purpose
With your consent, we use LeadInfo to identify companies by their IP-addresses. LeadInfo automatically filters out all users visiting from residential IP addresses and ISPs. These cookies are not shared with third parties under any circumstances.
Cookies
Purpose
With your consent, we use TechTarget to identify companies by their IP address(es).
Cookies
Purpose
With your consent, we use this service provided by uMarketingSuite to run A/B tests across the HSO.com application. A/B testing (also called split testing) is comparing two versions of a web page to learn how we can improve your experience.
Purpose
With your consent, we use this service provided by uMarketingSuite to personalize pages and content across the HSO.com application. Personalization helps us to tailor the website to your specific needs, aiming to improve your experience on HSO.com.