How to implement a Data Driven Culture

Photo by Chi Xiang on Unsplash

Building a data driven culture doesn’t happen just like that — it requires a sustained focus and purposeful decision-making over time [1]. Here, I would like to mention the three most important points to consider in order to build a Self Service BI Platform.

In the modern approach to enterprise analytics, the IT department and the business units work hand in hand. The IT department sets up a central environment of trusted data or content and gives business users the ability to access that data, enter their questions, and get the answers they need [1]. One of the building blocks…


How to decide upon building KPIs and Dashboards in the best way possible

Photo by Marissa Duenas on Unsplash

KPI is the abbreviation for Key Performance Indicator. The term refers to key figures that can be used to determine the performance of activities in companies. Which KPIs should be considered to measure success or failure depends on the company objectives [1]. These KPIs can be displayed together with other facts and figures on a so-called dashboard. State-of-the-art (self-service) BI tools such as Microsoft Power BI, Google Data Studio or Tableau can be used for this purpose. …


Impulses and Insights on how to successfully manage Data Science Projects

Photo by Tommy Nguyen on Unsplash

It takes several factors and parts in order to manage data science projects. This article will provide you with the five key elements: purpose, people, processes, platforms and programmability [1], and how you can benefit from these in your projects.

Just like in the classic approach of project management, a goal or purpose should always be formulated. Possible examples can be:

  • Better business insights
  • Fraud prevention/detection
  • Prediction
  • Maximization problems, etc.

It is essential for a project within the field of Big Data or Data Science to have a specific purpose or goal.


Turn Data into Insights

Photo by Tim Foster on Unsplash

Buzzwords like Big Data, Data Science and Business Intelligence are mentioned everywhere. But what are the typical potential new business models and products? Given the multitude of data sources, analysis options and business models, it’s easy to lose sight of the big picture.

In the following, three initial circumstances and scenarios are mentioned in further described.

If companies have large and valuable data pools at their disposal, decisions on the establishment of new data- and analytics-based business models are often made strategically. This applies to businesses which provide Big Data from satellites, weather phenomena, social media or telematics data. …


Ways of using Denormalization and Nested Data

Photo by James Lee on Unsplash

BigQuery is a fully managed, serverless data warehouse on the Google Cloud Platform infrastructure that provides scalable, cost-effective and fast analytics over petabytes of data. It is service-software that supports queries using standard SQL. In this article, I would like to mention two main techniques to make your BigQuery Data Warehouse become efficient and performant [1].

SQL vs NoSQL: SQL databases are table-based databases, whereas NoSQL databases can be document-based, key-value pairs, and graph databases. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable. …


Possibilities of how Cloud solutions can impact IT Management & Governance

Why IT managers shouldn’t be threatened when seeing clouds on the horizon — Photo by Caryle Barton on Unsplash

Improve your IT Assets, Resources and Capabilities to enhance your Business Success. The management of a company must be sure that their IT adequately supports the company’s goals. IT Management and a good IT Governance is responsible for this. The Cloud and its’ commoditization of IT assets and resources have massive impacts on the whole IT Governance. Especially smaller businesses and Start-ups can profit. How the cloud can help to meet the company’s goals, will be answered in the following article [1].

The figure below will show a short and superficially overview of how IT Governance is defined [2][3][4]:


What are the Dependencies to the Source Systems?

Photo by John Fowler on Unsplash

When integrating data from system A to system B, data engineers and other stakeholders should not only focus on the data process, e.g. via ETL/ELT, but also on the source system. What various circumstances must be taken into account and what I learned from earlier projects are the following:

When is a source system available? You have to consider maintenance cycles, downtimes, etc. Otherwise, if the system is not available, the data integration process will not work or only part of the data will be captured. Here, it makes sense to implement a monitoring of the source system and work…


How to realize Big Data Projects

Photo by Hannah Vorenkamp on Unsplash

When setting up a Big Data landscape, there are five steps and topic blocks that must be taken into account during implementation.

In order to process data in a data lake or data warehouse, to analyze it or to make it usable for other systems, data must first be made available from the source systems. Examples for sources could be:

  • Internal systems like SAP, Salesforce, etc.
  • Internal databases from Oracle, Microsoft, MySQL etc.
  • Archived Files and Log Files
  • Documents
  • Social Media (like Facebook or Instagram API)
  • Web scraping data
  • Open API/Data

Beside classical batch and ETL process data integration (e.g…


What are the common Issues and how can they be solved?

Photo by Taylor Friehl on Unsplash

In the field of Data Analytics and related topics like BI, Data Science, Data Engineering etc. you often will hear about the same problems when working in a project or on a product. Here, I want to share my experiences and possible solutions.

One of the most unpleasant moments in the life of every project or product manager is when the business department complains about the data quality. The problems can be of different nature. Errors in the source system, ETL process or in the report.

Solution: Here, it is a good idea to set up a monitoring system and…


How to Anonymize and Pseudonymize Data

Photo by Francisco Suarez on Unsplash

Personal data is the core concept of data protection. Data protection law only applies when data relates to individuals. The GDPR for example increases fines to up to 20 million euros or, in the case of large companies and groups, up to 4% of the global group turnover of the previous year [1]. When working in the field of Big Data, Data Science or related fields it is essential to know about these laws and how anonymization and pseudonymization give the possibility of still using the data for your use cases.

This is any information relating to an identified or…

Christianlauer

Big Data Enthusiast based in Hamburg and Kiel.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store