Big data – more than just data

These days, collecting and processing large amounts of data is crucial for optimising work processes and thus for ensuring the success and sustainable growth of your company. The two best-known examples of companies that use big data are the online giants Google and Facebook.

So when should you think about handing over the collection and processing of your data to professionals? Here’s a short checklist:

  • Do you deal with large amounts of data that can no longer be managed with a simple spreadsheet or smaller database applications?
  • Are there complex relationships between your data?
  • Do lots of different data types occur together (e.g. string, date, float, integer, money, text, binary)?

If you answered “yes” to one or more of these questions, ETL is often the solution.

ETL – extract, transform, load

The term ETL is often mentioned in connection with a “data warehouse”. It is a process that involves collecting and preparing data from various sources in one place (repository) for further analysis and evaluation. The use of metadata is essential for ensuring this process is conducted as efficiently as possible.

Metadata – superglue for data models

A collection of multiple different data, data types and data links is only as good as the corresponding metadata model.

Metadata means “data about information objects”, i.e. data, processes, services, ontology and information models. The metadata itself and the consistent management of the metadata are what make it possible to describe the data collected and the relationships between the data. This is done using clear, precise and explicit descriptions, including measurement units.

The structure and framework of metadata registries (MDR) are clearly defined in ISO 11179/19763

MDR – also useful for smaller volumes of data

The effort that goes into creating a small MDR is definitely worth it for the following reasons:

  1. Volumes of data are multiplying in our increasingly complex digital world
  2. Data relationships and dependencies are becoming more visible and easier to describe as the amount of data increases
  3. Use of an MDR makes data exchange easier to automate, more compatible with other MDRs and simpler to document

Just ask us!

Do you have any further questions about ETL, metadata, MDR or data models in general? Send us an email or use our Chat.