Data Modeling interview questions and answers 👇

  1. Data Modelling Interview Questions


Data Modelling Interview Questions

What is Normalization?

View answer

Normalization is a formal approach that applies a set of rules to associate attributes with entities.

What is Denormalization?

View answer

Denormalization is the process of adding precomputed redundant data to an otherwise normalized relational database to improve read performance of the database. Normalizing a database involves removing redundancy so only a single copy exists of each piece of information.

What is ERD?

View answer

An entity-relationship diagram (ERD) is a data modeling technique that graphically illustrates an information system's entities and the relationships between those entities. An ERD is a conceptual and representational model of data used to represent the entity framework infrastructure.

What is a Surrogate Key?

View answer

A surrogate key uniquely identifies each entity in the dimension table, regardless of its natural source key. This is primarily because a surrogate key generates a simple integer value for every new entity. Surrogate keys are necessary to handle changes in dimension table attributes.

What are the Critical Relationship Types in a Data Model?

View answer

There are 3 different types of relations in the database:

  • one-to-one
  • one-to-many
  • many-to-many

What is an Enterprise Data Model?

View answer

An enterprise data model is a type of data model that presents a view of all data consumed across the organization. It provides an integrated yet broad overview of the enterprise's data, regardless of the data management technology used.

What are the 2 Design Schemas?

View answer

The two popular dimensional data models are:

  • Star schema: data is organized into facts (measurable items) and dimensions (reference information), where each fact is surrounded by its associated dimensions in a star-like pattern
  • Snowflake schema: resembles the star schema but includes additional layers of associated dimensions, making the branching pattern more complex

What is Slowly Changing Dimension?

View answer

A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records.

What is a Data Mart?

View answer

A data mart is a subject-oriented database that is often a partitioned segment of an enterprise data warehouse. The subset of data held in a data mart typically aligns with a particular business unit like sales, finance, or marketing.

What is Granularity?

View answer

In dimensional modeling, granularity refers to the level of detail stored in a table. For example, a dimension such as Date (with Year and Quarter hierarchies) has a granularity at the quarter level but does not have information for individual days or months.

What is Data Sparsity?

View answer

Data sparsity signifies how much data we have for a particular dimension/entity of the model.

What are Subtype and Supertype Entities?

View answer

A supertype is a generic entity type that has a relationship with one or more subtypes. A subtype is a sub-grouping of the entities in an entity type that is meaningful to the organization and that shares common attributes or relationships distinct from other subgroups.

Why is Metadata important?

View answer

Metadata management is critical for organizations looking to understand the context, definition and lineage of key data assets. Data models play a key role in metadata management, as many of the key structural and business definitions are stored within the models themselves.

What is forward and reverse engineering?

View answer

Forward engineering is the process of generating the physical database schema from a physical model. You use the Schema Generation dialog to forward engineer a model and generate the schema. The schema that you generate includes all options that are supported in your target server.

Reverse engineering is the process of creating a data model from a database or a script. The modeling tool creates a graphical representation of the selected database objects and the relationships between the objects. This graphical representation can be a logical or a physical model.

What are Recursive Relationships?

View answer

A recursive relationship is a non-identifying relationship between two entities or tables that represents the fact that one company can own another company. In this type of relationship, the parent entity or table and the child entity or table are the same.

What is a Conformed Dimension?

View answer

In data warehousing, a conformed dimension is a dimension that has the same meaning to every fact with which it relates. Conformed dimensions allow facts and measures to be categorized and described in the same way across multiple facts and/or data marts, ensuring consistent reporting across the enterprise.

In what way are NoSQL databases better than relational databases?

View answer

NoSQL databases don't need a predefined schema, allowing you to work more freely with “unstructured data.” Relational databases are vertically scalable, but usually more expensive, whereas the horizontal scaling nature of NoSQL databases is more cost-efficient.

What is Junk Dimension?

View answer

A Junk Dimension is a dimension table consisting of attributes that do not belong in the fact table or in any of the existing dimension tables. The nature of these attributes is usually text or various flags, e.g. non-generic comments or just simple yes/no or true/false indicators.