If you’ve worked with Power Pivot in Excel, you’ve seen the central role of shared fields in establishing relationships between tables. But perhaps you’ve never considered how the total number of distinct records in those tables matters.
Cardinality in data modeling describes the number of observations in one table that can be related to the corresponding observations in another. If that definition of cardinality leaves you feeling a bit “bird-brained” (pun very much intended), let’s take a look at some examples in Excel:
This workbook consists of a simple employee database that is being prepared to become a Power Pivot model. Several types of cardinality exist in it — some better than others:
One-to-one cardinality
The simplest type of cardinality is one-to-one. This is when each observation in one table has one and only one match in the other.
For example, the emp
table contains records about each employee such as their hire date and salary. It makes sense that only employee would be listed once and only once here. There is also a mgr
table listing the direct supervisor for each employee. Assuming that each employee has one and only one manager, there should be one instance for each employee here too.
That makes this a one-to-one relationship.
One-to-one relationships in Power Pivot
One-to-one relationships are straightforward, but they may not be practical. It may be more efficient to merge the corresponding data instead of creating a separate table. As a result, one-to-one relationships are not widely used in data modeling, and they’re not supported in Power Pivot. Instead, all relationships are classified as one-to-many, which we’ll discuss next.
One-to-many relationships
In a one-to-many relationship, there can be many observations in one table related to one and only one observation in the other. emp
and dept
in our workbook make a good example. Every employee is assigned to one and only one department, yet multiple people can be in the same department:
If you’ve used lookup functions like VLOOKUP()
in the past, you might think of dept
as a “lookup table” in this example. Rather than repeat dept_title
unnecessarily in the emp
table, it stores that added information in a more compact, helper table. That said, one-to-many relationships can be a bit more complex and difficult to maintain than the simpler one-to-one relationship.
Many to many relationships
In a many-to-many relationship, each table contains multiple instances of the same observation. For example, the mgr_evals
table contains evaluation scores for each manager, which are collected every six months. How can we join this data together, so we know the employees that report to each manager?
This gets a little tricky as neither table can serve as an unduplicated “lookup table” for the other — it must be “resolved” in some way, often by creating a third junction table that can cross-reference each table, or by redoing the design of the data model’s tables entirely.
Why does cardinality matter?
Understanding cardinality is crucial to data modeling because it ensures data integrity and consistency. For example, if the cardinality of a relationship is one-to-many, then it is important to enforce that each instance of the “one” entity is associated with only one instance of the “many” entity, and vice versa. Failure to do so can lead to data inconsistencies and errors.
And, although Power Pivot doesn’t differentiate between one-to-one and one-to-many relationships, understanding and utilizing this concept can enhance the performance of the data model in Power BI.
Have you encountered these cardinality types in your work before? What have you done to resolve many-to-many relationships? Let me know in the comments.
Leave a Reply