BOM (Bill-Of-Materials) Scenario with SQL Server Analysis Services 2008 R2

While working on a Business Intelligence project where the data source is SAP R/3 and data is fetched directly from the SAP tables using SSIS 2008 with the Microsoft BizTalk SAP Connector; everything was smooth until one day I had to replicate a multiple parent parent-child relationship (aka bill-of-materials) that SAP creates in the Finance module.

These relationships occur when a child node can have more than one parent. Even though these types of relationships are not supported by default in SQL Server Analysis Services 2008 R2 I kept trying to implement them, until I’ve found a solution that works! In this document I explain how I did it.

In all bill-of-materials scenarios we have one fact table, one dimension table and one bridge or intermediate table that connects the fact table with the dimension table. This document is structured in 4 sections that will cover the 3 tables mentioned before plus one section for the required Analysis Services configuration.

Section 1 – Data Source Scenario

The data source tables are introduced and their structure explained in this section.  The fact table and its relationship with the dimension table are explained here.

This section will be quite familiar to those who have worked in a SAP environment, extracting data directly from SAP tables. If your scenario is not this specific SAP one, you can jump this section completely and go straight to section 2.

Section 2 – Building the Cost Element Dimension

A bill-of-materials dimension table has a slightly different structure than a normal dimension table in a non-BOM case. Section 2.3 covers this difference. The rest of the section details how to flatten the SAP hierarchy of nodes a leaves. If in your case you have a ready-to-go dimension table I still strongly suggest to read section 2.3.

Section 3 – Relate the Fact table with the Dimension table

This section describes the importance of the bridge or intermediate table that will connect the fact table with the dimension table.

Section 4 – Configuring the cube

Finally, section 4 details the configuration needed in Analysis Services 2008 R2 to make all parts work together in a bill-of-materials scenario.

Section 1 – Data source scenario

The specific solution covered in this document retrieves data for the Finance department to report on the expenditure of each department. In SAP terms that means the Cost Centers and Cost Elements, which are stored in the following tables’ structure:

Definition:

  • CSKS: Cost Center definition table
  • CSKA: Cost Element definition table

These two tables contain each individual Cost Center/Element identified by the OBJNR field in CSKS table and the KSTAR field in CSKA table.

Both Cost Centers and Cost Elements are structured in a hierarchy defined in the SETHEADERT, SETNODE and SETLEAF tables:

  • SETHEADERT: Description (long text) for the
    SETNODES
  • SETNODE: Declares for each node (SETNAME) which
    are its children (SUBSETNAME). This table is a normal recursive parent-child
    table. When a node doesn’t have any child it is considered a SETLEAF.
  • SETLEAF: Contains only the last members on the
    SETNODE hierarchy, i.e. the leaf nodes. Each leaf node spans a range of Cost
    Centers/Cost Elements (OBJNR/KSTAR); this range is declared in the VALFROM and
    VALTO fields.

The big problem is that each OBJNR/KSTAR can be in more than one SETLEAF and that one SETNODE can be in more than one SETNODE hierarchy.
This is known as a multi-parent parent-child relationship or bill-of-materials.

To complete the scenario and have an example let’s include our fact table.

COEP table is a fact table that stores for each expense in the company the amount and the cost center and cost element which that expense is related to.

For example, you buy a computer that goes to X (OBJNR) cost center and Y (KSTAR) cost element.

Relationships:

  • COEP is related with CSKS (Cost Center) by OBJNR field.
  • COEP is related with CSKA (Cost Element) by KSTAR field.
  • The rest of the fields are fact data.

Let’s have a simple example:

COEP Fact table:

  • OBJNR: KS100000120000
  • KSTAR: 0000404050

To make things simpler, in this document we will follow the solution for the Cost Element object. To implement the Cost Center object the same steps needs to be
followed.

The KSTAR 0000404050 appears in 2 leaf nodes (SETLEAF), PURSERV and MAINTREN. PURSERV node (SETNODE) appears under at least 2 different
parents as the following screenshot shows:

So, the expense that we have here needs to be reported under 2 different hierarchies but can’t be double billed if there is a parent node for the two hierarchy trees. This is a normal BOM scenario.

Section 2 – Building the Cost Element Dimension

2.1. Retrieve all SETNODE AND SETLEAF records

Using SSIS and the Microsoft BizTalk SAP Connector fetch all nodes and columns from the SAP tables described before and import them to a staging database in SQL Server.

The rows are filtered by the SETCLASS and SUBCLASS columns to retrieve only the ones that I’m interested. In my case SETCLASS = 0101 and SUBCLASS = QA10. These values could, and most probably would, be different in your company.

2.2. Detect the hierarchies and build the staging relationship tables

Hierarchy table for nodes (SETNODE)

Using a recursive stored procedure with a cursor inside, I build  a relationship table that contains all possible paths for the SETNODE hierarchies. The store procedure “walks” the SETNODE table and inserts rows in the Z_SETNODE_HIERARCHY table with the appropriate information.

Notice how the node with SETNAME equal to “109200” appears in 2 different hierarchies; one under “QA10_BJP” and the other under “QA10_CC”. It’s important to understand that this table has ALL possible hierarchy trees, that’s why node “109200” appears multiple times.

Complete LEAF information

A second table will store the complete SETLEAFs information crossed with the definition table CSKA. This table will be called Z_CSKA_SETLEAF.

To build this table I’m using another stored procedure, called p_Z_CSKA_SETLEAF.

Hierarchy table for LEAFs (SETLEAF)

A third table containing the relationships between SETLEAFs and SETNODEs is needed. In this table we have one entry for each KSTAR (SETLEAF) related to a SETNODE. 

If you are interested in the stored procedures that create the tables described above please click in the following link to a Word document which contains the code for the stored procedures in the order that should be executed: BOMStoredProcedures

Note: To execute the stored procedures I use the Execute SQL Task in SSIS.

The resulting tables from this process are:

2.3. Create the Dimension table DimCostElement

With the previous 2 tables now we can load them into ONE single Dimension table, DimCostElement. First we will load the nodes from the Z_SETNODE_HIERARCHY table and later the leaves from the Z_CSKA_SETLEAF_SETNODE table.

By joining these two tables we have “flattened” the hierarchy. Now as leaves and nodes are in the same table, they can be considered the same object type.

Remember that this Dimension table contains ALL the possible paths in the hierarchy, so a child will appear multiple times in the table. Also remember that a child now can be not only a leaf node member but also normal node member. There is no more distinction between leaf and node members.

The structure of this table is very important, let’s have a deeper look at it:

DimCostElement table contains a CostElementID and a ParentID field. The CostElementID field acts as ID for the table, while the ParentID field just refers to the former to create a parent-child relationship.

The SETNAME field contains the SETNAME for the SETNODEs and the KSTAR for the LEAFs. In this way we have unified the SETNAME for the two different object types. Be aware here that one SETNAME can have multiple CostElementIDs due to that SETNAME will appear in more than one hierarchy path.

The concept of decoupling the name of an object from its database ID, allowing the object name to have more than one ID is what allows us to have multiple-parent parent-child relationships. This is the most important concept to learn in this exercise.

The last two fields are description fields.  

Section 3 – Relate the Fact table with the Dimension table

The fact table, COEP, has for each entry one KSTAR field that relates the described expense with one Cost Element in the dimension table; but because one Cost Element can be in more than one hierarchy the relationship between the fact table and the dimension table cannot be direct. We need an intermediate table.

The intermediate table pairs each KSTAR from the fact table with all possible CostElementIDs in the DimCostElement table. Let’s remember that in our case the KSTAR field from the fact table contains the same value (here is the relationship) as the SETNAME field from the DimCostElement table, but because of the multi-parent parent-child hierarchy nature of the Cost Elements,
one SETNAME can have multiple CostElementIDs. The intermediate table stores for each KSTAR from the fact table all the CostElementIDs from the DimCostElement table (many-to-many relationship).

You can create another stored procedure that fills the fact-less table with the result of a join query between the fact table and the dimension table.

In SQL Server 2008 this type of intermediate tables is known as Fact-less tables. You can search the Books Online for further information about them.

In this solution we cross one of the golden rules in data warehousing. KSTAR values in the Fact table should be replaced by their equivalent CostElementID from the DimCostElement table. But because one KSTAR value can have more than one CostElementID (remember that one KSTAR can be in more than one leaf member), this is a better solution that saves us of creating another intermediate table.

With this last table the database is ready for Analysis Services. In the next section I describe how to configure the cube to allow the BOM relationships and properly aggregate them.

Section 4 – Configuring the cube

4.1. Adding the fact and dimension tables

First step will be to add the 3 previous tables (fact, dimension and intermediate) to the data source view and create the appropriate relationships between tables (if you are as me that I don’t create any relationships in the database itself).

Create an empty cube and let’s add the tables manually to fully control how Analysis Services understand them.

The 3 tables will be added and configured in the cube as follows:

  • The Fact COEP table will be added to the cube a fact table with one measure group.
  • The intermediate Fact-less table will be added to the cube as both a dimension and a fact table.
  • The DimCostElement table will be added to the cube as a dimension table.

The Fact-less table should only be configured as a fact table, as it doesn’t contain any data, but SSAS needs 2 dimensions when configuring many-to-many relationships. The reason behind this is that SSAS approaches many-to-many relationships as 2 regular relationships put together.

You can add first the Fact-less table as a dimension and later as a fact table or on the opposite order, it really doesn’t matter. To not confuse the final user set the “Visible” attribute of the dimension to “False” and do the same for each measure of the fact. This will prevent the false dimension and fact-less table to appear in any client software that queries the cube.

4.2. Configuring the Fact-less dimension

Attributes:

The Fact-less dimension will have 3 attributes:

  • Factless-Cost Element ID: This attribute is the field CostElementID field from the Fact-less table
  • FactCOEPActual – KSTAR: This attribute is the KSTAR field from the Fact COEP table (main fact table)
  • DimCostElement – Cost Element ID: This attribute is the CostElementID field from the DimCostElement table

Note: I like to name the attributes in these special tables with the name of their source table preceding their name for clarity purposes.

These 3 attributes are needed to later build the many-to-many relationships between the tables, as it is explained in the next section of this document.

4.3. Configuring the Dimension Usage

The Dimension Usage tab allows us to define the relationship between the dimensions and the measure groups (fact) contained in a cube. It is here where we can tell Analysis Services that we have a multi-parent parent-child between our 3 tables.

The screenshot below shows the final scenario for the Cost Center (Factless OBJNR Cost Center) and Cost Elements (Factless KSTAR Cost Element). As this document only covers the Cost Element case, let’s focus our attention on the Actual Values and Factless KSTAR Cost Element columns/measure groups and Dim Cost Element (Cost Element) and Factless KSTAR Cost Element rows/dimensions.

In this document I will not describe the different types of dimension relationships. I encourage you to read this topic in the Microsoft MSDN here: http://msdn.microsoft.com/en-us/library/ms175669(v=SQL.105).aspx

First relationship

(Dimension) Factless KSTAR Cost Element with (Measure G)
Factless KSTAR Cost Element:

This is a Regular relationship as the dimension and the measure group are jointly directly by a key column.

Configure the relationship as in the following screenshot:

Second relationship

(Dimension) Dim Cost Element with (Measure G) Factless KSTAR
Cost Element:

This is a Regular relationship as the dimension and the measure group are jointly directly by a key column.

Configure the relationship as in the following screenshot:

Third relationship

(Dimension) Factless KSTAR Cost Element with (Measure G) Actual Values:

This is a Regular relationship as the dimension and the measure group are jointly directly by a key column.

Configure the relationship as in the following screenshot:

Fourth relationship

(Dimension) Dim Cost Element with (Measure G) Actual Values:

This is the most important relationship to define. The Cost Element Dimension is related in a many-to-many fashion to the fact table. Analysis Services needs of an intermediate table (fact) in order to build the relationship.

As our intermediate table has all possible paths for all the hierarchies and the dimension table has a parent-child relationship built-in, this combination will give us the bill-of-materials scenario.

Configure the relationship as in the following screenshot:

After all the configuration is done, build the cube (first build the dimensions and later the measure groups) and start enjoying a multi-parent parent-child relationship (or Bill-of-Materials) in SQL Server Analysis Services 2008.

Advertisements
Tagged with: , , , , , , , , , , ,
Posted in SAP, SSAS
15 comments on “BOM (Bill-Of-Materials) Scenario with SQL Server Analysis Services 2008 R2
  1. Hi Jordi,

    thank you for this post, which IMO shows an elegant solution.

    I have been working on a BOM solution some time ago, using cascading many-to-many relations and other approaches. I’ve found that performance can be a big issue, depending to great extent on the size of the factless fact table. This corresponds to the results in the whitepaper http://www.microsoft.com/downloads/en/details.aspx?FamilyID=3494E712-C90B-4A4E-AD45-01009C15C665&displaylang=en by my colleague Erik Veerman and others.

    How was the performance of your solution?
    How big were your tables?
    How many levels did your data contain?

    Cheers,
    Helmut

    • jbartual says:

      Hi Helmut,

      I counted the members in the Cost Element dimension: 2930.
      The dimension is 14 levels deep.

      In the real cube the Cost Element dimension is always crossed with the Cost Center dimension which has the following numbers:
      Members: 710
      Deep levels: 9

      The fact tables have the following rows:
      Actual values: 157,331
      Plan values: 359,200
      Commitment values: 18,545

      As you can see this is not by any means a big deployment (is growing by the day) and maybe this is why I don’t have any performance issues.
      By now the system is running on a development virtual machine with just 4GB of RAM on a 64bits Windows 2008 R2 server with a two core Xeon @ 3.00 GHz.

  2. Sonic says:

    Hello Jordi,
    also big thanks for this post from me. Is it possible, to see the Stored Proc you mentioned under point 2.2 to create the relationship table? I don’t know how to start to create a table which shows me the Parent-Child-Architecture.

    Greets Sonic

    • jbartual says:

      Hi Sonic,

      Thank you for your interest. I’ve updated point 2.2 to include a bit more of information and a link to a Word document with the stored procedures source code. The stored procedures are in the order that they should be executed to properly build the 3 tables needed for the hierarchies.

      I hope this helps you and you can have these tricky relationships up and working very soon.

      Regards,
      Jordi

  3. Paolo says:

    Hello Jordi;

    Great Post, but I am having trouble at section 2.3. You mention flattening the Hierarchy, is this similar to the script that you use to create the dimension in the DSV (I am doing this with the Cost Centre Hierarchies)?

    SELECT A.ID AS CostCentreID,
    A.ParentID,
    A.SETNAME,
    B.SETLEAFKOSTL AS CostCentre,
    B.SETLEAFLTEXT AS Description
    FROM dbo.Z_SETNODE_HIERARCHY A
    FULL OUTER JOIN dbo.Z_CSKS_SETLEAF_SETNODE B ON A.ID = B.SETNODE_ID

    • jbartual says:

      Hi Paolo,

      By flattening the hierarchy I mean that in 1 table you will have both parent and child nodes and all possible paths of the hierarchy in a explicit manner. Is the less normalized way that you can think of a relationship.
      If you have successfuly created the Z_SETNODE_HIERARCHY and Z_CSKS_SETLEAF_SETNODE just load first all the SETNODE_HIERARCHY rows into the dimension table and later load all rows from SETLEAF_SETNODE. No need for the INNER JOIN, just create two data flow tasks to load first the parents and later the children.

      Hope this helps. Please let me know if you finally succeed creating the hierarchies.
      Jordi

  4. Paolo says:

    Hey Jordi,

    Thanks for your reply. I tried your suggestion, and have included a script that essentially emulates what I did in my DFT:
    SELECT DISTINCT
    a.id AS CostCentreID,
    a.ParentID AS ParentID,
    RTRIM(a.SETNAME) AS SetName,
    –***deals with nodes without descriptions
    ISNULL(b.DESCRIPT, RTRIM(a.SETNAME)) AS Descript,
    CASE WHEN b.DESCRIPT IS NULL THEN RTRIM(a.SETNAME)
    ELSE RTRIM(a.SETNAME) + ‘ – ‘ + RTRIM(b.DESCRIPT)
    END AS DisplayName
    –***
    FROM Z_SETNODE_HIERARCHY a
    LEFT OUTER JOIN Z_CSKS_SETLEAF b ON b.SETNAME = a.SETNAME
    UNION
    SELECT NULL,
    SETNODE_ID,
    RTRIM(SETLEAFKOSTL),
    RTRIM(SETLEAFLTEXT),
    RTRIM(SETLEAFKOSTL) + ‘ – ‘ + RTRIM(SETLEAFLTEXT)
    FROM dbo.Z_CSKS_SETLEAF_SETNODE

    When I create the cube and apply the relationships as you explained, the cube built successfully. However, the hierarchies are not correct, all I can see are the hierarchy of parent IDs, and not the description. Nor do the leaf nodes appear at the bottom of the hierarchies. Can you talk about the actual DIM Cost Centre (or Element) configuration, and also confirm if the leaf member CostCentreID values should be placed in the ParentID column?

    thanks again!
    Paolo

  5. Paolo says:

    Thanks Jordi! This has been extremely helpful. Well done.

  6. jpedroalmeida says:

    awesome!thanks!

  7. […] BOM (Bill-Of-Materials) Scenario with SQL Server Analysis Services 2008 R2 […]

  8. I love your blog.. very nice colors & theme.
    Did you make this website yourself or did you hire someone to do it for you?
    Plz answer back as I’m looking to create my own blog and would like to find out where u got this from. many thanks

  9. Тhis site was… how doo you say it? Relevant!! Finally I
    have found something tht helped me. Thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 24 other followers

Categories
%d bloggers like this: