Unlocking the Potential of Power BI (Part 2): Optimizing Datamarts

Unlocking the Potential of Power BI (Part 2):

Optimizing Datamarts

Welcome to part 2 of our Power BI Datamart series. In part 1 (Unlocking the Potential of Power BI: Building and Connecting Datamarts for Advanced Analytics), we walked through setting up Datamarts in Power BI – in this article we’ll dive into strategies for optimizing Datamarts.

From efficient report creation to enhanced data accuracy, this series is your roadmap to unlocking Power BI’s full potential. Discover practical techniques, best practices, and a systematic approach to optimizing efficiency. Our journey starts with core principles and practices, laying the foundation for a profound shift in your data landscape. Join us in our exploration towards data-driven excellence.

Many organizations find themselves navigating constraints imposed by the scope, skillset, and collective experience of their BI and Analytics teams. This often leads to the creation of BI Architectures that lack optimization and scalability, resulting in overlooked business opportunities and the prospective costs of resolving performance bottlenecks.

Our team has deep expertise developing Power BI solutions for a wide range of industries with data across numerous platforms. Our Solution Architects and Power BI Consultants understand how information flows through your organization. We know how to manage and govern it, and how to tap into it effectively. We bring best in class Power BI Architecture, Dashboard and Report Development, access to thought leadership and the ability to quickly execute a Proof of Value or Proof of Concept.

Introduction

Simply creating a Datamart is not enough and often doesn’t solve the underlying issue for why a Datamart was created to begin with. To unlock the full potential of Power BI and drive impactful outcomes, it is essential to optimize your Datamart. Optimizing a Datamart allows for streamlined report creation, a curated “database” for developers to work with and improves data accuracy.

The Benefits of Power BI Datamart Optimization

An organization can set up a Datamart and a Dataflow together. The Dataflow contains fact tables that use a complex ETL process to ingest and transform the data, and it’s refreshed several times a day. The Datamart contains several dimension tables that are updated once a day and connected to both on-prem SQL server and cloud-based platforms.

  1. The benefit of using a Dataflow is that it contains fresh data and doesn’t require the entire dataset to be refreshed when compared to other build-out data models.
  2. The benefit of using a Datamart is having all dimension tables contained within one data model, all table relationships already mapped, datatypes correctly assigned to each column, and cleaning of the data has already been applied either upstream, (source of the data) or in Power Query.
  3. The overarching benefit in this example is that using the Dataflow and Datamart optimizes the use of Power BI resources on their assigned premium capacity. Companies often have their Power BI workspace environments set up where a developer creates their script that is ingested into Power BI/paginated report and is not repurposed. Using a Dataflow/Datamart allows developers to save time on development, cloud/Power BI resource cost and reduce development errors. Essentially Datamarts act as a materialized view that’s available across all workspaces unless restricted to a specific group. Another benefit is that you can assign a team/group to manage the two and serve as a point of contact if something is incorrect or doesn’t load properly. This allows a team/group to be responsible for the data and follows data governance best practices.

Data Modeling and ETL Best Practices

With a Datamart serving as a materialized view for others to build reports from, you will want to optimize it using best practices for data modeling and design since this is where many of the optimizations can be performed. These best practices include:

Ingesting the data that is needed:
  • Refrain from bringing in additional fields/columns because that defeats the purpose for building a Datamart/Dataflow to begin with.
  • If you are using a time/date field, rather than hard coding a starting date for pulling in data, you can use a rolling calendar model such as the most recent 24 months of data.

 

Use STAR schema data modeling:
  • STAR schema denormalizes the structure of fact tables and is less complex when joining tables in a one to many (1:*) relationship.
  • STAR schema is easily scalable and can accommodate new fact or dimension tables.

 

Perform data transformations upstream in Azure SQL Database, On-Prem SQL Server, or Snowflake:
  • Enhanced Performance: By performing data transformations upstream, such as in the data source itself or in the ETL process, you can reduce the amount of data transferred to Power BI. This minimizes the processing overhead within Power BI and improves overall query performance.
  • Data Filtering and Reduction: Upstream transformations allow you to apply data filtering and reduction techniques at the source. This can involve removing unnecessary columns, filtering rows based on specific criteria, or aggregating data before it reaches Power BI. By reducing the data volume upfront, you optimize the query performance and enhance the user experience within Power BI.
  • Data Cleansing and Standardization: Performing data cleansing and standardization tasks upstream helps ensure that the data ingested into Power BI is accurate and consistent. This can involve tasks such as removing duplicates, correcting data inconsistencies, and enforcing data quality rules. By cleansing the data early in the process, you improve the reliability and integrity of the data in Power BI.

 

Power Query Data Transformations are sometimes needed because you are unable to perform all data transformations upstream:
  • Flexible Data Transformations: Power Query Editor provides a powerful and flexible environment for data transformations. By leveraging Power Query, you can perform complex data manipulations, such as merging multiple data sources, creating calculated columns, and applying custom transformations.
  • Iterative Data Refinement: Power Query supports iterative data refinement, enabling you to apply multiple data transformation steps in a logical and structured manner. You can progressively build complex queries, perform conditional transformations, and apply data enrichment techniques.

 

Use Power Query rather than DAX when building out calculated columns:
  • Efficiency: Power Query operates during the data loaning and transformation phase, while DAX operates during the data analysis and calculation phase. The difference is the efficiency in how Power BI handles the loading and processing of data.
  • Easier Data Transformation: Power Query offers a visual interface with a wide range of built-in functions and transformations, making it easier to perform complex data transformations and calculations. It provides a user-friendly environment for data shaping, merging, and filtering, allowing for flexible and intuitive data manipulation.
  • Exceptions: It’s important to note that there are scenarios where using DAX for calculated columns may be more appropriate. For instance, when the calculations are based on measures, dynamically changing context, or require complex aggregations, utilizing DAX expressions in calculated columns may be advantageous.

 

Selecting the correct data types reduces storage requirements, improves query performance, and enhances the UX in Power BI:
  • Choose appropriate numeric, date/time, and text data types to optimize storage and calculation efficiency in Power BI.
  • Utilize dedicated currency data types for financial values to ensure accurate handling and compatibility.
  • Reduce the length of values after a decimal for integers or currency data types.

 

Dax Studio is a tool for Power BI Developers to help analyze where bottlenecks are occurring in the execution of your script:
  • Query Optimization: You can run and analyze DAX queries against your Power BI model, making it easier to identify inefficient or slow-performing queries. By analyzing query patterns, you can optimize calculations, utilize proper indexing, and rewrite queries for improved performance.
  • Profiling and Tracing: DAX Studio offers powerful profiling and tracing capabilities that allow you to capture and analyze query events, data refreshes, and formula evaluations. This helps identify excessive calculations, evaluate data load timings, and troubleshoot issues related to slow performance or unexpected results.
  • Query Plan Visualization: DAX Studio provides visual representations of query execution plans, allowing you to understand the query flow, identify expensive operations, and optimize the execution path. This visual analysis helps in fine-tuning calculations, leveraging query folding, and optimizing data retrieval.

Performance Tuning and Optimization Techniques

  • Query Folding is a feature in Power BI where you push the data transformation steps back to the data source for execution, resulting in:
    • Improved Performance: Query folding reduces the amount of data transferred from the data source to Power BI. By pushing data transformations back to the source, it minimizes data movement and processing within Power BI, resulting in faster query execution and improved overall performance.
    • Data Consistency and Accuracy: When query folding is enabled, the transformations are executed on the original data source, ensuring that the results are consistent and accurate. This helps maintain data integrity and alignment with any changes or updates in the source data.
    • Scalability: Query folding allows for scalability as it offloads the processing burden to the data source. It enables handling larger datasets and complex transformations without overwhelming Power BI’s resources, allowing for smoother data integration and analysis.

 

  • Incremental Refresh is a feature in Power BI that enables you to refresh only a portion of your imported dataset, typically new or modified data, instead of refreshing the entire dataset:
    • Faster Refreshes: Incremental Refresh reduces the time required to refresh your dataset by focusing on updating only the modified or new data. This saves processing time and resources, allowing for quicker data availability and report generation.
    • Efficient Resource Utilization: By refreshing only the incremental changes, resource consumption, such as memory and processing power, is optimized. This improves the overall performance of your Power BI environment and ensures efficient utilization of resources.
    • Scalability: Incremental Refresh facilitates handling larger datasets over time without sacrificing performance. As the dataset grows, only the new or modified data needs to be refreshed, enabling the dataset to scale while maintaining refresh efficiency.

 

  • Pre-aggregated tables are summary tables that store pre-calculated aggregated data at various levels of granularity. The benefits of data optimization for pre-aggregated tables are:
    • Improved Query Performance: By leveraging pre-aggregated tables, queries can directly access summarized data instead of performing complex calculations on detailed data. This results in faster query execution, reducing response times and enhancing overall performance.
    • Reduced Data Volume: Pre-aggregated tables store aggregated data, which significantly reduces the data volume compared to the detailed transactional data. This optimization reduces memory usage, improves data load times, and enables efficient storage utilization.
    • Scalability: With pre-aggregated tables, as the dataset grows, the size of the aggregated data remains relatively constant. This scalability advantage allows for efficient handling of larger datasets while maintaining optimal query performance.

Conclusion

Optimizing Power BI Datamarts offers significant benefits in terms of enhanced performance, accurate data representation, and streamlined analysis. By consolidating dimension tables, implementing data modeling best practices, leveraging Power Query, selecting appropriate data types, and utilizing tools like DAX Studio, organizations can unlock the full potential of Power BI. You can also use techniques such as query folding, incremental refresh, and pre-aggregated tables to further optimize query execution and resource utilization.

Embracing these optimization strategies not only empowers organizations to achieve efficient data analysis, insightful reporting, and informed decision-making within their Power BI environment, but also leads to cost savings by minimizing cloud resource usage and Power BI premium capacity utilization.

In part 3 of this Power BI Datamarts blog series, we’ll dive into strategies for leveraging data across your organization using advanced company-wide Datamart configuration options.

We can help you optimize your Power BI environment using Datamarts and much more. Contact us today to speak with a Tail Wind expert!

About the Author:

Rick Johnson Headshot

Rick J – BI Developer:

As a Business Intelligence Developer, I work with clients to elevate and maintain their Power BI environments by optimizing their premium capacity performance, delivering company solutions using enhanced ETL process and architecture, and act as an advanced issue resolution specialist. I’ve managed over 3,000 workspaces as a Power BI Administrator and developed C-suite reports using cloud-based data sources. My main technology stack resides in SQL, Python, machine learning, and M-Query but I’ve been known to dabble in PowerShell and other languages where needed.