Table of Content
In the data-driven world, every action you perform creates a record, and modern business is overwhelmed with data. Thinking to get actionable insights by evaluating the dataset, you might need a tool. Ever heard of PowerBI? PowerBI is a SaaS-based data collective platform powered by Microsoft. From the inducement of Power BI in 2014, Microsoft has provided regular updates in it and made it more useful for its consumers. It converts your raw and scattered data into interactive visualizations at one place that helps users in making business decisions. Hundreds of certified connectors are provided in Power BI to connect data from different systems and even from files.
With this post our SharePoint Developers has explored some of the most influential aspects of Power BI before we start using it
Power BI Desktop
Power BI Desktop is a free tool that you install on Windows computers to generate reports and visualizations for yourself. It supports different databases and systems to extract data. It is an extremely handy tool for data scientists and developers to transform your information into meaningful visualizations. You can prepare reports using this tool but sharing of information is not possible within the tool.
- Connect, transform, and model data.
- Connect multiple sources into one dataset and generate reports.
- A vast variety of visualizations are provided to add to your report. Additionally, you can install custom visualizations from the market or develop customized visuals for yourself.
- Choose a color theme from existing templates and also create a new one with your own set of colors.
- Setup rules for row-level security.
- Power query editor and DAX queries to shape your data.
- Performance check for visualizations.
- Support Python.
- Publish reports in Power BI service.
Power BI Service
Power BI is a cloud service platform that allows users to create interactive dashboards and share the developed reports. It supports light report editing and has a prime focus on the collaboration of teams and organizations. Most of the features of the Power BI desktop are supported in Power BI services too. Power BI service lets users share their reports and dashboards among other users who can access these reports and dashboards using a web browser or mobile apps for Windows, Android, and Apple.
Power BI service has below extensive features to explore:
- Share reports and dashboards with other users.
- Users can subscribe and set alerts on reports and dashboards.
- Setup permissions – who can view and edit your reports.
- Manage row-level security.
- Create different workspaces.
- Create Power BI apps to securely share reports and dashboards to other users in teams.
- Analyze data in excel.
- Securely share reports in other systems via embed API.
- Schedule data refresh, setup on-premise gateways for legacy data.
Few things you could not achieve using only Power BI Service and hence you need Power BI Desktop:
- Add/edit rules for row-level security.
- Create calculated columns.
- Advanced query editor.
- Python and DAX support.
- Data transformation and modeling.
Power BI Connectivity Types
Power BI provides majorly four types of connectivity depending on the data source connector you are using.
- Composite (Mixed-mode)
These different connectivity types have their benefits and limitations. You should check all types to choose the best suitable options.
Import is the most important yet common feature of Power BI where almost all data sources can be used. It is extremely important for businesses to understand all the data that is to be imported and can be stored in a PBIX file. Storing data within the file and in-memory increases the performance for retrieving, querying, and loading reports and thus makes this connectivity type the fastest among all types. If the tables are large then this application is not applicable. Since the performance is dependent on the memory and machine processor, you will feel slowness during development while using the Power BI desktop tool on a local computer with a large amount of data.
Import connectivity type provides you to use full capabilities of the Power BI desktop tool. DAX functions are fully supported by this. Also, it stores all data in memory. You have the benefit of using the full capacity of datasets as permitted by licenses of Power BI.
When you are using Import type, you will have Reporting, Data and Modelling – 3 option tabs displayed in the Power BI Desktop tool.
The next in line is the Direct query. As the name suggests, it directly fires queries of data from the source. No data is stored in the Power BI report file as shown in the Import connectivity type. DirectQuery connectivity type is available only with relational database sources. Only the metadata of the source i.e.: table name, field names, relationships, etc. will be stored in the Power BI file except for actual data. This gives the major benefit when working with large data tables to achieve 1 GB restrictions of dataset size.
When interacting with reports, it requests data as per the applied filters into the data source using the details stored in the file. Also, when data often get updates, you will get nearly real-time data with this connection type. Performance may decrease since the queries are performed in real-time and no data is stored in the file. But there are several techniques provided to minimize the queries to the source.
Since data is not stored in the Power BI file, it has certain limitations in the capabilities of the Power BI desktop. Many DAX operations were not supported till the last year. Some transformations in Power Query are also restricted i.e.: change column data types, split columns, remove duplicates, etc. Also, data returned in DirectQuery are limited to 1 million rows, unless you have a Power BI premium license.
When you are using the DirectQuery type, you will have Reporting and Modelling – 2 option tabs displayed in the Power BI Desktop tool.
Data is not stored in the Power BI file for the LiveConnection connectivity type. All the data is queried using LiveConnection in the existing analysis Services model to interact with reports. Azure Analysis Service, SQL Server Analysis Service, and Power BI datasets hosted in Power BI services can be used with this connection type. Since these sources are analytical services, performance for querying is much better than DirectQuery. Generally, for enterprise deployment, this live connection is done using Power BI.
As the data is not stored in the Power BI file, the majority of people get confused between DirectQuery and LiveConnection types, but both are so much different and cannot be used on behalf of each other. Since Analytical Services works on data, you will not have much freedom on data transformation and authoring. Only report level DAX measures are available to add. These measures will be stored in the Power BI file but cannot make changes in the Analysis Service model data.
You will have all reporting capabilities when you use LiveConnection and thus only the Reports option tab will be displayed in the Power BI desktop tool.
In the past, Developers couldn’t create connections to multiple sources using Import and DirectQuery in a single report. But now it’s possible using the Composite connection type. You can include one table from the SQL server using DirectQuery type and other as an Import connection type in the same report. This way you can include a small amount of data in the Power BI file and connect large tables with DirectQuery type.
You have seen the variations in connection types to connect data sources in Power BI. This variation gives you the power to accommodate different types and sizes of data sources at your ease.
Let’s see licensing info for Power BI, luckily they have only 2 variations in that:
Power BI service has 2 licenses: Power BI Pro – to get started with and Power BI Premium – advanced data analytics, big data support and dedicated cloud compute.
Read More about Power BI + Google Analytics = Power Analytics
Have a look at the below comparisons before choosing licenses for your use:
|Features||Power BI Pro||Power BI Premium|
|Pricing||10 $ per month per user||4995 $ instance per month|
|Included with office 365 Enterprise E5||Yes||No|
|Dedicated cloud compute and storage resources||No||Yes|
|On-premise reporting through Power BI report server||No||Yes|
|Compute Processing environment||Shared||Dedicated|
|Content deployment in multiple regions||No||Yes|
|Incremental data refresh||Yes||Yes|
|Data refreshes per day||8||48|
|Allocate compute resources||No||Yes|
|Monitor performance for compute resources||No||Yes|
|Maximum size of individual dataset||1 GB||10 GB|
|Maximum storage||10 GB Per User||100 GB|
|Data security encryption||Yes||Yes|
You may notice the supported dataset size that Power BI supports in the above table. Perhaps you feel that Power BI is not useful when you have large data to serve. But you will be surprised to know that Microsoft uses some data compression techniques called VertiPaq storage engine to minimize the size of data after importing.
VertiPaq storage engine compresses the data and reduces the size up to 10x smaller. Although there is no specific equation defined for that. It depends on how you have managed the architecture of your raw data. Usually, it’s seen that 10 GB of source data is compressed and made of 1 GB in size in Power BI reports.
Despite the VertiPaq storage engine compressing the data, it is important that you only load the required data in Power BI to minimize the source data size. Because the gradual data refreshes in the report will increase the data size. Finally, this will directly or indirectly have an effect on the performance of reports and visualization.
Data Reduction techniques
There are 8 different ideas suggested by Microsoft to reduce the data size
- Remove unnecessary columns
Microsoft recommends that you include only those columns in the model that are required for reports. Your requirements may change over time, but you should be aware that data modeling and including new columns in models is easy.
- Remove unnecessary rows
Microsoft recommends that you include only a few rows in the model that are required in reports. By careful observations, you can set filters and allow only required rows in reports, which reduces the report size and also increases report performance. For example, you only need current year sales data for your report then instead of including all year sales data, you can filter and include only this year’s sales data rows in reports.
- Group by and summarize
These charts and visualizations need summarized data, and SharePoint developers normally pull all data first in Power BI and then shape them accordingly. Instead, you can load pre-summarize data in Power BI that will reduce its size and after importing in Power BI again it will be reduced. You can shrink the dataset size which is one of the effective ways to eliminate multiple rows and columns.
- Optimize Column data types
VertiPaq storage engine has different methods for each column for compression. It is seen that numeric columns can be compressed with high margin and perhaps decreases the size of the dataset. So, it’s advisable to check and set proper data types for columns in the table after importing the data.For example, if you have one column “Lead Number” with alpha-numeric values, like “L00001, L00002, L00003…”. Power BI will detect this as a text column due to alpha-numeric values. Since the prefix of the numbers is fixed in this column, you can remove the prefix and the column will be converted to number type. For the large tables, this minor change will give a huge effect on data compression and thus in data reduction.
- Preference for custom columns
Power BI provides facilities to create custom columns in tables. VertiPaq storage engine stores custom columns just like Power Query sourced columns. These columns are less effective during data reduction and take more time during data refreshes each time. Also, it’s advisable to add custom columns via Power Query editor instead of using direct Dax queries on models because Dax query custom columns are built once all Power query tables are refreshed and it increases the refresh time.However, if you create these calculated columns in the SQL server or any other systems before importing in Power BI will reduce the calculation efforts inside the engine and increase the performance.
- Disable Power Query Load
By default, Power Query enabled in Power BI fires a report to integrate data between different systems but within the same report. To avoid the loading of the query, you can disable it in the query editor as shown in the below image
- Disable auto date/time
Power BI Desktop has one option called “Auto Date/Time” which creates new data columns for storing date instances fragmented in the year, month, and day for better filter options when it is enabled. But when there are large tables used in reports, these new columns increase the size of the dataset. You can disable this option for date-time columns on which you don’t need such filters.
- Switch to Mixed mode
If you want to determine the storage of each table, the Power BI desktop application enables you to easily bifurcate and create space in the storage. It suggests to use composite connectivity – mixed mode to get the data. Whether you can import data in Power BI dataset or you can direct queries in the source system and request data as per your need. The direct query option is very useful when there are large size tables you have and data gets constant updates in it. This mixed-mode will be helpful for summarized data that you can get via direct query option and others can reside in the dataset.
Our modern business world is continuously surrounded by data, and as mentioned every move we take records of data. A magic tool that benefits users in understanding how data works and how it eliminates unnecessary storage space using Power BI. We hope this was an insightful blog and would have helped you with how to manage your data strategically.
More Related Blog Post
BI Tools – Microsoft Power BI vs. Google Data Studio
Know more about our SharePoint Development ServicesLearn More
Build your Team
Want to Hire Skilled Developers