This extensive guide explores Power BI, a business intelligence tool, offering a comprehensive look at its interface and core functionalities. It walks users through report creation, beginning with understanding the canvas, ribbon, and panes for filters, visualizations, and data. The text progresses to data importation from various sources, data cleaning using Power Query Editor, and dashboard construction with diverse visualizations like bar charts, column charts, and scatter plots. Furthermore, it covers advanced topics such as DAX (Data Analysis Expressions) for complex calculations, creating data models with fact and dimensional tables, and using parameters for interactive dashboards. The guide concludes with advice on sharing dashboards and best practices for effective data presentation.
Power BI Desktop: Interface and Fundamentals
The Power BI interface, primarily referring to the Power BI Desktop application, is designed for data analysis and dashboard creation, drawing inspiration from car dashboards for quick insights. It has a distinct layout and terminology compared to tools like Excel.
Key components of the Power BI interface include:
The Ribbon The ribbon is located at the top of the Power BI Desktop application, similar to other Microsoft products. It contains various tabs like Home, Insert, Modeling, View, Optimize, and Help, each offering different functionalities.
Home Tab: Primarily used for getting and editing data queries, connecting to various data sources like Excel workbooks, SQL Server, text files, and the internet. It also includes options to transform data, which opens the Power Query Editor, and to refresh queries.
Insert Tab: Allows users to insert new visuals, text boxes, shapes, and buttons into a report.
Modeling Tab: Used for creating measures, calculated columns, tables, and parameters, often utilizing the DAX language. It also includes options for managing relationships between tables.
View Tab: Enables changes to the report’s appearance, such as color themes (e.g., dark mode, light theme) and layout options. It also controls the visibility of various panes.
Optimize Tab: Contains tools like the Performance Analyzer to inspect and identify bottlenecks in report loading or cross-highlighting.
Help Tab: Provides access to help resources, though external chatbots like Gemini or ChatGPT are often recommended for more practical assistance.
Views: Located on the left-hand side, Power BI Desktop offers several views:
Report View: This is the primary area where users build their dashboards.
Table View: Allows users to view and inspect their loaded data in a tabular format, similar to a spreadsheet. It also enables formatting of data types and decimal places for columns.
Model View: Displays the data model, showing all loaded tables and the relationships between them. This view is crucial for understanding how different tables interact.
DAX Query View: A newer view that allows users to write and execute DAX queries to analyze data and define measures. It can also generate column statistics.
Panes: Located on the right-hand side, these provide interactive elements for report creation and data manipulation:
Filters Pane: Used to apply filters to visuals, specific pages, or all pages in a report.
Visualizations Pane: This is where users select different chart types (e.g., bar charts, line charts, pie charts, maps) and configure their properties, including axes, legends, and field wells. It also allows for formatting visuals, adding analytics features like trend lines, and toggling data labels.
Data Pane: Displays the data model, showing tables, columns, and measures that can be dragged into visuals.
Other Panes: Includes Bookmark Pane, Selection Pane, Performance Analyzer, and Sync Slicers, which are covered in more advanced lessons.
Canvas: The central area of the report view where dashboards are built and visuals are placed. Unlike Excel’s “worksheets,” Power BI reports consist of multiple “pages”.
Initial Setup and Terminology Differences: Power BI Desktop is available for free from the Microsoft Store. Upon opening, users can start with a blank report. The application may prompt users about features like dark mode, though the source recommends the light theme for tutorials due to contrast. Power BI refers to its files as “reports” and the individual tabs within a report as “pages,” differentiating them from Excel’s “workbooks” and “sheets”.
Interaction and Navigation: Users interact with the interface by selecting visuals, dragging fields between panes, and utilizing the various options on the ribbon. Navigation between pages can be done through page tabs at the bottom or by implementing buttons and bookmarks for more dynamic interaction.
The Power BI Service, a cloud-based platform, complements the Desktop application by allowing users to publish and share dashboards with co-workers or to the web, ensuring a single source of truth for data. However, advanced sharing features in the Power BI Service often require a Power BI Pro license.The Power BI interface, primarily referring to the Power BI Desktop application, is designed for data analysis and dashboard creation, drawing inspiration from car dashboards for quick insights. It has a distinct layout and terminology compared to tools like Excel.
Key components of the Power BI interface include:
The Ribbon: Located at the top of the Power BI Desktop application, similar to other Microsoft products, it contains various tabs like Home, Insert, Modeling, View, Optimize, and Help, each offering different functionalities.
Home Tab: Primarily used for getting and editing data queries, connecting to various data sources like Excel workbooks, SQL Server, text files, and the internet. It also includes options to transform data, which opens the Power Query Editor, and to refresh queries.
Insert Tab: Allows users to insert new visuals, text boxes, shapes, and buttons into a report.
Modeling Tab: Used for creating measures, calculated columns, tables, and parameters, often utilizing the DAX language. It also includes options for managing relationships between tables.
View Tab: Enables changes to the report’s appearance, such as color themes (e.g., dark mode, light theme) and layout options. It also controls the visibility of various panes.
Optimize Tab: Contains tools like the Performance Analyzer to inspect and identify bottlenecks in report loading or cross-highlighting.
Help Tab: Provides access to help resources, though external chatbots like Gemini or ChatGPT are often recommended for more practical assistance.
Views: Located on the left-hand side, Power BI Desktop offers several views:
Report View: This is the primary area where users build their dashboards.
Table View: Allows users to view and inspect their loaded data in a tabular format, similar to a spreadsheet. It also enables formatting of data types and decimal places for columns.
Model View: Displays the data model, showing all loaded tables and the relationships between them. This view is crucial for understanding how different tables interact.
DAX Query View: A newer view that allows users to write and execute DAX queries to analyze data and define measures. It can also generate column statistics.
Panes: Located on the right-hand side, these provide interactive elements for report creation and data manipulation:
Filters Pane: Used to apply filters to visuals, specific pages, or all pages in a report.
Visualizations Pane: This is where users select different chart types (e.g., bar charts, line charts, pie charts, maps) and configure their properties, including axes, legends, and field wells. It also allows for formatting visuals, adding analytics features like trend lines, and toggling data labels.
Data Pane: Displays the data model, showing tables, columns, and measures that can be dragged into visuals.
Other Panes: Includes Bookmark Pane, Selection Pane, Performance Analyzer, and Sync Slicers, which are covered in more advanced lessons.
Canvas: The central area of the report view where dashboards are built and visuals are placed. Unlike Excel’s “worksheets,” Power BI reports consist of multiple “pages”.
Initial Setup and Terminology Differences: Power BI Desktop is available for free from the Microsoft Store. Upon opening, users can start with a blank report. The application may prompt users about features like dark mode, though the source recommends the light theme for tutorials due to contrast. Power BI refers to its files as “reports” and the individual tabs within a report as “pages,” differentiating them from Excel’s “workbooks” and “sheets”.
Interaction and Navigation: Users interact with the interface by selecting visuals, dragging fields between panes, and utilizing the various options on the ribbon. Navigation between pages can be done through page tabs at the bottom or by implementing buttons and bookmarks for more dynamic interaction.
The Power BI Service, a cloud-based platform, complements the Desktop application by allowing users to publish and share dashboards with co-workers or to the web, ensuring a single source of truth for data. However, advanced sharing features in the Power BI Service often require a Power BI Pro license.
Power BI: Power Query and DAX for Data Mastery
Data manipulation in Power BI is a crucial process, primarily handled through two powerful tools: Power Query for data extraction, transformation, and loading (ETL), and DAX (Data Analysis Expressions) for creating calculated data within the data model.
Data Manipulation with Power Query
Power Query is described as an ETL tool that allows users to extract data from various sources, transform it, and then load it into Power BI for visualization. It provides a graphical user interface (GUI) for performing these transformations without extensive coding, though it operates on a specialized language called M.
Accessing Power Query Editor: The Power Query Editor can be accessed from the “Home” tab in Power BI Desktop by selecting “Transform data”. This opens a separate window with its own ribbon, data view area, queries pane, and query settings pane.
Key Functionalities and Interface:
Connecting to Data Sources: Power Query supports hundreds of data sources, categorized broadly into files (Excel, CSV, PDF, text), databases (SQL Server, BigQuery), cloud services (Salesforce, Snowflake), and web sources. Users can directly import data or choose to “Transform data” to open the Power Query Editor first.
Folder Connections: A common use case is combining multiple files (e.g., monthly Excel sheets) from a single folder into one table. This can be done by connecting to a “Folder” source and then using the “Combine and Load” or “Combine and Transform Data” options.
Web Sources: Data from web pages, particularly tables, can be easily imported by pasting the URL.
Database Connections: Power Query can connect to various databases, requiring credentials and allowing for optional SQL statements to extract specific subsets of data. When connecting to databases, users choose between “Import mode” (loads all data into the Power BI file, faster performance, larger file size) and “Direct Query” (data remains in the source, smaller file size, slower performance, limited DAX functionality). The source recommends using “Import mode” if possible for better performance and full functionality.
Power Query Editor Interface and Analysis:
Ribbon Tabs: The editor has tabs like “Home,” “Transform,” and “Add Column,” each offering different functionalities.
Queries Pane: Lists all loaded queries (tables).
Applied Steps: This pane on the right tracks every transformation applied to the data. Users can review, modify, or delete steps, allowing for iterative and non-destructive data cleaning. Each step generates M language code.
Formula Bar: Displays the M language code for the currently selected step.
Data View Area: Shows a preview of the data after the applied transformations.
Column Profiling (View Tab): The “View” tab offers features like “Column Profile,” “Column Distribution,” and “Column Quality” to inspect data, identify unique/distinct values, errors, and empty cells. This helps in understanding data quality and guiding transformations. Column profiling can be set to the top 1,000 rows or the entire data set.
Common Data Transformations in Power Query:
Data Type Conversion: Easily change data types (e.g., text to date/time, whole number to decimal). The editor asks if you want to replace the current step or add a new one.
Removing/Choosing Columns: Users can remove unnecessary columns or select specific columns to keep using “Remove Columns” or “Choose Columns”.
Replacing Values: Replace specific text or characters within a column (e.g., removing prefixes like “via” or cleaning up extraneous spaces).
Trimming/Formatting Text: “Format” options allow for changing case (uppercase, lowercase), and “Trim” removes leading and trailing whitespace.
Splitting Columns: Columns can be split by a delimiter into new columns or into new rows, which is particularly useful for handling multi-valued fields within a single cell.
Unpivoting Columns: Transforms columns into attribute-value pairs, useful when data is in a “pivot table” format and needs to be normalized.
Adding Custom Columns: Create new columns based on existing ones using formulas or conditional logic.
Standard Transformations (Add Column Tab): Perform mathematical operations like multiplication (e.g., calculating yearly salary from hourly pay).
Column from Example: Users provide examples of the desired output, and Power Query infers the M language code to generate the new column. This can be more intuitive for complex text manipulations or bucketing.
Conditional Columns: Create new columns based on “if-then-else” logic, similar to Excel’s IF function.
Custom Column (M Language): For more complex scenarios, users can write M language code directly to define new columns. AI chatbots like ChatGPT or Gemini can assist in generating this M language code.
Appending Queries: Combines rows from multiple tables with similar structures (same columns) by stacking them on top of each other. This is useful for consolidating data from different periods or sources.
Merging Queries: Combines columns from two or more tables based on matching values in common columns, similar to SQL joins. Different “Join Kinds” determine which rows are included (e.g., Left Outer, Right Outer, Inner, Full Outer, Left Anti, Right Anti). This is crucial for building star schemas by linking fact tables to dimensional tables.
Grouping Data (“Group By”): Aggregates data based on one or more columns, allowing for calculations like counts or sums for distinct groups, similar to pivot tables in Excel.
M Language: The underlying functional programming language that powers Power Query. Every action taken in the GUI translates into M code, which can be viewed and edited in the “Advanced Editor”. Understanding M can help with troubleshooting and advanced transformations. AI chatbots are recommended for assistance with M language queries.
Data Manipulation with DAX (Data Analysis Expressions)
DAX is a formula language used after data is loaded into the Power BI data model. Unlike Power Query which focuses on data preparation, DAX focuses on creating new calculations and enriching the data model.
Key Functionalities:
Calculated Columns: New columns added directly to a table in the data model using DAX formulas. These calculations are performed during data import or refresh and are stored as part of the model. While possible, Power Query’s custom columns are generally preferred for efficiency and better compression.
Examples include creating an adjusted salary column or a combined yearly/hourly salary column.
Calculated Tables: Entire new tables created using DAX formulas. This is useful for creating lookup tables (e.g., a distinct list of job titles) or date dimension tables.
The CALENDAR and CALENDARAUTO functions are specifically mentioned for creating date tables. The ADDCOLUMNS function can be used to add columns like year, month, or weekday name to a calculated table.
Explicit Measures: Unlike implicit measures (automatically generated by dragging fields), explicit measures are explicitly defined using DAX formulas. They are highly recommended for complex calculations, ensuring reusability, and maintaining a “single source of truth” for calculations across a report. Measures are calculated at “query runtime” (when a visualization is built) and are not stored in the table directly.
Examples include Job Count, Median Yearly Salary, Skill Count, and Skills per Job.
DIVIDE function: A safer way to perform division, handling divide-by-zero errors.
CALCULATE function: One of the most powerful DAX functions, allowing expressions to be evaluated within a modified filter context. This is crucial for overriding or modifying existing filters and contexts.
ALL and ALLSELECTED functions: Used within CALCULATE to remove filters from a table or selected columns/rows, respectively, enabling calculations against totals or specific subsets.
Parameters: While parameters are a user-facing feature, they rely on DAX to define their behavior.
Field Parameters: Allow users to dynamically switch the columns or measures displayed in a visual via a slicer. These parameters are created based on selected fields and generate DAX code.
Numeric Parameters (“What-if” Parameters): Enable users to input a numeric value (via a slider or field) that can then be used in DAX measures to perform “what-if” analysis (e.g., adjusting tax rates for take-home pay).
Context in DAX: Understanding DAX requires comprehending “context,” which dictates how calculations are evaluated. There are three types, with precedence from highest to lowest:
Filter Context: Explicitly modified using DAX functions like CALCULATE.
Query Context: Determined by visual selections, relationships, and cross-filtering.
Row Context: Operates at an individual row level, typically seen in calculated columns.
Best Practices and Considerations
Power Query for Cleaning, DAX for Calculations: Generally, it is recommended to perform extensive data cleaning and transformations in Power Query before loading data into the model, as it leads to better compression, smaller file sizes, and faster data model operations. DAX is best used for creating measures and calculated fields that enrich the analysis after the data is loaded.
Star Schema: Organizing data into fact and dimensional tables (star schema) is a recommended practice for efficient data modeling and analysis, especially when dealing with complex relationships like multiple skills per job posting.
Measure Organization: Store all explicit measures in a dedicated “measures” table for better organization and ease of access.
Commenting DAX: Use comments (single-line // or multi-line /* */) to document DAX measures, improving readability and maintainability.
Data Size: Be mindful of file size implications, especially when importing large datasets or creating many calculated columns, as this can affect performance and sharing capabilities.
Power BI Data Visualization: A Comprehensive Guide
Data visualization in Power BI is a core functionality that allows users to translate raw data into insightful, interactive reports and dashboards. It is a critical skill for data and business analysts, enabling them to communicate data-driven insights effectively.
Power BI Desktop and Its Interface for Visualization
The primary tool for creating visualizations is Power BI Desktop, a free application. When building reports, users interact with several key components:
Ribbon: Located at the top, it contains various tabs like “Home,” “Insert,” “Modeling,” “View,” “Optimize,” and “Help,” which offer tools for data manipulation and visualization.
Views: Power BI Desktop offers different views:
Report View: This is the central canvas where dashboards are built by adding and arranging visuals. Pages within a report are analogous to worksheets in Excel.
Table View: Allows users to inspect and verify the loaded data, view all values, and perform basic formatting like changing data types or currency formats.
Model View: Displays the data model, including tables, columns, measures, and, crucially, relationships between tables. This view helps in understanding how different tables interact.
DAX Query View: A newer feature that allows users to write and execute DAX queries to evaluate measures or view column statistics. It can assist in troubleshooting DAX formulas.
Panes: Located on the right-hand side, these panes are essential for building and refining visuals:
Filters Pane: Used to apply filters at the visual, page, or all-page level, controlling which data is displayed.
Visualizations Pane: Contains a gallery of available chart types and options to format selected visuals.
Data Pane: Shows the data model, listing all loaded tables, their columns, and measures, allowing users to drag fields into visual wells.
Bookmark Pane: Manages bookmarks, which capture specific states of a report page (filters, visible visuals).
Selection Pane: Controls the visibility and order of elements on the canvas, useful for managing layers in design.
Performance Analyzer: Helps identify bottlenecks and slow-performing visuals by recording the time taken for interactions.
Sync Slicers Pane: Manages the synchronization of slicer selections across different report pages.
Canvas: The central area where visuals are added, arranged, and interacted with.
Chart Types and Their Applications
Power BI offers a wide range of built-in visuals, and understanding when to use each is crucial.
Column and Bar Charts:
Stacked Bar/Column Chart: Compares values across categories, with segments of bars/columns representing proportions of a whole.
Clustered Bar/Column Chart: Compares values across multiple categories side-by-side.
100% Stacked Bar/Column Chart: Similar to stacked charts but shows the proportion of each segment relative to 100%, useful for visualizing percentages.
Often used for showing distributions or comparisons of categorical data, like “what are top data jobs” or “what are the type of data jobs”. Columns go vertically, bars horizontally.
Line and Area Charts:
Line Chart: Ideal for showing trends over time, such as “what is the trend of jobs in 2024”. Trend lines can be added for further analysis.
Stacked Area Chart: Shows trends over time while also indicating the composition of a total, useful for breaking down categories over time.
100% Stacked Area Chart: Displays the proportion of categories over time, emphasizing their relative contribution to a total.
Combo Chart (Line and Stacked Column/Clustered Column Chart): Combines columns and lines to compare different measures, like yearly vs. hourly median salary.
Pie and Donut Charts:
Represent proportions of a whole.
Donut Charts: Similar to pie charts but with a hole in the middle.
Recommended for use with only “two to three values” to maintain readability. Examples include “what portion of postings don’t mention a degree” or “what portion of job postings are work from home”.
Tree Maps:
Display hierarchical data as a set of nested rectangles. The size of the rectangle corresponds to the value.
Good for showing breakdowns and can be used to filter other visuals when clicked. Example: “what are the type of data jobs” (e.g., full-time, contractor).
Scatter Plots:
Show the relationship between two numerical values, revealing trends or correlations.
Example: “hourly versus yearly salary of data jobs”. Trend lines can be added.
Maps:
Map Visual: Displays geographical data as dots or bubbles on a map, with bubble size often representing a measure like job count. Can include legends for categorical breakdowns (e.g., degree mentioned). Requires enabling in security settings.
Filled Map: Colors regions on a map based on a measure or category. The source finds it “most useless” due to limited insights and distinct colors for all values.
ArcGIS for Power BI Map: Offers advanced mapping capabilities, allowing for color-coding based on values. However, sharing reports with this visual requires an ArcGIS subscription.
Uncommon Charts:
Ribbon Chart: Shows rank over time, with ribbons connecting values. Can be visually cluttered with too many categories.
Waterfall Chart: Illustrates how an initial value is affected by a series of positive and negative changes, common in finance. Requires specific data formatting.
Funnel Chart: Visualizes stages in a sequential process, showing conversion rates or progression.
Tables and Matrices:
Table: Displays data in rows and columns, similar to a spreadsheet. Useful for showing detailed information and allowing users to export data.
Matrix: Functions like an Excel pivot table, allowing for hierarchical aggregation and drill-down capabilities.
Both support Conditional Formatting (background color, font color, data bars, icons, web URLs) to highlight patterns.
Sparklines can be added to matrices to show trends within individual cells.
Cards:
Display single key metrics or KPIs, typically placed prominently at the top of a dashboard.
Card (original): Simple display of a single value.
Card (new): Preferred due to its ability to display multiple values in a more intuitive layout and title placement.
Gauge Card: Visualizes a single value against a target or range, showing progress or performance (e.g., median salary with min/max/average).
Multi-row Card: Displays multiple values across several rows, useful for listing several key figures.
KPI Card: Shows a key performance indicator, often with a trend line and color-coding (green/red) based on performance against a target.
Interactive Elements
Power BI enhances interactivity through:
Slicers: Allow users to filter data dynamically by making selections.
Styles: Vertical list, tile buttons, or dropdown.
Selection: Single select (radio buttons) or multi-select (holding Ctrl/Cmd). “Show select all” option can be enabled.
Types: Can be used for categorical data (e.g., job title), numerical ranges (e.g., salary), or date ranges (e.g., “between” dates, “relative date/time”).
Search: Can be enabled for large lists of values.
Sync Slicers: Allows a single slicer’s selection to apply across multiple report pages.
Buttons: Can be configured to perform various actions.
Page Navigation: Navigate to different report pages.
Q&A Button: Provides a tool tip to guide users on how to interact (e.g., “press control while clicking a button”).
Clear All Slicers: Resets all slicers on a page or report, providing an intuitive way to clear filters.
Apply All Slicers: Delays filtering until the button is clicked, useful for large datasets to improve performance.
Bookmark Actions: Activate specific bookmarks.
Bookmarks: Capture the current state of a report page, including applied filters, visible visuals, and visual properties. They allow users to quickly switch between different views or hide/show elements.
Can be set to preserve data (filters) or display (visual visibility) properties.
Drill Through: Enables users to navigate from one report page to another, passing filter context based on a selected data point. For example, clicking on a job title in one report can show a detailed view for only that job title on a drill-through page. A “back arrow” button is automatically added for navigation.
Formatting and Design Principles
Effective visualization in Power BI extends beyond just selecting chart types to thoughtful design and formatting.
Titles and Labels: Descriptive titles and clear labels are crucial for guiding the user’s understanding.
Coloring: Use color palettes consistently and strategically to draw attention to key insights. Avoid excessive or distracting colors. Dark mode themes are an option.
Font and Size: Adjust font sizes for readability.
Decimal Places and Display Units: Format numerical values appropriately (e.g., currency, thousands).
Gridlines: Often removed to reduce visual clutter.
Tooltips: Enhance interactivity by displaying additional information when hovering over data points.
Borders and Shadows: Can be used to group related visuals and add visual appeal.
Backgrounds: Can be made transparent for visuals to sit on custom backgrounds.
Edit Interactions: Control how visuals interact with each other when filtered or highlighted.
Dashboard Design Best Practices:Problem-solving and Audience Focus: Always design with a clear problem and target audience in mind.
Simplicity: Keep designs simple and avoid overwhelming users with too many visuals or colors.
Symmetry and Layout: Symmetrical layouts, often with KPIs at the top and related visuals below, can improve intuitiveness.
Visual Cues: Use background shapes or grouping to create visual cues that associate related visuals and parameters.
Performance Analyzer: A tool to check the loading times of visuals and identify bottlenecks in report performance.
Overall, data visualization in Power BI is a comprehensive process that involves selecting appropriate visuals, applying detailed formatting, and incorporating interactive elements, all while adhering to best practices for effective dashboard design.
DAX: Power BI’s Calculation Engine
DAX (Data Analysis Expressions) is a powerful formula language used in Power BI for performing calculations on data that has already been loaded into the data model. It is distinct from M language, which is a programming language used in Power Query for data manipulation and transformation before data is loaded into Power BI.
Purpose and Usage of DAX DAX allows users to add calculations to their data models, enabling more in-depth analysis and dynamic reporting. It is not exclusive to Power BI and can also be used in other Microsoft tools like Microsoft Excel, Microsoft Fabric, SQL Server Analysis Services, and Azure Analysis Services. DAX is particularly effective for performing calculations on large datasets.
Comparison with Excel Functions DAX functions share a similar syntax with Excel functions, but they operate differently. While Excel functions typically operate on a single cell or a range of cells, DAX can perform calculations on single rows, entire columns, or even whole tables. For instance, the SUM function in DAX is similar to Excel’s SUM, but in DAX, you typically insert a column name rather than a cell or range.
Comparison with M Language DAX is a formula language (like SUM, AVERAGE), whereas M language is a more verbose programming language. Functions and structures in DAX are not interchangeable with those in M language; for example, concatenating text in DAX uses TEXTCOMBINE instead of a direct concatenation symbol as might be seen in M language.
Types of DAX Functions and Their Applications DAX offers a wide range of functions categorized into:
Aggregation Functions: Such as AVERAGE, COUNT, MAX, MIN, and SUM.
Date and Time Functions: Including those for extracting day, minute, or month, and functions like CALENDAR and CALENDARAUTO for creating date tables.
Logical Functions: For operations like IF, AND, or OR statements.
Math and Trig Functions: For mathematical calculations.
DAX can be applied in Power BI using four primary methods:
Calculated Columns:
Calculated columns add new columns to an existing table in the data model.
They are computed immediately upon data import and are visible in both the data and report views.
Example: Creating a salary hour adjusted V2 column by multiplying salary hour average by 2080 (40 hours/week * 52 weeks/year). Another example is salary year and hour V2 which selects a value from either salary year average or salary hour adjusted V2 if the first is null.
Recommendation: While possible, it is generally recommended to perform data transformations and create new columns in Power Query using custom columns instead of DAX calculated columns. Power Query processes data before loading, leading to more efficient compression, smaller file sizes, and quicker data model operations. It also keeps all data cleaning in one centralized place.
Calculated Tables:
Calculated tables create entirely new tables within the data model based on DAX expressions.
They are useful for creating lookup tables (e.g., job title dim using the DISTINCT function to get unique job titles) or date tables.
Example: Date Dimensional Table: A date dim table can be created using CALENDAR (specifying start and end dates) or CALENDARAUTO (which automatically detects dates from the model). Additional columns like year, month number, month name, weekday name, week number, and weekday number can be added using functions like YEAR, MONTH, FORMAT, and WEEKNUM.
Date tables can be marked as such in Power BI to enable automatic date-related functionalities. Sorting columns (e.g., weekday name by weekday number) helps ensure correct visual order.
Recommendation: Similar to calculated columns, creating and cleaning tables is often more beneficial to do in Power Query.
Explicit Measures:
Measures are dynamic calculations that are not computed until they are queried (e.g., when a visual is built). They are not visible in the table view.
They provide a “single source of truth” for calculations across different reports, preventing inconsistencies that can arise from implicit measures (where aggregation is chosen directly in a visual).
Creation: Measures are defined with a name followed by an equals sign and a DAX formula (e.g., Job Count = COUNTROWS(‘Job Postings Fact’)).
Organization: Best practice is to create a dedicated table (e.g., _Measures) to store all explicit measures, improving organization.
Examples:Job Count: Calculates the total number of job postings using COUNTROWS.
Median Yearly Salary: Calculates the median yearly salary using the MEDIAN function. Measures can be pre-formatted (e.g., currency, decimal places).
Skill Count: Counts the total number of skills for job postings using COUNTROWS(‘Skills Job Dim’).
Skills Per Job: Calculates the ratio of Skill Count to Job Count using the DIVIDE function for safe division.
Job Percent: Calculates the percentage likelihood of a skill being in a job posting, demonstrating the CALCULATE and ALLSELECTED functions to manage filter context.
Median Yearly Take-Home Pay: Uses a numeric parameter to deduct a user-defined tax rate from the median yearly salary.
Commentation: Measures should be commented using // for single-line comments or /* … */ for multi-line comments to document their purpose and logic.
Parameters (using DAX):
Parameters allow end-users to dynamically change inputs in a chart without needing to modify the underlying DAX code.
Field Parameters:Enable users to dynamically switch between different columns or measures on an axis of a visual.
Example: A select category parameter can let users switch the Y-axis of a chart between Job Title, Country, Skills, or Company. A select measure parameter can switch between Median Yearly Salary and Job Count on the X-axis.
Numeric Parameters:Allow for “what-if” analysis by providing a slider or input field for numerical values.
Example: A select deduction rate parameter allows users to adjust a tax rate (e.g., from 0% to 50%) to see its impact on “take-home pay” calculations.
Context in DAX Understanding evaluation contexts is crucial for complex DAX calculations:
Row Context (Lowest Precedence): Refers to the current row a calculation is being applied to. Calculations in calculated columns typically operate at the row context level. The RELATEDTABLE function can be used to count related rows for the current row context.
Query Context: Determines which rows from a table are included in a calculation based on visual selections, relationships, slicers, and cross-filtering. This is an abstract context derived from the visual itself.
Filter Context (Highest Precedence): Applied on top of query and row contexts. It can explicitly modify the calculation environment, overriding other contexts. The CALCULATE function is a powerful tool used to explicitly modify filter context. The ALL and ALLSELECTED functions can remove existing filters from columns or tables within a CALCULATE expression.
DAX Query View The DAX query view in Power BI Desktop allows users to write and execute DAX queries to evaluate measures or view column statistics. It can also be used to define and evaluate measures, and even update the data model. While it requires some DAX knowledge, it can be assisted by quick queries for basic evaluations.
Learning and Troubleshooting DAX For learning and troubleshooting DAX, the source recommends consulting official DAX documentation and utilizing AI chatbots like Google Gemini or ChatGPT, which can provide step-by-step instructions and code for DAX formulas. Additional courses on DAX are also recommended for deeper learning.
Power BI Dashboard Design and Sharing Guide
Dashboard creation, particularly using Power BI, involves a structured approach that prioritizes understanding the user’s needs, careful planning, and effective utilization of Power BI’s features for data visualization and interaction.
What is a Dashboard? Analytical dashboards are inspired by car dashboards, providing users with quick insights at a glance. They consolidate key information and visuals to help users understand data and identify patterns or anomalies efficiently.
Tools for Dashboard Creation Power BI Desktop is a free and popular business intelligence tool specifically designed for creating dashboards. While Excel can be used to build dashboards, it comes with limitations regarding data manipulation, formula complexity for interactive elements, and sharing, which Power BI aims to solve. Power BI is noted as the second most popular BI tool and is gaining popularity over competitors like Tableau.
Power BI Ecosystem for Dashboard Creation and Sharing The Power BI ecosystem consists primarily of two parts:
Power BI Desktop (App): This is the application where dashboards are built. It’s free to install and allows users to load data, build reports (which contain multiple pages, unlike Excel’s worksheets), and design visualizations.
Power BI Service: This is a cloud-based platform accessible via an internet browser, designed for sharing dashboards. Dashboards published to the Power BI Service can be accessed by co-workers within shared workspaces, or even published to the web for public access if the data is not confidential. While there is a free option, it is very limited; a Power BI Pro license (paid) is often needed for sharing and collaboration. Microsoft Fabric is also an umbrella platform that consolidates various data tools, including Power BI.
Best Practices for Dashboard Design To create effective dashboards that users will actually utilize, consider the following:
Define the Problem and Audience: Always ask: “What problem are we trying to solve with this dashboard?” and “Who are we designing this dashboard for?”. Dashboards are ineffective if they don’t address the specific concerns or problems of the end consumer.
Simplicity and Clarity: Avoid overwhelming dashboards with too many visuals or distracting colors. Simple color palettes help guide the user’s eye to important information.
Key Performance Indicators (KPIs): Place cards displaying key metrics (KPIs) prominently at the top of the dashboard, as they provide immediate value and draw attention.
Symmetry and Layout: A symmetrical layout, often with KPIs at the top and equally spaced graphs below, can improve readability and intuitiveness. Visual cues like backgrounds and boxes can group related elements and draw attention.
Interactivity: Incorporate features that allow users to interact with the data, such as slicers, buttons, and drill-through options.
Planning and Rough Drafting Before building, it’s recommended to sketch out a rough design of the dashboard, or at least rough draft it within Power BI itself. This allows for early feedback from stakeholders and helps ensure the design aligns with the intended purpose.
Steps in Dashboard Creation (Power BI Desktop)
Start a New Page: Create a dedicated page for your dashboard.
Add a Title: Insert a text box for the dashboard title, formatting it appropriately for size and boldness.
Insert Slicers:Slicers enable users to interactively filter data.
Types include vertical list, tile, and dropdown.
Enable search functionality for long lists.
Allow multi-select (default with Ctrl/Cmd) or enforce single-select.
The “Show select all” option is useful.
Date and numeric slicers (between, before, after, relative) can be added, though some date slicer types may have known bugs.
Slicers can be synchronized across multiple pages using the “Sync slicers” pane.
A “Clear all slicers” button can be added for user convenience, often styled with visual cues like shadows and rounded corners. An “Apply all slicers” button can be useful for very large datasets to control refresh performance.
Add Cards (KPIs):Use card visuals (e.g., “Card (new)”) to display single, prominent data points like “Job Count,” “Median Yearly Salary,” or “Skills Per Job”.
New card visuals can display multiple fields.
Format callout values, labels, and remove borders as needed.
Other card types like Gauge cards (showing min, max, target values) and Multi-row cards are available. KPI cards show a value with a trend and color-coding based on goals.
Insert Charts/Visualizations:Choose appropriate chart types (e.g., bar charts for comparison, line charts for trends over time, scatter plots for relationships, tree maps for hierarchical breakdown).
Formatting: Adjust axes (labels, values, ranges), legends, titles, and data labels for clarity.
Conditional Formatting: Use data bars, background colors, or icons to highlight specific values based on conditions. This helps draw the user’s attention.
Trend Lines: Add trend lines to visualize patterns in data, especially in line charts or scatter plots.
Matrices and Tables: These are useful for displaying detailed data and can include conditional formatting and sparklines (mini-charts within cells) for quick trends.
Implement Drill-through: This advanced feature allows users to right-click on a visual and navigate to a separate, detailed page filtered by their selection. A dedicated button can also be created for drill-through.
Use Parameters:Field Parameters: Allow end-users to dynamically switch columns or measures displayed in a visual (e.g., changing a chart’s axis from “Job Title” to “Country” or “Skill”).
Numeric Parameters: Enable “what-if” analysis by allowing users to adjust numerical inputs (e.g., a tax deduction rate) via a slider, which then affects calculations in visuals.
Add Backgrounds and Organize Visually: Insert shapes (e.g., rounded rectangles) behind visuals to create visual groupings and a cohesive design. Set visual backgrounds to transparent to reveal these background shapes.
Hide Header Icons: Turn off header icons on visuals by making their transparency 100% to clean up the design.
Save Frequently: Power BI Desktop does not have an autosave feature, so frequent saving is crucial to prevent data loss.
Data Preparation for Dashboards Effective dashboards rely on well-prepared data.
Power Query (M Language): Used for Extract, Transform, Load (ETL) operations before data is loaded into the Power BI data model. It’s recommended for data cleaning, shaping, and creating new columns or tables (e.g., combining data from multiple files in a folder, unpivoting data, cleaning text). Power Query transformations lead to more efficient data compression and smaller file sizes.
DAX (Data Analysis Expressions): A formula language used after data is loaded into the data model to add calculations. It is used for creating calculated columns, calculated tables, and explicit measures. While calculated columns and tables can be created with DAX, it’s generally recommended to do data transformations in Power Query for better performance and organization.
Explicit Measures: Dynamic calculations that are computed at query runtime (e.g., when a visual is built), providing a “single source of truth” for consistent calculations across reports. They are preferred over implicit measures (automatic aggregations) for complexity and control. Measures can be organized in a dedicated table and thoroughly commented for documentation.
Context in DAX: Understanding row context (individual row calculation), query context (visual/filter selection), and filter context (explicit modification, highest precedence) is crucial for complex DAX calculations.
Sharing Dashboards After creation, dashboards can be shared in several ways:
Power BI File (.pbix): The dashboard file can be directly shared, but the recipient needs Power BI Desktop to open it, and version control can be an issue.
Power BI Service: Publishing to the Power BI Service allows for centralized access, sharing with specific groups (workspaces), and embedding reports (e.g., into websites). Admin settings may be required to enable features like “Publish to Web”.
GitHub: An online repository to store project files, including the Power BI file and a “readme” document that explains the project, showcases skills, and can link directly to the interactive dashboard in the Power BI Service. This method allows for version control and provides a professional portfolio for showcasing work.
LinkedIn: Projects hosted on platforms like GitHub or the Power BI Service can be linked and showcased on LinkedIn profiles, or shared directly via posts, to gain visibility and potential career opportunities.
Power BI for Data Analytics – Full Course for Beginners
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
This document serves as a transcript for a video tutorial focused on Microsoft Power BI, a business intelligence tool. The tutorial, led by Kevin, explains how to download and install Power BI, import data from various sources like Excel spreadsheets and the web, and transform that data for analysis. It then guides users through creating various visualizations such as bar charts, line charts, and maps, and demonstrates how to interact with and slice the data within the reports. Finally, the document covers customizing the report’s appearance and the process of saving and publishing the report for sharing and collaboration within the Power BI service.
Power BI: From Data to Insightful Reports
Microsoft Power BI is a tool used to gain insights from data. It was utilized at Microsoft to analyze business performance and make decisions based on that performance. Power BI Desktop is entirely free to download and install, regardless of whether you have an enterprise or commercial account.
The general workflow for using Power BI, as introduced in a tutorial, involves:
Downloading and installing Power BI.
Importing sample data.
Creating visualizations and reports.
Saving, publishing, and sharing these reports with others.
This overview serves as a “101” or introduction to Power BI.
Installation Methods The easiest and recommended way to install Power BI is by clicking the “download free” button, which opens the Microsoft Store to the Power BI download page. Benefits of installing via the Microsoft Store include automatic updates, quicker downloads of only changed components, and the ability for any user (not just an admin) to install it. Alternatively, you can click “see download or language options” to download an executable (.EXE) file and install it manually, though this method does not use the Microsoft Store.
Getting Started and Interface After installation, you can launch Power BI, which first displays a welcome screen. The most crucial initial step is to “get data,” as visualizations cannot be created without it. The welcome screen also shows recent data sources and previously created reports for quick access. Power BI offers training content, including videos and tutorials, to help users get up to speed.
The main interface of Power BI Desktop includes several views:
Report View: This is the default view, a blank canvas where visuals like charts, tables, or maps are created. On the right side, there are “fields” (all available data columns) and “visuals” (different types of visuals that can be built) panes.
Data View: Clicking this option displays a spreadsheet-like view of all imported and transformed data.
Model View: This view shows the relationships between different data tables. For example, if two tables are joined based on a common field like “country name,” a line will connect them, highlighting the relationship when hovered over.
Data Import and Transformation Power BI can pull data from an extensive list of sources, including Excel spreadsheets, SQL databases, web sources (like Wikipedia articles), and Kusto queries. For example, data can be imported from an Excel spreadsheet containing revenue, cost, and profit data, along with details like country, product, sales, and dates. Additionally, data from the web, such as a Wikipedia article listing countries and their populations, can be pulled in.
Data transformation is a key step, allowing users to modify and select data before it’s brought into Power BI. This process opens the Power Query editor, where data is “shaped” and a data model is built. Examples of transformations include:
Filtering out specific data, such as removing “Fortune cookies” from product analysis. These filtered steps can also be undone.
Changing data types, like converting “units sold” from decimals to whole numbers.
Renaming columns for conciseness, such as changing “month name” to “month”.
Removing unnecessary columns, like “percent of world population,” “date,” “source,” or “rank” from imported web data.
Filtering rows to include only relevant data, such as specific countries where a company has locations (e.g., Canada, France, Germany, Mexico, United States).
Replacing values within columns, like removing an extra “D” from “United StatesD”.
Connecting Data Sources Independent data tables can be connected or joined. This is done using the “merge queries” function, allowing tables to be linked based on common fields, such as “country name” between cookie sales data and country populations data. This enables the association of data from one source (e.g., population) with another (e.g., cookie sales).
Creating and Formatting Visualizations After data is loaded and modeled, visualizations can be created on the report canvas. Users can insert a text box to add a title to the report. To create a visual, users can simply click on a data field (e.g., “profit” and “date”) and Power BI will suggest a default chart type (e.g., a bar chart). This can then be changed to another type, such as a line chart for profit by date. Other common visualizations include:
Map visualization: Automatically inserted when country data is selected, showing locations and allowing profit data to be displayed on the map, with dot sizes indicating profit levels. Can be switched to a treemap to show profit by country hierarchy.
Table: Allows presentation of data like country, population, and units sold in a structured format.
Bar chart: Used to show sales or profit by product, easily illustrating which products generate the most profit.
Visualizations can be formatted by clicking on the “format” option (paint roller icon) in the visualization pane. This allows adjustment of various elements, such as increasing title text size, to match company branding or preference. Reports can also have multiple pages.
Slicing and Sharing Data Power BI reports allow for easy data slicing (filtering). A “slicer” visual can be added to a report, where users can select specific categories (e.g., country name) to filter all other visuals on the page. Clicking directly on elements within other visuals, such as a country on a map or in a table, can also serve as a quick way to slice the data.
Once a report is complete, it can be saved. The “power” of Power BI comes from its ability to share reports with others. Reports are published to the Power BI service (powerbi.com). From there, the report can be opened in the Power BI service, where it can still be filtered. The share dialog allows granting access to specific individuals via email, setting permissions (like allowing sharing or creating new content based on datasets), and sending email notifications.
Power BI: Data Transformation and Modeling with Power Query
Data transformation in Power BI is a crucial step that allows users to modify and select data before it is loaded into the Power BI environment. This process is carried out in the Power Query editor, where data is “shaped” and a data model is built.
Here are the key aspects and examples of data transformation discussed:
Purpose of Transformation
It enables users to modify their data and choose exactly what data they want to bring into Power BI.
It helps in building a structured data model suitable for analysis and visualization.
Accessing the Power Query Editor
After selecting data from a source (e.g., an Excel spreadsheet), users can choose “Transform data” instead of “Load” to open the Power Query editor.
Common Transformation Actions
Filtering Data: Users can filter out specific rows or values that are not relevant to the analysis. For example, a product line like “Fortune cookies” might be removed from the analysis if it’s not profitable or is distracting from other products. These filtered steps can also be undone later if needed.
Changing Data Types: Data types can be adjusted to ensure accuracy and usability. For instance, “units sold” might be changed from decimal numbers to whole numbers if fractional sales don’t make sense.
Renaming Columns: Columns can be renamed for conciseness or clarity, such as changing “month name” to simply “month”.
Removing Unnecessary Columns: Columns that are not needed for the analysis can be removed, such as “percent of world population,” “date,” “source,” or “rank” from a web-imported dataset.
Filtering Rows to Specific Subsets: Users can filter down rows to include only relevant data, such as selecting only countries where a company has locations (e.g., Canada, France, Germany, Mexico, United States).
Replacing Values: Specific values within columns can be replaced to correct inconsistencies, like removing an extra “D” from “United StatesD”.
Tracking Transformations (Applied Steps)
As changes are made in the Power Query editor, each transformation is recorded in a section called “applied steps” on the right-hand side of the interface. This allows users to see all the modifications made to the data and also provides the option to remove a step if it was made unintentionally.
Connecting Independent Data Sources (Merging Queries)
Power BI allows users to connect or join independent data tables, such as linking cookie sales data with country population data from a Wikipedia article.
This is done using the “merge queries” function, where tables are joined based on a common field (e.g., “country name”).
The “Model View” in Power BI Desktop visually represents these relationships between data tables, showing lines connecting tables that are joined.
Once all transformations are complete and the data model is built, users click “close and apply” to load the refined data into Power BI, ready for report creation.
Power BI: Crafting Interactive Reports and Visualizations
After data transformation and modeling, Power BI Desktop provides a Report View, which serves as a blank canvas where users create and arrange various visuals such as charts, tables, or maps. This blank area is referred to as the report editor.
On the right side of the Power BI Desktop interface, there are two key panes that facilitate report visualization:
Fields Pane: This pane displays all available data columns (called fields) from the imported and transformed data. Users can drag and drop these fields onto the canvas or select them to build visuals.
Visuals Pane: Located to the left of the fields pane, this section offers various types of visuals that can be built using the data.
Here’s a breakdown of how report visualization works:
Creating Visualizations
Starting a Visual: To create a visual, users can simply click on relevant data fields in the “fields” pane, such as “profit” and “date”.
Default Suggestions: Power BI often predicts and inserts a default chart type that it deems most likely suitable for the selected data, like a bar chart for profit by date.
Changing Visual Types: Users can easily change the chart type from the “visualizations” pane if the default doesn’t align with their needs (e.g., switching a bar chart to a line chart for profit by date).
Defining Visual Elements: The visualizations pane also allows users to define different elements of the chart, such as what fields serve as the axis, values, or legend.
Examples of Visualizations:
Text Box: Can be inserted to add a title to the report, providing context (e.g., “Kevin Cookie Company performance report”).
Line Chart: Useful for showing trends over time, such as profit by date.
Map Visualization: Automatically inserted when geographical data like “country” is selected. It shows locations with dots, and profit data can be dragged onto the map to represent profit levels by dot size.
Treemap: An alternative to the map view, it can display hierarchical data like profit by country, illustrating which country had the most or least profit.
Table: Allows presentation of data in a structured, spreadsheet-like format, such as country, population, and units sold. Users can drag and drop fields into the table.
Bar Chart: Used to show comparisons, such as sales or profit by product, clearly indicating top-performing products.
Formatting and Appearance
Themes: The “View” tab in the ribbon provides different themes (e.g., “executive” theme) that can be applied to change the overall look and feel of the report, including color schemes, to make it appear more professional.
Individual Visual Formatting: Each visual can be formatted individually by clicking on the “format” option (represented by a paint roller icon) within the visualization pane. This allows users to adjust elements like title text size or other visual properties to match company branding or preference.
Multiple Pages: Reports can span multiple pages, allowing for comprehensive data presentation.
Slicing and Interacting with Data
Slicer Visual: A “slicer” visual can be added to the report, typically based on a categorical field like “country name”. Selecting a specific category in the slicer will filter all other visuals on the page to reflect only that selection.
Direct Interaction with Visuals: Users can also slice data by directly clicking on elements within other visuals, such as clicking on a country on a map or in a table. This provides a quick way to filter the entire report based on that selection. Clicking a blank area or re-clicking a selection can undo the filter.
Saving and Sharing Reports Once a report with visualizations is complete, it can be saved locally. The “power” of Power BI is realized when reports are published to the Power BI service (powerbi.com), enabling sharing and collaboration. In the Power BI service, reports remain interactive and can still be filtered. The share dialog allows users to grant access to specific individuals via email, set permissions (e.g., allowing sharing or creating new content based on datasets), and send email notifications.
Power BI: Collaborative Data Sharing Essentials
Data sharing in Power BI is a fundamental aspect that unlocks the full potential of the platform, moving beyond individual analysis to collaborative insights. While reports can be created and saved locally for personal use, the true “power” of Power BI lies in its ability to enable collaboration and allow others to interact with the created visualizations.
Here’s a discussion on data sharing:
Purpose of Sharing: The primary goal of sharing is to allow other individuals to view and interact with the visualizations and reports you’ve created. This facilitates collective analysis and decision-making based on the data.
The Sharing Process:
Local Saving: After creating a report and its visualizations, it is initially saved locally on your desktop as a .pbix file. At this stage, it can be used for individual analysis.
Publishing to Power BI Service: To share the report, it must first be “published”. This is done by navigating to the “file” menu and selecting the “publish” option, then choosing “publish to Power BI”.
Power BI Service (powerbi.com): The Power BI service is the online platform where all published reports are housed. Once published successfully, the report becomes accessible on powerbi.com. Reports opened in the Power BI service remain interactive, allowing users to filter data just as they would in the Power BI desktop application.
Sharing Options and Permissions:
From the Power BI service, you can click on the “share” button, typically found in the top right-hand corner.
This opens a “share dialog” that provides various options for granting access.
You can grant access to specific individuals by entering their email addresses.
Crucially, you can define permissions for those you share with:
You can allow recipients to share the report with others.
You can enable them to create new content based on the underlying datasets.
An option to send an email notification to the recipients is also available, which can include any changes made to the report.
Power BI Report Customization Guide
Report customization in Power BI allows users to refine the appearance and layout of their reports to enhance clarity, professionalism, and alignment with specific branding or preferences. This process goes beyond merely creating visualizations and focuses on making the report aesthetically pleasing and user-friendly.
Key aspects of report customization include:
Adding Contextual Elements:
Titles: Users can insert text boxes to add a main title to the report, providing immediate context (e.g., “Kevin Cookie Company performance report”). These titles can be resized and positioned to span the entire report.
Formatting Visuals:
Changing Chart Types: While Power BI often suggests a default chart type (e.g., bar chart) for selected data, users can easily switch to other visual types (e.g., line chart, treemap, map, table, bar chart) from the “visualizations” pane to better represent their data.
Defining Visual Elements: Within the visualization pane, users can explicitly define what fields should serve as the axis, values, or legend for a chart. They can also add secondary values.
Individual Visual Formatting: Each visual can be formatted independently. By selecting a visual and clicking on the “format” option (represented by a paint roller icon) in the visualizations pane, users can adjust various elements. For instance, the title text size of a visual can be increased to make it stand out. This allows users to match the visuals to their company’s brand, look, and feel.
Applying Themes:
Power BI provides different themes (e.g., “executive” theme) under the “View” tab on the ribbon. Applying a theme changes the overall color scheme and appearance of the report, contributing to a more professional look.
Organizing Layout:
Users can drag and drop visuals around the report editor (the blank canvas) to organize them as desired.
Reports are not limited to a single page; users can add multiple pages to their report to accommodate extensive data and different views. Pages can also be renamed.
By leveraging these customization features, users can transform raw data visualizations into polished, insightful reports that effectively communicate their findings. Once satisfied with the customization, the report can be saved locally and then published to the Power BI service for sharing.
How to use Microsoft Power BI – Tutorial for Beginners
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
This comprehensive guide provides an in-depth look into Power BI, a powerful business intelligence tool from Microsoft. It details the step-by-step process of installing and utilizing Power BI Desktop, covering essential data manipulation techniques such as text, numerical, date, and time transformations. The sources further explore advanced concepts like merging and appending queries, managing data relationships through primary and foreign keys, and understanding different cardinalities. Finally, the guide concludes with a focus on data visualization, demonstrating the creation of various charts and filters, and the process of publishing dashboards to Power BI service.
Mastering Power BI: Data Analysis and Visualization
Power BI, developed by Microsoft, is a powerful business analytics tool designed for analyzing and visualizing data in insightful and interactive ways. It has gained popularity due to its user-friendly interface and robust features. Power BI is suitable for business analysts, data analysts, data scientists, or anyone who wants to work efficiently with data, providing necessary skills and knowledge to become proficient in data handling.
Key Capabilities and Features Power BI allows users to transform, clean, analyze, and visualize data. It enables effortless data gathering from various platforms, including Excel, CSV files, different databases like MySQL, Postgres, Oracle, or other datasets. It is noted for its strong visualization capabilities, offering a wide range of charts such as bar plots, pie charts, and stack plots. Unlike Excel, Power BI has the capacity to work with large datasets and offers numerous deployment options. The end result of working with Power BI is often the creation of interactive and visually appealing dashboards.
Installation and Interface To install Power BI Desktop for Windows, users typically download the executable file from Microsoft’s website. Once installed, its user interface is very similar to Excel, making it easy for Excel users to adapt. Power BI also offers tutorials, blogs, and forums for support. While desktop usage is common, Power BI reports can also be created and viewed on mobile phones. A company domain email address is generally required for login, though free business emails can be created for this purpose.
Data Handling and Transformation Power BI provides various data connectors to import data from diverse sources. These include:
Files: Excel workbooks, Text/CSV files, XML, JSON, and PDF. Data can also be pulled from folders.
Databases: SQL Server, Oracle, Postgres, MySQL, and other databases.
Power Platform: Existing datasets loaded in Power Platform can be accessed.
Cloud Services (Azure): Azure SQL Database and other Azure options are available.
Online Services: Google Analytics, GitHub, LinkedIn Sales Navigator, and many more.
Other: Data can be scrapped from the web, or connected to Hadoop, Spark, R script, and Python script.
Power BI offers extensive tools for data transformation:
Text Tools: Used for text manipulations like converting to lower/upper case, trimming whitespace, replacing values, combining values (concatenate), finding specific text, formatting text, and extracting specific parts of text using delimiters (e.g., username from an email address). These tools can either transform the existing column or add a new column with the transformed data.
Numerical Tools: Used for mathematical operations, statistics (maximum, median, average, standard deviation, count), rounding values, and applying filters. These can be applied by adding a new column or transforming an existing one.
Date and Time Tools: Essential for analyzing time-based patterns, such as identifying peak order times or days. They allow extraction of year, month, day, age calculations, and conversion of time formats (e.g., 24-hour to 12-hour). Regional settings may need adjustment for proper date parsing.
Pivoting and Unpivoting: These techniques allow converting rows to columns (pivoting) and columns to rows (unpivoting) to restructure data for easier analysis.
Conditional Columns: New columns can be created based on specified conditions, similar to conditional statements in programming.
Creating Tables: Users can manually create tables within Power BI by entering data directly.
DAX (Data Analysis Expressions) DAX is a collection of functions, operators, and constants used in Power BI to create new data or transform existing data.
Purpose: DAX is used to calculate complex formulas, create measures, develop time intelligence calculations, and dynamically or statically analyze data.
Calculated Columns vs. Measures:
Calculated Columns: Create a new column in the data model, adding static data that consumes memory and updates when new data is added. They work row by row.
Measures: Dynamically calculate values at runtime, primarily for aggregations like sum, count, or average, and are used to create visual reports. They do not consume memory for each row. Measures can be implicit (automatically created by Power BI) or explicit (user-defined).
DAX Functions: Broadly categorized into:
Date and Time: Work on date-related calculations (e.g., NOW, YEAR, WEEKDAY).
Text Functions: Manipulate text strings (e.g., CONCATENATE, FIND, FORMAT, LEFT, LEN, LOWER, REPLACE, RIGHT, TRIM, UPPER).
Informative Functions: Provide information about data types and handle errors (e.g., IFERROR, IFNA).
Filter Functions: Filter data based on conditions (e.g., FILTER, CALCULATETABLE).
Math and Trigonometric Functions: Perform mathematical calculations (e.g., ABS, SIN, COS, TAN).
Statistical Functions: Used for statistical calculations (e.g., percentile, standard deviation).
Financial Functions: Aid in financial computations.
DAX Syntax: Typically involves a column name, an equals sign, a function, and then references to table and column names (e.g., ColumnName = Function(TableName[ColumnName])).
Operators: Used in DAX formulas for various purposes:
Arithmetic: +, -, *, / for mathematical operations.
Logical: AND, OR, NOT for combining or negating conditions.
Concatenation: & for joining text from multiple columns.
Reference: TableName[ColumnName] for referencing specific columns.
Parentheses: () for controlling execution order of formulas.
Miscellaneous: : (colon) for separating elements in date and time.
Data Modeling and Relationships Data modeling is crucial for connecting different tables and sources of data within Power BI, especially in companies with diverse datasets (e.g., product, sales, customer details).
Merge and Append Queries:
Merge: Combines two tables based on a common key (like a primary key and foreign key), increasing the number of columns, similar to SQL joins (inner, left, right, full, anti-joins).
Append: Stacks rows from multiple tables with similar columns into one table, increasing the number of rows.
Keys:
Primary Key: A unique identifier for each record in a table (e.g., product ID, Aadhaar card number).
Foreign Key: A column in one table that refers to the primary key in another table, allowing for duplicate values.
Cardinality: Describes the nature of the relationship between two tables based on primary and foreign keys.
One-to-one (1:1): Both tables have unique primary keys related to each other.
One-to-many (1:*): One table has a primary key, and the other has a foreign key that can be repeated multiple times.
Many-to-one (*:1): The reverse of one-to-many, where the foreign key is on the “many” side and the primary key is on the “one” side.
Many-to-many (:): Both tables have foreign keys that can be repeated.
Cross-Filter Direction: Defines the flow of data filtering between related tables (single or double direction).
Managing Relationships: Power BI can automatically detect relationships. Users can manually manage and edit these relationships, including setting cardinality and cross-filter direction, and activating/deactivating multiple relationships between tables.
Data Visualization Visualization is a critical step in Power BI, revealing patterns and trends that are not apparent in raw row and column data.
Dashboard Elements: The report section is where visuals are built using fields (columns from tables) that can be dragged and dropped.
Visual Types: Power BI offers a wide array of built-in visuals:
Charts: Stacked bar, stacked column, clustered bar, clustered column, line, area, pie, scatter, donut, funnel, map, tree map.
Matrices: Powerful tools for visualizing data across different parameters and dimensions, allowing drill-down into subcategories.
Cards: Number cards (for highlighting single large numbers) and multi-row cards (for multiple pieces of information).
KPI Visuals: Show key performance indicators, often with trend lines, useful for comparing current and past performance.
Custom Visuals: Users can import additional visuals from the Power BI marketplace (e.g., boxplot, flow map, calendar).
Formatting and Customization: Visuals can be extensively formatted, including changing font size, colors, titles, background, borders, data labels, and themes.
Filtering:
Filter Pane: Allows applying filters on a specific visual, on the current page, or across all pages. Advanced filtering options like “greater than” or “less than” are available.
Slicers: Interactive tools for filtering data across the entire dashboard or different pages. They can display data as lists, dropdowns, or ranges (e.g., date sliders).
Sync Slicers: Allows the same filter to be applied consistently across multiple pages.
Interactivity Tools:
Buttons: Can be added to navigate between pages or trigger other actions.
Bookmarks: Capture the current state of a report page (e.g., filters applied, visuals visible) allowing users to return to that view.
Images: Can be inserted for branding (e.g., logos) or icons.
Publishing and Sharing Once a dashboard is complete, it can be published to Power BI service, which typically requires a user to be signed in. Published reports retain their interactivity and can be viewed online, shared with co-workers, or even published to the web without security if desired. Power BI also allows creating a mobile layout for dashboards, optimizing them for phone viewing.
Power BI: Data Analysis from Gathering to Visualization
Data analysis is a critical process for extracting insights and patterns from raw data to inform decision-making, and Power BI serves as a powerful business analytics tool to facilitate this. It involves several key steps, from data gathering and cleaning to sophisticated analysis and visualization.
The Role of a Data Analyst
A data analyst’s primary responsibility is to gather, interpret, process, and clean data, ultimately representing it in a graphical format. This graphical representation allows business strategists to understand the information better and use it to grow their business. Power BI is designed to provide the necessary skills and knowledge to become proficient in working efficiently with data.
Key Steps in Data Analysis using Power BI
Data Gathering (Data Connectors): Power BI offers extensive data connectors that allow users to effortlessly gather data from various platforms. These sources include:
Files: Excel workbooks, Text/CSV files, XML, JSON, and PDF. Data can also be pulled from folders.
Databases: SQL Server, Oracle, Postgres, and MySQL are among many databases from which data can be extracted.
Power Platform: Existing datasets loaded in Power Platform can be directly accessed.
Cloud Services (Azure): Azure SQL Database and other Azure options enable data retrieval from the cloud.
Online Services: Google Analytics, GitHub repositories, and LinkedIn Sales Navigator are examples of online services that can connect to Power BI.
Other: Data can be obtained by scrapping from the web, or connecting to Hadoop, Spark, R scripts, and Python scripts.
Data Transformation and Cleaning: Once data is gathered, Power BI provides robust tools for cleaning and processing it. This includes:
Text Tools: Used for manipulations such as converting text to lower or upper case, trimming whitespace, replacing values, combining values (concatenate), finding specific text, formatting text, and extracting parts of text using delimiters (e.g., username from an email address). These tools can either transform an existing column or add a new one with the transformed data.
Numerical Tools: Applicable for mathematical operations, statistics (maximum, median, average, standard deviation, count), rounding values, and applying filters. Like text tools, they can transform existing columns or create new ones.
Date and Time Tools: Essential for analyzing time-based patterns (e.g., peak order times or days). They allow extraction of year, month, day, and age calculations, and conversion of time formats (e.g., 24-hour to 12-hour). Regional settings may need adjustment for proper date parsing.
Pivoting and Unpivoting: These techniques allow restructuring data by converting rows to columns (pivoting) or columns to rows (unpivoting) for easier analysis.
Conditional Columns: New columns can be created based on specified conditions, similar to conditional statements in programming.
Creating Tables: Users can manually create tables within Power BI by entering data directly.
Data Analysis Expressions (DAX): DAX is a collection of functions, operators, and constants used in Power BI to create new data or transform existing data.
Purpose: DAX is used to calculate complex formulas, create measures, develop time intelligence calculations, and dynamically or statically analyze data.
Calculated Columns vs. Measures:
Calculated Columns: Create a new column in the data model, adding static data that consumes memory and updates when new data is added. They work row by row.
Measures: Dynamically calculate values at runtime, primarily for aggregations like sum, count, or average, and are used to create visual reports. They do not consume memory for each row. Measures can be implicit (automatically created by Power BI) or explicit (user-defined).
DAX Functions: Broadly categorized into Date and Time, Text, Informative, Filter, Aggregation, Time Intelligence, Logical, Math and Trigonometric, Statistical, and Financial functions.
DAX Syntax: Typically involves a column name, an equals sign, a function, and then references to table and column names (e.g., ColumnName = Function(TableName[ColumnName])).
Operators: Used in DAX formulas, including arithmetic (+, -, *, /), comparison (>, <, =, >=, <=, <>), logical (AND, OR, NOT), concatenation (&), reference (TableName[ColumnName]), and parentheses () for controlling execution order.
Data Modeling and Relationships: Data modeling is crucial for connecting different tables and sources, especially in companies with diverse datasets (e.g., product, sales, customer details).
Merge and Append Queries:
Merge: Combines two tables based on a common key, increasing the number of columns, similar to SQL joins (inner, left, right, full, anti-joins).
Append: Stacks rows from multiple tables with similar columns into one table, increasing the number of rows.
Keys: Primary keys are unique identifiers, while foreign keys can be duplicated and refer to a primary key in another table.
Cardinality: Describes the relationship type between tables (one-to-one, one-to-many, many-to-one, many-to-many).
Cross-Filter Direction: Defines the flow of data filtering between related tables (single or double direction).
Managing Relationships: Power BI can automatically detect relationships, and users can manually manage and edit them, including setting cardinality and cross-filter direction.
Data Visualization: Visualization is a critical step in data analysis within Power BI, as it reveals patterns and trends not apparent in raw row and column data.
Dashboard Elements: Visuals are built in the report section by dragging and dropping fields (columns from tables).
Visual Types: Power BI offers a wide range of built-in visuals, including stacked bar, stacked column, clustered bar, clustered column, line, area, pie, scatter, donut, funnel, map, tree map, matrices, cards (number and multi-row), and KPI visuals. Users can also import custom visuals from the Power BI marketplace.
Formatting and Customization: Visuals can be extensively formatted, including changing font size, colors, titles, background, borders, data labels, and themes.
Filtering: Filters can be applied via the filter pane (on specific visuals, pages, or all pages) or interactive slicers (displaying data as lists, dropdowns, or ranges). Slicers can also be synced across multiple pages.
Interactivity Tools: Buttons can be added for page navigation or other actions, and bookmarks capture report states to allow users to return to specific views. Images can be inserted for branding or icons.
Publishing and Sharing: Completed dashboards can be published to Power BI service, requiring login, to be viewed online, shared with co-workers, or published to the web without security. Power BI also supports creating mobile layouts for dashboards, optimizing them for phone viewing.
Power BI: Mastering Data Visualization and Reporting
Data visualization is a crucial step in data analysis, transforming raw data into insightful and interactive visual representations to reveal patterns and trends that are not apparent in simple rows and columns. Power BI, a business analytics tool developed by Microsoft, is designed to facilitate this process, offering powerful features for visualizing data.
The Importance of Data Visualization
Visualizing data helps users see new things and discover patterns that might otherwise be missed. When data is presented in a graphical format, business strategists can better understand the information and use it to grow their business. Power BI provides the necessary skills and knowledge to become proficient in efficiently working with and visualizing data.
Key Aspects of Data Visualization in Power BI
Report Section and Visuals:
The primary area for creating visuals in Power BI is the report section.
Users can build visuals by dragging and dropping fields (columns from tables) from the “Fields” pane on the right-hand side.
Power BI offers a user-friendly interface with a wide range of interactive and powerful features for visualization.
Types of Visuals: Power BI includes many built-in chart types and allows for the import of custom visuals:
Bar and Column Charts: Stacked bar, stacked column, clustered bar, and clustered column charts are available for comparing values across categories.
Line and Area Charts: Used to show trends over time or categories.
Pie and Donut Charts: Represent parts of a whole. A donut chart can become a pie chart by reducing its inner radius to zero.
Scatter Plot: Displays relationships between two numerical variables.
Funnel Chart: Shows stages in a linear process.
Maps: Allows visualization of data geographically, using locations like countries or continents. Bubbles on the map can represent values, with their size corresponding to a measure like population. A “flow map” visual can also be imported to show destinations and origins or flows between regions.
Tree Maps: Display hierarchical data in a set of nested rectangles, where the size of each rectangle is proportional to its value. An existing chart, like a donut chart, can easily be converted into a tree map.
Matrices: A powerful tool for visualizing data on different parameters and dimensions, allowing for hierarchical drilling down from categories (e.g., continents) to subcategories (e.g., countries).
Cards: Used to highlight specific numeric information or text.
Number Cards: Display a single large number, such as total population or average values.
Multi-row Cards: Show multiple pieces of information, like sum of population, average life expectancy, and average GDP, in one visual.
Text Cards: Display textual information, such as the top-performing category based on an order quantity filter.
KPI (Key Performance Indicator) Visuals: Allow for showing performance metrics, often with a trend graph in the background, like the sum of population over time or company profit/loss.
Slicers: Interactive filtering tools that allow users to filter data across the entire dashboard or specific pages. Slicers can display data as a list, a dropdown, or a range slider (e.g., for years). They can also be synchronized across multiple pages.
Tables: Simple tabular representations of data.
Custom Visuals: Users can import additional visuals from the Power BI marketplace (AppSource) to enhance their dashboards.
Formatting and Customization: Power BI provides extensive options for customizing the appearance of visuals and dashboards:
Canvas Settings: Users can change the background color or add images to the canvas background to match a particular theme. Transparency can also be adjusted.
Themes: Different built-in themes are available, and users can also create their own custom themes.
Gridlines: Can be added to help arrange visuals neatly on the canvas.
Object Locking: Visuals can be locked in place to prevent accidental movement.
Axis Formatting: Users can change font size, colors, define ranges (minimum/maximum), and customize titles for X and Y axes.
Data Labels: Can be turned on or off to display specific values directly on the chart, with customizable colors and positions.
Colors: Colors of bars, slices (in donut charts), and text can be customized. Conditional formatting can be applied, for instance, to show a gradient of colors based on value (e.g., light blue for lowest to dark blue for highest).
Borders and Shadows: Visuals can have customizable borders and shadows to make the dashboard more interactive and visually appealing.
Spacing and Padding: Adjusting inner and outer padding for elements within charts helps control visual spacing.
Titles: Visual titles can be customized in terms of text, color, and font.
Filtering and Interactivity:
Filter Pane: Filters can be applied to individual visuals, to all visuals on a specific page, or to all visuals across all pages. Advanced filtering options include operators like “less than” or “greater than”.
Buttons: Can be added to dashboards for various actions, such as page navigation. Users can define the destination page for a button.
Bookmarks: Capture the current state of a report (including filters, sort order, and visible visuals), allowing users to return to specific views easily. Bookmarks can be linked to buttons for navigation.
Images: Logos or other icons can be added to the dashboard for branding or aesthetic purposes.
Publishing and Mobile View:
Mobile Layout: Dashboards created on desktops can be optimized for phone viewing by arranging elements within a mobile grid layout. This allows for scrolling and resizing visuals to fit mobile screens.
Publishing: Once a dashboard is complete and satisfactory, it can be published to the Power BI service for online viewing and sharing with co-workers. Reports can also be published to the web without security for public viewing.
Power BI Data Modeling: Relationships and Cardinality
Data modeling is a crucial aspect of data analysis in Power BI, particularly when dealing with information from various sources. It involves connecting different tables and managing the relationships between them to enable comprehensive and accurate data visualization and analysis.
Purpose and Importance of Data Modeling
Data modeling is essential because companies often have data stored in separate tables or databases, such as sales, product, and customer details. Creating relationships between these disparate tables allows for a unified view and accurate visualization of the data, which is vital for data analysis. Without proper data modeling, tables remain independent, and it becomes difficult to see relationships between them, leading to inaccurate or incomplete data display.
Key Concepts in Data Modeling
Primary Key: A column that contains unique values and is not repeated or duplicated within a table. For example, a product ID in a product table or an Aadhaar card number are primary keys because each is unique to a single entity.
Foreign Key: A column that can contain duplicate values and acts as a clone of a primary key from another table. For instance, a customer key in a sales table might appear multiple times if a customer buys several products, making it a foreign key, whereas the same customer key in the customer data table would be a primary key.
Relationships and Cardinality
Relationships are built between tables based on common primary and foreign keys. Power BI can automatically detect these relationships upon data load. The type of relationship between tables is known as cardinality:
One-to-One (1:1): Occurs when both tables involved in the relationship have unique primary keys in the joined columns. For example, an employee ID in an employee details table and the same employee ID in a bonus table, where both IDs are unique in their respective tables, form a one-to-one relationship.
One-to-Many (1:N): This is a common relationship where one table contains a primary key, and the related column in another table is a foreign key with multiple occurrences. An example is a product table with unique product IDs (primary key) linked to a sales table where product IDs can repeat for multiple sales (foreign key). The data flow typically goes from the ‘one’ side (primary key) to the ‘many’ side (foreign key).
Many-to-One (N:1): This is the inverse of one-to-many, where the foreign key is in the first table and the primary key is in the second.
Many-to-Many (N:N): This relationship occurs when both related columns in two tables are foreign keys, meaning values can repeat in both. It is generally advised to create this type of relationship rarely.
Cross-Filter Direction: This refers to the direction of data flow between tables in a relationship.
Single Direction: Data flow is from the primary key side to the foreign key side (1 to Many).
Double Direction (Both): Data flow is bidirectional, allowing filtering from either side (primary key to foreign key and vice versa). This enables a third connected table to access data more easily, even if it doesn’t have a direct relationship.
Managing and Editing Relationships in Power BI
Power BI offers tools to manage and edit relationships:
Automatic Detection: Power BI can automatically detect and create relationships between tables when data is loaded, especially if common column names or keys exist.
Manual Creation: Users can manually create relationships by dragging and dropping common keys between tables in the ‘Model’ view.
Editing Relationships: Existing relationships can be edited to change their type (cardinality) or cross-filter direction. For instance, a user can modify a relationship from one-to-many to many-to-many or change its filter direction.
Activation/Deactivation: Only one active relationship can exist between two tables at any given time. If multiple potential relationships exist, others will appear as dotted lines, indicating they are deactivated. To activate a deactivated relationship, another active relationship between the same tables must be deactivated first.
Proper data modeling ensures that relationships are correctly defined, leading to accurate data analysis and visualization in dashboards.
DAX Functions for Data Analysis and Power BI
DAX, which stands for Data Analysis Expressions, is a powerful functional language used in Power BI to create custom calculations for data analysis and visualization. It includes a library of functions, operators, and constants that can be used to perform dynamic aggregations and define new computed columns and measures within your data models.
Purpose and Application of DAX Functions
DAX functions are essential for transforming and analyzing data beyond what simple transformations can achieve. They allow users to:
Create calculated columns: These are new columns added to a table, where each row’s value is computed based on a DAX formula. Calculated columns are static and consume memory, updating when new data is added to the model.
Create measures: Measures are dynamic calculations that aggregate data, such as sums, averages, or counts, and are evaluated at query time, making them efficient for reporting and dashboard interactions. They do not consume memory until used in a visual.
Calculate complex formulas: DAX enables the creation of sophisticated calculations, including time intelligence calculations, to group data and derive insights.
Analyze data dynamically and statically: DAX expressions provide flexibility for various analytical needs.
Categories of DAX Functions
DAX functions are broadly categorized to handle different types of data and analytical needs:
Date and Time Functions: Used for operations on date and time data, such as extracting parts of a date (year, month, day), calculating age, or finding differences between dates. Examples include NOW(), YEAR(), WEEKDAY(), DATE_DIFFERENCE().
Text Functions: Used to manipulate text strings, such as concatenating text, changing case, trimming whitespace, or finding specific substrings. Examples include CONCATENATE(), FIND(), FORMAT(), LEFT(), RIGHT(), LEN(), LOWER(), UPPER(), REPLACE(), and TRIM().
Informative Functions: Provide information about data types or handle errors, like checking for text, even/odd numbers, or missing data. Examples include ISERROR() or ISNA().
Filter Functions: Work based on specified conditions to filter data, often used with CALCULATE or FILTER to modify contexts. Examples include SUMX (sum if condition) or COUNTX (count if condition).
Aggregation Functions: Used to summarize data, such as SUM, COUNT, AVERAGE, MIN, and MAX.
Time Intelligence Functions: Specialized functions that enable calculations over time periods, essential for trend analysis.
Logical Functions: Implement conditional logic, evaluating expressions based on true/false conditions. Examples include IF(), AND(), OR(), NOT(), and SWITCH().
Math and Trigonometric Functions: Perform mathematical operations like absolute value, square root, exponents, or trigonometric calculations such as sine, cosine, and tangent. Examples include ROUNDUP(), ROUNDDOWN().
Statistical Functions: Used for statistical calculations like percentile or standard deviation.
Financial Functions: Help compute financial calculations.
Other Functions: A category for functions that don’t fit into the above, such as NOW() or GOOD().
DAX Syntax
The general syntax for a DAX expression typically involves:
Column Name: The name of the new calculated column or measure being created.
Equals Sign (=): Indicates that the column or measure is defined by the subsequent expression.
Function: The DAX function to be used (e.g., SUM, COUNT, IF).
Table Name (optional for measures, often needed for calculated columns): Specifies the table containing the data.
Column Reference: The specific column on which the function operates, often enclosed in square brackets [].
Example: Total Price = SUM(‘Order Items'[Price])
Practical Examples of DAX Functions
LEN(): To find the number of digits or characters in a column, such as digit count of ID = LEN(‘Zomato Asia Africa'[Restaurant ID]).
LEFT() / RIGHT(): To extract a specified number of characters from the beginning or end of a text string. For instance, creating a “Short Day” column from “Day Name” using short day = LEFT(‘Customer Data'[Day Name], 3) to get “THU” from “Thursday”.
LOWER() / UPPER(): To convert text in a column to lowercase or uppercase. For example, LOWER(‘Customer Data'[Day Name]) converts “THU” to “thu”.
Concatenation (&): To combine values from multiple columns into one, like creating a full name: ‘Customer Data'[Prefix] & ” ” & ‘Customer Data'[First Name] & ” ” & ‘Customer Data'[Last Name].
DATE_DIFFERENCE(): To calculate the difference between two dates, useful for determining age. For example, DATE_DIFFERENCE(‘Customers Data'[Birth Date], TODAY(), YEAR) to get age in years.
IF(): To apply conditional logic. For instance, creating a payment data column: IF(‘O list order payments'[Payment Value] > 100, “High Price”, “Low Price”).
Arithmetic Operators (+, -, *, /): Used for mathematical calculations on column values.
Comparison Operators (>, <, =, etc.): Used to compare values, yielding true/false results, often within conditional statements.
DAX functions are fundamental for performing advanced data manipulation and aggregation, enabling users to derive deeper insights from their data in Power BI.
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
This YouTube video tutorial covers fundamental statistical concepts for data analysis and data science. The presenter explains descriptive statistics (measures of central tendency and dispersion, graphical representations), probability (distributions, Bayes’ theorem), and inferential statistics (estimation, hypothesis testing). Various statistical tests (z-test, t-test, ANOVA, chi-squared test) are discussed, along with concepts like outliers, covariance, and correlation. The tutorial emphasizes practical applications and includes real-world examples to illustrate key ideas.
Statistics for Data Analysis and Data Science Study Guide
Quiz
What is the primary purpose of statistics in the context of data analysis and data science?
Briefly describe the difference between descriptive and inferential statistics.
What are the two main types of data based on their structure? Give an example of each.
Explain the difference between cross-sectional and time series data.
What is the difference between a population and a sample?
Name three sampling techniques used to collect data and briefly describe each one.
Why is the median sometimes a better measure of central tendency than the mean?
What do measures of dispersion tell you about a data set? Provide two examples of measures of dispersion.
What is the purpose of using a histogram and provide three examples of the different shapes they can have?
What is the difference between standardization and normalization?
Quiz Answer Key
The primary purpose of statistics in data analysis and data science is to collect, analyze, interpret, and draw meaningful conclusions from information and data to aid in decision-making. It is about extracting meaningful insights from data.
Descriptive statistics summarize and describe the main features of a dataset, such as measures of central tendency and dispersion. Inferential statistics, on the other hand, uses sample data to make inferences and predictions about a larger population.
The two main types of data based on their structure are structured data, which is organized in rows and columns (e.g., a spreadsheet) and unstructured data, which lacks a predefined format (e.g., emails, images, or videos).
Cross-sectional data is collected at a single point in time, such as data from a survey. Time series data, however, is collected over a sequence of time intervals, like daily stock prices or monthly sales figures.
A population is the entire group of individuals or items that are of interest in a study, while a sample is a subset of the population that is selected for analysis.
Three sampling techniques include: Stratified sampling, which divides the population into subgroups (strata) and randomly selects samples from each; Systematic sampling, which selects members at a regular interval from a starting point; and Random Sampling, which gives every individual in the population an equal chance of being selected.
The median is less influenced by outliers than the mean, making it a better choice when the data set contains extreme values that can skew the average.
Measures of dispersion describe the spread or variability of data points around the central tendency. Examples of dispersion include variance and standard deviation.
Histograms display the distribution of continuous data, using bins or intervals to show the frequency of values. Histograms can be symmetric, right-skewed, or left-skewed.
Standardization converts data to have a mean of zero and a standard deviation of one while preserving the original data distribution. Normalization scales all values to fall between zero and one, which is often useful in machine learning.
Essay Questions
Discuss the importance of understanding the different types of data and variables in statistical analysis. How does this knowledge affect the selection of appropriate statistical techniques?
Explain the concept of central tendency and dispersion in statistics. Describe different measures of each and discuss scenarios in which one measure may be preferred over another.
Describe the process of hypothesis testing, including the null and alternative hypotheses, p-values, and types of errors that can occur. Why is it important to establish statistically significant relationships.
Compare and contrast the various data visualization methods covered in the material (histograms, box plots, scatter plots). When is each visualization most appropriate?
Explain the concepts of probability distributions, especially focusing on the normal distribution and its applications in statistical analysis. How does the empirical rule relate to normal distribution?
Glossary of Key Terms
Central Tendency: A measure that represents the typical or central value of a dataset. Common measures include mean, median, and mode.
Confidence Interval: A range of values that is likely to contain a population parameter with a certain level of confidence.
Continuous Data: Data that can take any value within a given range (e.g., height, weight, temperature).
Covariance: A statistical measure of the degree to which two variables change together.
Cross-Sectional Data: Data collected at a single point in time.
Data: Facts and statistics collected together for reference or analysis.
Degrees of Freedom: The number of values in a statistical calculation that are free to vary.
Descriptive Statistics: Methods used to summarize and describe the main features of a dataset.
Discrete Data: Data that can only take specific values, often whole numbers (e.g., number of students in a class, number of cars).
Dispersion: A measure that describes the spread or variability of data points around the central tendency. Common measures include range, variance, and standard deviation.
Empirical Rule: Also known as the 68-95-99.7 rule, it describes the percentage of data within specific standard deviations from the mean in a normal distribution.
Hypothesis Testing: A statistical method used to evaluate a claim or hypothesis about a population parameter based on sample data.
Inferential Statistics: Methods used to make inferences and predictions about a population based on sample data.
Interquartile Range (IQR): A measure of dispersion calculated by subtracting the first quartile (Q1) from the third quartile (Q3).
Mean: The average value of a dataset, calculated by summing all values and dividing by the number of values.
Median: The middle value in a sorted dataset, dividing the dataset into two equal halves.
Mode: The value that appears most frequently in a dataset.
Normalization: A scaling technique that adjusts the values of data to a standard range, often between 0 and 1.
Null Hypothesis: The default statement or assumption that is tested in hypothesis testing, often indicating no effect or difference.
Outliers: Data points that are significantly different from other values in a dataset.
Population: The entire group of individuals or items that are of interest in a study.
Probability Distribution: A function that describes the likelihood of different outcomes for a random variable.
Random Variable: A variable whose value is a numerical outcome of a random phenomenon.
Sample: A subset of a population selected for analysis.
Sampling Techniques: Methods used to select a sample from a population, such as random, stratified, and systematic.
Scatter Plot: A graph that displays the relationship between two continuous variables.
Standard Deviation: A measure of the spread of data around the mean, calculated as the square root of the variance.
Standardization: A scaling technique that transforms data to have a mean of 0 and a standard deviation of 1.
Statistical Significance: A measure of the probability that an observed result is not due to random chance.
Time Series Data: Data collected over a sequence of time intervals.
Type I Error: Rejecting the null hypothesis when it is true (false positive).
Type II Error: Accepting the null hypothesis when it is false (false negative).
Variance: A measure of the average squared deviation of data points from the mean.
Statistics for Data Science
Okay, here’s a detailed briefing document summarizing the key themes and ideas from the provided text, which appears to be a transcript of a video or lecture on statistics:
Briefing Document: Introduction to Statistics for Data Analysis and Data Science
Overall Theme: This document outlines a comprehensive introduction to statistics, emphasizing its importance for data analysis and data science. It covers fundamental concepts, techniques, and applications, moving from basic definitions to more advanced topics like hypothesis testing and probability distributions. The speaker aims to provide a foundational understanding suitable for both beginners and those preparing for data-related interviews.
Key Themes and Ideas:
Definition and Role of Statistics:
Statistics is a branch of mathematics involved with collecting, analyzing, interpreting, and drawing conclusions from information and data.
Quote: “Statistics is a branch of mathematics that Evolve Collecting Analyzing Interpreting and drawing conclusion information and data.”
It’s essential for data analysis and is a core skill for data scientists.
Statistics helps extract meaningful information from data and aids in decision-making.
Quote: “Analyzing all the data thoroughly so that we Extract some meaningful information from this can help in decision making of these.”
Statistics is used in everyday life, with examples such as health recommendations (e.g., dentist endorsements), probability (e.g., birthday sharing), and sales trends.
Types of Statistics:
Descriptive Statistics: Focuses on summarizing and describing data using measures like mean, median, mode, and measures of dispersion (range, variance, standard deviation).
Quote: “Descriptive statistics are All this is Errantia Statistics.”
Inferential Statistics: Uses sample data to make inferences and draw conclusions about a larger population. It involves making generalizations and predictions based on statistical analysis.
Quote: “You can make inferences in data only by using statistics.”
Types of Data:
Structured vs. Unstructured Data:Structured data is organized in rows and columns (e.g., tables, spreadsheets, databases).
Quote: “Structured data means the data whose the structure will be okay so the data we Can be organized in the form of rose end columns.”
Unstructured data lacks a predefined format (e.g., multimedia, text documents, emails).
Quote: “unstructured data… is multimedia What is content now in multimedia content? will come Your images become audios ok all these are videos.”
Cross-Sectional vs. Time Series Data:Cross-sectional data is collected at a single point in time (e.g., survey data, student test scores at one time).
Quote: “Data is collected at A Single Point off time.”
Time series data is collected over a sequence of time (e.g., daily stock prices, monthly sales data).
Quote: “Time Surge is the opposite of Just Cross Sectional Data is data that is stored over a sequence of time is calculated or collected goes S Time Series Data Collected Over Sequence Off Time”
Univariate vs. Multivariate Data:Univariate data has a single variable.
Multivariate data has two or more variables.
Types of Variables:Nominal: Categorical data with no order (e.g., gender, colors).
Quote: “Nominal You get categories in the data Labels are available which have a particular there is no order so its true Examples could be gender”
Ordinal: Categorical data with an order or sequence (e.g., education level, customer satisfaction ratings).
Quote: “You will get the category but in this category you will You will get an order, you will get a sequence which There are intervals between the categories”
Numerical: Quantitative data that represents measurements or counts.
Further divided into:
Interval: Numerical data with meaningful intervals but no true zero point (e.g., temperature in Celsius or Fahrenheit).
Ratio: Numerical data with meaningful intervals and a true zero point (e.g., height, weight, age).
Quote: “in ratio and interval This is the difference, here are some examples of the ratio can such as Height Next is wait wait it gets old okay so if we compare the ages of two people Find the difference and we get zero difference If you get it then it means he is of the same age”
Population and Sample:
Population: The entire group of individuals or items that are being studied.
Quote: “Population is the entire Group Off As an individual, suppose we need to do some study All the people of India have to do research.”
Sample: A subset of the population that is used for analysis.
Quote: “we have to study as much as we can take some people from a population on which we perform studies on which we let’s perform observation, okay then that We call a group of people a sample, which represents the population.”
Samples should be representative of the population.
Sampling Techniques:
Stratified Sampling: Dividing the population into subgroups (strata) based on characteristics, then taking random samples from each stratum.
Quote: “I will divide this population based on some characteristics so suppose Characteristic here is gender so gender On the basis of which I am here for males and females.”
Systematic Sampling: Selecting individuals from the population at regular intervals (e.g., every 10th person).
Quote: “We follow a system in sampling Let’s select people from the What is the systematic population now? It happens that we first start from a point we do it and we cover every kth element of it”
Measures of Central Tendency:
Mean: Average of all data points. Heavily affected by outliers.
Quote: “if we find the mean Its 21.6 7 Now between 21.6 7 and 6 you You can see the difference is huge now. Here, because of an outlier, our mean which should have been six the average value which It should have been six, now it is showing 2.67”
Median: Middle value when data is ordered. Less influenced by outliers.
Quote: “median you can calculate for Numerical variable and this is less influenced by the outlier this is less Illus Buy Outlier”
Mode: Most frequently occurring value. Useful for categorical data.
Quote: “mode mode mode that is the value that if any one of the following is present in the data set A particular value is repeated again and again is that akers most frequently that will b The mode of the data set is most frequent”
Measures of Dispersion:
Range: Difference between the maximum and minimum values.
Variance: Average of the squared differences from the mean.
Standard Deviation: Square root of the variance. Measures the spread of data around the mean.
Quote: “standard deviation and what is the difference in variance and we Why is Standard Deviation Mostly Used?”
Quartiles: Divide the data into four equal parts. Q1 (25th percentile), Q2 (50th percentile, also the median), and Q3 (75th percentile).
Quote: “Divide the complete data set into four equal parts I divide it with three quartiles so here you will see q1 q2 and q3 is visible”
Percentiles: Divide the data into 100 equal parts.
Quote: “percentiles means 100 percentile so if you calculate one Percentile so what does one percentile mean is below one percentile, whatever your The data is coming, whatever your one percentile is The value will be”
Interquartile Range (IQR): The range between the first and third quartiles (Q3-Q1). Less sensitive to extreme values.
Quote: “If you need to calculate the interquary range If you want to do then from q3 my q1 you will get 50 off on the You get the data which is in your middle it stays okay so this is too much it is important if you only need the center of the”
Frequency and Relative Frequency:
Frequency: Number of times a value occurs in a data set.
Relative Frequency: Frequency of a value divided by the total number of observations.
Data Visualization:
Histograms: Display the distribution of continuous data. Useful for identifying skewness, outliers, and central tendency.
Quote: “The histogram divides the data here you have x in bins in intervals You can see the intervals on the y axis and The axis shows the frequency so here Pay you get in va axis Frequency.”
Different shapes: Symmetric (normal), Right-Skewed, Left-Skewed.
Based on number of modes: Uni-modal, Bi-modal, Multi-modal.
Box Plots: Show the spread of data using quartiles and outliers.
Quote: “Here you can see You get a box which has IQR represents the range here you You can see that q3 is the value and q1 is the value okay so whatever box it would be is that which is aa k aa which is q3 – q1”
Components: Box (IQR), median line, whiskers, outliers.
Scatter Plots: Useful for visualizing the relationship between two continuous variables.
Quote: “The scatter plot is Useful for visualizing the Relationship Between two continuous variables”
Help identify outliers, strength, and direction of relationships.
Outliers:
Data points that are significantly different from other values.
Can skew results.
Identified through visualization and statistical methods (z-scores, IQR).
Quote: “So outliers are those data points which are as normal as the data we have are either much bigger than them They are either very small”
Covariance and Correlation:
Covariance: Measures how two variables change together. Indicates direction (positive or negative), not the strength.
Quote: “covaris is a state major t describes how much to Variable Change Together”
Correlation: Measures the strength and direction of a linear relationship between two variables.
Quote: “correlation is the standardized version of covariance which tells you how strongly these Two Variables are related”
Probability:
Probability function assigns a probability to each event in a sample space.
Calculated as favorable outcomes divided by total outcomes.
Complement of an event is the probability of all outcomes not in that event.
Quote: “Assignments Probability to Each If you want to see an example of this event then The probability function is simply a function”
Types of Events:
Joint Events: Events that can occur at the same time with some common outcomes.
Disjoint Events: Events that cannot occur at the same time, having no common outcomes.
Dependent Events: The occurrence of one event affects the probability of another.
Independent Events: The occurrence of one event does not affect the probability of another.
Conditional Probability:
Probability of an event given that another event has already occurred.
Uses Bayes’ Theorem.
Probability Distributions:
Random Variables: Outcomes of random experiments. Can be discrete (countable) or continuous (interval based).
Probability Mass Function (PMF): Probability distribution for discrete random variables.
Probability Density Function (PDF): Probability distribution for continuous random variables.
Quote: “The probability that comes out of the variable We call it the probability mass function.”
Binomial Distribution: Multiple Bernoulli trials (counting the number of successes in n trials).
Quote: “The outcomes are either zero or If one is in the form of Bernoulli then We have just seen the trial of Bernoulli What is Bernoulli trial in distribution?”
Uniform Distribution: All values within an interval are equally likely.
Normal Distribution: Symmetrical, bell-shaped continuous probability distribution. Also known as Gaussian distribution.
Quote: “The distribution is the normal distribution also known edge gaussian distribution so this is normal distribution is a continuous Distribution and a Symmetry the probability distribution that is characterized by a bell shaped curve”
Standard Normal Distribution: A normal distribution with a mean of zero and a standard deviation of one (z-distribution).
Quote: “Standard Normal You can also call this distribution as z distribution you can say z”
Standardization and Normalization:
Standardization: Converts data to a standard normal distribution (mean 0, standard deviation 1), using z-scores.
Quote: “standardization is a process of converting normal distribution which is We saw it in the previous video as normal Distribution into Standard Normal Distribution Standard Normal”
Normalization: Re-scales data to a range between 0 and 1 (e.g., min-max scaling).
Quote: “Normalization Re Scales a Data Set So that itch Value Falls between row and one”
Empirical Rule (68-95-99.7 Rule):
For a normal distribution, approximately 68% of the data falls within one standard deviation of the mean, 95% within two, and 99.7% within three.
Quote: “You can also use the apical rule at 68, 95 and 99.7”
Inferential Statistics: Estimation
Use sample data to make inferences about the larger population.
Point Estimation: Providing a single “best guess” for a population parameter.
Quote: “Point Estimate and Interval Estimate Next Before proceeding further, let us understand two terms which are is the first population parameter and the second one is sample Statistics”
Interval Estimation: Providing a range of values within which the population parameter is likely to fall, expressed as a confidence interval.
Quote: “we do interval estimation in which our population parameter is If it does then what is its probability now How confident are we that the population”
Confidence Intervals:
Estimate the range within which the true population parameter is likely to lie, with a specified confidence level.
Quote: “Confidence interval at 95 or 99 Confidence interval so what does this mean The one who is at 95 here or 99 is this one the probability of the way that we are saying that 95 per cent of the time which is the true population The parameter is brought into the interval estimation”
Calculated using the point estimate, margin of error, and a critical value determined by the desired confidence level.
Confidence Level usually 95% or 99%
Sample size (n) greater than 30 usually follows z distribution; If n < 30, follows t distribution.
T-Distribution (Student’s t-distribution):
Used when the sample size is small (n<=30) and the population standard deviation is unknown.
Quote: “The sample size we have is The sample size, which we represent as n, is less than equal to 30 okay so then The distribution we use it happens t distribution”
The curve is bell-shaped, but fatter in the tails than the normal distribution.
Degrees of freedom (df) are used as parameters for the distribution. (df = n-1).
Hypothesis Testing:
A statistical method to evaluate claims about population parameters using sample data.
Involves setting up a null hypothesis (H0) and an alternative hypothesis (H1).
Quote: “Hypothesis Testing Much more from a research perspective It is important and also in data analysis even if you go for interviews Hypothesis testing is a very big Data is a practical implementation”
Null Hypothesis (H0): A statement of no effect or no difference. The default position that we aim to test for evidence against.
Quote: “Our base line is one which you could call the null hypothesis okay So the statement we need to prove is we say the null hypothesis”
Alternative Hypothesis (H1 or Ha): A statement that contradicts the null hypothesis. A hypothesis that suggests an alternative situation that we might accept when rejecting the null.
Quote: “which could be the statement okay when we What else would you reject other than that? There may be a possibility, we call it Alternate Hypothesis vs Null Hypothesis”
Level of Significance (α): A predetermined threshold for rejecting H0.
Quote: “level of significant it’s a pre determined thrush hold so it act as a Boundary to decide if we have enough Evidence to reject the null hypothesis and accept the null hypothesis you can also call it as the rejection region of the”
P-Value: Probability of obtaining the observed data or more extreme data, assuming the null hypothesis is true.
Quote: “the value of the value if it falls inside the rejection region then we apply the null so reject the hypothesis The next important term is p What is the p value?”
Decision Rule: If the p-value is less than α, reject H0.
Quote: “if p value is less than alpha reject the null hypothesis”
Type I Error (False Positive): Rejecting H0 when it’s true.
Quote: “Type one error can also be called a false positive”
Type II Error (False Negative): Accepting H0 when it’s false.
Quote: “Type2Error is not null Hypotheses is accepted when it is false”
One-Tailed Test: The critical region is only in one direction (left or right tail).
Quote: “The region will be either on your left side So suppose here our critical region is which is the reason for rejecting null the hypothesis is in the right tail or Yours is in the left tail”
Two-Tailed Test: The critical region is in both directions (both tails).
Quote: “what happens in the tail which is the critical region it happens on both sides it gets divided”
Types of Hypothesis TestsZ-Test: Used to compare sample and population means when the population standard deviation is known and for a large sample.
Quote: “when we have the population standard deviation should be non okay so when Population standard deviation is known and it is useful for this test is Useful for large samples”
T-Test: Used when population standard deviation is unknown and for smaller samples (n<=30).
Quote: “small sample size it is ok So when we have a sample size If it is small then we will use the T test”
Independent T-Test : For comparing means of two independent groups.
Paired T-Test : For comparing means of same group before and after a treatment/condition.
ANOVA (Analysis of Variance): Used to compare means of more than two groups * Quote: “Anova test. How is it done when we compare the group that we have here More than two groups means if we have to check whether they are the same or different, then we need to use Anova. we do”
One-way ANOVA: Checks for difference with one independent factor.
Quote: “only one independent Variable is taken here okay so these are the independent variable and then here you have the dependent variable.”
Two-way ANOVA: Checks for difference with two independent factors.
Quote: “two way Anova now What is different in two way Anova that the factor variable in this is more than one It happens more, okay there are two factors in this”
Chi-Square Test: Used to test the association between two categorical variables.
Quote: “The category comes tomorrow then its To check the association, they Which test do we use to compare? Let’s do the chi square test”
Chi-Square Test of Independence: Test for relationship between two categorical variable.
Chi-Square Goodness of Fit Test: Compares an observed distribution to an expected one for a single categorical variable.
Intended Audience:
This document is suitable for:
Individuals new to statistics.
Students learning data analysis and data science.
Professionals looking to refresh their statistical knowledge.
Those preparing for data-related job interviews.
Summary:
This briefing document provides a comprehensive overview of statistical concepts and techniques covered in the source material. The speaker systematically introduces each concept, emphasizing the practical application in the context of data analysis and data science, and using relatable examples. It acts as a good foundation for anyone wanting to learn statistics for use in their analysis. The speaker also provides a solid overview for exam or interview preparation.
Statistics for Data Analysis and Data Science
Frequently Asked Questions on Statistics for Data Analysis and Data Science
What is statistics and what role does it play in data analysis and data science?
Statistics is a branch of mathematics focused on collecting, analyzing, interpreting, and drawing conclusions from data. In data analysis and data science, statistics provides the tools and techniques necessary to extract meaningful insights from information, make predictions, and support informed decision-making. It’s used to perform functions such as summarizing data (mean, median, mode), understanding data variability (measures of dispersion), and drawing inferences. Statistics is crucial in handling various types of data, applying appropriate analytical methods, and ensuring the robustness of conclusions.
What are the main types of statistics, and how do they differ?
The main types of statistics are descriptive statistics and inferential statistics. Descriptive statistics involves summarizing and describing the main features of a dataset, using measures such as mean, median, mode, and standard deviation. It focuses on portraying data in a simple, understandable way. Inferential statistics, on the other hand, uses sample data to make generalizations or predictions about a larger population. This involves hypothesis testing, confidence intervals, and regression analysis to draw conclusions that go beyond the immediate dataset.
What are the different types of data, and why is it important to know them?
Data can be broadly categorized based on its nature. First, there’s structured data, which is organized in rows and columns (like spreadsheets and databases), and unstructured data, such as multimedia content (images, audio, video), text (emails, articles, blogs), which don’t have a predefined format. Data can also be categorized as cross-sectional (collected at a single point in time, like survey data or student exam marks) or time-series data (collected over a sequence of time, like daily stock prices). Further, univariate data involves one variable, while multivariate data involves two or more variables. Knowing these data types is crucial because the appropriate statistical techniques vary depending on the nature of the data.
What are the key differences between a population and a sample, and why is it important to understand sampling techniques?
A population refers to the entire group of individuals or items you are interested in studying, whereas a sample is a subset of that population from which data is actually collected. Sampling techniques are essential because it’s often impractical or impossible to collect data from an entire population. Sampling is done to make inferences about the entire population by using a representative sample. Different sampling techniques like stratified sampling (dividing the population into subgroups and then taking samples), systematic sampling (selecting every kth element) are used to obtain representative samples so that accurate conclusions can be made.
How do outliers and extreme values affect statistical analyses, and what measures can be used to mitigate their impact?
Outliers and extreme values can skew statistical results, particularly measures like the mean. When an outlier is present in the data, the median is a more robust measure of central tendency, as it is less affected by these values. The median represents the middle value in a dataset when ordered, and does not get influenced by extremely high or low values. In addition to the median, Interquartile Range (IQR) is less sensitive to extreme values. This makes the IQR useful to calculate the spread of the data when outliers are present.
What are measures of central tendency, and when should you use them?
Measures of central tendency describe the “center” of a dataset. The mean is the average value, sensitive to outliers and best used for normally distributed data without extreme values. The median is the middle value, which is less sensitive to outliers and suitable for data with extreme values or skewed distributions. The mode is the most frequent value and is primarily used for categorical data, or for numerical data where count of unique values is less. The choice of which measure to use depends on the data’s distribution and the presence of outliers.
What are some common measures of dispersion, and what do they tell us about a dataset?
Measures of dispersion describe the spread or variability in a dataset. Range is a simple measure (difference between max and min values) that’s very sensitive to outliers. Variance measures the average squared deviation from the mean. Standard deviation is the square root of variance, providing a measure of spread in the same units as the original data, which can tell us how far individual data points are from the central tendency of the data. Quartiles and percentiles divide the dataset into four and 100 equal parts, respectively. The Interquartile range (IQR), the difference between the third and first quartiles, represents the middle 50% of the data and is less sensitive to extreme values.
What is the role of hypothesis testing in inferential statistics, and what are Type I and Type II errors?
Hypothesis testing is a method of making a statistical decision using experimental data. It involves testing a null hypothesis (a statement of no effect) against an alternative hypothesis (a statement of some effect or difference). It is done to prove if a hypothesis is correct or not using the evidence of the sample data. Type I errors occur when a true null hypothesis is rejected (false positive). Type II errors occur when a false null hypothesis is not rejected (false negative). The level of significance (alpha) is often used in hypothesis testing to determine if an effect is statistically significant (when the p value is less than the alpha level, reject the null hypothesis). These tests are done to make informed decision using data from a sample, to generalize conclusions about a population.
Essential Statistics Concepts
The sources cover a variety of statistics topics, including descriptive statistics, probability, inferential statistics, and different types of data [1].
Descriptive Statistics [1, 2]
Descriptive statistics involves collecting, analyzing, and interpreting data to understand its main features [2].
It includes measures of central tendency, such as the mean, median, and mode [3, 4].
The mean is the average of a data set [4].
The median is the middle value of a data set [5].
The mode is the most frequently occurring value in a data set [5].
It also includes measures of dispersion, such as range, variance, and standard deviation [3].
Range refers to the spread of data [3].
Variance is a measure of how spread out the data is [3, 6].
Standard deviation is the square root of the variance [3, 6].
Percentiles and quartiles are also used in descriptive statistics [2, 3].
Graphical representations, such as box plots, histograms, and scatter plots, are used to visualize data [3, 7].
Box plots are used to show the spread of data and identify outliers [3, 8].
Histograms display the distribution of data [3, 7].
Scatter plots visualize the relationship between two continuous variables [3, 9].
Probability [3, 10]
Probability is a measure of the likelihood of a particular event occurring [10].
Key concepts in probability include sample space, events, and probability functions [3, 11].
A sample space is the set of all possible outcomes of a random experiment [11].
An event is a subset of the sample space [11].
A probability function assigns a probability to each event in the sample space [12].
Different types of events include joint, disjoint, dependent, and independent events [3, 12].
Conditional probability is the probability of an event occurring given that another event has already occurred [3, 13].
Bayes’ theorem is a formula that describes how to update the probability of a hypothesis based on new evidence [3, 13].
Probability distributions describe the probability of different outcomes in a random experiment [3, 14].
Discrete random variables have a finite number of values [3, 14].
Continuous random variables can take on any value within a given range [3, 14].
The probability of discrete variables is described by the probability mass function (PMF) [3, 15].
The probability of continuous variables is described by the probability density function (PDF) [3, 15].
Specific probability distributions include the Bernoulli, binomial, uniform, and normal distributions [3, 16-19].
The Bernoulli distribution describes the probability of success or failure in a single trial [16].
The binomial distribution describes the probability of a certain number of successes in a fixed number of trials [17].
The uniform distribution gives equal probability to all outcomes within a given range [18].
The normal distribution is a bell-shaped distribution characterized by its mean and standard deviation [19].
Inferential Statistics [1, 20, 21]
Inferential statistics involves drawing conclusions about a population based on a sample [20, 21].
It includes concepts such as point and interval estimation, confidence intervals, and hypothesis testing [3, 20, 22].
Point estimation provides a single value as a best guess for an unknown population parameter [23].
Interval estimation provides a range of values within which a population parameter is likely to lie [24].
A confidence interval is an interval estimate with a specified level of confidence that it contains the true population parameter [20, 24].
Hypothesis testing is a method for evaluating a claim or hypothesis about a population parameter [20, 25].
It involves setting up a null hypothesis (a statement of no effect) and an alternative hypothesis (a statement that contradicts the null hypothesis) [3, 25].
The level of significance (alpha) is the predetermined threshold for rejecting the null hypothesis [3, 26].
The p-value is the probability of observing a result as extreme as, or more extreme than, the observed result if the null hypothesis is true [26].
One-tailed tests have a critical region on one side of the distribution, while two-tailed tests have critical regions on both sides [3, 27].
Common statistical tests include the z-test, t-test, chi-square test, and ANOVA [3, 28, 29].
The z-test is used to compare sample means to population means when the population standard deviation is known and the sample size is large [3, 28].
The t-test is used when the population standard deviation is unknown or the sample size is small [3, 29, 30].
The chi-square test is used to compare categorical variables [31].
ANOVA (analysis of variance) is used to compare the means of three or more groups [29].
Types of Data [1, 32-34]
Data can be structured (organized in rows and columns) or unstructured (multimedia, text) [32].
Data can be cross-sectional (collected at a single point in time) or time series (collected over time) [32].
Variables can be categorical or numerical [33].
Categorical variables can be nominal (no order) or ordinal (ordered) [33].
Numerical variables can be discrete (countable) or continuous (any value within a range) [33].
Numerical data can be interval (meaningful intervals but no true zero point) or ratio (meaningful intervals and a true zero point) [33].
A population is the entire group of individuals or items of interest, while a sample is a subset of the population [34].
Sampling techniques include stratified sampling (dividing the population into subgroups and taking samples from each subgroup) and systematic sampling (selecting every kth element from the population) [35].
Other Concepts
Outliers are data points that are significantly different from other data points [3, 8, 9].
Covariance is a measure of how two variables change together [3, 36].
Correlation is a measure of the strength and direction of a linear relationship between two variables [36].
Causation refers to a cause-and-effect relationship between two variables [37].
Standardization is the process of converting data to a standard normal distribution [38].
Normalization is a scaling technique that rescales data to a range between 0 and 1 [39].
The empirical rule states that for a normal distribution, approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations [3, 21, 36].
A Guide to Data Analysis
Data analysis is a systematic process of inspecting, collecting, cleaning, transforming, and modeling data with the goal of discovering useful information [1]. It involves several key steps, including defining the problem, collecting data, cleaning data, conducting exploratory data analysis, transforming data, formulating hypotheses, testing hypotheses, interpreting results, and documenting the analysis [1].
Here is a breakdown of the steps of data analysis:
Defining the problem or research question is the first step, which guides the entire process [1].
Data collection involves gathering the necessary data through surveys, experiments, observations, or existing datasets [1].
Data cleaning is crucial to remove inconsistencies and ensure accuracy in the data [1].
Exploratory data analysis (EDA) involves exploring and understanding the data through summary statistics and visualizations [1, 2]. This step often involves using descriptive statistics [1].
Data transformation may be needed to prepare the data for analysis, including normalization, standardization, or encoding categorical variables [1, 3].
Normalization rescales data so that each value falls between 0 and 1 [3]. This is useful when features are on different scales [4].
Standardization converts data to a standard normal distribution, where the mean is zero and the standard deviation is one [5]. This is useful when you want to know how many standard deviations a value is from the mean [4].
Hypothesis formulation involves creating a null hypothesis and an alternative hypothesis based on the research question [1].
Hypothesis testing uses statistical tests to determine whether there is enough evidence to reject the null hypothesis [1].
Common tests include z-tests, t-tests, chi-square tests, and ANOVA [1].
Interpretation of results involves analyzing the outcomes of the tests and drawing conclusions based on the evidence [1].
Documentation of the analysis process and report creation is essential for sharing findings and ensuring reproducibility [1].
Descriptive statistics is a key component of data analysis. It is used to understand the main features of a dataset [2]. It helps to organize and summarize information from the data set [2]. Descriptive statistics includes measures of central tendency (mean, median, and mode) [6], measures of dispersion (range, variance, standard deviation, percentiles, and quartiles) [6, 7], and graphical representations (box plots, histograms, and scatter plots) [8-10].
Inferential statistics is used to make predictions about a population based on a sample [11]. It is used to test a claim or hypothesis about a population parameter [12]. It includes concepts such as point and interval estimation, confidence intervals, and hypothesis testing [11-14].
Fundamentals of Probability Theory
Probability is a measure of the likelihood of a particular event occurring [1]. It is measured on a scale from zero to one, where zero means the event is impossible and one means the event is certain [1]. Values between zero and one represent varying degrees of likelihood [1].
Key concepts in probability include:
Sample space: The set of all possible outcomes of a random experiment [2]. For example, when tossing a coin, the sample space consists of “heads” and “tails” [2].
Event: A subset of the sample space, representing specific outcomes or combinations of outcomes [2]. For example, when rolling a die, the event of getting an even number would include 2, 4, and 6 [2].
Probability function: A function that assigns a probability to each event in the sample space [3]. The probability of an event is calculated as the number of favorable outcomes divided by the total number of outcomes [3].
Complement: The complement of an event includes all outcomes not in that event [3]. For example, the complement of getting an even number on a die roll would be getting an odd number [3]. The probability of a complement is calculated as 1 minus the probability of the event [3].
There are different types of events, including:
Joint events (or non-disjoint events): Two or more events that can occur at the same time and have some common outcomes [4].
Disjoint events (or mutually exclusive events): Two or more events that cannot occur at the same time and have no common outcomes [4].
Dependent events: Events where the outcome of one event affects the probability of another event [5].
Independent events: Events where the outcome of one event does not affect the probability of another event [6].
Conditional probability is the probability of an event occurring given that another event has already occurred [7]. The formula for conditional probability is: P(A|B) = P(A and B) / P(B) where P(A|B) is the probability of A given B, P(A and B) is the probability of both A and B occurring, and P(B) is the probability of B occurring [7].
Bayes’ theorem is a mathematical formula used to update the probability of an event based on new evidence [8]. The formula is: P(A|B) = [P(B|A) * P(A)] / P(B), where P(A|B) is the updated probability of A given B, P(B|A) is the probability of B given A, P(A) is the initial probability of A, and P(B) is the probability of B [8]. Bayes’ theorem has applications in machine learning, medical diagnosis, spam classification, recommendation systems, and fraud detection [8, 9].
Probability distributions describe the probability of different outcomes in a random experiment [10]. There are two types of random variables:
Discrete random variables have a finite number of values or values that can be counted [10]. The probability of discrete variables is described by the probability mass function (PMF) [11].
Continuous random variables can take on any value within a given range [10]. The probability of continuous variables is described by the probability density function (PDF) [11].
Specific probability distributions include:
Bernoulli distribution: Describes the probability of success or failure in a single trial [12]. The PMF is given by p if x=1 and 1-p if x=0, where p is the probability of success, and q or 1-p is the probability of failure [12].
Binomial distribution: Describes the probability of a certain number of successes in a fixed number of trials [13]. The PMF is given by nCx * p^x * (1-p)^(n-x), where n is the number of trials, x is the number of successes, and p is the probability of success [13].
Uniform distribution: Gives equal probability to all outcomes within a given range [14]. The PDF is 1/(b-a), where a and b are the range boundaries [14].
Normal distribution (also known as Gaussian distribution): A bell-shaped distribution characterized by its mean and standard deviation [15]. The PDF is a complex formula involving the mean and standard deviation [15]. A standard normal distribution has a mean of zero and a standard deviation of one [16].
These concepts form the foundation of probability theory, which is used extensively in statistical analysis and data science [17, 18].
Inferential Statistics: Estimation, Hypothesis Testing, and Statistical Tests
Inferential statistics involves drawing conclusions or making predictions about a population based on a sample of data [1-3]. This is often done because studying an entire population is not feasible [3]. It is a way to use samples to make observations and then generalize those observations to the entire population [4].
Key concepts and techniques in inferential statistics include:
Estimation: This involves approximating population parameters using sample statistics. There are two main types of estimation [5]:
Point estimation provides a single best guess for an unknown population parameter [6]. This method is simple but has limitations, such as the lack of information about the reliability of the estimate [7]. Common methods for calculating point estimates include Maximum Likelihood Estimator, Laplace Estimation, Wilson Estimation, and Jeffrey Estimation [7].
Interval estimation provides an interval within which the population parameter is likely to fall [8]. This is more accurate than point estimation because it includes a range of values, increasing the likelihood of capturing the true population parameter [8]. Confidence intervals are a crucial part of interval estimation [9].
Confidence Intervals: These are intervals constructed from sample data that are likely to contain the true population parameter. A confidence interval is associated with a confidence level, such as 95% or 99%. For example, a 95% confidence interval means that if we were to take 100 samples from a population, and calculate a confidence interval from each sample, 95 of those intervals would contain the true population parameter [9]. The formula for a confidence interval is: point estimate ± margin of error [10].
The margin of error is calculated as: critical value * standard error [10].
The standard error of a particular statistic is calculated by dividing the population standard deviation by the square root of the sample size [10].
The critical value is based on the desired level of confidence and can be obtained from z-tables (for large sample sizes) or t-tables (for small sample sizes) [11, 12].
When the sample size (n) is greater than 30, the distribution is considered a z-distribution [12]. When the sample size is less than or equal to 30, a t-distribution is used [12].
Hypothesis Testing: This involves using sample data to evaluate a claim or hypothesis about a population parameter [13]. The process includes [3]:
Formulating a null hypothesis (a statement of no effect or no difference) and an alternate hypothesis (a statement that contradicts the null hypothesis) [13, 14].
Determining a level of significance (alpha), which acts as a boundary to decide whether there is enough evidence to reject the null hypothesis [14].
Calculating a p-value, which represents the strength of evidence against the null hypothesis. The p-value is compared to the alpha level. If the p-value is less than the alpha level, the null hypothesis is rejected [15].
Making a decision based on the p-value and alpha level.
Understanding that there can be errors in hypothesis testing, which includes:
Type I errors (false positives): rejecting the null hypothesis when it is true [15].
Type II errors (false negatives): failing to reject the null hypothesis when it is false [15].
Choosing between a one-tailed test (where the critical region is on one side of the distribution) or a two-tailed test (where the critical region is on both sides of the distribution) [16].
One-tailed tests look for evidence in only one direction, such as whether a value is greater than or less than a specific number [16].
Two-tailed tests look for evidence in both directions, such as whether a value is different from a specific number [16].
Types of Statistical Tests: There are various statistical tests used in hypothesis testing, including [16, 17]:
Z-tests: Used to compare sample means or population means when the population standard deviation is known and the sample size is large (greater than 30) [17].
One-sample z-tests are used when comparing a sample mean to a population mean [17].
Two-sample z-tests are used when comparing the means of two independent samples [17].
T-tests: Used when the population standard deviation is unknown, or the sample size is small (less than or equal to 30), or both [17].
Independent t-tests are used to compare the means of two independent groups [18].
Paired t-tests are used to compare the means of two related groups, such as the same group before and after a treatment [18, 19].
ANOVA (Analysis of Variance): Used when comparing the means of more than two groups. It utilizes the F test statistic to determine if any groups have significantly different means [19, 20].
One-way ANOVA is used when there is one factor influencing a response variable [20].
Two-way ANOVA is used when there are two factors influencing a response variable [21].
Chi-square tests: Used to test for associations between categorical variables [22].
Chi-square tests for independence are used to determine if two categorical variables are related [23].
Chi-square goodness-of-fit tests are used to compare observed values with expected values to determine if a sample follows a specific distribution [24].
In summary, inferential statistics allows for generalizing from samples to populations using concepts like estimation, confidence intervals, and hypothesis testing. These concepts are essential in data analysis and scientific research, helping to make informed decisions based on data [1, 3, 25].
Hypothesis Testing: Principles and Methods
Hypothesis testing is a crucial part of inferential statistics that uses sample data to evaluate a claim or hypothesis about a population parameter [1-3]. It helps in determining whether there is enough evidence to accept or reject a hypothesis [3].
The process of hypothesis testing involves several key steps [2]:
Formulating Hypotheses [2, 4]:
Null Hypothesis (H0): A baseline statement of no effect or no difference [2, 4]. It’s the default position that you aim to either reject or fail to reject.
Alternate Hypothesis (H1 or Ha): A statement that contradicts the null hypothesis [2, 4]. It proposes a specific effect or difference that you want to find evidence for.
Setting the Level of Significance (alpha) [2, 5]: This is a pre-determined threshold that acts as a boundary to decide if there’s enough evidence to reject the null hypothesis [5]. It represents the probability of rejecting the null hypothesis when it is actually true.
Calculating the p-value [2, 6]: This value represents the strength of the evidence against the null hypothesis [6]. It’s the probability of obtaining results as extreme as the observed results if the null hypothesis were true. The p-value is compared to the alpha level to make a decision about the null hypothesis.
Decision Making [2, 6]:
If the p-value is less than alpha, the null hypothesis is rejected in favor of the alternate hypothesis [6].
If the p-value is greater than or equal to alpha, there is not sufficient evidence to reject the null hypothesis.
Understanding Types of Errors [2, 6]:
Type I error (false positive): Rejecting the null hypothesis when it is actually true [2, 6].
Type II error (false negative): Failing to reject the null hypothesis when it is actually false [2, 6].
There are two types of tests that can be conducted within hypothesis testing, as determined by the directionality of the hypothesis being tested [7, 8]:
One-tailed test: This test is directional, meaning the critical region is on one side of the distribution [7]. A one-tailed test is used when the hypothesis is testing for a value that is either greater than or less than a specific value.
Two-tailed test: This test is non-directional, and the critical region is divided between both tails of the distribution [8]. This kind of test is used when the hypothesis is testing for a difference in the value, whether that difference is greater than or less than the expected value.
There are also various statistical tests that are used in hypothesis testing depending on the type of data and the specific research question [9]. Some common types of tests include:
Z-tests: Used when the population standard deviation is known and the sample size is large [9].
One-sample z-tests are used when comparing a single sample mean to a population mean [9].
Two-sample z-tests are used to compare the means of two independent samples [9].
T-tests: Used when the population standard deviation is unknown and/or the sample size is small (less than or equal to 30) [10, 11].
Independent t-tests are used to compare the means of two independent groups [11].
Paired t-tests are used to compare the means of two related groups, such as the same group before and after a treatment [11].
ANOVA (Analysis of Variance): Used to compare the means of more than two groups [12].
One-way ANOVA is used when there is one factor influencing a response variable [13].
Two-way ANOVA is used when there are two factors influencing a response variable [14].
Chi-square tests: Used to test for associations between categorical variables [15, 16].
Chi-square tests for independence are used to determine if two categorical variables are related [16].
Chi-square goodness-of-fit tests are used to compare observed values with expected values to determine if a sample follows a specific distribution [17].
By using these steps, hypothesis testing helps researchers and data analysts make informed decisions based on evidence from sample data [3].
Complete STATISTICS for Data Science | Data Analysis | Full Crash Course
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
This document is a tutorial on using Power BI, covering various aspects of data modeling and visualization. It extensively explains the creation and use of calculated columns and measures (DAX), demonstrates the implementation of different visualizations (tables, matrices, bar charts), and explores advanced features like calculation groups, visual level formatting, and field parameters. The tutorial also details data manipulation techniques within Power Query, including data transformations and aggregations. Finally, it guides users through publishing reports to the Power BI service for sharing.
Power BI Visuals and DAX Study Guide
Quiz
Instructions: Answer each question in 2-3 sentences.
What is the difference between “drill down” and “expand” in the context of a Matrix visual?
What is a “stepped layout” in a Matrix visual and how can you disable it?
How can you switch the placement of measures between rows and columns in a Matrix visual?
When using a Matrix visual with multiple row fields, how do you control subtotal visibility at different levels?
What is the primary difference between a pie chart and a tree map visual in Power BI?
How can you add additional information to a tooltip in a pie chart or treemap visual?
What is a key difference between the display options when using “Category” versus “Details” in a treemap?
What is the significance of the “Switch values on row group” option?
In a scatter plot visual, what is the purpose of the “Size” field?
How does the Azure Map visual differ from standard Power BI map visuals, and what are some of its advanced features?
Answer Key
“Drill down” navigates to the next level of the hierarchy, while “expand” displays all levels simultaneously. Drill down goes one level at a time, while expand shows all levels at once. Drill down changes the current view while expand adds to it.
A “stepped layout” creates an indented hierarchical view in the Matrix visual’s row headers. It can be disabled in the “Row headers” section of the visual’s format pane by toggling the “Stepped layout” option off.
In the values section, scroll down to “switch values on row group”. You can switch the placement of measures between rows and columns by enabling or disabling the “Switch values on row group” option. When enabled, measures are displayed on rows; when disabled, they’re on columns.
Subtotal visibility is controlled under the “Row subtotals” section of the formatting pane where you can choose to display subtotals for individual row levels, or disable them entirely; the “per row level” setting is what controls which subtotals are visible in the matrix. You can also choose to change where the subtotal name appears.
Pie charts show proportions of a whole using slices and a legend, whereas tree maps use nested rectangles to show hierarchical data, and do not explicitly show a percentage. Pie charts show percentages while treemaps show the magnitude of a total. Tree maps do not use legends.
You can add additional information to a tooltip by dragging measures or other fields into the “Tooltips” section of the visual’s field pane. The tooltips section allows for multiple values. Tooltips can also be switched on and off.
When you add a field to the “Category”, it acts as a primary grouping that is displayed and colored. When you add a field to the “Details” it is displayed within the existing category and the conditional formatting disappears.
“Switch values on row group” is an option in a Matrix visual that toggles whether measures appear in the row headers or in the column headers allowing for a KPI style or pivo style display. By default, values appear in the columns, but when switched on, they appear in the rows.
In a scatter plot visual, the “Size” field is used to represent a third dimension, where larger values are represented by bigger bubbles. The field’s magnitude is visually represented by the size of the bubbles.
The Azure Map visual offers more advanced map styles (e.g., road, hybrid, satellite), auto-zoom controls, and other features. It allows for heatmaps, conditional formatting on bubbles, and cluster bubbles for detailed geographic analysis, unlike standard Power BI maps.
Essay Questions
Instructions: Respond to the following questions in essay format.
Compare and contrast the use of Matrix, Pie, and Treemap visuals, discussing their best use cases and how each represents data differently.
Discuss the various formatting options available for labels and values across different visuals. How can these formatting options be used effectively to improve data visualization and analysis?
Describe how the different components of the Power BI Matrix visual (e.g., row headers, column headers, sub totals, drill down, drill up) can be used to explore data hierarchies and gain insights.
Explain how the “Values” section and “Format” pane interact to create a specific visual output, focusing on the use of different measure types (e.g., aggregation vs. calculated measures).
Analyze the differences and best use cases for area and stacked area charts, focusing on how they represent changes over time or categories, and how they can be styled to communicate data effectively.
Glossary
Matrix Visual: A table-like visual that displays data in a grid format, often used for displaying hierarchical data.
Drill Down/Up: Actions that allow users to navigate through hierarchical data, moving down to more granular levels or up to higher levels.
Expand/Collapse: Actions to show or hide sub-levels within a hierarchical structure.
Stepped Layout: An indented layout for row headers in a Matrix visual, visually representing hierarchy.
Measures on Rows/Columns: Option in the Matrix visual to toggle the placement of measures between row or column headers.
Switch Values on Row Group: An option that changes where measures are displayed (on row or column headers).
Subtotals: Sum or average aggregations calculated at different levels of hierarchy within a Matrix visual.
Pie Chart: A circular chart divided into slices to show proportions of a whole.
Treemap Visual: A visual that uses nested rectangles to display hierarchical data, where the size of the rectangles corresponds to the value of each category or subcategory.
Category (Treemap): The main grouping used in a treemap, often with distinct colors.
Details (Treemap): A finer level of categorization that subdivides the main categories into smaller units.
Tooltip: Additional information that appears when a user hovers over an element in a visual.
Legend: A visual key that explains the color coding used in a chart.
Conditional Formatting: Automatically changing the appearance of visual elements based on predefined conditions or rules.
Scatter Plot: A chart that displays data points on a two-dimensional graph, where each point represents the values of two variables.
Size Field (Scatter Plot): A field that controls the size of the data points on a scatter plot, representing a third variable.
Azure Map Visual: An enhanced map visual that offers more advanced styles, heatmaps, and other geographic analysis tools.
Card Visual: A visual that displays a single value, often a key performance indicator (KPI).
DAX (Data Analysis Expressions): A formula language used in Power BI for calculations and data manipulation.
Visual Calculation: A calculation that is performed within the scope of a visual, rather than being defined as a measure.
Element Level Formatting: Formatting applied to individual parts of a visual (e.g., individual bars in a bar chart).
Global Format: A default or general formatting style that applies across multiple elements or objects.
Model Level Formatting: Formatting rules applied at the data model level that can be used as a default for all visuals.
Summarize Columns: A DAX function that groups data and creates a new table with the aggregated results.
Row Function: A DAX function that creates a table with a single row and specified columns.
IF Statement (DAX): A conditional statement that allows different calculations based on whether a logical test is true or false.
Switch Statement (DAX): A conditional statement similar to “case” that can handle multiple conditions or multiple values.
Mod Function: A DAX mathematical function that provides a remainder of a division.
AverageX: A DAX function that calculates the average value across a table or a column.
Values: A DAX function that returns the distinct values from a specified column.
Calculate: A DAX function that modifies the filter context of a calculation.
Include Level of Detail: A technique for incorporating more granular data into calculations without affecting other visual elements.
Remove Level of Detail: A technique that excludes a specified level of data from a calculation for aggregated analysis.
Filter Context: The set of filters that are applied to a calculation based on the current visual context.
Distinct Count: A function that counts the number of unique values in a column.
Percentage of Total: A way to display values as a proportion of a total, useful for understanding the relative contribution of various items.
All Function: A DAX function that removes filter context from specified tables or columns.
Allselected Function: A DAX function that removes filters based on what is not selected on a slicer, but retains filters based on what is selected on a slicer.
RankX Function: A DAX function to calculate ranks based on an expression.
Rank Function: A DAX function that assigns a rank to each row based on a specified column or major.
Top N Function: A DAX function to select the top n rows based on a given value.
Keep Filters: A function that allows the visual filters to be retained or included during DAX calculations.
Selected Value: A DAX function used to return the value currently selected in a slicer.
Date Add: A DAX function that shifts the date forward or backward by a specified number of intervals (days, months, quarters, years).
EndOfMonth (EOMonth): A DAX function that returns the last day of the month for a specified date.
PreviousMonth: A DAX function that returns the date for the previous month.
DateMTD: A DAX function that returns the total value for the current month till date.
TotalMTD: A DAX function that returns a total for month till date, and can be used without a calculate.
DatesYTD: A DAX function to calculate a year to date value, and can be used in combination with a fiscal year ending parameter.
IsInScope: A DAX function to determine the level of hierarchy for calculations.
Offset Function: A DAX function to access values in another row based on a relative position.
Window Function: A family of DAX functions similar to window functions of SQL but with different objectives, that can be used to calculate totals that are based on previous or next rows or columns in a visual.
Index Function: A DAX function to find the data at a specified index from a table or a visual.
Row Number Function: A DAX function that provides a continuous sequence of numbers.
Power BI Visuals and DAX Deep Dive
Okay, here’s a detailed briefing document summarizing the main themes and ideas from the provided “01.pdf” excerpts.
Briefing Document: Power BI Visual Deep Dive
Document Overview:
This document summarizes key concepts and features related to various Power BI visuals, as described in the provided transcript. The focus is on the functionality and customization options available for Matrix, Pie/Donut, TreeMap, Area, Scatter, Map, and Card visuals, along with a detailed exploration of DAX (Data Analysis Expressions) including its use in calculated columns and measures and some of the time intelligence functions.
Main Themes and Key Ideas:
Matrix Visual Flexibility:
Hierarchical Data Exploration: The Matrix visual allows for drilling down and expanding hierarchical data. The “Next Level” feature takes you to the next available level, while “Expand” allows viewing of all levels simultaneously.
“…the next level take us to the next level means it’s take us to the next available level…”
Stabbed vs. Non-Stabbed Layout: Offers two layouts for rows: “stabbed” (hierarchical indentation) and “non-stabbed” (flat).
“this display is known as stabbed layout…if you switch it off the stepped layout if you switch it off then it will give you this kind of look and feel so this is non sted layout…”
Values on Rows or Columns: Measures can be switched to display on rows instead of columns, offering KPI-like views.
“I have this option switch values on row group rather than columns if you this is right now off if you switch it on you start seeing your measures on the row…”
Complex Structures: Allows for the creation of complex multi-level structures using rows and columns, with drill-down options for both.
“I can create really complex structure using the Matrix visual…”
Total Control: Subtotals can be customized for each level of the hierarchy, with options to disable, rename, and position them.
“In this manner you can control not only you can control let’s say you want to have the sub totals you can give the sub total some name…”
Pie/Donut Visual Customization:
Detailed Labels and Slices: The visual provides options for detailed labels and custom colors for each slice.
“for each slices you have the color again the P visual use Legend…”
Rotation: The starting point of the pie chart can be rotated.
“now rotation is basically if you see right now it’s starting from this position…the position starting position is changing…”
Donut Option: The pie chart can be converted to a donut chart, offering similar properties.
“and finally you can also have a donut instead of this one…”
Tooltip Customization: Additional fields and values can be added to the tooltip.
“if you want to add something additional on the tool tip let’s say margin percentage you can add it…”
Workaround for Conditional Formatting: While direct conditional formatting isn’t supported, workarounds exist.
TreeMap Visual Characteristics:
Horizontal Pie Alternative: The TreeMap is presented as a horizontal pie chart, showing area proportion.
Category, Details, and Values: Uses categories, details, and values, unlike the pie chart’s legend concept.
Conditional Formatting Limitation: Conditional formatting is not directly available when using details; colors can be applied to category levels or using conditional formatting rules.
“once I add the category on the details now you can see the FX option is no more available for you to do the conditional formatting…”
Tooltips and Legends: Allows the addition of tooltips and enables the display of legends.
“again if you want to have additional information on tool tip you can add it on the tool tip then we have size title Legends as usual…”
Area and Stacked Area Visuals:
Trend Visualization: These visuals are useful for visualizing trends over time.
Continuous vs. Categorical Axis: The x-axis can be set to continuous or categorical options.
“because I’m using the date Fe field I am getting the access as continuous option I can also choose for a categorical option where I get the categorical values…”
Legend and Transparency: Legends can be customized, and fill transparency can be adjusted.
“if there is a shade transparency you want to control you can do that we can little bit control it like this or little bit lighter you can increase the transparency or you can decrease the transparency…”
Conditional Formatting: While conditional formatting on series is limited at visual level, it is mentioned to be available with the work around.
Scatter Visual Features:
Measure-Based Axes: Best created with measures on both X and Y axes.
“the best way to create a scatter visual is having both x-axis and y axis as a measure…”
Dot Chart Alternative: Can serve as a dot chart when one axis is a category and another is a measure.
“This kind of become a DOT chart…”
Bubble Sizes: Can use another measure to control the size of the bubbles.
Conditional Formatting for Markers: Offers options for conditional formatting of bubble colors using measures.
“you can also have the conditional formatting done on these Bubbles and for that you have the option available under markers only if you go to the marker color you can see the f sign here it means I can use a measure out here…”
Series and Legends: Can use a category field for series and supports legends.
“let me try to add it again it give me a disclaimer Also let’s try to add some location to it…”
Multiple Styles: Supports various map styles including road, hybrid, satellite, and grayscale.
Auto Zoom and Controls: Includes auto-zoom and zoom controls.
“you have view auto zoom o on and you can have different options if you want to disable the auto zoom like you know you can observe the difference…”
Layer Settings: Offers settings for bubble layers, heatmaps, and legends.
“then you have the layer settings which is minimum and maximum unselected disappear you can have Legends in case we are not using Legends as of now here…”
Conditional Formatting and Cluster Bubbles: Supports conditional formatting based on gradients, rules, or fields and has options for cluster bubbles.
“color you have the conditional formatting option we have conditional formatting options and we can do conditional formatting based on gradient color rule based or field value base…”
Enhanced Functionality: The Azure Map visual is presented as a strong option with ongoing enhancements.
“map visual is coming as an stronger option compared to all other visuals and you’re getting a lot of enhancement on that…”
Card Visual Basics:
Single Measure Display: The Card visual is used to display a single numerical measure.
“you can have one major only at a time…”
Customizable Formatting: Offers customization for size, position, padding, background, borders, shadow, and label formatting.
DAX and Formatting:
DAX Definition: DAX (Data Analysis Expressions) is a formula language used in Power BI for advanced calculations and queries.
“Dex is data analysis expression is a Formula expression language used in analysis services powerbi and power power in Excel…”
Formatting Levels: Formatting can be applied at the model, visual, and element level, allowing for detailed control over presentation.
“you will see at the model level we don’t have any decimal places and if you go to the tool tip of the second bar visual you don’t see any tool tip on the table visual you see the visual level format with one decimal place on the first bar visual you see on the data label the two decimal places means the element level formatting and in the tool tip you see the visual level formatting…”
Visual Calculations: Visual level calculations in Power BI provide context based calculated fields.
Measure Definitions: Measures can be defined using the DAX syntax, specifying table, measure names, and expressions. * “we first we say Define mejor the table and the mejor name the new major name or the major name which you want and the definition the expression basically…”
Summarize Columns: SUMMARIZECOLUMNS function allows grouping of data, filtering and defining aggregated expressions.
“if you remember when we came initially here we have been given a function which was summarize columns…”
Row Function: Row function helps in creating one row with multiple columns and measures.
“row function can actually take a name expression name expression name expression and it only gives me one row summarize column is even more powerful it can have a group buse also we have not added the group by there…”
Common Aggregation Functions: Functions like SUM, MIN, MAX, COUNT, and DISTINCTCOUNT are used for data aggregation.
“we have something known as sum you already know this same way as sum we have min max count count majors are there…”
Conditional Logic (IF & SWITCH):
IF Statements: Used for conditional logic, testing for a condition and returning different values for true/false outcomes.
“if what is my condition if category because I’m creating a column I can simply use the column name belongs to the table without using the table name but ideal situation is use table name column in…”
SWITCH Statements: An alternative to complex nested IF statements, handling multiple conditions, particularly for categorical or variable values.
“here what is going to happen is I’m will use switch now the switch I can have expression expression can be true then I have value result value result combination but it can also be a column or a measure…”
SWITCH TRUE Variant: Used when multiple conditions need to be tested where the conditions are not the distinct values of a column.
Level of Detail (LOD) Expressions:
AVERAGEX and SUMMARIZE: Functions such as AVERAGEX and SUMMARIZE are used to compute aggregates at a specified level of detail.
“average X I can use values or summarize let me use values as of now to begin with values then let’s use geography City till this level you have to do whatever aggregation I’m going to do in the expression net…”
Calculations inside Expression: When doing aggregations inside AVERAGEX, CALCULATE is required to ensure correct results.
“if you are giving a table expression table expression and you are using aggregation on the column then you have to use calculate in the expression you cannot do it without that…”
Values vs. Summarize: VALUES returns distinct column values, while SUMMARIZE enables grouping and calculation of aggregates for multiple columns and measures in addition to group bys.
“summarize can also include a calculation inside the table so we have the Group by columns and after that the expression says that you can have name and expression here…”
Handling Filter Context:
Context Issues with Grand Totals: Direct use of measures in aggregated visuals can cause incorrect grand totals due to filter context.
“and this is what we call the calculations error because of filter context context have you used…”
Correcting Grand Totals: CALCULATE with functions like ALL or ALLSELECTED can correct grand total issues.
“the moment we added the calculate the results have started coming out so as you aware that when you use calculate is going to appear…”
Include vs Exclude: You can either include a specific dimension and exclude other or you can simply remove a particular dimension context for your calculation.
Distinct Counts and Percentages:
DISTINCTCOUNT Function: For counting unique values in a column.
“we use the function distinct count sales item id let me bring it here this is 55…”
Alternative for Distinct: COUNTROWS(VALUES()) can provide equivalent distinct counts for a single column and the combination of columns and measure can be taken from summarize.
“count rows values now single column I can use values we have learned that in the past get the distinct values you can use values…”
Percentage of Total: DIVIDE function can be used to calculate percentages, handling zero division cases.
“calculate percent of DT net grand total of net I want to use the divide function because I want to divide the current calculation by the total grand total…”
Percentage of Subtotal: You can calculate the percentage of a subtotal by removing the context for level of detail.
“I can use remove filters of city now there are only two levels so I can say remove filter of City geography City…”
Ranking and Top N:
RANKX Function: Used to assign ranks to rows based on the major and in DAX but has limitations.
“let me use this week start date column and create a rank so I’ll use I’ll give the name as Peak rank make it a little bit bigger so that you can see it Rank and you can see rank. EQ rank X and rank three functions are there I’m going to use rank X…”
RANK Function: Alternative to RANKX, allows ranking by a column, handles ties, and can be used in measures.
“ties first thing it ask for ties second thing it ask for relation which is something which I all or all selected item brand order by what order by you want to give blanks in case you have blanks Partition by in case you want to partition the rank within something match buy and reset…”
TOPN Function: Returns a table with the top N values based on a measure.
“the function is top n Now what is my n value n value is 10 so I need n value I need table expression and here table expression will be all or all selected order by expression order ascending or descending and this kind of information is…”
Dynamic Top N: Achieved with modeling parameters.
“we have new parameters one of them is a numeric range and another one is field parameter now field parameter is we’re going to discuss after some time numeric parameter was previously also known as what if parameter…”
Time Intelligence:
Date Table Importance: A well-defined date table is crucial for time intelligence calculations.
“so the first thing we want to make sure there is a date table…without a date table or a continuous set of dates this kind of calculation will not work…”
Date Range Creation: DAX functions enable the creation of continuous date ranges for various periods, such as month, quarter, and year start/end dates.
“and now we use year function month function and year month function so what will happen if I pass a date to that it will return me the month of that date and I need number so what I need is month function is going to give me the number isn’t it…”
Total MTD Function: Calculates Month-to-Date value.
“I’m going to use total MTD total MTD requires an expression date and filter it can have a filter and if you need more than one filter then you can again use calculate on top of total MTD otherwise total MTD doesn’t require calcul…”
Dates MTD Function: Also calculates MTD, and requires CALCULATE.
“this time I’ve clicked on a major so Major Tool is open as of now I’ll click on new measure calculate net dates MTD dates MTD required date…”
YTD: Calculates Year-to-Date values using DATESYTD (with and without fiscal year end).
“let me calculate total YTD and that’s going to give me YTD let me bring in the YTD using dates YTD so net YTD net 1 equal to calculate net dates YTD and dates YTD required dates and year and date…”
Previous Month Calculations: DATEADD to move dates backward and PREVIOUSMONTH for last month data.
“but inside the dates MDD I want the entire dates to move a month back I’m going to use a function date add and please remember the understanding of date head that date head also require continuous for dates…”
Offset: Is a better option to get the Previous value or any offset required.
“calculate net offset I need function offset what it is asking it is asking for relation what is my relation all selected date and I need offset how many offset minus one how do we go to minus one date…”
Is In Scope: A very powerful DAX function, which can be used in place of multiple IF statements and allows the handling of Grand totals in a measure.
“if I’m in the month is there month is in scope I need this formula what happens if I’m in the year is ear is in the scope or if I’m in a grand total you can also have this is in scope grand total but here is in scope is really important…”
Window Functions
Window: A DAX function which is very similar to SQL Window function and helps in calculating running total, rolling total and other cumulative calculations.
“the first is very simple if mod mod is a function which gives me remainder so it takes a number Division and gives the remainder so we are learning a mathematical function mod here…”
Index: A function which allows to find top and bottom performer based on certain calculation in the visual.
“I’m going to use the function which is known as index index which position first thing is position then relation order by blanks Partition by if you need the within let’s say within brand what is the top category or within the year which is the top month match by I need the topper one…”
Rank: A DAX function very similar to rank X but has additional flexibility in terms of columns and measures.
“what I need ties then something is repeat use dance relation is really important here and I’m going to create this relation using summarize all selected sales because the things are coming from two different table customer which is a dimension to the sales and the sales date which is coming from the sales that is why I need and I need definitely the all selected or the all data and that’s that is why I’m using all selected on the sales inside the sumarize from customer what I need I need name…”
Row Number: A very useful function which helps in creating sequential number or in a partitioned manner.
“I will bring item name from the item table and I would like to bring from the sales table the sales State Sal State and now I would like to bring one major NE now here I want to create a row number what would be row number based on row number can be based on any of my condition…”
Visual Calculations:
Context-Based Calculations: Visual calculations perform calculation based on the visual contexts using the DAX.
“I’m going to use the function offset what it is asking it is asking for relation what is my relation all selected date and I need offset how many offset minus one how do we go to minus one date…”
Reset Option: The reset option in offset can be used to get the calculation work as needed.
“and as you can see inside the brand 10 it is not getting the value for for the first category and to make it easier to understand let me first remove the subtotals so let me hide the subtotals…”
RANK with Reset: Enables ranking within partitions.
“and as you can see the categories are ranked properly inside each brand so there is a reset happening for each brand and categories are ranked inside that…”
Implicit Measure: You can also use the visual implicit measures in the visual calculation.
“in this row number function I’m going to use the relation which is row next thing is order by and in this order by I’m going to use the something which is we have in this visual sum of quantity see I’m not created a measure here I’m going to use sum of quantity in this visual calculation…”
Conclusion:
The provided material covers a wide array of features and capabilities within Power BI. The document highlights the importance of understanding both the visual options and the underlying DAX language for effective data analysis and presentation. The exploration of time intelligence functions and new DAX functions further empowers users to create sophisticated and actionable reports. This is a good start to get the deep knowledge of Power BI visuals.
Power BI Visuals and DAX: A Comprehensive Guide
Frequently Asked Questions on Power BI Visuals and DAX
What is the difference between “drill down,” “drill up,” and “expand” options in a Matrix visual?
Drill down moves to the next level of a hierarchy, while drill up returns to a higher level. Expand adds the next level without changing your current view and can be used multiple times for multiple levels, while “next level” only takes you to the next available level and does not require multiple clicks.
What is the difference between a “stepped layout” and a non-stepped layout in Matrix visuals? A stepped layout displays hierarchical data with indentation, showing how values relate to each other within a hierarchy. Non-stepped layout will display all levels without indentation and in a more tabular fashion.
How can I control subtotal and grand total displays in a Matrix visual?
In the format pane under “Row sub totals,” you can enable/disable sub totals for all levels, individual row levels, and grand totals. You can also choose which level of sub totals to display, add custom labels, and position them at the top or bottom of their respective sections. Subtotals at each level are controlled by the highest level in the row hierarchy at that point.
What customization options are available for Pie and Donut visuals?
For both Pie and Donut visuals, you can adjust the colors of slices, add detail labels with percentage values, rotate the visual, control label sizes and placement, use a background, and add tooltips. Donut visuals can also be used with a transparent center to display a value in a card visual in the middle. Additionally, with a Pie chart, you have the additional option to have a legend with a title and placement options, which the Donut chart does not have.
How does the Treemap visual differ from the Pie and Donut visuals, and what customization options does it offer? The Treemap visual uses rectangles to represent hierarchical data; it does not show percentages directly, and unlike Pie, there is no legend. Instead, you have category, details, and values. You can add data labels, and additional details as tool tips, can adjust font, label position and can add background and control its transparency. Conditional formatting is only available on single category levels.
What are the key differences between Area and Stacked Area visuals, and how are they formatted? Area charts visualize trends using a continuous area, while Stacked Area charts show the trends of multiple series which are stacked on top of one another. Both visuals share similar formatting options, including x-axis and y-axis customization, title and legend adjustments, reference lines, shade transparency, and the ability to switch between continuous and categorical axis types based on your dataset. These features are similar across a wide range of visualizations. You can use multiple measures on the y-axis or a legend on the x-axis to create an area visual and you can use both measure and legend in case of stacked area visual.
What are the key components and customization options for the Scatter visual?
The Scatter visual plots data points based on X and Y axis values, usually measures. You can add a size variable to create bubbles and use different marker shapes or conditional formatting to color the markers. You can also add a play axis, tool tips, and legend for more interactive visualizations. You cannot add dimension to the y-axis. You can add dimension on the color or the size, but not on the y-axis.
How do you use DAX to create calculated columns and measures, and what are the differences between them?
DAX (Data Analysis Expressions) is a language used in Power BI for calculations and queries in tabular data models. Calculated columns add new columns to a table based on DAX expressions. Measures are dynamic calculations based on aggregations and calculations, responding to filters and slicers. Measures do not add column to the table. Both use the same formula language, but columns are fixed for each row and measures are evaluated when used. DAX calculations can be created in measure definition as well as in the query view where you are able to see your results in tabular format and using those, you can create measures in the model view.
Mastering Power BI: A Comprehensive Guide
Power BI is a business intelligence and analytics service that provides insights through data analysis [1]. It is a collection of software services, apps, and connectors that work together to transform unrelated data sources into coherent, visually immersive, and interactive insights [1].
Key aspects of Power BI include:
Data Visualization: Power BI enables sharing of insights through data visualizations, which can be incorporated into reports and dashboards [1].
Scalability and Governance: It is designed to scale across organizations and has built-in governance and security features, allowing businesses to focus on data usage rather than management [1].
Data Analytics: This involves examining and analyzing data sets to draw insights, conclusions, and make data-driven decisions. Statistical and analytical techniques are used to interpret relevant information from data [1].
Business Intelligence: This refers to the technology, applications, and practices for collecting, integrating, analyzing, and presenting business information to support better decision-making [1]. Power BI can collect data from various sources, integrate them, analyze them, and present the results [1].
The journey of using Power BI and other business intelligence analytics tools starts with data sources [2]. Common sources include:
External sources such as Excel and databases [2].
Data can be imported into Power BI Desktop [2].
Import Mode: The data resides within Power BI [2].
Direct Query: A connection is created, but the data is not imported [2].
Power BI reports are created on the desktop using Power Query for data transformation, DAX for calculations, and visualizations [2].
Reports can be published to the Power BI service, an ecosystem for sharing and collaboration [2].
On-premises data sources require an on-premises gateway for data refresh [2]. Cloud sources do not need an on-premises gateway [2].
Published reports are divided into two parts: a dataset (or semantic model) and a report [2].
The dataset can act as a source for other reports [2].
Live connections can be created to reuse datasets [2].
Components of Power BI Desktop
Power Query: Used for data preparation, cleaning, and transformation [2].
The online version is known as data flow, available in two versions: Gen 1 and Gen 2 [2].
DAX: Used for creating complex measures and calculations [2].
Direct Lake: A new connection type in Microsoft Fabric that merges import and direct query [2].
Power BI Desktop Interface
The ribbon at the top contains menus for file, home, insert, modeling, view, optimize, help, and external tools [3].
The Home tab includes options to get data, transform data (Power Query), and modify data source settings [3].
The Insert tab provides visualization options [3].
The Modeling tab allows for relationship management, creating measures, columns, tables, and parameters [3].
The View tab includes options for themes, page views, mobile layouts, and enabling/disabling panes [3].
Power BI Service
Power BI Service is the ecosystem where reports are shared and collaborated on [2].
It requires a Pro license to create a workspace and share content [4].
Workspaces are containers for reports, paginated reports, dashboards, and datasets [4].
The service allows for data refresh scheduling, with Pro licenses allowing 8 refreshes per day and Premium licenses allowing 48 [2].
The service also provides for creation of apps for sharing content [4].
The service has a number of settings that can be configured by the admin, such as tenant settings, permissions, and data connections [4, 5].
Data Transformation with Power Query
Power Query is a data transformation and preparation engine [6].
It uses the “M” language for data transformation [6].
It uses a graphical interface with ribbons, menus, buttons, and interactive components to perform operations [6].
Power Query is available in Power BI Desktop, Power BI online, and other Microsoft products and services [6].
Common operations include connecting to data sources, extracting data, transforming data, and loading it into a model [6].
DAX (Data Analysis Expressions)
DAX is used for creating measures, calculated columns, and calculated tables [7].
It can be used in the Power BI Desktop and Power BI service [7].
The DAX query view allows for writing and executing DAX queries, similar to a SQL editor [7].
The query view has formatting options, commenting, and find/replace [7].
DAX query results must return a table [7].
Visuals
Power BI offers a range of visuals, including tables, slicers, charts, and combo visuals [8-10].
Text slicers allow for filtering data based on text input [10].
They can be used to create dependent slicers where other slicers are filtered by the text input [10].
Sync slicers allow for synchronizing slicers across different fields, even if the fields are in different tables [9].
Combo visuals combine charts, such as bar charts and line charts [9].
Conditional formatting can be applied to visuals based on DAX expressions [7].
Key Concepts
Data Quality: High-quality data is necessary for quality analysis [1].
Star Schema: Power BI models typically use a star schema with fact and dimension tables [11].
Semantic Model: A data model with relationships, measures, and calculations [2].
Import Mode: Data is loaded into Power BI [12].
Direct Query: Data is not imported; queries are sent to the source [12].
Live Connection: A connection to a semantic model, where the model is not owned by Power BI [12].
Direct Lake: Connection type that leverages Microsoft Fabric data lake [12].
These concepts and features help users analyze data and gain insights using Power BI.
Data Manipulation in Power BI Using Power Query and M
Data manipulation in Power BI primarily involves using Power Query for data transformation and preparation [1-3]. Power Query is a data transformation and data preparation engine that helps to manipulate data, clean data, and put it into a format that Power BI can easily understand [2]. It is a graphical user interface with menus, ribbons, buttons, and interactive components, making it easy to apply transformations [2]. The transformations are also tracked, with every step recorded [3]. Behind the scenes, Power Query uses a scripting language known as “M” language for all transformations [2].
Here are key aspects of data manipulation in Power BI:
Data Loading:Data can be loaded from various sources, such as Excel files, CSVs, and databases [4, 5].
When loading data, users can choose between “load data” (if the data is ready) or “transform data” to perform transformations before loading [5].
Data can be loaded via import mode, where the data resides within Power BI, or direct query, where a connection is created, but data is not imported [1, 5]. There is also Direct Lake, a new mode that combines the best of import and direct query for Microsoft Fabric lake houses and warehouses [1].
Power Query Editor:The Power Query Editor is the primary interface for performing data transformations [2].
It can be accessed by clicking “Transform Data” in Power BI Desktop [3].
The editor provides a user-friendly set of ribbons, menus, buttons and other interactive components for data manipulation [2].
The Power Query editor is also available in Power BI online, Microsoft Fabric data flow Gen2, Microsoft Power Platform data flows, and Azure data factory [2].
Data Transformation Steps:Power Query captures every transformation step, allowing users to track and revert changes [3].
Common transformations include:
Renaming columns and tables [3, 6].
Changing data types [3].
Filtering rows [7].
Removing duplicates [3, 8].
Splitting columns by delimiter or number of characters [9].
Grouping rows [9].
Pivoting and unpivoting columns [3, 10].
Merging and appending queries [8].
Creating custom columns using formulas [8, 9].
Column Operations:Power Query allows for examining column properties, such as data quality, distribution, and profiles [3].
Column Quality shows valid, error, and empty values [3].
Column Distribution shows the count of distinct and unique values [3].
Column Profile shows statistics such as count, error, empty, distinct, unique, min, max, average, standard deviation, odd, and even values [3].
Users can add custom columns with formulas or duplicate existing columns [8].
M Language:Power Query uses the M language for all data transformations [2].
M is a case-sensitive language [11].
M code can be viewed and modified in the Advanced Editor [2].
M code consists of let statements for variables and steps, expressions for transformation, and in statement to output a query formula step [11].
Star Schema Creation:Power Query can be used to transform single tables into a star schema by creating multiple dimension tables and a fact table [12].
This involves duplicating tables, removing unnecessary columns, and removing duplicate rows [12].
Referencing tables is preferable to duplicating them because it only loads data once [12].
Cross Joins:Power Query does not have a direct cross join function, but it can be achieved using custom columns to bring one table into another, creating a cartesian product [11].
Rank and Index:Power Query allows for adding index columns for unique row identification [9].
It also allows for ranking data within groups using custom M code [13].
Data Quality:Power Query provides tools to identify and resolve data quality issues, which is important for getting quality data for analysis [3, 12].
Performance:When creating a data model with multiple tables using Power Query, it is best to apply changes periodically, rather than all at once, to prevent it from taking too much time to load at the end [10].
By using Power Query and the M language, users can manipulate and transform data in Power BI to create accurate and reliable data models [2, 3].
Power BI Visualizations: A Comprehensive Guide
Power BI offers a variety of visualizations to represent data and insights, which can be incorporated into reports and dashboards [1]. These visualizations help users understand data patterns, trends, and relationships more effectively [1].
Key aspects of visualizations in Power BI include:
Types of Visuals: Power BI provides a wide array of visuals, including tables, matrices, charts, maps, and more [1].
Tables display data in a tabular format with rows and columns [1, 2]. They can include multiple sorts and allow for formatting options like size, style, background, and borders [2].
Table visuals can have multiple sorts by using the shift button while selecting columns [2].
Matrices are similar to tables, but they can display data in a more complex, multi-dimensional format.
Charts include various types such as:
Bar charts and column charts are used for comparing data across categories [3].
Line charts are used for showing trends over time [4].
Pie charts and donut charts display proportions of a whole [5].
Pie charts use legends to represent categories, and slices to represent data values [5].
Donut charts are similar to pie charts, but with a hole in the center [5].
Area charts and stacked area charts show the magnitude of change over time [6].
Scatter charts are used to display the relationship between two measures [6].
Combo charts combine different chart types, like bar and line charts, to display different data sets on the same visual [3].
Maps display geographical data [7].
Map visuals use bubbles to represent data values [7].
Shape map visuals use colors to represent data values [7].
Azure maps is a powerful map visual with various styles, layers, and options [8].
Tree maps display hierarchical data as nested rectangles [5].
Tree maps do not display percentages like pie charts [5].
Funnel charts display data in a funnel shape, often used to visualize sales processes [7].
Customization: Power BI allows for extensive customization of visuals, including:
Formatting Options: Users can modify size, style, color, transparency, borders, shadows, titles, and labels [2, 5].
Conditional Formatting: Visuals can be conditionally formatted based on DAX expressions, enabling dynamic visualization changes based on data [4, 9]. For instance, colors of scatter plot markers can change based on the values of discount and margin percentages [9].
Titles and Subtitles: Visuals can have titles and subtitles, which can be dynamic by using DAX measures [2].
Interactivity: Visuals in Power BI are interactive, allowing users to:
Filter and Highlight: Users can click on visuals to filter or highlight related data in other visuals on the same page [9].
Edit interactions can modify how visuals interact with each other. For example, you can prevent visuals from filtering each other or specify whether the interaction is filtering or highlighting [9].
Drill Through: Users can navigate to more detailed pages based on data selections [10].
Drill through buttons can be used to create more interactive reports, and the destination of the button can be conditional [10].
Tooltips: Custom tooltips can be created to provide additional information when hovering over data points [5, 10].
Tooltip pages can contain detailed information that is displayed as a custom tooltip. These pages can be customized to pass specific filters and parameters [10].
AI Visuals:
Key influencers analyze which factors impact a selected outcome [11].
Decomposition trees allow for root cause analysis by breaking down data into hierarchical categories [11].
Q&A visuals allow users to ask questions and display relevant visualizations [11].
Slicers: Slicers are used to filter data on a report page [9, 12].
List Slicers: Display a list of values to choose from [12].
Text slicers allow filtering based on text input [12].
Sync slicers synchronize slicers across different pages and fields [3, 12].
Card Visuals: Display single numerical values and can have formatting and reference labels [13].
New card visuals allow for displaying multiple measures and images [13].
Visual Calculations: Visual calculations are DAX calculations that are defined and executed directly on a visual. These calculations can refer to data within the visual, including columns, measures, and other visual calculations [14].
Visual calculations are not stored in the model but are stored in the visual itself [14].
These can be used for calculating running sums, moving averages, percentages, and more [14].
They can operate on aggregated data, often leading to better performance than equivalent measures [14].
They offer a variety of functions, such as RUNNINGSUM, MOVINGAVERAGE, PREVIOUS, NEXT, FIRST, and LAST. Many functions have optional AXIS and RESET parameters [14].
Bookmarks: Bookmarks save the state of a report page, including visual visibility [15].
Bookmarks can be used to create interactive reports, like a slicer panel, by showing and hiding visuals [15].
Bookmarks can be combined with buttons to create more interactive report pages [15].
By utilizing these visualizations and customization options, users can create informative and interactive dashboards and reports in Power BI.
Power BI Calculated Columns: A Comprehensive Guide
Calculated columns in Power BI are a type of column that you add to an existing table in the model designer. These columns use DAX (Data Analysis Expressions) formulas to define their values [1].
Here’s a breakdown of calculated columns, drawing from the sources:
Row-Level Calculations: Calculated columns perform calculations at the row level [2]. This means the formula is evaluated for each row in the table, and the result is stored in that row [1].
For example, a calculated column to calculate a “gross amount” by multiplying “sales quantity” by “sales price” will perform this calculation for each row [2].
Storage and Data Model: The results of calculated column calculations are stored in the data set or semantic model, becoming a permanent part of the table [1, 2].
This means that the calculated values are computed when the data is loaded or refreshed and are then saved with the table [3].
Impact on File Size: Because the calculated values are stored, calculated columns will increase the size of the Power BI file [2, 3].
The file size increases as new values are added into the table [2].
Performance Considerations:Calculated columns are computed during data load time, and this computation can impact load time [3].
Row-level calculations can be costly if the data is large, impacting runtime [4].
For large datasets, it may be more efficient to perform some calculations in a calculated column and then use measures for further aggregations [2].
Creation Methods: There are multiple ways to create a new calculated column [2]:
In Table Tools, you can select “New Column” [2, 3].
In Column Tools, you can select “New Column” after selecting a column [2].
You can also right-click on any table or column and choose “New Column” [2].
Formula Bar: The formula bar is used to create the new calculated column, with the following structure [2]:
The left side of the formula bar is where the new column is named [2].
The right side of the formula bar is where the DAX formula is written to define the column’s value [2].
Line numbers in the formula bar are not relevant and are added automatically [2].
Fully Qualified Names: When writing formulas, it is recommended to use fully qualified names (i.e., table name and column name) to avoid ambiguity [2].
Column Properties: Once a calculated column is created, you can modify its properties in the Column tools, like [2]:
Name.
Data type.
Format (e.g., currency, percentage, decimal places).
Summarization (e.g., sum, average, none).
Data category (e.g., city, state) [3].
Sort by column [3].
When to Use Calculated Columns:Use when you need row-level calculations that are stored with the data [2, 4].
Multiplication should be done at the row level and then summed up. When you have to multiply values across rows, you should use a calculated column or a measure with an iterator function like SUMX [4].
Calculated columns are suitable when you need to perform calculations that can be pre-computed and don’t change based on user interaction or filters [3].
When to Avoid Calculated Columns:When there is a division, the division should be done after aggregation [4]. It is generally better to first aggregate and then divide by using a measure.
Examples:
Calculating gross amount by multiplying sales quantity and sales price [2].
Calculating discount amount by multiplying gross amount by discount percentage and dividing it by 100 [2].
Calculating cost of goods sold (COGS) by multiplying sales quantity by sales cost [2].
Limitations:Calculated columns increase the file size [3].
Calculated columns are computed at data load time [3].
They are not dynamic and will not change based on filters and slicers [5, 6].
They are not suitable for aggregations [4].
In summary, calculated columns are useful for pre-calculating and storing row-level data within your Power BI model, but it’s important to be mindful of their impact on file size, load times, and to understand when to use them instead of measures.
Power BI Measures: A Comprehensive Guide
Measures in Power BI are dynamic calculation formulas that are used for data analysis and reporting [1]. They are different from calculated columns because they do not store values, but rather are calculated at runtime based on the context of the report [1, 2].
Here’s a breakdown of measures, drawing from the sources:
Dynamic Calculations: Measures are dynamic calculations, which means that the results change depending on the context of the report [1]. The results will change based on filters, slicers, and other user interactions [1]. Measures are not stored with the data like calculated columns; instead, they are calculated when used in a visualization [2].
Run-Time Evaluation: Unlike calculated columns, measures are evaluated at run-time [1, 2]. This means they are calculated when the report is being viewed and as the user interacts with the report [2].
This makes them suitable for aggregations and dynamic calculations.
No Storage of Values: Measures do not store values in the data model; they only contain the definition of the calculation [2]. Therefore, they do not increase the size of the Power BI file [3].
Aggregation: Measures are used for aggregated level calculations which means they are used to calculate sums, averages, counts, or other aggregations of data [3, 4].
Measures should be used for performing calculations on aggregated data [3].
Creation: Measures are created using DAX (Data Analysis Expressions) formulas [1]. Measures can be created in the following ways:
In the Home tab, select “New Measure” [5].
In Table Tools, select “New Measure” after selecting a table [5].
Right-click on a table or a column and choose “New Measure” [5].
Formula Bar: Similar to calculated columns, the formula bar is used to define the measure, with the following structure:
The left side of the formula bar is where the new measure is named.
The right side of the formula bar is where the DAX formula is written to define the measure’s value.
Naming Convention: When creating measures, a common practice is to add the word “amount” at the end of the column name so that the measure names can be simple without “amount” in the name [5].
Types of Measures:
Basic Aggregations: Measures can perform simple aggregations such as SUM, MIN, MAX, AVERAGE, COUNT, and DISTINCTCOUNT [6].
SUM adds up values [7].
MIN gives the smallest value in the column [6].
MAX gives the largest value in the column [6].
COUNT counts the number of values in a column [6].
DISTINCTCOUNT counts unique values in a column [6].
Time Intelligence Measures: Measures can use functions to perform time-related calculations like DATESMTD, DATESQTD, and DATESYTD [8].
Division Measures: When creating a measure that includes division, it is recommended to use the DIVIDE function, which can handle cases of division by zero [7].
Measures vs. Calculated Columns:Measures are dynamic, calculated at run-time, and do not increase file size [1, 2].
Calculated Columns are static, computed at data load time, and increase file size [3].
Measures are best for aggregations, and calculated columns are best for row-level calculations [3, 4].
Formatting: Measures can be formatted using the Measure tools or the Properties pane in the data model view [7].
Formatting includes setting the data type, number of decimal places, currency symbols, and percentage formatting [5, 7].
Multiple measures can be formatted at once using the model view [7].
Formatting can be set at the model level, which applies to all visuals unless overridden at the visual level [9].
Formatting can also be set at the visual level, which overrides the model-level formatting [9].
Additionally, formatting can be set at the element level, which overrides both the model and visual level formatting, such as data labels in a chart [9].
Examples:Calculating the total gross amount by summing the sales gross amount [7].
Calculating the total cost of goods sold (COGS) by summing the cogs amount [7].
Calculating total discount amount by summing the discount amount [7].
Calculating net amount by subtracting the discount from the gross amount [7].
Calculating margin by subtracting cogs from the net amount [7].
Calculating discount percentage by dividing the discount amount by the gross amount [7].
Calculating margin percentage by dividing the margin amount by the net amount [7].
In summary, measures are used to perform dynamic calculations, aggregations, and other analytical computations based on the context of the report. They are essential for creating interactive and informative dashboards and reports [1].
Power BI Tutorial for Beginners to Advanced 2025 | Power BI Full Course for Free in 20 Hours
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
This comprehensive course focuses on algorithmic trading, machine learning, and quantitative strategies using Python. It introduces participants to three distinct trading strategies: an unsupervised learning strategy using S&P 500 data and K-means clustering, a Twitter sentiment-based strategy for NASDAQ 100 stocks, and an intraday strategy employing a GARCH model for volatility prediction on simulated data. The course covers data preparation, feature engineering, backtesting strategies, and the role of machine learning in trading, while emphasizing that the content is for educational purposes only and not financial advice. Practical steps for implementing these strategies in Python are demonstrated, including data download, indicator calculation, and portfolio construction and analysis.
Algorithmic Trading Fundamentals and Opportunities
Based on the sources, here is a discussion of algorithmic trading basics:
Algorithmic trading is defined as trading on a predefined set of rules. These rules are combined into a strategy or a system. The strategy or system is developed using a programming language and is run by a computer.
Algorithmic trading can be used for both manual and automated trading. In manual algorithmic trading, you might use a screener developed algorithmically to identify stocks to trade, or an alert system that notifies you when conditions are triggered, but you would manually execute the trade. In automated trading, a complex system performs calculations, determines positions and sizing, and executes trades automatically.
Python is highlighted as the most popular language used in algorithmic trading, quantitative finance, and data science. This is primarily due to the vast amount of libraries available in Python and its ease of use. Python is mainly used for data pipelines, research, backtesting strategies, and automating low complexity systems. However, Python is noted as a slow language, so for high-end, complicated systems requiring very fast trade execution, languages like Java or C++ might be used instead.
The sources also present algorithmic trading as a great career opportunity within a huge industry, with potential jobs at hedge funds, banks, and prop shops. Key skills needed for those interested in this field include Python, backtesting strategies, replicating papers, and machine learning in trading.
Machine Learning Strategies in Algorithmic Trading
Drawing on the provided sources, machine learning plays a significant role within algorithmic trading and quantitative finance. Algorithmic trading itself involves trading based on a predefined set of rules, which are combined into a strategy or system developed using a programming language and run by a computer. Machine learning can be integrated into these strategies.
Here’s a discussion of machine learning strategies as presented in the sources:
Role and Types of Machine Learning in Trading
Machine learning is discussed as a key component in quantitative strategies. The course overview explicitly includes “machine learning in trading” as a topic. Two main types of machine learning are mentioned in the context of their applications in trading:
Supervised Learning: This can be used for signal generation by making predictions, such as generating buy or sell signals for an asset based on predicting its return or the sign of its return. It can also be applied in risk management to determine position sizing, the weight of a stock in a portfolio, or to predict stop-loss levels.
Unsupervised Learning: The primary use case highlighted is to extract insights from data. This involves analyzing financial data to discover patterns, relationships, or structures, like clusters, without predefined labels. These insights can then be used to aid decision-making. Specific unsupervised learning techniques mentioned include clustering, dimensionality reduction, anomaly detection, market regime detection, and portfolio optimization.
Specific Strategies Covered in the Course
The course develops three large quantitative projects that incorporate or relate to machine learning concepts:
Unsupervised Learning Trading Strategy (Project 1): This strategy uses unsupervised learning (specifically K-means clustering) on S&P 500 stocks. The process involves collecting daily price data, calculating various technical indicators (like Garmon-Class Volatility, RSI, Bollinger Bands, ATR, MACD, Dollar Volume) and features (including monthly returns for different time horizons and rolling Fama-French factor betas). This data is aggregated monthly and filtered to the top 150 most liquid stocks. K-means clustering is then applied to group stocks into similar clusters based on these features. A specific cluster (cluster 3, hypothesized to contain stocks with good upward momentum based on RSI) is selected each month, and a portfolio is formed using efficient frontier optimization to maximize the Sharpe ratio for stocks within that cluster. This portfolio is held for one month and rebalanced. A notable limitation mentioned is that the project uses a stock list that likely has survivorship bias.
Twitter Sentiment Investing Strategy (Project 2): This project uses Twitter sentiment data on NASDAQ 100 stocks. While it is described as not having “machine learning modeling”, the core idea is to demonstrate how alternative data can be used to create a quantitative feature for a strategy. An “engagement ratio” is calculated (Twitter comments divided by Twitter likes). Stocks are ranked monthly based on this ratio, and the top five stocks are selected for an equally weighted portfolio. The performance is then compared to the NASDAQ benchmark (QQQ ETF). The concept here is feature engineering from alternative data sources. Survivorship bias in the stock list is again noted as a limitation that might skew results.
Intraday Strategy using GARCH Model (Project 3): This strategy focuses on a single asset using simulated daily and 5-minute intraday data. It combines signals from two time frames: a daily signal derived from predicting volatility using a GARCH model in a rolling window, and an intraday signal based on technical indicators (like RSI and Bollinger Bands) and price action patterns on 5-minute data. A position (long or short) is taken intraday only when both the daily GARCH signal and the intraday technical signal align, and the position is held until the end of the day. While GARCH is a statistical model, not a typical supervised/unsupervised ML algorithm, it’s presented within this course framework as a quantitative prediction method.
Challenges in Applying Machine Learning
Applying machine learning in trading faces significant challenges:
Theoretical Challenges: The reflexivity/feedback loop makes predictions difficult. If a profitable pattern predicted by a model is exploited by many traders, their actions can change the market dynamics, making the initial prediction invalid (the strategy is “arbitraged away”). Predicting returns and prices is considered particularly hard, followed by predicting the sign/direction of returns, while predicting volatility is considered “not that hard” or “quite straightforward”.
Technical Challenges: These include overfitting (where the model performs well on training data but fails on test data) and generalization issues (the model doesn’t perform the same in real-world trading). Nonstationarity in training data and regime shifts can also ruin model performance. The black box nature of complex models like neural networks can make them difficult to interpret.
Skills for Algorithmic Trading with ML
Key skills needed for a career in algorithmic trading and quantitative finance include knowing Python, how to backtest strategies, how to replicate research papers, and understanding machine learning in trading. Python is the most popular language due to its libraries and ease of use, suitable for research, backtesting, and automating low-complexity systems, though slower than languages like Java or C++ needed for high-end, speed-critical systems.
In summary, machine learning in algorithmic trading involves using models, primarily supervised and unsupervised techniques, for tasks like signal generation, risk management, and identifying patterns. The course examples illustrate building strategies based on clustering (unsupervised learning), engineering features from alternative data, and utilizing quantitative prediction models like GARCH, while also highlighting the considerable theoretical and technical challenges inherent in this field.
Algorithmic Trading Technical Indicators and Features
Technical indicators are discussed in the sources as calculations derived from financial data, such as price and volume, used as features and signals within algorithmic and quantitative trading strategies. They form part of the predefined set of rules that define an algorithmic trading system.
The sources mention and utilize several specific technical indicators and related features:
Garmon-Class Volatility: An approximation to measure the intraday volatility of an asset, used in the first project.
RSI (Relative Strength Index): Calculated using the pandas_ta package, it’s used in the first project. In the third project, it’s combined with Bollinger Bands to generate an intraday momentum signal. In the first project, it was intentionally not normalized to aid in visualizing clustering results.
Bollinger Bands: Includes the lower, middle, and upper bands, calculated using pandas_ta. In the third project, they are used alongside RSI to define intraday trading signals based on price action patterns.
ATR (Average True Range): Calculated using pandas_ta, it requires multiple data series as input, necessitating a group by apply methodology for calculation per stock. Used as a feature in the first project.
MACD (Moving Average Convergence Divergence): Calculated using pandas_ta, also requiring a custom function and group by apply methodology. Used as a feature in the first project.
Dollar Volume: Calculated as adjusted close price multiplied by volume, often divided by 1 million. In the first project, it’s used to filter for the top 150 most liquid stocks each month, rather than as a direct feature for the machine learning model.
Monthly Returns: Calculated for different time horizons (1, 2, 3, 6, 9, 12 months) using the percent_change method and outliers are handled by clipping. These are added as features to capture momentum patterns.
Rolling Factor Betas: Derived from Fama-French factors using rolling regression. While not traditional technical indicators, they are quantitative features calculated from market data to estimate asset exposure to risk factors.
In the algorithmic trading strategies presented, technical indicators serve multiple purposes:
Features for Machine Learning Models: In the first project, indicators like Garmon-Class Volatility, RSI, Bollinger Bands, ATR, and MACD, along with monthly returns and factor betas, form an 18-feature dataset used as input for a K-means clustering algorithm. These features help the model group stocks into clusters based on their characteristics.
Signal Generation: In the third project, RSI and Bollinger Bands are used directly to generate intraday trading signals based on price action patterns. Specifically, a long signal occurs when RSI is above 70 and the close price is above the upper Bollinger band, and a short signal occurs when RSI is below 30 and the close is below the lower band. This intraday signal is then combined with a daily signal from a GARCH volatility model to determine position entry.
The process of incorporating technical indicators often involves:
Calculating the indicator for each asset, frequently by grouping the data by ticker symbol. Libraries like pandas_ta simplify this process.
Aggregating the calculated indicator values to a relevant time frequency, such as taking the last value for the month.
Normalizing or scaling the indicator values, particularly when they are used as features for machine learning models. This helps ensure features are on a similar scale.
Combining technical indicators with other data types, such as alternative data (like sentiment in Project 2, though not a technical indicator based strategy) or volatility predictions (like the GARCH model in Project 3), to create more complex strategies.
In summary, technical indicators are fundamental building blocks in the algorithmic trading strategies discussed, serving as crucial data inputs for analysis, feature engineering for machine learning models, and direct triggers for trading signals. Their calculation, processing, and integration are key steps in developing quantitative trading systems.
Algorithmic Portfolio Optimization and Strategy
Based on the sources, portfolio optimization is a significant component of the quantitative trading strategies discussed, particularly within the context of machine learning applications.
Here’s a breakdown of how portfolio optimization is presented:
Role in Algorithmic Trading Portfolio optimization is explicitly listed as a topic covered in the course, specifically within the first module focusing on unsupervised learning strategies. It’s also identified as a use case for unsupervised learning in trading, alongside clustering, dimensionality reduction, and anomaly detection. The general idea is that after selecting a universe of stocks, optimization is used to determine the weights or magnitude of the position in each stock within the portfolio.
Method: Efficient Frontier and Maximizing Sharpe Ratio In the first project, the strategy involves using efficient frontier optimization to maximize the Sharpe ratio for the stocks selected from a particular cluster. This falls under the umbrella of “mean variance optimization”. The goal is to find the weights that yield the highest Sharpe ratio based on historical data.
Process and Inputs To perform this optimization, a function is defined that takes the prices of the selected stocks as input. The optimization process involves several steps:
Calculating expected returns for the stocks, using methods like mean_historical_return.
Calculating the covariance matrix of the stock returns, using methods like sample_covariance.
Initializing the EfficientFrontier object with the calculated expected returns and covariance matrix.
Applying constraints, such as weight bounds for individual stocks. The sources mention potentially setting a maximum weight (e.g., 10% or 0.1) for diversification and a dynamic lower bound (e.g., half the weight of an equally weighted portfolio).
Using a method like max_sharpe on the efficient frontier object to compute the optimized weights.
The optimization requires at least one year of historical daily price data prior to the optimization date for the selected stocks.
Rebalancing Frequency In the first project, the portfolio is formed using the optimized weights and held for one month, after which it is rebalanced by re-optimizing the weights for the next month’s selected stocks.
Challenges and Workarounds A practical challenge encountered during the implementation is that the optimization solver can sometimes fail, resulting in an “infeasible” status. When the Max Sharpe optimization fails, the implemented workaround is to default to using equal weights for the portfolio in that specific month.
Contrast with Other Strategies Notably, the second project, the Twitter sentiment investing strategy, is explicitly described as not having “machine learning modeling”, and it does not implement efficient frontier optimization. Instead, it forms an equally weighted portfolio of the top selected stocks each month. This highlights that while portfolio optimization, particularly using sophisticated methods like Efficient Frontier, is a key strategy, simpler approaches like equal weighting are also used depending on the strategy’s complexity and goals.
Twitter Sentiment Trading Strategy Using Engagement Ratio
Based on the sources, Sentiment analysis is discussed in the context of a specific quantitative trading strategy referred to as the Twitter sentiment investing strategy. This strategy forms the basis of the second project covered in the course.
Here’s what the sources say about sentiment analysis and its use in this strategy:
Concept: Sentiment investing focuses on analyzing how people feel about certain stocks, industries, or the overall market. The underlying assumption is that public sentiment can impact stock prices. For example, if many people express positive sentiment about a company on Twitter, it might indicate that the company’s stock has the potential to perform well.
Data Source: The strategy utilizes Twitter sentiment data specifically for NASDAQ 100 stocks. The data includes information like date, symbol, Twitter posts, comments, likes, impressions, and a calculated “Twitter sentiment” value provided by a data provider.
Feature Engineering: Rather than using the raw sentiment or impressions directly, the strategy focuses on creating a derivative quantitative feature called the “engagement ratio”. This is done to potentially create more value from the data.
The engagement ratio is calculated as Twitter comments divided by Twitter likes.
The reason for using the engagement ratio is to gauge the actual engagement people have with posts about a company. This is seen as more informative than raw likes or comments, partly because there can be many bots on Twitter that skew raw metrics. A high ratio (comments as much as or more than likes) suggests genuine engagement, whereas many likes and few comments might indicate bot activity.
Strategy Implementation:
The strategy involves calculating the average engagement ratio for each stock every month.
Stocks are then ranked cross-sectionally each month based on their average monthly engagement ratio.
For portfolio formation, the strategy selects the top stocks based on this rank. Specifically, the implementation discussed selects the top five stocks for each month.
A key characteristic of this particular sentiment strategy, in contrast to the first project, is that it does not use machine learning modeling.
Instead of portfolio optimization methods like Efficient Frontier, the strategy forms an equally weighted portfolio of the selected top stocks each month.
The portfolio is rebalanced monthly.
Purpose: The second project serves to demonstrate how alternative or different data, such as sentiment data, can be used to create a quantitative feature and a potential trading strategy.
Performance: Using the calculated engagement ratio in the strategy showed that it created “a little bit of value above the NASDAQ itself” when compared to the NASDAQ index as a benchmark. Using raw metrics like average likes or comments for ranking resulted in similar or underperformance compared to the benchmark.
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
This text provides a comprehensive introduction to data science, covering its growth, career opportunities, and required skills. It explores various data science tools, programming languages (like Python and R), and techniques such as machine learning and deep learning. The materials also explain how to work with different data types, perform data analysis, build predictive models, and present findings effectively. Finally, it examines the role of generative AI in enhancing data science workflows.
Python & Data Science Study Guide
Quiz
What is the purpose of markdown cells in Jupyter Notebooks, and how do you create one?
Markdown cells allow you to add titles and descriptive text to your notebook. You can create a markdown cell by clicking ‘Code’ in the toolbar and selecting ‘Markdown.’
Explain the difference between int, float, and string data types in Python and provide an example of each.
int represents integers (e.g., 5), float represents real numbers (e.g., 3.14), and string represents sequences of characters (e.g., “hello”).
What is type casting in Python, and why is it important to be careful when casting a float to an integer?
Type casting is changing the data type of an expression (e.g., converting a string to an integer). When converting a float to an int, information after the decimal point is lost, so you must be careful.
Describe the role of variables in Python and how you assign values to them.
Variables store values in memory, and you assign a value to a variable using the assignment operator (=). For example, x = 10 assigns 10 to the variable x.
What is the purpose of indexing and slicing in Python strings and give an example.
Indexing allows you to access individual characters in a string using their position (e.g., string[0]). Slicing allows you to extract a substring (e.g., string[1:4]).
Explain the concept of immutability in the context of strings and tuples and how it affects their manipulation.
Immutable data types cannot be modified after creation. If you want to change a string or a tuple you create a new string or tuple.
What are the key differences between lists and tuples in Python?
Lists are mutable, meaning you can change them after creation; tuples are immutable. Lists are defined using square brackets [], while tuples use parentheses ().
Describe dictionaries in Python and how they are used to store data using keys and values.
Dictionaries store key-value pairs, where keys are unique and immutable and the values are the associated information. You use curly brackets {} and each key and value are separated by a colon (e.g., {“name”: “John”, “age”: 30}).
What are sets in Python, and how do they differ from lists or tuples?
Sets are unordered collections of unique elements. They do not keep track of order, and only contain a single instance of any item.
Explain the difference between a for loop and a while loop and how each can be used.
A for loop is used to iterate over a sequence of elements, like a list or string. A while loop runs as long as a certain condition is true, and does not necessarily require iterating over a sequence.
Quiz Answer Key
Markdown cells allow you to add titles and descriptive text to your notebook. You can create a markdown cell by clicking ‘Code’ in the toolbar and selecting ‘Markdown.’
int represents integers (e.g., 5), float represents real numbers (e.g., 3.14), and string represents sequences of characters (e.g., “hello”).
Type casting is changing the data type of an expression (e.g., converting a string to an integer). When converting a float to an int, information after the decimal point is lost, so you must be careful.
Variables store values in memory, and you assign a value to a variable using the assignment operator (=). For example, x = 10 assigns 10 to the variable x.
Indexing allows you to access individual characters in a string using their position (e.g., string[0]). Slicing allows you to extract a substring (e.g., string[1:4]).
Immutable data types cannot be modified after creation. If you want to change a string or a tuple you create a new string or tuple.
Lists are mutable, meaning you can change them after creation; tuples are immutable. Lists are defined using square brackets [], while tuples use parentheses ().
Dictionaries store key-value pairs, where keys are unique and immutable and the values are the associated information. You use curly brackets {} and each key and value are separated by a colon (e.g., {“name”: “John”, “age”: 30}).
Sets are unordered collections of unique elements. They do not keep track of order, and only contain a single instance of any item.
A for loop is used to iterate over a sequence of elements, like a list or string. A while loop runs as long as a certain condition is true, and does not necessarily require iterating over a sequence.
Essay Questions
Discuss the role and importance of data types in Python, elaborating on how different types influence operations and the potential pitfalls of incorrect type handling.
Compare and contrast the use of lists, tuples, dictionaries, and sets in Python. In what scenarios is each of these data structures more beneficial?
Describe the concept of functions in Python, providing examples of both built-in functions and user-defined functions, and explaining how they can improve code organization and reusability.
Analyze the use of loops and conditions in Python, explaining how they allow for iterative processing and decision-making, and discuss their relevance in data manipulation.
Explain the differences and relationships between object-oriented programming concepts (such as classes, objects, methods, and attributes) and how those translate into more complex data structures and functional operations.
Glossary
Boolean: A data type that can have one of two values: True or False.
Class: A blueprint for creating objects, defining their attributes and methods.
Data Frame: A two-dimensional data structure in pandas, similar to a table with rows and columns.
Data Type: A classification that specifies which type of value a variable has, such as integer, float, string, etc.
Dictionary: A data structure that stores data as key-value pairs, where keys are unique and immutable.
Expression: A combination of values, variables, and operators that the computer evaluates to a single value.
Float: A data type representing real numbers with decimal points.
For Loop: A control flow statement that iterates over a sequence (e.g., list, tuple) and executes code for each element.
Function: A block of reusable code that performs a specific task.
Index: Position in a sequence, string, list, or tuple.
Integer (Int): A data type representing whole numbers, positive or negative.
Jupyter Notebook: An interactive web-based environment for coding, data analysis, and visualization.
Kernel: A program that runs code in a Jupyter Notebook.
List: A mutable, ordered sequence of elements defined with square brackets [].
Logistic Regression: A classification algorithm that predicts the probability of an instance belonging to a class.
Method: A function associated with an object of a class.
NumPy: A Python library for numerical computations, especially with arrays and matrices.
Object: An instance of a class, containing its own data and methods.
Operator: Symbols that perform operations such as addition, subtraction, multiplication, or division.
Pandas: A Python library for data manipulation and analysis.
Primary Key: A unique identifier for each record in a table.
Relational Database: A database that stores data in tables with rows and columns and structured relationships between tables.
Set: A data structure that is unordered and contains only unique values.
Sigmoid Function: A mathematical function used in logistic regression that outputs a value between zero and one.
Slicing: Extracting a portion of a sequence (e.g., list, string) using indexes (e.g., [start:end:step]).
SQL (Structured Query Language): Language used to manage and manipulate data in relational databases.
String: A sequence of characters, defined with single or double quotes.
Support Vector Machine (SVM): A classification algorithm that finds an optimal hyperplane to separate data classes.
Tuple: An immutable, ordered sequence of elements defined with parentheses ().
Type Casting: Changing the data type of an expression.
Variable: A named storage location in a computer’s memory used to hold a value.
View: A virtual table based on the result of an SQL query.
While Loop: A control flow statement that repeatedly executes a block of code as long as a condition remains true.
Python for Data Science
Okay, here’s a detailed briefing document summarizing the provided sources, focusing on key themes and ideas, with supporting quotes:
Briefing Document: Python Fundamentals and Data Science Tools
I. Overview
This document provides a summary of core concepts in Python programming, specifically focusing on those relevant to data science. It covers topics from basic syntax and data types to more advanced topics like object-oriented programming, file handling, and fundamental data analysis libraries. The goal is to equip a beginner with a foundational understanding of Python for data manipulation and analysis.
II. Key Themes and Ideas
Jupyter Notebook Environment: The sources emphasize the practical use of Jupyter notebooks for coding, analysis, and presentation. Key functionalities include running code cells, adding markdown for explanations, and creating slides for presentation.
“you can now start working on your new notebook… you can create a markdown to add titles and text descriptions to help with the flow of the presentation… the slides functionality in Jupiter allows you to deliver code visualization text and outputs of the executed code as part of a project”
Python Data Types: The document systematically covers fundamental Python data types, including:
Integers (int) & Floats (float):“you can have different types in Python they can be integers like 11 real numbers like 21.23%… we can have int which stands for an integer and float that stands for float essentially a real number”
Strings (str):“the type string is a sequence of characters” Strings are explained to be immutable, accessible by index, and support various methods.
Booleans (bool):“A Boolean can take on two values the first value is true… Boolean values can also be false”
Type Casting: The sources teach how to change one data type to another. “You can change the type of the expression in Python this is called type casting… you can convert an INT to a float for example”
Expressions and Variables: These sections explain basic operations and variable assignment:
Expressions:“Expressions describe a type of operation the computers perform… for example basic arithmetic operations like adding multiple numbers” The order of operations is also covered.
Variables: Variables are used to “store values” and can be reassigned, and they benefit from meaningful naming.
Compound Data Types (Lists, Tuples, Dictionaries, Sets):
Tuples: Ordered, immutable sequences using parenthesis. “tuples are an ordered sequence… tupples are expressed as comma separated elements within parentheses”
Lists: Ordered, mutable sequences using square brackets. “lists are also an ordered sequence… a list is represented with square brackets” Lists support methods like extend, append, and del.
Dictionaries: Collection with key-value pairs. Keys must be immutable and unique. “a dictionary has keys and values… the keys are the first elements they must be immutable and unique each each key is followed by a value separated by a colon”
Sets: Unordered collections of unique elements. “sets are a type of collection… they are unordered… sets only have unique elements” Set operations like add, remove, intersection, union, and subset checking are covered.
Control Flow (Conditions & Loops):
Conditional Statements (if, elif, else):“The if statement allows you to make a decision based on some condition… if that condition is true the set of statements within the if block are executed”
For Loops: Used for iterating over a sequence.“The for Loop statement allows you to execute a statement or set of statements a certain number of times”
While Loops: Used for executing statements while a condition is true. “a while loop will only run if a condition is me”
Functions:
Built-in Functions: len(), sum(), sorted().
User-defined Functions: The syntax and best practices are covered, including documentation, parameters, return values, and scope of variables. “To define a function we start with the keyword def… the name of the function should be descriptive of what it does”
Object-Oriented Programming (OOP):
Classes & Objects:“A class can be thought of as a template or a blueprint for an object… An object is a realization or instantiation of that class” The concepts of attributes and methods are also introduced.
File Handling: The sources cover the use of Python’s open() function, modes for reading (‘r’) and writing (‘w’), and the importance of closing files.
“we use the open function… the first argument is the file path this is made up of the file name and the file directory the second parameter is the mode common values used include R for reading W for writing and a for appending” The use of the with statement is advocated for automatic file closing.
Libraries (Pandas & NumPy):
Pandas: Introduction to DataFrames, importing data (read_csv, read_excel), and operations like head(), selection of columns and rows (iloc, loc), and unique value discovery. “One Way pandas allows you to work with data is in a data frame” Data slicing and filtering are shown.
NumPy: Introduction to ND arrays, creation from lists, accessing elements, slicing, basic vector operations (addition, subtraction, multiplication), broadcasting and universal functions, and array attributes. “a numpy array or ND array is similar to a list… each element is of the same type”
SQL and Relational Databases: SQL is introduced as a way to interact with data in relational database systems using Data Definition Language (DDL) and Data Manipulation Language (DML). DDL statements like create table, alter table, drop table, and truncate are discussed, as well as DML statements like insert, select, update, and delete. Concepts like views and stored procedures are also covered, as well as accessing database table and column metadata.
“Data definition language or ddl statements are used to define change or drop database objects such as tables… data manipulation language or DML statements are used to read and modify data in tables”
Data Visualization, Correlation, and Statistical Methods:
Pivot Tables and Heat Maps: Techniques for reshaping data and visualizing patterns using pandas pivot() method and heatmaps. “by using the pandas pivot method we can pivot the body style variable so it is displayed along the columns and the drive wheels will be displayed along the rows”
Correlation: Introduction to the concept of correlation between variables, using scatter plots and regression lines to visualize relationships. “correlation is a statistical metric for measuring to what extent different variables are interdependent”
Pearson Correlation: A method to quantify the strength and direction of linear relationships, emphasizing both correlation coefficients and p-values. “Pearson correlation method will give you two values the correlation coefficient and the P value”
Chi-Square Test: A method to identify if there is a relationship between categorical variables. “The Ki Square test is intended to test How likely it is that an observed distribution is due to chance”
Model Development:
Linear Regression: Introduction to simple and multiple linear regression for predictive modeling with independent and dependent variables. “simple linear regression or SLR is a method to help us understand the relationship between two variables the predictor independent variable X and the target dependent variable y”
Polynomial Regression: Introduction to non linear regression models.
Model Evaluation Metrics: Introduction to evaluation metrics like R-squared (R2) and Mean Squared Error (MSE).
K-Nearest Neighbors (KNN): Classification algorithm based on similarity to other cases. K selection and distance computation are discussed. “the K near nearest neighbors algorithm is a classification algorithm that takes a bunch of labeled points and uses them to learn how to label other points”
Evaluation Metrics for Classifiers: Metrics such as the Jaccard index, F1 Score and log loss are introduced for assessing model performance.
“evaluation metrics explain the performance of a model… we can Define jackard as the size of the intersection divided by the size of the Union of two label sets”
Decision Trees: Algorithm for data classification by splitting attributes, recursive partitioning, impurity, entropy and information gain are discussed.
“decision trees are built using recursive partitioning to classify the data… the algorithm chooses the most predictive feature to split the data on”
Logistic Regression: Classification algorithm that uses a sigmoid function to calculate probabilities and gradient descent to tune model parameters.
“logistic regression is a statistical and machine learning technique for classifying records of a data set based on the values of the input Fields… in logistic regression we use one or more independent variables such as tenure age and income to predict an outcome such as churn”
Support Vector Machines: Classification algorithm based on transforming data to a high-dimensional space and finding a separating hyperplane. Kernel functions and support vectors are introduced.
“a support Vector machine is a supervised algorithm that can classify cases by finding a separator svm works by first mapping data to a high-dimensional feature space so that data points can be categorized even when the data are not otherwise linearly separable”
III. Conclusion
These sources lay a comprehensive foundation for understanding Python programming as it is used in data science. From setting up a development environment in Jupyter Notebooks to understanding fundamental data types, functions, and object-oriented programming, the document prepares learners for more advanced topics. Furthermore, the document introduces data analysis and visualization concepts, along with model building through regression techniques and classification algorithms, equipping beginners with practical data science tools. It is crucial to delve deeper into practical implementations, which are often available in the labs.
Python Programming Fundamentals and Machine Learning
Python & Jupyter Notebook
How do I start a new notebook and run code? To start a new notebook, click the plus symbol in the toolbar. Once you’ve created a notebook, type your code into a cell and click the “Run” button or use the shortcut Shift + Enter. To run multiple code cells, click “Run All Cells.”
How can I organize my notebook with titles and descriptions? To add titles and descriptions, use markdown cells. Select “Markdown” from the cell type dropdown, and you can write text, headings, lists, and more. This allows you to provide context and explain the code.
Can I use more than one notebook at a time? Yes, you can open and work with multiple notebooks simultaneously. Click the plus button on the toolbar, or go to File -> Open New Launcher or New Notebook. You can arrange the notebooks side-by-side to work with them together.
How do I present my work using notebooks? Jupyter Notebooks support creating presentations. Using markdown and code cells, you can create slides by selecting the View -> Cell Toolbar -> Slides option. You can then view the presentation using the Slides icon.
How do I shut down notebooks when I’m finished? Click the stop icon (second from top) in the sidebar, this releases memory being used by the notebook. You can terminate all sessions at once or individually. You will know it is successfully shut down when you see “No Kernel” on the top right.
Python Data Types, Expressions, and Variables
What are the main data types in Python and how can I change them? Python’s main data types include int (integers), float (real numbers), str (strings), and bool (booleans). You can change data types using type casting. For example, float(2) converts the integer 2 to a float 2.0, or int(2.9) will convert the float 2.9 to the integer 2. Casting a string like “123” to an integer is done with int(“123”) but will result in an error if the string has non-integer values. Booleans can be cast to integers where True is converted to 1, and False is converted to 0.
What are expressions and how are they evaluated? Expressions are operations that Python performs. These can include arithmetic operations like addition, subtraction, multiplication, division, and more. Python follows mathematical conventions when evaluating expressions, with parentheses having the highest precedence, followed by multiplication and division, then addition and subtraction.
How do I store values in variables and work with strings? You can store values in variables using the assignment operator =. You can then use the variable name in place of the value it stores. Variables can store results of expressions, and the type of the variable can be determined with the type() command. Strings are sequences of characters and are enclosed in single or double quotes, you can access individual elements using indexes and also perform operations like slicing, concatenation, and replication.
Python Data Structures: Lists, Tuples, Dictionaries, and Sets
What are lists and tuples, and how are they different? Lists and tuples are ordered sequences used to store data. Lists are mutable, meaning you can change, add, or remove elements. Tuples are immutable, meaning they cannot be changed once created. Lists are defined using square brackets [], and tuples are defined using parentheses ().
What are dictionaries and sets? Dictionaries are collections that store data in key-value pairs, where keys must be immutable and unique. Sets are collections of unique elements. Sets are unordered and therefore do not have indexes or ordered keys. You can perform various mathematical set operations such as union, intersection, adding and removing elements.
How do I work with nested collections and change or copy lists? You can nest lists and tuples inside other lists and tuples. Accessing elements in these structures uses the same indexing conventions. Because lists are mutable, when you assign one list variable to another variable both variables refer to the same list, therefore, changes to one list impact the other this is called aliasing. To copy a list and not reference the original, use [:] (e.g., new_list = old_list[:]) to create a new copy of the original.
Control Flow, Loops, and Functions
How do I use conditions and branching in Python? You can use if, elif, and else statements to perform different actions based on conditions. You use comparison operators (==, !=, <, >, <=, >=) which return True or False. Based on whether the condition is True, the corresponding code blocks are executed.
What is the difference between for and while loops? for loops are used for iterating over a sequence, like lists or tuples, executing a block of code for every item in that sequence. while loops repeatedly execute a block of code as long as a condition is True, you must make sure your condition will become False or it will loop forever.
What are functions and how do I create them? Functions are reusable blocks of code. They are defined with the def keyword followed by the function name, parentheses for parameters, and a colon. The function’s code block is indented. Functions can take inputs (parameters) and return values. Functions are documented in the first few lines using triple quotes.
What are variable scope and global/local variables? The scope of a variable is the part of the program where the variable is accessible. Variables defined outside of a function are global variables and are accessible everywhere. Variables defined inside a function are local variables and are only accessible within that function, there is no conflict if a local variable has the same name as a global one. If you would like to have a local variable update a global variable you can use the global keyword inside the function’s scope and assign the name of the global variable.
Object Oriented Programming, Files, and Libraries
What are classes and objects in Python? Classes are templates for creating objects. An object is a specific instance of a class. You can define classes with attributes (data) and methods (functions that operate on that data) using the class keyword, you can instantiate multiple objects of the same class.
How do I work with files in Python? You can use the open() function to create a file object, you use the first argument to specify the file path and the second for the mode (e.g., “r” for reading, “w” for writing, “a” for appending). Using the with statement is recommended, as it automatically closes the file after use. You can use methods like read(), readline(), and write() to interact with the file.
What is a library and how do I use Pandas for data analysis? Libraries are pre-written code that helps solve problems, like data analysis. You can import libraries using the import statement, often with a shortened name (as keyword). Pandas is a popular library for data analysis that uses data frames to store and analyze tabular data. You can load files like CSV or Excel into pandas data frames and use its tools for cleaning, modifying, and exploring data.
How can I work with numpy? Numpy is a library for numerical computing, it works with arrays. You can create Numpy arrays from Python lists, you can access and slice data using indexing and slicing. Numpy arrays support many mathematical operations which are usually much faster and require less memory than regular python lists.
Databases and SQL
What is SQL, a database, and a relational database? SQL (Structured Query Language) is a programming language used to manage data in a database. A database is an organized collection of data. A relational database stores data in tables with rows and columns, it uses SQL for its main operations.
What is an RDBMS and what are the basic SQL commands? RDBMS (Relational Database Management System) is a software tool used to manage relational databases. Basic SQL commands include CREATE TABLE, INSERT (to add data), SELECT (to retrieve data), UPDATE (to modify data), and DELETE (to remove data).
How do I retrieve data using the SELECT statement? You can use SELECT followed by column names to specify which columns to retrieve. SELECT * retrieves all columns from a table. You can add a WHERE clause followed by a predicate (a condition) to filter data using comparison operators (=, >, <, >=, <=, !=).
How do I use COUNT, DISTINCT, and LIMIT with select statements? COUNT() returns the number of rows that match a criteria. DISTINCT removes duplicate values from a result set. LIMIT restricts the number of rows returned.
How do I create and populate a table? You can create a table with the CREATE TABLE command. Provide the name of the table and, inside parentheses, define the name and data types for each column. Use the INSERT statement to populate tables using INSERT INTO table_name (column_1, column_2…) VALUES (value_1, value_2…).
More SQL
What are DDL and DML statements? DDL (Data Definition Language) statements are used to define database objects like tables (e.g., CREATE, ALTER, DROP, TRUNCATE). DML (Data Manipulation Language) statements are used to manage data in tables (e.g., INSERT, SELECT, UPDATE, DELETE).
How do I use ALTER, DROP, and TRUNCATE tables? ALTER TABLE is used to add, remove, or modify columns. DROP TABLE deletes a table. TRUNCATE TABLE removes all data from a table, but leaves the table structure.
How do I use views in SQL? A view is an alternative way of representing data that exists in one or more tables. Use CREATE VIEW followed by the view name, the column names and AS followed by a SELECT statement to define the data the view should display. Views are dynamic and do not store the data themselves.
What are stored procedures? A stored procedure is a set of SQL statements stored and executed on the database server. This avoids sending multiple SQL statements from the client to the server, they can accept input parameters, and return output values. You can define them with CREATE PROCEDURE.
Data Visualization and Analysis
What are pivot tables and heat maps, and how do they help with visualization? A pivot table is a way to summarize and reorganize data from a table and display it in a rectangular grid. A heat map is a graphical representation of a pivot table where data values are shown using a color intensity scale. These are effective ways to examine and visualize relationships between multiple variables.
How do I measure correlation between variables? Correlation measures the statistical interdependence of variables. You can use scatter plots to visualize the relationship between two numerical variables and add a linear regression line to show their trend. Pearson correlation measures the linear correlation between continuous numerical values, providing the correlation coefficient and P-value. Chi-square test is used to identify if an association between two categorical variables exists.
What is simple linear regression and multiple linear regression? Simple linear regression uses one independent variable to predict a dependent variable using a linear relationship, Multiple linear regression uses several independent variables to predict the dependent variable.
Model Development
What is a model and how can I use it for predictions? A model is a mathematical equation used to predict a value (dependent variable) given one or more other values (independent variables). Models are trained with data that determines parameters for an equation. Once the model is trained you can input data and have the model predict an output.
What are R-squared and MSSE, and how are they used to evaluate model performance? R-squared measures how well the model fits the data and it represents the percentage of the data that is closest to the fitted line and represents the “goodness of fit”. Mean squared error (MSE) is the average of the square difference between the predicted values and the true values. These scores are used to measure model performance for continuous target values and are called in-sample evaluation metrics, as they use training data.
What is polynomial regression? Polynomial regression is a form of regression analysis in which the relationship between the independent variable and the dependent variable is modeled as an nth degree polynomial. This allows more flexibility in the curve fitting.
What are pipelines in machine learning? Pipelines are a way to streamline machine learning workflows. They combine multiple steps (e.g., scaling, model training) into a single entity, making the process of building and evaluating models more efficient.
Machine Learning Classification Algorithms
What is the K-Nearest Neighbors algorithm and how does it work? The K-Nearest Neighbors algorithm (KNN) is a classification algorithm that uses labeled data points to learn how to label other points. It classifies new cases by looking at the ‘k’ nearest neighbors in the training data based on some sort of dissimilarity metric, the most popular label among neighbors is the predicted class for that data point. The choice of ‘k’ and the distance metric are important, and the dissimilarity measure depends on data type.
What are common evaluation metrics for classifiers? Common evaluation metrics for classifiers include Jaccard Index, F1 Score, and Log Loss. Jaccard Index measures similarity. F1 Score combines precision and recall. Log Loss is used to measure the performance of a probabilistic classifier like logistic regression.
What is a confusion matrix? A confusion matrix is used to evaluate the performance of a classification model. It shows the counts of true positives, true negatives, false positives, and false negatives. This helps evaluate where your model is making mistakes.
What are decision trees and how are they built? Decision trees use a tree-like structure with nodes representing decisions based on features and branches representing outcomes, they are constructed by partitioning the data by minimizing the impurity at each step based on the attribute with the highest information gain, which is the entropy of the tree before the split minus the weighted entropy of the tree after the split.
What is logistic regression and how does it work? Logistic regression is a machine learning algorithm used for classification. It models the probability of a sample belonging to a specific class using a sigmoid function, it returns a probability of the outcome being one and (1-p) of the outcome being zero, parameter values are trained to find parameters which produce accurate estimations.
What is the Support Vector Machine algorithm? A support vector machine (SVM) is a classification algorithm used for classification that works by transforming data into a high-dimensional space so that data can be categorized by drawing a separating hyperplane, the algorithm optimizes its output by maximizing the margin between classes and using data points closest to the hyperplane for learning, called support vectors.
A Data Science Career Guide
A career in data science is enticing due to the field’s recent growth, the abundance of electronic data, advancements in artificial intelligence, and its demonstrated business value [1]. The US Bureau of Labor Statistics projects a 35% growth rate in the field, with a median annual salary of around $103,000 [1].
What Data Scientists Do:
Data scientists use data to understand the world [1].
They investigate and explain problems [2].
They uncover insights and trends hiding behind data and translate data into stories to generate insights [1, 3].
They analyze structured and unstructured data from varied sources [4].
They clarify questions that organizations want answered and then determine what data is needed to solve the problem [4].
They use data analysis to add to the organization’s knowledge, revealing previously hidden opportunities [4].
They communicate results to stakeholders, often using data visualization [4].
They build machine learning and deep learning models using algorithms to solve business problems [5].
Essential Skills for Data Scientists:
Curiosity is essential to explore data and come up with meaningful questions [3, 4].
Argumentation helps explain findings and persuade others to adjust their ideas based on the new information [3].
Judgment guides a data scientist to start in the right direction [3].
Comfort and flexibility with analytics platforms and software [3].
Storytelling is key to communicating findings and insights [3, 4].
Technical Skills:Knowledge of programming languages like Python, R, and SQL [6, 7]. Python is widely used in data science [6, 7].
Familiarity with databases, particularly relational databases [8].
Understanding of statistical inference and distributions [8].
Ability to work with Big Data tools like Hadoop and Spark [2, 9].
Experience with data visualization tools and techniques [4, 9].
Soft Skills:Communication and presentation skills [5, 9].
Critical thinking and problem-solving abilities [5, 9].
Creative thinking skills [5].
Collaborative approach [5].
Educational Background and Training
A background in mathematics and statistics is beneficial [2].
Training in probability and statistics is necessary [2].
Knowledge of algebra and calculus is useful [2].
Comfort with computer science is helpful [3].
A degree in a quantitative field such as mathematics or statistics is a good starting point [4]
Career Paths and Opportunities:
Data science is relevant due to the abundance of available data, algorithms, and inexpensive tools [1].
Data scientists can work across many industries, including technology, healthcare, finance, transportation, and retail [1, 2].
There is a growing demand for data scientists in various fields [1, 9, 10].
Job opportunities can be found in large companies, small companies, and startups [10].
The field offers a range of roles, from entry-level to senior positions and leadership roles [10].
Career advancement can lead to specialization in areas like machine learning, management, or consulting [5].
Some possible job titles include data analyst, data engineer, research scientist, and machine learning engineer [5, 6].
How to Prepare for a Data Science Career:
Learn programming, especially Python [7, 11].
Study math, probability, and statistics [11].
Practice with databases and SQL [11].
Build a portfolio with projects to showcase skills [12].
Network both online and offline [13].
Research companies and industries you are interested in [14].
Develop strong communication and storytelling skills [3, 9].
Consider certifications to show proficiency [3, 9].
Challenges in the Field
Companies need to understand what they want from a data science team and hire accordingly [9].
It’s rare to find a “unicorn” candidate with all desired skills, so teams are built with diverse skills [8, 11].
Data scientists must stay updated with the latest technology and methods [9, 15].
Data professionals face technical, organizational, and cultural challenges when using generative AI models [15].
AI models need constant updating and adapting to changing data [15].
Data science is a process of using data to understand different things and the world, and involves validating hypotheses with data [1]. It is also the art of uncovering insights and using them to make strategic choices for companies [1]. With a blend of technical skills, curiosity, and the ability to communicate effectively, a career in data science offers diverse and rewarding opportunities [2, 11].
Data Science Skills and Generative AI
Data science requires a combination of technical and soft skills to be successful [1, 2].
Technical Skills
Programming languages such as Python, R, and SQL are essential [3, 4]. Python is widely used in the data science industry [4].
Database knowledge, particularly with relational databases [5].
Understanding of statistical concepts, probability, and statistical inference [2, 6-9].
Experience with machine learning algorithms [2, 3, 6].
Familiarity with Big Data tools like Hadoop and Spark, especially for managing and manipulating large datasets [2, 3, 7].
Ability to perform data mining, and data wrangling, including cleaning, transforming, and preparing data for analysis [3, 6, 9, 10].
Data visualization skills are important for effectively presenting findings [2, 3, 6, 11]. This includes using tools like Tableau, PowerBI, and R’s visualization packages [7, 10-12].
Knowledge of cloud computing, and cloud-based data management [3, 12].
Experience using libraries such as pandas, NumPy, SciPy and Matplotlib in Python, is useful for data analysis and machine learning [4].
Familiarity with tools like Jupyter Notebooks, RStudio, and GitHub are important for coding, collaboration and project sharing [3].
Soft Skills
Curiosity is essential for exploring data and asking meaningful questions [1, 2].
Critical thinking and problem-solving skills are needed to analyze and solve problems [2, 7, 9].
Communication and presentation skills are vital for explaining technical concepts and insights to both technical and non-technical audiences [1-3, 7, 9].
Storytelling skills are needed to translate data into compelling narratives [1, 2, 7].
Argumentation is essential for explaining findings [1, 2].
Collaboration skills are important, as data scientists often work with other professionals [7, 9].
Creative thinking skills allow data scientists to develop innovative approaches [9].
Good judgment to guide the direction of projects [1, 2].
Grit and tenacity to persevere through complex projects and challenges [12, 13].
Additional skills:
Business analysis is important to understand and analyze problems from a business perspective [13].
A methodical approach is needed for data gathering and analysis [1].
Comfort and flexibility with analytics platforms is also useful [1].
How Generative AI Can Help
Generative AI can assist data scientists in honing these skills [9]:
It can ease the learning process for statistics and math [9].
It can guide coding and help prepare code [9].
It can help data professionals with data preparation tasks such as cleaning, handling missing values, standardizing, normalizing, and structuring data for analysis [9, 14].
It can assist with the statistical analysis of data [9].
It can aid in understanding the applicability of different machine learning models [9].
Note: It is important to note that while these technical skills are important, it is not always necessary to be an expert in every area [13, 15]. A combination of technical knowledge and soft skills with a focus on continuous learning is ideal [9]. It is also valuable to gain experience by creating a portfolio with projects demonstrating these skills [12, 13].
A Comprehensive Guide to Data Science Tools
Data science utilizes a variety of tools to perform tasks such as data management, integration, visualization, model building, and deployment [1]. These tools can be categorized into several types, including data management tools, data integration and transformation tools, data visualization tools, model building and deployment tools, code and data asset management tools, development environments, and cloud-based tools [1-3].
Data Management Tools
Relational databases such as MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server, and IBM Db2 [2, 4, 5]. These systems store data in a structured format with rows and columns, and use SQL to manage and retrieve the data [4].
NoSQL databases like MongoDB, Apache CouchDB, and Apache Cassandra are used to store semi-structured and unstructured data [2, 4].
File-based tools such as Hadoop File System (HDFS) and cloud file systems like Ceph [2].
Elasticsearch is used for storing and searching text data [2].
Data warehouses, data marts and data lakes are also important for data storage and retrieval [4].
Data Integration and Transformation Tools
ETL (Extract, Transform, Load) tools are used to extract data from various sources, transform it into a usable format, and load it into a data warehouse [1, 4].
Apache Airflow, Kubeflow, Apache Kafka, Apache NiFi, Apache Spark SQL, and Node-RED are open-source tools used for data integration and transformation [2].
Informatica PowerCenter and IBM InfoSphere DataStage are commercial tools used for ETL processes [5].
Data Refinery is a tool within IBM Watson Studio that enables data transformation using a spreadsheet-like interface [3, 5].
Data Visualization Tools
Tools that present data in graphical formats, such as charts, plots, maps, and animations [1].
Programming libraries like Pixie Dust for Python, which also has a user interface that helps with plotting [2].
Hue which can create visualizations from SQL queries [2].
Kibana, a data exploration and visualization web application [2].
Apache Superset is another web application used for data exploration and visualization [2].
Tableau, Microsoft Power BI, and IBM Cognos Analytics are commercial business intelligence (BI) tools used for creating visual reports and dashboards [3, 5].
Plotly Dash for building interactive dashboards [6].
R’s visualization packages such as ggplot, plotly, lattice, and leaflet [7].
Data Mirror is a cloud-based data visualization tool [3].
Model Building and Deployment Tools
Machine learning and deep learning libraries in Python such as TensorFlow, PyTorch, and scikit-learn [8, 9].
Apache PredictionIO and Seldon are open-source tools for model deployment [2].
MLeap is another tool to deploy Spark ML models [2].
TensorFlow Serving is used to deploy TensorFlow models [2].
SPSS Modeler and SAS Enterprise Miner are commercial data mining products [5].
IBM Watson Machine Learning and Google AI Platform Training are cloud-based services for training and deploying models [1, 3].
Code and Data Asset Management Tools
Git is the standard tool for code asset management, or version control, with platforms like GitHub, GitLab, and Bitbucket being popular for hosting repositories [2, 7, 10].
Apache Atlas, ODP Aeria, and Kylo are tools used for data asset management [2, 10].
Informatica Enterprise Data Governance and IBM provide tools for data asset management [5].
Development Environments
Jupyter Notebook is a web-based environment that supports multiple programming languages, and is popular among data scientists for combining code, visualizations, and narrative text [4, 10, 11]. Jupyter Lab is a more modern version of Jupyter Notebook [10].
RStudio is an integrated development environment (IDE) specifically for the R language [4, 7, 10].
Spyder is an IDE that attempts to mimic the functionality of RStudio, but for the Python world [10].
Apache Zeppelin provides an interface similar to Jupyter Notebooks but with integrated plotting capabilities [10].
IBM Watson Studio provides a collaborative environment for data science tasks, including tools for data pre-processing, model training, and deployment, and is available in cloud and desktop versions [1, 2, 5].
Visual tools like KNIME and Orange are also used [10].
Cloud-Based Tools
Cloud platforms such as IBM Watson Studio, Microsoft Azure Machine Learning, and H2O Driverless AI offer fully integrated environments for the entire data science life cycle [3].
Amazon Web Services (AWS), Google Cloud, and Microsoft Azure provide various services for data storage, processing, and machine learning [3, 12].
Cloud-based versions of existing open-source and commercial tools are widely available [3].
Programming Languages
Python is the most widely used language in data science due to its clear syntax, extensive libraries, and supportive community [8]. Libraries include pandas, NumPy, SciPy, Matplotlib, TensorFlow, PyTorch, and scikit-learn [8, 9].
R is specifically designed for statistical computing and data analysis [4, 7]. Packages such as dplyr, stringr, ggplot, and caret are widely used [7].
SQL is essential for managing and querying databases [4, 11].
Scala and Java are general purpose languages used in data science [9].
C++ is used to build high-performance libraries such as TensorFlow [9].
JavaScript can be used for data science with libraries such as tensorflow.js [9].
Julia is used for high performance numerical analysis [9].
Generative AI Tools
Generative AI tools are also being used for various tasks, including data augmentation, report generation, and model development [13].
SQL through AI converts natural language queries into SQL commands [12].
Tools such as DataRobot, AutoGluon, H2O Driverless AI, Amazon SageMaker Autopilot, and Google Vertex AI are used for automated machine learning (AutoML) [14].
Free tools such as AIO are also available for data analysis and visualization [14].
These tools support various aspects of data science, from data collection and preparation to model building and deployment. Data scientists often use a combination of these tools to complete their work.
Machine Learning Fundamentals
Machine learning is a subset of AI that uses computer algorithms to analyze data and make intelligent decisions based on what it has learned, without being explicitly programmed [1, 2]. Machine learning algorithms are trained with large sets of data, and they learn from examples rather than following rules-based algorithms [1]. This enables machines to solve problems on their own and make accurate predictions using the provided data [1].
Here are some key concepts related to machine learning:
Types of machine learning:Supervised learning is a type of machine learning where a human provides input data and correct outputs, and the model tries to identify relationships and dependencies between the input data and the correct output [3]. Supervised learning comprises two types of models:
Regression models are used to predict a numeric or real value [3].
Classification models are used to predict whether some information or data belongs to a category or class [3].
Unsupervised learning is a type of machine learning where the data is not labeled by a human, and the models must analyze the data and try to identify patterns and structure within the data based on its characteristics [3, 4]. Clustering models are an example of unsupervised learning [3].
Reinforcement learning is a type of learning where a model learns the best set of actions to take given its current environment to get the most rewards over time [3].
Deep learning is a specialized subset of machine learning that uses layered neural networks to simulate human decision-making [1, 2]. Deep learning algorithms can label and categorize information and identify patterns [1].
Neural networks (also called artificial neural networks) are collections of small computing units called neurons that take incoming data and learn to make decisions over time [1, 2].
Generative AI is a subset of AI that focuses on producing new data rather than just analyzing existing data [1, 5]. It allows machines to create content, including images, music, language, and computer code, mimicking creations by people [1, 5]. Generative AI can also create synthetic data that has similar properties as the real data, which is useful for training and testing models when there isn’t enough real data [1, 5].
Model training is the process by which a model learns patterns from data [3, 6].
Applications of Machine Learning
Machine learning is used in many fields and industries [7, 8]:
Predictive analytics is a common application of machine learning [2].
Recommendation systems, such as those used by Netflix or Amazon, are also a major application [2, 8].
Fraud detection is another key area [2]. Machine learning is used to determine whether a credit card charge is fraudulent in real time [2].
Machine learning is also used in the self-driving car industry to classify objects a car might encounter [7].
Cloud computing service providers like IBM and Amazon use machine learning to protect their services and prevent attacks [7].
Machine learning can be used to find trends and patterns in stock data [7].
Machine learning is used to help identify cancer using X-ray scans [7].
Machine learning is used in healthcare to predict whether a human cell is benign or malignant [8].
Machine learning can help determine proper medicine for patients [8].
Banks use machine learning to make decisions on loan applications and for customer segmentation [8].
Websites such as Youtube, Amazon, or Netflix use machine learning to develop recommendations for their customers [8].
How Data Scientists Use Machine Learning
Data scientists use machine learning algorithms to derive insights from data [2]. They use machine learning for predictive analytics, recommendations, and fraud detection [2]. Data scientists also use machine learning for the following tasks:
Data preparation: Machine learning models benefit from the standardization of data, and data scientists use machine learning to address outliers or different scales in data sets [4].
Model building: Machine learning is used to build models that can analyze data and make intelligent decisions [1, 3].
Model evaluation: Data scientists need to evaluate the performance of the trained models [9].
Model deployment: Data scientists deploy models to make them available to applications [10, 11].
Data augmentation: Generative AI, a subset of machine learning, is used to augment data sets when there is not enough real data [1, 5, 12].
Code generation: Generative AI can help data scientists generate software code for building analytic models [1, 5, 12].
Data exploration: Generative AI tools can explore data, uncover patterns and insights and assist with data visualization [1, 5].
Machine Learning Techniques
Several techniques are commonly used in machine learning [4, 13]:
Regression is a technique for predicting a continuous value, such as the price of a house [13].
Classification is a technique for predicting the class or category of a case [13].
Clustering is a technique that groups similar cases [4, 13].
Association is a technique for finding items that co-occur [13].
Anomaly detection is used to find unusual cases [13].
Sequence mining is used for predicting the next event [13].
Dimension reduction is used to reduce the size of data [13].
Recommendation systems associate people’s preferences with others who have similar tastes [13].
Support Vector Machines (SVM) are used for classification by finding a separator [14]. SVMs map data to a higher dimensional feature space so data points can be categorized [14].
Linear and Polynomial Models are used for regression [4, 15].
Tools and Libraries
Machine learning models are implemented using popular frameworks such as TensorFlow, PyTorch, and Keras [6]. These learning frameworks provide a Python API and support other languages such as C++ and Javascript [6]. Scikit-learn is a free machine learning library for the Python programming language that contains many classification, regression, and clustering algorithms [4].
The field of machine learning is constantly evolving, and data scientists are always learning about new techniques, algorithms and tools [16].
Generative AI: Applications and Challenges
Generative AI is a subset of artificial intelligence that focuses on producing new data rather than just analyzing existing data [1, 2]. It allows machines to create content, including images, music, language, computer code, and more, mimicking creations by people [1, 2].
How Generative AI Operates
Generative AI uses deep learning models like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) [1, 2]. These models learn patterns from large volumes of data and create new instances that replicate the underlying distributions of the original data [1, 2].
Applications of Generative AI Generative AI has a wide array of applications [1, 2]:
Natural Language Processing (NLP), such as OpenAI’s GPT-3, can generate human-like text, which is useful for content creation and chatbots [1, 2].
In healthcare, generative AI can synthesize medical images, aiding in the training of medical professionals [1, 2].
Generative AI can create unique and visually stunning artworks and generate endless creative visual compositions [1, 2].
Game developers use generative AI to generate realistic environments, characters, and game levels [1, 2].
In fashion, generative AI can design new styles and create personalized shopping recommendations [1, 2].
Generative AI can also be used for data augmentation by creating synthetic data with similar properties to real data [1, 2]. This is useful when there isn’t enough real data to train or test a model [1, 2].
Generative AI can be used to generate and test software code for constructing analytic models, which has the potential to revolutionize the field of analytics [2].
Generative AI can generate business insights and reports, and autonomously explore data to uncover hidden patterns and enhance decision-making [2].
Types of Generative AI Models
There are four common types of generative AI models [3]:
Generative Adversarial Networks (GANs) are known for their ability to create realistic and diverse data. They are versatile in generating complex data across multiple modalities like images, videos, and music. GANs are good at generating new images, editing existing ones, enhancing image quality, generating music, producing creative text, and augmenting data [3]. A notable example of a GAN architecture is StyleGAN, which is specifically designed for high-fidelity images of faces with diverse styles and attributes [3].
Variational Autoencoders (VAEs) discover the underlying patterns that govern data organization. They are good at uncovering the structure of data and can generate new samples that adhere to inherent patterns. VAEs are efficient, scalable, and good at anomaly detection. They can also compress data, perform collaborative filtering, and transform the style of one image into another [3]. An example of a VAE is VAEGAN, a hybrid model combining VAEs and GANs [3].
Autoregressive models are useful for handling sequential data like text and time series. They generate data one element at a time and are good at generating coherent text, converting text into natural-sounding speech, forecasting time series, and translating languages [3]. A prominent example of an autoregressive model is Generative Pre-trained Transformer (GPT), which can generate human-quality text, translate languages, and produce creative content [3].
Flow-based models are used to model the probability distribution of data, which allows for efficient sampling and generation. They are good at generating high-quality images and simulating synthetic data. Data scientists use flow-based models for anomaly detection and for estimating probability density function [3]. An example of a flow-based model is RealNVP, which generates high-quality images of human faces [3].
Generative AI in the Data Science Life Cycle
Generative AI is a transformative force in the data science life cycle, providing data scientists with tools to analyze data, uncover insights, and develop solutions [4]. The data science lifecycle consists of five phases [4]:
Problem definition and business understanding: Generative AI can help generate new ideas and solutions, simulate customer profiles to understand needs, and simulate market trends to assess opportunities and risks [4].
Data acquisition and preparation: Generative AI can fill in missing values in data sets, augment data by generating synthetic data, and detect anomalies [4].
Model development and training: Generative AI can perform feature engineering, explore hyperparameter combinations, and generate explanations of complex model predictions [4].
Model evaluation and refinement: Generative AI can generate adversarial or edge cases to test model robustness and can train a generative model to mimic model uncertainty [4].
Model deployment and monitoring: Generative AI can continuously monitor data, provide personalized experiences, and perform A/B testing to optimize performance [4].
Generative AI for Data Preparation and Querying Generative AI models are used for data preparation and querying tasks by:
Imputing missing values: VAEs can learn intricate patterns within the data and generate plausible values [5].
Detecting outliers: GANs can learn the boundaries of standard data distributions and identify outliers [5].
Reducing noise: Autoencoders can capture core information in data while discarding noise [5].
Data Translation: Neural machine translation (NMT) models can accurately translate text from one language to another, and can also perform text-to-speech and image-to-text translations [5].
Natural Language Querying: Large language models (LLMs) can interpret natural language queries and translate them into SQL statements [5].
Query Recommendations: Recurrent neural networks (RNNs) can capture the temporal relationship between queries, enabling them to predict the next query based on a user’s current query [5].
Query Optimization: Graph neural networks (GNNs) can represent data as a graph to understand connections between entities and identify the most efficient query execution plans [5].
Generative AI in Exploratory Data Analysis
Generative AI can also assist with exploratory data analysis (EDA) by [6]:
Generating descriptive statistics for numerical and categorical data.
Generating synthetic data to understand the distribution of a particular variable.
Modeling the joint distribution of two variables to reveal their potential correlation.
Reducing the dimensionality of data while preserving relationships between variables.
Enhancing feature engineering by generating new features that capture the structure of the data.
Identifying potential patterns and relationships in the data.
Generative AI for Model Development Generative AI can be used for model development by [6]:
Helping select the most appropriate model architecture.
Assessing the importance of different features.
Creating ensemble models by generating diverse representations of data.
Interpreting the predictions made by a model by generating representatives of the data.
Improving a model’s generalization ability and preventing overfitting.
Tools for Model Development
Several generative AI tools are used for model development [7]:
DataRobot is an AI platform that automates the building, deployment, and management of machine learning models [7].
AutoGluon is an open-source automated machine learning library that simplifies the development and deployment of machine learning models [7].
H2O Driverless AI is a cloud-based automated machine learning platform that supports automatic model building, deployment, and monitoring [7].
Amazon SageMaker Autopilot is a managed service that automates the process of building, training, and deploying machine learning models [7].
Google Vertex AI is a fully managed cloud-based machine learning platform [7].
ChatGPT and Google Bard can be used for AI-powered script generation to streamline the model building process [7].
Considerations and Challenges When using generative AI, there are several factors to consider, including data quality, model selection, and ethical implications [6, 8]:
The quality of training data is critical; bias in training data can lead to biased results [8].
The choice of model and training parameters determines how explainable the model output is [8].
There are ethical implications to consider, such as ensuring the models are used responsibly and do not contribute to malicious activities [8].
The lack of high quality labeled data, the difficulty of interpreting models, the computational expense of training large models, and the lack of standardization are technical challenges in using generative AI [9].
There are also organizational challenges, including copyright and intellectual property issues, the need for specialized skills, integrating models into existing systems, and measuring return on investment [9].
Cultural challenges include risk aversion, data sharing concerns, and issues related to trust and transparency [9].
In summary, generative AI is a powerful tool with a wide range of applications across various industries. It is used for data augmentation, data preparation, data querying, model development, and exploratory data analysis. However, it is important to be aware of the challenges and ethical considerations when using generative AI to ensure its responsible deployment.
Data Science Full Course – Complete Data Science Course | Data Science Full Course For Beginners IBM
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
These documents function as a tutorial on data science in R, covering topics from the fundamentals of the R environment and data manipulation to advanced concepts in machine learning. The material explores visualizing data using various plotting techniques, including base graphics, ggplot2, and box plots, to gain insights into data distribution and relationships. Furthermore, it introduces regression models, specifically linear and logistic regression, explaining their mathematical basis and practical application in R for prediction. Finally, the sources discuss clustering algorithms, like hierarchical and k-means clustering, for grouping similar data points and touch upon time series analysis for understanding data trends over time, all while highlighting the essential skills and job roles within the data science field that often utilize R.
Podcast
Essential Data Science Skills and R Applications
R for Data Science Fundamentals
Based on the provided source, here is a discussion of R for data science:
R as a Programming Language for Data Science
R is described as a widely used programming language for data science. It is considered more than just a programming language; it is also a programming tool for performing analytics on data. R is an open-source and free software environment for statistical computing and graphics. It supports most machine learning algorithms for data analytics like regression, association, clustering, and more. While Python is noted as the main programming language in data science currently, R is considered powerful for doing a very quick display. Becoming proficient in R analytics can make transferring those skills to another language fairly easy, although R doesn’t have the same breadth of general code access as Python.
Key Features and Advantages of R
Several advantages of using R are highlighted:
Open Source: R is completely free and open source with active community members.
Extensible: It offers various statistical and graphical techniques.
Compatible: R is compatible across all platforms, including Linux, Windows, and Mac. Its compatibility is continually growing, integrating with systems like cluster computing and Python.
Extensive Library: R has an extensive library of packages for machine learning and data analysis. The Comprehensive R Archive Network (CRAN) hosts around 10,000 packages focused on data analytics.
Easy Integration: R can be easily integrated with popular software like Tableau, SQL Server, etc..
Diversity and Ease of Use: The diverse capabilities and extensive libraries make R a very diverse and easy-to-use coding source for analyzing data. It’s very easy and quick to go through and do different functions on the data and analyze it. R makes it easy to explore data.
R Environment: RStudio
RStudio is presented as a popular Integrated Development Environment (IDE) for R. It automatically opens up extra windows, which is nice. Typically, RStudio displays a console on the left (the main workspace), environmental information, and plots on the right. You can also use a script file in the upper left panel and execute the script, which runs in the console on the bottom left.
R Packages
Packages are essential in R as they provide pre-assembled collections of functions and objects. Each package is hosted on the CRAN repository. Not all packages are loaded by default, but they can be installed on demand using install.packages() and accessed using the library() function. Installing only necessary packages saves space.
Key packages mentioned for data science include:
dplyr: Used to transform and summarize tabular data. It’s described as much faster and easier to read than base R. Functions include grouping by data, summarizing, adding new variables (mutate), selecting columns (select), filtering data (filter), sorting (arrange), and sampling (sample_n, sample_fraction).
tidyr: Makes it easy to “tidy” data. It includes functions like gather (stacks multiple columns into a single column), spread (spreads single rows into multiple columns), separate (splits a single column into multiple), and unite (combines multiple columns). It’s also used for handling missing values, such as filling them.
ggplot2: Implements the grammar of graphics. It’s a powerful and flexible tool for creating sophisticated visualizations with little code. It’s part of the tidyverse ecosystem. You can build graphs by providing components like data, aesthetics (x, y axes), and geometric objects (geom). It uses sensible defaults if details aren’t provided. Different geom types are used for different graphs, e.g., geom_bar for bar charts, geom_point for scatter plots, geom_boxplot for box plots. You can customize elements like colors and sizes.
rpart: Used for partitioning data and creating decision trees.
rpart.plot: Helps in plotting decision trees created by rpart.
fSelector: Computes measures like Chi-squared, information gain, and entropy used in decision tree algorithms.
caret: A package for splitting data into training and test sets, used in machine learning workflows.
randomForest: The package for implementing the random forest algorithm.
e1071: A library containing support vector machine (SVM) functions.
dmwr: Contains the regress.eval function to compute error metrics like MAE, MSE, RMSE, and MAPE for regression models.
plotrix: Used for creating 3D pie charts.
caTools: Includes the sample.split function used for splitting data sets into training and test sets.
xlsx: Used to import data from Microsoft Excel spreadsheets.
elements.learn: Mentioned as a standard R library.
mass: A package containing data sets like the US serial data frame used for examples.
plot_ly: Creates interactive web-based graphs via a JavaScript library.
Data Structures in R
R supports various data structures, including vectors (the most basic), matrices, arrays, data frames, and lists. Vectors can contain numerous different values. Data frames are tabular data with rows and columns.
Data Import and Export
R can import data from various sources, including Excel, Minitab, CSV, table, and text files. Common functions for importing include read.table() for table files and read.csv() for CSV files, often specifying if the file has a header. Even if a file is saved as CSV, it might be separated by spaces or tabs, requiring adjustments in the read function. Exporting data is also straightforward using functions like write.table() or write.csv(). The xlsx package allows importing directly from .xlsx files.
Data Wrangling/Manipulation
Data wrangling is the process of transforming raw data into an appropriate format for analytics; it involves cleaning, structuring, and enriching data. This is often considered the least favorite but most time-consuming aspect of data science. The dplyr and tidyr packages are specifically designed for data manipulation and tidying. dplyr functions like filter for filtering data, select for choosing specific columns, mutate for adding new variables, and arrange for sorting are key for data transformation. Tidyr functions like gather, spread, separate, and unite help restructure data. Handling missing values, such as using functions from tidyr to fill NA values, is part of data wrangling.
Data Visualization
Data visualization in R is very powerful and quick. Visualizing data helps in understanding patterns. There are two types: exploratory (to understand the data yourself) and explanatory (to share understanding with others). R provides tools for both.
Types of graphics/systems in R:
Base graphics: Easiest to learn, used for simple plots like scatter plots using the plot() function.
Grid graphics: Powerful modules for building other tools.
Lattice graphics: General purpose system based on grid graphics.
ggplot2: Implements grammar of graphics, based on grid graphics. It’s a method of thinking about complex graphs in logical subunits.
Plot types supported in R include:
Bar chart (barplot(), geom_bar)
Pie chart (pie(), pi3d() from plotrix)
Histogram (hist(), geom_histogram)
Kernel density plots
Line chart
Box plot (boxplot(), geom_boxplot). These display data distribution based on minimum, quartiles, median, and maximum, and can show outliers. Box plots grouped by time periods can explore seasonality.
Heat map
Word cloud
Scatter plot (plot(), geom_point). These graph values of two variables (one on x, one on y) to assess their relationship.
Pairs plots (pairs()).
Visualizations can be viewed on screen or saved in various formats (pdf, png, jpeg, wmf, ps). They can also be copied and pasted into documents like Word or PowerPoint. Interactive plots can be created using the plot_ly library.
Machine Learning Algorithms in R
R supports various machine learning algorithms. The process often involves importing data, exploring/visualizing it, splitting it into training and test sets, applying the algorithm to the training data to build a model, predicting on the test data, and validating the model’s performance.
Linear Regression: A statistical analysis that attempts to show the linear relationship between two continuous variables. It creates a predictive model on data showing trends, often using the least square method. In R, the lm() function is used to create a linear regression model. It is used to predict a number (continuous variable). Examples include predicting rent based on area or revenue based on traffic sources (paid, organic, social). Model validation can use metrics like RMSE (Root Mean Squared Error), calculated from the square root of the mean of the squared differences between predicted and actual values. The regress.eval function in the dmwr package provides multiple error metrics.
Logistic Regression: A classification algorithm used when the dependent variable is categorical (e.g., yes/no, true/false). It uses a sigmoid function to model the probability of belonging to a class. A threshold (usually 50%) is used to classify outcomes based on the predicted probability. The college admission problem (predicting admission based on GPA and rank) is presented as a use case.
Decision Trees: A classification algorithm that splits data into nodes based on criteria like information gain (using algorithms like ID3). It has a root node, branch nodes, and leaf nodes (outcomes). R packages like rpart, rpart.plot, and fSelector are used. The process involves loading libraries, setting a working directory, importing data (potentially from Excel using xlsx), selecting relevant columns, splitting the data, creating the tree model using rpart, and visualizing it using rpart.plot. Accuracy can be evaluated using a confusion matrix. The survival prediction use case (survived/died on a ship based on features like sex, class, age) is discussed.
Random Forest: An ensemble method that builds multiple decision trees (a “forest”) and combines their outputs. It can be used for both classification and regression. Packages like randomForest are used in R. Steps include loading data, converting categorical variables to factors, splitting data, training the model with randomForest, plotting error rate vs. number of trees, and evaluating performance (e.g., confusion matrix). The wine quality prediction use case is used as an example.
Support Vector Machines (SVM): A classification algorithm used for separating data points into classes. The e1071 package in R contains SVM functions. This involves reading data, creating indicator variables for classes (e.g., -1 and 1), creating a data frame, plotting the data, and running the svm model. The horse/mule classification problem is a use case.
Clustering: Techniques used to group data points based on similarity. The process can involve importing data, creating scatter plots (pairs) to visualize potential clusters, normalizing the data so metrics aren’t biased by scale, calculating distances between data points (like Euclidean distance), and creating a dendrogram to visualize the clusters. The use case of clustering US states based on oil sales is provided.
Time Series Analysis: Analyzing data collected over time to identify patterns, seasonality, trends, etc.. This involves loading time-stamped data (like electricity consumption, wind/solar power production), creating data frames, using the date column as an index, visualizing the data (line plots, plots of log differences, rolling averages), exploring seasonality using box plots grouped by time periods (e.g., months), and handling missing values.
R in Data Science Skills and Roles
R is listed as an essential programming tool for performing analytics in data science. A data science engineer should have programming experience in R (or Python). While proficiency in one language is helpful, having a solid foundation in R and being well-rounded in another language (like Python, Java, C++) for general programming is recommended. Data scientists and data engineers often require knowledge of R, among other languages. The role of a data scientist includes performing predictive analysis and identifying trends and patterns. Data analytics managers also need to possess specialized knowledge, which might include R. The job market for data science is growing, and R is a relevant skill for various roles. Knowing R is beneficial even if you primarily use other tools like Python or Hadoop/Spark for quick data display or basic exploration.
Data Visualization Techniques in R
Data visualization is a core aspect of data science that involves the study and creation of visual representations of data. Its primary purpose is to leverage our highly developed ability to see patterns, enabling us to understand data better. By using graphical displays, such as algorithms, statistical graphs, plots, and information graphics, data visualization helps to communicate information clearly and effectively. For data scientists, being able to visualize models is very important for troubleshooting and understanding complex models. Mastering this skill is considered essential for a data scientist, as a picture is often worth a thousand words when communicating findings.
The sources describe two main types of data visualization:
Exploratory data visualization helps us to understand the data itself. The key is to keep all potentially relevant details together, and the objective is to help you see what is in your data and how much detail can be interpreted. This can involve plotting data before exploring it to get an idea of what to look for.
Explanatory visualization helps us to share our understanding with others. This requires making editorial decisions about which features to highlight for emphasis and which might be distracting or confusing to eliminate.
R is a widely used programming language for data science that includes powerful packages for data visualization. Various tools and packages are available in R to create data visualizations for both exploratory and explanatory analysis. These include:
Base graphics: This is the easiest type of graphics to learn in R. It can be used to generate simple plots, such as scatter plots.
Grid graphics: This is a powerful set of modules for building other tools. It has a steeper learning curve than base graphics but offers more power. Plots can be created using functions like pushViewport and rectangle.
Lattice graphics: This is a general-purpose system based on grid graphics.
ggplot2: This package implements the “grammar of graphics” and is based on grid graphics. It is part of the tidyverse ecosystem. ggplot2 enables users to create sophisticated visualizations with relatively little code using a method of thinking about and decomposing complex graphs into logical subunits. It requires installation and loading the library. Functions within ggplot2 often start with geom_, such as geom_bar for bar charts, geom_point for scatter plots, geom_boxplot for box plots, and geom_line for line charts.
plotly (plot ly): This library creates interactive web-based graphs via an open-source JavaScript graphing library. It also requires installation and loading the library.
plotrix: This is a package that can be used to create 3D pie charts.
R supports various types of graphics. Some widely used types of plots and graphs mentioned include:
Bar charts: Used to show comparisons across discrete categories. Rectangular bars represent the data, with the height proportional to the measured values. Stacked bar charts and dodged bar charts are also possible.
Pie charts: Used to display proportions, such as for different products and units sold.
Histograms: Used to look at the distribution and frequency of a single variable. They help in understanding the central tendency of the data. Data can be categorized into bins.
Kernel density plots.
Line charts: Used to show trends over time or sequences.
Box plots (also known as whisker diagrams): Display the distribution of data based on the five-number summary: minimum, first quartile, median, third quartile, and maximum. They are useful for exploring data with little work and can show outliers as single dots. Box plots can also be used to explore the seasonality of data by grouping data by time periods like year or month.
Heat maps.
Word clouds.
Scatter plots: Use points to graph the values of two different variables, one on the x-axis and one on the y-axis. They are mainly used to assess the relationship or lack of relationship between two variables. Scatter plots can be created using functions like plot or geom_point in ggplot2.
Dendrograms: A tree-like structure used to represent hierarchical clustering results.
Plots can be viewed on screen, saved in various formats (including pdf, png, jpeg, wmf, and ps), and customized according to specific graphic needs. They can also be copied and pasted into other files like Word or PowerPoint.
Specific examples of using plotting functions in R provided include:
Using the basic plot function with x and y values.
Using the boxplot function by providing the data.
Importing data and then graphing it using the plot function.
Using plot to summarize the relationship between variables in a data frame.
Creating a simple scatter plot using plot with xlab, ylab, and main arguments for labels and title.
Creating a simple pie chart using the pie function with data and labels.
Creating a histogram using the hist function with options for x-axis label, color, border, and limits.
Using plot to draw a scatter plot between specific columns of a data frame, such as ozone and wind from the airquality data set. Labels and titles can be added using xlab, ylab, and main.
Creating multiple box plots from a data frame.
Using ggplot with aesthetics (aes) to map variables to x and y axes, and then adding a geometry layer like geom_boxplot to create a box plot grouped by a categorical variable like cylinders. The coordinates can be flipped using coord_flip.
Creating scatter plots using ggplot with geom_point, and customizing color or size based on variables or factors.
Creating bar charts using ggplot with geom_bar and specifying the aesthetic for the x-axis. Stacked bar charts can be created using the fill aesthetic.
Using plotly to create plots, specifying data, x/y axes, and marker details.
Plotting predicted versus actual values after training a model.
Visualizing the relationship between predictor and response variables using a scatterplot, for example, speed and distance from the cars data set.
Visualizing a decision tree using rpart.plot after creating the tree with the rpart package.
Visualizing 2D decision boundaries for a classification dataset.
Plotting hierarchical clustering dendrograms using hclust and plot, and adding labels.
Analyzing time series data by creating line plots of consumption over time, customizing axis labels, limits, colors, and adding titles. Log values and differences of logs can also be plotted. Multiple plots can be displayed in a single window using the par function. Time series data can be narrowed down to a single year or shorter period for closer examination. Grid lines (horizontal and vertical) can be added to plots to aid interpretation, for example, showing consumption peaks during weekdays and drops on weekends. Box plots can be used to explore time series seasonality by grouping data by year or month. Legends can be added to plots using the legend function.
Overall, the sources emphasize that data visualization is a critical skill for data scientists, enabling them to explore, understand, and effectively communicate insights from data using a variety of graphical tools and techniques available in languages like R.
Key Machine Learning Algorithms for Data Science
Based on the sources, machine learning algorithms are fundamental techniques used in data science to enable computers to predict outcomes without being explicitly programmed. These algorithms are applied to data to identify patterns and build predictive models.
A standard process when working with machine learning algorithms involves preparing the data, often including splitting it into training and testing datasets. The model is trained using the training data, and then its performance is evaluated by running the test data through the model. Validating the model is crucial to see how well it performs on unseen data. Metrics like accuracy, RMSE (Root Mean Squared Error), MAE (Mean Absolute Error), MSE (Mean Squared Error), and MAPE are used for validation. Being able to visualize models and troubleshoot their code is also very important for data scientists. Knowledge of these techniques is useful for various data science job roles.
The sources discuss several specific machine learning algorithms and related techniques:
Linear Regression: This is a type of statistical analysis and machine learning algorithm primarily used for predicting continuous variables. It attempts to show the relationship between two variables, specifically modeling the relation between a dependent variable (y) and an independent variable (x). When there is a linear relationship between a continuous dependent variable and a continuous or discrete independent variable, linear regression is used. The model is often found using the least square method, which is the most commonly used method. Examples include predicting revenue based on website traffic or predicting rent based on area. In R, the lm function is used to generate a linear model.
Logistic Regression: Despite its name, logistic regression is a classification algorithm, not a continuous variable prediction algorithm. It is used when the response variable has only two outcomes (yes/no, true/false), making it a binary classifier. Instead of a straight line like linear regression, it uses a sigmoid function (sigmoid curve) as the line of best fit to model the probability of an outcome, which is always between zero and one. Applications include predicting whether a startup will be profitable or not, whether trees will get infested with bugs, or predicting college admission based on GPA and rank. In R, the glm (general linear model) function with the family=binomial argument is used for logistic regression.
Decision Trees: This is a tree-shaped algorithm used to determine a course of action and can solve both classification and regression problems. Each branch represents a possible decision, occurrence, or reaction. An internal node in the tree is a test that splits objects into different categories. The top node is the root node, and the final answers are represented by leaf nodes or terminal nodes. Key concepts include entropy, which measures the messiness or randomness of data, and information gain, which is used to calculate the tree splits. The ID3 algorithm is a common method for calculating decision trees. R packages like rpart and rpart.plot are used to create and visualize decision trees. Examples include predicting survival or classifying flower types.
Random Forests: This is an ensemble machine learning algorithm that operates by building multiple decision trees. It can be used for both classification and regression problems. For classification, the final output is the one given by the majority of its decision trees; for regression, it’s the majority output (implied average/aggregation of values). Random forests have various applications, including predicting fraudulent customers, diagnosing diseases, e-commerce recommendations, stock market trends, and weather prediction. Predicting the quality of wine is given as a use case. R packages like randomForest are used.
k-Nearest Neighbors (KNN): This is a machine learning technique mentioned as useful for certain job roles. It is described as grouping things together that look alike.
Naive Bayes: Mentioned as one of the diverse machine learning techniques that can be applied.
Time Series Analysis: While not a single algorithm, this involves techniques used for analyzing data measured at different points in time. Techniques include creating line plots to show trends over time, examining log values and differences of logs, and using box plots to explore seasonality by grouping data by time periods.
Clustering: This technique involves grouping data points together. It is useful for tasks like customer segmentation or social network analysis. Two main types are hierarchical clustering and partial clustering. Hierarchical clustering can be agglomerative (merging points into larger clusters) or divisive (splitting a whole into smaller clusters). It is often represented using a dendrogram, a tree-like structure showing the hierarchy of clusters. Partial clustering algorithms like k-means are also common. Calculating distances between points (like Euclidean or Manhattan distance) is a key step. Normalization of data is important for clustering to prevent bias from different scales. A use case is clustering US states based on oil sales.
Support Vector Machine (SVM): SVM is a machine learning algorithm primarily used for binary classification. It works by finding a decision boundary (a line in 2D, a plane in 3D, or a hyperplane in higher dimensions) that best separates the data points of two classes. The goal is to maximize the margin, which is the distance between the decision boundary and the nearest points from each class (called support vectors). If data is linearly separable, a linear SVM can be used. For data that is not linearly separable, kernel SVM uses kernel functions (like Gaussian RBF, sigmoid, or polynomial) to transform the data into a higher dimensional space where a linear separation becomes possible. Use cases include classifying cricket players as batsmen or bowlers or classifying horses and mules based on height and weight. Other applications include face detection, text categorization, image classification, and bioinformatics. The e1071 library in R provides SVM functions.
Overall, the sources highlight that a strong understanding of these algorithms and the ability to apply them, often using languages like R, is essential for data scientists.
Time Series Analysis: Concepts, Techniques, and Visualization
Based on the sources, Time series analysis is a data science technique used to analyze data where values are measured at different points in time,. It is listed among the widely used data science algorithms. The goal of time series analysis is to analyze and visualize this data to find important information or gather insights.
Time series data is typically uniformly spaced at a specific frequency, such as hourly weather measurements, daily website visit counts, or monthly sales totals. However, it can also be irregularly spaced and sporadic, like time-stamped data in computer system event logs or emergency call history.
A process for working with time series data involves using techniques such as time-based indexing, resampling, and rolling windows. Key steps include wrangling or cleaning the data, creating data frames, converting the date column to a date time format, and extracting time components like year, month, and day,,,,,. It’s also important to look at summary statistics for columns, check for and potentially handle missing values (NA), for example, by using forward fill,,,,. Accessing specific rows by date or index is also possible. The R programming language, often within the RStudio IDE, is used for this analysis,,. Packages like dplyr are helpful for data wrangling tasks like arranging, grouping, mutating, filtering, and selecting data,,,,.
Visualization is a crucial part of time series analysis, helping to understand patterns, seasonality, and trends,,,,. Various plotting methods and packages in R are used:
Line plots can show the full time series,,,.
The base R plot function allows for customizing the x and y axes, line type, width, color, limits, and adding titles,,,,. Using log values and differences of logs can sometimes reveal better patterns,.
It’s possible to display multiple plots in a single window using functions like par,,.
You can zoom into specific time periods, like plotting data for a single year or a few months, to investigate patterns at finer granularity,,,,,. Adding grids and vertical or horizontal lines can help dissect the data,,.
Box plots are particularly useful for exploring seasonality by grouping data by different time periods (yearly, monthly, or daily),,,,,,,,. They provide a visual display of the five-number summary (minimum, first quartile, median, third quartile, and maximum) and can show outliers,,.
Other visualization types like scatter plots, heat maps, and histograms can also be used for time series data.
Packages like ggplot2 and plotly are also available for creating sophisticated visualizations, although the plot function was highlighted as choosing good tick locations for time series,,,,,,,,. Legends can be added to plots to identify different series.
Analyzing time series data helps identify key characteristics:
Seasonality: Patterns that repeat at regular intervals, such as yearly, monthly, or weekly oscillations,,,,,,,,,. Box plots grouped by year or month clearly show this seasonality,,,. Weekly oscillations in consumption are also evident when zooming in,,,.
Trends: Slow, gradual variability in the data over time, in addition to higher frequency variations,,,. Rolling means (or rolling averages) are a technique used to visualize these trends by smoothing out higher frequency variations and seasonality over a defined window size (e.g., 7-day or 365-day rolling mean),,,,,,,. A 7-day rolling mean smooths weekly seasonality but keeps yearly seasonality, while a 365-day rolling mean shows the long-term trend,,. The zoo package in R is used for calculating rolling means.
Using an electricity consumption and production dataset as an example,, time series analysis revealed:
Electricity consumption shows weekly oscillations, typically higher on weekdays and lower on weekends,,,.
There’s a drastic decrease in consumption during early January and late December holidays,.
Both solar and wind power production show yearly seasonality,. Solar production is highest in summer and lowest in winter, while wind power production is highest in winter and drops in summer. There was an increasing trend in wind power production over the years.
The long-term trend in overall electricity consumption appeared relatively flat based on the 365-day rolling mean,.
Data Science Careers and Required Skills
Based on the sources, the field of data science offers a variety of career paths and requires a diverse skill set. Data scientists and related professionals play a crucial role in analyzing data to gain insights, identify patterns, and make predictions, which can help organizations make better decisions. The job market for data science is experiencing significant growth.
Here are some of the roles offered in data science, as mentioned in the sources:
Data Scientist: A data scientist performs predictive analysis and identifies trends and patterns to aid in decision-making. Their role involves understanding system challenges and proposing the best solutions. They repetitively apply diverse machine learning techniques to data to identify the best model. Companies like Apple, Adobe, Google, and Microsoft hire data scientists. The median base salary for a data scientist in the U.S. can range from $95,000 to $165,000, with an average base pay around $117,000 according to one source. “Data Scientist” is listed as the most common job title.
Machine Learning Engineer: This is one of the roles available in data science. Knowledge of machine learning techniques like supervised machine learning, decision trees, linear regression, and KNN is useful for this role.
Deep Learning Engineer: Another role mentioned within data science.
Data Engineer: Data engineers develop, construct, test, and maintain architectures such as databases and large-scale processing systems. They update existing systems with better versions of current technologies to improve database efficiency. Companies like Amazon, Spotify, and Facebook hire data engineers.
Data Analyst: A data analyst is responsible for tasks such as visualization, optimization, and processing large amounts of data. Companies like IBM, DHL, and HP hire data analysts.
Data Architect: Data architects ensure that data engineers have the best tools and systems to work with. They create blueprints for data management, emphasizing security measures. Companies hiring data architects include Visa, Logitech, and Coca-Cola.
Statistician: Statisticians create new methodologies for engineers to apply. Their role involves extracting and offering valuable reports from data clusters through statistical theories and data organization. Companies like LinkedIn, Pepsico, and Johnson & Johnson hire statisticians.
Database Administrator: Database administrators monitor, operate, and maintain databases, handle installation and configuration, define schemas, and train users. They ensure databases are available to all relevant users and are kept safe. Companies like Tableau, Twitter, and Reddit hire database administrators.
Data and Analytics Manager: This role involves improving business processes as an intermediary between business and IT. Managers oversee data science operations and assign duties to the team based on skills and expertise.
Business Analytics/Business Intelligence: This area involves specializing in a business domain and applying data analysis specifically to business operations. Roles include Business Intelligence Manager, Architect, Developer, Consultant, and Analyst. They act as a link between data engineers and management executives. Companies hiring in this area include Oracle, Uber, and Dell. Business intelligence roles are noted as having a high level of jobs.
To succeed in these data science careers, a strong skill set is necessary, encompassing both technical and non-technical abilities.
Key Technical Skills:
Programming Languages: Proficiency in languages like R and Python is essential. Other languages mentioned as useful include SAS, Java, C++, Perl, Ruby, MATLAB, SPSS, JavaScript, and HTML. R is noted for its strengths in statistical computing and graphics, supporting most machine learning algorithms for data analytics. Python is highlighted as a general-purpose language with libraries like NumPy and SciPy central to data science. Mastering at least one specific programming language is important.
SQL and Database Knowledge: A strong understanding of SQL (Structured Query Language) is considered mandatory for extracting large amounts of data from datasets. Knowledge of database concepts is fundamental. Various SQL forms exist, and a solid basic understanding is very important as it frequently comes up.
Big Data Technologies: Experience with big data, including technologies like Hadoop and Spark, is required. Hadoop sits on top of SQL and is used for creating huge clusters of data. Spark often sits on top of Hadoop for high-end processing.
Data Wrangling/Preparation: This is a process of transforming raw data into an appropriate format for analytics and is often considered the most time-consuming aspect. It involves cleaning (handling inconsistent data types, misspelled attributes, missing values, duplicates), structuring, and enriching data. Functions like arranging, grouping, mutating, filtering, and selecting data are part of this process. Techniques for handling missing values like forward fill are also used.
Machine Learning Algorithms: Knowledge of diverse machine learning techniques is crucial. This includes algorithms like Linear Regression (for continuous variables), Logistic Regression (a classification algorithm for binary outcomes), Decision Trees (for classification and regression), Random Forests (an ensemble method for classification and regression), k-Nearest Neighbors (KNN), Naive Bayes, Clustering (like hierarchical clustering and k-means), and Support Vector Machines (SVM) (often for binary classification). Applying these algorithms to data to identify patterns and build predictive models is core to data science.
Data Visualization: This involves creating visual representations of data using algorithms, statistical graphs, plots, and other tools to communicate information effectively. Being able to visualize models is important for troubleshooting. Various plots like line plots, bar charts, histograms, scatter plots, box plots, heat maps, pie charts, and dendrograms for clustering are used. Tools like Tableau, Power BI, and QlikView are used for creating reports and dashboards. R provides packages and functions for visualization, including base graphics, grid graphics, plot, and ggplot2.
Statistics: A data scientist needs to know statistics, which deals with collecting, analyzing, and interpreting data. Understanding probabilities, p-scores, f-scores, mean, median, mode, and standard deviation is necessary.
Model Validation: Evaluating the performance of models is crucial, using metrics like accuracy, RMSE, MAE, MSE, and MAPE.
Key Non-Technical Skills:
Intellectual Curiosity: This is highlighted as a highly important skill due to the rapidly changing nature of the field. It involves updating knowledge by reading content and books on data science trends.
Business Acumen/Intuition: Understanding how the problem solved can impact the business is essential. Knowing the company’s needs and where the analysis is going is crucial to avoid dead ends.
Communication Skills: The ability to clearly and fluently translate technical findings to non-technical teams is vital. Explaining complex concepts in simple terms is necessary when communicating with stakeholders and colleagues who may not have a data science background.
Versatile Problem Solver: Data science roles require strong analytical and quantitative skills.
Self-Starter: As the field is sometimes not well-defined within companies, data scientists need to be proactive in figuring out where to go and communicating that back to the team.
Teamwork: Data science professionals need to work well with others across the organization, including customers.
Ability to Visualize Models and Troubleshoot Code: This specific skill goes beyond just visualization for communication; it’s about breaking down and debugging complex models.
Career Outlook and Resume Tips:
The sources indicate significant growth in data science job listings.
For building a resume, key elements include a summary that ties your skills and experience to the specific company. Including links to professional profiles like LinkedIn and GitHub is important. The resume should be concise, ideally taking only about 30 seconds to a minute to glance over. Sections typically include experience, education, skills, and certifications. The order can be adjusted based on experience level and the specific job requirements. Highlighting experiences relevant to data science is advised. Remember to keep the resume simple, short, and direct.
R For Data Science Full Course Data Science With R Full Course Data Science Tutorial Simplilearn
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
These sources offer an extensive exploration of data analysis and PowerBI, focusing on the role of a data analyst and the process of transforming raw data into valuable insights. They cover essential concepts like data sourcing, cleaning, modeling, and visualization, emphasizing the importance of effective communication of findings. The texts also introduce advanced topics such as DAX calculations, performance optimization, and the integration of PowerBI within a larger enterprise data flow, highlighting the potential of data to drive strategic business decisions. Furthermore, they touch upon the application of generative AI in data analysis and provide guidance on preparing for the Microsoft PL-300 certification exam, offering real-world scenarios and career insights through examples of aspiring data analysts.
Foundations of Data Analysis
Data analysis is a multifaceted process crucial for turning raw data into meaningful insights and informed decisions for businesses and organizations . It involves identifying, cleaning, transforming, and modeling data to discover meaningful and useful information. Data analysts use various techniques to explore, interpret, and draw meaningful conclusions from processed data “.
The Importance of Data Analysis
Data is an essential business component, but raw data is only meaningful after proper interpretation and analysis . **Data analysts are crucial because they help organizations make sense of the vast amounts of collected data, turning it into insights that inform decisions**. This analytical work helps businesses identify growth opportunities, improve operations, gain a competitive advantage , identify the cause of problems, uncover trends, and make decisions that can improve business performance. Ultimately, data analysis drives strategic decision-making and can significantly impact an organization’s success “.
The Data Analysis Process
The data analysis process typically involves several interconnected stages “:
Identifying the analysis purpose or defining the business problem: This is the foundational step, determining what you aim to achieve or the questions you need to answer with the analysis . Gathering the right data is fundamental to ensure the analysis is relevant and useful, and understanding the purpose informs the type and scope of data needed. Consulting with stakeholders is key to determining the purpose “.
Data Collection and Preparation: Data is gathered from various sources . This raw data is often unorganized and may have missing values or inconsistencies. Data preparation involves cleaning, standardizing, organizing, and transforming the data into a usable format for analysis . The Extract, Transform, Load (ETL) process is a common method for processing data, involving extracting data from sources, transforming it to make it consistent and ready for analysis, and loading it to a suitable destination. Data wrangling is another term for this process of processing, cleaning, and transforming data “.
Data Processing and Modeling: Processing transforms raw data . Data modeling organizes data to make sense of the information and generate insights. This can involve understanding basic concepts, using tools like DAX to create calculations, and optimizing model performance . Common data schemas include star and snowflake schemas, which organize data into fact and dimension tables.
Data Analysis, Visualization, and Interpretation: This stage involves exploring processed data and generating insights . Data analysis uses various techniques to explore, interpret, and draw meaningful conclusions from the processed data. Analytical techniques include statistical analysis, hypothesis testing, and identifying patterns, trends, and relationships . Data visualization is a powerful tool used to communicate these insights. Visualizations (like charts and graphs) transform complex data into understandable representations, helping to spot patterns, anomalies, and trends at a glance . Interpretation involves understanding what the patterns and trends reveal.
Reporting and Sharing Data Insights: Insights are communicated to stakeholders through reports and dashboards . Dashboards consolidate critical information visually on one screen to achieve specific objectives. Sharing reports requires considering factors like accessibility, visual appeal, and security . Effective communication and storytelling are essential to convey findings responsibly and ethically.
Implementing Insights and Recommendations: Informed decisions are made based on the analyzed data, guiding actions and adjustments within the business to achieve objectives “.
This data flow process – collection, processing, analysis, and decision-making – is a fundamental concept in business “.
Roles in Data Analysis
The data analysis process involves various roles that collaborate to achieve datadriven success “:
Data Engineer: Designs and constructs data infrastructure, including pipelines, cleaning, pre-processing, and transforming raw data for analysts and scientists “.
Data Analyst: Examines data sets to identify trends, patterns, and insights . They use tools to visualize and present data, making it digestible for stakeholders, and work closely with teams to align analysis with business goals. The data analyst is often a central figure in the process “.
Data Scientist: Dives deeper into data, creating predictive models using machine learning and statistical techniques to identify hidden patterns and optimize decisions . They often collaborate with data analysts.
Database Administrator (DBA): Works on the maintenance, performance, and security of databases, ensuring data is stored efficiently and accessible “.
Data Architect: Creates the blueprint for data management systems, designing data models and strategies for storage, integration, and retrieval “.
Business Intelligence (BI) Analyst: Transforms data into actionable insights, focusing on Key Performance Indicators (KPIs) using BI tools to visualize and present data to stakeholders and collaborating with business leaders to understand their goals “.
These roles are essential for providing organizations with the information they need for informed, data-driven decisions “.
Skills for Data Analysts
To succeed, data analysts require a mix of technical and non-technical skills “:
Technical Skills: Proficiency with tools like Microsoft Excel and Microsoft PowerBI . Experience with programming languages such as R and Python is used for analysis and visualization. Understanding SQL (Structured Query Language) is vital for interacting with databases . Key technical activities include data wrangling (cleaning and transforming data), data modeling (organizing data for analysis) , creating calculations using languages like DAX, data visualization (creating charts and reports) , and using statistical functions. Other important technical skills mentioned include data profiling , managing data storage modes, creating aggregations , joining and merging data, grouping and binning data , and performance optimization.
Non-Technical (Soft) Skills: These are crucial for connecting with and influencing stakeholders . Essential skills include **effective communication** to present complex information clearly and concisely to various audiences, diplomacy for navigating disagreements and maintaining relationships , **understanding end-user needs** to tailor analysis and provide relevant insights, and being a technical interpreter to translate complex concepts for non-technical stakeholders . **Strategic thinking, awareness of impact, and understanding the business context** are also important. The ability to use data to tell a story or narrative is also highlighted “.
By developing these technical and non-technical skills, data analysts can collaborate effectively, create actionable insights, inspire change, and make lasting impacts “.
Tools and Techniques Used in Data Analysis
Data analysts utilize a range of tools and techniques “:
Software and Tools: Microsoft Excel is used for designing and managing spreadsheets and preparing data . **Microsoft PowerBI** is a powerful tool for processing, analyzing, and sharing data, known for its user-friendly interface, rich visualizations, and advanced analytics capabilities . The PowerBI workflow includes PowerBI Desktop, PowerBI Service, and PowerBI Apps. Power Query Editor within PowerBI is used for data preparation, cleaning, transformation, and ETL tasks . SQL Server and other databases are used for data storage. Programming languages like R and Python are used for data analysis and visualization “.
Techniques:ETL (Extract, Transform, Load): A fundamental process for preparing data “.
Data Wrangling/Cleaning/Transformation: Making raw data consistent and usable “.
Data Modeling: Organizing data into structured formats like star or snowflake schemas “.
DAX (Data Analysis Expressions): A formula language used to create custom calculations and measures within data models “.
Calculations and Statistical Functions: Performing mathematical operations and applying functions like average, median, count, min, and max to data to reveal insights “.
Data Visualization: Creating graphical representations of data such as charts, graphs, scatter plots, bubble charts, dot plots, and tables to make complex information understandable . Interactive features like filtering, sorting, slicers, and bookmarks enhance visualizations.
Data Profiling: Examining data sets to evaluate accuracy, completeness, and statistical distribution . Tools analyze column quality, distribution, and profile statistics.
Grouping and Binning: Organizing data points into chosen groups or equal-sized segments “.
Clustering: Identifying similarities in data attributes to divide data into subsets or clusters “.
Time Series Analysis: Analyzing data in chronological order to identify trends “.
Performance Optimization: Modifying data models and reports to improve speed and efficiency, especially with large data volumes . Techniques include filtering, sorting, indexing, aggregation, and choosing appropriate storage modes. The Performance Analyzer tool helps diagnose issues “.
Data Storage and Management: Understanding different data types (structured, unstructured, semistructured) and appropriate storage solutions , as well as concepts like normalization and indexing in databases.
Connecting to Data Sources: Using methods like Import mode or Direct Query mode to bring data into tools like PowerBI “.
These tools and techniques empower data analysts to extract insights, support business intelligence, and facilitate data-driven decision-making . The sources frequently use the example of Adventure Works, a fictitious bicycle company, to illustrate how data analysis is applied in real-world business scenarios.
Mastering Microsoft PowerBI for Business Intelligence
Microsoft PowerBI is an interactive data visualization product and a comprehensive business analytics solution. It is considered an essential resource for many organizations across various industries.
Importance in Business
PowerBI plays a crucial role in helping businesses make sense of the vast amounts of collected data, transforming it into actionable insights that inform decisions. It enables organizations to harness the full potential of data to uncover insights, identify patterns, trends, and insights, and drive strategic decision-making. PowerBI supports data-driven decision-making and is vital for providing organizations with the information they need for informed decisions [Introduction]. For companies like Adventure Works, PowerBI is used to extract insights from large amounts of data.
Components and Workflow
Microsoft PowerBI has multiple components that work together. The main components are PowerBI Desktop, PowerBI Service, and PowerBI Apps. Other related components include PowerBI mobile, PowerBI report server, and PowerBI embedded.
PowerBI Desktop is a Windows-based application used by data analysts or report designers to clean, transform, and load data, create a data model, design reports, and publish them.
PowerBI Service is the cloud-based service (SaaS) part of PowerBI, used by report users and administrators. It offers advantages like accessibility, scalability, collaboration tools, and data backup and recovery features.
PowerBI Apps are the native mobile applications available on iOS, Android, and Windows. They allow access to insights on the go.
A typical workflow in PowerBI often starts with the creation of a report in PowerBI Desktop. Report designers and developers are primarily responsible for this task. When the report is ready, you publish it to the PowerBI service, where administrators can assign permissions and specific users can consume the report. You can also share reports with colleagues, your whole organization, or external stakeholders who need to draw insights. Insights are also communicated through dashboards, which consolidate critical information visually. PowerBI Service and PowerBI mobile can be used to view dashboards.
Key Capabilities and Features
PowerBI offers a wide range of features and capabilities for data analysis and business intelligence:
Data Connection and Preparation:
PowerBI supports a wide range of data sources, including traditional databases, Excel spreadsheets, cloud-based services, on-premise databases, external enterprise applications, and APIs. PowerBI connector is used to access these sources.
Data preparation is crucial for making raw data usable. This involves cleaning, standardizing, organizing, and transforming data.
The Extract, Transform, Load (ETL) process is fundamental for preparing data in PowerBI. Power Query Editor in PowerBI is a tool used for data preparation, cleaning, transformation, and ETL tasks. Data wrangling is another term for processing, cleaning, and transforming data [Introduction, 1, Introduction].
Techniques include data profiling, joining and merging data [Introduction], and grouping and binning data to classify or segment data points.
Data Modeling:
Data modeling is creating visual representations of your data in PowerBI to organize it and make sense of the information. It involves understanding how different data elements interact and outlining the rules that influence these interactions.
PowerBI allows you to identify or create relationships between data elements. You can define relationships between tables and assign data types.
Common data schemas include star and snowflake schemas, which organize data into fact and dimension tables [Introduction, 7, 43].
DAX (Data Analysis Expressions) is a powerful language used to create custom calculations, calculated measures, columns, and tables within data models. DAX is fundamental to data analysis in PowerBI.
Performance Optimization is important, especially with large data volumes. Techniques include modifying models, reports, queries, filtering, sorting, indexing, aggregation, and choosing appropriate storage modes. The Performance Analyzer tool helps diagnose issues.
Aggregations in PowerBI enable diving deeper into data without compromising speed and performance. They involve summarizing or consolidating large volumes of data into manageable summary tables.
Understanding different Data Storage Modes (Import, Direct Query, Dual, Composite) is vital as they determine where data is stored and how queries are sent. Import mode stores data in PowerBI’s in-memory storage, Direct Query keeps data in the source, and Dual mode can act as either. Composite mode allows combining different storage modes.
Creating Hierarchies (date, product, geographical) is a significant feature allowing analysis at different levels of granularity within the same visual using drill down.
Analysis Techniques:
PowerBI empowers you to transform raw data into meaningful insights through various advanced tools and functionalities.
Calculations are the foundation of data analysis in PowerBI and are created using DAX. Common calculations include aggregations and statistical functions like average, median, count, min, and max [Introduction, 21, 22, 23].
PowerBI offers analytics capabilities to add significant value to visualizations. This includes using statistical summary tools.
Identifying patterns, trends, and anomalies is crucial. Scatter charts can help identify outliers.
Time Series Analysis involves analyzing data in chronological order to identify trends. PowerBI supports time series forecasting to predict future trends.
Clustering identifies similarities in data attributes to divide data into subsets.
The Analyze feature automatically detects relationships and connections, providing automated insights. You can right-click on a data point to analyze fluctuations like increases or decreases.
PowerBI leverages AI capabilities and machine learning algorithms to provide insights. This includes AI visuals like Key Influencers and Decomposition Trees for understanding drivers behind outcomes, sentiment analysis, and key phrase extraction.
The Q&A feature is a natural language processing tool allowing users to ask questions about data in plain English and get answers as visuals. It learns and adapts over time.
Quick Insights automatically searches datasets to discover and visualize potential patterns, trends, and outliers using machine learning and statistical functions.
Dynamic reports can facilitate using What-If parameters for interactive adjustments and scenario analysis.
Metrics and Scorecards are critical for tracking progress towards specific objectives and providing a comprehensive view of performance.
Visualization:
Data visualization is a powerful tool for communicating insights. Visualizations transform complex data into understandable representations, helping to spot patterns, anomalies, and trends [Introduction, 11].
PowerBI offers a variety of built-in visualization types, such as bar charts, maps, tables, cards, multirow cards, gauges, KPI visual, scatter plots, bubble charts, and dot plots. Heat maps, tree maps, and 3D visualizations are also discussed for handling high-density data. Coropleth and shape maps are common map visuals.
Custom visuals can be imported from the PowerBI marketplace or created using Python or R.
Design principles are important for creating effective visualizations. This includes considering color theory, appropriate positioning and scale, maintaining cohesion and consistency, and avoiding clutter.
Accessibility is crucial in report design, including features like alt text, sufficient color contrast, keyboard navigation, and compatibility with screen readers. PowerBI has built-in tools to support this.
Visualizations can be interactive, allowing users to drill down, filter, and sort data.
Visual interactions determine how selecting data in one visual affects others. The primary types are filter (filters other visuals), highlight (dims non-selected data), and none (no interaction).
Slicers help users drill down to deeper insights and can be synchronized across report pages to improve user experience.
The Selection Pane helps manage report elements, allowing naming, grouping, and layering visuals. Bookmarks can also be used to create a smooth narrative.
PowerBI allows optimizing report layouts for mobile devices to ensure proper display on smaller screens.
Sharing and Collaboration:
Insights are communicated through reports and dashboards. Publishing reports to PowerBI Service makes them accessible and collaborative.
PowerBI Workspaces are specialized areas that hold assets like reports, dashboards, and datasets. They help organize assets, provide security, enable collaboration, and allow quick updates. There are personal and shared workspaces.
Workspace roles (viewer, contributor, member, admin) determine how individuals interact with content. Permissions can be managed.
You can share Workspace assets as an app, which can have multiple audience groups with tailored access.
Data security is important for safeguarding sensitive data. PowerBI offers authentication tools, sharing links with controlled permissions, sensitivity labels, and data permissions.
Row-Level Security (RLS) controls which individuals can view data based on predefined roles and rules, enhancing security and user experience.
You can promote and certify datasets to establish trust and standardize data quality, helping users find the most accurate data.
Data Gateways establish a secure connection between PowerBI cloud services and on-premises data sources. Types include on-premises data gateway (standard mode), on-premises data gateway personal mode, and Azure virtual network data gateway. They help sync data and keep datasets up to date via schedule refresh.
Subscriptions and Alerts provide automated delivery of data snapshots (emails/notifications) and notifications when specific conditions are met. They enhance user engagement and support real-time decision-making.
Overall, PowerBI transforms raw data into actionable intelligence, acting as a toolkit with mapping techniques and navigation support to help users cut through data noise and interpret patterns. It is a central tool in the data flow process within a business, moving from collection, processing, analysis, and decision-making.
PowerBI Data Transformation Explained
Data transformation is a fundamental process in Microsoft PowerBI, essential for preparing raw data for analysis and generating meaningful insights. It involves altering the structure, format, or values of data to make it suitable for analysis. This often includes cleaning, structuring, and enriching the data.
Why is Data Transformation Necessary?
Raw data, as collected from various sources, is often untidy, incomplete, inconsistent, scattered across different systems, or may have missing values or duplicate entries. Working with such data can lead to inaccurate or misleading analysis results and, consequently, poor business decisions. Data transformation addresses these issues by ensuring the data used for analysis is accurate, clean, consistent, and reliable. It standardizes data across multiple sources and organizes it to be more understandable.
Where Transformation Happens in PowerBI
Within PowerBI, data transformation is primarily handled by Power Query Editor. Power Query is a powerful ETL (Extract, Transform, Load) tool integrated into PowerBI Desktop. It provides a graphical user interface (GUI) for connecting to various data sources, cleaning data, and performing transformations with ease.
Key Data Transformation Techniques and Capabilities
Power Query Editor offers a range of tools and features for transforming data:
Data Cleaning: This involves identifying and correcting errors and inconsistencies. Techniques include removing duplicate entries, handling or filling in missing values (nulls), fixing incorrect data types, and standardizing formats (e.g., ensuring consistent spelling or capitalization). Filtering data is also a key cleaning method.
Structuring and Shaping Data: This prepares data for analysis. Operations include removing unwanted columns or rows, splitting or merging columns (e.g., combining first and last names into a full name), changing data types (e.g., text to numeric, date, or decimal), and sorting data. Promoting header rows is also a common shaping task. Grouping data allows manually dividing data points, while binning automatically separates data points into segments based on number or size.
Combining Data: It is common to need to combine data from multiple sources.
Append: Adds rows from one table to another. This is useful for consolidating data that has the same columns but spans across different files or databases (e.g., monthly sales files).
Merge: Consolidates data from multiple sources into a single table based on matching criteria or key columns, similar to joining tables in a database. This is used when data needs to be combined horizontally based on relationships between tables.
Reshaping Data Structures:Unpivot: Transforms data from a “wide” format (many columns) to a “narrow” format (fewer columns), often converting column headers into row values. This is useful for data normalization and making comparisons easier.
Pivot: Transforms data from a “narrow” format to a “wide” format, converting rows into columns based on specific values.
Adding Calculated Columns: Power Query allows adding new columns based on calculations performed on existing columns, such as calculating total price by multiplying quantity and unit price. DAX is used for calculations within the data model, but calculated columns can be created during the transformation stage in Power Query using its own formula language or features.
Query Management: Power Query’s Applied Steps list is a critical feature, visually representing every transformation applied to a query. This list can be reviewed, modified, deleted, or reordered, ensuring transparency and allowing for easy undo or redo functionality. Referencing a query creates a new query based on an existing one, inheriting its steps. Changes to the original query automatically update the referenced query, which is useful for maintaining complex transformation workflows. Duplicating a query creates an independent copy that can be modified without affecting the original.
Relationship with Data Loading and Profiling
Transformation is typically performed after data extraction and before data loading into the PowerBI data model. The loading process brings the transformed data into PowerBI for analysis and visualization.
Before transforming or loading data, it is essential to inspect and profile the data. Power Query Editor provides tools like Column Quality, Column Distribution, and Column Profile to evaluate the data’s accuracy, completeness, validity, distribution, and identify anomalies or outliers. This profiling step helps identify where transformations are needed.
Benefits of Data Transformation
Effective data transformation is crucial for generating accurate reports and gaining valuable insights. It improves data quality and consistency, enhances performance by preparing data efficiently, simplifies data management, and helps organizations make informed decisions based on reliable information.
PowerBI Data Visualization Fundamentals
Data Visualization in PowerBI
Data visualization is a graphical representation of data. In Microsoft PowerBI, it is much more than simple graphical depictions; it involves converting raw data into a visual format to help identify patterns, trends, and insights that might not be apparent in text-based data. Visualizations enable you to communicate complex data and insights in a simple, appealing way by presenting data graphically. This process makes it easier for stakeholders to grasp key insights, trends, and patterns that may be difficult to identify from row data or tables.
Why is Data Visualization Important?
Data visualization is crucial for generating accurate reports and gaining valuable insights. It enhances business intelligence, particularly in complex and dynamic business environments. Key benefits include:
Revealing Patterns and Trends: Data visualizations can reveal patterns, trends, and correlations hidden in raw data. For example, a bar chart could visualize sales data demonstrating geographic regions where sales are highest.
Making Data Accessible: Visualizations make data more accessible to a broader audience, as most stakeholders can understand a well-designed chart or graph. This encourages engagement with data and contributes to data-driven decision-making.
Powerful Communication Tool: Visualizations are a powerful communication tool that can tell a compelling story with data, making insights more memorable and persuasive.
Driving Data-Driven Decisions: By providing clear, interactive displays, visualizations act like a navigation system through complex data, helping businesses make informed decisions based on reliable information.
Real-time Analysis: Visualizations can enable real-time data analysis. For example, as sales figures are updated, visualizations in PowerBI can update automatically, providing up-to-date insights.
Where Visualization Happens in PowerBI
Visualizations are primarily created in the Report View of PowerBI Desktop. This is the primary canvas where you design and create your visualizations, adding and arranging different visual elements. Reports can have multiple pages organized using tabs at the bottom of the window. Once created in reports, visualizations can also be pinned to Dashboards in the PowerBI service, which provide a consolidated, one-page summary of the most important metrics or key performance indicators (KPIs).
Workflow for Creating Visualizations
Creating visualizations in PowerBI typically follows a workflow:
Connecting to data sources.
Using Power Query Editor to extract, transform, and load the data.
Loading the refined data into PowerBI’s data model.
Representing this processed data in visualizations.
Key Components and Concepts
Several key components and concepts are involved in creating and using visualizations in PowerBI:
Visualizations Pane: Located on the right side of the window, this pane contains a gallery of visual elements you can add to your report canvas. You add visuals by clicking or dragging them onto the report view.
Fields Pane (or Data Pane): Also on the right side, this pane displays the data tables and fields available for your report. You use this pane to populate your visualizations with data by dragging fields onto the visual or specific field wells.
Field Wells: These are sections within the visualizations pane where you drag data fields to define how they are used in the visual, such as axes, legend, values, or tooltips.
Axes (X and Y): These represent the data points you want to compare or analyze.
Categorical Axes: Used to represent discrete, non-numeric data points (categories). PowerBI automatically arranges data points in the order they appear in the dataset or allows sorting. Common in bar charts and column charts.
Continuous Axes: Designed to represent numerical data points with an inherent order along a continuous scale. Ideal for visualizing quantitative information to identify trends and patterns. Common in line charts, area charts, and scatter plots.
Legend: Controls the color coding or grouping of elements in your chart, helping differentiate between different categories or subgroups. It makes it easier to understand which color represents which item.
Tooltips: Display data or extra information when you hover over the data points of a chart. Tooltips can be customized to include additional fields.
Formatting: PowerBI offers extensive options to format the appearance and feel of visualizations to improve their aesthetic appeal, readability, and align with branding. This includes options for colors, fonts, grid lines, titles, backgrounds, and more. Formatting options are found in the ‘Format visual’ tab of the visualizations pane.
Common Visualization Types
PowerBI offers a wide variety of visualization types:
Charts:Column Charts: Compare different categories in a vertical orientation, useful for demonstrating changes over time or comparisons, generally with fewer than 10 categories.
Bar Charts: Similar to column charts but horizontal, useful for comparing larger quantities or categories with lengthy labels.
Line Charts: Best suited for showing trends over time by connecting individual numeric data points, particularly effective for large datasets.
Area Charts: Similar to line charts but with the area beneath the line filled, helping compare quantities and show part-to-whole relationships over time or across categories. Stacked area charts emphasize the total across several categories.
Pie Charts: Circular graphics divided into slices to illustrate numerical proportions of a whole. Each slice represents a category, and its size is proportional to its quantity. Less effective with too many categories.
Donut Charts: Similar to pie charts but with a blank center. Ideal for showing a dataset as a proportion of a whole.
Scatter Charts: Use dots to represent values for two numeric variables, plotting them along two axes to illustrate how one factor is affected by another, representing correlations and helping identify anomalies or outliers.
Bubble Charts: A variation of scatter plots where a third variable is represented by the size of the bubble. They can depict multi-dimensional data in a single view.
Funnel Charts: Present sequential or staged data, such as a sales conversion process, helping identify trends and bottlenecks.
Combo Charts (Line and Column): Combine line and column charts to display complex and related data points seamlessly.
Tree Maps: Use nested rectangles to display hierarchical or proportional data. Useful for visualizing larger datasets without becoming overly complex compared to pie charts.
Tables: Display raw, detailed data and exact numbers in columns and rows, providing a comprehensive numerical view. Useful for examining exact figures and making precise comparisons.
Maps: Visualize geographical data.
Shape Maps: Color-code geographical regions based on data values to reveal insights.
Coropleth Maps (Filled Maps): Similar to shape maps, shading or patterning geographical areas (countries, states, regions) to illustrate quantitative data values.
Heat Maps: Use color gradients to represent the density and distribution of data across geographical regions or grids. Not a core PowerBI visual but can be imported or created with Python.
ArcGIS Maps: Rich in map visualization features.
KPI Visuals: Specifically designed to display key performance indicators. Include Cards (single value), Multirow Cards (multiple values per row), Gauges (progress toward a target), and the KPI visual (performance against target with trend line).
Advanced Visualization Techniques
PowerBI offers advanced capabilities for visualizing complex data:
Handling High-Density Data: Techniques include using aggregations and summarization, drill through and drill down, color coding (like heat maps), and using 3D and custom visualizations.
Hierarchies and Drill Down/Through: Organizing data into hierarchies (like Date, Product, Geography) allows users to explore data from a general overview level down to specific details within the same visualization. Drill down allows navigating through these hierarchy levels. Drill through is a technique for creating summary pages with high-level insights.
Custom Visualizations: User-defined visual elements for specific requirements. They can be imported from the PowerBI marketplace (AppSource).
Python/R Visuals: Integration with Python and R programming languages allows creating dynamic and sophisticated custom visualizations. This requires specialist expertise and has limitations on data size.
Key Influencers Visual: An advanced analytics feature that uses AI algorithms to identify key contributors behind increases or decreases in a metric, such as sales.
Decomposition Tree: Another specialized analytics tool to navigate through data hierarchy levels to understand how a final value is influenced by different categories.
Clustering: Using algorithms (like in scatter plots) to group data points based on patterns and identify hidden relationships.
Interactions: Visualizations can be configured to interact with one another.
Filter: Selecting a data point in one visual filters the data displayed in others.
Highlight: Selecting a data point highlights related data in other visuals while dimming the rest, maintaining context.
None: Disables interaction, useful when visuals should function independently.
Slicers: Visual filters that allow viewers to segment and filter the data in real-time.
Data Visualization and Data Storytelling
Data visualization is a crucial part of data storytelling. Data storytelling involves leveraging narrative, data, and visualizations to communicate insights effectively. Visualizations act as a bridge between raw data and actionable insights, supporting the narrative and making complex information accessible and engaging for the audience. By choosing appropriate and effective data visualizations, analysts can allow viewers to quickly grasp information and identify trends, patterns, and insights.
Accessibility
When designing reports and visualizations, it is important to consider accessibility. This means creating reports that can be easily used and understood by all individuals, including those with disabilities. Features supporting accessibility in PowerBI include providing alt text for visuals, ensuring sufficient color contrast, enabling keyboard navigation (Tab Order), using markers on lines, and ensuring compatibility with screen readers. High-contrast themes are also available.
Essential Concepts in Data Security
Based on the sources provided, here is a discussion of data security:
Data security is considered paramount in our digital age, like safeguarding your most valuable possessions in a vault with a strong lock. Data, being the lifeblood of modern organizations, is subject to a range of threats, including cyber attacks, breaches, and unauthorized access. Ensuring the security of this “digital gold mine” is not just a choice, but a necessity. In the world of data visualization, ensuring data security is of utmost importance. This includes protecting sensitive information and maintaining data integrity. Incorporating robust security measures is crucial throughout the visualization process.
Why Data Security Matters
Data security is crucial for generating accurate reports and gaining valuable insights. It enhances business intelligence, particularly in complex and dynamic business environments [Source 1 – my previous response, not directly from the provided sources]. Working with data often involves handling sensitive information, such as customer data, financial records, or proprietary business insights. Ensuring the security of this data is essential to:
Maintain trust.
Comply with regulations.
Protect against unauthorized access or data breaches.
Safeguard the company’s reputation and success.
Prevent potential harm to the company and its stakeholders.
Mishandling sensitive data can lead to serious consequences, including financial loss, legal troubles, brand damage, and competitive disadvantage. It can also damage the relationship between an organization and its workforce if employee data is leaked.
Identifying Sensitive Data
Sensitive data contains important information about a business or its stakeholders that, if mishandled, could cause harm or misuse. A simple rule is: if it’s information that could damage the company’s reputation, finances, or stakeholder privacy, it’s sensitive data. Examples include:
Customer details.
Financial records (including profit margins).
Employee information.
Proprietary business knowledge or insights.
Product designs.
Vendor contracts.
Any information that offers intimate knowledge not meant for circulation can be classified as sensitive.
Measures for Safeguarding Data
PowerBI offers various measures to ensure data security:
Access Control & Authentication: Controlling access to data is vital to ensure only authorized individuals can view or interact with specific data sets. Before a user can access a report, they need to prove who they are through an authentication system. Once authenticated, the system determines what data they are permitted to access. This helps protect organizations like Adventure Works from internal leaks and unauthorized external breaches. PowerBI allows defining roles for users with specific permissions tied to them, ensuring data is distributed on a need-to-know basis. Regularly reviewing and updating these roles is essential. Access logs and audit trails can also track and monitor data usage.
Role-Level Security (RLS): RLS is a powerful data governance capability that controls which individuals can view data based on predefined roles and rules. It allows restricting data visibility so each user can only access data they are authorized to view, ensuring data integrity and confidentiality.
Benefits: Precise control over data visibility, prevention of accidental data leaks, safeguarding sensitive data, easier handling of complex data access needs as data scales, assistance with compliance and auditing, and a reduced risk of data breaches.
Types:Static RLS: Uses predefined rules based on user roles and is suitable for a fixed set of users or a simple logic. You configure this in PowerBI Desktop by managing roles, adding filters using DAX expressions, testing, and then assigning users to these roles in the PowerBI service.
Dynamic RLS: Adjusts real-time data access based on user roles and attributes stored in the data itself, using DAX expressions like USERPRINCIPALNAME() to filter data dynamically. This is ideal when user access is based on varying criteria, such as region-specific data access.
Considerations: Both types require thorough testing to ensure accurate and secure visibility. Dynamic RLS can potentially slow down data retrieval and requires regular maintenance.
Data Anonymization and Masking: These techniques protect privacy by removing personally identifiable information or replacing it with pseudonyms. Techniques include generalization, suppression, or noise addition. Data masking specifically allows working with obscured versions of sensitive data, balancing transparency and security, for example, viewing only the last four digits of a credit card number. These are used for analysis and visualization while preserving privacy, especially when sharing data with external partners.
Data Integrity: Maintaining data integrity is crucial to ensure the accuracy and reliability of the visualized information. Key aspects include data validation, error detection, and consistency checks. Implementing data validation rules and performing regular audits helps identify and rectify anomalies. Encryption techniques can also prevent unauthorized modifications and tampering.
Secure Data Transmission: When transferring data or sharing visualizations, it is essential to prioritize secure data transmission using encrypted connections such as HTTPS or SSL/TLS. These protocols ensure data is encrypted during transit, making it difficult for unauthorized individuals to intercept or manipulate it. Other secure methods include using VPNs, two-factor authentication (2FA), enterprise cloud storage solutions, secure protocols like SFTP, and secure cloud-based platforms for distribution. Sharing reports externally requires secure embedding methods like publish to web or embed code, chosen carefully based on data sensitivity.
Data Sensitivity Labels: PowerBI’s data sensitivity labels allow categorizing data to safeguard company reputation and trust. They act like digital tags indicating the required level of confidentiality. Applying these labels properly ensures data protection, especially when sharing or exporting. The sources mention six categories: Personal, Public, General, Confidential, Highly Confidential, and Restricted. These labels can also include encryption settings, preventing access even if a file is inadvertently shared.
Sharing Permissions and Link Management: PowerBI’s link sharing feature allows distributing reports via a URL. However, this poses security risks, so access must be carefully managed. PowerBI offers different sharing options for links (e.g., people in your organization, specific people). Configuring sharing permissions is vital to safeguard data by determining who can access it and what they can do. Permission types include Read (view only), Build (use data for analysis/reports but not change source), Reshare (distribute to authorized users), Write (alter data sets), and Owner (comprehensive control). These permissions can be configured using the ‘Manage permissions’ option in the PowerBI service. When sharing externally, it is important to carefully control what information is shared and maintain strict security measures. Safe links with clear permissions, expiration dates, and limitations to specific users enhance report security. User licensing also needs to be considered for external partners.
External Sharing Settings: PowerBI administrators can adjust settings to enable external sharing while maintaining security standards, such as authorizing users or groups, setting content restrictions, controlling link expiration, and mandating authentication.
PowerBI Gateways: Data gateways, such as the on-premises data gateway, bridge the gap between PowerBI’s cloud services and on-premises data sources, allowing secure use of on-premises data in the cloud. The connection is outbound, which helps reduce security vulnerabilities.
Data Security in the Data Flow
Security considerations are relevant throughout the data flow stages: collection, processing, analysis, and decision-making. Processes within a business govern how data is acquired, stored, manipulated, and shared to support operations. Safeguarding data is important during data preparation (cleaning, transformation) [Source 1 – my previous response, not directly from the provided sources] and ensuring accurate data (data refresh). Planning for data storage and management involves considering security and implementing measures to protect data against unauthorized access, theft, tampering, and emerging threats.
Roles and Responsibilities
Various roles are involved in ensuring data security. Data analysts often work with sensitive data and must handle it with care. Database administrators safeguard the security and overall health of an organization’s databases. Data architects design strategies for data storage, integration, and retrieval, collaborating with other data professionals to align designs with business needs and support security objectives. BI analysts transform data into actionable insights and must work closely with other data professionals, considering data security when presenting to stakeholders. PowerBI Administrators control organizational settings related to security, including external sharing. Workspace roles (viewer, contributor, member, admin) define levels of interaction and access to assets.
In conclusion, security is a fundamental aspect of data visualization in PowerBI, crucial for protecting sensitive information, maintaining trust, ensuring data integrity, and complying with regulations. By implementing measures such as access control, RLS, data anonymization, secure transmission, sensitivity labels, and proper sharing permissions, organizations can build trust, protect sensitive information, and deliver reliable insights to stakeholders.
Microsoft Power BI: Data Analysis Study Guide
Quiz
What are the three key pieces of information required to construct an IF function formula in Excel? An IF function requires a logical test, a value to display or perform if the test is true, and a value to display or perform if the test is false.
Explain the primary difference between a nested IF function and an IFS function in Excel. A nested IF function involves placing one IF function inside another as an argument, typically in the “value if false” section. An IFS function is designed to handle multiple logical tests sequentially without requiring nesting.
According to the source material, why is gathering the right data crucial in the data analysis process? Gathering the right data is essential because it ensures the analysis is focused, relevant, and useful for the end user. Using irrelevant data will not provide insights needed for informed decisions.
What is the primary purpose of data profiling in Power BI, and what are two tools available in the Power Query editor for this? Data profiling identifies potential issues and anomalies within a dataset, enabling informed decisions about data cleaning and transformation. Column quality and column distribution are two tools in the Power Query editor for data profiling.
Define the terms “unique” and “distinct” as they are used in data profiling within Power BI, according to the source. “Unique” refers to the total number of values that appear only once in a column. “Distinct” refers to the total number of different values in a column, regardless of how many times each value appears.
What is DAX (Data Analysis Expressions) and what is its primary function in Power BI? DAX is a programming language used in Power BI (among other Microsoft tools) to create custom calculations on data models and generate additional information not present in the original data.
Explain the concept of “row context” in DAX calculations. Row context refers to the current row of a table being evaluated within a calculation. When a DAX expression is evaluated for a specific row, it considers the values in that row as the context for the calculation, allowing for row-level operations.
What are “calculated columns” in Power BI, and how do they differ from standard columns? Calculated columns are new columns added to an existing table in Power BI that display the results of a DAX formula. Unlike standard columns which are populated by imported data, calculated columns are generated dynamically based on existing data.
Describe the purpose of the CALCULATE function in DAX. The CALCULATE function in DAX evaluates an expression within a context that is modified by specified filters. It allows you to alter the filter context of a calculation, enabling more focused analysis.
What is the primary requirement for a table to be marked as a “date table” in Power BI for time intelligence calculations to function correctly? For a table to function correctly as a date table for time intelligence calculations, it must contain one record for each day, have no missing or blank dates, and span from the minimum to the maximum date present in the data.
Answer Key
Logical test, value if true, value if false.
Nested IF places IF functions inside each other as arguments; IFS handles multiple tests sequentially without nesting.
It ensures the analysis is focused, relevant, and useful for the end user and provides necessary insights for informed decisions.
To identify potential issues and anomalies within the dataset; Column quality and Column distribution.
Unique: Total number of values that appear only once. Distinct: Total number of different values regardless of frequency.
A programming language used for creating custom calculations and generating additional data not in the original model.
The current row being evaluated in a calculation, considering the values in that specific row.
New columns added using DAX formulas; they are calculated dynamically, while standard columns are from imported data.
To evaluate an expression in a filter context modified by specified filters.
One record per day, no missing or blank dates, and spans from minimum to maximum date.
Essay Format Questions
Compare and contrast the star schema and snowflake schema data models in Power BI. Discuss their key characteristics, advantages, disadvantages, and when you might choose one over the other.
Explain the concept of evaluation context in DAX. Discuss how row context and filter context interact and impact the results of DAX calculations, providing examples of each.
Describe the different types of measures in Power BI (additive, semi-additive, and non-additive). Provide examples of each and explain how the approach to aggregation differs for each type.
Discuss the importance of effective data visualization in Power BI for conveying insights to stakeholders. Describe at least three different visualization types mentioned in the source material and explain how they can be used to display key performance indicators (KPIs).
Explain the process of creating and utilizing data hierarchies in Power BI. Discuss why hierarchies are beneficial for data analysis and reporting, and describe how you can create your own custom hierarchies using different data fields.
Glossary of Key Terms
Autofill: A feature in Excel that allows you to quickly copy formulas or data down a column or across a row.
Logical Function: A function in Excel or Power BI that performs a calculation based on whether a condition is true or false.
IF Function: A logical function in Excel that returns one value if a condition is true and another value if it’s false.
Logical Operators: Symbols used in logical functions to compare values (e.g., =, >, <, >=, <=, <>).
Nested IF: An Excel formula where one IF function is placed inside another IF function’s arguments.
IFS Function: An Excel function that checks multiple conditions and returns a value corresponding to the first true condition.
Serial Numbers: How Excel interprets and stores dates for calculation purposes.
AutoFill Double-click Shortcut: A quick method in Excel to copy a formula down a column by double-clicking the fill handle.
DAX (Data Analysis Expressions): A programming language used in Power BI, Excel Power Pivot, and SQL Server Analysis Services for creating custom calculations and data analysis.
Data Modeling: The process of creating visual representations of data and defining relationships between data elements in Power BI.
Schemas: Structures used to organize data in a data model, such as star and snowflake schemas.
Relationships: Connections between tables in a data model, typically based on common key columns.
Cardinality: The nature of the relationship between two tables (e.g., one-to-one, one-to-many, many-to-many).
Cross-filter Direction: The direction in which filters propagate through relationships in a Power BI data model (e.g., single, bidirectional).
Calculated Tables: New tables created in a Power BI data model using DAX formulas based on existing data or combinations of data sources.
Cloned Tables: Exact copies of existing tables in a Power BI data model, often created to manipulate data without affecting the original table.
Calculated Columns: New columns added to an existing table in a Power BI data model that display the results of a DAX formula.
Measures: Dynamic calculations or metrics created in Power BI using DAX to summarize, analyze, and compare data across dimensions.
Additive Measures: Measures that can be meaningfully summed across any dimension (e.g., total sales quantity).
Semi-additive Measures: Measures that can be summed across some dimensions but not all, often problematic with the time dimension (e.g., inventory balance).
Non-additive Measures: Measures that cannot be meaningfully summed across any dimension (e.g., profit margin percentage).
Row Context: In DAX, the current row being evaluated within a calculation.
Filter Context: In DAX, the set of filter constraints applied to the data before it’s evaluated by an expression.
CALCULATE Function: A powerful DAX function that evaluates an expression in a context modified by specified filters.
Time Intelligence Functions: Specialized DAX functions designed to work with date and time data for temporal analysis (e.g., TOTALYTD, DATESBETWEEN, DATEADD).
Common Date Table (Date Dimension): A dedicated table in a data model containing a continuous list of dates, required for time intelligence calculations.
Data Granularity: The level of detail captured in a data set or data field (high granularity means more detail).
Data Profiling: The process of examining and summarizing data to understand its structure, content, and quality.
Column Quality: A data profiling feature in Power BI that categorizes values in a column as valid, error, or empty.
Column Distribution: A data profiling feature in Power BI that shows the frequency and distribution of values in a column.
Append Queries: A process in Power Query to combine rows from two or more tables with the same column structure into a single table.
Merge Queries: A process in Power Query to combine data from two or more tables based on matching values in common columns (similar to SQL joins).
Join Type: Determines how rows from two tables are combined during a merge query based on matching criteria (e.g., left outer, inner).
Primary Key: A column or set of columns in a table that uniquely identifies each row.
Foreign Key: A column or set of columns in one table that establishes a relationship to the primary key in another table.
Data Hierarchy: A structured way to organize data fields into levels, allowing for drill-down analysis in visualizations.
Drill Down/Up: Features in Power BI visualizations that allow users to navigate through different levels of a data hierarchy.
Bookmarks: A feature in Power BI reports that captures the current state (filters, slicers, visual state) and allows users to quickly return to that state.
Key Performance Indicators (KPIs): Measurable values that indicate the effectiveness of a company or department in achieving business objectives.
Card Visualization: A Power BI visual that displays a single data point or value.
Multi-row Card Visualization: A Power BI visual that displays one or more data points, with each data point on a separate row.
Radial Gauge: A Power BI visual that displays a single value measuring progress toward a goal or target.
KPI Visual: A Power BI visual specifically designed to track the performance of a metric against a target, often including a trend line.
Histogram: A type of bar chart used to visualize the frequency distribution of data, grouping values into ranges or bins.
Top N Analysis: A method to filter data to show only the top or bottom specified number of values based on a criterion.
Geo Hierarchy: A data hierarchy based on geographical locations (e.g., continent, country, state, city).
Custom Visualizations: Visualizations in Power BI created using programming languages like Python or R or developed to meet specific analytical or aesthetic needs.
Workspace Apps: A feature in Power BI Service that allows you to package and share an entire workspace (data sets, reports, dashboards) with specific users or teams.
Impact Analysis: A tool in Power BI Service to view which workspaces, reports, or dashboards are affected by a data set.
Lineage View: A view in Power BI Service that shows the connections and dependencies between different items in a workspace.
Permissions: Settings in Power BI Service that control who can access and interact with data sets, reports, dashboards, and workspace apps.
Use Relationship Function: A DAX function that allows you to activate an inactive relationship between tables for a specific calculation.
Role-Playing Dimension: A single dimension table in a data model that can play multiple roles in relationships with a fact table (e.g., a Date table related to both Order Date and Ship Date).
Briefing Document: Excel and Power BI Data Analysis Techniques
Summary:
This document summarizes the key concepts and techniques presented in the provided source material, focusing on fundamental data manipulation in Excel and various advanced data analysis and visualization capabilities in Microsoft Power BI. The sources cover Excel’s date/time and logical functions (IF, nested IFs, IFS), and delve into Power BI topics such as data modeling, DAX (Data Analysis Expressions), data preparation (profiling, cleaning, transforming, loading, merging, appending), visualization types, hierarchical data, bookmarks, and performance optimization. The importance of non-technical skills, data quality, and understanding analysis objectives is also highlighted.
Key Themes and Important Ideas:
1. Excel Fundamentals:
Working with Dates and Time: Excel interprets dates as serial numbers, allowing for calculations like subtraction. Functions like TODAY(), NOW(), DAY(), MONTH(), YEAR(), and DATE() are used to extract or combine date components and create dynamic date/time formulas.
“Excel interprets stored dates as serial numbers…”
“you can separate the date into its component parts so that you can focus on the year element type an equal sign the word year and an open parenthesis in cell H5…”
“…you also reviewed functions for creating dynamic formulas that calculate time and date values these include the today and now functions…”
“…you can also divide a date entry into its component parts using day month and year or return these components as a single date with the date function…”
Logical Functions (IF, Nested IFs, IFS): Logical functions allow Excel to perform actions based on conditions or logic, essentially asking “yes” or “no” questions about data.
“when working with Excel you might need to execute a function under certain conditions or logic in these instances you can use a logical function calculation like an if function…”
“You can use logical functions to ask yes or no questions about your data if the function returns yes as its answer then you can direct Excel to perform the required action however if the function returns an answer of no then Excel can be directed to perform a different action…”
Logical Operators: These operators are crucial for logical tests within formulas and compare values against specified criteria. Examples include =, >, <, >=, <=, and <>.
“for these tests to work the formula must contain logical operators the logical operators determine what kind of question the formula is asking and what value it needs for its answer these operators can be used to compare both text and numeric entries…”
“The equal sign is the first of the mathematical operators that Excel uses in logical functions excel uses this operator to check if the value of one item is equal to that of another item…”
“finally a very useful set of logical operators is not equal to this is when the less than and greater than symbols are typed back to back this combination of operators is interpreted by Excel as not equal to…”
IF Function Syntax: The IF function requires three arguments: a logical test, a value if true, and a value if false.
“when constructing the if function formula you need to give Excel three pieces of information the first piece of information is called the logical test… The next instruction tells Excel what to do or what to display if the test returns a result of true… The third and final argument is what Excel should do or display if the logical test returns the result of false…”
Nesting IF and IFS Functions: Nested IF functions allow for multiple conditions to be tested sequentially, with subsequent IF functions embedded within the value if false argument of the previous one. The IFS function provides an alternative, designed to run a series of tests without nesting, executing the action for the first test that returns true.
“what if you need to test for multiple conditions? You can use nested if and ifs functions…”
“nesting functions is the technique of adding another function to the formula as an argument for the original function in other words you can place one function inside another to expand its functionality…”
“One approach would be to create what is known as a nested if formula the formula begins with an if that performs an initial logic test if the test turns out to be true then the formula will simply process whatever action is specified in the value if true argument however the result of the logical test could also be false if so then another if function in the value of false argument could run another test and process different actions…”
“The second approach is to use a function called ifs an ifs function is designed to run a series of tests that don’t require you to nest other functions the ifs function steps through the tests checking each one if a test is false it continues to move through the tests until it finds one that is true when a logical test returns true as a result the formula performs or displays whatever is in the value if true for that test it then stops running tests…”
2. Power BI – Data Modeling and DAX:
Data Modeling: Creating visual representations of data and defining relationships between data elements to generate insights. Power BI is a key tool for this.
“data modeling is creating visual representations of your data in PowerBI you can use these representations to identify or create relationships between data elements by exploring these relationships you can generate new insights into your data to improve your business…”
“microsoft PowerBI is a fantastic tool for creating data models and generating insights and you don’t need an IT related qualification to begin using it…”
Schemas (Flat, Star, Snowflake): Different ways to structure data models. Star and Snowflake schemas are common, organizing data into fact and dimension tables.
“you’ll learn to identify different types of data schemas like flat star and snowflake…”
“when deciding on the data schema you plan to use for your analysis the most common schema types are star and snowflake schemas you may recall that in these schemas data is broken down into fact and dimension tables…”
Relationships: Connecting tables based on common keys (primary and foreign keys). Cardinality (one-to-one, one-to-many, many-to-many) and cross-filter direction are important aspects of relationships.
“you’ll create and maintain relationships in a data model using cardality and cross- filter direction…”
“a table relationship is how two tables are connected to each other…”
“in the products table the product ID column is what’s known as a primary key each value in the product ID column is unique… in the sales table the product ID column is what’s known as a foreign key it’s not the primary key of the table but instead it establishes a relationship to the products table…”
“Now that you know how to establish a relationship between two tables the next important aspect is the cardality of the relationship in PowerBI there are three types of cardality one many to one or one to many and many to many…”
DAX (Data Analysis Expressions): A programming language used in Power BI (and other Microsoft tools) to create custom calculations and generate information not present in the original data model. It uses functions, operators, and constants.
“if it’s possible to derive the data from the original model you can use DAX data analysis expressions to create custom calculations to generate the data…”
“dax is a programming language used in Microsoft SQL Server analysis services Power Pivot in Excel and PowerBI it is a library of functions operators and constants used in formulas or expressions to create additional information about the data not present in the original data model…”
“to master DAX you need to understand its syntax different data types the operators and how to refer to columns and tables using functions…”
DAX Syntax: Typically involves specifying the name of the new calculation, an equal sign, the DAX function name, and arguments within parentheses (often referencing table and column names).
“first write the name of your new calculation then add the equal sign operator next write the name of your DAX function then parenthesis that contain the logic of your formula write a table name enclosed in single quotes followed by the column name enclosed in square brackets…”
Operators in DAX: Used for various calculations and comparisons, including arithmetic, comparison, logical, and concatenation.
“dax formulas rely on operators there are many different types of operators they can be used to perform arithmetic calculations compare values work with strings or test conditions…”
DAX Functions: Reusable pieces of logic for tasks like aggregations, conditional logic, and time intelligence calculations. Examples include SUM, AVERAGEX, and SUMMARIZE.
“functions are reusable pieces of logic that can be used in a DAX formula these functions can perform various tasks including aggregations conditional logic and time intelligence calculations…”
“commonly used DAX formulas and functions include calculate sum and average…”
Row Context and Filter Context: DAX formulas are evaluated within a context. Row context refers to the current row being evaluated in a calculation. Filter context refers to the constraints applied to the data before evaluation, determining the subset of data used for calculations.
“dax computes formulas within a context the evaluation context of a DAX formula is the surrounding area of the cell in which DAX evaluates and computes the formula this surrounding area is determined by the set of rows and filters to be evaluated in a DAX expression it determines which subset of data is used to perform calculations…”
“row context refers to the table’s current row being evaluated within a calculation…”
“filter context refers to the filter constraints applied to the data before it’s evaluated by the DAX expression…”
CALCULATE Function: A powerful DAX function that can alter the filter context of a calculation. It evaluates an expression within a context modified by specified filters.
“calculate along with its companion calculate table is the only DAX function that can alter the filter context during a DAX calculation…”
“the calculate function evaluates an expression in a context modified by the specified filters…”
“from the examples you have learned the calculate only modifies the outer filter context by applying new filters this is done by either overriding the existing filter or by combining new filters with the existing ones…”
Measures: Calculations or metrics that generate meaningful insights from data, often using DAX. They are essential for quantitative analysis and can be categorized as additive, semi-additive, and non-additive.
“a measure is a calculation or metric that generates meaningful insights from data measures are an important aspect of data analysis and play a lead role in creating calculated tables and columns…”
“there are three different types of measures additive semi-additive and non-additive which type of measure is used depends on the needs of your data and its dimensions…”
Additive, Semi-Additive, and Non-Additive Measures:Additive: Can be meaningfully aggregated across any dimension (e.g., total sales).
Semi-Additive: Can be aggregated over some dimensions but not all, often time (e.g., inventory balance).
Non-Additive: Cannot be meaningfully aggregated across any dimension (e.g., profit margin percentage).
Statistical Functions in Measures: Functions like AVERAGE, COUNT, DISTINCTCOUNT, MIN, and MAX are used in measures to calculate values related to statistical distributions and probability.
“a key element of measures is statistical functions statistical functions calculate values related to statistical distributions and probability to reveal information about your data several common statistical functions are used in measures like average median and count…”
Calculated and Cloned Tables/Columns: Calculated tables and columns are new elements created within a data model using DAX formulas. Calculated tables can combine data from multiple sources or normalize dimension tables. Cloned tables are exact copies used for manipulation without altering the original. Calculated columns add derived data to existing tables.
“you can use calculated and cloned tables to enhance your data sets and improve your analysis…”
“a calculated table is a new table created within a data model based on data from different sources a calculated column is a new column added to an existing table that presents the results of a calculation…”
“cloning a table can be extremely useful for manipulating or augmenting data without affecting the original table…”
“calculated columns are custom data columns that are created within a Microsoft PowerBI data model using data analysis expressions or DAX language…”
Time Intelligence Functions: Specialized DAX functions for working with date and time data to perform advanced temporal analysis, including period-to-date calculations, comparisons, and moving averages. A common date table is a prerequisite.
“time is the dimension that virtually underpins all data analysis and for this reason time intelligence functions hold a position of paramount importance time intelligence functions are specialized functions designed to work with date and time data enabling users to perform advanced temporal analysis and gain deeper insight into historical data…”
“a common date table or date dimension is a prerequisite for time intelligence calculations you can’t execute them without a date dimension…”
“important time intelligence DAX functions is total year-to- date… date year-to- date function… dates between… same period last year… date add function…”
Common Date Table: A critical dimension table for time intelligence calculations, requiring one record per day, no missing or blank dates, and covering the full date range of the data. Can be created in Power BI using Power Query or DAX (CALENDAR, CALENDARAUTO).
“a common date table or date dimension is a prerequisite for time intelligence calculations…”
“the date dimension must meet the following requirements there must be one record per day there must be no missing or blank dates and it must start from the minimum date and end at the maximum date corresponding to the fields in your parameters…”
“you can create a date dimension in PowerBI using either Power Query or DAX this is useful when working on large data sets with complex calculations you can create a date dimension with DAX using the calendar and calendar auto functions…”
USE RELATIONSHIP Function: Used within other DAX functions (like CALCULATE) to override or activate an inactive relationship between two tables for a specific measure calculation.
“with the cross filter function you can change the cross filter direction for a specific measure while maintaining the original settings… Fortunately Adventure Works can use the cross filter function to alter the direction while maintaining the original settings…”
“the cross filter function changes the cross filter direction between two tables for a specific measure while maintaining the original settings…”
“you can only use use relationship within DAX functions that take filter as an argument for example calculate calculate table and total YTD…”
“the use relationship function in DAX overrides this relationship and establishes a temporary relationship between the date column of the date table and the shipping date column of the sales table this inactive relationship becomes active only during the current calculation when using the use relationship function there are some essential points to consider…”
3. Power BI – Data Preparation and Transformation:
Importance of Gathering the Right Data: The objective or purpose of the analysis informs the data collection process, ensuring the data is focused, relevant, and useful for the end user.
“gathering the right data is crucial for conducting a successful analysis however before you can start collecting data it’s essential to determine and understand the purpose or goals of the analysis you can then collect the appropriate data to conduct an analysis that is focused relevant and useful for the end user of the analysis…”
“the purpose of your analysis will inform what is the right data to collect including the type and scope of the data to gather and use in the analysis…”
Data Profiling: Analyzing data to understand its structure, content, quality, and patterns. Helps identify potential issues and anomalies for cleaning and transformation. Power BI’s Power Query Editor offers Column Quality, Column Distribution, and Column Profile tools.
“data profiling is the process of examining and analyzing a data set to understand its structure content quality and patterns…”
“data profiling enables the identification of potential issues and anomalies within the data set this proactive approach allows you to make informed decisions about data cleaning transformation and enrichment ultimately leading to improved data quality…”
“microsoft PowerBI offers the following two profiling tools in the Power Query editor column quality and column distribution…”
“column quality focuses on valid error and empty rows on each column allowing you to validate your row values…”
“column distribution provides a set of visuals underneath the names of the columns that showcase the frequency and distribution of the values in each of the columns…”
“another type of profiling in PowerBI is column profile column profile provides column statistics such as minimum maximum average frequently occurring values and standard deviation…”
Unique vs. Distinct: In Power BI, “unique” refers to values that appear only once, while “distinct” refers to the total number of different values regardless of frequency.
“before delving into data profiling tools let’s first consider two important factors in data profiling unique and distinct in PowerBI unique is known as total number of values that only appear once distinct is known as total number of different values regardless of how many of each you have…”
Data Cleaning: Addressing inconsistencies, errors, and missing values identified during profiling.
“you explored evaluating data data statistics and column properties reviewing why data evaluation is crucial Power Query’s profiling capabilities and different evaluation methods through an interactive activity you practiced analyzing a data set for anomalies and statistical irregularities preparing you for real world scenarios as a PowerBI data analyst you also explore data inconsistencies unexpected or null values and data quality issues you may encounter as a PowerBI data analyst as well as resolving data import errors…”
Transforming and Loading Data: Shaping data into a usable format and loading it into the data model. Includes creating and transforming columns, changing data types, and applying query steps.
“next you explored the transforming and loading data you reviewed creating and transforming columns understanding the importance of selecting appropriate column data types and how to transform columns and create calculated columns in Power Query you brushed up on shaping and transforming tables and applying query steps to shape the data exploring reference queries you recaped when to use reference or duplicate queries and also unpacked the differences between merge and append queries and explored the different types of joins…”
Merge vs. Append Queries:Append: Combines rows from multiple tables into a single table (stacking data). Works best when tables have the same column structure.
Merge: Combines columns from multiple tables based on a common key (joining data). Requires selecting a join type (left outer, right outer, full outer, inner, left anti, right anti).
“Append queries are a great way to consolidate data from multiple sources into a single table… append queries works well when the columns in the data source are well aligned and the desired resulting table should match the format of the data sources however you may encounter more complex scenarios requiring the merging of data from different sources this is where merge queries comes in…”
“to merge two tables you need to tell the merge query which type of join you would like to use the join type informs PowerBI how to merge the two tables a join requires that there is a common column between the two tables… this is known as the join key…”
“powerbi supports the following join types left outer right outer full outer inner join left anti-join and right anti- join…”
4. Power BI – Visualization and Presentation:
Visualizing KPIs: Displaying key performance indicators using Power BI visuals like Cards, Multi-row Cards, Radial Gauges, and the dedicated KPI visual. KPIs differ from regular charts by aligning with strategic business objectives.
“kpis differ from regular charts and metrics because they align directly with strategic business objectives instead of simply presenting raw data KPIs offer insight into how that data impacts overall business goals and progress…”
“microsoft PowerBI offers a range of visualizations to display KPIs including cards multirow cards gauges and the KPI visual…”
Card Visuals: Display a single value or data point, ideal for essential statistics.
“the card visualization displays one value or a single data point this type of visualization is ideal for representing essential statistics you want to track on your PowerBI dashboard or report…”
Multi-row Card Visuals: Display one or more data points, with one data point per row.
“next is the multirow card visualization that displays one or more data points with one data point for each row…”
Radial Gauge Visuals: Circular arcs displaying a single value, measuring progress toward a goal.
“another visualization you can use is the radial gauge this visual is a circular arc that displays a single value measuring progress toward a goal or target or indicates the health of a single measure…”
KPI Visual: Tracks a metric’s performance against a target and includes a trend line.
“lastly the KPI visual in PowerBI is a powerful tool for tracking the performance of a metric against a target the KPI visual also includes a trend line or chart to show the data’s trajectory over time…”
Data Granularity: Refers to the level of detail captured in a data set or field. High granularity provides deeper, more precise insights. The appropriate level of granularity depends on the analysis objectives.
“data granularity refers to the level of detail or depth captured in a certain data set or data field granular data provides deeper and more precise insights this delivers more nuanced and valuable findings…”
“data granularity isn’t about always having the highest level of detail it’s about having the appropriate level of detail before you begin your analysis ask yourself do you require high granularity or low granularity the decision should depend on the specific requirements and objectives of the analysis…”
Histograms: Visualizations illustrating the frequency distribution of data by grouping data points into ranges or bins. Often use bar or area charts.
“a histogram is a way to visualize a topend data query result while the topend function in PowerBI is a built-in DAX function that retrieves the topend records from a data set based on specific criteria it compares the parameters provided and returns the corresponding rows from the data source the n in top n refers to the number of values at the top or bottom data points are grouped into ranges or bins making the data more understandable a histogram is a great way to illustrate the frequency distribution of your data…”
Top N Analysis: Filtering data to display only the top or bottom ‘n’ values based on specific criteria, enabling quick identification of significant data points.
“the top end analysis prevents this by sorting the data to display according to a category’s best or worst data points this enables stakeholders to quickly identify the top or bottom values in the data and make datadriven decisions efficiently…”
Data Hierarchies: Structured ways to organize data (e.g., geographical, product categories) to allow users to drill down into data at different levels of detail. Can be created automatically by Power BI (for dates) or manually.
“PowerBI offers a way to unravel this mystery by creating a data hierarchy hierarchies provide a structured way to organize and visualize data allowing users to uncover hidden insights and tell a compelling story…”
“PowerBI has automatically created a hierarchy with all the date fields such as estimated delivery date and order date… How can you create a hierarchy of your own? Let’s create a hierarchy for product related data using the product category product subcategory color and product name fields…”
Map Visualizations: Used for visualizing geographical data. Requires correctly formatting geographical columns as data categories (Country, State/Province, City) and can benefit from using latitude and longitude coordinates for precision. Geo hierarchies enhance map visualizations.
“for map visualizations defining a precise location is especially important this is because some designations are ambiguous due to the presence of one location name in multiple regions for example there is a Southampton in England Pennsylvania and New York adding longitude and latitude coordinates solves this issue but if the data set does not have this information you will need to make sure to format the geographical columns as the appropriate data category…”
“adding depth to map visualizations leverages geo hierarchies you can drill down from country to state state to city and so on…”
Bookmarks: Capture and save the current state of a report (filters, slicers, display properties, current page, visual selection) to share specific views with others or for easy navigation.
“bookmarks in PowerBI are a way to capture the current state of the report you are viewing and share this state with other viewers…”
“when adding a bookmark there are four state options that you can save data properties such as filters and slicers display properties such as visualization highlighting and visibility current page changes which present the page that was visible when you added the bookmark and selecting if the bookmark applies to all visuals or selected visuals…”
Using Variables for Troubleshooting: Variables in DAX store values or tables temporarily, allowing for breaking down complex formulas into smaller, manageable parts. This aids in debugging and understanding the calculation process.
“maybe the weight of potential inaccuracies weighs on you mistakes mean mistrust in data and mistrust in data can lead to poor business decisions in this video you’ll learn how to use variables in DAX to troubleshoot issues like this one…”
“to recap a variable in DAX lets you store a value or a table to be used later in your formula think of them as placeholders or temporary storage units for your data by breaking down your DAX formula into smaller pieces and storing parts of the calculation in variables you can keep track of each step making the process more comprehensible and easier to debug…”
Power BI Service – Dashboards: Dashboards provide a single page view of key metrics and visuals from one or more reports. They are available in Power BI Service and mobile, but not Desktop. Tiles from reports or other dashboards can be pinned to dashboards.
“a PowerBI dashboard is a single page view of key metrics and visuals from one or more reports…”
“you can create and copy dashboards you must use the Microsoft PowerBI service you can view dashboards in Microsoft PowerBI service and in Microsoft PowerBI mobile dashboards are not available in PowerBI desktop…”
Duplicating Dashboards and Pinning Tiles: Dashboards can be duplicated in Power BI Service. Tiles from reports or other dashboards can be pinned to existing or new dashboards to consolidate visuals.
“to create a copy of a dashboard you must be the creator of the dashboard… you cannot pin tiles from dashboards shared with you only from dashboards created by you…”
“to duplicate a dashboard log into your PowerBI service and open the workspace that contains your dashboard… to pin a tile from one dashboard to another open the product sales dashboard from my workspace and hover the cursor on the tile to pin then select more options and select pin tile from the dropdown…”
Custom Visualizations (Python/R): Power BI allows for creating custom visualizations using Python or R programming languages for more advanced or specific needs. Requires installing Python/R and enabling scripting in Power BI.
“you can create custom visualization in PowerBI using Python or R programming languages these visualizations are imported from a file on your local computer you can also develop PowerBI visuals to meet your analytical or aesthetic needs…”
“using R or Python to develop your own PowerBI visuals or to customize existing ones is an optional expertise you may wish to pursue it if you have a coding background a familiarity with Python or want to extend your skill set into this area…”
Data Access and Permissions in Power BI Service: Power BI Service allows for managing data access and permissions at the dataset level and through workspace apps. Lineage view helps understand the impact of a dataset on reports and dashboards.
“effective data access and permission management is crucial to ensure that the right individuals have the appropriate level of access to sensitive data and reports…”
“with data set level permissions PowerBI service enables you to assign specific permissions to data sets while sharing you can ensure that although colleagues can access and utilize the data they cannot make changes to it this ensures the sanctity of vital data sets…”
“workspace apps in PowerBI allow you to share entire workspaces including data sets dashboards and reports ia workspace app is a full data package that can be shared with specific users or teams ensuring a comprehensive sharing experience…”
“to check how many workspaces reports or dashboards are affected by a data set you can perform what is known as impact analysis to do this you go to your workspace and hover on a data set then select the more options three dots next to it and select show lineage…”
Using Microsoft Copilot in Bing for DAX Assistance: Copilot can help troubleshoot DAX formulas, suggest corrections, and offer alternative approaches for complex calculations like nested IFs.
“Microsoft Copilot in Bing can also be a valuable companion in troubleshooting and improving your DAX formulas…”
“microsoft Copilot in Bing can help guide you through the correct structuring of calculate formulas suggest how to perform dynamic aggregations and even detect and suggest fixes to syntax errors…”
“Copilot can simplify this by suggesting straightforward alternatives or helping restructure these nested conditions into manageable components…”
5. General Concepts:
Importance of Non-Technical Skills: Developing non-technical skills like understanding end-user needs, relaying findings to stakeholders, collaboration, and creating actionable insights are crucial for data analysts.
“non-technical skills are equally vital these include a keen understanding of the needs of end users and the ability to relay findings and concepts to stakeholders of varying technical knowledge by developing these non-technical skills you can better collaborate with stakeholders create actionable insights inspire change and make lasting impacts enriching your own career and contributing to the growth and success of those around you…”
Data Quality: Emphasized throughout the data preparation process, focusing on completeness, accuracy, uniqueness, and consistency.
“data profiling enables the identification of potential issues and anomalies within the data set this proactive approach allows you to make informed decisions about data cleaning transformation and enrichment ultimately leading to improved data quality…”
This briefing document provides a high-level overview of the key topics and concepts covered in the provided source material, offering a foundation for understanding essential data analysis techniques in both Excel and Power BI.
Excel Functions and Power BI Data Modeling
How do Excel’s logical functions, such as the IF function, work and what are they used for?
Excel’s logical functions are used to ask yes or no questions about your data. Based on the answer to that question (true or false), Excel can be directed to perform different actions or display different values. The IF function is a common example, requiring three pieces of information: a logical test (a condition to check, often using logical operators), what to do if the test is true, and what to do if the test is false. For example, you could use an IF function to check if a sales figure is greater than or equal to a target; if true, award a bonus, and if false, award nothing. Logical operators like =, >, <, >=, <=, and <> (not equal to) are essential components of these tests.
When might you need to use multiple conditions in Excel logical functions, and what are the approaches?
You might need to test for multiple conditions when a simple yes/no question isn’t sufficient. For instance, determining different bonus levels based on varying sales thresholds. There are two main approaches: using nested IF functions or using the IFS function. A nested IF involves placing an IF function within another IF function’s “value if false” argument to perform a subsequent test if the initial one is false. The IFS function is designed to run a series of tests without nesting, stepping through each condition until one is true and then performing the corresponding action.
What is Data Analysis Expressions (DAX) in Power BI and what are its key components?
DAX is a programming language used in Power BI, SQL Server Analysis Services, and Power Pivot in Excel. It’s a library of functions, operators, and constants used to create additional information or custom calculations on data models that isn’t present in the original data. Key components of DAX include syntax (defining calculations, often starting with a name, equals sign, and function), operators (for arithmetic, comparison, logic, and concatenation), functions (reusable logic for tasks like aggregation, conditional logic, and time intelligence), and understanding the data model (tables, relationships, and context).
How do row context and filter context influence DAX calculations in Power BI?
DAX formulas compute values within a context. Row context refers to the current row being evaluated within a calculation. This allows calculations to be performed row by row, which is useful for tasks like creating calculated columns where a calculation is applied to each row independently. Filter context refers to the filter constraints applied to the data before a DAX expression is evaluated. This determines which subset of data is used for calculations. Changes in filters (like selecting a specific product category or region) will alter the filter context, leading to different results for the same DAX measure.
What are measures in Power BI, what types exist, and why are they important for analysis?
Measures in Power BI are dynamic calculations or metrics used to generate insights from data. They are essential for quantitative analysis and summarizing, calculating, and comparing data. There are three main types: additive measures (which can be meaningfully summed across all dimensions, like total sales), semi-additive measures (which can be summed across some dimensions but not all, particularly time, like inventory balance), and non-additive measures (which cannot be meaningfully summed across any dimension, like percentages or ratios). Measures are important because they compute values on the fly based on the current filter context, allowing for dynamic analysis and reporting.
What are calculated and cloned tables in Power BI and when would you use them?
Calculated tables are new tables created within a Power BI data model using DAX expressions, often based on data from existing tables or even multiple sources. Cloned tables are exact copies of existing tables. You would use calculated tables to combine data from different sources, normalize dimension tables (like in a snowflake schema), create a common date dimension table, or generate summary tables from large datasets. Cloned tables are useful when you need to manipulate or augment data without affecting the original table, especially if the original data is refreshed periodically.
How do data granularity and geographical hierarchies contribute to data analysis in Power BI?
Data granularity refers to the level of detail captured in a dataset or data field. High granularity provides deeper and more precise insights, while low granularity offers a more summarized view. Choosing the appropriate level of granularity depends on the analysis objectives. Geographical hierarchies in Power BI (like Country > State > City) provide a structured way to organize and visualize data based on location. They allow users to drill down into data from a broad overview to a more detailed level, enabling the analysis of trends and performance at different geographical scales.
What is the significance of data modeling, schemas (Star and Snowflake), and table relationships in Power BI?
Data modeling in Power BI involves creating visual representations of your data and defining relationships between data elements to generate new insights. Schemas, such as the Star and Snowflake schemas, are common structures for organizing data into fact tables (containing measurements and metrics) and dimension tables (providing contextual attributes). Table relationships, established using primary and foreign keys, define how these tables are connected. Understanding and correctly configuring cardinality (one-to-one, one-to-many, many-to-many) and cross-filter direction in these relationships is crucial for accurate data analysis and filter propagation in Power BI calculations.
Power BI Tutorial For Beginners To Advanced | Master Power BI From Beginner to Expert, By Microsoft
The Original Text
data is an important part of your day-to-day existence think about how many times you collect and make use of data every day for example you may have recently compared the cost of flights to find the best value for your vacation or you might have asked your friends to let you know what dates they’re available to meet for a party so that you can find a day that suits everyone in the group so how do data analysts make use of information just like when you plan your vacation or party they identify and gather important data then study and analyze the data to generate the insights that they need data analysts carry out these tasks using a range of techniques tools and software like Microsoft Excel and Microsoft PowerBI these might sound like complicated technologies but it’s possible to approach them from an entry-level stage and develop competency and this high demand at an organizational level for individuals who can demonstrate proficiency with these tools the career opportunities available for data analysts include a range of roles from business analyst to data scientist to database administrator with increasing digitization of all aspects of life the demand for these roles across all business sectors is greater than ever with the right knowledge and skills you could be the next data analyst an organization is looking for you might be keen to pursue a career in data analytics but you might also be concerned that you don’t have a relevant university degree or prior experience or maybe the cost is just too high don’t let these concerns hold you back if you’re fascinated by the world of data and willing to join us then we’re offering you a chance to embark on a learning journey that prepares you for an exciting career in data analytics this Microsoft PowerBI analyst professional certificate consists of a series of courses that act as a solid foundation of fundamental knowledge that imparts the skill set required for an entry- level job in data analytics in addition finishing this program also prepares you for the exam PL300 Microsoft PowerBI data analyst earning a Microsoft certification provides industry endorsed evidence of your skills and demonstrates your willingness to stay on top of the latest trends and demands and stand out in a fast changing industry you’ll begin this program with an overview of how to design and manage spreadsheets using Microsoft Excel this overview begins with a guide to Excel elements and techniques along with guidance on how to organize data you’ll then learn how to prepare data for analysis using different functions this overview of Excel will help you to understand the importance of sourcing and organizing data so you’ll follow it with an exploration of the different stages and roles in the data analysis process you’ll begin by learning about essential data analysis concepts and the role of the data analyst you’ll then review the tools required to source gather transform and analyze data effectively sourcing data is important but so is preparing it for analysis that’s why you’ll also learn how to bring data into PowerBI and clean and transform it for analysis you’ll begin by learning about different data sources in PowerBI you’ll then learn techniques for importing the data lastly you’ll discover how to clean and transform data once you’ve imported your data you then need to organize it so that you can make sense of the information to generate insights so you’ll also review techniques for modeling data you’ll start by developing an understanding of basic data modeling concepts you’ll then learn how to use DAX in PowerBI to create calculations finally you’ll discover how to optimize the performance of a data model in PowerBI the ability to generate insights from your data is great but you also need to be able to communicate these insights that’s why you’ll also explore the techniques and tools used to create visual presentations of data you’ll begin by exploring visualization concepts and you’ll also learn how to create reports next you’ll learn how to ensure your reports contain navigation and accessibility elements you’ll then explore how to bring data to the user by managing access and creating dashboards finally you’ll review methods and techniques for identifying patterns and trends in your data another important skill you’ll require is the ability to make use of available PowerBI assets so you’ll also learn how to create use monitor and manage a workspace and you’ll discover how to manage share and secure data sets in PowerBI not only do you need to be able to visualize your data but it’s also important that you can use it to tell a story or narrative during this program you’ll explore how to design robust and compelling visualizations to communicate your data with stakeholders you’ll start by exploring key principles of design and the importance of narrative you’ll then learn techniques for designing report pages with powerful visuals and you’ll review design principles and techniques for dashboards you’ll complete a final capstone project where you’ll put your new skills to use by developing a PowerBI dashboard in the final course you’ll prepare for the PL300 exam by undertaking a practice exam this exam covers all the main topics of the Microsoft Certified Exam PL300 so it’ll also help you determine if you’re ready for the real thing once you complete the program it’s time to start exploring potential careers and don’t forget to share your Corsera Professional Certificate to get that extra advantage congratulations on your decision to become a data analyst and to help make sense of data for others now let’s get started have you ever faced the challenge of making decisions or providing insights based on large amounts of data this can be quite a daunting task especially if the data is difficult to read and understand fortunately you’ve come to the right place this course on preparing data for analysis in Microsoft Excel will equip you with the skills you need to work with large blocks of data and make it easier to read and understand data analysis is a process that involves defining the purpose of the data gathering cleaning and analyzing it to gain insights businesses often use data analysis to obtain usable relevant information that can assist them in making educated business decisions however this is usually done with large amounts of data that you need to cleanse transform and analyze you will often have to present this data in charts tables and graphs that provide relevant insights your data insights will help organizations to lessen the risks associated with making business decisions microsoft Excel can assist you in analyzing data for your business and you don’t need an IT related qualification to do this the preparing data for analysis with the Microsoft Excel course is designed for anybody that’s interested in learning about preparing data for analysis within a business context it also establishes a foundation for anyone striving to have a career in data analytics through data analytics in Excel you will be able to collect store and delve deeper into your business’s data you will also learn to harness the power of data using tools for sourcing gathering transforming and analyzing data now let’s go over a brief overview of what you will learn over the next few weeks to kickstart your learning journey you’ll discover the fundamental and essential Microsoft Excel elements and techniques for creating workbook content these techniques include entering formatting managing and adding data to worksheets you’ll then learn how to read large blocks of data and review the steps for sorting and filtering data in Excel next you’ll discover how to use formulas and functions to perform calculations in Excel then you’ll learn how to prepare data for analysis using functions you’ll explore functions that are used to clean or standardize text to prepare it for effective analysis you’ll then investigate the use of date and time functions in Excel so that you can complete actions like creating timeline information in a spreadsheet you’ll also review the logical functions like if and ifs and you’ll learn how to use these logical functions to generate content like data columns in the last module you’ll undertake a final project in this project you’ll create a worksheet with an executive summary of a business’s month-by-month profit margin performance compared to the previous year this project will help you prepare for the final capstone project at the end of this program finally you’ll have a chance to recap on what you’ve learned and focus on areas you feel you can improve on throughout the course you will encounter many videos that will gradually guide you toward a solid understanding of preparing data for analysis watch pause rewind and re-watch the videos until you are confident in your skills then consolidate your knowledge by consulting the course readings and measuring your understanding of key topics by completing the different knowledge checks and quizzes by the end of the course you’ll be equipped with the necessary skills to work effectively with data in Microsoft Excel good luck as you start this exciting learning journey the Microsoft PowerBI Analyst program is an excellent resource to start your career whether you’re a beginner or a seasoned professional looking to improve your skills data is the driving force behind this everchanging modern world shaping and developing industries and society it has transformed the way institutions operate from banks and hospitals to schools and supermarkets and for businesses data is everything it informs decisions and helps create value for customers content streaming services analyze data to decide what content to promote social media services analyze data to determine what products their customers are interested in and your local supermarket gathers and analyzes data to ensure the products you want are available the result of having all this data is that professional analysts are required to process and sort it to gain the insights that drive both the business and social worlds are you intrigued by this career field and wondering how to get started let’s meet two other students who have just begun their careers in entry- levelvel positions discover how and why they have chosen to embark upon career paths in this field with Microsoft and Corsera lucas a recent information technology graduate is currently searching for his first IT job he is eager to secure a position in the IT sector that offers good earning potential and a quick career progression he wants to work full-time in data analysis as he feels this career would offer both benefits during his degree he found working with and analyzing cloud-based data to be the most enjoyable element hence his focus on this career path lucas currently works shifts in a warehouse environment so he will need the flexibility of self-paced learning his earnings are low so he wants to achieve the qualification using the same basic laptop he relied upon as a student despite being a beginner Lucas has already mapped out his career and certification path and has enrolled in the Microsoft PowerBI analyst program he plans to apply for an entry- levelvel position as a data analyst once he has successfully completed the program and passed the PL300 exam as a data analyst he will inspect data identify key business insights for new business opportunities and help solve business problems amelia has been working as an administrative assistant in sales and marketing since leaving high school now that a few years have passed she is ready to embark upon a new career path in her current role Amelia has seen PowerBI reports and dashboards created by colleagues and shared with the team she was impressed at how the information was used to shape and focus the sales campaigns this sparked an interest in a career in data analysis amelia’s job requires her to work long hours so the ability to structure her own learning path is vital she also has a long commute so would like to access e-learning through her smartphone or tablet pursuing the PowerBI analyst qualification will showcase her dedication and help her apply for more senior roles in the department in the short term amelia doesn’t have a scientific background but she finds IT concepts logical and easy to understand so she’s embarking on the Microsoft PowerBI analyst program as it doesn’t assume a pre-existing high level of technical knowledge in the long term she hopes to secure an entry-level role as a PowerBI analyst as a PowerBI analyst she will be responsible for building data models creating data assets like reports and dashboards and ensuring data requirements are met you may be in a similar position to Lucas and Amelia and possess an interest in this exciting field of data analysis like them you can begin your career in this field by enrolling in the Microsoft PowerBI analyst program this will be the start of your new adventure good luck with your learning journey generative AI stands at the forefront of a transformative era reshaping our interaction with data and redefining the boundaries of creativity across diverse sectors this innovative tool utilizes sophisticated statistical techniques to generate content across text images and code empowering individuals and industries with remarkable capabilities in this video you’ll gain an understanding of the multifaceted landscape of generative AI exploring its vast capabilities industry implications and the career opportunities it presents before we get into more detail let’s answer the question what is generative AI examples of these models are generative adversarial networks or GANs and transformer models with these models generative AI can create outputs that closely mimic humanmade content using generative AI as an assistant can make a positive contribution across multiple industries for example imagine a trendy clothing store using generative AI to design unique patterns and styles based on customer preferences with GANs the AI could generate lifelike images of clothing designs enabling the store to offer personalized options to each customer this application not only enhances the shopping experience but also streamlines the design process illustrating how generative AI is reshaping industries through its creative capabilities now that you’re up to speed on what generative AI is let’s explore some of its capabilities across different functions firstly there’s text generation where generative AI models like generative pre-trained transformer or GPT can compose essays generate creative writing automate customer support and more imagine how generative AI can bring the store collection to life for shoppers effortlessly crafting engaging product descriptions captivating social media posts and personalized customer communication that mimics the tone and style of human interaction next there’s image creation generative AI can transform textual descriptions into stunning visual representations for the retail store this means converting text into realistic images of new apparel designs from elegant evening gowns to casual streetear providing the store’s creative team with endless inspiration and flexibility in bringing their vision to life this capability is revolutionizing fields such as graphic design video game development film production and marketing and branding where custom visuals can be created quickly and at scale with audio production the store’s marketing and branding department uses generative AI’s audio ability to synthesize speech compose music and create sound effects generative AI produces captivating audiovisisual content for advertising campaigns captivating audiences and enhancing brand visibility in addition to its applications in creative fields like fashion generative AI also showcases its capability in code generation imagine the retail store leveraging generative AI to optimize its online presents ai would aid the store’s programmers by suggesting improvements completing lines of code or even creating entire programs this would not only streamline website development but also enhance user experience ensuring seamless navigation and captivating visuals for online shoppers finally there is data synthesis in the fashion world staying ahead of the curve is crucial and generative AI aids the store in achieving just that it utilizes extensive data sets on fashion trends customer preferences and style influencers the store can conduct market research and analyze customer behavior ethically and responsibly by generating synthetic data sets that maintain statistical properties without compromising individual privacy this application is crucial for training more AI models where access to real data might be restricted or unethical so what are the industry implications of this emerging technology the deployment of generative AI across various industries indicates a major shift in operational dynamics in healthcare AI generated models can predict patient outcomes personalize treatment plans and automate administrative tasks in finance AI can manage risk assessment automate trading and personalize banking services the creative industry is seeing an explosion of innovation and inspiration as generative AI aided tools are contributing hugely to the fields of art music and literature pushing the boundaries of traditional creativity as AI evolves its impact on the workforce and industry standards will be significant the demand for AI knowledge is growing and learning to work with AI will be crucial for career advancement in all fields jobs that traditionally didn’t involve technology will start using AI tools more often this shift will require professionals in most fields to develop new skills and undergo additional training to effectively integrate generative AI into their work as a result educational programs and workshops focusing on generative AI and its applications are becoming increasingly important offering valuable resources for those looking to stay relevant and excel in their careers both businesses and individuals need to understand and adapt to generative AI’s capabilities to fully harness its potential generative AI is not just a tool for creating and automating it is a catalyst for innovation and transformation across all areas in this video you gained an understanding of the capabilities of generative AI and its implications for various industries you also explored some of the career opportunities it will create as we continue to explore and expand these technologies capabilities the opportunities for advancement and creativity are limitless welcome to the age of generative AI where everyone has the chance to redefine the boundaries of what is possible generative AI is transforming businesses today by gathering information and creating all kinds of content changing how businesses operate let’s imagine a renowned restaurant called Chef’s Table as chef Andre strives to innovate and delight his patrons with new dishes he turns to generative AI to enhance his culinary creations the technology behind this ability involves using models trained on huge sets of data to do tasks such as text generation image creation and even code synthesis in Chef Andre’s kitchen Generative AI acts as his trusty sue chef assisting him in developing innovative recipes crafting visually stunning presentations and even optimizing kitchen workflows just like Chef Andre relies on his sue chef to complement his skills and creativity generative AI compliments businesses by providing them with new insights ideas and efficiencies in this video you’ll explore the technical foundations and potential applications of generative AI in businesses like Chef’s Table you’ll also assess its limitations and examine the ethical considerations that arise when using it first let’s gain some insight into the technical foundations of generative AI it operates primarily through two types of models generative adversarial networks or GANs and transformer-based models guns involve two neural networks the generator and the discriminator working in tandem to produce highly realistic outputs these two components are known as the generator and the discriminator imagine the generator as a chef preparing a new dish and the discriminator as a food critic tasting it the chef the generator creates new dishes while the food critic the discriminator evaluates them if the critic cannot distinguish between the chef’s creations and dishes from renowned restaurants then the chef has succeeded this collaborative process results in the creation of highly realistic and refined outputs transformers used by models like generative pre-trained transformer or GPT and birectional encoder representations from transformers or BERT use attention mechanisms to create text that is contextually relevant and stylistically coherent attention mechanisms play a crucial role in the model’s functionality these mechanisms enable the model to focus selectively on various parts of the input data much like a chef carefully chooses the best ingredients for a dish this selective focus allows the model to highlight important information and maintain a clear grasp of the context imagine a chef who not only selects fresh ingredients but also keeps the recipe and cooking techniques in mind to craft a delicious and well- balanced meal similarly attention mechanisms ensure that the text generated by the model is coherent and contextually appropriate rather than a random assortment of words these technologies rely on deep learning needing a lot of computer power and data to train them how well a generative AI model works depends on the quality and variety of its training data which affects its ability to generalize new information without upholding biases so you’ve learned about the technical foundations of generative AI but what are its practical applications in various business functions in marketing and customer engagement generative AI can craft personalized content at scale from email marketing campaigns to dynamic web content think of this as a chef preparing a personalized menu for each diner based on their preferences creating unique and delightful dining experiences ai models can enhance engagement and conversion rates by analyzing existing customer data and tailor messages that resonate on an individual level additionally generative AI assists in optimizing operational efficiencies and logistics for instance AI can forecast demand trends simulate supply chain scenarios and recommend adjustments this is like a chef estimating the number of diners planning the menu and ordering ingredients to minimize waste and make customers happy this predictive capability enables Chef’s Table to make informed decisions reduce costs and improve service delivery in the area of human resources AIdriven analysis of job descriptions and applicant data helps streamline the recruitment process by generating and evaluating diverse job descriptions AI can attract a wide range of candidates potentially reducing biases often found in manual processes additionally generative AI can simulate training scenarios providing personalized learning experiences for employees think of this as a chef conducting cooking classes tailored to the skill levels and learning styles of each student ensuring everyone learns effectively another application of generative AI is document management and technical writing it can analyze extensive data sets of documents to learn and replicate the necessary formatting style and technical language specific to different business sectors for example AI models trained on legal documents can help to draft contracts that comply with current laws and regulations furthermore models trained on medical texts can help in preparing accurate clinical trial reports the technologies ability to understand and generate technical content is like Chef Andre mastering the preparation of complex dishes ensuring consistency and high standards without extensive manual effort one of the standout features of generative AI is its capacity to mimic specific writing styles this capability is particularly useful in marketing and customer communications where maintaining a consistent brand voice is crucial by training on a company’s historical communication data AI can generate content that aligns with the brand’s tone style and audience engagement strategies additionally it can adapt to different styles as needed much like a versatile chef who can cook various cuisines to cater to diverse tastes and cultural preferences finally the ability of generative AI to produce coherent and contextually relevant text has wide ranging application in business for instance it can generate product descriptions marketing copy or news articles with little to no human input significantly speeding up the content creation process moreover in customer service AIdriven chat bots can handle inquiries and provide responses in real time improving customer experiences and operational efficiency these applications demonstrate the potential of generative AI to take over repetitive and time-conuming tasks enabling employees to focus on more strategic activities much like a chef relying on a well-trained kitchen staff to handle routine tasks while focusing on creating innovative dishes despite its capabilities generative AI is not without limitations and may raise some ethical concerns the quality of output can vary significantly depending on the model’s training inaccuracies can emerge especially when the AI encounters data or requests outside its training scope moreover there’s the potential for AI to reinforce or amplify biases present in the training data leading to unfair outcomes or ethical dilemmas this is similar to a chef needing to ensure their ingredients are fresh and free from contaminants as any issue can affect the final dish ethical concerns that must be addressed include issues such as data privacy intellectual property and the potential for misuse therefore businesses must establish clear guidelines and ethical frameworks to govern AI use ensuring that AI generated outputs align with legal and moral standards think of it as a chef adhering to food safety regulations and ethical sourcing practices to ensure every dish is not only delicious but also responsibly made in this video you learned how generative AI offers substantial benefits across various business functions enhancing productivity decision making and customer engagement however to leverage this technology effectively businesses must understand its technical foundations potential applications and limitations you also gained insight into how responsible use of generative AI guided by strong ethical principles is essential to harness its full potential while reducing associated risks as businesses continue to integrate AI into their operations the focus must remain on creating value responsibly ensuring that AI solutions are deployed in a manner that is both effective and ethical like a master chef businesses must blend innovation with responsibility to create a successful and sustainable future picture a future where machines not only grasp our language but also craft it with remarkable finesse where creativity knows no bounds as artificial minds effortlessly generate images and ideas this isn’t the stuff of sci-fi dreams it’s the emergence of generative AI a tool that will complement and benefit us in both our work and our everyday lives to gain a better understanding of generative AI it is crucial to dive into its foundational technologies such as machine learning models and their architectural nuances let’s get started by exploring the distinguishing features of generative AI unlike traditional AI which typically focuses on analysis and classification generative AI is proactive in creating new content this shift from passive analysis to active creation is transformative especially in handling complex tasks such as natural language processing or NLP and synthetic image generation nlp enables machines to read understand and generate human language while synthetic image generation involves creating fake images using computer programs and algorithms it’s like a digital artist creating a convincing picture of a landscape they’ve never seen before the introduction of transformers a type of model architecture that relies on mechanisms called attention and self attention has revolutionized NLP models like Google’s birectional encoder representations from transformers or BERT and Open AI’s GPT series use these transformers they learn the relationships between words in a text but not in the usual order from start to end instead they can understand different parts of the text at the same time it’s like reading a mystery novel and being able to pick up on clues scattered throughout the book all at once this way of learning allows for more things to be processed at the same time making the training quicker and more efficient so those are some of the distinguishing features but what are the technical foundations of generative AI it primarily operates through two types of machine learning supervised and unsupervised in supervised learning models are trained on labeled data sets allowing them to learn a function that can map input data to desired outputs for example a model might be trained to generate text summaries by learning from a data set of articles paired with their respective summaries unsupervised learning on the other hand involves training models on data without explicit labels here the aim is for the models to discover inherent patterns and relationships in the data this approach is particularly beneficial for generative AI as it allows the model to learn to create content that is not bound by predefined labels enabling more innovative and adaptive applications next let’s take a closer look at some of the core technologies behind generative AI at the heart of its capabilities are neural networks particularly generative adversarial networks or GANs and variational autoenccoders or VAEs variational autoenccoders or VAEEs encode input data into a compressed representation and then decode it back to reconstruct the input the process involves optimizing the parameters of the encoder and decoder so that the output closely matches the input allowing the model to generate new data samples from learned representations language models are constantly evolving so it’s important to keep up to date with these advancements language models such as GPT3 and BERT demonstrate significant advancements in generative AI these models use transformer architectures which rely on self-attention mechanisms to process sequences of data like sentences in ways that consider the context provided by other parts of the sequence this is crucial for generating coherent and contextually appropriate text word tovec another critical technology involves vectorizing words into a geometric space where words with similar meanings are located close to each other this enables more nuanced understanding and generation of text based on semantic similarities rather than just syntactic rules generative AI has many business applications and can revolutionize several key areas let’s explore some in more detail firstly there’s content generation gpd models excel in generating written content by leveraging transformer architecture which allows them to understand context and generate coherent and contextually appropriate text these models are pre-trained on a wide variety of internet text and fine-tuned for specific applications enabling them to create highquality articles blogs and other written materials next is personalization the process starts with collecting user data from sources like websites apps and social media integrated data pipelines using tools like Apache Kafka or Google Cloud Data Flow consolidate this data in real time realtime analytics platforms such as Apache Spark streaming or AWS Kinesis process the data to extract insights which feed into a personalization engine that generates tailored recommendations content and communications these personalized interactions are delivered using APIs integrated with various platforms to ensure low latency responses edge computing technologies like AWS Green Grass or Azure IoT Edge process data closer to the user additionally there’s automation ai models trained on large data sets and using advanced algorithms automate these processes improving efficiency and reducing costs the technical backbone includes robotic process automation or RPA for executing repetitive tasks AI powered software tools for intelligent decision making and cloud services that provide the necessary scalability and support continuous learning and adaptation of the models this infrastructure ensures that AI systems remain upto-date and can handle increasing volumes of work effectively and finally innovation generative AI fosters innovation by simulating and modeling various scenarios to predict outcomes aiding businesses in developing new products and services with higher success rates this involves using advanced AI models for predictive analytics scenario planning and risk assessment including techniques like regression analysis time series forecasting Monte Carlo simulations Beijian networks and stress testing large data sets from diverse sources are processed using tools like Apache Hadoop and Apache Spark simulation tools such as digital twins and optimization algorithms are used to predict performance and find optimal solutions from what you have learned in this video it is clear that generative AI is a powerful tool that when leveraged responsibly can provide significant advantages to businesses by automating tasks personalizing customer experiences and driving innovation you’ve gained an understanding of how generative AI continues to evolve providing useful business applications as the technology continues to evolve it will likely become an even more integral part of the digital business landscape it’s no secret that generative AI has significantly transformed various job functions in the workplace from automating routine tasks to enhancing creative processes these systems use vast amounts of data to create new content make predictions and even make decisions despite its revolutionary potential generative AI is not without its pitfalls and shortcomings which raise several risks challenges and ethical considerations that must be carefully managed in this video you will gain further insight into these challenges and limitations but first let’s explore how generative AI can be integrated into different job functions in many sectors generative AI tools are employed to streamline operations and enhance productivity for example in roles such as content creation AI can produce drafts suggest edits and generate creative ideas which allows human workers to focus on more strategic aspects of their work similarly in software development AI can write code debug and even test software streamlining the development process and reducing time to market a significant shortcoming of generative AI was highlighted by the use of Open AI’s GPT3 in generating medical advice in one instance GPT3 was used to provide mental health support and it suggested to a simulated user experiencing distress to commit self harm this incident underscored the danger of relying on AI for sensitive tasks without robust safeguards the model generated harmful advice because it lacked the nuanced understanding and ethical judgment required in mental healthcare relying instead on patterns learned from its training data this example demonstrates the potential risks and severe consequences of deploying AI without adequate human oversight and ethical considerations these capabilities not only optimize efficiency but also offer significant cost savings and scalability for growing businesses however the integration of AI into these roles is not always seamless the reliance on AI can lead to job displacement as roles traditionally failed by humans become automated furthermore the quality of AI generated outputs can be inconsistent while AI excels in generating structured content it struggles with tasks requiring deep understanding or emotional intelligence often producing outputs that are awkward or contextually inappropriate earlier you learned that businesses need to adopt ethical considerations given the potential for bias in AI generated content since AI models learn from data they inherently acquire the biases found in their training data sets this can result in discriminatory practices such as favoring one demographic group over another when AI is used in HR for resume screening or job recommendations maintaining the privacy of personal data is a primary objective for businesses when using generative AI systems to interact with personal data care must be taken to ensure confidentiality and user privacy these systems can inadvertently expose sensitive information or even be used to generate deep fakes contributing to misinformation and potentially harming individuals reputations next let’s examine some of the challenges of reliability and accountability when using generative AI ai systems are notorious for their blackbox nature meaning the processes they use to reach conclusions are not always clear this lack of transparency can lead to reliability issues where businesses find it challenging to understand or predict the AI’s behavior this is particularly problematic in highstakes environments like healthcare or finance where unexpected AI decisions can have serious consequences accountability is another challenge when errors occur it’s difficult to determine responsibility between the AI developers the users and the AI itself this complicates legal and regulatory frameworks which are often illequipped to handle the novel implications of AI technology despite their advanced capabilities generative AI systems often lack common sense reasoning a basic human ability to make practical judgments about everyday situations ai can generate plausible sounding responses or content that upon closer examination is nonsensical or impractical this limitation is due to the AI’s reliance on pattern recognition instead of understanding underlying principles or contexts implementing generative AI in a workplace context involves various hurdles these include the technical challenge of integrating AI with existing IT systems the need for significant investment in technology and training and the ongoing requirement to update and maintain AI systems to adapt to new data or changing conditions additionally if an organization is resistant to change and its staff are doubtful about AI this can also make it harder to implement effectively to reduce potential harm and ensure ethical AI deployment it is crucial to adhere to guidelines like those set by major technology companies including Microsoft these guidelines emphasize fairness reliability privacy inclusiveness accountability and transparency organizations must commit to rigorous testing and auditing of AI systems to identify and correct biases protect data privacy and ensure that AI systems perform as intended without infringing on ethical norms in this video you’ll learn that while generative AI presents remarkable opportunities for transforming workplace operations and enhancing productivity its implementation must be approached with a nuanced understanding of its limitations and potential risks by prioritizing ethical considerations and responsible use organizations can harness the benefits of generative AI while mitigating its shortcomings this balanced approach is essential for realizing the full potential of AI technologies in a manner that respects human values and social standards at this point in the course you might view Microsoft Excel as a complicated software application or believe it’s only used for working with financial data however Excel is designed to be very userfriendly and can assist with many different types of data and tasks in this video you’ll discover Excel’s primary purpose and use cases and explore key parts of the software’s user interface including the command tabs adventure Works a multinational manufacturing company that produces and distributes bicycles and accessories globally needs to input some data into Excel to assist with this task the company has recruited you and your several new employees however before starting the task the company has decided to train you to use the software so that you can improve your experience with Excel this training will help you better manage and analyze the data required for the task at hand let’s begin by understanding what Excel can do for Adventure Works microsoft Excel is a software application that businesses use to store data like financial figures and create calculations based on this data users can interpret the data they store by creating visuals or using Excel’s built-in analysis features they can then use the insights derived from these interpretations to inform business strategies or influence decisions with Adventure Work’s vast product line and global presence Excel’s capabilities will be crucial in managing and analyzing its data efficiently before you can start using Excel it’s essential to understand how to navigate the software’s user interface and locate the features you need excel’s user interface is designed to be accessible and includes various elements that help you interact with the software effectively the first of these elements is the title bar it’s located at the top of the Excel window and displays the name of your file the search option and other essential features the worksheet is the primary area where you can input data into cells using either the keyboard or other input devices the command tabs are located below the title bar and provide quick access to Excel’s hundreds of commands which are organized in areas called tabs or ribbons to find the command you need click in the relevant tab to reveal the related commands let’s take a few moments to explore these features and discover how you can use them to input data one of the main areas of Excel is the grid this area contains the worksheet which is where you enter data or information it’s divided into rows and columns and you input information into cells where a column and row intersect just above the worksheet is the formula bar when you type information into a cell in the spreadsheet it appears in both the cell and the formula bar when you create a calculation the result appears in the cell while the formula that drives the result appears in the formula bar in other words the formula bar always shows the actual contents of the cell there is a green title bar at the top of the screen on the left is the autosave button in the browser version of Excel you can find the app launcher button here which you could use to access other Microsoft 365 programs the title bar also contains a useful undo button when autosave is turned on creating a new Excel document automatically assigns the name book to your new file you can view the file name within the title bar to rename a file select the title bar and type an alternative name file names can contain spaces and capital letters you can also use punctuation marks however it is best to avoid the use of punctuation marks as some characters are not permitted also file names can contain a maximum of 255 characters but it’s recommended that you use 31 characters at most you can select the same box to manage the location in which you store the file to the right of the file name is the search feature select the search box and then select find to open a dialogue box where you can search for content like text or figures in your files you can use the options choice in the bottom right of the dialogue box to refine and control Excel searches you can also search for a recent action you’ve applied to a cell next let’s explore the command tabs excel has hundreds of commands organized in storage areas called tabs or ribbons you can select a tab heading to view its ribbon and related commands let’s review the most frequently used tabs the home ribbon is the first ribbon that appears when you open a file it contains the most frequently used commands you’ll rely on for standard everyday tasks like formatting and sorting data you can use the commands on the insert ribbon to add different elements to a file like charts or comments the draw ribbon offers you drawing tools for marking your worksheet while the page layout ribbon lets you alter the appearance of a spreadsheet when printed the formulas ribbon contains commands that you can use to manage more complex calculations you can use the data ribbon to perform different actions with data such as transform query sort and filter operations adventure works are expected to work with large blocks of information and the data ribbons sort and filter commands are useful for these tasks you’ll mostly use the commands on the review ribbon once you’ve created a spreadsheet for example you can use them to manage security settings or collaborate with colleagues the view ribbon offers Excel users commands to make it easier to view large spreadsheets such as the freeze pane which keeps titles visible when moving through data blocks there are also extra tabs called contextual tabs that appear during specific actions or when certain items are selected for example if you add a bar plot to your worksheet then the chart design and format tabs appear on screen these extra tabs contain commands relevant to the tasks you’re working on this demonstration provided only a brief overview of Excel’s interface and it’s completely normal if you feel like you need more help with this information learning any new software requires time and practice so don’t worry if you don’t fully understand everything just yet as you continue through the course you’ll have more opportunities to explore these commands and features in greater depth and you’ll become more comfortable with Excel’s interface by learning about its key elements including the command tabs you’ve built a solid foundation of Excel’s primary purpose and use cases keep up the good work excel is a powerful tool for organizing and analyzing data but sometimes when you’re dealing with large amounts of information it can be difficult to make sense of it all that’s where formatting comes in in this video you’ll discover how to enter and format data in Excel to improve its readability adventure Works has created a list of its offices using Excel however important information is missing from these files it’s also difficult to read the data because it’s not correctly formatted let’s help Adventure Works to add and format its data the green cursor box is in the top leftand corner of the worksheet you can move the cursor by pointing and selecting on a cell the cell location indicator shows you where you are on the sheet you can also use the arrow keys on the keyboard to move the cursor as you type the entry appears in the cell and on the formula bar you can use the backspace key to delete any typing errors the office location is missing from cell C21 select on C-21 type Delaware and then press enter to confirm your entry the entry appears in the cell and formula bar the data lines up to the left of the cell to indicate that it’s text type the number 130422 and confirm it in cell E21 the entry sits to the right of the cell in Excel text aligns to the left of the cell and numbers to the right excel treats an entry that contains both letters and numbers as text you can also manually set the alignment with the alignment buttons on the home ribbon excel also offers an autocomplete feature as a shortcut for entering data for example column D already contains several instances of the word partner so if you type the letter P in cell D21 then Excel suggests the word partner as a possibility press enter to accept the suggestion you can also ignore it by continuing to type an alternative word next New Jersey needs to be added type the word new in C16 this prompts an incorrect suggestion so you must type New Jersey in full now if you type new in C17 Excel waits to see what letter is typed next before suggesting a word because there is more than one entry beginning with new in the browser version of Excel you’ll be presented with a drop-own list of multiple suggestions from which you could select New Jersey column C contains state names this results in a floating dialogue called convert to geography to appear select in the dialogue to instruct Excel to recognize text entries as geographic locations you can select on the card symbol to the left of the entry to interact with Bing to generate information about the location keep in mind that if you print your worksheets the card symbols beside the entries will appear on the print like other Microsoft 365 apps Excel has an undo feature in the desktop version this feature is located on the title bar in the browser version it is located to the left of the home ribbon select the undo feature to reverse recent actions in this case you’ll remove the geographic locations tag and return the entries to normal text the next action is to type New York in full in C18 autocomplete has no suggestions as New York hasn’t appeared in the column before a different shortcut called autofill can be used to add New York to C19 and C20 with the cursor still on C18 position the mouse pointer over the bottom right hand corner of the cell the pointer changes to a narrow black cross now hold down the mouse button and drag it down this action autofills the entry into the cells underneath now that you’ve entered the data in your spreadsheet you need to format it formatting data makes it easier to read and correct formatting on numeric entries prevents misunderstandings here the numbers in E2 and H21 are financial data to make this clear highlight the numbers by selecting all the data from E2 to H21 then select on the currency button in the number group the currencies are available on the drop-own menu alternatively you can use the comma format to display a comma separator and two decimal places you can use the increase or decrease decimal buttons to customize the number of decimal places the percentage button is both a format and an action button it adds the percentage symbol and it also multiplies the cell content by 100 select undo to reverse this the dropown above these buttons presents other number formats these formats include dates as dates are treated as numbers in Excel your next task is to format the column titles so that they stand out type the heading state code in B1 the text overflows into the adjacent empty cell once you add state in C1 two characters of the B1 heading are masked however the formula bar confirms that the whole heading is still there the column’s title is partially hidden you need to make the full title visible from the home ribbon choose wrap text to stack the words in the cell you can also format a heading to stand out using font options in this example the size of the heading has been increased to Calibbri 12 and a blue background color has been applied you can also center the heading using the alignment section of the ribbon another Excel shortcut is the format painter which is found on the left of the home ribbon this shortcut copies format settings from one cell to another select in the format painter to display a paintbrush and copy B1’s style then highlight A1 to H1 to paint those cells with a copied format this action also copies the wrap text and center alignments you should now be familiar with the different methods and shortcuts you can use to enter and format data in Excel this video also demonstrated how this knowledge can be applied to help Adventure Works complete and format their Excel sheet great work reading and editing the contents of a large spreadsheet with hundreds or even thousands of data entries can seem like a large task thankfully Microsoft Excel offers several features and keyboard shortcuts that help you navigate and edit your spreadsheets over the next few minutes you’ll explore these features and shortcuts and learn how to use them adventure Works has sent you a large inventory file they need you to check the current information in the file and add some new data there are over 100 entries in the file to navigate through however you can quickly review these entries and add new ones through Excel’s navigation features and keyboard shortcuts there are several useful navigation and editing features available in Excel the freeze panes feature for example keeps an area of the screen static you could use it to freeze a specific row the static area remains on screen while you scroll freely through the other content you can use the new window option to open a second viewpoint of your file with this feature you can keep one part of the file within view as you work in another area name box is another useful Excel feature the name box is the title of an area located between the ribbon and the worksheet to the left of the formula bar when you type a cell reference in this box and press enter the cell cursor moves to that position on the sheet the name box can also be used to assign a name to a cell finally there are also several keyboard shortcuts that you can use to speed up the navigation and editing of a spreadsheet let’s discover more about how these features and shortcuts operate by helping Adventure Works first you need to freeze key rows to give yourself a more efficient view of the data from the window group of the view ribbon you can access several options two of these include freeze panes and new window select the freeze pane drop-down to view three choices freeze panes freeze top row and freeze first column select freeze top row to turn the row currently visible at the top of the screen static be aware that row one isn’t always the top visible row a horizontal line appears under the top row to indicate the static area the selected frozen row remains static while the other rows below it scroll off screen you can also select freeze first column to turn the first column currently visible on screen static in this case it’s the category column again the first column column A isn’t always the one that becomes static selecting the freeze first column option automatically turns off the freeze first row option once you’ve frozen an area of the screen the first choice in the freeze panes drop-down menu changes to unfreeze pane select the unfreeze pane to release all static areas on screen what if you need to freeze the screen in two directions at the same time for example to help Adventure Works view its worksheet more clearly you need to make sure that all row titles and the data in columns A and B are visible to do this you first need to select on C2 to move the cursor to that position then in the freeze panes dropdown select the freeze panes option once this option is selected Excel identifies the cursor position and freezes everything above and to its left your cursor is currently on C2 so Excel freezes columns A and B along with row one again you can use the unfreeze panes option on the freeze panes dropdown to release all areas of the screen you must also have the totals in row 152 available on the screen while editing other areas of the spreadsheet you can use the new window command to open another view of the file in a new window this window isn’t a separate copy of the file it’s just a different view of the same file with both views visible you can now review the totals data in row 152 while editing the cells in other areas of the spreadsheet to close this second view just select the X in the top right hand corner of its window you can also move quickly around the worksheet using keyboard shortcuts let’s take a moment to explore some keyboard shortcuts available to Windows users press control and home to jump to cell A1 at the top left of the worksheet if on the freeze panes top row choice is turned on the cursor will instead jump to cell A2 but what if you need to move to the end of your work to continue data entry press control and end to move the cursor to the last cell in the worksheet that contains content rather than simply moving the cursor hold down the shift key while pressing either the control and home or the control and end combinations excel selects the entire block as it moves the cursor you can also use the name box to move quickly to specific cells the name box is located to the left of the formula bar the box typically displays the cell reference for your cursor’s current position however if you type a different cell reference and press enter your cursor jumps to the specified cell the name box is also a useful method for assigning names to cells a cell name helps users to identify data content since it’s more descriptive than just a cell reference adventure Works needs you to rename cell 152 to units in stock so position the cursor on the cell then in name box type the text units underscore in underscore stock and press enter cell names must be unique and cannot contain spaces you can use the underscore symbol to substitute for spaces if the cell is referenced in a calculation its name and reference are visible you can view the name from the drop-own list in the name box you can check which cell the name is assigned by selecting the name manager on the formula ribbon in the dropown select the cell name to move the cursor to the cell you can use these same steps to view and access this cell from any sheet in the workbook for example from the products two sheet selecting the units in stock cell name from the name box dropdown brings you back to that cell on the products one sheet you should now know the Excel features and shortcuts to help you navigate and edit spreadsheets you can use these tools to assist you in any Adventure Works Excelbased assignments well done have you ever opened a Microsoft Excel worksheet only to find the content structure difficult to interpret perhaps it contains irrelevant entries or needs too much scrolling to navigate in this video you’ll learn how to use Excel’s sort and filter features to organize content so you can read and identify data quickly and efficiently over at Adventure Works the company checked its inventory data for records related to a specific supplier however the Excel file that contains the data is poorly structured and difficult to navigate adventure Works needs your help to sort and filter the information so that only the suppliers data is visible before you begin helping Adventure Works let’s examine the concepts of sorting and filtering in Excel excel offers users a series of sort and filter commands these commands change the position of data in the worksheet window so that it’s easier to understand in other words they don’t change the data they change how it’s displayed it’s also important to remember that the sort and filter commands are not the same they work on data in different ways you need to understand these differences to prevent any misreading of the data let’s begin with the sort feature the sort feature is found in the sort and filter group in the data ribbon this feature reorders the worksheet by physically moving rows into new positions to return the data to its original position you must use the undo command however if a sort was not your last action you may inadvertently reverse other steps you should also be careful if saving your workbook after applying a sort once your changes are saved the sort order applied to the data is permanent and an undo is no longer possible now that you’re familiar with the sort feature let’s focus on filtering filtering refineses the data displayed based on the criteria of your choosing however unlike with sort the rows are not repositioned instead Excel hides all the rows that don’t match your chosen criteria this leaves a subset of rows visible this subset can be reduced further by applying more filters let’s learn more about how these actions work by helping Adventure Works restructure its inventory Excel file the Adventure Works inventory Excel file is currently sorted by category you need to restructure it using the sort and filter commands access these commands from the sort and filter group in the data ribbon the sort ascending and sort descending commands are shortcut choices when you select one Excel checks the location of your cursor it then uses the column in which the cursor is located as the key for the sort place your cursor on column B which is the date entered column then select sort ascending which is now called oldest to newest the rows are now organized in date order excel interprets dates as numbers so it has performed a numeric sort had you placed the cursor in the supplier column Excel would have performed a textbased sort you can select undo on the title bar to restore the previous row order adventure Works has requested that the data be sorted by supplier the data in column D and that the most recent entry is visible first within each block of supplier data sorting by the supplier and then sorting by the date won’t work here because one sort would cancel out the other instead you need to perform a multi-level sort this technique lets you sort data in two ways simultaneously first from the sort and filter group of the data tab select the sort button to open a sort dialogue box at the top right of the dialogue box you need to confirm that there’s a tick in the my data has headers box this instructs Excel to exclude the first row from the sort next use the drop-own menu under column to instruct Excel to perform the first sort by supplier you can retain the defaults of sort and sell values and sort A to Z then select the add button to display additional sort fields use these fields to configure the second sort level by data entered again retain the defaults of sort and cell values but change the order to newest to oldest then select okay to exit the dialogue box and sort the data as required you have now sorted the data by supplier and date entered select undo on the title bar to reverse the sort next Adventure Works needs you to filter the records to view only the data related to the supplier called Cycles the first step when filtering is to turn on the filtering feature select the filter button on the sort and filter group of the data tab to add filter arrows to each column heading you can now filter the data using the arrows next to each heading to open drop-own lists each filter arrow also has an additional submen to allow for more precise filtering excel recognizes the type of content in the column and generates contextsensitive choices such as equals does not equal begins with and more select the arrow next to the supplier column heading to display a list of suppliers a tick mark beside an entry indicates that its rows are currently visible remove the tick marks next to list entries except for cycles as then select apply excel hides all other rows in the worksheet so that only the cycles as data is visible there are now only 10 rows visible in the sheet all of which relate to cycles as you can confirm this by checking the bottom left of the Excel screen here it states that 10 records were found select the arrow next to the unit price to apply another filter from the drop-down put a tick in the box to the left of item seven then select the apply button the filter only works on the 10 visible records so you have now displayed only rows where cycles as is the supplier and seven is the unit price you might ask yourself how do I know if data has been filtered in Excel there are two ways to determine if data has been filtered the first is to check the filter arrow to the right of the column heading if there is a funnel symbol on the filter arrow then your list is filtered the other method is to check for breaks in the sequence of row numbers on the left hand side of the display area for example a row sequence of 8 9 112 indicates that rows 10 to 111 have been filtered out so how can you remove filtering to make other data visible again in the column header select the arrow or arrow and funnel symbol then select the clear filter option from the drop-own menu to clear a specific filter while retaining the others you can also select the clear choice in the sort and filter group of the data tab to clear all filters you’ve now removed all filters and restored the full data display thanks to your help Adventure Works has the inventory data it needs and you should now be familiar with using the sort and filter actions to organize and identify data quickly and efficiently well done congratulations on reaching the end of the first week in this course on preparing data for analysis with Microsoft Excel in this week you explored the fundamentals of Microsoft Excel by learning how to create workbook content and work with blocks of data in Excel let’s take a few minutes to recap the key skills you gained during this week’s lessons you began with an introduction to the program in which you discovered what topics you will learn about as you progress through the different courses you were also given guidance on how to be successful in this course this guidance included helpful tips on how to structure your study and ways in which you can approach the learning material you were then introduced to other learners in a meet and greet session during which you explained why you’re taking this course and what you hope to achieve from it finally you explored a list of valuable resources you can use to succeed in the course in the second lesson you learned how to create workbook content you began this module with an introduction to Microsoft Excel you developed an understanding of the importance and function of the application including how it’s used in everyday business to store calculate and gain insights from data you then learned how to navigate Excel using its user interface or UI the UI is comprised of three key areas there’s the title bar which contains the name of your file the search option and other primary features the worksheet is the main area used to input data into cells and the command tabs provide quick access to Excel’s commands which are organized in areas called tabs or ribbons you then learned how to enter and format data in Excel you explored the different ways data can be added to a worksheet you discovered how to use formatting to improve the readability of a spreadsheet and you reviewed keyboard shortcuts for data entry and formatting next you learned how to manage worksheets you then undertook an exercise where you demonstrated your new skills by adding data to a worksheet this was followed by a knowledge check which tested your understanding of the material finally you explored additional resources to enhance your learning in the third and final lesson of this week you focused on working with blocks of data in Excel you began the lesson by learning how to read large data blocks in Excel you explored Excel’s navigation and editing features such as the freeze panes feature the new window feature and the name box feature and keyboard shortcuts you then developed an understanding of the concepts of sort and filter you learned how to identify the key differences between both and you learned how to sort and filter data in Excel so that you can organize and identify data quickly and efficiently you then explored different methods for sorting data in a worksheet including alpha numeric sort and the multi-level sort feature and you discovered how to use the filter feature to control data visibility in a worksheet next you undertook an exercise in which you demonstrated your new skills by sorting and filtering data in a worksheet this was followed by a knowledge check and module quiz both items tested your understanding of the material by presenting questions focused on the key concepts you explored you should now be familiar with the fundamentals of Microsoft Excel you should be capable of creating workbook content and using different methods for working with blocks of data great work i look forward to guiding you through the lessons next week in which you’ll learn how to use formulas and functions in Excel analyzing data often involves making calculations however when working with large blocks of data calculations can quickly become confusing luckily Microsoft Excel can calculate numerical information using formulas you can solve real life data analysis problems in Excel with a little bit of planning and some basic math over the next few minutes you’ll learn how
Excel processes calculations and how to create a formula using the correct syntax over at Adventure Works the accounting staff are amending a spreadsheet that records orders placed with suppliers their first task is to update the prices and order amounts they need to work out the purchasing cost by creating a calculation in the data but first they need to understand how Excel reads interprets and implements calculations let’s take a few minutes to explore formulas and calculations and then help Adventure Works a formula in Excel is a calculation performed on the values in a range of cells in your worksheets examples of these calculations include addition subtraction multiplication and division once the calculation is completed the formula returns a result even if it’s an error now that you’re familiar with what a formula is let’s find out more about how they work all formulas begin with an equal sign it is then followed by a calculation or function formulas can contain numbers or cell references for example this formula instructs Excel to add the values in cells A1 and B1 excel usually reads the formula from left to right characters are used to indicate the type of calculation Excel should perform the plus character is used for addition and the minus character for subtraction the asterisk is used for multiplication and the forward slash character is used for division the formula bar shows the formula in the cell you are working in the worksheet shows the result of the formula in the formula bar this is important to take note of when you are creating or working with calculations a formula can also be static or dynamic a formula containing fixed numbers will be static and always generate the same result for example the formula in E2 is static because it contains specific numerical values it will not update if any of the monthly figures in cells A2 B2 or C2 change on the other hand a formula that contains cell references is dynamic based because Excel always uses the current value in the cell the formula in E3 is dynamic because it includes cell references a formula can also include a reference to a cell which itself contains a formula this creates a chain of calculations for example the formula in E1 refers to cell C1 cell C1 also contains a formula that calculates the data in cells A1 and B1 if the values in cells A1 and B1 change then the formulas in cells C1 and E1 will both change in other words a change at one end affects all other formulas in the chain a formula can also refer to a cell in another sheet this reference must include the worksheet name followed by an exclamation mark this other worksheet can be in the same workbook or in another Excel file references to cells in other workbooks are called links or external references the formula in this screenshot references the product sheet within the same workbook for example this formula states that what is in this cell is equal to the contents of H2 in the product sheet plus the contents of A1 in this sheet now that you’re familiar with the basics of a formula let’s view it in action by helping Adventure Works determine the cost of the items it’s ordering from its supplier begin by positioning the cursor on K3 which is the cost column this is the cell where the results should be displayed then type an equal sign to determine the cost of the order you need to multiply the contents of I3 the unit price by the contents of J3 the number ordered select cell I3 to add that reference to the formula the equal sign and the cell reference are displayed in both the result cell K3 and in the formula bar next type an asterisk symbol to represent multiplication then select cell J3 this reference is colored red on the formula bar and the cell is highlighted in red press the enter key to complete the formula this creates a result of 79,050 which is now visible in K3 adventure Works decide to make a change to its order it wants to reduce the number of units that it ordered by 250 so how can Adventure Works update the formula with this new information amend the figure in J3 and press enter this causes the formula in K3 to recalculate and generate a new result of $65,875 if you double click on a cell such as K3 this opens edit mode while you’re in edit mode Excel places colored highlights around the cells referenced in the calculation it’s easy to begin to edit a cell accidentally with a double click if the cell contains a formula particularly one you didn’t create this can be a little worrying pressing the escape key is a safe way to cancel an edit without amending any of the information within a cell you have explored how calculations in Excel can be useful in data analysis by now you should know how Excel processes calculations and how to create a formula using the correct syntax you will learn more about formulas as you progress in your learning journey well done microsoft Excel doesn’t just store data it also assists with calculations a fundamental component of Excel and data analysis so it’s important that your calculations are correct and reliable in this video you’ll learn how Excel processes calculations discover how to construct the syntax for calculations and edit your syntax to avoid errors jamie at Adventure Works is working on a purchase sheet it has been updated to include information on new orders placed with suppliers she now needs to create calculations that correctly display the difference between purchasing costs and sales amounts the formulas she creates will contain a mixture of multiplication and subtraction and she needs to be confident that those operations are happening in the correct sequence let’s take a few minutes to explore how these formulas work beginning with operators the symbols that are used to indicate mathematical actions in Excel are known as operators operators are used for actions like addition subtraction multiplication and division for example you can use operators to add the values of two cells together or divide the value of one cell by another when working through a formula Excel does not always calculate the expressions or steps in a formula from left to right excel handles the operators in a calculation according to a key mathematical principle called the order of precedence the order of precedence assigns greater importance to some of the mathematical symbols over others this means that Excel calculates formulas according to the hierarchical position of each symbol within the order of precedence don’t worry if you don’t fully understand what the order of precedence is this is covered in a later reading in terms of importance Excel tries to process division and multiplication symbols before addition and subtraction however you can control how Excel executes calculations by using parenthesis in your formulas this is a key technique in creating formulas that generate reliable results parenthesis instruct Excel as to which part of a calculation must be executed first even if this would contradict the order of precedence let’s explore the use of parenthesis in formulas you want Excel to add the numbers two and three together and then multiply the subtotal result by 4 so you type this formula as equal sign 2 + 3 * 4 however Excel will not process this calculation left to right instead Excel will first multiply 3 by 4 which gives a result of 12 it will then add two giving a formula result of 14 this is because the multiplication symbol has a higher priority in the order of presidents adding parentheses to the calculation allows you to instruct Excel to do this bit first so you could rewrite your calculation by placing part of the formula in this instance 2 + 3 in parenthesis now you’ve directed Excel to add 2 and 3 as its first step and then multiply the result of that addition by four the result of this calculation would be 20 and not 14 as it was previously it is important to have a clear understanding of where to put parenthesis in a calculation placing parentheses in the wrong position in a formula or not including them at all could change how Excel understands and implements the calculation an incorrect calculation result may not always be obvious as it may seem plausible there are also times when you may need to reproduce cell entries and formulas within a worksheet when a formula is copied it is important to consider the appearance of the cell references there are two ways a cell reference can appear in a calculation these are relative and absolute a relative cell reference means that if you copy a formula to a new cell Excel will adjust the row numbers or the column initials in the cell references to update the formula relative to its new location this ensures that the formula is correct for the row or column it has been copied to for example the formula in K3 which reads equal sign I3 multiplied by J3 is copied down using the autofill feature excel adjusts the cell references for each row but what if a cell reference needs to say the same when the formula is copied elsewhere for this to happen you must make the cell reference into an absolute reference when Excel copies a formula it keeps absolute references constant and does not adjust them for example if the formula in L3 is copied down through the column then the reference for the cell that contains the exchange rate needs to say the same when the formula in L3 is copied down the K3 reference in the formula will adjust to include a different row number however the N2 reference in the formula should not change since the exchange rate is only mentioned in that one cell to make a cell reference absolute add a dollar sign before the column initial and before the row number this instructs Excel to keep the cell reference constant during the copy operation this means that all copies of the formulas will contain the original cell reference don’t worry if you find these concepts difficult to follow you’ll explore how to control calculations in more detail in a later video there are also additional resources available at the end of the lesson excel will also recalculate and update all formula results when a file is opened files that contain a lot of complex calculations will be slower to open fully on screen than ones that only contain data fortunately you can turn the automatic recalculation feature off just remember to switch it on when you are done working with the file to change the recalculation mode select the calculation options dropown on the formulas ribbon then on the dropown select the recalculation mode you need for your file well done you now know how to control how Excel works through the steps in a formula you’re also able to identify the correct syntax to use if calculations are going to be copied elsewhere in the spreadsheet great work a Microsoft Excel formula can be complex and include many steps in this video you’ll explore the correct syntax for Excel calculations that contain multiple steps and discover how to adjust a formula to ensure that it copies a calculation correctly amy at Adventure Works is preparing a price quote in a worksheet for the client Kontoso Bikes the client wants to order bicycle parts for their retail outlets let’s find out more about how Amy can control her worksheet calculations to ensure that the prices are correct for the client amy has already listed the required items and their respective prices adventure Works are offering a 10% discount to the customer adventure Works charges different prices for delivery based on the region that the customer outlet is in contoso Bikes has four retail outlets two in region A and two in region B the spreadsheet also shows data for region C however this region is not the focus of this video amy must ensure that two different delivery rates are used in her formulas let’s help her create calculations firstly cell G6 must show the result of the cost per unit multiplied by the quantity ordered position the cursor on cell G6 and type an equal sign to begin the first calculation select cell E6 and type a star from multiplication then select F6 press enter to complete the calculation and generate the subtotal next Adventure Works needs to calculate the client’s 10% discount select cell H6 and type an equal sign select the subtotal amount in G6 to work out the 10% amount you need to divide by 100 and multiplied by 10 add the forward slash symbol for divide and type 100 then add the star symbol for multiply and type 10 excel processes these calculations from left to right it first divides the figure in G6 by 100 and then multiplies the result by 10 press enter to get the discount figure now you need to work out the total cost excluding delivery select cell I6 and type an equal sign then select G6 to select the subtotal and type a minus symbol to subtract the discount select cell H6 to select the discount however before pressing enter to complete the calculation there’s another step to consider this order needs to be duplicated for each of Ktoso bikes four outlets so the total cost excluding delivery needs to be multiplied by the value in cell I2 to calculate this type a star select cell I2 and press enter but something has gone wrong with the result of this formula because the total amount is less than the subtotal select I6 to return to edit mode in your formula the multiplication operator has higher priority or precedence than the minus operator in other words the multiplication operator is higher in the order of precedence so Excel takes the discount in H6 multiplies it by the value in I2 and then subtracts that value from the total to work around this add an opening parenthesis before G6 and a closing parenthesis after H6 this ensures that Excel processes the subtraction operator before the multiplication operator press enter to execute the formula and generate the correct value next you need to calculate the total amount if it is to include the cost of delivery remember there are two different prices for delivery one price for each region so there must be subtotals in this formula the formula in the cell also requires a mixture of addition and multiplication symbols so you need to use parenthesis to work with the order of precedence select cell J6 and type an equal sign select I6 to include the total cost if excluding delivery then type a plus symbol type an opening parenthesis and the number two add a star symbol and then select cell M2 include a closing parenthesis type another plus symbol add an opening parenthesis a number two and a star select cell M3 type the closing parenthesis press enter to calculate the result the total cost when delivery is included is $22,930 amy now needs to calculate these same costs for all the remaining categories in the worksheet you could help her by using the autofill feature to copy the formulas that you’ve created to save time however some cell references will need to be made absolute to prevent the autofill process from changing them select cell I6 type a dollar sign in front of the letter I and another dollar sign in front of the number two press enter the formula in J6 also requires a dollar sign this time instead of typing out each dollar sign let’s use a shortcut method enter edit mode on cell J6 position the cursor on the M2 reference this is the region A delivery charge press the F4 key on the keyboard to bring up the dollar signs repeat this action for the M3 reference the region B delivery charge then press enter to complete the formula it’s now safe to use autofill to copy these formulas as the required cell references will remain absolute position the cursor on G6 a shortcut for autofill is available because there is a block of data to the left position the mouse pointer on the bottom right hand corner of the cursor so that it becomes a black cross then double click the mouse button excel uses the block of data to the left as a reference and copies the formulas down to G15 repeat this process on cells H6 I6 and J6 to complete the worksheet you have now helped Amy to calculate the various costs for Kontoso bikes orders you should now be able to recognize situations in which you need to adjust the syntax in a formula to control how it’s processed in Excel you’ve also learned some useful shortcuts for absolute references and autofill these shortcuts will help you to work more quickly and efficiently on your worksheets at this stage of the course you should be familiar with creating and working with formulas but you don’t always have to create your own formulas as you’ll soon discover Excel offers predefined formulas called functions that you can use to perform calculations in this video you’ll discover what function formulas are explore their syntax and learn how to use them to perform calculations over at Adventure Works the company is approaching the end of its financial year lucas in accounts has been tasked with calculating the total quarterly sales for each regional sales team you can help Lucas carry out this task using Excel function formulas but first you need to learn what functions are and recognize their syntax let’s begin by defining a function a function is a predefined formula that performs a calculation based on values specified by the user for example a simple function could total the values in two cells or a more complex function could calculate repayments on a bank loan functions are useful because they allow for more complex calculations they also facilitate dynamic content that responds to changes in the worksheet excel contains many built-in functions these built-in functions are grouped into different categories which can be accessed from the formulas tab or ribbon there are several categories visible when you access this ribbon select the more functions option to view the others these categories are organized so that you can locate the functions most relevant to your day-to-day requirements for example Excel offers functions for financial date and time and math calculations you’ll explore each of these categories in more details as you progress through the course you can also refer to the Microsoft page Excel functions by category article link in the additional resources so now that you know what a function is let’s explore its elements the first element of a function formula is the name of the function this takes the form of a single word such as sum the sum function adds all the values within a selected range of cells the second element of the formula is the arguments as you’ve just learned a function calculates data this data or information is referred to as an argument the data it accepts is also custom you can add your own information to the formula to direct and control the action of the function it’s important to remember that each function requires a different list of arguments some arguments are mandatory a function can’t carry out its task without them however other arguments are optional they exist to provide different choices around additional elements like formatting your results so how do you construct a function formula like any other calculation a function formula begins with an equal sign you then need to write the function name for example equals followed by sum the next step is to write the arguments arguments are contained within a pair of parenthesis so begin by typing an open parenthesis then list the arguments as an example you could follow a sum function with the argument open parenthesis C2 colon C4 make sure to separate arguments from one another using characters such as commas or colons instead of spaces or periods when you finish typing your arguments end your function formula with a closing parenthesis you now have an argument that instructs Excel to add all data in cells C2 to C4 when executed this formula returns a result that calculates the values within this cell range function formulas can contain more complex arguments but this simple example is a great starting point to help familiarize you with the syntax now that you know how to construct a basic function formula let’s make use of your new skills and help Lucas create a sum function to obtain the totals for Adventure Works sales figures adventure Works sales data is contained in an Excel workbook called annual sales totals the workbook contains a worksheet called sheet one this sheet contains five columns the first column lists the months of the year one month per row the other four columns contain the names and data for each regional sales team each column contains 12 sales totals one for each month let’s begin by calculating the sales totals for team A first you need to place the cursor on the cell where the result of your function must appear place your cursor on cell B14 underneath the sales data for team A this is the cell where the overall sales total must appear now you can write your function first type an equal sign then type the name of your function in this instance you need to add the data so you can use a sum function function names are not case sensitive you can type them in upper or lower case once written Excel displays them in uppercase as you type the word sum a list of suggested functions appears this list is a useful shortcut for accessing functions quickly but for now you can continue typing the formula now that you’ve stated the name of your function you need to outline your arguments type an open parenthesis a floating help message appears with argument prompts if the prompt is in bold then the argument is required if the argument is in square parenthesis then it is optional in other words it’s not required for the function to work in this instance you’re writing a custom argument type B2 colon B13 then type a closing parenthesis to end your argument the sum function and your custom argument instruct Excel to calculate or add numeric total of all data in cells B2 to B13 just like the example you explored earlier press enter to execute the function the result shows that team A sales total for the year was $971,000 now that you’ve calculated the sales total for team A you can copy the function formula to the other cells in the row using the autofill shortcut select cell B14 position the mouse pointer over the bottom right hand corner of the cursor to turn it into a black cross hold down the mouse button and drag the cursor to the right as far as cell E14 as it copies the data from cell to cell excel also adjusts the formula to total the cells in each column for the remaining teams lucas now has the sales totals for each of Adventure Works sales teams thanks to your help Lucas successfully created the function formula he needed to complete his task and having assisted Lucas you should now know what functions are be able to read the syntax of a function and know how to use a function to perform a calculation creating a formula with a function for the first time can be intimidating how many arguments does it require what’s the correct syntax thankfully Excel offers a useful insert function tool that provides a framework for creating a function formula in this video you’ll explore the insert function tool and function categories and learn how to create a function over at Adventure Works the company is busy calculating the annual sales total for each regional team the sales data is contained in a worksheet called sheet one the worksheet lists all four teams and their respective sales totals for each month let’s help Adventure Works calculate each team’s total sales using the insert function feature begin by positioning the cursor on cell B14 this is the cell in which your sales total must appear for team A now you can access the insert function feature there are two ways to open this feature the first is by selecting the insert function button on the left hand side of the formulas ribbon or you can select the insert function option on the worksheet screen to the left of the formula bar selecting either one of these options opens the insert function dialogue box in the middle of this dialogue box is a list of functions you can navigate through these functions using the scroll bar however this is a brief list that doesn’t contain all available functions above this list is a drop-own box with the heading most recently used to the left of this dropdown is a prompt called or select a category because the category choice is set to most recently used the list underneath contains functions that you’ve recently used in your worksheet formulas as you work through Excel you’ll most likely make frequent use of the same functions over time this list will populate with your most used functions providing a useful quick access shortcut you can select each function in the list to display a short description of its purpose in the bottom left of the dialogue box is a blue hyperlink called help on this function this is a contextsensitive link select it to visit the help page for your selected function on the Microsoft support site if your required function isn’t on this list then select the drop-own arrow to the right of most recently used you can select another category to open a different list of functions for example you need to use the sum function to complete the calculation task for adventure works you can access a sum function from the math and trigonometry category when you select this category the list of available functions changes you can learn more about which functions correspond to which categories in the additional resources remember that you can select a function for an explanation of what it does or you can highlight a function name in the list then select the blue help hyperlink for more detail the function list is arranged alphabetically so scroll down to the S section select sum and then select okay this action opens another dialogue box called function arguments there are two boxes at the top of this dialogue labeled number one and number two respectively notice that the text number one is bolded this indicates that an entry is required here the text number two is not bolded which indicates that it is optional however you might use it in a situation where you require a total for blocks of numbers at separate locations in the spreadsheet in the adventure worksheet Excel has identified the block of numbers directly above your cursor position so it’s suggesting that you include the cell range B2 to B13 in your total in the background on the formula bar Excel has already constructed the calculation for you it has included not only the cell references but also the equal sign the parenthesis and the colon if Excel has suggested the wrong block of cells then you can select the navigate button to select a different range or edit the formula the navigate button is an arrow pointing upwards at the right of the number one box selecting this arrow temporarily collapses the dialogue box and returns you to the spreadsheet so that you can change the selection the navigation arrow to the right of the number box is now an arrow pointing downwards selecting this arrow restores the full function arguments dialogue box just above the blue help link on this dialogue is a formula result which in this case is a total you should also be aware of warning messages that could appear here these warnings are often generated by errors that are created when working with more complex function formulas you’ve now selected the required function and you’ve made sure that the syntax is correct and targets the required data select okay to add the completed formula to the worksheet when executed this function formula generates a sales total of $971,000 for team A adventure Works can copy this formula across the row to generate sales totals for the other teams thanks to your use of the insert function feature Adventure Works now have the required sales data and you should now be familiar with the function tool understand its categories and be able to make use of the tool to create a function formula congratulations on reaching the end of this second week in this course on preparing data for analysis with Microsoft Excel this week you explored how to create and work with formulas and functions in Excel let’s take a few minutes to recap what you learned in this week’s lessons you began the first lesson by learning about formulas you learned that a formula in Excel is a calculation performed on the values in a range of cells in your worksheets examples of these calculations include addition subtraction multiplication and division once the calculation is completed the formula returns a result even if it is an error you then learned how formulas work different characters or operators are used to indicate what type of calculation Excel should perform examples of operators and calculations include addition subtraction multiplication and division the formula bar shows the formula in the cell you are working in while the worksheet shows the result of the formula in the formula bar formulas can also be static or dynamic a static formula means that the numbers are fixed so it always generates the same results a dynamic formula is one in which the results depend on the current values in the reference cells a formula can also include a reference to a cell that itself contains a formula creating a chain of calculations and a formula can also refer to a cell in another sheet this reference must include the worksheet name followed by an exclamation mark you then learned how to control calculations you learned that when working through a formula Excel handles the operators according to the order of precedence this means that Excel calculates formulas according to the hierarchical position of each symbol within the order of precedence the hierarchy is as follows excel first calculates division and multiplication operators it then calculates addition and subtraction operators however you also discovered that you could control a calculation using parenthesis in formulas parenthesis instruct Excel as to which part of a calculation must be executed first even if this would contradict the order of precedence there are also times when you may need to reproduce cell entries and formulas within a worksheet when a formula is copied it is important to consider the appearance of the cell references there are two ways that a cell reference can appear in a calculation relative and absolute a relative cell reference means that Excel adjusts the cell reference of a copied formula relative to its new location to make sure it’s correct and an absolute reference means that Excel keeps the reference constant it doesn’t adjust it you learned that to make a cell reference absolute you must add a dollar sign before the column initial and before the row number you also explored different percentage calculations and you learned how to create reliable percentage formulas using the correct syntax throughout the lesson you put your new knowledge to use by assisting Adventure Works with many different calculation tasks one of these tasks was in the exercise in the exercise you calculated Adventure Works profits and margins in preparation for a presentation to complete this task you created a calculation that relied on the company’s revenue data and you made sure that your calculation followed the best practices you had explored during the lesson you then undertook a knowledge check in this item you proved your understanding of the concepts you encountered by answering a series of questions finally you explored a list of additional resources designed to help you improve your knowledge of the topics in this lesson in the second lesson of this week you learned how to get started with functions you began by learning that a function is a predefined formula that performs a calculation based on values specified by the user you then discovered that Excel contains many built-in functions grouped into separate categories which can be accessed from the formulas tab or ribbon you then explored the two elements of a function the first element of a function is the name such as sum next is the arguments an argument is the data a function accepts arguments are mandatory but the data can be custom you then learned how to construct an argument in Excel like any other calculation a function formula begins with an equal sign you then need to write the function name for example equals followed by sum the next step is to write the arguments within a pair of parenthesis when you finished typing your arguments end your function formula with a closing parenthesis you also learned that you could create a function using the insert function tool the tool is a framework for building functions it’s accessed using the formulas ribbon or from the worksheet screen the tool lets you build a function from a series of drop-own lists and it provides useful tips for building functions and warnings for when they’re incorrect you then explored the autosum shortcut the autosum shortcut is a method of adding formulas in Excel it provides quick access to core functions that Excel users make daily use of the functions it provides access to include the sum function which adds all values within a selected range of cells the average function used to calculate the average of the selected range and the different versions of the count functions these are useful methods of counting the numbers of cells in a given range that contain or don’t contain specified values there’s also the max function which displays the cell with the largest value from a given range and finally the min function this function displays the cell with the lowest value from a given range you can also reproduce calculations quickly and easily in a worksheet using the autofill feature just like in the previous lesson you put your new knowledge to use by assisting Adventure Works with many different functions this included the exercise item in the exercise you helped Adventure Works to prepare a monthly sales report to complete this task you prepared the report using a series of functions and you made sure that your calculation followed the best practices you had explored during the lesson you then undertook a knowledge check and a module quiz in which you proved your understanding of the concepts you encountered by answering a series of questions you’ve now reached the end of this module summary it’s time to move on to the discussion prompt where you can discuss what you’ve learned with your peers you’ll then be invited to explore some additional resources to help you develop a deeper understanding of the topics in this lesson best of luck we’ll meet again during next week’s lessons you check the results of a recently performed data analysis only to discover the results are wrong a quick inspection of the data set reveals errors in the data raw data needs to be correct and trustworthy because this information influences decisions so you always need to check for errors and resolve any you find in this video you’ll explore the common data errors in Microsoft Excel and discover how they could negatively impact data analysis jamie at Adventure Works is working on a spreadsheet that contains a large amount of customer and sales information she’s assessing if the contents are reliable enough to be used for data analysis to deliver new insights on customer behavior however the spreadsheet contains some common errors these errors must be resolved before she can make use of the data let’s take a few minutes to examine the types of errors that Jaime should be checking for many common errors or mistakes that you might find in your data set are often made by those who entered the data they might be unfamiliar with the software or technology or they’re just not paying attention a common mistake is that a name or keyphrase is misspelled in that case Excel might not link the entry to other important details as it should or it might not find the entry in a search for example Jaime’s spreadsheet tracks sales figures by region column C tracks the city in which each sale was made if she types the city Chicago as the latest entry without the A or types it in the wrong column Excel would ignore that entry when asked to summarize or total the sales results for that city entries can be misidentified during the data analysis process if they contain unnecessary characters for example Jaime types a dollar character before the numbers in her entries these entries are considered text excel would not include those amounts in a number calculation in a wider data analysis process they might be ignored altogether remember in Excel a currency amount should always be typed as numbers in the cell first then you should apply the currency symbol or the comma separator using a number format unnecessary spaces before or after entries can also create difficulties they don’t stand out on screen in the same way as other text or number characters but Excel is aware of them for Excel the word Chicago followed by a single space is different from Chicago typed without the space for calculation and analysis purposes it considers them to be two separate cities finally an entry might be placed in the wrong column or under an incorrect heading in a spreadsheet for example Jaime might type an entry under the wrong heading in her spreadsheet the city named Chicago is entered in the sales price column so that row item might be mclassified other examples of common errors or mistakes can be caused by an inconsistent layout or content it’s important that data is presented consistently throughout a worksheet so that it always remains accurate and reliable poor or inconsistent layouts can give rise to errors when creating an Excel file keep in mind the way in which information will be used like if a spreadsheet only has a single column for an address this column then contains all the address elements like city region or area code this means that it’s difficult to break down these results separately by city or by region during data analysis because they’re not in separate columns instead you should format information like addresses across multiple columns so that it’s easier to process and analyze the data abbreviations and acronyms can also generate errors in data analysis it’s usually better to include a full word or title instead of an abbreviation or acronym in the following spreadsheet there are multiple variations of common abbreviations like Mr Miss and doctor this will cause serious issues during data analysis the best approach for data analysis is to standardize the approach for writing abbreviations particularly for titles like these another important feature of data analysis is the ability to break down results and information by date or calendar interval this means that dates must be entered in a particular way in a spreadsheet so that Excel recognizes them as calendar items the component elements like the month day and year must be typed as numbers and separated by a forward slash or a dash if you type dates with incorrect separator characters then Excel won’t interpret them as numbers instead it processes them as text so you won’t be able to conduct time analysis of your data a final common error to be aware of is duplicate information duplicate information in a data block distorts analysis results items can be counted multiple times and numeric results can be artificially inflated checking for duplicate data is an important step before performing data analysis duplicated entries in data are often the result of human error where entries are typed multiple times data could also be repeated accidentally if imported or created using a copy and paste operation for example Jamie might add sales figures from the previous week to the spreadsheet if her colleague doesn’t check for duplicate data then those sales figures could be included in the results a second time so how could you avoid the risk of duplicate data aim for an efficiently designed spreadsheet for example if you’re including dates in your spreadsheet then sort the sheet in date order this makes it easier to identify the time entries already added likewise if you’re including address data then assign a different column to each element of an address this helps others to identify entries by searching for house numbers street names or cities like an entry for apartment 1 2 36 on North Street Miami jamie has identified the common errors in her data set she can now resolve them and start analyzing the data and you should also now be able to recognize common data errors and how they can have a negative impact on data analysis results you’ll be able to identify and fix the most common errors in the data before submitting it for analysis well done every day you calculate dates and times asking questions like “How long do I have to get to work?” or how many days do I have available to complete that project data analysts also ask date and timebased questions about their data sets and they can calculate answers using Excel’s date and time functions and formulas in this video you’ll learn about the importance of these date and time calculations how they can generate new data and explore some business use cases over at Adventure Works distribution hub Jaime is overseeing both the stock that Adventure Works are purchasing from suppliers and the items dispatched to fulfill customer orders jaime needs to create a spreadsheet with date and time formulas that track the delivery times dates and date intervals before you discover how Jaime can make use of these formulas let’s find out how date and time information provides businesses with an essential framework for planning date and timebased calculations are useful tools in helping businesses to plan for increased demands for products and demands on resources such as staff and equipment they also help businesses plan towards key dates or deadlines you can also use Excel to plan toward key dates where there will be an increased demand on your business take the example of a building company contracted to build a new office block the project manager needs to create schedules and plans for all stages of the building process for planning purposes they need to determine how many working days there are between the project’s proposed start and end dates excel can be used to create formulas to calculate how many hours calendar days or work days there are for important deadlines these formulas can be set up in a dynamic way so that they update as the clock or the calendar changes by monitoring daily results over a specific time interval businesses can identify dips and peaks in performance for example a management team might notice that during one period there was a significant drop in sales if the results are organized by date they can identify the factors internal or external that might have caused this date and time calculations are also useful for tracking results and performance business transactions are usually recorded against dates and in some cases against time now that you’re familiar with some of the benefits of date and time calculations let’s explore date and time functions and formulas in Excel it is important to understand how Excel tracks dates and how it is used in calculations let’s begin with serial numbers the method Excel uses for tracking calendar days in Excel each date entry is formatted to appear as a calendar item however behind each date is a number that Excel uses to keep track of calendar days this number is known as a serial number excel assigns a serial number to each date starting from the 1st of January 1900 this date was given serial number one excel uses the system clock on your computer to track time and it increments the serial number by one when a 24-hour period has elapsed a date in the past will have a smaller serial number than one in the future you can view the serial number behind any date by changing the format from date to general in this example the two entries in A2 and B2 are formatted to display as dates if the same entries in A4 and B4 are formatted as general it is possible to display the serial numbers behind these dates the later date has a larger serial number excel uses these serial numbers in calculations using serial numbers one date can be subtracted from another to calculate a specific number of days for example the today formula can be used to always display the current date in a spreadsheet over at Adventure Works Jamie needs to display the current date in her spreadsheet she can use the today function to generate this result the syntax for this formula is an equal sign followed by the word today and parenthesis this creates a dynamic date display in a spreadsheet that updates every 24 hours a similar function called now can also be used to display both the current date and time the syntax for this function is an equal sign the word now followed by parenthesis when executed this function displays the current date and time in your spreadsheet this makes it more useful than the today function which just shows the date you can also use functions to extract the component elements of a date these actions can be carried out using the month day and year functions each function extracts a specific component of the date the month day or the year you will learn more about these functions and the others you’ve just reviewed later in the course finally there’s also the date function the date function is the opposite of month day and year either of these operations may be necessary to prepare date information for data analysis you will learn more about these functions and the others you’ve just reviewed later in the course jamie can use these date and time formulas to track delivery times and dates for Adventure Works purchases from suppliers and to track items dispatched to their customers and you should now understand how date and time calculations are used to generate new data in Microsoft Excel you’ve also learned how to identify key business case uses for date and timebased information well done as a data analyst you’ll often have to input large volumes of time and date based data into your spreadsheets and it can be difficult to manually keep this data aligned with your project thankfully with Excel you can create dynamic date and time entries that update automatically over the next few minutes you’ll learn how to create dynamic time and date entries in a worksheet and separate dates into component parts adventure Works are preparing a new advertising campaign which will launch in multiple countries they need to use Excel to track progress toward key dates the milestone dates for the project are contained in a worksheet called regional dates the worksheet tracks information about the products that are part of the campaign alongside the campaign launch dates for each country adventure Works needs to calculate how many project days are available for each campaign another calculation in the spreadsheet must show on a rolling basis how many days are left until each launch date the development of this campaign will spread over two years so Adventure Works also need to record the accounting period for the project launch date for each country let’s help Adventure Works to complete their spreadsheet using date and time formulas entries in column D and E are formatted as dates you can select any cell in the range D5 to E19 and check the number format box on the home ribbon to confirm this remember that these dates are actually serial numbers so you can switch the format on cells D5 and E5 to general access the home tab and select general from the drop-own menu to display the serial numbers notice that the serial number for the date in E5 is larger than the one for the date in D5 select undo to restore the date format now you need to calculate the number of project days you can complete this task using a simple subtraction formula select F5 to input your calculation begin the calculation with an equal sign then take the date in E5 the larger serial number and subtract the date in D5 the smaller serial number press enter to generate the result there are 63 days assigned to the timeline for this first project note that because this calculation is a subtraction Excel doesn’t include the start date in cell D5 in its count however if required you can ask Excel to include the start date by adding a plus one to the formula the result in F5 remains static because the dates in D5 and E5 won’t change now you need to work out the days to launch figure for cell G5 the formula for this figure takes the launch date in E5 and subtracts a current date figure in cell E1 the current date in E1 must also be created using a formula if E1 always displays the current calendar date then the formula in G5 recalculates daily to show the decreasing numbers of days to the launch date you need to use the today function in your formula in E1 to make sure that the date updates every 24 hours to the current date with the cursor in E1 type an equal sign the word today and an open parenthesis you might notice that the help prompt is empty this is because the function doesn’t require any arguments there still needs to be parenthesis after the function name but no arguments should be included press enter to produce a dynamic date result that updates every 24 hours to show the days to launch figure in G5 the formula takes the campaign launch date in E5 and subtracts the current date in E1 the E1 cell reference must have dollar signs before the column initial and the row number this is to make sure that the reference stays constant when the formula is copied the today formula will now change the current date in cell E1 every day this means that the formula in G5 also recalculates daily so the days to launch figure reduces by one each day as the timeline gradually progresses your next task is to show the year for the campaign launch date excel recognizes three elements in a date the month the day and the year you can use the year function to identify and display the year element from a date in another cell in other words you can separate the date into its component parts so that you can focus on the year element type an equal sign the word year and an open parenthesis in cell H5 a help prompt appears on screen and states serial number this is because Excel interprets stored dates as serial numbers select E5 type a closing parenthesis and then press enter to generate the result in H5 this campaign is set to launch in 2023 you’ve calculated the required campaign information for row 5 you can now copy these formulas down through the spreadsheet to calculate the remaining campaign dates use the autofill doubleclick shortcut on each formula to copy it down through the column to row 19 and complete the spreadsheet you should now understand how Excel works with dates in calculations and be able to create some common dates and time tracking formulas thanks to your work in these formulas Adventure Works now have a clearer picture of how much time is available for each stage of this project well done when working with Excel you might need to execute a function under certain conditions or logic in these instances you can use a logical function calculation like an if function in this video you’ll explore the purpose of logical functions review some common use cases and learn the syntax for creating a logical function formula using the if function over at Adventure Works Lucas is reviewing the monthly sales reports he needs to find out if any of the sales staff are entitled to a monthly bonus as a reward for exceeding their sales targets you can help Lucas to identify which sales team members deserve a bonus by using an if function formula but before you can help Adventure Works you’ll need to find out more about how logical functions work you can use logical functions to ask yes or no questions about your data if the function returns yes as its answer then you can direct Excel to perform the required action however if the function returns an answer of no then Excel can be directed to perform a different action for example you can direct adventure works if function formula to ask the question has this salesperson met their target if the answer is yes then they’ll be awarded their bonus if the answer is no then they’re not awarded a bonus when logical functions such as if run a test they determine the answer by comparing the value in a cell against a specified criterion for these tests to work the formula must contain logical operators the logical operators determine what kind of question the formula is asking and what value it needs for its answer these operators can be used to compare both text and numeric entries let’s review some examples of these operators the equal sign is the first of the mathematical operators that Excel uses in logical functions excel uses this operator to check if the value of one item is equal to that of another item for example a formula that tests if one equals 1 would return the value of true the logical symbols greater than and less than are used by Excel to test if one value is larger or smaller than another an Excel formula that performed the logical tests two is greater than one and one is less than two would return an answer of true for both tests the greater than and less than symbols can also be combined with the equals sign this combination lets Excel confirm if a value is greater than or equal to or less than or equal to another value let’s take a formula where Excel checks to see if the value in cell D2 is the same as or larger than the value of 400 if even one of these arguments were true then the test would return the value of true finally a very useful set of logical operators is not equal to this is when the less than and greater than symbols are typed back to back this combination of operators is interpreted by Excel as not equal to in other words you’re asking Excel to determine that value A does not equate to value B for example the result of the logical test 1 is not equal to two would be true because the two numbers are different values so you’ve discovered how an if function formula works but how do you make use of one when constructing the if function formula you need to give Excel three pieces of information the first piece of information is called the logical test for the logical test you need to identify the cell that contains the value to be checked you also need to specify the test to be carried out in relation to this value this is the if keyword followed by parenthesis it’s within these parentheses that you must type the logical test for example Lucas needs Excel to check the total sales of each team member to determine if they meet their monthly target the next instruction tells Excel what to do or what to display if the test returns a result of true in Lucas’s case if his test returns a value of true then the team member is awarded a bonus the third and final argument is what Excel should do or display if the logical test returns the result of false if Lucas’ test returns a value of false for a team member then Excel returns a value of zero in other words that person is not awarded a bonus now that you’ve reviewed the elements of an if function formula let’s make use of your new skills and help Lucas create a formula to check the sales team’s monthly figures and determine which employees are entitled to a bonus the data set Lucas requires is in a workbook called monthly sales the workbook contains four sheets one for each sales team for this exercise let’s just focus on the results for team A the worksheet lists the name of each team member their total monthly sales and their monthly target the bonus amounts must be calculated and listed within column E any team member who meets or exceeds their target is awarded the bonus figure in cell H4 let’s begin by finding out if team member Michelle Cook is entitled to a bonus position the cursor on cell E4 type an equal sign the keyword if and an opening parenthesis you need to place your arguments for the if function within parenthesis notice the floating help message prompting you for the three arguments that the function needs select cell C4 for Michelle’s monthly sales data type a greater than symbol followed by an equal sign then select cell D4 and type a comma this instructs Excel to check if Michelle’s sales figures for this month are greater than or equal to her assigned target however as you can see from the bold prompt text the formula is still incomplete you now need to instruct Excel on what bonus value to award you must also include what action Excel should take if the result of the logical test is true or yes and what to do if the result is false or no select cell H4 for the value if true add a dollar sign before the column initial and the row number this dollar sign prevents Excel from adjusting it when copied then type a comma followed by a zero for the value if false this zero indicates that Michelle doesn’t receive a bonus if Excel doesn’t return the required value finally type a closing parenthesis to end your arguments press enter to execute the if function formula the results show that Michelle has met her sales target and has earned a bonus of $500 for this month copy the formula down the column and executed to determine how the other team members have performed the results show that three team members met their sales targets and could be awarded a bonus two team members did not reach their targets so should not receive a bonus thanks to your help Lucas successfully created the IF function formula he needed to complete his task and having assisted Lucas you should now know how if functions work and recognize the correct syntax to create a logical formula using if well done you may be familiar with using a logical function to test for conditions in your data sets but what if you need to test for multiple conditions you can use nested if and ifs functions in this video you’ll explore the concept of nested if and ifs functions and learn how they can be used to perform a series of elimination tests and generate a final result over at Adventure Works Lucas is calculating bonuses for sales team B lucas needs to calculate each team member sales total and determine what level of bonus they should be awarded lucas can complete this task using nested if and ifs functions let’s find out more about these functions and then help Lucas complete his task at this stage of the course you’ve encountered many examples of function formulas but a formula doesn’t have to make use of just one function in fact a formula can contain several functions that work together to achieve a result logical functions work this way by interconnecting with one another nesting functions is the technique of adding another function to the formula as an argument for the original function in other words you can place one function inside another to expand its functionality for example you might need to create a formula that performs a series of elimination tests before it generates the final result you could design this formula in two ways one approach would be to create what is known as a nested if formula the formula begins with an if that performs an initial logic test if the test turns out to be true then the formula will simply process whatever action is specified in the value if true argument however the result of the logical test could also be false if so then another if function in the value of false argument could run another test and process different actions for example a nested if formula could check if a member of the adventure work sales team meets a specific bonus band if the result is false then a second argument could check the value against another band and so on the second approach is to use a function called ifs an ifs function is designed to run a series of tests that don’t require you to nest other functions the ifs function steps through the tests checking each one if a test is false it continues to move through the tests until it finds one that is true when a logical test returns true as a result the formula performs or displays whatever is in the value if true for that test it then stops running tests in the case of Adventure Works the IFS function can continually check each sales team member’s sales results against the different bonus bands until it identifies a suitable amount to award them now that you’ve learned about the basics of nested if and ifs functions let’s put your knowledge to use by helping Lucas to calculate the bonus bands for the sales team the sales data sets are contained in the team B worksheet in a workbook called monthly sales figures the team B worksheet lists the names of each team member and their monthly sales result it also lists their sales targets and the amount they achieved above their targets the bonus amounts must be listed in column F using the bonus bands data in column I and J adventure Works also needs a formula in F3 that checks the sales data in cell E3 it must then calculate which bonus band is applicable to the team member Olivia King and display the correct bonus amount let’s begin by typing the formula position the cursor on F3 type an equal sign an if and an opening parenthesis next select E3 to add that cell as a cell reference then type a greater than symbol followed by an equal sign type 20,000 which is the first bonus band and then a comma finally select cell J3 to add it as the value if true argument then type a comma this first part of the formula provides Excel with the following instruction if the figure in cell E3 is greater than or equal to 20,000 then the staff member is owed the bonus amount in cell J3 but what if one or more of the amounts in column E are less than 20,000 if the amount in E3 is less than 20,000 there are still two other bands from which a bonus can be assigned to test for these bands you need to add another if function as the value if false argument in the formula you can nest this function within the first one first type an if in this instance you don’t need another equal sign then type an opening parenthesis so you can begin writing your arguments this second occurrence of the if will need its own opening and closing parenthesis the parenthesis must contain three arguments a logical test a value if true and a value if false let’s create the logical test first select E3 to assign it to your argument then type a greater than symbol and an equal sign then type 10,000 and add a comma next you need to assign the value if true so if the amount in E3 is over 10,000 then the bonus amount awarded will be the value in J4 select cell J4 and type a comma to assign it to your argument finally you need the value if false if it’s not true that the amount is over 10,000 then the bonus amount awarded will be the value in J5 select cell J5 to assign it to your argument each instance of if also needs its own closing parenthesis type two closing parenthesis and press enter to execute the function the results of your function show that the logical test for the first if failed so Excel moved on to the second if the second logical test was true so Excel correctly displayed the bonus amount of $1,000 in cell J4 changing the monthly sales figure for Olivia to 67,140 would change the result in F3 because both if functions would have returned a false result so the result would have been the value in cell J5 this formula is now a nested formula because there is a second if inside the first one let’s delete this result and recreate the formula using the ifs function when you type equals an ifs and an opening parenthesis Excel only provides prompts for two arguments a logical test and a value if true as you learned earlier you can use ifs to specify a series of tests and the value if true for each one let’s step through this process select cell E3 then type a greater than symbol an equal sign and a value of 20,000 type a comma and then select J3 as the band to be assigned if the first test is met when you type a comma prompts appear for another logical test and a value if true for the second logical test select E3 again this time you must follow it with a greater than sign and an equal sign then select J4 now you need to tell Excel that the final value of true should be the result of the formula so type true and a comma then select J5 adding the word true here prevents Excel from producing a hash NA error message you also need to add dollar signs to the J3 J4 and J5 references you can now copy this formula down through the column to calculate the bonus amount for each team member thanks to your help Lucas has now determined what bonus band should be awarded to each team member and you should now understand the difference between a nested if function formula and a calculation that uses ifs you’ve explored the different syntax for both types of formula so you can decide which you find easier to understand and replicate congratulations on reaching the end of the third week in this course on preparing data for analysis with Microsoft Excel this week you explored how to use functions to prepare data for analysis in Excel let’s take a few minutes to recap what you learned in this week’s lessons you began the first lesson by discovering how inconsistent data affect analysis and the common mistakes people make examples of these errors include misspellings unnecessary characters and spaces and incorrectly place entries you now know that errors such as these have a negative impact on data analysis you were also able to fix these errors in your data before submitting it for analysis you then learned how you can use different functions to standardize text data the left mid and right functions are used to return a specific number of characters from either the left the middle or the right side of a cell entry typically these functions are used in situations where you need to transfer parts of the cell content to a different column many data analysts use the left mid and right functions to split the contents of a column into three separate columns the trim function removes empty spaces from text strings except for the spaces between words this is useful for when you suspect that there are random spaces at the beginning or end of an entry it’s also a useful way to tidy up a column of text before beginning any analysis using the wrong case in text data can make a summary or report appear untidy or unprofessional there are three functions you can use to standardize the case used in text entries these are upper lower and proper lastly you can use the concat function to combine entries from different cells in a spreadsheet into a single cell entry and in this lesson you put your new knowledge of functions to use by helping adventure works you used your knowledge of functions to help Adventure Works standardize its data for analysis one of these tasks was in the exercise in the exercise you had to clean up Adventure Works spreadsheet so that it could be used for data analysis to complete this task you used formulas to remove inconsistencies or errors from the data and you made sure that your formulas followed the best practices you had explored during the lesson you then undertook a knowledge check in this item you proved your understanding of concepts you encountered by answering a series of questions finally you explored a list of additional resources designed to help you improve your knowledge of the topics in this lesson in the second week you learned how to use date and time functions in Microsoft Excel to generate new data you explored different examples of how the data generated from date and time calculations can be used for example date and time data can be used to create a framework for planning track business performance and display important results you then learned how Excel interprets and works with dates in a spreadsheet all dates have serial numbers which is how Excel interprets them with these serial numbers you can use dates to perform calculations like subtracting one date from another you also reviewed functions for creating dynamic formulas that calculate time and date values these include the today and now functions and you discovered that you can also divide a date entry into its component parts using day month and year or return these components as a single date with the date function throughout the lesson you put your new knowledge to use by assisting Adventure Works you helped the company to plan its projects by using different date and time calculations one of these tasks was in the exercise in the exercise you gathered date and time information for one of Adventure Works advertising campaigns you completed this task using the date and time calculations you learned about these functions helped you to generate new milestone data for Adventure Works you then undertook a knowledge check in this item you proved your understanding of the concepts you encountered by answering a series of questions finally you explored a list of additional resources designed to help you improve your knowledge of the topics in this lesson in week three you learned about logical functions such as if and ifs you learned that logical functions can be used to ask yes or no questions about your data if the function returns yes as its answer then you can direct Excel to perform the required action however if the function returns an answer of no then Excel can be directed to perform a different action next you learn that for these tests to work the formula must contain logical operators the logical operators determine what kind of question the formula is asking and what value it needs for its answer you discover that these operators make use of if formulas and this formula needs three pieces of information to work it requires a logical test a true value and a false value you also learned that nesting functions is the technique of adding another function to the formula as an argument for the original function in other words you can place one function inside another to expand its functionality there are two approaches you can use the nested if function or the ifs function you learned that the nested if formula begins with an if that performs an initial logic test if the test turns out to be true then the formula will simply process whatever action is specified in the value if true argument however the result of the logical test could also be false if so then another if function in the value if false argument could run another test and process different actions the second approach is to use the ifs function you discover that the ifs function steps through the tests checking each one if one test is false then the function continues to move through the remaining tests until it finds one that is true when a logical test returns true as a result the formula performs or displays whatever is in the value if true for that test it then stops running tests just like in the previous lessons you put your new knowledge to use by helping adventure works in this lesson you determined the financial performance of the sales team using if and ifs functions this included the exercise item in the exercise you helped Adventure Works to generate additional information from a customer’s spreadsheet to complete this task you generated the required information by using if and ifs functions and you made sure that your calculation followed the best practices you had explored during the lesson you then undertook a knowledge check and a module quiz in which you proved your understanding of the concepts you encountered by answering a series of questions you’ve now reached the end of this module summary it is time to move on to the discussion prompt where you can discuss what you’ve learned with your peers you’ll then be invited to explore some additional resources to help you develop a deeper understanding of the topics in this lesson best of luck we’ll meet again during next week’s lessons you’re nearing the end of this course on preparing data for analysis in Microsoft Excel you’ve put great effort into this course by completing the videos readings quizzes and exercises you should now have a stronger grasp of several foundational concepts for understanding data analysis these include the fundamentals of working with data in Microsoft Excel creating and using formulas and functions in Excel and preparing data for analysis using functions you’re now ready to apply your knowledge in the exercise and the final course assessment the assessment is a graded quiz that consists of 30 questions that are related to topics you covered throughout the course but before you start let’s recap on what you’ve learned in the first week you were introduced to Microsoft Excel you learned how to use Excel by exploring how to enter and format data manage worksheets read large blocks of data and sort and filter data microsoft Excel is a useful data analysis tool it is used in everyday business to store calculate and gain insights from data you learned how to navigate Excel using its UI for example the title bar that displays the name of your file and search option and the command tabs which are organized into tabs and ribbons you also learned that a worksheet is where you input data into cells data can be added to worksheets by importing it or creating it manually data isn’t always easy to read but you’ve learned how to use formatting to improve the readability of a spreadsheet you also explored the keyboard shortcuts for data entry and formatting excel has various features that help you to read large blocks of data you learn that you can use the freeze panes new window name box features and keyboard shortcuts to make it easier to read your data you can use the sort and filter feature to organize and sort data quickly and efficiently there are also different sort methods such as alpha numeric sort and multi-level sort that you can use to sort your data the filter feature helps you to control data visibility in a worksheet and provides information on how many rows match a specific criteria in the following week your focus shifted to functions and formulas in Excel you discovered that a formula in Excel is a calculation performed on the values in a range of cells in your worksheets examples of these calculations include addition subtraction multiplication and division once the calculation is completed the formula returns a result even if it is an error you then explored how formulas work along with the operators they use formulas can be static or dynamic a static formula means that the numbers are fixed so it always generates the same results a dynamic formula is one in which the results depend on the current values in the reference cells and it reacts to any changes in the values by updating the result you also learned how to control calculations here you learned that Excel controls calculations using the order of precedence this means that Excel processes the mathematical operators in formulas according to the hierarchical position of each symbol within the order of precedence you learned about the hierarchy of symbols and discover that you can also control a calculation using parenthesis next you explored the relative and absolute cell references these concepts relate to how a
cell reference appears in a calculation a relative cell reference means that Excel adjusts the cell reference of a copied formula relative to its new location to make sure it’s correct an absolute reference means that Excel keeps the reference constant in other words it doesn’t adjust it you also learned about functions which are predefined formulas built into Excel you explored popular functions such as sum average and count and learned how to create formulas with them using features such as the autosome shortcut and the insert function wizard you also explored different percentage calculations and you learned how to create reliable percentage formulas using the correct syntax the third week was all about preparing data for analysis using functions you started off by exploring how inconsistent data affects analysis and the mistakes that can be made when inputting data examples of these errors include misspellings unnecessary characters and spaces and incorrectly placed entries you now know that errors such as these have a negative impact on data analysis you also learned how to fix these errors in your data before submitting it for analysis it is important to standardize text data before analyzing it you can do this using functions the left mid and right functions are used to return a specific number of characters from either the left the middle or the right side of a cell entry typically these functions are used in situations where you need to transfer parts of the cell content to a different column the trim function removes empty spaces from text strings except for the spaces between words this is useful for when you suspect that there are random spaces at the beginning or end of an entry you also learned that there are three functions upper lower and proper that you can use to standardize the case used in text entries your reports will look tidy and professional if you standardize the case you can also use the concat function to combine entries from different cells in a spreadsheet into a single cell entry next you discover that dates are important for data analysis without date and time data it is more difficult to analyze and compare results over time you explored functions such as today or now which help you add dynamic date and time information to your worksheet you also learned that other functions such as year month or day can be used to split dates into their component parts to facilitate analysis finally you learned how logical functions such as if and ifs add another dimension to calculations because they ask Microsoft Excel to check for criteria and perform different actions depending on the result you then explored how other functions such as the or and the and functions make the logical formulas you create even more efficient and versatile you also learned how to produce specific and targeted formulas by using functions such as sum if average if and count if these functions combine the if functionality with the actions of standard functions such as sum now that you’ve built a solid understanding of the fundamentals of Excel formulas functions and learned how to prepare data for analysis you’re ready to test your knowledge by undertaking the exercise and the final course assessment best of luck congratulations you have made it to the end of the preparing data for analysis in Microsoft Excel course your hard work and dedication have paid off you’re off to a great start with your data analysis learning journey and you should now have a thorough understanding of the fundamentals of Microsoft Excel working with blocks of data in Excel formulas and functions and how to prepare data for analysis using functions you can also identify common errors made in data analysis and you know how to deploy different strategies to make sure you have reliable data but that’s not all you’ve also gained valuable insight into the functions and formulas you can use to create in-depth data for analysis you’ve explored various calculations deepened your knowledge of how data analysis can be performed and reviewed scenarios where it is used and let’s not forget the process of preparing data for analysis you now understand the critical role that reliable data plays as a central focal point of data analysis you should now have a firm knowledge of how Microsoft Excel works and how it can be used for data analysis think about everything you can do with this new knowledge well done for taking the first steps towards your future data analysis career by successfully completing all the courses in this program you’ll receive a Corsera certification this program is a great way to expand your understanding of data analysis and gain a qualification that will allow you to apply for entry- levelvel jobs in the field all the courses in this program including the one you just completed will help you prepare for the PL300 exam by passing the exam you’ll become a Microsoft certified PowerBI data analyst it will also help you to start or expand a career in this role this globally recognized certification is industry endorsed evidence of your technical skills and knowledge the exam measures your ability to perform the following tasks prepare data for analysis model data visualize and analyze data and deploy and maintain assets to complete the exam you should be familiar with Power Query and the process of writing expressions using data analysis expressions or DAX you’ll learn about the syntax later in this program you can visit the Microsoft certifications page at http://www.learn.microsoft.com/certifications to learn more about the PowerBI data analyst associate certification and exam this course has enhanced your knowledge and skills in the fundamentals of data analysis but what comes next there’s more to learn so it’s a good idea to register for the next course on harnessing the power of data in Microsoft PowerBI the next course will cover various ways data analysis is used in business you’ll learn about the role of a data analyst and how to use data to solve business problems and you’ll learn how to process and analyze data then you’ll move on to learn about the tools needed to analyze data efficiently whether you’re just starting out as a novice or you’re a technical professional completing the whole program demonstrates your knowledge of analyzing data in PowerBI you’ve done a great job so far and you should be proud of your progress the experience you’ve gained will show potential employers that you are motivated capable and not afraid to learn new things it’s been a pleasure to embark on this journey of discovery with you best of luck in the future hello and welcome to the harnessing the power of data with PowerBI course this course covers the core concepts of data analysis and introduces the main features of Microsoft PowerBI many of your normal digital activities generate data this can happen when you use services such as car parking traveling by rail or air or from your shopping socializing or fitness activities of course it’s not just you that’s contributing data your friends family and colleagues in fact almost everyone adds content to the data pool businesses and organizations also use many other sources such as government financial economic health and scientific data to name a few gathering and storing a vast amount of data is the first phase then comes the challenge of its analysis this is why there is a growing demand for data analyst professionals businesses need data analysis more than ever and as a data analyst you’ll be ideally placed to begin harnessing the power of data in this learning path you will learn about the life and journey of a data analyst and the skills tasks and processes they go through in order to tell a story with data you’ll discover how getting that data analysis story correct enables businesses to make informed decisions let’s get an overview of the main topics covered in this course you may have already learned about one crucial topic preparing data using Microsoft Excel you also need to understand other elements involved in the career of data analysis including learning about the stages in the data analysis procedure and the roles involved recognizing key issues and concerns when conducting analysis and sharing results and knowing different types of data sources and connection types this course will give you a solid foundation in these topics and introduce you to the component elements of Microsoft PowerBI software that helps to process analyze and share data let’s now quickly summarize the course material to give you an overview of all your study in this course this course will introduce you to data analysis in business data sources and data ingestion to begin you’ll learn about the role of a data analyst key data analysis concepts and how data plays an essential role in business you’ll then be briefly introduced to PowerBI as a tool for data analysis you will also learn about data sources and the exact transform load or ETL process you’ll learn the importance of identifying and evaluating data sources and following this you will learn about transforming and cleaning data in PowerBI you’ll get to distinguish between the different query and scripting languages to consolidate your learning and put it into practice you will complete a practical assignment where you will use data to determine the cause of a recent decrease in sales practical exercises in the course are based on a fictional business called adventure works during the exercise you must identify stakeholders locate data sources perform data transformation and distribute reports after this hands-on learning you will complete a final graded assessment be assured that everything you need to complete the assessment will be covered during your lesson with each lesson made up of video content readings and quizzes to assist your learning you will also get to apply your newly gained skills in exercises quiz questions and self- reviews in addition discussion prompts allow you to share knowledge and discuss difficulties with other learners these discussions are also a great way to grow your network of contacts in the data analysis world so be sure to get to know your classmates and stay connected during and after your course is this the course for you hopefully the outline of the course content and topics will help you decide and it’s important to mention that you don’t need an IT related background to take this course it’s for anyone who likes using technology and has an interest in data analysis whatever your background to complete this course you need to have access to some resources you need a laptop or desktop computer with a recommended 4 GB of RAM an internet connection and a Windows operating system version 8.1 or later it should have a .NET framework version 4.6.2 to or later install and a subscription to Microsoft Office 365 you’ll also need to install PowerBI desktop available as a free download you’ll find further details about these and other requirements in the additional resources item at the end of this lesson this program prepares you for a career in data analysis when you complete all the courses in the Microsoft Power BI analysis professional certificate you learn a Corsair certificate to share with your professional network taking this program not only helps you become job ready but also prepares you for an exam PL300 Microsoft PowerBI data analyst in the final course you’ll recap the key topics and concepts covered in each course along with a practice exam you’ll also get tips and tricks testing strategies useful resources and information on how to sign up for the exam finally you’ll test your knowledge in a mock exam mapped to the main topics in this program and the Microsoft Certified Exam PL300 ensuring you’re wellprepared for certification success earning a Microsoft certification is evidence of your real world skills and is globally recognized a Microsoft certification showcases your skills and demonstrates your commitment to keeping pace with rapidly changing technology it also positions you for increased skills efficiency and earning potential in your professional roles the topics covered in the practice exam include prepare data model data visualize and analyze data and deploy and maintain assets in summary this course introduces how a data analyst uses data to create a compelling story through reports and dashboards using Microsoft PowerBI it also explores the need for true business intelligence in the enterprise i hope you are ready to get started with your data analysis journey data is an essential business component with organizations using many methods to collect their data however raw data is only meaningful with proper interpretation and analysis that’s where the work of a data analyst is crucial because data is often used to inform decisions that can significantly impact an organization’s success data analysts are essential to business they help organizations make sense of the vast amount of collected data in this video you will explore the role of a data analyst the flow of data in an organization and how an analyst achieves data insights that inform decisions you’ll also learn about the importance of data analysis in modern organizations and the vital role of a data analyst data analysts help organizations make sense of the data they collect turning it into insights that inform decisions let’s explore the responsibilities of a data analyst and discover how they achieve data insights imagine you work for an online retail company every day your company collects data on customer purchases website traffic and social media engagement however the data is not organized which makes it difficult to analyze the inability to interpret the data means your company fails to identify opportunities to improve customer experience increase sales and stay ahead of the competition this is why a data analyst is needed the data analyst is responsible for collecting organizing and analyzing the data to generate insights that inform business decisions for example the data analyst may identify trends in customer behavior that could inform marketing campaigns or website design they may also identify areas where the company can cut costs or improve efficiency strategic thinking awareness of impact and understanding of context are crucial skills for a data analyst to succeed in their role here’s why each skill is important strategic thinking helps data analysts prioritize tasks allocate resources efficiently and make datadriven decisions that contribute to long-term success by considering both short-term and long-term implications data analysts can ensure their work has a meaningful impact on the organization being aware of the potential impact of their analysis is critical for data analysts to ensure they communicate their findings responsibly and ethically this involves understanding the consequences of datadriven recommendations considering potential biases and ensuring data privacy and security awareness of impact also helps data analysts advocate for datadriven decision making and fosters a culture of evidence-based strategy within the organization data analysts need to have a deep understanding of the context in which they are working including the industry market trends and the organization’s goals and challenges this knowledge allows them to tailor their analysis to the specific needs of the business and provide actionable insights data analysts use various tools and techniques to collect and analyze data these include programming languages like R and Python r is used specifically for data analysis while Python is a generalpurpose programming language that can be used for a wide range of applications including statistical analysis data visualization tools like Microsoft PowerBI and databases like SQL Server data analysts are expected to be proficient in these tools and technologies and to possess excellent analytical skills a data analyst collects data from many resources including customer sales financial and operational data departments within an organization such as marketing sales finance and operations provide this data the data is then processed cleaned and transformed into a usable format for analysis this process is known as data wrangling once the data is wrangled it is loaded into a data warehouse or data lake where data analysts can access and analyze it the data is organized into tables or data sets each containing a specific data type data analysts then use this data to generate insights that inform business decisions data analysts play a critical role in our datadriven world they help organizations make sense of the large amounts of collected data turning it into insights that inform decisions using their skills data analysts help organizations identify growth opportunities improve operations and gain competitive advantage someone at the party asks you “What do you do?” You reply “I work with data.” Does that help them data roles are a mystery most people don’t understand the value and variety of positions in the data analysis process let’s demystify data analysis roles and responsibilities in this video by exploring various roles and describing how they contribute to the success of datadriven organizations you’ll also learn about the importance of each role and how roles collaborate the data analysis roles and responsibilities that you’ll explore are data engineer data analyst data scientist database administrator data architect and business intelligence analyst commonly called BI analyst to understand a data engineer’s role imagine you’re creating a garden the data engineer is like the person who designs and constructs the irrigation system delivering water to each plant they build and maintain the data infrastructure including designing constructing and integrating data pipelines they clean pre-process and transform raw data into a format that can be used by data analysts and data scientists in our gardening analogy the data analyst is like the gardener who meticulously observes the growth of each plant and makes recommendations for improvement data analysts examine data sets to identify trends patterns and insights to inform decision-m they use various tools and techniques to visualize and present data making it easily digestible for stakeholders data analysts work closely with other team members to align their analysis with business goals and objectives think of a data scientist as a botanist using their plant biology knowledge to optimize the growth and health of the garden they dive deeper into the data to create predictive models using machine learning algorithms and statistical techniques they seek to identify hidden patterns and correlations that help organizations make better datadriven decisions data scientists often work closely with data analysts sharing insights and collaborating on projects to maximize the value of the data after all at gardening you’ll want to safeguard the security and overall health of the garden that’s like the role of a database administrator or DBA database administrators work on the maintenance performance and security of an organization’s databases they ensure data is stored and retrieved efficiently implemented backup and recovery strategies and manage user access dbas play a crucial role in keeping data safe and accessible to those who need it to ensure a greatl looking garden a landscape architect designs the garden layout to maximize aesthetics and functionality in a similar fashion a data architect creates the blueprint for an organization’s data management systems they design data models establish database structures and create strategies for data storage integration and retrieval data architects collaborate with other data professionals to align their designs with business needs and support the objectives of data analysts and scientists the business intelligence or BI analyst is like the garden consultant who helps you make informed decisions about the type of plants to grow where to place them and how to care for them based on data and analysis pi analysts transform data into actionable insights that drive business growth and improve decision-making they work closely with data analysts and data scientists to extract meaningful insights from complex data sets focusing on key performance indicators and using various BI tools to visualize and present data to stakeholders bi analysts also collaborate with business leaders to understand their goals and objectives ensuring that their analysis is relevant and impactful so the next time you’re at a party and someone asks about your role what will you say you should be able to highlight the importance and variety of data analysis positions you could discuss the data engineer who is responsible for building and maintaining the data infrastructure the data analyst who identifies trends patterns and insights in the data the data scientist who creates predictive models to optimize decision-m the database administrator who ensures the security and performance of databases the data architect who designs the blueprint for data management systems and the business intelligent analyst who transforms data into actionable insights for decision makers your party friends will then understand what each role does in the data analysis process providing organizations with the information they need to make informed datadriven decisions jamie the CEO at Adventure Works has asked you to analyze customer data to identify trends and make recommendations for improving the customer experience after weeks of working through the data creating detailed visualizations and uncovering valuable insights you now need to present your findings to various stakeholders these include your team marketing sales and company executives for your project to be successful you need to effectively communicate your findings and collaborate with people at all organizational levels to succeed as a data analyst you need a strong foundation in non-technical abilities like these in addition to technical skills in this video you will explore some essential non-technical or soft skills a data analyst should have nontechnical skills are important for data analysts these skills can help you connect with and influence stakeholders increasing your impact within your organization essential non-technical skills include effective communication diplomacy understanding end user needs and being a technical interpreter for nontechnical stakeholders let’s explore each skill in more detail the first soft skill is effective communication data analysts need to effectively communicate findings to various stakeholders with different degrees of technical knowledge for example when Jamie at Adventure Works asks you to analyze customer data you would need to present your findings to team members managers and executives to communicate effectively data analysts need to present complex information clearly and concisely imagine you have identified a trend in Adventure Works data that could significantly increase sales instead of overwhelming your audience with raw data you could visually represent this trend and use storytelling techniques to explain how it could impact the business another important non-technical skill is diplomacy which is the art of navigating delicate situations and maintaining positive relationships even when disagreements arise as a data analyst diplomacy may be essential for negotiating access to data mediating disagreements among stakeholders or presenting results that challenge existing beliefs for instance you might have to present a report that disagrees with a manager’s idea by being diplomatic you can share your findings in a way that maintains trust and respect while still communicating your insights collecting and analyzing data is not sufficient for making an organizational impact data analysts also need to understand the needs of the end user of their reports this will lead to findings that are relevant and useful to the stakeholders that will use them as a result stakeholders can use the insights from your reports to take action and make informed business decisions understanding the analytical needs of a business involves asking questions empathizing with the user’s perspectives and collaborating with stakeholders to identify the most valuable insight imagine you are analyzing customer data for a marketing team by understanding the marketing team’s goals and customer frustrations you can tailor your analysis to provide more useful and relevant insights because data analysts often serve as a bridge between technical and nontechnical stakeholders it’s important to be able to translate complex concepts into understandable terms this is especially so when relaying information to stakeholders who lack a technical background one way to do this is by using analogies or metaphors to explain technical concepts for example comparing machine learning algorithms to a chef who improves their recipes over time based on customer feedback ultimately becoming a successful data analyst goes beyond mastering technical skills it also requires effective communication diplomacy a total understanding of the needs of end users and the ability to relay findings and concepts to stakeholders of varying technical knowledge by developing these non-technical skills you can better collaborate with stakeholders create actionable insights inspire change and make lasting impacts enriching your own career and contributing to the growth and success of those around you i hope this thought will inspire you as you continue your journey to becoming the best data analyst you can be if you needed to assess the prospects for a new bicycle launch in the USA by Adventure Works you wouldn’t collect data about sports clothing from the European market would you no because no matter how great your analysis is this data will not provide insights that Adventure Works can use to make informed decisions about a product launch in the USA that’s why gathering the right data is an important part of the data analysis process in this video you’ll explore how the objective or purpose of analysis informs the data analysis process you’ll learn the importance of gathering data that is aligned with this purpose and how it influences the type of scope of data used gathering the right data is crucial for conducting a successful analysis however before you can start collecting data it’s essential to determine and understand the purpose or goals of the analysis you can then collect the appropriate data to conduct an analysis that is focused relevant and useful for the end user of the analysis to determine the purpose of your analysis you will need to consult with stakeholders and consider the questions you aim to answer with the analysis such as what are the recent sales figures for bike A and bike B and insights you hope to gain through the patterns trends or relationships that emerge from the analysis such as how the introduction of bike B to the market is affecting the sales of bike A for example in the case of Adventure Works you might need to brainstorm with marketing manager Renee and the sales and marketing team to determine what they hope to achieve with analysis the purpose of your analysis will inform what is the right data to collect including the type and scope of the data to gather and use in the analysis the type and scope of data used then influence the conclusions drawn and the decisions made let’s explore how the purpose of the analysis can influence the type and scope of data used in the analysis the type of data refers to the format or structure of the data for example sales figures and numerical data suppose through consultation you determine that the primary goal of the analysis for the sales and marketing team at Adventure Works is to determine which bicycle models are the most profitable in the USA in this case the type of data you might choose to focus your analysis on is sales data which includes information on the total sales of each bicycle model the number of units sold and the revenue generated by each model however if the team is more interested in understanding which products American customers are interested in buying and how to improve the product purchasing experience customer feedback data may be more useful than sales data this might involve collecting customer reviews ratings and comments on each bicycle model as this data can provide valuable insights into customer preferences and help identify areas for improvement these examples demonstrate the role identifying and defining the end goal or purpose of the analysis plays in determining what data is relevant and should be collected aside from considering the type of data appropriated from achieving the aims of your analysis you also need to define the scope of your data in relation to the analysis purpose considering the scope of your data in data analysis includes defining the boundaries or limits of the data you’ll collect and use in your analysis such as geographical regions time periods or product categories it can also include the size or amount of the data and number of variables considered in the data to illustrate if Adventure Works stakeholders would also like to use the analysis to inform the development of a new bike in the USA you might decide to analyze market trends competitor and sales data from the past two years focusing on mountain bikes and road bikes in North America by defining the scope of the data you can ensure that you collect data that is useful for understanding the relevant product market and identifying potential product development opportunities for adventure works ultimately by carefully defining the type and scope of your data based on the purpose of your analysis you can collect relevant data this helps ensure that your analysis is accurate and relevant to the needs of the business addressing the specific objectives or goals of the project this video highlighted the importance of identifying the purpose of your analysis and then gathering relevant data of the appropriate type and scope for successful analysis this ensures that the analysis results are meaningful and useful helping businesses like Adventure Works unlock insights and make informed decisions as you continue to develop your data analysis skills remember that the foundation of any successful analysis lies in gathering the correct data you might think that a business like Adventure Works is a great place for data analysis it has access to large amounts of data from a variety of sources like sales manufacturing purchasing and marketing however that data while valuable is often not in a form that is easily understandable or ready for analysis this is where the process of preparing and analyzing data comes in in this video you’ll learn about the importance of processing and analyzing data for transforming raw data into valuable insights that can drive strategic decisions you’ll be introduced to the extract transform load or ETL process a common method for processing data you will also learn how using calculations and visualizations during analysis can help uncover hidden patterns and trends in the data first let’s define what is meant by processing and analyzing data processing data refers to transforming raw data into a format that can be easily understood and analyzed analyzing data involves using various techniques to explore interpret and draw meaningful conclusions from the processed data for Adventure Works processing data might involve consolidating data from multiple sources such as sales transactions customer demographics and product inventory this is because the data in its raw form may be scattered across different databases spreadsheets and even paper records additionally the data may be in various formats have missing values or contain duplicate entries in this case processing the data would involve cleaning organizing and transforming the data into a format that is more suitable for analysis a common data processing method is the extract transform load or ETL process the ETL process involves extracting data from various sources such as databases or files transforming the data to make it consistent accurate and ready for analysis for example by cleaning and filtering the data and loading the transformed data to a suitable destination like data repositories databases or analytical tools for further analysis this process which you will learn about in greater depth later plays a crucial role in preparing raw data for analysis now that you have a general understanding of data processing let’s explore some methods of data analysis one effective way to analyze data is by performing calculations on the processed data to reveal new insights for example Adventure Works can calculate its products total revenue profit margin or average order value these calculations can help the company identify which products are performing well and which might need improvement another powerful technique for analyzing data is data visualization visualizations or graphical representations of data such as charts and graphs can communicate complex information in a simpler way and help make complex data easier to understand they can also help uncover patterns trends and relationships within the data that might not be apparent through calculations alone for instance Adventure Works could create a bar chart to compare the total sales of different product categories or a line chart to track monthly revenue over time visualizations like these can help the company quickly identify trends spot potential issues and make more informed decisions in summary processing and analyzing data is critical to transforming raw data into actionable insights through the ETL process data can be extracted transformed and loaded into a format that is suitable for analysis when the data is processed calculations and visualizations can then be used to explore the data uncover hidden patterns and generate new insights to drive strategic decisions as you progress in this course you will learn more about the various tools and techniques available for processing and analyzing data by mastering these skills you will be better equipped to help businesses like Adventure Works maximize the value of their data and make datadriven decisions that drive growth and success jaime Lee owner and CEO of Adventure Works is concerned that sales have been stagnant and wants to take her business to the next level she’s aware of the power of data insights to drive business decisions so she employs Adio Quinn a data analyst to help provide the answers she needs to grow her company in this video you’ll explore how data insights can be used in the final stage of the data analysis process to drive business using a case study you’ll discover how these insights can empower stakeholders like Jamie to make informed decisions and improve business performance data insights refer to the valuable and actionable information knowledge and understanding generated from analyzing data this is the final stage of data analysis where the insights can be used to identify trends patterns and opportunities these insights can then lead to actionable business decisions that can help businesses grow and stay ahead of the competition let’s explore how data insights can drive business decisions practically by considering how Jamie could use insights related to sales customer and competitor data to make decisions that improve business performance at Adventure Works by analyzing sales data collected over the past year ADIO identifies that certain types of bicycles sell more during specific seasons like mountain bikes in the spring and road bikes in the summer by using this data insight Jaime can make informed decisions about inventory and promotional efforts for example she could make sure that the warehouse is sufficiently stocked up with each bike type based on seasonal demand levels and have the marketing team offer special promotions to boost sales of the bikes in their off seasons by making decisions based on data insights Jaime can optimize her inventory management and increase overall profitability suppose Adio also discovers that customers belonging to particular age groups prefer specific bicycle types or respond more positively to particular marketing messages jamie can use this information to oversee the creation of targeted marketing campaigns offerings and communications that resonate with different segments of the company’s audience by personalizing marketing efforts based on customer data insights Jaime can increase customer satisfaction and loyalty and drive more sales and revenue imagine Addio’s analysis reveals a gap in Adventure Works current offerings with customer data indicating that customers are increasingly interested in electric bikes and unique design features with insight into this growth opportunity Jaime can explore the development of new products to meet these demands making decisions related to product development and innovation for Adventure Works this datadriven approach to product development ensures that businesses create products that cater to real customer needs increasing the likelihood of success another area where data insights could drive business decisions is pricing strategy sales data competitor pricing and customer feedback can help stakeholders like Jamie determine optimal price points for products balancing demand revenue optimization and market competitiveness for example say Adio finds that customers at Adventure Works are willing to pay a premium for certain highquality bicycles jaime can then adjust the company’s pricing strategy accordingly to capture more value from those sales however if some bicycles are priced too high and are hurting overall sales Jaime can consider lowering their prices to create demand by using data insights to inform pricing decisions businesses can optimize revenue and profitability stakeholders and data analysts alike can follow some best practices to enhance the use of data insights to drive business decisions for a comprehensive understanding of a business its operations and trends and patterns it’s important to gather data from multiple sources and regularly analyze it regular data analysis makes it possible to stay upto-date with trends and make timely informed decisions it’s also important to encourage a datadriven culture where data insights are valued and used to inform decision making at all levels likewise encouraging collaboration and insight sharing within an organization can lead to better decision-m finally investing in the right tools and technology like Microsoft PowerBI can help streamline the data analysis process making it easier to gain insights and make datadriven decisions you should now have a better understanding of how data insights can drive business by embracing a datadriven approach companies can stay ahead of the competition and make better business decisions ultimately the more stakeholders like Jamie understand their data the better equipped they’ll be to make informed strategic decisions that can optimize business performance for your company imagine navigating through a dark maze without a map searching for hidden treasure this is what it feels like to dive into a vast ocean of data without the right tools microsoft PowerBI offers a solution to the challenge of navigating large amounts of data and uncovering useful insights in this video you’ll learn about PowerBI’s role in data analytics and visualization its key features and benefits and navigating its user interface powerbi is a suite of business analytics tools to help organizations transform raw data into meaningful information and make datadriven decisions there are several products within the PowerBI ecosystem including PowerBI desktop the Windows application for creating reports and dashboards that you’ll use throughout this course and others such as PowerBI service PowerBI mobile PowerBI report server and PowerBI embedded these components work together to provide a comprehensive business analytics solution allowing you to connect to various data sources clean and prepare data create impactful visualizations and reports and share findings and insights effectively powerbi has become an essential resource for many organizations across various industries let’s explore why powerbi is userfriendly its easy to use intuitive interface makes it accessible to technical and nontechnical users alike with its drag and drop functionality you can create visualizations reports and dashboards simply and quickly another benefit of using PowerBI is data integration it supports a wide range of data sources including traditional databases Excel spreadsheets and cloud-based services this allows you to consolidate data from multiple sources and create a comprehensive view of their business performance powerbi simplifies data transformation with the Power Query Editor in PowerBI you can clean transform and reshape data as needed which is important to ensure that data is accurate consistent and ready for analysis there are also rich visualization options available in PowerBI with a variety of built-in visualization types such as bar charts and maps and custom visuals developed by the community these options make it easy for you to present data in a visually appealing and easy to understand way you can perform advanced analytics with PowerBI with data analysis expressions or DAX and built-in analytical capabilities you can perform complex calculations and data analysis leading to deeper insights and better decision- making plus you can easily collaborate and share reports and dashboards with colleagues both within and outside the organization powerbi is scalable and designed to grow with organizations its various licensing options and features can accommodate businesses of all sizes and the platform can scale to meet changing business needs finally PowerBI integrates seamlessly with other Microsoft products such as Excel SharePoint and Teams and offers a cost effective pricing model now that you have some insight into why PowerBI is one of the most popular data visualization and business intelligence tools let’s examine its user interface to get started with PowerBI you’ll need to download and install PowerBI Desktop the primary application for designing and creating reports and dashboards once you have PowerBI Desktop installed you can begin exploring the main areas of its user interface you can use the ribbon located at the top of the PowerBI desktop window to quickly access various tools and features to create and customize your reports and dashboards it contains several tabs such as home insert modeling and view each tab has its own collection of buttons and options for performing common tasks like connecting to data sources creating visualizations and formatting your reports in the left navigation pane you can select report to open report view report view is the primary canvas where you design and create your visualizations you can add and arrange different visual elements here like charts tables maps and more to build your report pages allow you to create multiple views of your data in a single report at the bottom of the PowerBI desktop window you’ll find a row of tabs you can use these to organize your visualizations based on themes or categories to add duplicate or remove pages use the tabs at the bottom of the report view the visualizations pane is located on the right side of the window and contains a gallery of visual elements that you can add to your report there are various types of visuals available that you can add to your report by clicking or dragging them from the visualization pane onto the report view also on the right side of the window is the fields pane it displays the data tables and fields available for your report as you learn to build reports in PowerBI you’ll use the fields pane to populate your visualizations with data the fields pane is organized into two sections the top section displays the available tables and the bottom section shows the fields within the selected table last the filter pane found on the right side of the window allows you to apply filters to your data at various levels such as the entire report individual pages or specific visualizations in this video you discover the benefits of using PowerBI as a business intelligence tool and explored its user interface by understanding its key features and capabilities you’re one step closer to using PowerBI to create reports that communicate your insights effectively and drive meaningful change businesses like Adventure Works often have a large amount of data but don’t know how to extract the insights hidden within in this video you’ll discover how calculations and visualizations in Microsoft PowerBI are used to analyze this data generate and communicate insights and empower businesses to make datadriven decisions you’ll learn the key concepts behind calculations using data analysis expressions or DAX and how visualizations can communicate complex data and insights in PowerBI calculations are the foundation of your data analysis and are created using a powerful language called data analysis expressions or DAX calculations allow you to perform specific operations on data manipulate it and create new calculated measures columns and tables that you can use in visualizations and reports to drive decision-m with custom calculations you can tailor your analysis to specific business requirements and address unique analytical needs some common calculations are aggregations where multiple values are combined or grouped into a single value to summarize large amounts of data for example summing up finding the average or counting data points based on specific criteria timebased calculations for comparing data across time periods such as month over month or year-over-year growth and ratios and percentages for calculating proportions or shares of a whole to understand the relative performance of different elements to illustrate with data on monthly sales Adventure Works could use DAX to calculate the average monthly sales determine the month with the highest sales or identify the percentage of sales coming from a specific product category after performing calculations with your data the next step is to represent the results visually visualizations enable you to communicate complex data and insights in a simple appealing way by presenting data graphically visualizations make it easier for stakeholders to grasp key insights trends and patterns that may be difficult to identify from row data or tables powerbi offers a wide range of visualization types such as different charts maps tables and even custom visualizations when choosing the most suitable visualization you should consider the type of data you’re working with for example whether the data is numerical or categorical consisting of non- numeric variables the purpose of your analysis such as comparing values showing distribution understanding relationships or tracking trends as well as the level of detail needed from highlevel summaries to granular insights now let’s explore how to create a visualization in PowerBI using a given data set suppose you are part of a team analyzing sales data and creating a report for Adventure Works you need to create a visualization that represents the number of orders across the different bite categories to create your visualization you first need to import your data to do this open Microsoft PowerBI desktop click on get data in the home tab then select text/ CSV and click connect navigate to the location of the CSV file containing the data you need in this case the Adventure Works bike sales data select it and click open once the data is loaded the data view will display the important data in a table format take a moment to familiarize yourself with the structure of the data the next step is to create a bar chart of the bike sales by category click on the report view which is the first icon on the left side of the PowerBI interface next click on the clustered bar chart visualization icon in the visualizations pane this is a bar chart with multiple bars after that drag and drop the product category field onto the y-axis section of the visualization pane then drag and drop the order quantity field onto the x-axis section of the visualization pane this bar chart visualization shows the total order quantity for each product category this can help Adventure Works quickly identify which bike categories have the highest or lowest number of orders they can use the insight to make informed decisions about inventory management marketing strategies and product development you’ve now gained a foundational understanding of calculations and visualizations in PowerBI and their role in generating results and insights from data you learned about using DAX calculations for data analysis and using visualizations to communicate data insights and help businesses make datadriven decisions congratulations on completing this first module on data analysis in business let’s recap some key concepts that you covered in lesson one you were introduced to the course and syllabus explored some tips for successfully completing the course and engage with your peers in the second lesson you learned more about the essential role data analysis play in businesses helping them collect organize analyze and understand their data data analysis can help businesses gain insights from their data identify the cause of problems uncover trends and make decisions that can improve business performance you are introduced to the stages of data analysis and the interconnected roles available within this process from data engineers to business intelligence or BI analysts you also explored some important skills data analysts need to succeed in their role including nontechnical skills like effective communication and understanding end user needs in lesson three you examine the stages of data analysis in more depth these stages include identifying the problem or purpose of the analysis collecting processing data and analyzing data data visualization and report sharing and implementing insights and recommendations you learned that gathering the right data is fundamental to an analysis that is relevant and useful understanding the purpose of your analysis will inform the type and scope of data that is correct for the analysis you then explore the processing and analyzing stages of data analysis some are more processing involves transforming raw data in preparation for analysis and analysis involves analyzing the processed data and generating insights you are briefly introduced to the extract transform load or ETL processing method and learned about DAX calculations and visualizations in data analysis you also learned about some factors to consider before sharing reports with stakeholders including the accessibility visual appeal and security of your report as well as data storage and refresh schedules you discovered the importance of understanding stakeholder experience and applying this to data visualization and analysis to more effectively convey data insights you learned how data insights can drive informed business decisions and lead to improvements like increased customer satisfaction you then explored some best practices for stakeholders and data analysts to follow to drive business decisions including collecting data from multiple sources regular data analysis encouraging datadriven culture and collaboration and insight sharing and investing in the right tools and technology you also had the opportunity to apply the knowledge gained in the lesson by evaluating an analysis process finally you were introduced to Microsoft PowerBI and its many benefits including its userfriendly interface rich visualizations and advanced analytics you learned how to navigate PowerBI’s users interface set up your own PowerBI desktop environment view a report and generate interactive visualizations you now know more about the role of a data analyst the data analysis process the role data analysts play in business and PowerBI as a tool for data analysis with the foundational knowledge you’ve gained you are ready to move on to your next lesson on harnessing the power of data in PowerBI in previous lessons you learned about the importance of data and the role it plays you discovered how organizations aim to derive meaningful insights from their collected data in this context it’s necessary to identify the collected data and evaluate which parts of it are required you could start a data project by first determining what is being measured and what are the critical issues you need to make decisions about the answers will help you to identify and evaluate the data correctly now let’s examine the process of data identification and evaluation in more detail this process includes understanding the importance of asking the right questions analyzing the required data for a business decision and data type classification by the end of this video you’ll understand data classification and modern data sources and you’ll learn how to use these in business decisions proper data valuation depends on the key skills of identifying data sources and asking the right questions let’s explore data evaluation at Adventure Works a fictitious large multinational company that makes and distributes bicycles and accessories to global markets jamie the CEO at Adventure Works wants to analyze sales data to reveal factors that influence the sales of their products a good place to start the analysis is to streamline the business requirement from complex to simple and then establish relationships between any multiple topics let’s take the example of identifying factors that affect sales to do this analysis you need first to determine the data to be measured and the potential factors that could influence it for instance this includes internal company data data from social media and sensor generated data such as product codes from barcode scanners or identity confirmation from facial recognition software sales data is the main area that Adventure Works wants to assess a critical source of this information comes from their enterprise resource planning or ERP system erp systems are designed to collect store manage and interpret structured data from various business activities structured data is data that is organized into a formatted repository typically a database so it’s easily searchable in the context of Adventure Works everything is a physical store from product shelves product categories to points of sale employees and customers and are all defined and stored in the table of the ERP database this kind of data structure creates a digital mirror of the real world store and provides a highly efficient and effective way for Adventure Works to analyze sales data from various periods such analysis could be based on product category or type of customer providing actionable insights into sales trends customer behaviors and product performance how you evaluate the ERP database depends entirely on your perspective and analysis evaluation questions could be are sales generally showing a downward or upward trend are there seasonal increases or decreases in certain categories how do holidays or special occasions affect sales have sales shown variability by age gender income level or customer geographic location on a product or category basis now let’s consider other potential data sources for Adventure Works in addition to the ERP data examining the situations that occur before or during the purchase are useful an excellent example of such a source is the sensors installed in the automatic doors of the store the data from these sensors revealing the number of people entering and exiting the store at any given time can be categorized as semistructured data semistructured data falls between structured and unstructured data while it doesn’t conform to the formal structure of data models as seen in an ERP system it contains tags or other markers to separate data elements and enforce hierarchies of records and fields within the data the data obtained from door senses might be tagged with information like timestamps store identifiers or locations allowing for more detailed analysis this data can be used to evaluate the store’s visit intensity over different periods offering an opportunity to correlate store traffic patterns with sales volume this analysis could lead to insight about peak selling times the effectiveness of promotions or how staffing levels relate to sales in addition Adventure Works can analyze unstructured data flowing from social media channels to gauge the company’s popularity and reputation this can include online messages related to the company social media check-ins photos and videos shared by customers unstructured data is information that doesn’t have a predefined structure or isn’t organized in a predefined manner making it less straightforward to analyze for adventure works this social media data can be evaluated from different dimensions such as the timing of posts or demographic characteristics of the audience interacting online with the company for instance by conducting trend analysis the company can gauge the popularity of its brands products or campaigns this analysis can inform marketing strategies customer engagement tactics and product development with a robust data identification and evaluation strategy to identify and evaluate the correct data sources companies like Adventure Works can harness the full potential of data to uncover actionable business insights each piece of data regardless of its type structured unstructured or semistructured holds immense value the true power of data lies not in its volume or variety but in its purposeful utilization remember data itself is not the end goal instead it’s a tool to help businesses make more informed decisions therefore it’s vital to understand why you’re using the data how it serves your purpose and what methods you’ll use for its evaluation what’s the best way to use Microsoft PowerBI as with other software you may have your own preferred way to use it and that’s okay however in this video you will explore key PowerBI components and discover their primary purpose to achieve the best results you must use these components in the proper order that sequence of use is known as a workflow over the next few minutes you’ll get to know how a common workflow operates in PowerBI microsoft PowerBI is an interactive data visualization product with multiple components you use its components and its rich visualization features to create meaningful reports from different data sources and types of data let’s explore the details of Microsoft PowerBI’s three main components powerbi Desktop PowerBI apps and PowerBI service powerbi Desktop is a Windows-based desktop application that is mainly used by data analysts or report designers to clean transform and load data create a data model design reports and publish these reports powerbi desktop uses PowerBI connector to access various data types and data sources connectors allow you to read data from various sources this includes resources located in the local file system such as Microsoft Excel or PDF documents conventional database systems hosted on internal servers called onremise databases cloud-based databases and even external enterprise applications and application program interfaces or APIs powerbi service is the cloud-based BI service or software as a service part of PowerBI it is used by report users and administrators powerbi apps is the native mobile application of PowerBI it’s available on iOS Android and Windows with these components and interfaces Microsoft PowerBI enables users from various disciplines such as report designers administrators and business users to use the product according to their roles as mentioned earlier the order in which you use these components is known as a workflow a PowerBI workflow can be described as the steps taken with data to create publish and share a typical workflow in PowerBI often starts with the creation of a report in PowerBI desktop report designers and developers are primarily responsible for this task when the report is ready you publish it to the PowerBI service where administrators can assign permissions and specific users can consume the report now let’s examine each step of the workflow in more detail create is about importing data and creating a report this step is when you import your data sources into PowerBI desktop clean transform and load your data in order to have targeted data for your reports use your filtered data to create a report and analyze and present your data using various visualizations and charts in your report then you move on to the publish step of the workflow where you publish reports and create dashboards that means you publish your report to the PowerBI service and share your data with others by creating dashboards and use different visualizations and filters to make your data more understandable in your dashboard the final step of this workflow is sharing in this step you share dashboards with users and manage access to your data share your dashboards with the users needed to make it easier to collaborate on projects manage access to your data by ensuring that dashboards have different user permission levels this is also where you consider mobile usage for instance using PowerBI mobile apps you can view and interact with reports and dashboards that have content pinned from reports anytime and anywhere you can use different features of the mobile apps to explore and share your data from different perspectives in summary a typical Microsoft PowerBI workflow sequences the requirements needed to choose data sources and types in step one and then step two is used to visualize the data the third and final workflow step presents the resulting reports and dashboards to cater to different user types and their requirements using such a workflow you combine different types of data from many sources using various components such as PowerBI desktop PowerBI service and PowerBI apps have you ever tried to solve a jigsaw puzzle when the pieces are scattered everywhere and you don’t even know those pieces belong to the same puzzle that’s what it can feel like as a data analyst tasked with extracting insights from data that spread across multiple sources formats and structures not to worry there’s a way to solve this problem the extract transform load or ETL process in this video you’ll build on your knowledge of the ETL process you’ll explore the three main components of the ETL process and how to apply them the benefits of using the ETL process and how it’s performed using Microsoft PowerBI as you learned earlier in this course ETL stands for extract transform and load the names given to the three main steps in the ETL process this process involves taking raw data from various sources preparing it for analysis and loading it into a repository or data storage and management system let’s explore each step of the ETL process in more detail and how they can be applied in the scenario of the manufacturing company Adventure Works which produces and distributes bicycles and accessories extract is the first step in the ETL process which involves retrieving and extracting raw data from different sources such as databases files or other data storage systems for example imagine that Adventure Works data is scattered across multiple systems as is the case with many organizations say customer data is stored in a data management system called customer relationship management or CRM sales marketing and manufacturing data is in an enterprise resource planning system or ERP and purchasing data is in spreadsheets the extraction process involves pulling the data from these different sources then you consolidate it into an easily accessible central location often a temporary intermediate storage location known as the staging area and prepare it for further processing in the next step once the data is extracted the second step is to transform it transforming the data involves cleaning structuring and enriching the data to make it more suitable for analysis this may involve removing duplicates handling missing values creating new calculated fields converting data types and standardizing measurement units in the case of Adventure Works let’s say that the sales and marketing data is in US dollars but the manufacturing and purchasing data is in different currencies depending on where in the world the sales or purchase take place as part of transforming the data you may need to convert all the currency values into a standard unit of measurement in this case US dollars to ensure consistency the third and last step involves loading the transformed data into the final storage system typically a data warehouse where it can be readily accessed and analyzed for example using tools like PowerBI depending on the organization’s needs the loading process can be a one-time event or scheduled to run regularly in the case of Adventure Works the cleaned and transformed data might be loaded into a cloud-based data warehouse making it accessible to the company’s data analysts and decision makers the ETL process ensures that the data analyze is accurate clean and consistent which in turn supports informed decision-m this process offers many benefits including data integration etl helps integrate data from different sources providing a unified view of an organization’s data making it easier for analysts to perform analysis and derive insights data quality etl processes involve data cleansing and validation which significantly improve data quality data consistency by transforming data into a standardized format ETL ensures consistency across various data sets enabling analysts to easily compare and analyze data from different sources enhance performance by aggregating summarizing or indexing data during the transformation process etl can improve query performance and reduce the load on data analysis systems and data governance etl can support data governance initiatives by helping organizations maintain a single source for their data ensuring that everyone has access to the same accurate information widely used in data analytics tools like PowerBI the ETL process helps you bring together refine and assemble different data pieces into a coherent picture that can drive business decisions powerbi is just one tool that comes equipped with built-in ETL capabilities enabling you to connect to many different data sources transform your data using Microsoft Power Query and load it into the PowerBI data model power Query is a powerful ETL tool within PowerBI providing a graphical interface and formula language called M to perform various data transformation tasks with Power Query you can extract data from multiple sources clean and structure it and load it into PowerBI for creating reports and visualizations the extract transform load or ETL process is essential for any datadriven organization the importance and benefits of ETL lie in its ability to turn raw data into accurate and consistent information in a centralized system that is easy to analyze and use in decision-m because data is critical to better decision- making embracing tools that can support the ETL process such as PowerBI can significantly impact business performance addio the data analyst at Adventure Works needs to analyze sales data from multiple channels including physical stores and e-commerce platforms he asks the data analytics team to gather and ingest the data a fundamental step before he can proceed with the later stages of the extract transform load or ETL process in this video you’ll explore data gathering and ingestion including different methods to gather and ingest data and their advantages and disadvantages let’s start by outlining data gathering and ingestion which typically take place in the extract step of the ETL process data can come from a variety of sources such as structured data from spreadsheets or databases unstructured data from text files or social media posts and streaming data from realtime data transmissions such as webcams or satellite navigation systems data gathering involves collecting or acquiring data from these different sources an example of gathering data is the data analytics team at his venture works collecting all their sales data ranging from spreadsheets to realtime streams data ingestion starts with data gathering and encompasses the process of obtaining and importing data from various sources for immediate use or storage such as in a database for example as a part of data ingestion the team at Adventure Works can go on to extract relevant data from each source such as customer data and sales metrics like revenue they can then load it into a central database where it can be accessed for further processing and transformation the data gathering and ingestion process is beneficial for organizations for various reasons with data volume velocity or speed of generation and variety in terms of types and sources constantly increasing it helps organizations consolidate their data this unified view of their data facilitates comprehensive analysis datadriven decision-m and innovation data ingestion improves operational efficiency through process automation proper ingestion practices can also help organizations meet regulatory requirements protect sensitive data and ensure data integrity now that you know more about data gathering and ingestion and its benefits let’s explore some common methods for gathering and ingesting data as well as their advantages and limitations these include manual data entry filebased ingestion database connections web scraping and data streaming manual data entry is the most basic method of data gathering and ingestion where data is manually inputed into a system for example an employee at Adventure Works may type in data from a physical customer order form into a customer relationship management or CRM system while manual data entry is straightforward and suitable for small amounts of data it is time consuming prone to errors and unsuitable for large scale data ingestion another method is filebased ingestion the process of importing data from files such as spreadsheets to illustrate Adventure Works might receive sales data from retail stores in Excel spreadsheets these files can be imported into the ETL process using tools that read and parse or interpret the file contents while filebased ingestion is common and requires less technical expertise than other methods it can become cumbersome when dealing with large numbers of files or frequent updates with the database connection method you access data directly from a database or data warehouse using tools that can connect to and query the source for example Adventureworks can create a database connection to access data from its sales database using SQL queries this connection enables the analytics team to extract necessary data by using SQL commands as well as transform and load it for further analysis later in the ETL process while database connections offer real-time access to data enabling instant insights and prompt decision- making they do require knowledge of database languages like SQL and may involve complex configuration or authentication process web scraping is a method of extracting data from websites using automated methods or software tools in the case of Adventure Works the analytics team can use web scraping to gather competitor pricing information or customer reviews web scraping is a powerful way to gather data from websites but it can require legal permission and be complex as it involves a range of technologies streaming data is continuous real-time data generated by sensors or other sources you can ingest data streaming using tools that connect to and process the data as it is generated for instance Adventure Works could use data streaming to monitor factory equipment track inventory levels or analyze real-time sales data data streaming allows for immediate analysis and decision-m but requires specialized tools and infrastructure to handle the continuous flow of data each data ingestion method has its advantages and limitations so it’s essential to choose the appropriate data ingestion method based on your specific use case and the nature of the data you’re working with in summary data gathering and ingestion involve obtaining and importing data from different sources generally in the extract phase of the ETL process data gathering and ingestion have many benefits for businesses from consolidating data to facilitating innovation by mastering the data gathering and ingestion methods introduced in this video you can help organizations like Adventure Works optimize their data for analysis due to rapid growth Adventure Works needs to store and manage increasing volumes of data from different sources the company must develop a comprehensive plan for data storage and management to handle its changing data needs in this video you learn about the role of data storage and management planning in the extract transform load or ETL process and for organizations in the short and long term you’ll also learn key considerations for effective data storage and management planning planning for data storage and management is involved throughout the ETL process during the extract step you need to consider what types of data you’ll be collecting how often and from which sources setting the foundation for data management in the transform step proper data management ensures the transform data is consistent accurate and complete planning for data storage is also necessary as the transformed data may need temporary storage before being loaded into its end destination finally in the load step planning for data storage and management like considering database or data warehouse structure facilitates efficient retrieval and analysis of stored data in a broader context planning for data storage and management impacts multiple aspects of an organization short-term data storage and management solutions address immediate data needs facilitating quick access to up-to-date data and collaboration for Adventure Works this is vital for daily operations like responding to customer inquiries and processing transactions long-term storage and management planning caters to strategic goals and compliance requirements for example long-term storage solutions will enable Adventure Works to analyze sales data customer feedback and market trends over time informing decision-m and improvement strategies when planning for data storage key considerations include storage capacity data access scalability security and backup and disaster recovery one of the first considerations is how much storage capacity you need this depends on factors like organization size data types and average file size required storage duration and anticipated data volume growth accurate estimation can prevent the cost of overprovisioning and lower underprovisioning risks like data loss and system performance issues it’s also important to consider how easily you and your team can access data when needed whether for daily operations and collaboration or long-term trend analysis planning for accessibility may involve organizing file structure implementing searchability and retrieval mechanisms and providing remote access options another factor is the scalability of your storage solutions or its ability to adapt to changes in data volume technology and data types planning for scalability helps ensure the storage infrastructure can support your organization’s data needs as they change over time without compromising performance requiring major infrastructure changes or incurring excessive costs next is security considering storage security is vital as data breaches can have serious consequences like financial loss planning and implementing security measures such as access controls and data encryption help protect your data against unauthorized access theft or tampering and emerging threats and vulnerabilities lastly a comprehensive backup and disaster recovery plan is essential for minimizing the impact of data loss due to unexpected events such as hardware failures or human error this involves creating regular data backups on site offsite or both implementing a recovery strategy that outlines how to restore data and resume operations and regularly testing and updating the recovery plan now that you’re familiar with data storage planning let’s focus on data management which involves organizing maintaining and protecting data to ensure its quality accuracy and accessibility key aspects of data management planning include data governance data quality data integration data security and privacy and data retention and archiving data governance establishes policies and procedures for data collection storage access and usage throughout your organization this helps prevent data silos or isolated sets of data ensures data accessibility and promotes data quality and responsibility among team members data quality considerations ensure accurate complete up-to-date data relevant to business needs you can implement processes for checking cleaning and enriching your data to maintain high quality data data integration plays an important role in the combination and consolidation of data from multiple sources and formats into a unified view facilitating data analysis and insights data security and privacy include planning measures such as access controls activity monitoring and compliance with data protection regulations implementing a data retention policy and archiving process to ensure data is retained for the appropriate time based on factors like legal or business requirements are important aspects of data management planning in conclusion data storage and management planning helps organizations develop comprehensive solutions to handle their current and future data needs even during periods of expansion as with adventure works by considering data storage factors like storage capacity and accessibility alongside aspects of data management from data quality to retention organizations can ensure efficient data storage management and use imagine you have a Microsoft Excel spreadsheet of raw data from various sources your task is to analyze it and generate insights to help Adventure Works make informed decisions as you start exploring the data set you realize that it’s filled with inconsistencies missing values and duplicate entries if you don’t address these issues your analysis will be flawed and potentially lead to costly mistakes this is where data cleaning and transforming comes into operation in this video you’ll explore data cleaning and data transformation discover how they impact the quality of your analysis and compare the implications of cleaning data at source and in PowerBI data cleaning is the process of identifying and correcting errors and inconsistencies in data sets this includes removing duplicate entries filling in missing values and fixing incorrect data types data transformation involves altering the structure format or values of the data to make it more suitable for analysis this may include aggregating data converting data types or normalizing values both cleaning and transformation are crucial to ensure the quality and reliability of your analysis for instance imagine you’ve been given a data set that contains information about customers products and sales transactions some customer names are written in all caps while others are in sentence case making it difficult to group or filter the data by customer name cleaning this data would involve standardizing the format of customer names an example of transforming this data is calculating the total revenue for each customer which would require aggregating the sales data by customer and multiplying the quantity of products sold by their respective prices inconsistent untidy or duplicate data entries can have a negative impact on data analysis these issues can lead to inaccurate or misleading results which can lead to poor decision-m for example if duplicate sales transactions are included in the data the total revenue might appear higher than it actually is this can result in overestimating the company’s performance and making illinformed decisions about resource allocation now let’s discuss the difference between cleaning data at the source and cleaning data in PowerBI cleaning data at the source involves addressing data quality issues directly within the source system such as a database or a spreadsheet this method ensures that any future analysis using this data will have a clean and consistent foundation however this approach may not always be possible especially if you don’t have direct access to the source system or if multiple systems are involved cleaning data in PowerBI involves importing the raw data and applying cleaning and transformation steps within the PowerBI environment this approach addresses data quality issues without modifying the original data source however this means that you may need to repeat the cleaning process each time you import the data into PowerBI which is time consuming and prone to errors let’s consider examples of data cleaning in PowerBI and data cleaning at the source the source refers to where your data is coming from for instance it could come from internal software like enterprise resource planning or ERP systems accounting software databases or Microsoft Excel let’s start by exploring how to clean data at the source adventure Works stores its sales customer and product information in a centralized database the data quality team decides to implement data validation rules and standardize the formatting of customer names directly in the database this ensures that any future analysis of this data has a consistent and accurate base by addressing the data quality issues at the source Adventure Works can save time and effort in future analysis as the data will already be clean and ready for use now let’s switch to an example of cleaning data in PowerBI rather than at the source imagine that Adventure Works stores its sales and data in multiple systems and the data quality team does not have direct access to all the source systems they choose to import the raw data into PowerBI and apply cleaning and transformation steps there while this approach allows them to address data quality issues and generate accurate insights it also means that they will need to repeat the cleaning process each time they import new data this is time consuming and if the cleaning steps are poorly documented it may lead to inconsistencies in future analysis in summary data cleaning and transforming are essential data analysis processes they help ensure your insights are accurate and reliable data cleaning involves identifying and correcting errors and inconsistencies in data sets data transforming involves altering the data structure format or values to make it more suitable for analysis now that you understand the implications of cleaning data at the source compared to EmpowerBI you can choose the most effective approach for your needs by improving your data cleaning and transformation skills you’ll be better equipped to tackle the challenges of errors and inconsistencies in data sets picture this you’re at your desk with your morning coffee your manager needs a comprehensive report on Adventure Works sales performance across all regions product categories and customer types and she needs it by the end of the day your heart races as you think about the vast amount of data you’d have to sift through scattered across numerous files databases and systems but you don’t panic you remember that Microsoft Power Query can help with Power Query you know you can efficiently connect to multiple data sources transform unclean data and create a structured data set for further analysis in PowerBI this video explores the capabilities and benefits of Power Query you’ll discover how Power Query helps you connect to multiple data sources clean and transform data and create structured and repeatable data preparation workflows for efficient data analysis microsoft Power Query more commonly known as Power Query is a data connectivity and data preparation tool built into Microsoft’s PowerBI suite it plays a crucial role in the data analysis process by enabling you to connect to a wide range of data sources clean and transform the data and then load it into PowerBI data models for analysis and visualization power Query streamlines and automates the process of preparing data for analysis making it easier for you to gain valuable insights from data power Query is designed to handle the extract transform load or ETL process an essential part of any data analysis workflow let’s explore how Power Query can help with the ETL step extract power Query can connect to various data sources such as relational databases Excel workbooks CSV files web pages and more once connected you can select the specific tables or data sets you want to work with transform with the data loaded Power Query provides a userfriendly interface for cleaning and transforming the data you can perform various transformations such as filtering sorting merging splitting grouping and aggregating data load once the data has been cleaned and transformed Power Query loads it into the PowerBI data model where you can further analyze visualize and share power Query is particularly useful in the following scenarios connecting to multiple data sources power Query simplifies the process of connecting to any consolidating data from different sources into a single data set for further analysis cleaning and transforming data power Query provides a wide range of tools and functions that help you clean reshape and transform data into a structured and usable format automating data preparation tasks power Query records the steps you take when transforming data creating a repeatable and editable process this feature not only saves time by automating repetitive tasks but also ensures consistency and accuracy during data preparation structured and collaborative workflows power Query’s ability to record and edit transformation steps makes it easy for you to share data preparation workflows with colleagues power Query also promotes a structured and repeatable approach to data preparation as you perform transformations it records these steps in an applied steps pane which allows you to review modify or delete any step in the process this makes it easy to fine-tune your data preparation workflow and ensures that you can consistently reproduce your results to illustrate the ability of Power Query let’s return to your task of creating a sales performance report for Adventure Works based on all sales regions in this situation your data is scattered across various sources such as Excel spreadsheets CSV files databases and even web pages with Power Query you can easily connect to these different sources extract the relevant data and consolidate it into a single data set once you’ve connected to your data sources Power Query provides a userfriendly interface that allows you to perform various data transformations such as removing unwanted columns or rows splitting or merging columns changing data types and filtering and sorting data power Query is ideal for extracting data from various sources cleaning and transforming it and then loading it into a PowerBI data model for further analysis and visualization this enables you to create a comprehensive Adventure Works sales performance report breaking down sales by region product category and customer type just as your manager requested part of the PowerBI suite Power Query is a versatile and powerful data connectivity and preparation tool by connecting to multiple data sources cleaning and transforming data and creating structured and repeatable data preparation workflows Power Query helps you at each stage of the ETL process turning raw data into valuable insights that drive informed decision-making as you continue to work with data and explore the world of PowerBI Power Query will become an indispensable tool in your data analysis toolbox imagine yourself as an artist standing before a canvas prepared to create a masterpiece the colors on your palette are your data and your brush is Microsoft PowerBI how you blend these colors the strokes you choose and your vision will determine the beauty of your final painting your business intelligence insights working through this week on the right tools for the job you learned the techniques to paint a masterpiece you covered the importance of identifying suitable data and evaluating data sources data gathering and ingestion transforming and loading the data in preparation for analysis and using the extract transform load or ETL capabilities of Microsoft PowerBI and Microsoft Power Query let’s revisit some of the key concepts you covered in the week you started your journey with an exploration of data collection identifying and evaluating the required data in the foundation for successful business decision-making you learn the importance of asking the right questions and analyzing the necessary data for business decisions illustrated through the scenario of adventure works you explore the need to understand the purpose of the data how it serves this purpose and how it should be evaluated learning about classifying data as structured unstructured and semistructured types you then continued to the workflow in PowerBI the artist’s brush in the earlier analogy you discover that PowerBI with its three main components PowerBI desktop PowerBI service and PowerBI apps is a powerful tool for creating meaningful reports from various data sources you were introduced to the PowerBI workflow to effectively sequence your work from importing data to creating dashboards sharing them and managing access permissions next you explored the ETL process and related concepts you learned about data gathering and ingestion the act of obtaining and importing data from different sources this process aids in data consolidation enabling enhanced decision-m and innovation you covered some common methods of data ingestion and gathering from less technical methods like manual data entry to methods that require specialized tools or knowledge like database connections you also learned more about data storage and management and their importance for datadriven organizations you explored key considerations for data storage planning such as storage capacity and data access needs as well as key aspects of data management planning from data governance to retention and archiving your journey then led you to data cleaning and transformation much like cleaning and preparing your paint brushes before creating a masterpiece data needs to be cleaned and transformed to ensure its quality and suitability for analysis you learned how data cleaning addresses inconsistencies missing values and duplicate entries in data sets while data transformation enhances data analysis through processes like aggregating data converting data types and normalizing values after that you explore the practical aspects of cleaning data at the source in Excel before importing it into PowerBI you discovered the importance of using key Excel functions like text functions data and time functions logical functions and lookup functions to ensure the reliability and accuracy of our data in the final part of the week you explored Microsoft Power Query in PowerBI a data connectivity and preparation tool that handles the ETL process you should now understand how Power Query helps in connecting to multiple data sources cleaning and transforming data automating data preparation tasks and creating structured and collaborative workflows this week you were introduced to some of the tools you can use to create data analysis masterpieces robust insightful and visually appealing business intelligence reports in future courses you’ll have the opportunity to develop practical skills in using these tools as you continue your PowerBI learning journey remember that like a skilled artist a successful data analyst must know their tools well understand their medium the data and have a clear vision of the end result the knowledge and skills acquired in this week will serve as a strong foundation to build on enabling you to create compelling data narratives that drive informed business decisions you’ve now reached the end of your learning journey for this harnessing the power of data with PowerBI course building a solid foundation in learning how to use Microsoft PowerBI to help businesses make the most of their data with Microsoft PowerBI in your data analysis toolkit you discovered how you can use data effectively to help stakeholders make informed business decisions you’ve put great effort into completing this course by working through a range of videos readings exercises and quizzes in the final course assessment you’ll apply what you’ve learned by completing tasks that simulate a real world data analysis scenario to consolidate your learning you’ll then take a final graded quiz to assess the knowledge and skills you gained throughout this course in this video you’ll review key learnings related to the data analysis process for businesses and the process of transforming data into valuable insights using PowerBI this will help you prepare effectively for your upcoming assessments now let’s get started by revisiting your first week of learning in the first week you learned about data analysis in business including the interconnected roles available to you in the world of data you primarily focus on the role of a data analyst when exploring the data analyst role you cover the skills data analysts need to collect process analyze and ultimately transform raw data into valuable business insights another key learning point was the stages of the data analysis process you learned that the data analysis process includes identifying the analysis purpose or defining the business problem data collection and preparation data processing and modeling data analysis visualization and interpretation and reporting and sharing data insights in relation to data processing you explored how you can use the extract transform load or ETL process to transform raw data in preparation for analysis you were introduced to data analysis expressions or DAX calculations and using visualizations during the data analysis stage you also explored some factors to consider when creating data analysis reports and best practices for supporting datadriven decision-making in businesses the importance of gathering the right data and engaging with the analysis purpose for successful data analysis was emphasized you learned the significance of understanding stakeholder experience you discovered how tailoring your data analysis and visualization with this in mind can enhance comprehension engagement and the relevance of data insights part of your learning included discovering how data insights can drive business decisions and how stakeholder engagement can facilitate this process you then went on to learn more about Microsoft PowerBI and its user interface components powerbi is a userfriendly but powerful tool for data analysis and visualization week two began with an exploration of data collection and the importance of asking the right questions to ensure you gather the right data this included learning about identifying suitable data by evaluating data sources and types you were introduced to the PowerBI workflow consisting of PowerBI desktop PowerBI service and PowerBI apps you learned that with the PowerBI workflow you can import data generate data insights create meaningful reports and dashboards and share and manage those reports and dashboards you then explored elements of the extract transform and load process in more depth as a part of this process you covered data gathering and ingestion which are integral to the data analysis as well as methods for performing them you also explored the importance of effective data storage and management which is involved throughout the ETL process data storage and management planning and considerations from storage capacity and data access needs to data retention and archiving were highlighted as crucial for datadriven organizations you then learned more about data cleaning and transformation essential steps to ensure data quality and accuracy prepare your data for analysis and enhance your analysis you discovered how to clean data at source in Microsoft Excel before you import it into PowerBI the week of learning concluded with an introduction to Microsoft Power Query Editor in PowerBI a data preparation tool with ETL capabilities you learn that Power Query can help you connect to multiple data sources clean and transform data automate data preparation tasks and create workflows as you embark on the final course exercise and graded quiz you can approach your assessments with confidence knowing that you’ve built a strong foundation of knowledge and skills by committing to your learning journey throughout the course however if you feel the need to review any of the concepts summarized for you in this video or require additional preparation remember that you have the flexibility to revisit any of the course items it’s now time to showcase your learning starting with an invaluable practical exercise in this exercise you’ll engage in key tasks that form part of the initial phases of the data analysis process for a product launch analysis wishing you the best of luck as you embark on the final week of this course congratulations on completing the harnessing the power of data with PowerBI course with your hard work and dedication you’ve made great progress in your data analysis learning journey you should now have a thorough understanding of the following topics the role of data in driving decisions and business outcomes how data is produced gathered and transformed into insights in businesses and organizations the stages in the data analysis process the role of the data analyst including related skills tasks and tools the components of Microsoft PowerBI and using PowerBI as a tool for data analysis and visualization this course provided you with a foundation in data analysis in Microsoft PowerBI you discovered the importance of data analysis in business with a deep dive into the role of a data analyst in supporting datadriven decision-m in organizations you’ve learned all about the data analysis process and how to ensure that the analysis you perform is useful for stakeholders whether you’re engaging with stakeholders to determine the analysis purpose or business problem gathering the right data or reporting the insights you now have a comprehensive understanding of each stage of the process you familiarize yourself with PowerBI including its user interface and components you had the opportunity to generate your own visualization a key skill for a data analyst you also learned about the PowerBI workflow and using Power Query Editor in PowerBI for transforming data the foundational knowledge you’ve gained represents a significant step towards using PowerBI effectively to generate valuable insights from data well done this course forms part of the Microsoft PowerBI analyst professional certificate these professional certificates from Corsera help you get job ready for in demand career fields the Microsoft PowerBI analyst professional certificate in particular is not only a way to broaden your understanding of data analysis but also gain a qualification that can serve as a foundation for a career in data analysis using Microsoft PowerBI plus the professional certificate will help you prepare for exam PL300 Microsoft PowerBI data analyst by passing the PL300 exam you’ll earn the Microsoft certified PowerBI data analyst certification this globally recognized certification is industry endorsed evidence of your technical skills and knowledge the exam measures your ability to prepare data model data visualize and analyze data and deploy and maintain assets to complete the exam you should be familiar with Power Query and the process of writing expressions using data analysis expressions or DAX of which you gain some foundational knowledge in this course you can visit the Microsoft certifications page at http://www.learn.microsoft learn.microsoft.com/certifications to learn more about the PowerBI data analysis certification and exam this course enhance your knowledge and skills in the fundamentals of data analysis in PowerBI but what comes next well there’s more to learn so it’s recommended you move on to the following course in the program whether you’re new to the field of data analysis or already have some expertise and experience completing the whole program demonstrates your knowledge of and proficiency in analyzing data using PowerBI you’ve done a great job so far and should be proud of your progress the experience you’ve gained will showcase your willingness to learn motivation and capability to potential employers it’s been wonderful to be a part of your journey of discovery wishing you all the best for the future hello and welcome to this course on extracting transforming and loading data in Microsoft PowerBI regular digital activities such as ordering food online reserving a trip and using a social media application generate a great deal of data now think about the billions of people who engage in these activities every single day then there are other organizations like universities and banks that perform many other transactions that may need to be stored in different ways businesses also need to gather data from different sources for example from their customers from other companies and from the government now imagine all that data living in different places and being stored in different ways how can a company make sense of all of this that’s where data analysts come in one of their jobs is to extract data from different sources transform it in a way that it can be used and load it into a tool to help the analysis process like PowerBI this is what you will learn in this course how to extract transform and load data a process also known as ETL before data can be used to tell a story it must first be processed so that it is usable as a story data analysis is the process of identifying cleaning transforming and modeling data to discover meaningful and useful information the data is then crafted into a story through reports for analysis to support the critical decision-making process in this learning path you will learn about the life and journey of a data analyst and the skills tasks and processes they have to master to tell a story with data you’ll discover how getting the data analysis story correct enables businesses to make informed decisions by now you should have learned how to harness the power of data in PowerBI and how it benefits an organization in this course you will get to explore various topics and elements involved in the career of a data analyst including identifying how to collect data from multiple sources and configuring it in PowerBI preparing and cleaning data for analysis and inspecting and analyzing ingested data to ensure data integrity this course will give you a solid foundation in these topics and offer you opportunities to practice extracting transforming and loading data into PowerBI now let’s briefly outline the course content so you can have an idea of what’s to come in your learning journey as you explore the extract transform and load process first you will learn about the extract portion of the ETL process you will focus on data sources and how to extract data and configure storage modes in PowerBI then you will move on to the transform portion of the ETL process you will practice cleaning and transforming data to prepare it for data modeling you will also learn about data cleaning using Power Query and how to use applied steps next you will cover the load portion of ETL and practice using data profiling and advanced queries you will also learn about referencing queries and data flows and using the advanced editor to modify code to assist your learning you will also get to apply your newly gained skills in exercises quiz questions and self- reviews to consolidate your learning and put it into practice you will complete a practical assignment in this assignment you will be provided a business scenario from Adventure Works a fictional business where you need to gather data from multiple data sources to clean and transform you will have the opportunity to apply the knowledge you gained in this course to join and merge these data sources identify and remove anomalies using profiling tools after this practical assignment you will complete a final graded assessment be assured that everything you need to complete the assessment will be covered during your learning with each lesson made up of video content readings and quizzes in addition you can share your knowledge and discuss challenges with other learners these discussions are also a great way to grow your network of contacts in the data analysis world so be sure to get to know your classmates and stay connected during and after your course this course is also a great way to prepare for the Microsoft PL300 exam by passing the PL 300 exam you’ll earn the Microsoft PowerBI data analyst certification the exam measures your ability to prepare data model data visualize and analyze data and deploy and maintain assets in this course you will learn the process of extract transform and load you will identify how to collect data from and configure multiple sources in PowerBI and prepare and clean data using Power Query you’ll also have the opportunity to inspect and analyze ingested data to ensure data integrity now that you have an overview of what this course is about it’s time to take the next step and prepare for a career as a data analyst using PowerBI these days businesses generate very large amounts of data through their activities and the data may come from different sources for example from different departments within the company or from clients the challenge is how to make sense of this data and extract valuable insights that can help improve business performance that’s where PowerBI comes in in this video you’ll explore the basics of data sources produced from business operations and learn how to combine them to gain business insights to begin let’s first review the data sources that you can connect to in PowerBI flat files are a common type of data source that can be used for ETL or extract load and transform in PowerBI examples of flat files include CSV TXT and Microsoft Excel files relational data sources such as SQL Server MySQL and Oracle databases are commonly used by large organizations because they provide a high level of reliability data integrity and security nosql databases such as MongoDB and Cassandra are becoming increasingly popular for ETL in PowerBI these databases are designed to store and manage large volumes of unstructured or semistructured data making them ideal for use in a wide range of applications don’t worry if you’re not familiar with all the terminology it will be discussed later in this course so no matter where your data is stored PowerBI has the flexibility to connect to a wide range of data sources next we will explore how combining data sources in PowerBI can optimize supply chain performance imagine you are a supply manager responsible for managing the new just in time system of your company ensuring that all parts and materials are sourced and delivered on time while meeting quality standards you closely collaborate with your team to ensure that the system runs and all suppliers meet their obligations by combining data from various sources such as sales figures inventory production and supplier information your department could gain valuable insights into customer behavior product performance and supplier performance for example by analyzing sales data alongside supplier data trends in customer demand can be identified and production and inventory levels adjusted accordingly on a company level analyzing supplier performance data helps to identify areas for improvement and work with them to enhance their performance and long-term collaboration in conclusion combining data sources can benefit different stakeholders in a business by providing valuable insights into customer behavior product performance and supplier performance this information can be used to make informed decisions leading to improved supply chain management reduced costs increased customer satisfaction and ultimately drive business success data integration can be a daunting task especially when you are working with multiple data sources that have varying formats structures and quality levels the combination of these sources can often lead to inconsistencies and errors making it difficult to derive meaningful insights and make informed decisions but you don’t need to worry tools like PowerBI simplify the process of combining data from different sources reducing the time and effort required to create a comprehensive view of your data it is designed to be userfriendly and accessible even for non-technical users with an intuitive interface and drag and drop functionality that makes it easy to create reports and visualizations powerbi also allows you to customize your reports and visualizations to suit your company’s specific needs you can choose from a wide range of pre-built templates and visualizations or create your own custom designs this flexibility makes it easy to create reports that are tailored to the unique needs of your business it also enables collaboration by allowing you to share your reports and visualizations with colleagues clients or stakeholders by sharing reports or embedding them in websites or apps this collaborative approach can improve communication and ensure that everyone is working with the same data ultimately driving business success combining data sources is a great method of providing valuable information that can lead to improved supply chain management reduced costs increased customer satisfaction and ultimately drive business success and it should not be a daunting task in this video you learned the basics of data sources produced from business operations and how to combine them to gain business insights tools like PowerBI with its built-in data connections can simplify the process of combining data from different sources reducing the time and effort required to create a comprehensive view of your business by leveraging the functionalities of PowerBI you as an aspiring data analyst along with other stakeholders can gain a competitive edge and unlock new opportunities for growth and success at Adventure Works every day businesses generate large amounts of data but where do they store it all many organizations store and export data as files such as flat files in this video you’ll learn how to set up and export a flat file data source your manager at Adventure Works Adio Quinn asked you to build a PowerBI report using a flat file that the human resources team has prepared the file contains some of Adventure Works’s employee data such as employee names hire dates positions and managers as well as data located in several other data sources so what is a flat file a flat file is a file type that contains a single data table with a uniform structure for every row of data and does not have hierarchies some examples of flat files include commaepparated value or CSV files delimited text or TXT files and fixed width files additionally output files from various applications such as Microsoft Excel workbooks can also be classified as flat files now that you know what a flat file is let me demonstrate how to set up a flat data source let’s help Adventure Works HR department set up a flat data source the first step is to determine which file location you need to use to export the data the file location is important because when it is changed PowerBI will not be able to refresh the data this can cause errors such as file not found or data source not found once you have located your file you can proceed in PowerBI to display available data sources in the home group of the PowerBI desktop ribbon select the get data button option or down arrow to open the common data sources list if the data source you want isn’t listed under common data sources select more to open the get data dialogue box in this example you need an Excel data source which is first on the list next a connection window displays where you select the employee Excel workbook that the HR team prepared and select open when your HR file is connected to PowerBI desktop the navigator window opens this window displays the tables available in your data source the Excel file in this example you can select a table to preview its contents and to ensure that the correct data is loaded into the model after selecting the check box of the table that you want to bring into PowerBI it activates the load button now you can select the load button to import your data into the PowerBI data set in case you need to change the location of your source file for a data source during development or if your file storage location changes you’ll need to update your connection strings in PowerBI to keep your reports up to date to do this in PowerBI desktop select file in the menu bar then select options and settings from the file menu and now select data source settings from the options and settings menu you can also change or clear the permissions by selecting edit or clear permissions respectively permissions cover the privacy level and credentials used for connecting to a data source remember that any structural changes to the file can break the reporting model so it’s important to reconnect to the same file with the same file structure by following these steps you’ll be able to ensure that your report uses the most accurate and up-to-date information available you’ve now helped Adventure Works HR department to store their data and you should now know how to set up and export a flat file data source great work as an aspiring PowerBI data analyst you’ll generate large amounts of data but where can you store this data fortunately PowerBI offers several storage options for its users over the next few minutes you’ll explore PowerBI’s storage modes and their impacts on report performance adventure Works need help with creating a report that displays the performance of different product categories over time this report will be a large sales transaction table with billions of rows so you need to optimize its performance so that the end users have fast access to the visuals but before taking on this task you first need to understand the different storage modes available in PowerBI and how they impact report performance let’s begin with an overview of PowerBI storage modes powerbi has two primary storage modes import mode and direct query mode it also includes a complimentary dual mode import mode is used to import small data sizes from various sources into PowerBI and it stores it in memory which enables quick access for example in import mode you can connect to an Excel file containing a data set of available categories this mode is ideal for the marketing department if they need to filter sales transactions by category in the report view on the other hand direct query mode allows you to connect directly to the data source and the data remains in the source system direct query mode is best suited for larger data sets where loading data into memory is not practical for instance if you have a card visualization that displays an aggregate summary of category sales from a sales table with this storage mode PowerBI will send a request to the data source and get the result back by using direct query the sales department can leverage the power of the external database to handle complex queries and aggregations while PowerBI only brings in the necessary data for visualizations there are many features in import mode not supported in direct query mode so it’s important to remember that you can’t switch from one mode to the other now that you’re familiar with the two primary storage modes in PowerBI import and direct query let’s explore the complimentary dual mode dual mode is a distinct mode that combines the benefits of import and direct query modes when you use dual mode the PowerBI service determines the most efficient mode to use for each query so if a table has similar data between import and direct query modes then using dual mode can be beneficial with dual mode you can import the data you need and still use direct query for additional data that is not available in the important data let’s explore the advantages and limitations of each of the storage modes in a little more detail starting with import mode import mode is a great option if you need to work with small to medium-siz data sets data is loaded into PowerBI to form the data model the data model organizes the data into tables columns and relationships making it more accessible and easier to work with all calculations are performed within the data model the data is stored in compressed form which optimizes memory usage one downside of import mode is that you must refresh the data manually this means that any changes you make to the source data will not be reflected in the report until the data is refreshed the next mode you’ll explore is direct query direct query mode connects directly to the data source and queries are sent to the source system in real time this means that the data is always up to date and there’s no need to refresh the data manually direct query mode is best suited for larger data sets as it does not require loading all the data into memory if you choose to import the data to a PowerBI file stored on your local computer it will require a significant amount of memory and resource overhead one downside of using direct query mode is that it can impact performance if the queries are complex or the data source is slow so you need to consider the benefits and drawbacks of each storage mode and select the one that best suits your needs the third option you need to be familiar with is dual mode this is where data is stored in memory but can also be retrieved from the original data source this is useful when you are working with dimension tables which can be queried with fact tables from the same source for instance Adventure Works might have a sales aggregate by customer loyalty table in import mode which is used to speed up query processing by storing a summarized and categorized version of customer data in memory simultaneously the larger sales transactions table could be set to direct query mode in this scenario setting the common dimension table such as date to dual mode can enhance the performance of the report when the dual mode table date is combined with an import mode table sales aggregate by customer loyalty it behaves like an import table and retrieves data from memory ensuring faster performance on the other hand when the dual mode table dimension date is combined with a direct query mode table sales the dual mode table dimension date behaves like a direct query table quering data directly from the source system when you use multiple data sources to create a data model it is called a composite model composite models enables you to combine multiple import modes into one unified data model using composite models can greatly enhance the functionality and performance of your reports and analytics workflow when building composite models in PowerBI it’s important that you specify the storage mode for each table in your data model the performance of your composite model depends on how you set it up for the best performance try to use import or dual mode tables they work faster because the data is stored in memory and can be retrieved quickly giving you faster results when creating reports it’s essential that you consider the size of your data set and determine if real-time access is a requirement before selecting a storage mode powerbi offers different storage modes and in this video you learned about the two primary storage modes in PowerBI import direct query as well as the complimentary dual mode as an aspiring data analyst it is important that you understand how these different storage modes impact a report’s performance in this video you explored the advantages and limitations of each of the storage modes great work data has the potential to help organizations make better business decisions but businesses generate such large amounts of data they have to sift through that it becomes difficult to see the story it tells luckily PowerBI is an excellent tool for visualizing and analyzing data however the slow loading time of data can be a significant issue especially when working with large data sets in this video you’ll learn how to configure import direct query and dual storage modes in PowerBI to optimize data retrieval and processing enhance report speed and guarantee that your reports always contain the most recent data renee Gonzalez the marketing manager at Adventure Works has asked you to create a report that displays sales at the cash registers as customers purchase products the point of sale system scans product barcodes at the cash register measuring purchase trends she’s concerned with the logistics of ordering stocking and selling products while maximizing profit as this is going to be a large sales transaction table with billions of rows you need to ensure that the report’s performance is optimized so that the end users have fast access to the visuals to complete this task successfully you have to select the best storage mode for the data and configure it in PowerBI to optimize data retrieval and processing let’s start by helping Adventure Works choose a storage mode in PowerBI desktop to do this select the data button on the home group of the PowerBI desktop written in the get data dialogue box search for the Azure SQL database connector once you’ve selected the Azure connector the data connectivity mode section displays where you can choose from two options import or direct query import mode stores data directly in PowerBI desktop’s memory while direct query retrieves data from your data source in real time powerbi also provides extra functionality to customize the storage mode for each table in your data set to get started select the model view icon near the left side of the window to display a view of the existing model model view displays all the tables columns and relationships in your model table card headers are colored to help you quickly identify which tables are from the same kind of source a table card header with no color indicates that these tables are in import mode tables from the same direct query source will display the same color in the table card header blue in our example select the sales order detail DW table and expand the properties pane by right-clicking on the table and selecting properties the properties pane displays various options for configuring the table you’ll find a drop-own menu labeled storage mode in the advanced section of the properties pane this is where you can set or adjust the table’s storage mode now let’s set up a dual import mode for your table by configuring the storage mode of the sales order details table this table is currently set to a direct query mode in the advanced section change the option to import mode the following warning message will display setting storage mode to import is an irreversible operation you will not be able to switch it back to direct
query this operation will refresh table setter import which may take time depending on factors such as data volume next select okay congratulations you now know how to configure storage modes to optimize your reports now that the storage modes are configured Renee and her team should experience a significant improvement in system performance for example reports will generate more quickly they can display real- time data and business users can access data more efficiently well done at this stage of the course you should be familiar with how businesses gather and generate large amounts of data in their daily activities this can include data from human resources accounting and sales you also learned that this data may be structured and stored in different ways as an aspiring data analyst at Adventure Works you will realize that the most important step is to determine how data will be structured and stored knowing your data types and the way it is structured gives you the correct data sets to create reports that suit the company’s needs allowing business insights that will help during decision-m furthermore identifying the best storage solution for your data can reduce costs and improve performance two aspects that any company has as top priorities by the end of this video you will be able to identify the difference between structured and unstructured data and what storage solution is ideal for each type as an aspiring data analyst at Adventure Works you’ve been assigned the task of determining the best storage solution for the online retail website at Adventure Works the website was built with three data sets used to run the business product catalog data image files and financial business data each data set has different requirements the key factors to consider in your task are data classification how your data will be used and how you can get the best application performance now let’s focus on data types there are three types of data structured unstructured and semistructured all of which are suitable for analysis but differ in the tools used for ingestion transformation and storage let’s start with structured data structured data is the most common type of data that we use it is also known as relational data in a financial report for example numbers and names are arranged into columns and rows making it easier for analysis and processing by nature structured data is quantitative easily searchable sortable and analyzed using tools like Microsoft Excel spreadsheets or relational databases which can store large amounts of structured data sql or structured query language is a programming language used to manage relational databases it allows users to manipulate and query data stored in a database making it a valuable tool that’s used by data analysts and business users however the structure makes any addition or removal of data fields difficult since you must update each record to adjust to the new structure some applications where relational data is used are customer relationship management reservations and inventory management systems now let’s cover unstructured data unstructured data does not have a predefined structure or format it is best used for qualitative analysis and usually resides in non-reational databases or unprocessed file formats some examples of this type of data are text documents audio and video files social media posts and images these types of files can be stored in a centralized repository that ingests and stores large volumes of data in its original form then there is a third type of data it is called semistructured data because it is not as organized as structured data and it is not stored in relational databases this type of data uses tags for organization and hierarchy video files may have an overall structured and contain semistructured metadata but they are considered unstructured data since the data that forms the video itself is unstructured there is a process for converting semi-structured data into a specific format that can be easily transmitted stored or processed it is called data serialization it uses a method of formatting that will allow the data to be transmitted or stored in a way that is easily understood by both the sender and the receiver without the need to know all the specific details of the data this is useful when dealing with semi-structured data that doesn’t fit neatly into traditional databases or data structures if you want to learn more about serialization please visit the additional resources at the end of this lesson now you’ll learn how to classify your data in order to choose a suitable storage solution for structured or unstructured data the correct storage solution can deliver better performance improve manageability and save on database costs when selecting a storage solution it’s important to consider the type of data you’re working with what operations are needed to transform the data and what level of management and maintenance is required the business data used at adventure works for analysis on a year-to-year comparison is not updated frequently it is stored in multiple data sets and some latency can be accepted since it is mainly read only not all data analysts need write access but they can all read from all data sets this is a type of structured data that will most likely be queried by data analysts who use SQL more than any other query language therefore a suitable storage solution for this example is a SQL database or a cloud-based solution like Azure SQL database but it can also be bundled with another cloud-based solution Azure Analysis Services to model the data in Azure SQL database this model can be shared with business users who can connect to it through PowerBI for analysis and gain business insights in summary selecting the appropriate storage solution is vital for addressing the specific requirements of your data remember when we spoke about serialization and the formatting to allow the storage of unstructured or semistructured data one of those formats is a blob this is a binary large object where the data is stored in a binary ones and zeros format for Adventure Works online retail website Azure Blob Storage is an ideal option for storing unstructured data such as photos and videos it’s a scalable and cost-effective cloud storage service which is designed to store large amounts of unstructured data such as images videos or documents the website has a product page where a bicycle photo needs to be displayed at the same time as the specific bicycle model the photos will not be queried independently by including the photo ID or URL as a product property the photo can be retrieved by its ID without any time lag this demonstrates how unstructured data can be stored the right storage solution allows Adventure Works to achieve optimal performance and efficient data management in this video you learned that while structured data is easier to work with and analyze unstructured data is often more abundant and valuable businesses and organizations are increasingly focusing on harnessing unstructured data to gain insights into customer behavior emotions and other aspects that can shape their strategies choosing and implementing the correct storage solution can benefit companies and organizations by improving performance reducing costs and increasing efficiency adventure Works generates data from many different departments and stores this data in many different sources wouldn’t it be great if they could combine data from these different sources with PowerBI they can combine data sources using connectors in this video you’ll learn about the different kinds of connectors available in PowerBI their purpose how to choose a connector and securely connect to the cloud data source adventure Works needs to generate a report that compares the sale of bicycle models across the company’s different outlets web retail and individual sellers however the sales data is stored in different sources the company needs you to generate an integrated report that combines these different data sources you can combine these data sources using connectors in PowerBI you can use PowerBI as a single business intelligence solution to generate an integrated report by combining the company’s data sources through the use of connectors but before you begin let’s find out more about connectors connectors are links that transport data between a data source and an application they’re basically the bridges that connect PowerBI to different sources of data with connectors you can create a link or bridge between PowerBI in different data sources like databases files services SharePoint and more connectors make it easy to connect between data sources you can then transform clean and visualize the data into PowerBI for report and analysis to generate insights but before you start importing your data it’s important to understand what your business requirements are for the data source this includes things like whether the data is stored on your own computer and gets updated every so often or if the data is coming from an external source and needs to be updated in real time you also need to know who will be using the data and how it will be used these requirements are essential because they can affect the way you load the data into PowerBI so it’s important that you get them right microsoft frequently adds new data connectors to its desktop and services platforms it typically releases at least one or two new connectors every month as part of the regular PowerBI update this has resulted in PowerBI having a vast collection of over 100 data connectors available files databases and web services are the most used sources all PowerBI connectors are free to use but they might be marked as beta or preview depending on their development stage any data source marked as beta or preview has limited support and functionality so don’t make use of it in production environments now that you’re familiar with the data connectors available in PowerBI it’s time to help Adventure Works generate their report let’s examine the steps involved in setting up a connector to a SQL database first navigate to the home tab and locate the get data button you have two options to choose from here you can either select the get data button and then choose all or you can select the expand arrow next to the get data button and select more this lets you access a wide range of data connectors available in PowerBI to make sure your data is mapped correctly in PowerBI it’s crucial to identify the specific nature of the data for instance if you’re working with a document meant for an Azure SQL database using the Excel connector wouldn’t give you the desired outcome as a PowerBI user in the get data window navigate to the Azure SQL option and select it then select the connect button you can also use the search bar to filter the available connectors and quickly find what you’re looking for after selecting the data source you’ll be prompted to set up the connection depending on the type of data source you’ve chosen the specific details you need to provide will differ for example if you’re working with an Excel file you’ll need to specify the location of the file on the other hand if you’re dealing with a SQL server database you’ll need to enter the server name and the database connection details there are a few additional options you may want to consider in addition to specifying the server address and database name you can also choose between different connection modes such as import or direct query most of the time you’ll select import other advanced options are also available in the SQL Server database window but you can ignore them for now you’ll cover them at a later stage in the course after you’ve specified the server and database names you’ll be prompted to sign in with a username and password you’ll have three different sign-in options to choose from depending on your credentials the first option is to use your Windows account this is often the easiest option for users who are already logged into their computer the second option is to use your database credentials for instance SQL Server has its own signin and authentication credentials that are managed by the database administrator the third option is to use your Microsoft account credentials which require your Azure Active Directory credentials once you’ve selected the sign-in option that’s appropriate for your situation enter your username and password and then select connect this will allow you to securely connect to your data source once you’ve successfully connected your database to PowerBI desktop the available data in the navigator window appears this window displays all the tables or entities that are available in your data source such as the SQL database in this example to preview the contents of a table or entity simply select the check box next to the table to import data into your PowerBI model select all tables that you want to bring in finally once you’ve selected the tables you can choose to either load the data into your model in its current state or transform it before loading for now the focus is on the data loading process data transformation will be covered in more detail at a later stage by selecting the appropriate data and choosing the load option you can easily bring in the data you need to start building visualization and analyzing your data in PowerBI connectors are an essential component of PowerBI the wide range of available connectors lets you connect to lots of different data sources to bring them all together into one place you can then import or extract the data from these sources into reports and dashboards for analysis and visualization by leveraging the full range of connectors you can access valuable insights to make datadriven decisions for your business you should now understand that connectors are a powerful asset that can help you get the most out of your data analysis what if you could reorder products you buy frequently with a click of a button that would be really convenient right and what if other types of tasks could be automated by businesses well in today’s datadriven world organizations are constantly searching for ways to automate tasks to optimize productivity microsoft PowerBI is an integrated suite of software tools applications and connectors that can help you transform your data sources into clear and compelling visualizations connectors play an important role in connecting to various data sources and executing actions or triggering workflows based on specific events there are two types of operations available to create automated workflows triggers and actions in this video you will explore how actions are triggered to create efficient and effective scheduled actions so let’s get started with triggers and actions in PowerBI addio Quinn a data analyst at Adventure Works a bicycle manufacturer is responsible for analyzing daily sales reports and providing insights to the management team however the manual process of importing data from multiple sources and analyzing it can be laborious and timeconuming to streamline this process Adio asks your help to leverage PowerBI’s triggers and actions to automate the workflow with PowerBI you can schedule an action to refresh the data and email the latest sales report to the management team with this automated workflow in place you can now focus on analyzing the data and providing valuable insights to the management team without worrying about the manual process of importing and analyzing the data in PowerBI triggers and actions work together in configuring a workflow either based on time or specific actions a trigger is always required to initiate a workflow and prompt it to run additionally actions in PowerBI enable interaction with the data source through various functions automating tasks and processes with actions in your workflow can save time reduce manual effort and make your workflow more efficient moreover scheduled actions in PowerBI can automate tasks and actions based on specific time intervals by setting up a schedule reports and dashboards can be updated with the latest data regularly without manual intervention thereby improving data accuracy and streamlining workflows now we are going to explore how to set up a schedule data refresh when it comes to working with data in an organization having access to the latest and most relevant information is essential outdated data won’t be useful to the organization as it doesn’t reflect the current situation relying on old data can even hinder the organization’s growth since there could be more recent and applicable data readily available in this video we’ll explore the topic of automating tasks in PowerBI in PowerBI users have the option to create scheduled actions which enable them to automate tasks and actions at specified time intervals today you are going to help Adio a data analyst at Adventure Works and his job involves regularly updating sales report data sets according to a predetermined schedule by setting up a schedule data refresh Adio can now automate the process saving him valuable time and effort let’s begin by opening your browser and heading to https/app.powerbi.com/home powerbi.com/home to get to the scheduled refresh screen in the navigation pane on the left hand side of the screen select data hub next locate the data set you wish to work with in our case the sales report data set next select the ellipses and then select settings to expand the data set settings this will take you to a new screen where you can configure the trigger scheduled refresh section is where you define the frequency and time slots to refresh the data set let’s walk you through the steps to set up an online refresh schedule in PowerBI services here’s what you need to do step one turn the switch to on step two you can modify the schedule to fit your needs choose the frequency you want the data set to refresh such as daily select the time zone you want to use for example UTC London under time select add another time and enter a time for the refresh to occur repeat this step for additional refresh times as needed step three once you’re done simply select apply and you’re all set did you know that you can easily adjust the frequency time zone and time of your scheduled refreshes in PowerBI this allows you to ensure that your data is always up to-date and accurate plus you can even set up scheduled notifications to be sent to a specific email address how convenient is that beware if your data set hasn’t been active for 2 months the scheduled refresh will be automatically paused are you ready for a quick rundown on data refreshing in PowerBI great as a PowerBI user refreshing data typically means importing data from the original data sources into a data set you can choose to refresh data based on a predetermined schedule or on demand depending on your needs if your underlying source data changes frequently it may be necessary to perform multiple data set refreshes daily however it’s important to note that PowerBI limits data sets on shared capacity to a maximum of eight scheduled daily data set refreshes with these easy steps you can now create a refresh schedule that works perfectly for you in this video you explored the topic of automating tasks within PowerBI specifically using scheduled actions to automate tasks and actions at specified time intervals by automating processes such as data refreshing users can save valuable time and effort we walked through the steps to set up an online refresh schedule in PowerBI services and highlighted the importance of periodically checking the refresh status and history to ensure data sets are error-free good job congratulations on reaching the end of the first week in this course on how to extract transform and load data in PowerBI this week you explored how to work with basic and advanced data sources in PowerBI let’s now take a few minutes to recap what you learned this week this summary will help you review the concepts presented previously and clear up questions you might have you began the course by covering basic data sources you learned that for example by analyzing sales data alongside supplier data you can identify trends in customer demand you also learned that data from different parts of an organization may come from different sources and may be stored in different ways that’s when you identified the many different data sources supported by PowerBI like flat files relational data sources and NoSQL databases you also learned how to set up a flat data source after that you learned that local data sets provide data that is only available to a specific individual or organization and are typically stored locally local data sets are a good option for organizations or projects with few users that demand high security and need speed over quantity on the other hand shared data sets allow multiple individuals or organizations access to data and are usually stored on multiple locations or cloud-based platforms they are suitable for large enterprises or projects that require multiple users working at the same time then you had the opportunity to complete a practical exercise on how to set up an Excel data source in PowerBI after that you covered different storage modes in PowerBI you learned that you must think carefully about the benefits and limitations of each storage mode and select the one that best suits your needs import mode is a great option if you are working with small to medium-siz data sets and if the data is loaded into PowerBI data model in this model data must be refreshed manually on the other hand direct query mode connects directly to the data source and queries are sent to the source in real time so there’s no need to refresh the data manually however this mode might impact performance you also covered dual and hybrid modes as alternative storage modes after you explored these different storage modes you then learned how to configure them in PowerBI next you had the opportunity to apply your skills and configure storage modes in PowerBI you discovered that structured data also known as relational data is arranged into columns and rows by nature structured data is quantitative easily searchable sortable and analyzed using tools like Microsoft Excel spreadsheets or relational databases which can store large amounts of structured data on the other hand unstructured data does not have a predefined structure or format unstructured data is best used for qualitative analysis and usually resides in non-reational databases or unprocessed file formats some examples of this type of data are text documents audio and video files social media posts and images semistructured data is not as organized as structured data and it is not stored in relational databases this type of data uses tags for organization and hierarchy an example of semi-structured data is video files you then learned about connectors connectors are the bridges that connect PowerBI to different sources with connectors you can import data from databases files Outlook servers SharePoint and many other sources you also learned that before you start importing your data it’s important to understand what your business requirements are for the data source you then explored the two types of operations used for creating automatic workflows triggers and actions triggers are used to create efficient and effective scheduled actions for example Adventure Works can use triggers to automate parts of their PowerBI workflow like refreshing data and emailing reports next you undertook another practical exercise in this exercise you implemented triggers to automate your workflow in PowerBI you then tested your understanding of the concepts that you encountered in this lesson in the knowledge check finally you undertook a module quiz this quiz tested your understanding of all concepts that you explored in this module you should now be familiar with the fundamentals of data sources you should be capable of extracting data from basic and advanced data sources to work with in PowerBI great work i look forward to guiding you through the next week’s lessons in which you’ll learn about transforming data in PowerBI you’re making progress in your journey to become a data analyst you’ve learned how to extract data and now it’s time to learn how to transform it so you can make better use of it depending on your data sources data transformation can involve different activities such as cleaning merging and profiling in this video you’ll learn how to identify components of data transformation and understand why data transformation is required adventure Works CEO Jamie Lee has set a new goal for the company to increase sales she’s relying on company data to uncover trends and insights and make that goal achievable your manager Addio Quinn has asked you to create a PowerBI report that visualizes the data in a meaningful way but before you can start working with that data you need to clean and transform the raw data to ensure its accuracy and consistency in the first part of this course when you explored the extract stage of the extract transform load process you learned that data may come from different sources however the data from these sources may contain inconsistencies that make accurate analysis difficult data from different sources can be untidy incomplete and inconsistent making it difficult to draw meaningful insights that’s why data transformation is a crucial step it helps you prepare data for analysis now let’s examine some of the inconsistencies you may find in data by this point in the course you should know that data is classified into three main groups called structured semistructured and unstructured data each data group is suitable for analysis but may require different tools to ingest transform and store you can say that data coming from sources that you define as structured data is more ideal to work with and compliant with the rules since these sources are systems that have strict rules and prioritize data integrity data coming from conventional databases generally have a low probability of inconsistent or erroneous data however in semistructured data unstructured data and even in some types of structured data it is likely that there is data that needs to be transformed before starting the report design for example let’s say you are working on an analysis related to products in an e-commerce database for this task you need some relevant fields for your report however the table has hundreds of fields so you need to decide how to identify the relevant data to create your report an example of useful data transformation in this scenario is including certain columns from the data and excluding others before loading the data for analysis and reporting another transform example would be selecting fields and transforming by merging them such as in a customer table with fields for the first and last name but you want to display them as a single full name field by merging fields with a space between now let’s explore what data cleaning is data that is not structured is more flexible in terms of rules and therefore more likely to be disorganized and require cleaning you may not encounter as clean data as you would expect in Excel data or in data organized using delimiter symbols such as angle brackets or commas in such cases the data should have a preliminary examination to identify incorrect data or separate rows where content refers to the same values like where house written as two words and warehouse as in one word you can resolve these inconsistencies by passing them through filters with specific rules this examination is referred to as data cleaning another data issue you may encounter is the need to merge or append multiple data sources for example if Adventure Works has two data sources for sales one for online sales and another for in-person sales you’ll need the data from both to create a monthly sales report depending on the data formats you can use commands such as append or merge data transformations to combine the data for analysis in this video you learned that data transformation can help improve data quality by removing errors inconsistencies and inaccuracies this results in cleaner more reliable data for analysis it also allows you to standardize data when working with multiple sources with data transformation you can help organizations like Adventure Works use data that is more understandable organized and consistent to achieve goals like increased sales in this video you will explore some features of Power Query and learn to navigate the Power Query editor interface adio Quinn the data analyst at Adventure Works asks you to clean and transform the company’s sales data which is scattered across multiple sources in preparation for data analysis power Query can help you with this power Query is part of PowerBI desktop allowing for seamless data preparation within the PowerBI environment power Query is a data transformation and data preparation tool allowing you to connect clean and transform data from a wide range of sources it ensures that your data is ready for analysis enabling you to create insightful visualizations and reports let’s explore how Power Query helps you clean shape and organize data from various sources the first feature is data connectivity power Query connects to various data sources both on premises and the cloud directly within PowerBI desktop you can access data from traditional databases as well as file-based sources next there’s data extraction and transformation power Query’s interface allows you to extract and transform data with ease during the extraction process you can filter sort and apply custom transformations ensuring that you import only the required data then there’s the power query editor in PowerBI within PowerBI desktop which provides a graphical user interface or guey for designing and managing queries tabs such as home transform add column and view have data manipulation tools there’s also query reusability and applied steps power Query records each transformation as an applied step allowing you to review modify or delete any step this ensures that your data transformations are transparent and easily modifiable finally there’s performance and scalability power Query handles large data sets efficiently using various techniques that optimize performance and reduce memory usage let’s demonstrate these features in Power Query to achieve Jaime’s goal of increasing sales you must work with sales data from different regional teams stored in different file formats like Excel CSV and even a SQL database to get started you’ll need to import this data into PowerBI using Power Query to begin the import you must add a data source in the PowerBI desktop in the home tab select get data to choose a data source the Power Query editor opens in a separate PowerBI window where you can apply various data transformations such as removing columns changing data types and filtering data next you need to load the data select your data source and configure the connection settings if necessary select transform data to open the Power Query Editor now let’s discover how to navigate in Power Query the Power Query editor has several key areas let’s start with the ribbon the ribbon is the set of toolbars at the top of the window it helps you quickly find the commands that you need to complete your tasks the ribbon tabs such as home transform add column and view contain commands and tools for data transformation and manipulation the queries pane is located on the left side of the editor the queries pane displays a list of all the queries in your project select a query to view or edit its applied steps and data preview this pane is where you can manage and navigate between different queries in your project by selecting a query you can view the data and the applied steps associated with it helping you keep track of your work and maintain organization in your project then on the right pane below the ribbon there’s the applied steps section it displays the sequence of transformations applied to the selected query select a step to view the data state at that point or delete reorder or modify steps as needed the applied steps section provides a visual representation of the transformations applied to your data making it easier to understand the changes made by reviewing the applied steps you can identify errors redundancies or inefficiencies in your data transformations finally in the center of the Power Query window let’s explore data preview the data preview pane displays a preview of your data as it appears after the applied transformations you can interact with the data by sorting filtering or changing the data type of columns this pane enables you to review your data at different stages of the transformation process helping you to get your transformations accurate and effective before loading the data into the data model in this video you learned that Power Query is a versatile tool in PowerBI that streamlines data import cleaning and transformation from multiple sources its features such as data connectivity data extraction and transformation make it an integral part of PowerBI desktop it helps you prepare and transform data from different sources within Adventure Works to simplify analysis and create insightful visualizations and reports the Power Query Editor interface offers a userfriendly experience allowing you to perform various data transformations with ease thanks to the applied steps list in Power Query you can easily undo and reorder steps without losing progress in this video you’ll learn how to use the applied steps list to undo modify and reorder steps first let’s open the Power Query Editor in PowerBI to do this from the home tab select transform data after selecting your data source the Power Query Editor opens in a separate window next let’s locate the applied steps list in the Power Query editor you’ll find the applied steps list on the right pane below the ribbon it has all the steps you’ve performed on your data presented in the order of application the applied steps list is a visual representation of the transformations applied to your data by reviewing the applied steps you can identify errors redundancies or inefficiencies in your data transformations to view the data state at a specific point in the process select the corresponding step in the applied steps list the applied steps list makes it easy to correct a mistake or change your mind or undo a transformation to undo a step simply select the X icon next to the step to remove power Query will automatically revert the data to the state it was in before that step was applied please note that removing a step will also remove all subsequent steps in the list as they are dependent on the previous transformations what if you need to reorder the sequence of steps to reorder steps select and drag the step you’d like to move to a new position in the list power Query will update the data accordingly applying the transformations in the new sequence you should note that reordering steps might affect the results of subsequent transformations review your data and the applied steps list to check everything suppose you need to modify a step just select the gear icon next to the step this opens a settings window to edit the transformation parameters when changed select okay to apply the update as with reordering steps modifying a step might affect subsequent transformations always review your data and the applied steps list to ensure everything is as expected to add a new step use the Power Query Editor ribbon to choose a transformation such as filtering or sorting when you perform a new data transformation it’s added to the applied steps list with the Power Query Editor you can also add filters filtering is the process of narrowing down your data set by displaying only the rows that meet specific criteria it helps focus on a particular subset of data remove unwanted data that may affect your analysis or simplify your data set for better readability let’s check how to add a filter in the Power Query Editor select the column header for the column you want to filter this highlights the entire column with the column selected select the small down arrow next to the column header this opens a drop-own menu with filtering options such as text filters number filters or date filters depending on the data type in the column choose the type of filter and select okay notice the new filtering step has been added to the applied steps list you can also sort your data set sorting is the process of arranging your data in a specific order either ascending or descending sorting organizes data based on specific attributes such as alphabetical order numerical values or chronological order helping to identify the highest or lowest values in a data set select the column header for the column you want to sort in the home tab of the ribbon find the sort group choose sort ascending A to Z or sort descending Z to A to sort the selected column in ascending or descending order the data is sorted based on your chosen sorting order check the applied steps list to ensure the new sorting step is added finally for better organization and readability you can rename any step in the applied steps list just rightclick the step you’d like to rename and select rename enter a new descriptive name for the step and press enter renaming steps helps keep track of transformations making it easier to navigate and understand the data transformation process in this video you learned how to use the applied steps list in Power Query to undo modify and reorder steps it has a visual representation of the data transformation process making it easier to understand complex queries and track the impact of each action on the data set the applied steps list provides easy undo and redo functionality flexibility and reordering steps and efficient troubleshooting capabilities saving time and effort how do you efficiently remove and rename columns to focus on the data that matters you can do it with Microsoft Power Query in Microsoft PowerBI in this video you’ll learn how to remove and rename columns and promote header roles in Power Query in PowerBI as you continue to work on Adventure Works goal to increase sales your manager Adio Quinn asks you to prepare a report on sales and customer demographics you have a data set with numerous columns but you only need a few of those columns for your analysis you must get the data organized and streamlined but you’re not sure where to start that’s where Power Query comes in power Query is a powerful data transformation tool within PowerBI that allows you to connect to different data sources clean data and transform data with ease a common data manipulation you’ll encounter is working with columns working with columns in Power Query in PowerBI is an essential skill for data analysts and professionals who regularly deal with data one of the main benefits of learning to work with columns is efficient data preparation eliminating unimportant or repetitive columns allows you to concentrate on the most crucial data for your analysis minimizing the data set size and streamlining the data structure for easier manipulation and quicker processing another benefit of working with columns is improved data readability and interpretation removing unnecessary columns helps declutter your data set making it easier to read and understand renaming columns with more descriptive names helps you quickly identify the purpose and content of each column one other benefit of working with columns is that it allows for enhanced data analysis and reporting by focusing on the most relevant columns you can produce more accurate and meaningful analyses this allows you to deliver actionable insights to your team and organization leading to better decision making finally working with columns means time and resource savings efficiently removing and renaming columns in Power Query can save you a significant amount of time during the data preparation stage this means you can devote more time to analyzing the data and generating insights by streamlining your data preparation process you also reduce the computational resources required to process your data this can lead to faster analysis and in some cases cost savings particularly when working with cloud-based services that charge based on resource usage now let’s explore a step-by-step guide on how to remove and rename columns and promote header rows in Power Query let’s start by demonstrating how to remove columns the first step is to load your data into Power Query Editor open PowerBI on the ribbon select home select get data and choose your data source for example Excel or CSV once connected to your data the Power Query Editor opens displaying your data the next step in the Power Query Editor is to locate the columns you want to remove to select a single column select its header if you need to select multiple columns hold down the keyboard control key or the command key if you’re using a Mac and select multiple column headers to remove with the columns you want selected you’re ready to proceed right click on any of the selected column headers in the context menu that appears select remove columns the selected columns are removed from your data set you will notice a new step removed columns appears in the applied steps list on the right pane reflecting the updated data state now let’s cover how to rename columns first you select the column you want to rename in the Power Query editor select the header of the column to rename rightclick the header of the selected column in the context menu select rename a text box appears type in a new column name press enter to save the change again you’ll notice the new step in the applied steps list let’s check how to promote header rows the first thing is to identify which row in your data set contains the headers in most cases this is the first row if your data set has additional information or metadata above the headers you may need to scroll down to find the appropriate row now you can promote the header row once you’ve identified the header row on the ribbon use the home tab to locate the transform group select use first row as headers this promotes the first row to be used as column headers replacing the existing headers note if the header row isn’t the first row you’ll need to remove any rows above the header row before promoting it to do this select the rows you want to remove by selecting the row numbers on the left side of the editor then on the ribbon in the home tab select remove rows you will notice a new step removed rows in the applied steps list on the right pane reflecting the updated data state in this video you learned how to remove and rename columns in Power Query you also learned how to promote header rows these are important skills for you to master as an aspiring data analyst they empower you to transform raw data into valuable insights that drive smarter decision making and lead to a greater impact within your organization furthermore efficient data preparation saves time and computational resources when analyzing your data you need to ensure accuracy and reliability but data sets often contain errors that lead to inaccurate results using Power Query you can fix many common data set errors in this video you’ll learn how to identify common types of errors and discover how best to fix them using Power Query in PowerBI adventure Works is preparing to analyze its latest sales data worksheet however there are several errors in this data set like null values duplicate rows and inconsistent data types these errors must be resolved before analysis let’s take a few moments to help Adventure Works fix these errors using Power Query first you must import the data set to transform in this case it’s the Adventure Works sales data set on the home tab select get data and choose text CSV for the file type browse to the location of your data set and select open to import then select load to load the data next select transform data in PowerBI desktop the transform data button is in the home tab in the queries group of functions the button is positioned to the right of the recent sources button the sales data is loaded into Power Query it shows a list of bicycle products and key information about each product like name price weight category and description however several of these rows contain null or missing values these errors need to be resolved before the data can be analyzed to systematically identify missing or null values select the drop- down arrow in the column header for the variable you’re examining this opens a filter menu used to filter the data in the column based on specific criteria the filter menu contains options like empty or null available options depend on the data type of the column empty refers to blank cells in text columns null refers to missing values in numeric or date columns select the appropriate option to filter and display rows that contain missing or null values in the selected column inspect the data table in the editor and identify any rows with missing or null values in this data set two rows contain missing values row 16 and row 17 have a missing value in the product subcategory column now that you’ve identified the values you can resolve them there are three ways to resolve missing values you can replace them with default values replace them with values from another column or remove the rows containing missing values for adventure works the best approach is to replace its missing values with default values logical default values can represent the missing data without distorting the analysis or visualizations first in the ribbon at the top of the editor select the transform tab you use this tab to access the tools and functions for modifying and transforming the data next select the replace values button then select replace values from the drop- down menu you use this option to replace specific values in a column with a new value in this case you can replace all null or missing values a replace values dialogue box appears on screen it has a text box labeled value to find where you specify the value you want power query to identify and replace the aim is to find missing or null values in the product subcategory column so in the value to find box you can write null below the value to find box there’s another text box labeled replace with this is where you type the new value you want to replace the missing or no values with the new value should be consistent with the columns data type which is text so let’s replace the missing values in the product subcategory with the text value trail which represents the default category for trail bikes finally select okay to confirm and make the change when you select the okay button in the replace values dialogue box Power Query scans the sheet for the values you’ve instructed it to identify it then replaces each instance of these values based on the criteria you specified in the replace with box you can review a history of all data transformation operations you’ve applied to the data set by selecting the pane called applied steps on the right hand side of the power query editor window adventure Works has fixed the null values in its data set but there are still duplicate rows errors present the entries in rows 22 to 24 are duplicates of other records in the sheet and identical records also exist in rows 25 to 27 let’s help Adventure Works resolve these errors on the home tab access the data manipulation functions from these functions select the remove rows option and a drop-own menu appears select remove duplicates from the options power Query analyzes the data set and finds rows that have identical values in the selected columns it then removes all but one instance of each group of duplicates that’s good progress just one final error left in the data set inconsistent data types in the form of order dates let’s fix this final error the inconsistent data is in the column order date select the column header to select and apply changes to the entire column next select the transform tab to access the data modification options select the data type button then select the date data type from the drop- down menu this converts all values in the column to the select to data type meaning all data types in the column are now consistent thanks to your help Adventure Works has removed all errors from its data set it can now perform data analysis without the risk of producing inaccurate results you should now understand how to identify common errors in data sets like missing or no values duplicate rows and inconsistent data types you should also be able to resolve these issues using the tools available in Power Query identifying and resolving these errors is essential for making sure your analysis runs on accurate reliable and highquality data you are a data analyst at Adventure Works tasked with analyzing sales data across different product categories and regions using PowerBI understanding the importance of reshaping the data to uncover valuable insights you know you’ll need to transform the data so far in your introduction to transforming data in PowerBI in this course you’ve learned about Power Query data types columns and preparing a data set in this video you’ll gain further insight into PowerBI’s powerful data transformation capabilities by discovering unpivoting and pivoting in Microsoft Power Query unpivot and pivot operations are data transformation techniques that you can use to reshape and restructure data in PowerBI let’s explore each operation in turn the unpivot operation refers to the transformation of data from a wide format with multiple columns to a narrow format with fewer columns by reshaping the data structure it involves converting column headers into row values resulting in a more structured and standardized representation of the data the unpivot operation is useful in data analysis supporting data normalization by organizing data in a tabular format this facilitates analysis variable comparison and data aggregation and summary as related information is consolidated into a single column transforming data from a wide to a narrow structure can also enable data compatibility and integration with other systems or tools that require a narrow format for example in the case of the adventure works sales analysis you can perform the unpivot operation to convert the sales data which is organized in a wide format with separate columns for each region into a long format where the region specific data is stacked vertically in a single column this makes it easier to compare sales across different regions and gain a holistic view of the overall performance on the other hand the pivot operation refers to the transformation of data from a narrow format with fewer columns to a wide format with multiple columns by reorganizing the data structure it enables data analysts to convert rows into columns based on specific criteria or values this operation is often used to summarize and aggregate data create cross tabulations and represent data in a more structured easy to understand way for analysis and reporting to illustrate say you want to analyze the sales data based on different product categories as part of the Adventure Works sales analysis using PowerBI’s pivot functionality you can transform the rows containing individual product categories into separate columns this pivot operation enables you to present the sales data in a more concise and structured manner making it easier to identify trends top selling products and performance within each category you’ve been introduced to PowerBI’s unpivot and pivot operations to transform and structure your data as with other data transformation techniques reshaping the data can help your team gain deeper insights and support business success through datadriven strategies decisions and actions now let’s take a moment to work through a practical application of the unpivot and pivot operations to the Adventure Works sales data using Power Query in PowerBI desktop suppose Adventure Works uses two separate Excel files to assess their quarterly sales and product and category distributions the first Excel file contains the sales target data consisting of three columns month 2022 and 2023 within this file there are 12 rows representing each month and each row displays the target sales amount for the corresponding month and year to enhance the table structure for easier readability your manager asks you to perform an unpivot operation to create a table with columns for month year and target which will also increase the number of rows the second Excel file includes category and subcategory data showcasing the category and subcategory data as columns without the product names you are tasked with performing a pivot operation on this file to present the product count per category in a tabular format to address the tasks given to you by your manager you can start by downloading and importing the two Excel files into Power Query with each data source selected select the transform data option to open the Power Query editor where you can apply various transformations including the unpivoting and pivoting operations for the first Excel file containing the sales target data you need to perform an unpivot operation to unpivot the table columns select target query on the left menu highlight the 2022 and 2023 columns select the transform ribbon tab in Power Query and then select unpivot rename the attribute column to year and the value column to target amount you now have an unpivoted table where the columns are converted to rows to accomplish the second task and pivot the table columns in the Excel file with the product categories and subcategories select the product categories query on the left menu on the transform ribbon tab select pivot column then on the pivot column window that displays select the column subcategory from the values column list expand the advanced options and select the option count all from the aggregate value function list lastly select okay with the pivot column feature applied you change the way that the data is organized subcategory names are converted to columns and row count for each subcategory is added as a row value for each column in this video you explored unpivot and pivot operations in PowerBI and the application of both in practice by building your technical expertise and learning about effective data transformation techniques like unpivoting and pivoting you can maximize the potential of PowerBI to unlock valuable insights from business data ultimately contributing to growth and success of organizations like Adventure Works you’re making good progress in your journey to becoming a data analyst you’ve learned how to transform data by using Power Query and have worked on data sets now it’s time to learn how to combine different data sources so you can use it more effectively the capability to combine queries is valuable as it empowers you to combine and merge diverse tables or queries enhancing your data analysis capabilities in the next few minutes you will be introduced to why combining data may be necessary and how you can combine tables or queries adventure Works have recently acquired another bicycle business adventure Works CEO Jamie Lee has assigned a task to the sales department to ensure that sales data from this business is incorporated in the Adventure Works sales reports your manager Adio Quinn has tasked you with creating a PowerBI query that merges the data but before you start working on the data you first need to understand the reasons why it is important to combine data the first reason for combining data is that it allows you to consolidate information from various sources or tables into a single table this consolidation can provide a unified view of the data making it easier to analyze and gain insights the next reason why you would combine tables is to create relationships combining tables is crucial for establishing relationships between related data in PowerBI relationships between tables are used to create meaningful visualizations and enable interactive analysis by combining tables you can link data points across different tables based on common fields or keys combining tables also enables you to enrich your data by adding additional information for example you may have a table with client details and another table with product information by combining these tables you can create a comprehensive data set that includes both client and product details allowing for a more comprehensive analysis another reason to combine data is that it provides a broader scope for analysis by merging multiple tables you gain deeper insights by analyzing data from different angles and lastly combining tables helps simplify data management in PowerBI instead of working with multiple separate tables having a single consolidated table reduces complexity and makes it easier to handle data updates refreshes and maintenance tasks now that you understand the reasons why it is important to combine data let’s look at the ways to do it in PowerBI there are two ways to combine data append and merge when you append queries you are adding rows of one table or query to another table or query by adding multiple lists one below the other you will see an increase in the number of rows say for instance you have two separate classes class A and class B that need to take an exam together to do this you have to combine the 20 students in class A with the 20 students in class B resulting in a combined class list of 40 students on the other hand when merging queries you consolidate data from multiple tables into a single entity by leveraging a shared column between the tables for example data with specific content such as gender category and city is stored in different independent tables and referenced by main tables that require this information this allows you to use this information within a specific context enables easy data classification and ensures data integrity you will learn more about both of these operations over the coming lessons in this video you learned about data combination techniques and the reasons for using it combining data in PowerBI is essential for creating accurate comprehensive and interactive reports and visualizations it allows you to leverage the full potential of your data by consolidating relevant information from multiple sources establishing relationships and enabling more insightful analysis good job adventure Works has recently acquired an additional bicycle business your manager Adio Quinn tasked you with creating a PowerBI query that merges the current sales data of Adventure Works with the sales data from the newly acquired business and he needs the query by the end of the day but you do not panic you know that PowerBI can help you combine different tables and queries to consolidate information create relationships enrich data enhance analysis and simplify data management in the next few minutes you will learn why appending tables or queries may be required at the end of this video you will also be able to describe the operation of appending one table to another by now you know that there are two ways to combine data in PowerBI append and merge when merging queries you consolidate data from multiple tables into a single entity by leveraging a shared column between the tables you will learn more about merging in the coming lessons when you append queries or tables you add rows from one or more tables to another query or table in this video you will focus on append before I demonstrate how the append operation is done let me share a very important tip with you say your manager has asked you to list the Adventure Works products that have fewer than 100 units sold for the current year the products that have not been sold do not appear in the sales table so you have to identify them by subtracting the sold products from all the products as a result you have two data sets to be merged products with 100 or fewer sales and products that have never been sold if you only list the products with sales data of less than 100 you won’t include the products that haven’t been sold at all to overcome this problem you have to merge the products with total sales below 100 and the ones that haven’t been sold at all to present the complete picture back to the task audio set you before you append the adventure works sales.xlsx and the other sales.xlsx XLSX files you have to format the data of both files to ensure they have an equal number of columns and that the columns have the same names and data types if you don’t have an equal number of columns or different column names the extra columns will be added to the most right of the query by preserving their values in the originating query and setting null values for the matching new query in this example columns A and B are common columns in both data sets columns C and D are unique and added to the right of the merged list since the D column does not have any data in the first data set the row values will be null after the merge similarly in the second data set null values will be added for the previously non-existent C column this may be confusing so try to have an equal number of columns with the same column titles let’s explore how this is done to format tables select other sales query in the query pane at the left menu of the power query window rename the quantity column to order QTY name to product name and total to line total by selecting the column names once you have completed the reformatting process you can merge the queries on the Power Query Editor ribbon navigate to the home ribbon tab and select the append queries drop-down menu you can select append queries as new to create a new query or table from the appended output or select append queries to merge the rows from an existing table into another if you select append queries as new you will create a new master table this selection displays the append window where you can select the tables you want to combine from the available tables section and add them to the tables to append section when you select okay a master table is created that contains the sales data of both Adventure Works and the newly acquired company in this video you learned how to combine data by appending tables and queries by appending different sales data you can create a master sales table this will help you to consolidate and enrich data from multiple tables and queries and simplify data management combining or joining data from different sources is like putting puzzle pieces together to form a big picture the big picture can help you discover details you could have missed when examining the individual pieces in this video you will discover what a join is and explore the purpose of joining data and its importance in data analysis before we explore the power of joining data to unlock new perspectives you need to understand what a join is when you have data in two tables and the columns of those tables are exactly the same appending the data from one table to another is straightforward however to combine the data of two tables with different column structures you need to specify the method in which the two tables should be combined this is known as a join join is when you merge or combine data from different places to create a bigger and a more complete data set it helps you view all the information in one place like putting puzzle pieces together to understand the whole picture let’s look at an example your manager Adio Quinn has tasked you to list all products with their category names and indicate which category has the most products during your investigation you notice that category data is referenced to a table called categories it is also being used by the common columns named category key on closer inspection you notice the row with a category key of one has a category name of bikes and the row with a category key of two has a category name of accessories your conclusion is that any row with a value of one in the category key column has bikes as the products category one of the key usage areas of joins is merging the two tables in this manner and matching related data by using the relationship one of the key usage areas of joins is merging two or more tables and matching related data by using the relationship joining data is essential for PowerBI data analysts because it enables you to combine information from different sources giving you a complete picture of the data joining data can help you validate data accuracy make informed decisions and perform advanced analysis joining data also empowers you to gain a holistic understanding uncover valuable insights and make datadriven conclusions overall join is a powerful technique that enhances your data analysis capabilities and allows you to unlock the full potential of your data in a previous video you learned that there are two ways to combine data in PowerBI append and merge in both merge and append operations the use of join is essential for combining tables effectively let’s explore merge with join in more detail when you merge queries you’re combining the data from multiple tables into one based on a column that is common between the tables merge with join allows you to match related data integrate data and explore relationships when you append queries you are adding rows of data to another table or query append with join helps you to ensure consistency and allow you to expand your existing data set whether it’s a merge or append operation the use of join is essential for aligning integrating and combining data from different tables it ensures that the relevant information is properly matched and merged enabling you to analyze and understand the data in a meaningful way in this video you learned what a join is as well as the purpose of joining data and its importance in data analysis by now you are aware that combining data and using join keys can save you hours of searching through vast amounts of data for a specific product item but did you know that you can simplify your query even further by specifying how the data should be combined in this video you will learn about join types specifically the difference between left outer right outer full outer and inner joins a join type in Microsoft PowerBI refers to how tables of data are related to each other in the software the joins are important because they determine how data is consolidated from multiple sources into a single view understanding joint types and their implications is crucial to building accurate efficient and meaningful data models in PowerBI over the next few minutes you’ll be introduced to four different join types left outer right outer full outer and inner join let’s explore each join type and the way it combines data from multiple tables based on matching criteria let’s say we have two tables one on the left for sales and one on the right for countries the sales table has three columns date country ID and units the countries table has two columns ID and country the sales table country ID column can be used as a join key with the ID column of the countries table now let’s explore each join type and how they combine data first let’s start with a left outer join if a left outer join is used all rows in the left table are kept and the matching rows from the right table are merged in if the left table is missing columns that the right table has the columns are included as part of the merge it is important to note that if there is no match for a row between the tables default or null values will be used for columns where matching data is unavailable in this scenario the resulting table will have the columns from the left table date country ID and units along with a country name column since the right table did not have a country ID of four the country name is null a right outer join works similarly to the left outer join except that all rows in the right table are kept and the matching rows from the left table are merged in again if the right table is missing columns that the left table has the columns are included as part of the merge similarly if there is no match for a row between the tables default or null values will be used for columns where no matching data is available in our scenario the resulting table will have date country ID units and country name the full outer join is used when you want to retrieve all records from both tables regardless of whether they have matching values in the join condition in this scenario since the right table has an ID of four and the left table does not have a corresponding entry with a country ID of four a row is created with a country name for ID 4 and with null values in all other columns in the previous video what is a join you used full outer joins and appended with joins by matching related data for inner join only matching rows from both left and right tables are merged together this join type is helpful when you want to focus only on the sales that have corresponding data in another table and exclude any sales data that don’t match as a data analyst you often come across the requirement to combine data from different tables or data sets related to sales and product tables this is where merging operations specifically join types become crucial keep in mind that you should choose the combination types based on how you choose them taking into account the specific needs of the analysis the choice of join type will impact the inclusiveness of the data in your analysis it’s important to consider your analysis objectives and the specific requirements of your project each join type serves a different purpose and selecting the appropriate one ensures that you obtain the desired result set for your analysis of order and order details data as you start working with more and more data sources keeping all the different data in different tables will become quickly unmanageable identifying similar and related data that can be merged is an important skill for a data analyst over the next few minutes you will learn how to identify and merge tables using joins in PowerBI in relational data fields such as category or status are often kept in a separate table for instance when a new product is added the category information is associated with an entry in a different table instead of being manually repeated in multiple rows in the product table as you have previously learned data from two different tables can be linked by join keys this works for tables from individual and multiple data sources however sometimes you’ll be working with a single data source such as a database where these relationships are already established in these scenarios merging the data using a join is a straightforward operation a column in one table will act as a key to the column of another table in databases this is known as a foreign key relationship and the foreign key is used as the join key this is almost impossible for databases that have a large number of products for example an e-commerce business selling books or adventure works who sell a large number of product variants selecting from defined categories or any other parametric data ensures easy classification of data and enables us to work within a consistent and comprehensive data set consider a scenario where you are working in the sales department of Adventure Works a multinational bicycle store and you have been given a task by your manager Adio Quinn to consolidate orders and their corresponding details currently in two tables into a single table there is a typical foreign key relationship between the order and order details tables which is order ID adventure Works provides the following details to deal with situations such as this the orders table is created to store information such as the name of the store the date of the purchase the cashier’s name and so forth since there can be multiple individual products associated with a single order Adventure Works database has created a separate but related table to store these variable numbers of associated product purchases it allows you to add new products to your current purchase by opening as many rows as needed in this way you’ll develop a structure that is dynamic and flexible saving space and time by only storing the necessary information to truly understand the join operation or in PowerBI terms the combine with merge operation it is important to first understand the relationship between tables the merging operation arises from the need to separate tables avoid forcibly distributing data that can be stored in a single table into separate tables visualize relationships such as product category transaction status person city where the definition table and its rows need to be separated in the order example the order details can connect unique data with repeating data in a more efficient manner now you can complete your task to combine the two tables orders and orders details with merge go to home on the power query editor ribbon and select combine then merge queries drop-down menu and select merge queries as new this selection opens a new window where you can select the tables that you want to merge from the drop- down list next select the column that matches between the tables which in this case is order ID select left outer join in the join kind drop-down which displays all rows from the first table and only the matching rows from the second after you select okay you are directed to a new window where you can view your new merged query now let’s take a look at doing this in more detail in Microsoft PowerBI in this scenario you are working in the sales department of Adventure Works which is a multinational bicycle manufacturer and you have been given a task by your manager Adio Quinn to consolidate orders and their corresponding details which are currently in two tables into a single table in PowerBI you select the Excel workbook option in the data group of the home tab select order.xlsx and order details.xls XLSX there is a typical foreign key relationship between the orders and order details tables let’s try to understand this with an example from our own social life we have all probably shopped at a market at least a few times at the end of the shopping we go to the cashier scan our items make the payment and receive a receipt the receipt contains information such as the name of the store the date of the purchase the cashier’s name and various other details at the bottom of the receipt there is a section that lists the quantity unit price and total amount for each item purchased followed by a grand total or the amount paid now let’s explore how we can structure these commonly encountered pieces of information into a table format adventure Works provides the following details to deal with these situations the order table is created to store information such as the name of the store the date of the purchase and other details found on the receipt in our earlier market scenario since there can be multiple individual products associated with a single order Adventure Works database have created a separate but related table to store these variable numbers of associated product purchases it allows you to add new products to your current purchase by opening as many rows as needed in this way you develop a structure that is dynamic and flexible saving space and time by only storing the necessary information to truly understand the join operation or in PowerBI terms the combine with merge operation it is important to first understand the relationship between tables if there is a need to separate tables the merging operation arises from that need avoid forcibly distributing data that can be stored in a single table into separate tables visualize relationships such as product category transaction status person city where the definition table and its rows needed to be separated now in the example of order order details that we have learned you have connected unique data with repeating data in a more efficient manner now you complete your task to combine the two tables order order details with merge go to home on the power query editor ribbon and select combine then the merge queries drop-down menu where you can select merge queries as new this selection will open a new window where you can choose the tables that you want to merge from the drop- down list and then select the column that is matching between the tables which in this case is order ID you will choose to use a left outer join in the join kind dropdown which displays all rows from the first table and only the matching rows from the second after you click okay you will be routed to a new window where you can view your new merged query and that concludes how to combine tables with merge in PowerBI in this video you learned how to combine data by merging tables and queries it can help you to consolidate information from multiple tables and queries by using related fields with foreign keys good job adventure Works is looking to expand its business by identifying new product lines that it can market to its customers it hopes that the results of data analysis will identify potential new product lines meet Daniel he’s a talented data analyst with Adventure Works they’re in-house expert on configuring and transforming data in PowerBI including merging data in Power Query adventure Works has noticed that a lot of customers have been returning bicycles to their stores for repair and maintenance these are often very simple repair and maintenance tasks like replacing tires or tightening loose bolts and screws the company suggests that Daniel analyzes the customer and sales data related to these transactions perhaps these customers might be willing to purchase a service plan for their bicycles first Daniel identifies the relevant data sources he begins with an Excel sheet named sales data this worksheet contains data on each bicycle Adventure Works has recently sold including the categories they belong to a description of each bike the prices they sold for and the staff who sold them the worksheet also includes data on the repairs carried out on each bike like the names of the parts that were replaced there are other relevant data sets available on a sheet named customer data this worksheet provides information on all customers including their names contact details age the bikes they have purchased and the repairs they have requested daniel uploads these data sources to PowerBI where he configures them for data analysis by transforming the data sets in Power Query once the data has been configured and transformed Daniel then uses joins to merge these worksheets together to identify what kind of bicycles customers are buying which customers are sending their bicycles to the store for repair and what kind of repairs are required he uses the results of his analysis to segment customers into profiles that focus on data such as age groups location and purchases he then identifies related search engine queries for individuals who match these profiles through combining and analyzing this data Daniel discovers that many of the customers seeking repairs are adults between the ages of 18 and 35 who live in rural areas this demographic mostly purchases mountain bikes which they use for weekend biking excursions he presents his data insights to Adventure Works the company realizes that he can offer these customers a service plan or bicycle health check in addition existing store staff can carry out these repairs so no new staff are needed to deliver this product it also helps the business to retain and generate a new revenue stream from existing customers this scenario emphasizes the importance of combining or merging data sources in Microsoft PowerBI by combining data sets you can deliver new insights on topics in the case of Adventure Works Daniel was able to create a customer profile and identify the needs of that profile adventure Works then provided a new product to this customer profile when it comes to generating data insights the benefits of merging data sources can’t be overstated the more data you have on your topic the greater an understanding you can develop and all of this can be achieved with Microsoft PowerBI and a strong data analytics skill set congratulations on reaching the end of the third week in this course on extracting transforming and loading data in PowerBI you’ve now reached the end of this module let’s take a few minutes to recap what you’ve learned you began this module by exploring the process of transforming data in PowerBI you first examined why data needs to be transformed you learned that raw data is not always gathered or sourced in a condition that’s suitable to work with it might be incomplete inconsistent or have other errors so it’s important that you transform and clean your data you can clean data by setting up filters in PowerBI that identify and resolve errors this way the filter data is accurate consistent structured and easier to analyze you then reviewed Power Query and its interface you learned how to navigate this interface and locate useful tools and features for connecting cleaning and transforming data from a wide range of sources and you explored the steps for these actions by helping Adventure Works connect to its data sources and then clean and transform the data they contained an important part of this cleaning process includes the applied steps list an editable list of all transformations applied to a selected query you can use this list to undo and reorder steps in the process next you explored the different data types in PowerBI the data types you explored included number types data and time type text or true or false and binary you learned that these different data types are used to classify values to help you better organize and structure your data sets you also learned that when working with data sets you might need to remove and rename columns you were presented with many of the benefits of reworking columns like more efficient readable and enhanced data and analysis or significant time and resource savings you continue to explore Power Query by reviewing steps for dealing with common errors power Query can fix errors like null values duplicate rows and inconsistent data types it’s important to resolve these errors before analyzing your data in Power Query you then made use of your new knowledge by helping Adventure Works to prepare a data set by cleaning the data and resolving its errors you then undertook a knowledge check in this item you proved your understanding of the concepts you encountered by answering a series of questions finally you explored a list of additional resources designed to help you improve your knowledge of the topics that you covered this week in the second week of this module you explored advanced data transformation methods in PowerBI you began this week by learning about the importance of data combination combine information create relationships between tables improve data and analysis and simplify data management you then reviewed the two main methods for combining data in PowerBI which are append and merge append means to add one table row or query to another merge means consolidating data from multiple data sources into a single table and you examined the process for combining tables with append and power query editor you then put your new skills to use by assisting Adventure Works with appending tables in their database next you completed a knowledge check which tested your understanding of these concepts through a series of questions and you were presented with a list of additional resources that you could review to learn more about advanced data transformation in week three you learned about methods for combining data that you could use for data transformation you discovered that one method of combining data is to use a join a join is a useful way of combining data from different sources you also learned that join keys are the values used to link rows between tables you also learned that there are different types of joins these different types include the left outer join right outer join full outer join and inner join which of these join types you choose to use depends on your data transformation needs you then looked at how to combine tables using a merge operation in Power Query Editor by identifying the relevant keys and require join operations you can merge two or more tables to deliver new insights into your data next you demonstrated your competence with these new skills by helping Adventure Works to merge two of their data sources to deliver new insights into their business finally you undertook a knowledge check which tested your understanding of the concepts that you encountered this week and you completed a module quiz in which you demonstrated your understanding of all concepts you encountered throughout the entire module you’ve learned a lot about transforming data in PowerBI and as you approach the next module consider going through some of the learning material again to reinforce your understanding looking ahead you will expand your knowledge of the ETL process by diving into advanced ETL in PowerBI where you will learn all about loading and profiling data and advanced queries best of luck you have gained detailed knowledge about the extract and transform steps in the ETL process so far and you have applied this knowledge by considering scenarios and tasks in this video you will learn about the final step of the ETL process load the load operation in summary enables the transformed data obtained by reading from a data source to become available for reporting purposes considering that the ultimate goal of PowerBI is to provide data visualization through reports and dashboards the importance of making the data available for this purpose becomes evident up until the load stage you have completed tasks such as accessing data sources establishing connections extracting data and performing transform operations the purpose of all these operations was to bring meaningful and cohesive data into the reporting interface filtered based on specific criteria the load process ensures the visualization of all the extracted and transformed data there are two main ways to load data in the PowerBI user interface load and transform data let’s look at each option a bit closer starting with load with the load option data is loaded directly into the data pane in PowerBI if you choose to load data directly you can still transform the data at a later stage the second option transform data allows you to transform the data before loading it the changes to the data are applied to the data model and the data pane is refreshed in PowerBI visualizations can now use the applied changes whether you choose to load the data directly with the load option or transform the data before loading with the transform data option loading time can vary depending on the size of your data set optimizing performance and reflecting updated data from the source in reporting are of great importance in the data loading process in the upcoming sections you will gain detailed information about these topics in some cases you might have some source tables which are used during the ETL process that will not be used directly in the reporting area and some of these tables may not meet the production demands of your data warehouse in such cases you will need an intermediate state between the data source and the data warehouse called the data staging area a staging area serves as an intermediate storage location for raw or unprocessed data allowing it to be temporarily stored and prepared for further processing in a data pipeline the existence of a data staging area is not obligatory for your ETL jobs so you can execute ETL jobs without creating staging areas however it is recommended to simplify the process of data cleansing and consolidating data coming from multiple sources by now you know that the data loading process is the final step of the ETL operation and that it is the most crucial step for making the data available in the reporting environment to achieve this the data is loaded into Power Query either directly from the data source or after performing transformation operations additionally a staging area is often used as an intermediate step to store the data in a more organized manner aiming to facilitate maintenance and management tasks by completing the load stage you are now ready to explore the data create compelling visualizations and gain valuable insights to support decision-making for your organization data staging is one of the key concepts in data loading over the next few minutes you will learn the basics of data staging the reasons for its necessity and the advantages of using it in the overall ETL processes to better understand the concept of staging let’s use an everyday life example imagine you’ve invited friends over for dinner and you’ve bought ingredients from the grocery store to prepare the meal however you don’t serve the ingredients as they are you might marinate the meat in a pot cut the vegetables and place them in a bowl for washing and prepare other dishes like making a salad or putting appetizers on a plate in this example all the ingredients represent raw data while the processes of marinating washing cutting and waiting correspond to ETL operations the pots bowls and other utensils used before serving can be thought of as the staging area now let’s apply this everyday life example to data staging a staging area serves as an intermediate storage location for raw or unprocessed data allowing it to be temporarily stored and prepared for further processing the staging area typically acts as a bridge between the data sources and the data warehouse a staging area simplifies the process of data cleansing and consolidation of operational data originating from multiple source systems particularly for enterprise data warehouses that centralize an organization’s critical data remember a data staging area is not required for your ETL jobs you can still execute ETL jobs without creating one however based on your need to consolidate data coming from multiple sources it is recommended over at Adventure Works the company receives feedback about its products from various channels such as social media platforms and corporate websites your manager Adio Quinn has tasked you to prepare a data set by using these resources to consolidate and to prepare the data for use in reports and dashboards none of the feedback can be used in its raw form as they have different formats you must transform the data and then consolidate it in a unified list since you will only use this data in the ETL process it is appropriate to use a staging area let’s take a few moments to complete this task using Power Query the first step is to import the two data sets Adventure Works social media feedbacks one and Adventure Works Social Media Feedbacks 2 to transform and consolidate in the staging area to do this navigate to the home ribbon tab at the top of the PowerBI window select the Excel workbook button inside the data group in the middle of the toolbar select your data sets and select open then select your data sets and select transform data in the window that opened now you have two queries Adventure Works social media feedbacks one and Adventure Works social media feedbacks 2 in the queries pane at the left menu of Power Query to successfully complete your task you have to consolidate these two queries into a single query and add an extra column to indicate where the feedback came from to do this you have to use these queries and integrate the data into a more defined and optimized model to do this you need a staging area as you have to consolidate these two tables into one but also keep them separately you have to create a new group called the staging area in the queries pane at the left menu of power query select new group type staging area in the name text box and select okay now move both the data sets adventure work social media feedbacks one and adventure work social media feedbacks 2 to the staging queries group your tables are now organized according to your need select the Adventure Works Social Media Feedbacks one and Adventure Works Social Media Feedbacks 2 tables respectively and disable the load by clearing the checkbox enable load you will keep the include and report refresh option this way both tables will still be used in queries but will not be part of the data model you are now familiar with a concept of a staging area and how it is implemented in PowerBI imagine you have just started working at Adventure Works as a data analyst you have a lot of data to analyze to determine which products are preferred by which client and why to perform successful analysis on these many items it is necessary to have data that includes fields suitable for analysis with an adequate amount of data and a variety of data ranges representing the overall data over the next few minutes you will be introduced to data profiling and statistical analysis and why it is important when reviewing data sets by the end of this video you will have been introduced to a high-level understanding of data profiling and statistical analysis when reviewing data sets you will also learn about the distribution anomalies and outliers in the context of data profiling let’s first cover an introduction to data profiling before analyzing any data set it is important to examine and evaluate the data you are working with analyzing the data without evaluating its accuracy completeness and alignment with your objectives can lead to misleading results when examining a data set for the first time there are several aspects you should look at especially for numerical fields you should check these characteristics for each numerical field minimum or min maximum or max average or mean frequently occurring values or mode and standard deviation the best way to start assessing data is with data you can immediately troubleshoot imagine you are reviewing a data set that has an age field for instance there could be someone in the data set with an age of 200 which would be extremely unlikely to be true if so there may be an outlier in the data look at the minimum and maximum values such as appearing between 21 and 77 these are realistic ages unlike 200 the concept of distribution of data refers to how the data points are spread or arranged within a data set it describes the pattern or shape of the data when plotted on a graph understanding the distribution of data is crucial in data analysis because it helps you gain insights into the central tendency variability and overall characteristics of the data next let’s consider outliers the formal definition of an outlier in statistics is a data point that significantly deviates from other observations outlier data can be handled by applying a technique called min max scaling or normalization the aim is to adjust the mean and standard deviation of the data proportionally while preserving the ratio of the distance between outlier data and other data points analyzing the distribution allows you to make informed decisions identify outliers and choose appropriate statistical techniques for further analysis there are situations where there may be values in the data set that skew the average for example there may be examples close in age let’s say there are three individuals aged 80 and above if you solely rely on the average to evaluate the distribution these outliers can mislead you by increasing the average in this case it would be appropriate to examine the distribution more closely when taking a closer look at the data you may find that the distribution is normal but the three records mentioned in the example are outliers next let’s look at standard deviation standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a data set it provides a way to understand how individual data points differ from the mean or average of the data set the main objective here is to prevent outliers from causing deviations in your analysis results minimizing their impact finally let’s return to the point of distribution of data the balanced distribution of data points that fall outside the outliers is another factor that affects data quality and your analysis results it is important for descriptive variables such as age gender income status occupation city and neighborhood to represent as many diverse groups as possible and be evenly distributed among others if not a cluster of records that closely resemble each other will lead to narrow intervals when defining norms which will mislead your analysis profiling and statistically analyzing data including examining its distribution min max mean and mode values detecting outliers if any and normalizing outliers ensuring that the data represents the entirety of the data set are the key elements that demonstrate data quality considering these factors will enhance the accuracy and quality of analysis and predictions made with this data by now you should have a good understanding of the concepts of profiling data and possible situations where you will need to apply the profiling techniques in this video you will learn about data profiling and statistical analysis and how to use them in PowerBI as well as this you will cover how to use profiling tools to inspect the data adventure Works recently conducted a field survey to increase sales and collected potential customer data this resulted in an Excel file containing information such as age gender occupation income level address and phone number of prospective customers since the survey data was collected manually it was not subjected to any validation therefore before analyzing the data it is necessary to confirm that the data is valid within the desired ranges and quantities and exhibits a good distribution before starting analysis on any data set it is important to examine the data by examining various aspects such as completeness accuracy uniqueness and consistency data profiling enables the identification of potential issues and anomalies within the data set this proactive approach allows you to make informed decisions about data cleaning transformation and enrichment ultimately leading to improved data quality additionally data profiling facilitates effective data exploration and visualization by providing insights into data patterns relationships and trends it empowers users to discover hidden insights uncover data inconsistencies and make datadriven decisions with confidence before delving into data profiling tools let’s first consider two important factors in data profiling unique and distinct in PowerBI unique is known as total number of values that only appear once distinct is known as total number of different values regardless of how many of each you have microsoft PowerBI offers the following two profiling tools in the Power Query editor column quality and column distribution let’s begin with column quality column quality focuses on valid error and empty rows on each column allowing you to validate your row values the column quality feature labels values in rows in five categories valid shown in green error shown in red empty shown in dark gray unknown shown in dashed green indicates when there are errors in a column the quality of the remaining data is unknown unexpected error shown in dashed red these indicators are displayed directly underneath the name of the column as part of a small bar chart the number of records in each column quality category is also displayed as a percentage by hovering over any of the columns you are presented with a numerical distribution of the quality of values throughout the column additionally selecting the ellipses button opens some quick action buttons for operations on the values column distribution provides a set of visuals underneath the names of the columns that showcase the frequency and distribution of the values in each of the columns the data in these visualizations is sorted in descending order from the value with the highest frequency by hovering over the distribution data in any of the columns you get information about the overall data in the column with distinct count and unique values you can also select the ellipses button and choose from a menu of available operations let’s consider column distribution specifically relating to distribution of distinct and unique amounts imagine that you have a selection of bike accessories that are supplied by four different suppliers supplier A supplier B supplier C and supplier D in this case there are four distinct suppliers now imagine you have two bikes each with a unique supplier to any other bikes you currently stock these would be considered two unique suppliers another type of profiling in PowerBI is column profile column profile provides column statistics such as minimum maximum average frequently occurring values and standard deviation and in addition value distribution on the selected column this is very important when assessing data to detect anomalies and outliers now that you’ve covered the basics of data profiling tools let’s apply this in PowerBI and inspect some data adventure Works conducted a field survey to increase sales and collected potential customer data this survey resulted in an Excel file containing information such as age gender occupation income level address and phone number of prospective customers the survey data was collected manually it was not subjected to any validation therefore before analyzing the data it is necessary to confirm that the data is valid within the desired ranges and quantities and exhibits a good distribution navigate to home at the top of the PowerBI window select Excel workbook inside the data group in the middle of the tab select potential customers.xlsx and select transform data in the opened window check the column quality check box inside the data preview group of view to assess column quality in the age column 89% of the values are valid 0% of the values are air and 11% of the values are empty rows to assess column distribution for the occupation column on the view tab from inside the data preview group check column distribution note that there are nine distinct values and two unique values computer programmer and accountant are the occupations which appear only once for each column note that if all the row values are distinct then unique and distinct amounts will be equal for example you can see that there are 19 distinct and 19 unique values for the surname column select the age column and then check the column profile checkbox note that maximum value for age column is 132 which is not acceptable examine the minimum maximum average and other column statistics and review the value distribution chart in this video you learned how to profile data by assessing column quality distribution and profile data profiling in PowerBI offers several advantages in the process of data analysis it helps you gain a comprehensive understanding of the data quality structure and distribution with its ability to assess data quality and provide valuable insights data profiling in PowerBI plays a crucial role in enhancing data reliability accuracy and overall analytical outcomes in the world of technology even the most meticulously designed software can harbor hidden bugs waiting to unleash chaos upon unsuspecting users imagine a scenario where a simple bug managed to infiltrate a company’s database threatening to compromise the accuracy of critical reports and potentially sending shock waves through senior management however thanks to the miraculous powers of data profiling with the aid of PowerBI disaster was averted and the company emerged victorious buckle up as we take you on a thrilling journey through the realm of software mishaps triumphs and the heroes who saved the day it all began innocently enough deep within the complex coding of a company’s flagship software a tiny bug had nestled its way into the system this bug had an uncanny ability to transform innocent data into deceptive monsters causing them to wreak havoc when unleashed into the wild the bug was sly and patient biting its time until the perfect moment to strike as the software went about its daily operations the bug began silently distorting the data it touched unbeknownst to the users inaccuracies were creeping into the system lurking beneath the surface reports that were once reliable now became unreliable leading to questionable decisions and raised eyebrows among senior management fortunately the company had an ace up its sleeve a team of brilliant data profilers armed with the mighty PowerBI with its robust data profiling capabilities PowerBI became the ultimate weapon against the deceptive bug and its corrupted data the team rallied together ready to utilize PowerBI’s analytical prowess and visualizations to uncover the truth hidden within the tainted database armed with PowerBI the heroic team embarked on a quest to hunt down and eradicate the corrupted data they connected PowerBI to the company’s database leveraging its intuitive interface and advanced algorithms to identify the anomalies lurking within the system powerbi’s data profiling features allowed the team to analyze and scrutinize every nook and cranny of the company’s data unearthing the bugs footprints one by one after days of tireless work the data profilers empowered by PowerBI emerged triumphant they successfully identified and isolated the distorted data ensuring its exclusion from future reports powerbi’s rich visualizations and interactive dashboards enabled the team to present their findings to senior management in a clear and concise manner further solidifying their victory as the dust settled the company took a moment to reflect on the incident they recognized the transformative power of PowerBI’s data profiling capabilities and the critical role it played in safeguarding their data integrity the bug had served as a wake-up call reminding them of the importance of incorporating robust data profiling tools like PowerBI into their systems helping them catch potential issues before they cascade into crisis in this thrilling tale of software mishaps and heroic data profilers we’ve witnessed how a simple bug had the potential to plunge a company into chaos however thanks to the power of data profiling with the aid of PowerBI accuracy was restored the diligent efforts of the data profiling team did not go unnoticed as senior management praised them for their exceptional work and dedication in resolving the crisis the successful outcome served as a reminder of the invaluable role data profiling plays in maintaining the integrity of systems it showcased the power of collaboration expertise and the remarkable capabilities of tools like PowerBI in conquering challenges and emerging triumphant as a data analyst at Adventure Works your team is responsible for analyzing vast amounts of data to gain insights into customer behavior and improve business operations microsoft Power Query is an essential tool in data analysis workflow enabling you to transform and integrate data from various sources you heavily rely on Microsoft PowerBI for your daily tasks preparing reports for business units by connecting to data sources and performing extract transform and load operations since adventure works strive for optimal efficiency and results your manager Adio Quinn has assigned you the task to research best practices for specific configurations performance preferences security and other related topics to ensure the most optimal use of PowerBI in your work over the next few minutes you’ll be introduced to best practices when working with data sources in PowerBI and also understand why these practices are important to implement let’s start by exploring how you and your team can apply best practices to enhance your Power Query workflows and improve data quality and analysis your first step is to plan and document your data transformation requirements you define the desired output identify the relevant data sources and outline the transformations needed you also ensure that data source credentials are properly documented and securely stored by maintaining an organized and consistent approach your team can streamline your Power Query process and avoid confusion next you carefully select the appropriate connector to connect to your data sources you consider factors such as the type and location of the data source the volume of data and the available connectivity options with PowerBI’s wide range of connectors you can seamlessly connect to databases cloud services files and APIs it is important that you evaluate the performance capabilities and scalability of the connectors to ensure optimal performance for your data requirements considering the performance and optimization of your data transformations and calculations your team follows the principle of do expensive operations last you prioritize and schedule resource intensive operations towards the end of the data transformation process this approach ensures that complex calculations merging large data sets and applying multiple transformations on a significant number of rows are executed efficiently leading to faster data loading and more responsive reports your team also pays attention to data type selection for columns aiming to improve performance and data accuracy you review and adjust the inferred data types manually preventing incorrect data interpretations and reducing memory consumption data profiling plays a crucial role in your team’s data analysis process you leverage PowerBI’s data profiling capabilities to gain a comprehensive understanding of data quality structure and distribution by examining aspects such as completeness accuracy uniqueness and consistency you identify potential issues and anomalies within the data set this proactive approach enables you to make informed decisions about data cleaning transformation and enrichment ultimately improving data quality to ensure smooth data processing your team implements error handling techniques such as conditional logic and custom error messages you also incorporate data validation checks to identify and handle unexpected data inconsistencies effectively the next best practice is to consider your merge strategy when merging or joining multiple queries you consider the most efficient merge strategy selecting inner joys whenever applicable you remove redundant fields to avoid unnecessary duplicate columns in the resulting merge query to maintain an organized work environment your team utilizes groups as containers for your queries you create nested groups when needed and easily move queries between groups by dragging and dropping them regularly reviewing and removing unnecessary steps in the Power Query editor is another practice you follow removing unused or redundant transformations helps improve processing time and simplifies query maintenance monitoring the performance of your Power Query workflows is an ongoing task for your team you evaluate the refresh speed resource consumption and overall efficiency by fine-tuning query settings such as parallel loading or data load options you optimize performance based on your specific requirements following these best practices when working with Power Query will enable you to effectively shape and transform your data while maintaining data integrity improving performance and streamlining your workflows remember consistent documentation efficient data filtering error handling and optimization techniques are key to achieving reliable and efficient data transformations with Power Query embrace these practices adapt them to your specific requirements and continue exploring new features and capabilities to become a Power Query expert in the world of Microsoft PowerBI data is the foundation of meaningful insights and informed decisionmaking however managing and preparing data for analysis can be a complex and timeconsuming process this is where data flows can help in this video you will explore what data flows are and why they are used in PowerBI you’ll learn the subscription level required to use them and engage with a fictional scenario showcasing their application and the advantages and limitations they offer adventure Works is a company operating in multiple regions each with its own set of data sources and reporting requirements to manage these multiple data sources Adventure Works wants to use the PowerBI data flows feature data flows allow you to connect to data sources perform data transformations and create business logic to build data entities that can be shared across different reports and dashboards they can also be published to the PowerBI service and in shared reports and dashboards data flows simplify the process of data preparation allowing users to cleanse transform and shape their data with ease you can apply business rules clean untidy data and create calculated columns through Microsoft Power Query a powerful data transformation tool within PowerBI data flows offer a visual interface for building data transformation logic making it accessible to users lacking coding skills you can use data flows in Microsoft PowerBI Desktop and Microsoft PowerBI service in PowerBI desktop you can create and manage data flows using the Power Query Editor this allows you to connect to various data sources perform transformations and define the structure of your data entities you can then publish these data flows to the PowerBI service for further use once published to the PowerBI service data flows can be accessed and managed through the PowerBI web interface you can schedule data flow refreshes configure data connectors and establish relationships between data flows and other data sets in your workspace additionally you can use the capabilities of Power Query online a cloud-based version of Power Query to perform data transformations directly in the PowerBI service by supporting data flows in both PowerBI desktop and PowerBI service powerbi enables a seamless experience for users to create share and collaborate on data flows throughout the entire data preparation and analysis process this flexibility allows users to work with data flows using their preferred environment while ensuring consistent and efficient data management across both desktop and cloud-based environments a PowerBI Pro license is required to use data flows in PowerBI however a PowerBI premium subscription is necessary for advanced features and capabilities such as incremental refresh compute engine selection and larger data capacity powerbi premium unlocks additional functionalities and performance optimizations that enhance the data flow experience advantages of data flows include reusability data flows enable the reuse of query logic and transformations saving time and effort in data preparation tasks data centralization data flows provide a centralized and consistent data source ensuring data integrity and reducing duplication collaboration users can collaborate on data flows making sharing and working on data preparation processes easier scalability data flows use cloud-based processing capabilities enabling efficient handling of large data sets and complex transformations limitations of data flows include data refresh data flows have specific refresh limitations such as the frequency and dependencies on data source availability data flow management currently data flows are managed individually and there is limited visibility into dependencies between data flows advanced transformations while data flows offer a wide range of transformations certain complex scenarios may require advanced coding or alternative solutions data flows in PowerBI help users streamline and enhance their self-service data preparation workflows by providing a scalable and collaborative approach to data integration and transformation data flows enable organizations to unlock the true potential of their data while data flows offer numerous advantages such as reusability centralization collaboration and scalability you must be aware of their limitations and consider alternative approaches for advanced transformations by effectively using data flows you can accelerate data preparation ensure data consistency and make informed decisions based on reliable and well-prepared data power Query is a powerful data transformation and manipulation tool within PowerBI that allows users to shape and transform data from various sources but performing repetitive steps on multiple queries can be a tedious task especially when the queries involve similar but separate sets of data one of the key features to solve this issue is through reference queries which provide flexibility reusability and efficiency in your data transformation process in this video you will learn about reference queries in Power Query and its importance in streamlining data workflows you’ll also explore best use cases for reference queries and data flows by establishing a query reference you can establish a connection between an existing query and a new query enabling data flow across sequential models any modifications made to the original query will automatically apply to the referenced query ensuring consistency and up-to-date information instead of modifying transformations individually in multiple queries you can make updates in the master query and those changes will be automatically applied to all reference queries this provides cohesion and makes it easier to maintain and update your data transformations so what are the benefits of query referencing let’s explore some examples first there is reusability by referencing queries you can reuse common data transformations across multiple queries this promotes consistency in your data processing and reduces the risk of errors that can occur when duplicating complex transformations next there is efficiency reference queries eliminate the need to repeat time-consuming data transformation steps instead you can leverage the results of a previously defined query significantly improving the performance of your data workflows lastly you have scalability as your data analysis requirements grow reference queries allows you to build modular and scalable data transformation workflows you can create separate queries for different data sources or transformation steps and combine them as needed providing flexibility and adaptability to changing business needs in Power Query you can reference a query by using the reference option by right-clicking any query in the queries pane reference will create a new query a copy of the original query but containing one single step you can rename the new query as you need and then start to use it in this way you establish a connection between the queries enabling data flow and transformation continuity let’s delve into this further through a scenario you are working as a data analyst at Adventure Works which recently acquired another bicycle business your manager Adio Quinn has assigned you the task of appending the product data from the newly acquired company to Adventure Works’s existing products prior to appending the new products you need to perform several transformation tasks such as changing column types and removing unnecessary columns however your manager has asked you not to modify the existing queries to preserve their original form and use them as a source for other operations to accomplish this you need to create references from the original queries rename the new queries apply necessary transformations and then append the data any changes made to the base queries will impact on the new queries this approach allows you to keep the original queries update the reference queries and ensure that any changes made to the base queries are reflected in the referenced ones query referencing creates many opportunities for advanced data transformation techniques you can apply conditional logic merge referenced queries or perform calculations based on reference data these advanced techniques further enhance the flexibility and power of your data workflows referencing queries in Power Query is a fundamental concept that allows you to streamline and optimize your data transformation process by leveraging query references you can improve reusability efficiency and scalability ultimately enhancing the overall productivity and effectiveness of your data analysis in PowerBI as data volume continues to grow so does the challenge of transforming that data into well-formed actionable information we want data that’s ready for analytics to populate visuals reports and dashboards so we can quickly turn our volumes of data into actionable insights however managing and preparing data for analysis can be a complex and timeconsuming process it’s important to consider the best approach for your data transformations and analysis in this video you will explore how to reference other queries and why a data flow may be more suitable choosing between referencing queries and data flows depends on the specific requirements of your scenario it’s important to evaluate factors such as data volume complexity of transformations user expertise and maintenance requirements to determine the best fit for your use case there are some performance considerations you need to bear in mind with regards to reference queries especially reference queries can contribute to slow data refreshes due to the nature of their referencing when a reference query is refreshed it needs to ensure that all the referenced queries are also refreshed to maintain data consistency this can result in longer refresh times especially if there are multiple layers of referencing involved furthermore reference queries can overburden data sources particularly when working with large data sets as reference queries rely on the data from other queries they need to fetch and process the data from the original sources this becomes more noticeable when dealing with complex transformations or frequent refreshes to mitigate these issues it’s important to optimize the design and usage of reference queries consider limiting the number of reference layers and optimizing the queries transformations to reduce unnecessary data processing additionally carefully manage the refresh schedule to avoid excessive load on data sources during peak usage times by implementing these best practices you can help minimize the impact of reference queries on data refreshes and prevent overburdening your data sources now let’s review data flows data flows offer a centralized and scalable approach for data preparation data flows are designed specifically for data integration and transformation tasks providing a self-service environment for business users to create and manage extract transform and load processes referred to as ETL processes with data flows you can connect to various data sources perform transformations using a visual interface and store the prepared data in the PowerBI service data flows are a feature available in both PowerBI desktop and PowerBI service data flows provide a cloud-based data preparation experience where you can build manage and share reusable data entities in summary understanding the differences and best use cases between reference queries and data flows is essential for optimizing your data processing workflows in Power Query reference queries in Power Query is a fundamental concept that allows you to streamline and optimize your data transformation process by leveraging query references you can improve reusability efficiency and scalability ultimately enhancing the overall productivity and effectiveness of your data analysis in PowerBI remember practice makes perfect experiment with reference queries in Power Query to gain hands-on experience and discover the immense value it brings to your data analysis endeavors at Adventure Works you have a task that needs separate analysis for three main bike product categories you soon realize that to complete the task you’re creating the same query three times the only difference being the change to the bike category it’s inefficient to completely rewrite queries whenever there’s a minor change in the data or a slightly different question from management what if there was a way to create adaptable reusable queries there is the query parameters feature in Microsoft PowerBI allows you to define one query that can be easily adjusted to handle different categories or variables this video will help you understand the concept of query parameters in PowerBI it explains how to effectively implement and manage query parameters let’s learn how query parameters can make your data analysis tasks more efficient and adaptable query parameters in PowerBI is a powerful feature that allows users to input a value which is then used in the data retrieval process from a data source essentially it’s a placeholder for information that can change the query parameter can be used in various operations such as filters transformations or creating new columns and tables let’s explore some possible uses of query parameters at Adventure Works adventure Works can use query parameters when connecting to its database to retrieve specific information rather than importing the entire data set for instance Adventure Works can establish a query parameter for a sales date range by inputting the dates PowerBI will only fetch data for that period saving resources and time parameters can also be used in Adventure Works data transformations if there’s a need to frequently adjust a specific value in the transformations using a parameter avoids manual changes each time the value only needs to be updated in the parameter parameters can control filters on Adventure Works data if the company wants viewers of a report to concentrate on a particular product category they could create a parameter for the product category this allows the viewer to select the category they’re interested in and PowerBI will adjust the report accordingly now let’s explore creating query parameters in Microsoft PowerBI first you’ll need to open the Power Query editor in PowerBI to do this go to the top left corner of the PowerBI desktop interface there is a set of tabs in a ribbon layout one of these tabs is home select this home tab once you are in the home tab select transform data this action will open the Power Query editor in the Power Query editor go to the Home tab select the manage parameters option this opens the manage parameters dialogue box where you can create parameters to create a new parameter select new now you are able to name your parameter and define its properties for instance you might name it product category filter under type from the drop-own menu select text as the data type next specify what values this parameter can take from the suggested values drop-down menu choose list of values in the input field that appears create your list by entering the different product categories from your data set therefore the values here are such items as mountain bikes road bikes and touring bikes once you’ve filled in these details select okay then okay again in the manage parameters dialogue to return to the Power Query editor query parameters can significantly enhance your PowerBI reports making them more flexible and interactive parameters enable efficient data retrieval and transformation by allowing for dynamic changes helping you cater to evolving business needs without having to rewrite entire queries the more adaptable your data analysis tools are the more capable you become in meeting your organization’s everchanging demands this makes your work more efficient and enables you to provide valuable insights that can guide your company’s decision-making processes keep exploring keep learning and embrace the power of query parameters in PowerBI to improve your analysis in previous videos in this course you learned about advanced query capabilities data flows and the differences between reference queries as mentioned before every instance of data transformation performed in Microsoft Power Query adds a step to the Power Query process these steps can be rearranged removed or modified as needed to optimize the data shaping process whenever you use the Power Query interface M language code is executed to perform each operation behind the scenes the M language is available for you to read and modify directly in the Power Query Advanced Editor in this video you’ll learn how to use this advanced editor to update an M query a core capability of Power Query is to filter and combine data from one or more supported data sources any such data mashup is expressed using the Power Query formula language Mquery although you don’t have to know M language to use Power Query being familiar with the language used behind the user interface as well as being able to update it when necessary is valuable for anyone using the tool for example you may need to perform custom transformations that cannot be easily accomplished using the Power Query user interface alone this is where knowledge of the M language and its syntax can be helpful using the M language you can perform advanced data manipulation tasks such as conditional filtering custom column creation data type conversions and merging multiple data sources the language is designed to be expressive and efficient enabling you to handle large data sets with ease when you access the M
language code there are certain group names and meanings that are called M syntax let’s explore the syntax using an M language code snippet this snippet showcases how to handle various CSV file operations in Power Query including setting up the initial data source and performing data transformations loading the file specifying the delimiter and encoding for the CSV document calculating the number of columns and assigning a value to a variable you can find more information on M syntax in the additional resources of this lesson it can also serve as template code for further data transformations using the Power Query M language in PowerBI which you can customize based on your needs the advanced editor provides syntax highlighting autocomp completion and error checking features making it easier to write and debug your AM code it also offers functions and operators that allow you to perform various data transformations calculations and aggregations now let’s explore how you can use the advanced editor tool in Power Query and modify steps by updating M language code using a practical scenario a report designer informs Adio Quinn your manager at Adventure Works about an error being received in the Power Query window he assigns you the task of identifying the cause of this error and resolving it you investigate the issue by examining the steps in Power Query and analyzing the problem using the M language discovering that the error is a result of a change in the source files location let’s outline the steps to resolve this issue using the advanced editor tool let’s start with the source file and adventure work sales spreadsheet in Excel if you navigate to the home tab at the top of the PowerBI window select Excel workbook in the data group followed by the Adventure Works sales file and lastly select transform data in the opened window you’ll successfully access the Power Query editor however suppose the location of the source file is unintentionally changed by another person for example the Excel file is moved to another folder this will cause an error in the Power Query window to explore what happens as a result of this error let’s navigate to refresh preview in the query group on the home tab and select refresh preview from the drop- down menu when you refresh the preview you now get an error message indicating that the source file is no longer reachable as the location has changed you can resolve this issue by using the advanced editor to do this you need to select advanced editor in the query group on the home tab next you’ll need to read the error message and code carefully to determine the necessary action in this case you need to correct the file path in this scenario I’ll change the path from C data C3 M3 L3 Adventure Works Sales.Xlsx to C data adventureworks sales.xlsx your file path will differ from this as it will specify the location of the file on your computer after you’ve completed your correction you can select done in the opened window with this edit you’ve modified the code using advanced editor correcting the file path and resolving the issue by using the advanced editor and familiarizing yourself with the M language you can unlock the full potential of Power Query whether for error checking or creating sophisticated data transformations that meet your specific requirements the advanced editor empowers you to manipulate and shape your data precisely congratulations on reaching the end of the third week in this course on extracting transforming and loading data in PowerBI you’ve now reached the end of this module let’s take a few minutes to recap what you’ve learned you began this module by exploring the final step of the ETL process load you learned that the load operation enables the transformed data obtained by reading from a data source to become available for reporting purposes you then explored the two main ways to load data in the PowerBI user interface load this option directly loads data into the data pane in PowerBI and you can still transform the data at a later stage and transform data the option allows you to transform the data before loading it with changes being applied to the data model next you discovered that in some cases you might have some source tables which are used during the ETL process that will not be used directly in the reporting area in some of these tables may not meet the production demands of your data warehouse in such cases you will need an intermediate state between the data source and the data warehouse called the data staging area a staging area serves as an intermediate storage location for raw or unprocessed data allowing it to be temporarily stored and prepared for further processing in a data pipeline you then made use of your new knowledge by helping Adventure Works transform and consolidate data by using a staging area next you undertook a knowledge check in this item you proved your understanding of the concepts you encountered by answering a series of questions in the second week of this module you were introduced to data profiling in PowerBI you began this week by learning about the importance of data profiling and statistical analysis when reviewing data sets you also learned about distribution anomalies and outliers in the context of data profiling and you learned about standard deviation next you explored the two profiling tools in the Power Query editor column quality and column distribution you then put your new skills to use by assisting Adventure Works with data profiling and statistical analysis using the profiling tools in PowerBI to inspect data next you completed a knowledge check which tested your understanding of these concepts through a series of questions in week three you discovered the best practices when working with data sources and why these practices are important to implement then you had the opportunity to complete a practical exercise importing a data set while considering the best practices you were then introduced to data flows you explored what data flows are and why they are used in PowerBI you learned about the subscription level required to use them and engaged with a fictional scenario showcasing their application and the advantages and limitations they offer next you explored reference queries and their importance in streamlining data flows reference queries in Power Query refer to the practice of using the output of one query as a data source or transformation step in another query you then explored the performance considerations you need to bear in mind when using reference queries next you demonstrated your competence with these new skills by helping Adventure Works to merge two of their data sources using reference queries to deliver new insights into their business next you explored the query parameters feature in Microsoft PowerBI you learned that this feature allows you to define one query that can be easily adjusted to handle different categories or variables and you examined the process for disabling helper queries in PowerBI after that you were introduced to the advanced editor and learned how to modify code you learned that whenever you use the Power Query interface M language code is executed to perform each operation behind the scenes and you learned that although you don’t have to know M language to use Power Query being familiar with the language used behind the user interface as well as being able to update it when necessary is valuable for anyone using the tool you then explored the various global options PowerBI offers that allow you to customize and optimize your experience when working with files you learned that these options provide flexibility and control over file settings ensuring a seamless workflow and enhancing your overall productivity finally you undertook a knowledge check which tested your understanding of the concepts that you encountered this week and you completed a module quiz in which you demonstrated your understanding of all concepts you encountered throughout the entire module you should now be familiar with the advanced ETL processes in PowerBI you should be capable of loading data with PowerBI profiling this data and using advanced queries in PowerBI great work you have almost reached the end of this course in this video you’ll consolidate key concepts you learned throughout you’ll revisit essential learnings related to the data analysis process for businesses and transforming data into valuable insights using PowerBI through your continuous effort you’ve gained a solid foundation in collecting data from and configuring multiple data sources in PowerBI preparing and cleaning data using Microsoft Power Query and inspecting and analyzing data to ensure data integrity you have demonstrated tremendous dedication to this course through your engagement with the videos readings exercises and quizzes what’s left now is to demonstrate the skills you’ve learned in the final course project this recap will serve as valuable preparation for your final course assessment and graded quiz in the final course assessment you’ll apply what you’ve learned by completing tasks that simulate a real world data analysis scenario to consolidate your learning you’ll then take a final graded quiz to assess the knowledge and skills you gained throughout this course let’s get started by revisiting your first week of learning in the first week you learned about data sources local and shared data sets working with Excel data types storage modes triggers and actions you primarily focused on data sources in the process you covered the skills to connect data sources choose the correct query modes either import or direct and setting up triggers and actions to stay updated with the frequently changing data week two began with analyzing the need behind the data transformation and getting familiar with the Power Query interface which will be used throughout the ETL operations you continued your journey with learning about columns data types applied step lists and common data errors and then you prepared a data set you also learned how and why to pivot and unpivot tables which are very popular operations finally you applied combining table operations which are appending merging and joining tables these week two contents are fundamentals for ETL operations week three began with loading data and staging area concepts you applied an end-to-end ETL operation then learned about data profiling which is very important for understanding data quality and distribution this helps you detect a potential anomaly in a data set before you start to analyze it you then explored how to use M language and advanced editor to apply detailed operations in Power Query finally you learned data flows and reference queries which are used to increase efficiency and productivity this course equipped you to use PowerBI and Power Query to construct end to end ETL solutions starting from understanding data sources then advanced transformation techniques and ended by loading data in PowerBI as you embark on the final course project and assessment you can approach it with confidence knowing that you’ve built a strong foundation of knowledge and skills by committing to your learning journey throughout the course however if you feel the need to review any of the concepts summarized for you in this video or require additional preparation remember that you have the flexibility to revisit any of the course items this might only be the start of your journey toward a career as a data analyst but you can be very proud of yourself for how much you’ve already learned and accomplished now you’re ready to tackle the course project and graded assessment quiz good luck you’ve got this well done on completing this course you should be proud of the progress you’ve made in your data analysis learning journey with Microsoft PowerBI throughout the course you explored how to extract transform and load data using PowerBI in depth gaining expertise in building ETL solutions using PowerBI and Power Query you explored collecting data from and configuring multiple data sources in PowerBI preparing and cleaning data using Microsoft Power Query and inspecting and analyzing data to ensure data integrity you learned about data sources and setting them up in PowerBI as well as some of PowerBI’s ETL capabilities including connectors storage modes and setting up triggers plus you discovered more about transforming data using Power Query whether you’re cleaning and preparing data sets in Power Query to deal with errors and inconsistencies or performing advanced transformations to combine data you are now better equipped to transform data using PowerBI and don’t forget that you now have more insight into loading and profiling data in PowerBI as well as performing advanced queries in Power Query you even practice transforming multiple data sources a key real world skill for a data analyst congratulations on the expertise you’ve gained in extracting transforming and loading data in PowerBI this insight marks a valuable milestone in your journey to comprehensively using PowerBI to unlock valuable insights from data completing this course contributes towards gaining the PowerBI analyst professional certificate from Corsera these professional certificates are designed to equip you with the necessary skills to become job ready for in- demand career fields the Microsoft PowerBI Analyst Professional Certificate in particular not only offers you the opportunity to enhance your data analysis skills but also gain a qualification that can lay the groundwork for a career as a PowerBI data analyst plus the professional certificate will help you prepare for the exam PL300 Microsoft PowerBI data analyst by passing the PL300 exam you’ll earn the Microsoft certified PowerBI data analyst certification this globally recognized certification is industry endorsed evidence of your technical skills and knowledge the exam measures your ability to prepare data model data visualize and analyze data and deploy and maintain assets to complete the exam you should be familiar with Power Query and the process of writing expressions using data analysis expressions or DAX you can visit the Microsoft certifications page at http://www.learn.microsoft.com/certifications learn.microsoft.com/certifications to learn more about the PowerBI data analyst certification and exam this course enhanced your knowledge and skills in the ETL process in PowerBI but what comes next well there’s more to learn so it’s recommended you move on to the following course in the program whether you’re new to the field of data analysis or already have some expertise and experience completing the whole program demonstrates your knowledge of and proficiency in analyzing data using PowerBI you’ve done a great job so far and should be proud of your progress the experience you’ve gained will showcase your willingness to learn motivation and capability to potential employers it’s been a joy to take part in your learning journey keep up the excellent efforts and best wishes for all your future endeavors have you ever been confronted with large amounts of information at once it can be an overwhelming experience how do you make sense of everything with PowerBI you can create data models that act as visual representations of your records however this requires familiarity with the process and mastery of many different techniques so we’ve designed this course to equip you with the skills you need data modeling is creating visual representations of your data in PowerBI you can use these representations to identify or create relationships between data elements by exploring these relationships you can generate new insights into your data to improve your business microsoft PowerBI is a fantastic tool for creating data models and generating insights and you don’t need an IT related qualification to begin using it this course is designed for anyone interested in learning about building data models it also establishes a strong foundation for those pursuing a career in data analytics by exploring PowerBI you’ll learn how to create data models using schemas and relationships analyze your models using DAX also known as data analysis expressions and optimize a model for performance in PowerBI in the first week of this course you’ll explore the key concepts related to data modeling you’ll learn to identify different types of data schemas like flat star and snowflake you’ll create and maintain relationships in a data model using cardality and cross- filter direction and you learn to form a model using a star schema the second week of this course focuses on DAX or data analysis expressions this syntax is used to create elements and perform analysis in PowerBI you’ll start by writing calculations in DAX to create elements and analysis in PowerBI you’ll explore the formula and functions used in DAX and use DAX to create and clone calculated tables you’ll then be introduced to the concept of measures you’ll learn where measures are used and what types are available you’ll work with measures to create calculated columns and measures in a mode and you’ll learn about the importance of context and DAX measures finally you’ll perform useful time intelligence calculations in DAX for summarization and comparison and learn how to use these techniques to set up a common date table in the third week of this course you’ll learn how to optimize a model for performance in PowerBI you’ll begin by learning how to identify the need for performance optimization this means analyzing your data models to determine how they can perform more efficiently you’ll then learn how to optimize your PowerBI models for performance you’ll explore different techniques and methods for ensuring that you’re running efficient models and you’ll also learn how to optimize performance using DAX queries in the final week of this course you’ll undertake a project and graded assessment in the project you’ll build and optimize a data model for Adventure Works you’ll have to build this model from scratch and optimize it to run efficiently finally you’ll have a chance to recap what you’ve learned and focus on areas you can improve upon throughout the course you’ll engage with videos designed to help you build a solid understanding of data modeling in PowerBI watch pause rewind and re-watch the videos until you are confident in your skills then consolidate your knowledge by consulting the course readings and measure your understanding of key topics by completing the different knowledge checks and quizzes this will set you on your way towards a career in data analytics and form part of your preparation to take the PL300 Microsoft PowerBI data analyst exam by the end of the course you’ll be equipped with the necessary skills to work effectively with data models in PowerBI good luck as you start this exciting learning journey as a data analyst you’ll often manage thousands hundreds of thousands or even millions of records but how can you generate insights from all this raw data you can convert it into data models in this video you’ll explore the basics of data models and learn how to create them over at Adventure Works the company needs to generate insights and increase sales from different data sources these data sources include customer sales and marketing data but these data sources are all in separate locations and the only way to generate insights is to combine them that’s where the data model comes in adventure Works can integrate its data sources as a data model in Microsoft PowerBI then generate insights in the form of visualizations let’s find out more about data modeling and learn how Adventure Works can make use of it at its core data modeling is creating a structured representation of data this representation can then be used to support different business aims in other words a data model shows how different data elements interact and it also outlines the rules that influence these interactions data models can be built in Microsoft PowerBI microsoft PowerBI is software that provides data analysts with a user-friendly interface for building data models other benefits of a PowerBI data model are that it can be used to define relationships between tables and assign data types you can also create calculated columns and measures and update your model as your business requirements change in PowerBI the foundation of creating reports and dashboards lies within the data model it’s important to understand how to design a data model that effectively aligns with the visual elements within your reports and dashboards there are several steps involved in building a data model in PowerBI connect to your data sources prepare and transform your data and configure table and column properties then create model relationships and finally create measures and calculated columns using DAX or data analysis expressions once your data model is in place you can analyze the data to generate insights to help you achieve your business objectives let’s explore some examples of how data models can be applied to business data by optimizing the data model you can significantly improve the performance of your PowerBI reports and dashboards it’s also easier to aggregate structured data in a data model thanks to the clear relationships and hierarchies with an effective data model you can perform more advanced analytical capabilities like complex measures and predictive analysis when your underlying data is structured organized and aligned your insights and reports are more likely to be accurate and reliable now that you understand more about data models let’s briefly explore how Adventure Works can build one with PowerBI to generate the sales insights they need first Adventure Works needs to connect to its data sources by executing a query in Power Query Editor the result is then loaded into the PowerBI data model as a table using Power Query in PowerBI Adventure Works can finish importing and cleaning their data sources this creates a data model that contains cleaned customer date employee and marketing data as separate tables each table in the model represents a specific business entity and each table also has its own related attributes the next step is to define the relationships between the tables in PowerBI’s model view the company can link its customers and sales tables using the customer ID column which is common to both tables with this relationship the company can now view each customer’s transactions adventure Works could also link its sales and marketing tables to understand which campaigns were most effective for boosting sales finally the company needs to create measures and calculated columns using DAX or data analysis expressions dax is a syntax used in PowerBI to analyze data you’ll learn more about it later in the course for now just know that Adventure Works can use DAX to create aggregations and custom calculations to generate insights on important aspects of their data like sales totals a strong understanding of data models will help you maximize your data’s full potential building sophisticated data models creates a robust foundation for data analysis and generating insights remember that your data model is the foundation of everything else generating business insights often means working through large amounts of data and it’s important that this data is stored and structured meaningfully with PowerBI you can structure your data using a schema in this video you’ll learn about different types of schemas and their advantages and disadvantages adventure Works wants to optimize its inventory and rework its sales strategy to sell more bicycles but first it needs to analyze the relevant data to determine the best way to approach this task these data sources include customer product and sales data along with information on other aspects of the business adventure Works can use a schema in PowerBI to organize and build relationships between these different data sources this way the company can generate its required insights let’s find out more about schemas and how Adventure Works can use one a schema refers to a structure that defines the organization and relationships of tables within a data set it represents the logical framework of how the data is organized and connected there are many benefits to using a schema in PowerBI which you’ll explore over the course of this lesson a schema plays a crucial role in defining the data structure it also enables efficient data analysis helps with the creation of visualizations and assists with generating meaningful insights from your data there are three different types of schema that can be used to organize and structure data a flat schema a star schema and a snowflake schema let’s review each of these schema types and find out how Adventure Works can use them a flat schema is the simplest form of a data model all attributes and fields related to the entity are stored in a single table as you discovered in earlier courses a table is a set of rows containing data with each row divided into columns each column represents a piece of information with a specified data type the required attributes and entities are stored in the rows and can be extracted as required from the columns there are several advantages to a flat schema it’s easy to retrieve data from it’s less complex to analyze flat schema data and it’s a simpler way to visualize data however even though it’s an easy approach to understand the flat schema still has a few disadvantages it requires large data sets which are difficult to maintain and slow to query it leads to data redundancy and inconsistency so is more suited to smaller data sets and it doesn’t allow for complex data sets which require more flexibility and detail next let’s explore the star schema data model a star schema is a more advanced approach to structuring and organizing quantitive or measurable data in PowerBI it allows for multiple tables to be connected through one central table in a star schema a central fact table connects to multiple dimension tables you’ll explore these concepts in a later lesson these connections look like a star shape so it’s called a star schema adventure Works can build a star schema using a central fact table that contains sales transactions the company can then link the fact table to dimension tables that contain records for customers employees dates and marketing campaigns let’s break down the components of the star schema using the example from Adventure Works database first there’s the fact and dimension tables you’ll explore these further in a later lesson and there are the table relationships there are many different types of relationships which you’ll also explore in a later lesson a star schema offers many advantages over a flat schema by storing data in separate tables star schemas help to reduce data redundancy and boost query performance it also provides a clear logical data model which makes it easier to understand the data structure however it’s also less flexible than other schema types adding or modifying tables can require extensive changes to the schema and the star schema can struggle to manage complex relationships next is the third and final model the snowflake schema a snowflake schema is an extension of the star schema it breaks down the dimension tables into multiple related tables existing tables in a star schema can be further denormalized into other tables which creates a hierarchy yet these tables maintain a relationship with the dimension and central facts tables for example Adventure Works can further normalize its product data into supplier and category data tables don’t worry about the terms normalize and denormalize for now you’ll learn more about these concepts later in the course extending a star schema into a snowflake schema offers several advantages it provides more efficient data storage and retrieval it improves data integrity and consistency and it reduces data redundancy it also offers scalability and flexibility by integrating new data tables as required yet there’s also disadvantages to a snowflake schema it’s more difficult to perform data analysis because of the extra relationships these new relationships also make the schema more challenging to understand and manage and they result in slower queries finally it’s important to validate your schemas to make sure they’re accurate when validating a schema you need to check for the following make sure each table column has been assigned the correct data type like text and numeric check that each column has the correct formatting applied confirm that all columns have clear descriptions with relevant context and make sure all table and column properties are correctly configured you should now be familiar with the different types of schemas in PowerBI and their advantages and disadvantages you can build on this knowledge to develop robust data models in PowerBI this way you’ll ensure that your data retains its integrity and simplicity and can be used to generate insights making datadriven decisions involves working with large complex data sets fortunately you can easily manage these data sets with a flat schema in this video you’ll learn how to create a flat schema in PowerBI and configure your table and column properties over at Adventure Works the company has received complaints from customers about incorrect and delayed orders let’s help Adventure Works build a flat schema to organize its data more efficiently the first step is to connect PowerBI to the data sources to connect to a data source in PowerBI desktop select the home tab then select the get data drop-down menu select the appropriate data source from this menu in this instance you need to select the Excel workbook option then navigate to the folder containing the Adventure Works spreadsheet and select open once you select the Excel data source PowerBI displays the available tables in the navigator menu for Adventure Works there is only one table in the Excel spreadsheet available to load adventure Works data select the table from the navigator menu a preview appears on the right hand side the preview shows the Excel sheet has one table which contains sales data for Adventure Works there are also other columns related to the data like product name category subcategory quantity and more you can perform transformations from this menu but in this instance you just need to load the data so select load to add the selected data table to your PowerBI data model next select the data set from the data pane on the right hand side of the PowerBI desktop interface then select data view from the left sidebar to view the data set you can now configure your table and column properties using the power query editor to access the editor select the home tab and then the transform data option for example you can select the properties feature to alter the spreadsheet name or add a description add some spacing to the spreadsheet name then add the following description Adventure Works sales data this makes it easier to identify the spreadsheet it’s particularly useful when working in a team now you can begin applying transformations to shape the data as a flat schema first you need to remove duplicate data from the order ID column select and rightclick on the order ID column in the drop-own menu select the remove duplicates option alternatively you can access the home tab and select the remove rows option in the drop-own menu select the remove duplicates option either action removes all duplicate values from the selected column you can also format the product weight column by changing the data to a decimal type select the column then select the transform tab select the data type option and select decimal number from the list of available options confirm your selection to change the column type when you’ve completed your transformations select the home tab and then select close and apply you’re then returned to the PowerBI desktop interface you can make further changes here using the table tools and column tools tabs for example from the column tools tab you can select the format option and change the product price column data type to currency the next step is to edit the model select model view from the lefthand sidebar to view the schema of the loaded data the model view shows that there is currently one table of data this shows that we are working with a flat schema since there are no other tables there’s no need to build any relationships however you can still make further changes to the table’s properties select the table in model view to open the properties pane you can make more changes here by selecting individual columns from the table you should now be familiar with creating a flat schema in PowerBI from your data sources and you should also know how to configure your table and column properties using PowerBI and Power Query creating a schema in Microsoft PowerBI is an essential skill for entry-level data analysts as you progress in your data analysis career you’ll explore even more complex schema structures to handle more intricate data scenarios as you discovered in an earlier lesson you can use schemas for data organization and two central components of all schemas are fact and dimension tables in this video you’ll explore these tables in more detail and learn how they can be used to build schemas adventure Works is dealing with an increase in delivery errors to help fix this issue the company needs to explore its data and discover the underlying cause it can use fact and dimension tables to find a resolution as you learned earlier a schema is a logical and visual representation of how your fact and dimension tables relate they’re the backbone of schemas in PowerBI fact tables are called fact tables because they consist of the measurements metrics or facts of a business process in other words they hold quantifiable measurable data let’s take the example of an adventure works fact table it sits at the center of a sample adventure works star schema it’s called sales orders and includes transaction details like order ID product ID customer ID quantity and total price these are core facts about transactions like the customer who made the purchase the price of the product they purchased and so on and this fact table is related to dimension tables dimension tables are typically textual fields and provide descriptive attributes related to fact data they offer the context surrounding a business process event in the Adventure Works star schema the dimension tables are linked to the fact table and include date customer sales and product data these are descriptive details that can be used to identify individual customers these two examples should help you understand how fact and dimension tables inform the building of a schema in the star schema model the fact table sits at the center the dimension tables radiate out like the points of a star each dimension table is directly connected to the fact table for example the sales order table is the central fact table in the adventure works star schema the dimension tables like date customer and product are connected directly to it this structure simplifies queries because you only need to navigate through two tables to answer questions like what were the total sales on a particular date and these fact and dimension tables can also be used to extend a star schema into a snowflake schema a snowflake schema makes use of dimension tables by normalizing them normalization means that existing tables within a schema are divided into additional related tables this technique creates a structure that resembles a snowflake this is where we get the name snowflake schema from for instance in addition to a central fact table Adventure Works product dimension table could be split into a product table connected to subcategory and category tables this schema reduces data redundancy but adds complexity to queries you can help Adventure Works use these schema designs to discover the cause of the delivery errors you can import the required data sources represent the data sets as a snowflake schema and perform data analysis your analysis might reveal that the errors are linked to inventory management issues or incorrect addresses on record with these insights Adventure Works can fix its delivery processes and avoid future errors you should now understand the importance of fact and dimension tables when building a database schema with these tables you can create different schemas that help to organize and make sense of your data and generate insights you’ll often have to untangle large data sets and make sense of the relationships between tables an understanding of cardality and table relationships can be useful in these situations in this video you’ll explore the concept of cardality and review the different relationships that can be created between tables in a database to help with its business planning Adventure Works asks questions of its data like what bicycle sells best in each region or what is the revenue of each store however the data required to answer these questions is stored across several tables posing a complex data analytics challenge adventure Works can solve this challenge using cardality and by identifying the table relationships before we find out how Adventure Works can solve its data issues let’s take a few moments to explore the concept of cardality in the context of data analytics cardality refers to the nature of relationships between two data sets in other words how tables in your database relate to each other it’s important that your cardality settings are correct incorrect settings can lead to inaccurate data analysis and flawed business decisions there are three types of cardalities or relationships between tables in PowerBI the first is a onetoone relationship in this instance a record in one column of table A corresponds to a unique record in one column of table B onetoone relationships are less common in data modeling but they are useful when dealing with specific scenarios for example a single business entity can be loaded as two or more model tables because the data might come from different sources this scenario is common for dimension tables for example in Adventure Works data set each bicycle model has a unique model ID listed in the product ID column and a separate table lists specific features for each model ID in a product features column together these columns form a onetoone relationship between the two tables next is the one to many relationship each record in a column of table A corresponds to multiple records in a column of table B but not the other way around adventure Works lists its stores in table A and it lists the employees of each store in table B the relationships between the stores and their employees establish a one to many relationship this is because each employee works for one store but each store has many employees this is the most common type of relationship in data modeling where one table acts as the primary table and the other tables act as related tables finally there’s the many to many relationship this is where multiple records in a column of table A are related to multiple records in a column of table B in both directions many to many relationships are often used to establish a relationship between two fact tables or two dimension tables in the case of Adventure Works a customer can purchase many different bicycle models logged in table B and each bicycle model can be purchased by multiple customers recorded in table A this creates a many to many relationship understanding these relationships and configuring your settings appropriately helps your queries and calculations flow correctly and generate accurate insights another important aspect when considering the cardality of your data is granularity granularity refers to the level of detail or depth of a data set the granularity of your data should align with the business questions you need to answer for example Adventure Works wants to view customer purchase histories over the past year with data granularity you can explore individual transactions to analyze individual customer behavior and identify purchase patterns however if you want to understand which specific bicycle models are performing well in a region you need sales data with high granularity high granularity data is the data set that captures detailed information about the transactions for example geographical sales of products can be captured as a continent country state city and all the way down to individual stores but for a more general analysis like total sales per store a lower level of granularity suffices low granularity data refers to the data set that captures a highle summary or an aggregated level over broader categories an example of this is monthly sales of a product category the sales data is summarized at the category level but only on a monthly basis understanding the granularity of your data is crucial for establishing correct cardality it also influences how you set up your cross filter direction in PowerBI which you will learn more about in a future lesson but be careful when judging the required level of granularity misjudging the level of granularity can lead to misrepresented data and incorrect business insights and excessive granularity can lead to too much data and slow down your queries by developing a keen understanding of cardality and granularity you can untangle complex data scenarios like the one at Adventure Works with confidence and ease understanding the relationships between multiple data sets requires an advanced tool and Microsoft PowerBI’s cross filters are the perfect fit in this video you’ll explore the concept of cross filter direction and learn how to identify different types of cross filters adventure Works needs to calculate which members of its sales team have sold the most product types and should be awarded a bonus however the data required to generate this insight is spread across multiple tables with fixed cross filter directions you can help Adventure Works analyze this data by changing the cross filter directions of its tables but first let’s find out what data analysts mean by cross filter direction in PowerBI cross filter direction refers to the pathway or the direction through which filtering happens between two tables in a data model it dictates how data from one table influences the data in another table this enables relational analysis without resorting to complex queries or manual data consolidation powerbi relationships are directional in nature unlike other database management systems the direction significantly impacts how filtering operates having a clear understanding of relationship direction is a crucial aspect of data modeling in PowerBI let’s look at how direction plays an important role the Adventure Works data set contains three tables product sales and salesperson the product dimension table is connected to the sales fact table using a one to many relationship based on the product ID column common to both tables and a oneto many relationship also connects the salesperson dimension table to the sales fact table based on their common rep ID columns there are two types of cross filter direction the first is single cross filter direction this is the default setting in PowerBI the filter propagates from one table to another but not vice versa a good example of single cross filter direction is the scenario you just explored adventure Works product and salesperson dimension tables are connected to the company’s sales fact table via a one to many relationship each arrow points in a single direction indicating that the relationships direction is single this means that sales data can be filtered by both product and salesperson so when the product table is filtered for product one the sales table is automatically filtered for all sales of product one the next type of filtering is birectional filtering birectional filtering is filtering against the direction of a relationship sometimes you’ll need to do this to answer a particular question for example as you learned earlier Adventure Works requires a report on employee performance the report must show the number of products sold by each salesperson you can generate this report using birectional cross- filtering to generate the required results you must filter from the sales fact table to the salesperson and products dimension tables so you need to change the direction of the filter to both let’s look at the process steps for this action you can apply a filter in the salesperson table for a specific sales team member this filters the sales table for all sales by that person the filter propagates to the product table as the direction is birectional we have now determined how many unique products the salesperson has sold however there are a few important points to note when using birectional filtering birectional cross- filter relationships can negatively impact performance and configuring a birectional relationship can also result in ambiguous filter propagation paths you can disable filter propagation within a relationship in PowerBI using the cross filter DAX function this setting can be particularly useful in certain advanced scenarios where you must isolate data for independent analysis you’ll learn more about DAX in the next module the direction of the relationships plays a very important role in data modeling in PowerBI properly applying these cross- filter directions can drastically enhance data analysis leading to more insightful and actionable conclusions different data sets are explored at different levels of detail depending on the questions to be asked answering these questions requires working with different levels of data granularity over the next few minutes you’ll explore the concept of data granularity and discover how it can help inform your data analysis over at Adventure Works the company needs sales data to help make strategic decisions about what products to stock it must identify the highest and least performing products using annual and daily sales data you can help the company generate these insights by using data granularity to analyze its sales records let’s begin by recapping what is meant by the term data granularity as you might recall data granularity refers to the level of detail or depth captured in a certain data set or data field granular data provides deeper and more precise insights this delivers more nuanced and valuable findings remember data granularity isn’t about always having the highest level of detail it’s about having the appropriate level of detail before you begin your analysis ask yourself do you require high granularity or low granularity the decision should depend on the specific requirements and objectives of the analysis it’s about striking the right balance between detail manageability precision and simplicity high granularity data is the data set that records very detailed information about each transaction this level of granularity provides a comprehensive overview of each transaction including specific attributes and metrics associated with the transaction let’s look at an example from Adventure Works database for instance in Adventure Works data analysis product related data can be captured as product ID category subcategory name price size and weight some benefits of high granularity include in-depth exploration of trends patterns and relationships within data sets to identify specific behaviors and anomalies the flexibility to aggregate and summarize data at various levels of detail and the ability to facilitate accurate decision making by drilling down into specific data points next let’s look at low granularity in low granularity data information is captured and analyzed at a high-level summary or an aggregated level the data is not broken down into individual records instead data is summarized over broader categories or periods here’s an example from the Adventure Works database for example Adventure Works can explore its sales quarter by business quarter or month the benefits of low granularity include a simplified view that’s easier to understand and allows for analysis without an overwhelming level of detail improved performance and reduced data volume which leads to faster query execution and a quick identification of trends and patterns for informed decision-m let’s take a closer look at data granularity and its role in data analysis in the context of data analysis high granularity data is often more desirable it offers a finer level of detail so it provides greater precision and potential for deeper insights for instance tracking sales hourly high granularity instead of monthly low granularity could reveal patterns like peak shopping hours during the day however working with high granularity data comes with its challenges the more granular your data the larger your data sets will be potentially slowing down data processing and analysis on the other hand low granularity data while offering less detail can provide a broader view of your data it’s also easier to manage because of the smaller data sets in Adventure Works the monthly sales data low granularity could help identify broader trends such as seasonal sales fluctuations of certain product lines for example bicycle repair equipment sells more during the spring and summer months this is because customers are more active on their bicycles you can ensure the relationships are accurate and produce consistent aggregations by matching the granularity levels it also helps with correct filtering and supports drill down analysis data granularity also has a significant impact on building relationships between tables in PowerBI for example to determine the highest and lowest selling products in the Adventure Works inventory you must produce reports of total sales and budget over time using the sales and budget data the sales data is in the sales table and has daily level granularity on the other hand the budget data is stored in the budget table and is monthly to establish the relationship between tables and produce accurate results you need to format the date table in both tables and then build a relationship based on a commonly formatted date column understanding and manipulating data granularity is a powerful skill that all data analysts must master the degree of granularity can impact the insights drawn and the ease with which data can be analyzed with a firm understanding of data granularity you can now approach your data analysis tasks with a refined perspective it’s time to discover the story that the right level of detail in your data can tell untangling complex intricate data is often too large a task for one individual thankfully a PowerBI star schema can simplify complex data over the next few minutes you’ll learn how to configure a star schema in PowerBI including differentiating between fact and dimension tables and configuring cardality and cross filter direction adventure Works needs to organize its data to understand what products have been ordered and where they need to be shipped you can help them to organize the data using a star schema but first let’s review the steps for setting up a star schema in PowerBI the first step is to disable autodetect powerbi auto detects relationships when you load multiple tables you need to disable the function so you can set your own relationships the next step is to load your fact and dimension tables into PowerBI select the required tables from your Excel spreadsheet or other relevant location and load them into the application once you’ve loaded the tables you must create relationships between them you can join tables by dragging relationships between key columns or from the manage relationship section of PowerBI desktop finally you need to set cardality and cross filter direction you must set cardality to determine how your database tables relate and you need to set the cross filter direction to determine the pathway through which filtering occurs between your tables now that you’re familiar with the steps for setting up a star schema in PowerBI let’s help out Adventure Works as you’ve just discovered the first step is to disable the auto detect function launch PowerBI desktop go to file and select options and settings then select options within the settings menu to open the options dialogue box on the left bar of the dialogue box select data load then deselect autodetect new relationships after data is loaded and select okay next you need to load your fact and dimension tables into PowerBI select home then get data select Excel workbook from the list of options in the get data drop-own menu navigate to the Adventure Works company data spreadsheet and select open the navigator menu appears on screen this menu displays a list of available tables within your spreadsheet you can select which tables you need from this menu you can also use the search bar to locate a table when working with larger spreadsheets a preview of each table appears in the preview pane when selected in this instance you require the product region sales and salesperson tables select these tables then select load the tables are now visible in the model view your next step is to create the relationships between the tables you must build a one to many relationship between the sales table and the product region and salesperson tables in this instance you can create a relationship between the product table and the sales table based on the product key column which is common in both tables similarly you need to relate the sales table to the region and salesperson tables based on the sales territory key column and employee key column respectively alternatively you can also create and configure relationships from the manage relationship section of PowerBI desktop from the model view select manage relationship select new to open a dialogue box called create relationship from here you can build and configure relationships select the sales table from the drop-own menu then select the product key column from the available options then select the product table and its product key column next you need to set up the cardality and cross filter directions to set up cardality select the cardality drop-own menu then select the appropriate relationship type in this case it is many to one finally under the cross filter direction drop-own menu select the filter direction powerbi’s default direction is single so leave this as it is for the current scenario however before you select a birectional cross filter make sure that you fully understand its implications select okay when finished you can repeat this process to create relationships between the other tables select new then work through the same steps again to create more relationships select okay from the create relationship dialogue box when finished then select close from the manage relationships dialogue box to return to the model view the star schema is now ready to use the sales table is the fact table it sits in the middle of the model and connects to the salesperson region and product dimension tables you should now be able to configure a star schema in PowerBI differentiate between fact and dimension tables and configure cardality and cross filter direction keep the data analysis needs of your organization in mind as you build and refine your star schemas with practice this powerful data modeling technique will become a vital tool in your data analysis toolkit data is not always structured in a way that provides quick insights but by leveraging the Snowflake design schema you can unlock your data’s full potential in this video you’ll explore the snowflake schema learn how to build your own and discover how to transition to one from a star schema adventure Works data is stored in a complex format it’s having difficulty retrieving the necessary information you can help Adventure Works build a Snowflake schema to enable more efficient data storage and make it easier to generate insights let’s begin with an overview of the Snowflake schema the snowflake schema is a type of database schema design that optimizes data storage and retrieval by normalizing the data into multiple related tables unlike the star schema which uses denormalized data with fewer tables the snowflake schema consists of a central fact table connected to one or more dimension tables the dimension tables are further connected to other related tables to create a hierarchy for example the Adventure Works sales data sets product dimension table has a product category and a product subcategory in a star schema all three fields exist in one dimension table however in a snowflake schema you can split this single table into three different tables and all these tables are related to one another via one to many relationships now when you filter a specific product category the filter is propagated through the tables from product category to subcategory product and then sales as the adventure works example has just shown the snowflake schema offers many benefits so it’s an ideal choice for complex data structures in PowerBI here’s a quick overview of some of these benefits it simplifies dimension tables by splitting them into separate tables simplifying dimension tables also improves data integrity because hierarchical relationships more accurately represent the data and splitting data sets into separate tables also helps to reduce data redundancy because each attribute is only stored once it also enhances data analysis because a more efficient structure means more accurate insights and finally a snowflake schema leads to better management of data using hierarchies now that you’ve explored the basics of the snowflake schema and its benefits let’s help Adventure Works build one before uploading the data set you first need to turn off PowerBI’s autodetect feature this feature automatically creates relationships between the tables but you need to do this manually to disable this feature open PowerBI desktop select file options and settings and then options within settings this opens the options dialogue box select the data load option to the left of the dialogue box then deselect autodetect new relationships after data is loaded then select okay now you can load the adventure works data set from the home tab select get data then select Excel workbook from the options in the drop-own menu navigate to the data set and select open the navigator menu presents a list of available tables from the data set select the following tables category product region sales salesperson and subcategory then select load the tables are loaded into PowerBI and presented in the model view you can now establish the relationships between the fact and dimension tables you can do this by dragging the primary key from the dimension table to the foreign key in the fact table for example drag the product key column from the products dimension table to the product key column in the select fact table you can then repeat this process for all related tables in the snowflake schema next you must create hierarchies in the dimension tables to enable greater data analysis create relationships between the product table and the category and subcategory tables based on the category ID and subcategory ID respectively via a oneto many relationship this creates a hierarchy of product dimensions but what if Adventure Works has already created a star schema let’s review the process for transitioning from a star to a snowflake schema open the PowerBI project that contains the star schema your first step is to normalize the dimension tables identify the tables in the star schema to be further normalized into related tables create separate tables and then link them using foreign and primary keys to create these tables you’ll need to use DAX you’ll explore DAX in greater detail in a later module for now let’s just use some basic DAX code select the table tools tab then select new table add the required DAX code to the formula bar to create a new category table repeat the same process with the required DAX code to create a subcategory table once you’ve created the new tables PowerBI attempts to detect the relationships between them remove any new relationships that it establishes between the tables next you need to update the product hierarchy in the dimension tables to reflect the new Snowflake schema structure build a relationship between the category and subcategory tables based on the subcategory ID then build new relationships between the product and category tables based on the category ID you can now use this hierarchy to interrogate data on individual products product categories and product subcategories configuring the Snowflake schema in PowerBI is a valuable skill by mastering these skills you can play a critical role in helping organizations make datadriven decisions optimize operations and drive growth choosing the right schema generates valuable data insights choosing the wrong schema generates incorrect and misleading insights so how do you select a schema in this video you’ll discover why the Snowflake schema is often the most suitable schema for your data sets adventure Works wants to use its data to generate business insights into its sales and marketing practices so it needs to structure its data in a way that enables efficient querying and analysis it considers using a star schema however the last star schema it used resulted in an overly simplified and denormalized data set so you suggest a snowflake schema to more accurately represent and analyze the complex relationships between its data components as you discovered in earlier lessons a star schema organizes data into a central fact table this central fact table is surrounded by dimension tables containing descriptive attributes this structure is suitable for certain kinds of analysis for example it’s useful for analyzing smaller data sets however it becomes problematic when dealing with more complex hierarchical relationships this is particularly true for the Adventure Works data set by using the star schema’s denormalized approach Adventure Works risks generating results that contain redundant data and a loss of data integrity this would make it difficult to perform an accurate analysis of the data on the other hand a snowflake schema would provide a much better approach as you discovered previously the snowflake schema optimizes data storage and retrieval by normalizing the data into multiple related tables this structure provides more flexibility in defining complex dimension hierarchies and it allows for the creation of subdimensions within these hierarchies this lets analysts explore data at much deeper levels of granularity however the downside is that increased table sizes result in slower query performance this impacts the team’s ability to derive insights and make datadriven decisions quickly the best approach for adventure works is to build a snowflake schema this schema uses a more normalized approach which is more beneficial for dealing with intricate data relationships it can be used to build out multiple levels of related tables in the form of a hierarchy this is much more efficient than a star schema which flattens a hierarchy into a single table you can normalize several of the tables in the Adventure Works data set for example the product dimension table can be split into two separate tables category and subcategory this structure makes it much easier to analyze the performance of individual products and their related categories through deeper granularity customer data can also be organized in a hierarchy the team can explore customers and their purchases by country state and city this level of granularity reveals insights into regional sales patterns and marketing campaigns another benefit of this hierarchical structure is that it helps the team to identify patterns and relationships between data sets a snowflake schema also eliminates data redundancy each attribute is stored only once in its respective table and a unique identifier ensures consistent and accurate data finally the normalization of dimension tables also helps to reduce the data model storage requirements this makes the snowflake schema a much more efficient approach choosing the right schema is crucial for data analysis especially when dealing with complex data sets as the case of Adventure Works shows opting for a snowflake schema can help avoid the risks of using a star schema for hierarchical data relationships as an entry-level data analyst understanding the importance of using the correct schema for your data set is crucial by recognizing when a snowflake schema is more appropriate than a star schema you can optimize your data analysis process leading to more accurate insights and better informed decisionmaking you might often encounter a data model that’s unsuitable or not fit for purpose and leads to data analysis issues when this occurs you can take steps to rebuild the model and fix these issues over the next few minutes you’ll learn how to identify and resolve some common challenges arising from unsuitable data models adventure Works uses a star schema for its data model in PowerBI to analyze sales and customer data however this data model is not effectively meeting the company’s analytical requirements adventure Works has very large data sets and the company’s departments want to visualize this data according to their specific needs however this is difficult to achieve with the currently employed model adventure Works needs your help to resolve these issues and create a new more suitable data model the first step is to analyze the existing model and identify its issues some examples of common issues you could find in a data model include inferior performance issues with data consistency and limited scalability let’s begin with the issue of inferior performance the current data model might not be optimized for query performance resulting in slow report generation and analysis complex calculations based on larger data sets contribute to slow performance this makes it difficult for business users to draw real-time valuable insights from that data the sales table in the adventure works model contains columns like product descriptions these columns can be normalized into a dimension table for faster insights the next issue identified with the data model is inconsistent data disperate sources of data can be integrated without being properly validated for example duplicate data or incorrect data types this can lead to inaccurate reporting in your analysis adventure Works data model contains multiple examples of duplicate and inaccurate data across its tables if these tables aren’t fully normalized this redundant and inaccurate data will enter the company’s reports the final issue that was identified is that of limited scalability in other words the model cannot scale alongside a company to accommodate its increased data volume and associated evolving analytical needs adventure Works current model cannot integrate additional data sources emerging business requirements or analytical needs so now that you’ve completed your analysis and identified the issues you need to resolve the model’s challenges you can propose the following measures as a line of action to resolve these modeling and analytical problems the first step is to conduct a thorough assessment of the current data model and find any other issues that might exist once you’ve identified all the issues you can plan a redesign of the data model you must also understand the following data model components to support meaningful analysis and decision making the model specific data elements and their sources and the dimension and fact tables the relationships that exist between the model’s tables and the model’s calculations and measures another important step is to collaborate with stakeholders and business users to define the analytical requirements and objectives to be achieved for example Adventure Works sales department wants to identify the top performing product categories for each region and the marketing team wants to understand the impact of marketing campaigns within specific territories understand these analytical requirements and objectives so you can redesign a data model that implements all these requirements from the stakeholders and management team based on your assessment you’ve decided to redesign the data model as a snowflake schema you can complete this process by performing the following actions normalize the dimension tables create new tables where necessary establish proper relationships and cardality and create hierarchies compute custom calculations and measures using DAX test and validate and document all changes these actions will bring the following benefits to the data model they’ll improve model performance and enhance data integrity they’ll also remove data redundancies and boost the scalability of data analysis you then need to carry out the final few steps transform and validate the data while also implementing data quality checks you can also optimize the model then test it to ensure it functions as required finally deploy the new data model and train users to make sure everyone is familiar with how it works by implementing these steps you can help Adventure Works resolve challenges posed by the not fitfor-purpose data model the newly optimized data model will meet Adventure Works’s analytical requirements improve its data integrity and guarantee adaptability to changing business needs congratulations on reaching the end of this first week in this course on data modeling in PowerBI this week you’ve explored concepts for data modeling let’s take a few minutes to recap what you’ve learned in this week’s lessons you began the week with an introduction to data models you learned how to identify the initial steps involved in data modeling like defining relationships between tables assigning data types and creating calculated columns and measures you then explored the process steps for building a data model in PowerBI this involves connecting your data sources preparing and transforming your data and configuring the table properties you also learned how to create model relationships and create measures and calculated columns with DAX and you reviewed the benefits of data models you discovered that data models can be used to enhance the performance of reports improve calculations improve analysis and insights and deliver more accurate reports you then explored schemas a schema is a structure that defines the organization and relationships of tables within a data set three types of schema can be used to organize and structure data the first is a flat schema this is the simplest data model form it’s a set of rows and columns containing data then there’s the star schema it’s a central fact table that links to multiple dimension tables these tables are connected through relationships and finally there’s the snowflake schema this is an extension of the star schema it breaks down dimension tables into multiple related tables you first learned how to set up a flat schema this involves removing duplicate data formatting columns and editing the tables properties in the lesson exercise you configured a flat schema for Adventure Works you also completed an activity configuring a flat schema with multiple sources finally you completed a knowledge check to test your understanding of data models and you reviewed links to materials for further learning in the additional resources item the next lesson focused on cardality and crossfilter direction this lesson began with an introduction to fact and dimension tables fact tables hold quantifiable measurable data on a business process it sits at the center of a star schema then there’s dimension tables dimension tables provide descriptive attributes related to fact data they radiate out from the central fact table a snowflake schema extends this design it normalizes the dimension tables by breaking them down into additional related tables next you explored the concept of cardality cardality refers to how your database tables relate to one another your cardality settings must be correct to ensure your insights are accurate there are three types of cardality in PowerBI the first is a onetoone relationship in this instance a record in one column of table A corresponds to a unique record in one column of table B next is the one to many relationship each record in a column of table A corresponds to multiple records in table B but not vice versa this is the most common relationship finally there’s the manny to many relationship this is where multiple records in a column of table A are related to multiple records in a column of table B in both directions you can understand these relationships using cross filters powerbi offers single cross filter direction and birectional filtering single cross filter direction is the default setting it propagates from one table to another as in table A to table B but not the other way birectional filtering is filtering against the direction of a relationship this means changing the direction of the filter to both so you can propagate the filter in the reverse direction another important aspect of cardality is granularity granularity refers to the level of detail or depth of a data set the granularity of your data should align with the business questions you need to answer do you need high granularity data in the form of a data set that captures detailed information about the transactions or low granularity data in the form of a data set that captures highle summary or at an aggregated level over broader categories you then tested your understanding of these concepts you completed a knowledge check to test your understanding of data models and you reviewed links to materials for further learning in the additional resources item in the fourth and final lesson you learned how to work with advanced data models the lesson began with an introduction to setting up a star schema in PowerBI the key steps in this process involve loading the required tables creating the relationships between the tables based on common keys and setting up cardality and cross filter direction you then completed an exercise configuring a star schema for adventure works in PowerBI and you compared your result against an exempller next you learned how to set up a snowflake schema in PowerBI the process steps are like those for setting up a star schema the key difference is that you must create hierarchies in the dimension tables to enable greater analysis you can also convert a star schema into a snowflake schema using DAX queries you then put this knowledge into practice by changing an Adventure Works star schema into a snowflake schema you continued your exploration of advanced data models with snowflake schemas you reviewed the importance of snowflake schemas including their key benefits and you explored the process for resolving challenges in data models finally you completed a knowledge check and module quiz to test your knowledge of the concepts you encountered you’ve now reached the end of this module summary it’s time to move on to the discussion prompt where you can discuss what you’ve learned with your peers you’ll then be invited to explore additional resources to help you develop a deeper understanding of the topics in this lesson best of luck we’ll meet again during next week’s lessons what if you’re analyzing a data model and the data you need isn’t in the original model if it’s possible to derive the data from the original model you can use DAX data analysis expressions to create custom calculations to generate the data in this video you’ll learn about DAX and explore the basic syntax of DAX formulas adventure Works needs to identify its top selling products and calculate its revenue but these insights are beyond the scope of the original data model they can only be generated by calculating the existing data so Adventure Works must use DAX or data analysis expressions to complete this task let’s begin with an overview of DAX dax is a programming language used in Microsoft SQL Server analysis services Power Pivot in Excel and PowerBI it is a library of functions operators and constants used in formulas or expressions to create additional information about the data not present in the original data model with DAX expressions you can create custom calculations on data models to extract maximum information from your data to solve real world problems to master DAX you need to understand its syntax different data types the operators and how to refer to columns and tables using functions let’s begin with the syntax dax usually computes values over columns in a table so you need to know how to reference a column in a table first write the name of your new calculation then add the equal sign operator next write the name of your DAX function then parenthesis that contain the logic of your formula write a table name enclosed in single quotes followed by the column name enclosed in square brackets omit the table name if the reference column is on the same table let’s demonstrate this using an example from Adventure Works the Adventure Works sales table doesn’t include any data that denotes the total number of products sold the company could generate this data using DAX in the DAX expression sales is the table name followed by the column name quantity to be referenced and sum is the DAX aggregation function total product sold is the name of the new calculated column that holds the results of the calculation when executed this DAX formula adds a new column to the existing table that contains the required data next let’s review operators dax formulas rely on operators there are many different types of operators they can be used to perform arithmetic calculations compare values work with strings or test conditions some commonly used operators in DAX include parenthesis for grouping arguments arithmetic operators for performing basic functions like addition and subtraction and comparison operators for comparing values dax also uses logical operators to return true false values and concatenation operators to combine two or more values into a single string adventure Works can use operators in a DAX formula to calculate its total revenue in this example the multiplication operator multiplies the unit price by the quantity to compute the total revenue the parenthesis group the arguments of the expression and the sumx DAX function adds the arguments values to calculate the total revenue finally let’s move on to DAX functions dax functions perform various calculations manipulate data and create custom expressions as you discovered in an earlier example Adventure Works need to calculate their total revenue and they can perform this calculation using the sumx DAX function for now you just need to be familiar with the concept of functions you’ll explore functions in more detail later in this lesson it’s also important to understand that DAX is not just about formulas and functions it involves understanding the data model the relationships between tables and the context in which calculations are made for instance understanding how the tables relate to one another in Adventure Works data model is crucial for creating meaningful calculations there are several important aspects of a relationship that will help you to understand DAX tables connected via a relationship are not the same they are either on one or many sides of the relationship columns used to build the relationship are the keys of the relationship the column on one side of the relationship needs to have unique values and tables relationships can be either single or birectional the direction of the relationship determines the direction of automatic filtering remember mastering DAX requires practice start with simple formulas and gradually incorporate more complex functions and operators and ensure you understand your data model and the relationships within it as your comfort with DAX grows so will your ability to turn data into meaningful insights eventually you’ll be able to unleash the full potential of your data using DAX and gain valuable insights for decision-making dax is a useful language for generating business insights using formulas however data analysts need to understand that DAX generates insights from data based on the context of that data in this video you’ll explore the concepts of row and filter context and discover how they impact data evaluation in DAX adventure Works needs to answer business specific questions like what are the total sales for each product and what are the top selling items by category it can generate these insights using DAX dax formulas answer these questions by evaluating the relevant data according to its row and filter context let’s find out more about the relationships between DAX and context dax computes formulas within a context the evaluation context of a DAX formula is the surrounding area of the cell in which DAX evaluates and computes the formula this surrounding area is determined by the set of rows and filters to be evaluated in a DAX expression it determines which subset of data is used to perform calculations dax expressions adapt or refer to the context for evaluating dynamic and contextaware results let’s begin with an overview of row context row context refers to the table’s current row being evaluated within a calculation when a DAX expression is evaluated for a specific row it considers the values of the columns in that row as the context of the calculation this allows for calculations to be performed at row level and it’s especially useful for iterating through rows within a table for instance if you create a formula for a calculated column the row context for your formula includes the values from all the columns in the current row let’s demonstrate the concept using Adventure Works sales table the table contains sales data for multiple products over one month stored within the following columns date product category quantity and price adventure Works wants to create a total sales calculated column that shows the total sales data for each product in the table the company can use a DAX formula to multiply the quantity data in the quantity column by the price data in the price column for each item the formula iterates through the relevant quantity and price column values at the row level and returns the results in the total sales calculator column in other words the formula calculates the new values via row context next let’s review filter context as the name suggests filter context refers to the filter constraints applied to the data before it’s evaluated by the DAX expression in the previous example a different result was produced in each cell because the same DAX expression was evaluated against different subsets of data however with filter context you can determine which rows or subsets should be included or excluded from the calculation let’s demonstrate filter context using the Adventure Works sales table adventure Works must calculate the total sales for all items in category X the company can create a DAX formula containing filters that target all sales recorded against category X once the formula is executed it iterates through each row and retrieves only the data with the value of X row and filter context also interact with each other to produce results when a DAX expression is evaluated it first considers the filter context then the row context takes effect let’s demonstrate how this occurs with Adventure Works the company can use the filter context to narrow its sales data to the selected region the row context then iterates each row in the filtered results and calculates the sales totals as you’ve just discovered a filter applied on a table column affects all table rows filtering rows that satisfy that filter if you apply two or more filters to columns in the same table they are executed under a logical end condition this means only the rows satisfying all the filters are processed by the DAX expression in that filter context be careful when applying a filter in a large data model with multiple tables a filter context automatically propagates through the relationships between the tables in the data model based on the selected cross- filter direction of the relationships in this example this means that when data is filtered in the sales order table then data in the related tables is also filtered you can disconnect the tables to prevent propagation a row context on the other hand doesn’t automatically propagate through a data model’s relationships if you have a row context in a table you can iterate the rows of a table on the many side of a one to many or many to many relationship using the related table function you can also access the rows of the parent table using the related function of DAX understanding the context of DAX expressions at the row and filter level is important as you continue to build data models for reporting and visualization context affects how DAX interprets and analyzes your data so always consider the context when creating and executing your DAX formulas as a data analyst you’ll often have to perform complex calculations on large data sets beyond the scope of spreadsheet software like Microsoft Excel in these instances you need to utilize formulas and functions in DAX in this video you’ll review some commonly used DAX functions and examples of formulas that use these functions adventure Works has experienced steady growth in recent months however this growth has led to data management issues so Adventure Works needs a better way to generate insights into its data fortunately DAX formulas and functions are the perfect solution for generating these insights let’s find out more about DAX formulas and functions and then discover how Adventure Works can make use of them you previously learned about operators the building blocks for creating a DAX formula however there are also many common formulas and calculations performed on data these are part of DAX’s extensive library of functions functions are reusable pieces of logic that can be used in a DAX formula these functions can perform various tasks including aggregations conditional logic and time intelligence calculations data analysts can use these functions to handle complex data challenges and drive meaningful insights to create a function you must be familiar with the syntax a function begins with the function name followed by parentheses containing the functions parameters dax function names are typically expressed using capital letters to help differentiate them from table and column names for example Adventure Works could use a function to get the distinct count of rows in the custom key column in a table named sales dax expressions can be difficult to write particularly complex calculations which require nested functions so you can use variables in your DAX formulas to simplify calculation results and store them for reuse you can use variables to store intermediate results in a temporary location they’re like a storage box that you can put information into to be retrieved later this improves reliability and readability and reduces the complexity of your expressions you can define a variable in DAX by placing var before your variable or expression follow the variables with return where the expression’s result is provided adventure Works can create a simple formula that defines two variables to generate insights into its sales and customer data sales amount and customer number are variables defined to determine the total sales and number of customers respectively the return statement divides one variable by the other the entire expression’s result is in the DAX query’s return statement although DAX functions can be classified into many broad categories there are some commonly used functions let’s review these and discover how Adventure Works could leverage them to resolve their business problems the calculate function evaluates an expression in a context modified by the specified filters adventure Works can use the function to analyze total sales for a product category based on the color of the products the company just filters the products based on a specified color like blue the calculate function evaluates the sum of the sales table sales amount column in a modified filtered context a new filter is added to the product table color column another useful function is average X the average function returns the average of an expression evaluated for each row in a table adventure Works can use this function to calculate the collective average for freight and tax the function calculates the average freight and tax on each order in the sales table it first sums freight plus tax amount in each row and then averages those sums you also need to be familiar with the summarize function the summarize function creates a summary table by grouping data based on one or more columns adventure Works can use the summarize function to generate a sales summary report displaying annual sales for each product category this function returns the summary of sales grouped around the calendar year and the product category the resulting table allows you to analyze the sales by year and product category dax is a powerful language for advanced data modeling and analysis its wide range of functions can be combined with formulas to generate deep insight and remember that DAX functions can be combined to create complex calculations that perform multiple operations this versatility and flexibility makes DAX an essential tool for data analysts you might not always be able to answer business questions using an existing data model it could lack the required data or be too complex in these instances you can use calculated and cloned tables to enhance your data sets and improve your analysis over the next few minutes you’ll explore calculated and cloned tables and learn how to create them from different sources using DAX functions adventure Works needs answers to business specific questions about its sales and marketing but its current data model isn’t up to the task however by creating calculated tables the company can compare and analyze its data to generate the required insights you can learn more about calculated and clone tables by discovering how Adventure Works can create them using DAX functions let’s begin with cloning a table cloning a table can be extremely useful for manipulating or augmenting data without affecting the original table this is especially true when working with tables that are refreshed periodically and any changes you made to the original table might be overwritten for example Adventure Works must augment its sales table to generate insights but it doesn’t want to alter the original data so the company can create and work from a cloned version of the table while leaving the original intact a table can be cloned using a simple DAX formula type the new table’s name an equals operator and the original table name in parenthesis add the word all to instruct PowerBI to clone all data from the target table this formula states that the clone table is equal to the original table adventure works can use this syntax to create a clone of their sales table called sales data you can also use DAX to create a calculated table based on data from various sources for example Adventure Works must combine customer data from a database with sales data from an Excel spreadsheet to analyze the relationship between its sales and customers the company can use DAX to merge these sources and enable its analysis calculated tables can also be used to normalize dimension tables adventure Works can use DAX to split their product dimension table into category and subcategory tables this creates a hierarchy that enables more efficient data exploration and reporting now that you’re familiar with creating and cloning calculated tables let’s help Adventure Works before we begin let’s quickly review the data model within our model the sales table is the fact table it’s connected to all other tables via one to many relationships and the cross filter direction is set to single for all relationships we’re now ready to start the first step is to create a new calculated table using DAX in the data view of PowerBI select new table from the table tools tab to expand the DAX formula bar select the formula bar and write an all DAX function that extracts all data from the sales table to create a new cloned version of the table press enter to execute the function and generate an exact copy of the sales table the new cloned table is listed as cloned sales next you need to create a calculated table based on different data sets this must be an annual sales summary table that references the sales and product tables from the imported data set select new table once again then access the formula bar and write a DAX expression that uses the add columns summarize and calculate functions to calculate and summarize the required data within the annual sales summary calculated table press enter to execute the formula and generate a new table called annual sales summary finally ensure you have the proper relationships set between the tables for the proper functioning of DAX review the new calculated tables and the relationships in the data pane and the PowerBI desktop model view adventure Works can now begin analyzing its sales data and answering specific business questions by creating visualizations and reports using the newly calculated tables and existing data calculated tables are useful in DAX and PowerBI for simplifying and enhancing data analysis you can deploy DAX functions to perform analysis without impacting the original data sets study these tools carefully and make them a central part of your skill set you might often encounter tables that don’t have the data you need you can generate this data by combining existing columns to create a new calculated column in this video you’ll explore the basics of calculated columns in PowerBI learn how to create them using DAX and evaluate their effectiveness in contributing to meaningful analysis adventure Works is analyzing the data in its sales table and realizes there’s no data for the profit margins on its product categories in the original data source calculated columns are the perfect solution to this problem adventure Works can add data on its profit margins using DAX expressions to create new calculated columns within the original data source before you begin helping Adventure Works let’s find out more about calculated columns a calculated column is a new column added to an existing data table in PowerBI data analysts can use calculated columns to derive new data from existing columns and add it to the data model once added these columns can be used in any part of a report or visual just like any other column traditional columns are filled with data imported from a data source a calculated column is created by defining a DAX expression you can create a DAX expression that calculates the data from two or more columns the result of this calculation is then added to the table as the newly calculated column write the name of your calculated column and an equals operator then write the names of the tables to be referenced in single quotation marks and their respective column names in square brackets include a relevant arithmetic operator depending on the operation required for example Adventure Works can create a total sales calculated column by multiplying the quantity and unit price columns in its sales table now that you’ve explored the purpose of calculated columns in PowerBI let’s help Adventure Works to calculate its profit margin from its sales data in its sales table by creating calculated columns launch PowerBI desktop and load the Adventure Works data set the workbook contains one table called sales the table tracks Adventure Works recent sales data access PowerBI’s data view to view the sales table adventure Works need to calculate its profit margin but to do this it must first calculate its total sales for the quantity of each item sold however the table is missing this data you can add this data to the table by creating a new total sales column you just need to multiply the quantity and unitpriced columns select the sales table from the data pane on the right hand side of PowerBI desktop in the table tools tab select the new column from the calculations group this opens the DAX formula bar write DAX code in the formula bar that multiplies the quantity column by the unit price column and adds the result as a new total sales column press enter to execute the code a new total sales calculated column appears under the sales table in the data view on the right hand side of the PowerBI interface you can use this new column in any report or visualization like any other table column now that you’ve identified the total sales data you can create a profit column to determine how much profit has been made on each item write another DAX formula that subtracts the cost from the total sales and generates the data as a new profit column press enter to execute the formula the new profit calculated column is added to the sales table now that you’ve identified the profits you can create the profit margin column select new column again then write another DAX formula in the formula bar that divides the profit and total sales columns and generates the result in a profit margin calculated column press enter to execute the formula the profit margin column is added to the data finally you need to format the calculated columns select the profit column and format it as currency then format the profit margin column as a percentage you should now understand the basics of calculated columns and be able to create them using DAX and evaluate their effectiveness measures uncover the information hidden in your data and help you to tap into its real potential over the next few minutes you’ll explore measures and their importance for data analysis you’ll also explore how calculated tables are built from pre-calculated measures adventure Works needs to calculate its sales data for all the products it has sold this month it also needs to ensure that this calculation can be updated monthly against new sales data the company can generate these insights using measures you can discover more about measures and how they function by exploring how Adventure Works uses them let’s begin with an overview of measures measures in PowerBI are used to perform calculations on data model fields measures play a pivotal role in data analysis and interpretation measures are used in PowerBI to perform aggregations calculations or evaluations on data that provide meaningful insights measures are typically used in data visualization elements examples of these elements include charts tables and cards by using measures you can compute aggregated values such as sums averages minima maxima counts or more complex statistical calculations measures in PowerBI offer several benefits in data analytics and reporting let’s explore some of the benefits measures are calculated in the context of the visualization a report they are used in this means they are dynamically updated based on filtering and other interactions within the report in other words if the context changes then so does the measure this dynamic calculation allows you to dive deeper into data and gain insights from different angles and perspectives measures are also reusable once created you can continue to recall them in your code this reduces the repetitive work of creating the same calculations and ensures data consistency across all reports another benefit is performance measures can be used to track the performance of different aspects of a business measures are commonly used to create key performance indicators or KPIs essential to monitor business performance kpis provide a quick snapshot of performance against predefined targets or benchmarks and finally measures also help to maintain consistency measures help maintain consistency in metrics across different visualizations and reports consistency ensures the same results show regardless of filtering or grouping in your measures your calculations must be standardized and uniformly applied throughout the analysis this ensures accurate and reliable reporting across various visualizations and dashboards measures can also be used to create calculated tables in PowerBI a calculated table is a table that you add to a model derived from existing tables by using a DAX formula adventure Works has created a measure called total sales this measure is the sum of all sales across all products now the company needs a new product table that lists each product alongside its respective total sales this can be done using a DAX formula in this DAX formula sales is the original table sales product is the product column in the original table and total sales is the measure Adventure Works created let’s take a moment to explore a sample of the syntax used to create such a formula begin with the name of your new measure followed by an equals operator then add the required expression that contains the logic of your measure for example Adventure Works can create a new measure called total sales that calculates the total sales amount from the sales table when executed this DAX formula will list each product and its total sales creating calculated tables from pre-calculated measures is particularly useful for creating a summary table from large data sets or for creating a table with data that does not exist in the original tables this can enhance data analysis visualization capabilities in PowerBI in this video you have learned about measures and their importance in data analysis you are also able to explain how calculated tables are built from pre-calculated measures measures in Microsoft PowerBI are essential to data analysis and interpretation they offer dynamic reusable and complex calculation capabilities enabling businesses to gain insights from their data and make datadriven decisions effectively and efficiently as a data analyst you want to be able to provide your business with answers and solutions to the questions they are asking using measures you can gain valuable insights into your data drive strategic decisions and enhance your business’s performance over the next couple of minutes you’ll explore the different types of measures in PowerBI adventure Works is using different types of measures to prepare its annual sales report to compile this report it must analyze its sales data across different regions and generate insights into specific products and sales team members let’s explore the different types of measures Adventure Works can use to prepare its report before we explore measures let’s quickly review the concept of additivity additivity refers to how measures behave when aggregated across different dimensions for example summing or averaging values however not all measures behave the same way so understanding the behavior and categorization of measures is crucial for accurate data analysis and visualization in PowerBI measures are essential for performing quantitative analysis and deriving meaningful insights from the data they provide a way to summarize calculate and compare data across various dimensions based on specific criteria and business requirements measures can be categorized into three types additive semi-additive and non-additive let’s explore these types of measures in more detail additive measures facilitate data aggregation across any business dimension like time geography or product categories the basic mathematical operations applied to these measures are addition and subtraction these types of measures provide consistent results regardless of how you group data additive measures also use the sum DAX function to aggregate over any attribute for example Adventure Works monthly sales analysis report shows revenue and quantities sold by product category and region this data is for a specific unit of time in this case per month you can use additive measures to aggregate revenue and quantity sold by summing them across all dimensions this allows you to view the total revenue and total quantities sold while analyzing the performance of various products regions and months of the year next is non-additive measures non-additive measures cannot be meaningfully aggregated across any dimension these measures involve calculations like ratios averages and percentages the result of aggregating a non-additive measure can be skewed or misleading and should be handled with caution for example at Adventure Works the average sales per customer is a non-additive measure the average sales per customer in January is $300 and in February it’s $350 however it doesn’t make sense to add these averages and state that the average sales per customer for the two months is $650 instead calculate the total sales and total numbers of customers for the two months combined then divide the total sales by the total number of customers to obtain the correct average sales per customer for the period finally let’s explore semi-additive measures semi-additive measures can be aggregated over some but not all dimensions they’re mostly used in situations where the data represents a state at a particular point in time they’ve meaningful aggregation for certain dimensions but not for all like with additive measures semi-additive measures use some to aggregate over some dimensions and a different aggregation over other dimensions examples of semi-additive measures that Adventure Works use include inventory balance and current account balance adventure Works has created a measure called inventory at hand it uses this measure to add inventory across different product categories or store locations but the measure can’t be used to add up the inventory across time like the change in inventory over a two-month period this is because it’s semi-additive for example Adventure Works had 50 bicycles in stock at the end of January and 60 at the end of February but it would not be accurate to say that it had 110 bicycles in stock for the two months the stock level changed over this period it wasn’t a fixed unit or measurement you should now be able to identify and distinguish between the different types of measures in PowerBI each of these measures plays a unique role in generating insights and guiding decision making as always with data analysis it is vital to remember that the value lies not just in the numbers but in their correct and thoughtful interpretation as a data analyst you’ll often have to identify trends from raw data supported by empirical evidence this sounds like a complicated task but you can make it easier by using statistical functions in this video you’ll explore the most common statistical functions used in measures and explore examples of each one adventure Works needs to identify trends in its business from raw data the company can use several basic statistical functions to generate these insights exploring Adventure Works use of these functions is a great way to understand how they work but first let’s begin by understanding what data analysts mean by statistical functions statistical functions calculate values related to statistical distributions and probability they also allow you to perform calculations and comparisons that reveal meaningful information about the data when it comes to quantitive data analysis statistical functions are the lifeblood of the process these functions enable in-depth analysis by providing insights into your data trends patterns and relationships some common statistical functions you’ll make use of include average median and count there’s also distinct count min which calculates the minimum and max which calculates the maximum let’s start with the average function also known as the mean this function sums up all the numbers in a data set and divides the result by the total count of numbers this function is frequently used to identify a central tendency in a data set it is beneficial when you need to find the middle ground or commonality within data for example Adventure Works can use the average function to identify its average sales amount the company can create a calculation to generate this data using the average function sales is the name of the table that contains the sales data and sales amount is the column that contains the numbers for which it wants the average the next statistical function is the median function this function calculates the middle value in a set of numbers it sorts the numbers in ascending order and then selects the middle number the median is the average of the two middle numbers for data sets with an even number of observations unlike the average the median is less affected by outliers and extreme values this makes it useful for data sets with skewed distributions for example Adventure Works needs to compute average response times for its customers service team with this data the company can measure the team’s performance and identify areas of improvement the data set contains the support table with the response time adventure Works can apply the median function to compute the median value support is the table name response time is the column containing the numbers for which the company requires the median which is response time in this case only numeric data types are supported in this function dates logical values and text columns are not supported next let’s explore the count function this function counts the number of rows in a column or a table it is often used to measure the size of a data set you can use it to count all rows or only rows that meet specific criteria the only argument in the function is column when the function finds no rows to count it returns a blank for example Adventure Works needs a report containing sales of product categories to generate this report it needs to analyze the count of sales for each product category it can use the count formula to calculate this category is a column name that contains values to be counted next let’s look at the distinct count function this function counts the number of distinct values in a data set this function is helpful when you need to understand the count of unique values or categories the only argument allowed for this function is a column you can use columns containing any type of data when the function finds no rows to count it returns a blank otherwise it returns the count of distinct values adventure Works needs to analyze the number of unique daily visitors to its website this data is stored in a website table containing a visitor ID column adventure Works can use distinct count to compute the number of unique visitors website is the table name for reference visitor ID is the column name that contains the values to be counted lastly let’s examine the min and max functions the min function is used to identify the smallest value in a column or between two scalar expressions the max function is used to identify the largest value in a column or the larger value between two scalar expressions both min and max functions can provide an overview of the range of your data adventure works can use these functions to analyze its store inventory the min and max functions identify the minimum and maximum product quantity from the inventory table using the quantity column inventory is the name of the table quantity is the name of the column that contains the values to be evaluated you should now be familiar with the most common statistical functions used in measures and be able to make use of them mastering these functions will undoubtedly elevate your data analysis skills do you want to create custom calculations for tables columns and measures you can create custom calculations by using DAX over the next couple of minutes you’ll learn about context and how it impacts DAX measures you will also examine different scenarios where measures are presented in various ways adventure Works wants to analyze its sales data determine which customers make the largest purchases and compute stock in hand across all stores in an inventory management scenario at this stage of the course you should be familiar with the concepts of DAX measures and contexts you’ll often create measures in the form of custom calculations but these custom calculations are contextsensitive it’s important to understand the influence of context because it can result in variations in your calculations these variations are based on the level of data you are evaluating the model structure and the visual you are using to represent it an understanding of context and variation helps deliver accurate data analysis and provides business intelligence to key stakeholders let’s recap the basics of context context in DAX comes in two primary forms row context and filter context row context is the current row being evaluated in an expression like racing bikes in the Adventure Works data set in contrast when you build reports in PowerBI you can filter the report data which results in DAX using the filter context this is the subset of data the calculation operates upon influenced by visuals or reports filters for adventure works it could be all cross-country bicycles sold in North America now let’s explore the impact of context on DAX to understand how the use of context in DAX measures can influence business decisions adventure Works wants to analyze and present a report on annual total revenue the company can use the sum xdax formula to compute the sum of all the quantity values multiplied by the unit price in the sales table by applying this measure to the sales table the formula computes the sum of all sales amounts but this measure utilizes only the row context adventure Works needs more insights to drive key decisions through data for example it must understand which products are selling the best to improve warehouse stock management and impact marketing decisions to identify the best performing product categories Adventure Works can filter the data set using a DAX query this query determines the total sales for products under the bikes category it incorporates filter context created by the category column from the product table in addition to the row context it incorporates filter context created by the category column from the product table in addition to the row context adventure works also needs to determine which customers make the largest purchases first the company must determine the average purchase amount using the average DAX function it can calculate the average sales amount per customer by applying this measure to the sales data set to compute the measure for the customer with the highest purchases you need to define a logic based on customer ID customer ID corresponds to the total sales amount of $2,000 and above as high purchase customers and those who spend less than $2,000 are average purchase customers in this case the customer ID is now acting as a filter context to compute the measure this instructs the sales and marketing team which customers to target in their campaigns you should now be familiar with the impact of context on DAX the contextsensitive nature of DAX is a powerful feature of PowerBI it enables dynamic calculations based on the context in which the DAX computes the formula understanding how context impacts DAX allows users to create more accurate insightful and dynamic reports that can be tailored to specific business scenarios powerbi is very effective for generating insights but writing DAX code to analyze data takes time fortunately you can create calculations and measures faster using PowerBI’s quick measures feature over the next few minutes you’ll explore the concept of quick measures learn about the different types available and review the process for creating them in PowerBI adventure Works wants to quickly analyze and monitor the performance of its sales team against several key performance indicators but constantly rewriting the same DAX code for each performance review is time consuming adventure Works can speed up the process using PowerBI’s quick measures feature let’s learn more about how quick measures work in PowerBI so you can help Adventure Works as you’ve just learned quick measures are a useful technique for performing commonly used calculations quickly and easily a quick measure runs a set of DAX commands behind the scenes then presents the results as a new measure you can use in your reports and visualizations in other words you don’t have to spend time writing DAX code the measure does it for you based on the inputs you provide there’ll still be times when you need to write DAX expressions for specific business case scenarios but quick measures can still act as a good foundation many different categories of DAX calculations are available to work with and you can modify these calculations to meet your specific analytical needs when creating quick measures in PowerBI you can choose calculation types depending on the nature of the analysis you want to perform types of quick measures include aggregate per category filters and time intelligence there are also totals mathematical operations and text quick measures in PowerBI offer several benefits for data analytics and reporting you can use quick measures to generate commonly used calculations with just a few clicks this eliminates the need to write DAX expressions making the process more efficient another benefit is accessibility you can create quick measures using PowerBI’s userfriendly interface this accessible UI means even users with limited DAX knowledge can create calculations quick measures also help with data they empower business users to take ownership of their data analysis and reporting this simple and accessible tool for creating calculations reduces dependency on data experts and quick measures also offer flexibility to iterate and refine calculations if you need to adjust a calculation or explore alternative metrics you can easily modify your quick measures without affecting the underlying data now that you’re familiar with the basics of quick measures let’s help Adventure Works use them to track the performance of its sales team before we begin let’s quickly review the model you’ve launched PowerBI connected to your data sources and loaded transformed and configured the following tables for your model products region sales and salesperson now you can begin creating measures in PowerBI the first step is to select the report view or data view to access the calculations group within this group select quick measure the quick measures window appears on the screen choose the required calculation type and fields to run the calculations alternatively you can select the ellipses next to the table name on the data pane then select new quick measure from the drop-own menu remember that the measure is created by default in the table you have selected from the data pane on the right side of the window choose select calculation this action opens a list of available calculation types in PowerBI adventure Works must calculate what quantity of each product each team member has sold so choose total for category filters applied next you must select the required fields from the right pane to perform calculations select the sales column from the sales table and assign it as the base value then select the category column from the product table and assign it to the category section then select add to add these elements to the measure the new quick measure appears in the fields pane and the underlying DAX formula appears in the formula bar adventure Works also needs to know how much revenue each team member has generated this year you can calculate this using a year-to-ate sales measure to create this sales measure you can repeat the same process as before select quick measure from the measure tools tab then select the year-to-ate total calculation type then select the sales column from the sales table as the base value and the order date column from the product table in the date section finally select add a new measure called sales YTD appears in the fields in the data pane thanks to your help Adventure Works can now quickly track the performance of its sales team using quick measures and you should now understand the importance of quick measures be familiar with the different types available and be able to create them in PowerBI measures are PowerBI features that let you explore your data to create meaningful reports and visualizations in this video you’ll learn how to create custom measures with DAX adventure Works needs to analyze its sales data to calculate its total sales and identify the top two best-selling products in each category and region you can use DAX calculations to create custom measures to help Adventure Works generate these insights custom measures refers to userdefined calculations or metrics created using DAX like traditional measures custom measures also generate insights about data let’s create custom measures to help Adventure Works generate insights into its sales data before we begin let’s quickly review the company’s data model you’ve launched PowerBI connected to your data sources and loaded transformed and configured the following tables in the model: products region sales and salesperson so within our model the sales table is the fact table it’s connected to all other tables via a series of active one to many relationships and the cross filter direction is set to single for all relationships we’re now ready to start creating measures the first step is to create a new measure called total sales using DAX in the data view of PowerBI select new measure from the table tools tab to expand the DAX formula bar type total sales as the name of your new measure be aware that any new measure added to the DAX formula bar is named measure by default if you don’t rename the measure all new measures are named measure one measure two and so on give your measures unique names to be easily identifiable particularly when creating several measures write the total sales measure using the sumx function to multiply the unit price and quantity columns from the sales table when you enter your formula a list of suggested functions appears after you type the equals operator you’ll need to ensure that you understand the functions on this list and that you select the relevant one for your calculation and once you reference a table or column name PowerBI displays a drop-own list of available tables and columns within your data model select the correct field when choosing a reference from the drop-own list to ensure your chosen measure functions as required press enter to execute the function and generate the new total sales measure you can view the new measure within the table you selected under the data pane on the right hand side of the PowerBI desktop interface next you must create a measure that identifies the number one and number two top selling products in each category you can use the total sales measure to create another new custom measure select new measure to expand the formula bar and write a measure called top two products the measure begins with a variable that defines the ranking of products using the DAX values function the return section returns the value with the required calculation the calculate function filters the results of the total sales measure based on the top two products the top function defines the top products based on their respective sales it uses the number two to represent the top two products this is a dynamic measure that you can use to present the number one and number two top selling products by product category color or region press enter to execute the function when executed the function displays the results of the measure in a matrix or table that shows the total sales amount for the top two performing products in each category you can dig deeper into the data by working through different business years thanks to your help Adventure Works now have the insights they require and you should now be able to create custom measures with DAX this is a valuable new skill you’ve learned when used correctly you can deploy dynamic calculations to generate insights quicker there may be times when you encounter a data model with a cardality and cross filter direction configured making it impossible to perform the necessary filters with the cross filter function you can change the cross filter direction for a specific measure while maintaining the original settings in this video you’ll develop an understanding of the cross filter function its syntax and its relationship to measures adventure Works needs to analyze its sales performance for the previous few years along with the performance of its sales team however its data model tables are connected via one to many relationships and single cross filter direction this prevents the company from filtering the data as required and changing the cross filter direction to both results in a permanent change fortunately Adventure Works can use the cross filter function to alter the direction while maintaining the original settings let’s explore how this works as you’ve just discovered the cross filter function changes the cross filter direction between two tables for a specific measure while maintaining the original settings in other words it specifies the cross filtering direction to calculate a relationship between two columns so how do you create a cross filter function a cross filter function can only be used within a DAX function that accepts a filter as an argument like the calculate function for example this means that the function receives two arguments the name of the table you want to filter along with the required column and the direction in which you want to filter let’s explore an example the syntax begins with the cross filter function the argument is then placed in parenthesis the argument is the name of each table followed by the names of the required columns in square brackets the first column name is typically the Manny side of the relationship and the second is the one side finally add the filter direction the first column name is typically the Manny side of the relationship and the second is the one side for example Adventure Works could filter between both sides of the relationship on its sales and products table using the product key columns common to both you might be familiar with cross filter directions from earlier in this course here’s a quick recap of the possible directions in which you can filter the relationships in your model you could use none which means that no cross filter occurs within the relationship there’s also the one-way direction filters applied on one side of a relationship propagate to the other however you can’t use the one-way option with a one:one relationship next is oneway right filters left in this instance filter propagation occurs from the right side to the left side of the relationship and finally there’s one way left filters right in which filter propagation occurs from the left side to the right side of the relationship let’s review an example of how adventure works can make use of cross filter function in the adventure works data model the sales fact table is related to the dimension tables via one to many relationships and single cross filter direction this means that filters propagate from the product table to the sales table but not in the other direction so when Adventure Works analyze products sold by year the results aren’t accurate because the model can’t filter the results correctly you could try to resolve this issue by changing the cross filter direction between the tables to both but this also changes how the filters work for all data between these tables instead you can create a cross filter function using DAX to change the filter only for the current measure create a new productby-year measure that computes the total number of products sold the distinct count function calculates the number of distinct values in the product key columns between the sales and products tables and the cross filter function alters the cross filter direction from single to both based on this column once Adventure Works analyzes the measure based on the year column from the date table the results are accurate according to the business analytical needs you should now be familiar with the cross filter function and how it works cross filter is a useful function to change the direction of a relationship without changing the relationship itself this function creates visualizations with custom filtering depending on the business needs you’ll often create measures that generate answers to specific data questions but what if you need your measure to answer another question you can use the calculate function to refocus your measure in this video you learn how the calculate function can alter the filter context for measures adventure Works needs to analyze its total sales for all its products it also needs to generate more granular data including sales of bikes blue colored products and sales within the US region it can calculate the total sales for all products using a standard measure but insights into the other data will require more specific filters adventure Works can use the calculate function to change the filter context and generate these insights let’s learn more about how quick measures work in PowerBI so you can help Adventure Works changing the context of a filter means changing the data that the filter must analyze for example Adventure Works needs to create a calculation or measure that analyzes its total sales for all its products this is the original filter context once this calculation is completed the company needs to explore its data in more granular detail by identifying how many bicycles it has sold it can combine the original total sales measure with a new bike sales measure that generates insights into how many bicycles have been sold so the filter context changes from all products to all bikes before you review some examples let’s review the syntax of the calculate function the calculate function can be invoked with an expression as its first argument a set of filters in square brackets then follows the expression these filters are defined or modified by expressions to find out more about how this works let’s explore how Adventure Works make use of the function adventure Works first needs to calculate its total sales the company can create the total sales measure using the sumx function the measure must multiply the sales table quantity and unit price columns this measure uses row context and iterates over each row of the sales table to compute the total sales of products for Adventure Works adventure Works can continue to use this measure in all the other calculations it needs to complete now that Adventure Works has a generic measure of total sales it can refocus its filters to generate insights into bike sales adventure Works can create a new measure called bike sales that uses calculate to analyze the sales of products in the bikes category when the category bikes is executed the formula calls the total sales measure again however this time it adds the bikes product category as an additional filter in the filter context in other words the filter context changes from all products to all bikes next Adventure Works needs to analyze all blue colored products in each category the company can write a new measure called sales of blue products when executed the expression incorporates the blue color from the product color column as an additional context for this calculation it calculates the total sales of blue color products from the entire data set you can also specify multiple filters in the same calculate function all the filters intersect regardless of the order in which they appear for example Adventure Works can create a measure called sales of blue products in USA that computes the total sales of blue products in the USA region this measure calculates the total number of blue products sold only in the United States by adding the country column from the region table in the overall filter context of the calculation but what if you’ve already created filters on these columns any existing filters will be overridden by those in your calculate function so how do you retain both sets of filters you can use calculate modifiers to keep the behaviors that already exist in your columns an example of a calculate modifier is keep filters you can add keep filters before your argument while placing the argument in parenthesis this ensures that existing active filters on your columns are not overridden or merged with new filters other examples of calculate modifiers include cross filter all and use relationship you’ll explore these modifiers in more detail later in this lesson you should now be able to use the calculate function to alter the filter context of your measures so you can create measures to generate insights into your data and modify your measures filters to ask and answer other questions about your data as a data analyst unlocking fresh insights requires exploring data from multiple angles with role-playing dimensions you can explore your data from different perspectives and eliminate the need for redundant data structures through active and inactive relationships in this video you’ll explore the concept of role-playing dimensions and active and inactive relationships adventure Works receives thousands of orders from all over the world and it’s important that the company continually analyzes its orders to avoid delayed or mistaken deliveries it can use multiple dimensions to explore its order related data from multiple angles let’s find out more about role- playinging dimensions by exploring how Adventure Works makes use of it in the context of PowerBI dimensions represent the various attributes or business entities used to organize data role-play dimensions are instances of the same dimension used multiple times in a data model each instance plays a unique role by representing different aspects of the data this provides the flexibility to analyze data from different viewpoints without duplicating data tables let’s demonstrate this with an example from the Adventure Works database adventure Works sales and shipping departments operate in sequence first new sales are recorded in the sales data set as order date then the orders shipping date is recorded in the sales data set finally the system automatically generates a delivery date when the customer receives the product so in Adventure Works sales data set the date dimension is used three times for new sales shipping dates and receipt dates adventure Works can analyze sales performance by order and shipping date without needing separate tables optimizing delivery time by delivery date analysis this helps the business to analyze sales performance based on order date and shipping date without creating separate tables for each date type when Adventure Works queries its data the role of the date dimension is based on the fact column used to join the tables for example the table join relates to the sales order date column when analyzing sales by order date an important part of role-playing dimensions are active and inactive relationships an active relationship is a relationship between two tables used for analysis reporting and visualization an inactive relationship is a valid relationship not being actively used in the current analysis to differentiate between active and inactive relationships PowerBI marks active relationships with a solid line and inactive relationships with a dotted line let’s examine an example from Adventure Works in the Adventure Works table the date and the sales tables have three relationships however there can only be one active relationship between two PowerBI model tables all remaining relationships must be set to inactive a single active relationship means there is a default filter propagation from the date to the sales table the active relationship is set to the most common filter used by the company’s reports which is the order date relationship you can utilize the inactive relationship for specific analytical needs using the DAX use relationship formula so how do active inactive relationships relate to role- playinging dimensions here’s a quick demonstration of how these concepts function in the Adventure Works database let’s begin with creating a role- playinging dimension after importing sales and date tables you can create two relationships between them one for order date and another for shipping date by default the first relationship is active and the second is inactive the date table serves as a role- playinging dimension for both order and shipping date any analysis reporting and visualization you require can make use of this active relationship occasionally you’ll need to analyze data from a unique perspective for example Adventure Works needs to calculate its total sales based on the shipping date however the shipping date is an inactive relationship so using this calculation requires a measure to create such a measure an inactive relationship needs to be employed this is where the DAX function use relationship comes in to use the shipping date the inactive relationship create a measure using use relationship for instance to calculate the total sales based on the shipping date you can create a DAX formula calculate is used here to alter the filter context of the entire measure sum is summing up the sales amount column of the sales table as the sales table is connected to the date table via order date column by default each DAX calculation is based on the relationship between the tables user relationship function in DAX overrides the relationship and establishes a temporary relationship based on the shipping date column of the sales table or inactive relationship the relationship becomes active only for the current calculation this formula forces PowerBI to use the inactive shipping date relationship for the calculation role-playing dimensions and active inactive relationships in PowerBI create an efficient data model for comprehensive analysis although it might take some time to get used to these concepts they will prove invaluable as you navigate your PowerBI journey as a data analyst you’ll often encounter table relationships that are difficult to perform analysis with fortunately you can alter or manipulate table relationships to facilitate more efficient analysis using the use relationship function over the next few minutes you’ll explore the use relationship function its syntax and its application adventure Works needs to analyze its sales data based on the shipping date it could create a calculated table for the shipping date and relate it to the sales table this might work well for a smaller data set but Adventure Works has millions of shipping records a more effective approach is for Adventure Works to use the use relationship function to create a measure that utilizes the inactive relationships between the tables before we explore how Adventure Works can analyze its sales data let’s find out more about the use relationship function the use relationship function is used within the calculate function it forces the inactive relationship between the tables for the considered calculation to be used this lets you switch contexts within your data model without changing the default relationship between the tables it’s most useful when there are multiple relationships between two tables the function allows you to create contextaware calculations that can analyze data based on different date dimensions or adjust analysis based on a different category of products the advantage of use relationship is that it enables you to perform analyses using different relationships available between the related tables without affecting the overall structure of the data model now that you’ve explored how the use relationship function works let’s review the syntax begin with the function and then place your argument in parenthesis the argument is the names of the required tables and their respective columns that define the relationship the order of the columns doesn’t matter for the accurate calculation this function doesn’t return a value but modifies the context of a calculation this changes the table relationships meaning that there is no scalar value or table returned as a function is executed instead it changes the context by overriding the relationship between tables let’s return to Adventure Works data model to explore the syntax in action as you discovered earlier Adventure Works data model has a sales fact table and a date dimension table the data model’s current active relationship is from the sales tables order date column and the date table’s date column as no shipping date dimension table exists in the data model Adventure Works needs to create an additional relationship between the sales fact table and the date dimension table using the sales table’s shipping date column by default the active relationship for any analysis and visualization is utilized however there may be a requirement to calculate the total sales using the shipping date to do this it can use the use relationship function within the calculate functions first Adventure Works creates a sales by shipping date measure then it inputs the calculate function followed by the required argument in parenthesis in this argument the sum expression calculates the total of the sales amount column from the sales table the use relationship function changes the context of this calculation by switching the active relationship from the sales tables order date column and date tables date column to the sales tables shipping and date date to sales shipping date and date date when executed this calculation results in multiple relationships between these tables an active relationship with the order date and an inactive relationship with the shipping date this affects only the calculate function where it’s used it won’t permanently alter the active relationship let’s review some important points to remember when working with use relationship use relationship only works within the calculate and calculate table functions if you try to use it elsewhere you will receive an error use relationship functions can be used multiple times within a single calculate function to switch multiple relationships the use relationship must exist in the data model but it doesn’t have to be active the use relationship function provides flexibility to derive insights from different perspectives within a data model this provides a layer of flexibility to PowerBI making it an essential function for data analysts to master it can be challenging for a data model to handle various roles for a single dimension so analysts deploy the use relationship function in their calculations to configure role-playing dimensions in this video you’ll learn how to configure a role-playing dimension in PowerBI using calculate and use relationship adventure Works wants to analyze its sales data based on the shipping date instead of creating a separate date dimension table it can use the use relationship function in DAX to roleplay dimensions helped the company achieve this by launching PowerBI desktop and loading the Adventure Works data set the data model contains two tables called sales and date the sales table tracks Adventure Works recent sales data access PowerBI’s model view to view the sales and date tables however after loading data the model is missing the relationships you can establish the relationships between the sales and date table in the model view of PowerBI select and drag the order date column from the sales table to the date table this is the active relationship between these two tables next select and drag the shipping date field from the sales table to the date column of the date table this is an inactive relationship represented by dashed line you can validate the relationship by selecting the connector line between the tables and doubleclicking it opens the edit relationship dialogue box you can observe the checkbox make this relationship active is unchecked next you need to create the measure total sales by shipping date in the home tab of data view select the new measure from the calculations group this opens the DAX formula bar write DAX code in the formula bar that uses use relationship function to create a custom relationship between the date column of the date table and the shipping date column from the sales table press enter to execute the code a new total sales by shipping date measure appears under the sales table in the data pane on the right hand side of the PowerBI interface you can use this new measure in any report or visualization to analyze monthly sales data based on the shipping date you should now be familiar with the process for configuring a role-playing dimension in PowerBI using calculate and use relationship by now you should be familiar with methods for generating insights into your data but the most powerful and effective data insights you can generate are timebased in this video you’ll explore the concept of time intelligence and discover its importance by reviewing some scenarios where it can be applied over at Adventure Works the company is preparing its sales strategies and marketing campaigns for the year ahead as part of its preparation it needs to generate insights into time related data like seasonal trends annual growth and specific sales periods adventure Works can generate insights into these timerelated aspects of its business by using time intelligence functions as the Adventure Works scenario suggests time intelligence functions refers to methods and processes that aggregate and compare data over time data analysts can deploy time intelligence functions to analyze data based on time related dimensions time related dimensions include dates weeks and months and quarters and years you can also generate comparisons of time related data over annual periods and yearto date or YTD so why do data analysts view time intelligence as important time intelligence provides the ability to analyze data within the context of time this enables a more in-depth understanding of trends and patterns as the earlier Adventure Works example demonstrates this data plays a significant role in a business’s ability to generate insights to help with its planning forecasting and decision-making processes let’s explore a few other benefits of time intelligence time intelligence is useful for trend analysis identifying trends in past business performance is crucial for future decisions for example Adventure Works can use time intelligence data to examine historical sales trends and recognize if certain products sell better at specific times of the year identifying trends in past business performance is crucial for future decisions insights derived from time intelligence also help with forecasting and predictive analysis adventure Works can forecast future trends and plan activity based on historical trends it can make informed predictions about sales and demands which helps with resource planning budgeting and risk management for instance if the data shows a consistent increase in mountain bike sales every spring the company can ensure adequate inventory before the season starts time intelligence also enables real-time performance monitoring this is possible by creating dynamic measures like year-to-ate or YTD and month-to-ate or MTD adventure Works can use these measures to monitor real-time performance against key performance indicators the company can then use these insights to respond quickly to changing conditions time intelligence calculations facilitate comparative analysis an example of this is year-over-year R Y functions adventure Works can compare its current growth rate sales performance and other metrics against data from previous years to analyze its progress time intelligence also facilitates the optimization of sales and marketing strategies adventure Works can analyze its sales trends and the impact of its marketing efforts over time it can then use the results of these analyses to fine-tune its marketing strategies and sales tactics to improve its results now that you know its benefits your next question might be how do I use time intelligence implementing time intelligence involves creating calculated fields and measures to analyze data over time you can use PowerBI’s automatic time intelligence features or deploy DAX formulas to create quick measures powerbi offers an auto date time feature that allows easy data analysis by year quarter month and date this is useful for smaller data models powerbi automatically creates one date table for each date column in the date model to analyze data by different date attributes this table is hidden from the user because PowerBI handles it automatically you can also use custom DAX calculations to shape your data model and implement time intelligence calculations with more complex and non-standard requirements time intelligence is essential for understanding and visualizing time related trends and patterns in data as a PowerBI developer mastery of time intelligence calculations is key to generating meaningful information from your data summarizing data over a specific period is a key skill for data analysts timebased data can generate temporal insights and trends within data in this video you’ll review the importance of using DAX based time intelligence functions to summarize data over time over at Adventure Works the company needs to generate insights into its recent sales trends the insights it requires includes revenue growth seasonal sales patterns and the impact of marketing campaigns adventure Works can generate these insights using time intelligence functions index to summarize its data over time so what does it mean to summarize data over time at its core summarizing data over time is identifying trends patterns and anomalies in business performance over a specific period like sales per quarter or annual growth you can generate these insights by using timebased data summarization functions some frequently used examples of these functions include total year-to- date year-to- date and dates between each function generates insights into different aspects of your data the functions are written by stating the function name and the required arguments in parenthesis this basic structure is similar across all functions but the syntax for the arguments varies r must be combined with calculate and other functions let’s begin with the year-to-ate calculation the year-to-ate calculation or YTD aggregates values from the beginning of the year to the specified date for example all sales from January 1st of that year to the specified date the year-to- date requires two mandatory and two optional arguments expression is the first mandatory argument it calculates the total sales from the sales table dates is the date column we use PowerBI default date dimension in the current lesson filter and year-end date are optional parameters for example Adventure Works wants to evaluate its realtime sales performance call the expression sales year-toate and add the total year-to-ate function after the equals operator in your first parameter reference the total sales column from the sales table and aggregate the values using sum in the second parameter reference the order date column from the sales table then add another date field in square brackets when you type the date field PowerBI allows you to select a field from the table next let’s review the date year-to- date function this function returns a running total in the form of a single column table containing year-to-ate or YTD dates in the current filter context this function is part of a group that also includes the dates MTD and dates QTD DAX functions for monthto date also called MTD and quarter to date or QTD you can pass these functions as filters into the calculate DAX function the syntax contains two arguments the first is dates the column containing the required dates and the second is the year end date an optional parameter while the total YTD function is simple it limits the filter expression to only one filter if you need to apply multiple filter expressions within year-to-ate values use the calculate function then pass the dates YTD function as one of the filter expressions for example Adventure Works needs a running total that calculates its year-to-ate sales on a month-by-month basis based on the order date column from the sales table it can calculate this by creating an expression called sales yearto date method 2 the expression does not refer to any separate date table instead the dates YTD function is combined with the calculate function so Adventure Works can incorporate further filters when executed the expression returns a calculated table with the required running monthly total the next function is dates between this function returns a table that contains all dates between a specified start date and an end date the syntax contains three arguments dates is the column containing dates start date is the date expression for the start of the calculation end date is the date expression for the last date for the calculation adventure Works wants to evaluate its total sales over the summer season so it must create a measure using the dates between function in DAX the DAX code computes the total sales between June 1st and August 31st 2023 the calculate function computes the values of the total sales column of the sales table and dates between defines the period for which the sales values are to be computed when executed the expression returns a calculated table with the required total sales figures as these examples have shown your data model requires a date table or dimension before you can use time intelligence functions however you can use PowerBI’s auto date time intelligence if you’re missing the date dimension or you can create a date dimension in PowerBI using Power Query or DAX as you’ve just discovered DAX-based time intelligence functions provide valuable flexibility in summarizing and analyzing timebased data you can use these functions with other DAX functions to build powerful and insightful data models as a data analyst it’s important to be able to compare data sets particularly those from different periods like previous years or months in this video you’ll learn how to use DAX for comparison over time using time comparison functions like date ad parallel period and same period last year adventure Works is preparing its marketing campaign for the holiday season as part of its preparations it needs to analyze and evaluate campaigns from previous years adventure Works can implement DAX time intelligence comparison functions to identify trends and patterns from marketing campaigns from previous years it can then use these insights to inform its current campaign before you can help Adventure Works let’s find out more about comparison over time comparison over time means as the term suggests comparing sets of data over specific periods for example comparing sales from this month to last month these comparisons are generated using time intelligence functions in DAX like same period last year date add and parallel period the basic syntax for each function is to state the function name followed by the required arguments in parenthesis however the rest of your syntax can vary according to the functions requirements and your analytical needs when executed the functions return insights in the form of a table let’s explore an example of each function from the Adventure Works database to learn more about how they work the same period last year function returns a table that contains a column of dates these dates are shifted one year back in time from the dates in the specified dates column in the current context in other words it compares the current period against the same period from last year the syntax for this function requires one argument in the form of specific dates adventure Works can use this function to evaluate its sales from the previous year to compare them against the sales team’s performance from this year it first creates a measure called revenue previous year then it defines var as the variable for the previous year’s revenue calculate computes the total revenue based on the same period last year function which takes the date column from the sales table as its parameter in this instance we are using PowerBI’s autogenerated date dimension finally the return function displays the value of the entire expression next Adventure Works wants to evaluate its year-over-year change in sales it can modify the measure it just created to calculate the change ratio it first creates a new measure called revenue year-on-year percentage variables used in the expression enhance the code readability and query performance and in addition to the previous calculation the divide function computes the change ratio of sales amount by dividing the difference by the previous year’s revenue the results of both measures can be visualized in table format the following table extract compares revenue for July and August over a three-year period next let’s look at the date add function the date add function returns a table containing a column of dates added either forward or backward in time by the specified number of intervals from the dates in the current context the syntax contains three arguments dates is the column containing the required dates the number of intervals is the integer value that defines the number of intervals to add or subtract from the date interval is the unit of time by which to shift the date the unit can be a year quarter or month for example Adventure Works can use the date add function to compare this month’s sales with the previous month’s sales the calculate function computes the total revenue based on the filter arguments previously computed in the revenue measure date add function takes the order date column from the sales table as a date reference one represents the unit of time and the negative sign indicates that the intervals are back in time month represents the unit of time you can also use day quarter or year the results of this measure can be visualized in table format the following table extract compares sales revenue for August to October over a 2-year period comparing data over time is a powerful method for deciphering business trends and growth patterns mastering this skill will enable you to provide valuable insights for your organization to help it strategize and grow when working with time oriented values your date table must be correctly formatted and configured to avoid issues with your analysis in this video you’ll explore the process for setting up and the benefits of a common date table adventure Works data model has multiple fact tables tracking different aspects of its business like sales products and resellers but the data model doesn’t contain a date table this means there’s a risk that the different fact tables might represent dates differently without a common date table this makes it difficult to compare or relate data from diverse sources let’s find out more about the role of a common date table then help Adventure Works to add one to its data model a common date table or date dimension is a prerequisite for time intelligence calculations you can’t execute them without a date dimension the date dimension must meet the following requirements there must be one record per day there must be no missing or blank dates and it must start from the minimum date and end at the maximum date corresponding to the fields in your parameters but what if your data model is missing a date dimension in this instance you can use PowerBI’s autodate time intelligence you can also create a date dimension in PowerBI using either Power Query or DAX this is useful when working on large data sets with complex calculations you can create a date dimension with DAX using the calendar and calendar auto functions both functions return a calculated table with a single date column and a list of date values when executed adventure Works could use the calendar function to create its date dimension the company can use the calendar function as a calculated table called date it can then include its required periods start and end dates as its arguments it can also use calendar auto the calendar auto function scans the data model for the date column it takes the start and end date from the order date column from the adventure works sales table fiscal year and month is an optional parameter if defined for a different end of the year month for example if you specify three the year starts on April 1st and ends on March 31st if not specified PowerBI takes the default year-end month which is December now that you’ve explored the basics of a common date table let’s help Adventure Works build one in its data model begin by launching PowerBI desktop and loading the Adventure Works data set the data model contains five tables: sales salesperson products reseller and region the sales table tracks Adventure Works sales data the data model has no date dimension table so you’ll need to create one navigate to the home tab and select new table in the formula bar that appears on screen write the DAX code using the calendar function to create the date dimension table this table must calculate all date values between the 1st of January 2017 and the 31st of December 2021 when executed the DAX code creates a table with a single column containing the dates specified in your code the date values in the column also have timestamps format the column as date format to remove the timestamps select an appropriate format from the drop- down list of the format section navigate to the home tab and select new table to populate the common date table you need to write more DAX code using the date related functions like year month week number and weekday these functions extract the relevant information from the date columns of the other tables next you need to mark the common date table as the date table navigate to the date pane select the ellipses to the right of the date table and select mark as date table from the drop-own list of options this opens the mark as date table dialogue box select the date option from the date column drop-own menu if these steps are completed successfully a validation message appears select okay this action overrides the PowerBI’s autogenerated date dimension for all time intelligence and datebased calculations in DAX within the data model finally access the model view of PowerBI and establish the new one to many relationship with single cross- filter direction between the date table and the sales fact table drag the date column from the date table to the order date column in the sales table the model is now configured for time intelligence calculations adventure Works can use the model to generate its timebased reports and visualizations you should now be familiar with configuring and formatting a common date table in your data model a common date table makes the data analysis process more accurate and efficient it’s an essential part of every data analyst’s toolkit to execute time intelligence functions your data model must contain a common date table in this video you’ll explore the process for setting up a common date table using IM language in Power Query adventure Works must execute time intelligence functions but its data model lacks a common date table let’s help Adventure Works by creating a date table using M language in Power Query m is a PowerBI developmental language used in Power Query to create new dimensions and tables within a data model it provides a much more visual approach to creating dimension tables to assist Adventure Works load the data tables into the PowerBI data model select transform data in PowerBI desktop to open the Power Query Editor access the Home tab and select new source select blank query from the drop-own list of options add the required IM language code to create the date dimension table in the editor the list dates function lists the dates in this code based on the provided date range in this instance you’re creating a 5-year table from January 1st 2017 to January 1st 2021 the syntax 365×5 represents all the possible dates within this 5-year range and duration specifies the duration of the period with one equaling one day once you execute the code PowerBI generates a list of dates these dates must be converted to a common date table navigate to the top left side of the Power Query editor in the transform tab and select to table this action converts the list of dates to a table with a column named list by default rename the column as date next you must change the columns data type to the date data type right click to open the drop-own list and select change type select the date option from the list now you need to populate the table with the related columns select the table’s date column and navigate to the add column tab of Power Query Editor select the date section to expand the drop-own list of options select the following columns to add to the table from the drop-own list year month name of month name of day and week of year access the properties name field in the query settings and rename the query as date then select close and apply to return to the PowerBI interface finally select the ellipses next to the date table from the data pane and mark the table as a date table select the date column from the dialogue box then select okay to confirm finally establish the required relationships between the data models date table and other tables the model is now configured for creating time intelligence measures using DAX and for creating reports and visualizations in this video you learned how to set up a common date table using IM language in Power Query this video is a short introduction to IM language and Power Query you’ll learn more about IM language as you continue your PowerBI studies meet Tina Adventure Works in-house expert on using time intelligence calculations in DAX adventure Works is looking to optimize all aspects of its business from sales and deliveries to financial planning using time intelligence calculations in DAX the company suggests that Tina analyze its data in these areas and generate insights that reveal where improvements could be made to the business first Tina focuses on sales she performs timebased trend analyses using year-to-ate functions to analyze trends and patterns in sales over time her analyses reveal seasonal spikes and downward trends in sales of certain products over different months and quarters adventure Works can use these insights to forecast demand for its products this means the company better understands what products customers purchase and when they will most likely buy them it can design and implement marketing strategies targeting consumers during the months they’re most likely to purchase specific products tina’s insights into sales trends also help Adventure Works to manage its inventory better by identifying what kinds of bicycles customers are likely to buy and when adventure Works can then ensure that these products are in stock in time for busy sales periods tina can also use time intelligence functions to track sales team performance she can compare current and past performance data to prepare for the upcoming sales period the insights generated from her comparisons are then used to set realistic targets for the team and identify the high performers the upcoming sales period also requires large investments in inventory and marketing fortunately time intelligence is also a useful budgeting and financial planning tool tina can compare actual financial data with budgeted values over different periods assess financial performance and track spending the company’s finance team can use these insights to make budget adjustments time intelligence functions can also identify issues and their root cause for example Adventure Works anticipated a high volume in sales of mountain bikes over the holiday sales period but sales declined over the season tina can use time intelligence functions to drill into the related data and isolate these sales anomalies to analyze the root cause of the slowdown in sales for example the decline in sales might indicate a shift in customer behavior that needs to be addressed time intelligence in PowerBI is an important tool that businesses can use to use the power of time dimensions in data analysis through the insights generated by time intelligence businesses like Adventure Works can generate valuable insights that drive informed decisionmaking and help resolve issues congratulations on reaching the end of the second week in this course on data modeling in PowerBI this week you’ve explored how to use data analysis expressions or DAX in PowerBI let’s take a few minutes to recap what you’ve learned in this week’s lessons you began the first lesson by learning about DAX dax is a programming language that adds new information about existing data it consists of a library of functions operators and constants these are used in formulas or expressions to add information missing from the original data model a key element of formulas is functions functions are reusable logic used in a DAX formula to perform tasks like aggregation or calculations commonly used DAX formulas and functions include calculate sum and average you then explored the syntax of a formula a formula begins with the name of your new calculated column or table followed by an operator typically an equal sign you then write the name of your DAX function and parenthesis that contain the logic of your formula you then learned about row and filter context dax computes formulas within a context the evaluation context of a DAX formula is the surrounding area of the cell in which DAX evaluates and computes the formula row context refers to the table’s current row being evaluated within a calculation while filter context refers to the filter constraints applied to the data this determines which rows or subsets should be included or excluded from the calculation you are then introduced to calculated tables and columns a calculated table is a new table created within a data model based on data from different sources a calculated column is a new column added to an existing table that presents the results of a calculation you then completed the lesson by putting your new skills to the test by assisting Adventure Works with its use of DAX in the exercise and completing a knowledge check in the second lesson you received an introduction to measures you learned that a measure is a calculation or metric that generates meaningful insights from data measures are an important aspect of data analysis and play a lead role in creating calculated tables and columns there are three different types of measures additive semi-additive and non-additive which type of measure is used depends on the needs of your data and its dimensions a key element of measures is statistical functions statistical functions calculate values related to statistical distributions and probability to reveal information about your data several common statistical functions are used in measures like average median and count you learned how to build statistical functions into your syntax and explored how to use common functions like using the average function to calculate the average of a data set you then discovered how context impacts DAX measures you reviewed Adventure Works business scenarios in which the context of measures influenced the company’s business decisions finally you tested your new skills with a knowledge check and explored additional learning material in the additional resources in the third lesson you expanded your understanding of measures you began by learning how to create quick measures in PowerBI using common calculations instead of DAX codes you then explored techniques for creating more complex custom measures with DAX next you learned how to use the cross filter function you can use the cross filter function to change the cross filter direction between tables for a specific measure while maintaining the original table settings a cross filter function can only be used with a DAX function that accepts a filter as an argument like calculate you can use calculate and its related modifiers to combine filters and generate more granular insights into your data you then tested your new skills by adding a measure to an adventure works data set in the exercise and you tested your understanding of the topics in a knowledge check in the fourth lesson you explored how DAX is used with table relationships you began the lesson by learning about role-playing dimensions instances of the same dimension used multiple times in a data model each instance plays a unique role by representing different aspects of the data this allows analysts to analyze data from different viewpoints without duplicating data tables in a data model relationships between tables are either active or inactive you can configure these relationships using the use relationship function alongside the calculate function to force the use of the inactive relationship you completed this lesson by helping Adventure Works to add a role-playing dimension between two tables in its data model you then tested your understanding of the topics in a knowledge check and explored further learning material in the additional resources in this week’s final lesson you explored time intelligence in DAX you learned that time intelligence functions refer to methods and processes that aggregate and compare data over time these functions can be used in PowerBI through the auto date time feature or DAX dax can summarize data over time by identifying trends patterns and anomalies over a specific period or it can be used for comparison over time by comparing data sets over specific periods these insights are generated using summarization and comparison functions that return the required insights there are also more complex functions that can be used with time intelligence a prerequisite for using time intelligence functions is a common date table or date dimension if this isn’t present in your data model you can build one using the calendar function or the calendar auto function both functions return a calculated table with a single date column and list of date values you also learned how to generate a calculated date table using language in Power Query you then explored a realworld scenario where time intelligence played an important part in a business’s decision-making process during this lesson you helped Adventure Works use time intelligence calculations in DAX during an exercise and activity you’ve now reached the end of this module summary it’s time to move on to the discussion prompt where you can discuss what you learned with your peers you’ll then be invited to explore additional resources to help you develop a deeper understanding of the topics in this lesson best of luck we’ll meet again during next week’s lessons imagine you’re a data analyst at Adventure Works a thriving multinational bike manufacturing company your role is significant it involves digging deep into the vast array of data sifting through it and translating it into meaningful actionable insights decision makers in Adventure Works rely heavily on your PowerBI dashboards which provide a window into the world of Adventure Works vast data landscape these dashboards through your analysis guide the company and reveal its successes challenges and opportunities however over time you start noticing an issue as the data volume is growing the reports are slowing down simple queries that used to take seconds now take many minutes even hours this bottleneck is frustrating staff delaying decisions and even starting to undermine the value of datadriven solutions there is an urgency to fix the situation and you must act before the issue escalates further that’s when you realize the need for performance optimization this video covers the importance of performance optimization in PowerBI and how it affects the overall performance of data models reports and dashboards by the end of this video you’ll understand the benefits of PowerBI performance optimization such as enhanced speed and efficiency informed decision-making improved user experience resource efficiency and timely report generation over the next few minutes you’ll learn about the challenges Adventure Works face due to growing data volume and how performance optimization in PowerBI can address these issues in the context of PowerBI optimization refers to the process of modifying tuning or streamlining your data models reports and dashboards to achieve the best possible performance at its core it’s all about making sure your reports and dashboards run as smoothly and quickly as possible when you’re dealing with small volumes of data performance isn’t typically a concern but as your data grows the performance of your PowerBI solutions can start to deteriorate this might manifest as slow report loading times sluggish response times when interacting with dashboards or even timeouts and errors performance issues can arise due to a variety of factors including inefficient data models complex DAX calculations and inappropriate visuals however regardless of the cause performance issues can have a significant negative impact on the user experience and the usefulness of your PowerBI solutions that’s where performance optimization comes in by understanding and applying optimization techniques you can improve the performance of your PowerBI solutions ensuring they continue to deliver value as your data grows now let’s dive into some of the benefits provided by performance optimization first enhanced speed and efficiency adventure Works manages enormous volumes of data from sales records production statistics customer interactions to employee information this data holds valuable insights that guide strategic decision-making by optimizing your PowerBI report and data model you can significantly cut down the loading and processing time of large data sets allowing you to execute queries faster this means the different teams at Adventure Works from sales to production to management can quickly access the data they need reducing weight times and enhancing overall productivity the next benefit of performance optimization is informed decisionmaking the ability to make timely and informed decisions at Adventure Works is critical to its success if there’s a sudden drop in sales of a specific bike model or if a new bike accessory becomes a hot seller company decision makers must know about it as soon as possible to adjust its strategies accordingly with an optimized PowerBI data model reports load swiftly enabling faster analysis of trends and thereby leading to more prompt informed decisions next let’s look at the improved user experience of optimizing performance in PowerBI at Adventure Works numerous team members rely on PowerBI reports for their tasks slow loading reports can lead to frustration loss of time and lower productivity in contrast an optimized PowerBI system can dramatically improve the user experience by ensuring reports load smoothly and swiftly this way team members can focus on deriving insights instead of waiting for reports to load as Adventure Works continues to expand the data it manages grows as well requiring more computing resources in this situation they need more efficient use of resources an optimized PowerBI data model can make more efficient use of the resources handling larger volumes of data without a noticeable drop in performance this is crucial as it allows Adventure Works to handle its growth and the accompany increase in data without requiring excessive increases in computing resources lastly there is timely report generation different teams at Adventure Works may require regular reports to function efficiently the sales team might need weekly sales reports while the manufacturing team might require daily production reports with an optimized PowerBI data model these reports can be generated and distributed in a timely manner facilitating smooth operations across the company and ensuring each team has the data it needs when it needs it by embracing the power of performance optimization in PowerBI you’re not just enhancing the speed and efficiency of reports and dashboards you’re helping Adventure Works to make better decisions faster remember every second saved in loading a report every query executed faster every frustration eliminated by a smoothly loading dashboard these are victories in your quest to unlock the full potential of data so continue to explore optimize and innovate for it’s through these actions that you make a difference in organizations industries and the world you are the data pioneer and the future is in your hands imagine it’s your first day at Adventure Works a multinational manufacturing company renowned for its premium bicycles as a newly hired data analyst you have an enormous challenge to analyze the constant stream of data generated by the company’s diverse operations every sale in North America every accessory produced in Asia and every customer interaction in Europe sends ripples through the vast ocean of data that Adventure Works amasses every day this data is a disorganized treasure trove filled with critical insights that can drive strategic decision-making and fuel the company’s continued growth but how do you extract these precious insights from an unoptimized data set that’s where your secret weapon comes in the effective combination of optimization techniques and PowerBI this video aims to assist you in understanding the fundamental concept of optimization in PowerBI using a relatable scenario set in the context of Adventure Works by the end of this video you’ll understand the various optimization techniques such as sorting filtering indexing and data transformation and how they contribute to enhancing the efficiency and accuracy of data analysis over the next few minutes you’ll learn the importance of optimization in decisionmaking and strategy formulation to recap optimization in the context of PowerBI is the process of transforming cleaning and organizing your data sets to achieve the best possible data performance optimization involves techniques like filtering sorting and indexing which can make your data more manageable and your searches faster improving overall efficiency adventure Works operates in a data inensive environment this includes sales data from diverse markets manufacturing data from various plants product management data on hundreds of items human resource data on employees from different regions and much more to help understand this let’s put ourselves in the shoes of Lucas Pereira an assistant data analyst at Adventure Works lucas is tasked with understanding the sales performance of their different bike models across North America the sales data in front of Lucas is vast filled with information about bike models sales dates customer details and regions this is where optimization becomes a vital tool in Lucas’s arsenal there are four tools that will help Lucas with his task: sorting filtering indexing and data transformation in PowerBI sorting is an optimization technique that allows Lucas to organize his data alphabetically by bike model this seemingly straightforward step is like putting on a pair of glasses it sharpens the focus on the sales patterns and performance of each bike model making the data set much easier to read and interpret the benefits of sorting go beyond simplicity and aesthetics it sets the stage for faster and more efficient data processing by grouping similar data the search operation is enhanced thereby saving time it allows Lucas to identify trends patterns and outliers more quickly leading to quicker insights and decision-m in the competitive environment that Adventure Works operates this speed can translate into significant business advantages lucas then moves on to filtering his data to focus on his area of interest North America filtering data enhances clarity and relevance it eliminates unnecessary noise making the data more manageable lucas removes all irrelevant data related to other regions filtering leaves him with a data set that focuses exclusively on North American sales and by doing so Lucas can conduct more precise and targeted analyses leading to more relevant insights and strategies it also reduces the processing time and computational load making the overall process more efficient if filtering takes place during the transformation stage it also reduces the amount of data stored within PowerBI like using a well-laidout map to reach a destination faster indexing enhances the data analysis process by providing faster access to specific data points lucas creates an index on bike models and regions this allows him to quickly locate the data for a particular bike model in a specific region without having to sift through the entire data set it saves time and makes the analysis process more efficient enabling Lucas to respond faster to queries or generate reports more quickly thereby enhancing the decision-making process finally Lucas applies data transformation to standardize the sales dates which are in multiple formats the key benefit of data transformation is the improvement in data consistency which facilitates more accurate and meaningful analyses standardizing the dates allows Lucas to conduct a proper date related analysis enabling him to track and forecast sales patterns accurately it helps eliminate potential errors in the analysis due to inconsistent data the cumulative effect of these optimization techniques turns data sets into a powerful instrument of insight lucas’s journey through the data set of Adventure Works demonstrates that by streamlining and simplifying the data set optimization makes the data more accessible and manageable by applying optimization techniques businesses like Adventure Works can harness the true power of their data turning information into actionable business strategies as you’ve seen through Lucas’s journey data is more than just numbers on a screen it’s a mosaic a narrative a path that can lead you to new insights strategies and victories but to interpret data effectively you must refine it shape it and most importantly understand it that’s what optimization techniques do they’re the compass the map and the light that guide you through the maze of data so step up to the challenge use the power of optimization in PowerBI to create your own stories of success imagine it’s a Monday morning at Adventure Works headquarters and sales data from the previous quarter has just arrived as a newly appointed data analyst you’re eager to dive in and extract meaningful insights from the data pouring in from several stores and customer orders worldwide in addition there’s data from various suppliers and manufacturers who deliver essential parts for Adventure Works diverse bicycle product line for this report you are tasked to trace the journey of a specific component from the Adventure Works suppliers data set to the products data set as you start loading the data into PowerBI things begin to slow down queries that should take seconds are taking minutes and some aren’t loading at all you notice that the performance issues intensify when dealing with relationships between the different tables in your data model specifically many to many relationships this video helps you to understand how to identify data model performance issues in relationships and how to resolve them by adjusting the cross filter direction by the end of this video you’ll understand how to edit the relationships and optimize the performance of your data model using PowerBI over the next few minutes you’ll learn how to balance accuracy and performance in your data model by applying birectional filters only where necessary to understand the issue let’s first dive into what a manyto-y relationship entails in a data model relationships in data models represent how data tables connect and interact with each other the simplest form is a onetoone relationship where one row in a table corresponds to one row in another however real world data isn’t always that simple often one record can correspond to multiple records in another data set and vice versa this is where you can encounter the many to many relationships in the context of Adventure Works consider the relationship between the products and suppliers tables each product at Adventure Works is made up of various components from multiple suppliers and each supplier can provide components for multiple products this mutual relationship where each entity can relate to multiple entities on the other side is what we call a many to many relationship now let’s dive into the cause of many to many performance issues and how you can resolve it your focus is on the model view so select the bottom icon in the model view your tables are represented as boxes with field lists lines connecting these boxes represent the relationships between these tables find and select the specific relationship you wish to edit in this case you are interested in the relationship between the products and suppliers tables if your model has many tables and relationships you might need to drag the tables around or zoom in and out using the scroll wheel or the zoom slider at the bottom right of the screen now that you’ve located the relationship it’s time to edit it double click on the line connecting the products and suppliers tables this action opens a new dialogue box titled edit relationship the cross-filter direction between the products and suppliers table is causing performance issues in the data model since you wanted to trace the journey of a specific component from the adventure works suppliers table to the products a one-way filter would be appropriate for this limiting the products data to only those that involve the chosen component in the edit relationship dialogue box locate the option labeled cross filter direction the current setting is both meaning filters can flow from the products table to the suppliers table and vice versa to change the cross filter direction to reduce this complexity select the drop-own menu for cross filter direction and select single or suppliers filters products now that you’ve made the desired changes it’s time to save them at the bottom right of the manage relationships dialogue box select the okay button this action will close the dialogue box and apply your changes to the data model by changing the direction of its filter you’ve simplified the data model this simplicity has made it more efficient and resolved the performance issues you’re a newly hired data analyst at Adventure Works your first task is to source prepare and analyze data to aid the marketing initiatives as you’re delving into the data you start to encounter an issue you notice that your PowerBI reports usually swift and reliable have started to slow down you discovered that this is due to high levels of cardality in this video you’ll explore the impact of cardality on performance and how high cardality affects your data analysis tasks by the end of this video you’ll have the practical knowledge to reduce cardality to improve the performance of your PowerBI reports over the next few minutes you’ll learn how to identify high cardality explore strategies to reduce cardality decimals and consider the implications of these changes on your data as you might already be aware cardality in the context of PowerBI refers to the number of distinct values in a column for example imagine analyzing a data set containing a column called product category within this column you might find several different categories each of these unique categories represent a distinct value and the total count of these unique items determines the cardality of the product category column a column with a high number of distinct values has high cardality when you have high cardality it can increase the size of your data model and the time taken to process queries slowing down your PowerBI reports imagine trying to find a specific book in a library that doesn’t have a categorization or indexing system that’s essentially what happens when cardality is high the PowerBI engine must sift through more unique values slowing down the process while high cardality can slow down the performance of your PowerBI reports identifying high cardality columns and modifying them appropriately can enhance your report’s performance powerbi itself is a high-erformance system that can handle large volumes of data with high cardality however there are always trade-offs in system design and reducing cardality can help when dealing with truly large data sets let’s explore some methods for reducing high cardality one strategy to reduce cardality is through summarization during transformation this step is similar to moving from a detailed view to a summary view of your data instead of looking at individual transaction data you can group them by categories such as product category order date or delivery date in Adventure Works instead of analyzing every unique bike sale you could aggregate sales data on a product category basis however that’s not the only method to reduce high cardality a second strategy is to reduce cardality by changing decimal columns to fixed decimals high precision decimal values can significantly increase cardality for instance consider the product weight column in Adventure Works sales table responsible for tracking the weight of each bike to the microgram the variation in bike weights is very large leading to high cardality by rounding these weights to a fixed decimal point you can significantly reduce cardality now that you’ve learned how to identify high cardality let’s look at how you can reduce it as you just discovered you can reduce the cardality of Adventure Works data model through summarization once you have located the column you want to summarize in this case product category select the columns header to select the entire column then go to the transform tab on the top menu bar in the transform toolbar select group by a new group by window will appear in this window you can specify the column you want to group by and the aggregation function you want to apply like sum count average etc based on the nature of your data after specifying these settings select okay this form of summarization lowers the cardality leading to improved performance and as the second strategy demonstrated you can also reduce cardality using fixed decimals to do this locate and select the decimal columns header you want to modify in this case the product weight column then select the transform tab on the top menu bar in the transform toolbar select data type a drop-own menu will appear with a list of different data types from this list select fixed decimal number after this the column’s data type will be changed and it should now contain fewer unique values effectively reducing its cardality by following these steps you can reduce the cardality of your data thereby improving the performance of your PowerBI reports however remember that reducing cardality might also result in less granular data so always take into consideration the requirements of your analysis before you decide to reduce cardality as you continue exploring the world of data always remember that it’s not about having less data or more data it’s about having the right data and when you master this you can turn raw numbers into insightful stories make informed decisions and create impactful change data enthusiasts are often required to look for real-time insights and dynamic visualizations to make informed decisions direct query in PowerBI enables you to dive into vast amounts of data with auto refresh functionalities though direct query connectivity has several benefits it comes with its own set of behaviors and limitations let’s walk through these elements of direct query as data connectivity options in PowerBI adventure Works has expanded its operations in recent years to various regions across the world the company wants to build a real-time sales dashboard to monitor sales performance across various regions categories and products adventure Works has a massive transactional database that records sales data in real time the company also wants to implement data security to ensure data access permissions are defined within the database and users only have access to the data they are authorized to view to meet the requirements of Adventure Works you need to establish a direct query connection in PowerBI to retrieve and analyze the data let’s explore what direct query is and how it can help you to connect to your data sources direct query is a data connectivity option in PowerBI that allows analysts to connect directly to the data sources without importing data into PowerBI model instead of loading data into the memory direct query sends queries directly to retrieve data from the sources for real time analysis although it is best practice to import data into PowerBI model there are times when using direct query is inevitable let’s review some of the benefits that Direct Query offers direct Query allows you to execute queries in real time for example in a multinational retail corporation new sales transactions are added every hour to the database this ensures that the sales dashboard always displays the latest data large data set imports to PowerBI models can cause performance problems and high memory consumption by using direct query PowerBI avoids loading an entire data set to the model optimizing memory usage direct query respects the data source level security ensuring that only the authorized users have access to the data the data access permissions defined in the underlying database are enforced providing a secure and controlled data access environment let’s examine the behavior of direct query connections when you establish a connection in PowerBI desktop via direct query if the connection is made to a relational database like SQL you can select a set of tables from the database that will return a set of data for example at Adventure Works you can select data from the central SQL data warehouse via direct query connection to perform realtime sales analysis data loading in PowerBI only loads the schema not the actual data reports and visuals send queries to the underlying database to retrieve the necessary data the visual refresh time depends on the performance of the underlying data source the tables you selected for Adventure Works are not imported to PowerBI model only the schema is therefore the data refresh cycle sends the query to the central database once added information is recorded to the source database the reports and visuals do not reflect the updated data immediately you will need to refresh the report to display the latest data for instance each new sale record of Adventure Works saved on the database will be reflected on the dashboard after you refresh the report if you publish a PowerBI report to a PowerBI service it displays the same behavior as with imported data except there is no data imported all the report elements can be used in creating a dashboard the dashboard titles are automatically refreshed as per refresh frequency that you can configure dashboard visuals will show data from the latest refresh when opened for example if your manager asks you to present the most recent dashboard every morning then you can set up refresh time an hour before the presentation time the use of direct query can have negative implications the limitations vary depending on the specific data source that is being used it is always faster to query data from memory import data rather than querying it from the server direct query the performance depends on the size of the data the database server specifications the network connection speed and optimizations to the data source you must understand these performance implications before deciding to use the direct query for your data analysis in PowerBI with direct query you can apply some data transformation in the query editor of PowerBI however not all the transformations are supported this also depends on the data source for example a SQL server supports some transformations while SAP business warehouse doesn’t support any transformation in the query editor in the latter case you need to apply transformation in the underlying data source data modeling and DAX are also limited in direct query mode for example PowerBI default date hierarchy is not available in direct query and some of the DAX functions such as parent child functions are also not available complex DAX measures also cause performance issues so it is advisable to start building simple aggregation measures and test the performance before moving to more complex calculations in DAX when using direct query mode almost all the reporting capabilities that you have with imported data are also supported for direct query models provided that the underlying source offers a suitable level of performance however when you publish your PowerBI report to a PowerBI service the quick insights and Q&A features of the service are not supported in direct query mode dax measures filters can cause performance implications in reports of direct query models direct query offers an alternative way to connect PowerBI to the data sources but it has some limitations data analysts must understand the behavior benefits and limitations of direct query before deciding to use it for their analytical and business needs direct query models demand consistent performance across all layers of the solution fortunately there are several optimization and query reduction strategies that you can use to help you along the way over the next few minutes you will learn how to optimize the underlying data source for better query performance adventure Works is experiencing poor report performance it is taking too long for pages to load in the reports table and matrix visuals are not refreshing quickly enough when certain elements of the report are selected while reviewing the data model you discover that the model is using direct query to connect PowerBI to the source data resulting in the poor report performance you will need to act in order to optimize the performance of the direct query model in direct query mode performance optimization is needed at each layer of the solution the first layer of the solution to be optimized is the data source you’ll need to tune the source database any optimization done to the underlying source database will enhance the direct query connection which will improve your PowerBI reports the following standard database practices apply to most situations avoid using complex calculated columns because the calculation expression will be embedded into the source queries review the indexes and verify that the current indexing is correct if you need to create new indexes ensure that they are appropriate powerbi desktop provides you with the option to reduce the number of queries sent to the database in direct query mode in PowerBI the default behavior of a filter or slicer is that when you select an item in that slicer or filter the other visuals of the report will be filtered automatically in direct query mode this will send multiple queries to the database for every selection within a filter or slicer these multiple queries will reduce the performance of your report for example you want to select multiple items but when you select the first item five queries are sent to the underlying database on selecting the second item another five queries are sent to the database this will result in a further slowdown of speed this is especially true when you have a multis select slicer or filter you can optimize the number of queries sent to the database in PowerBI desktop the optimization of performance through query reduction requires effective strategies and techniques aggregations allow for pre-calculated summary values that can be imported and stored in the memory engine of PowerBI an optimized data model can lead to efficient query processing simplifying relationships eliminating unnecessary columns and avoiding complex DAX expressions wherever possible can enhance query optimization by reducing the number of queries sent to the underlying data source you can limit the number of visuals and filters in a PowerBI report while working with direct query connectivity for example you can reduce the number of visuals on the report page or reduce the number of fields that are used in a visual in direct query mode performance optimization is vital to deliver a smooth and responsive user experience implementing query reduction strategies and focusing on query performance enhancements allows you to maximize the benefits of real-time data connectivity in PowerBI as a data analyst you’ll often need to optimize the query performance of direct query connectivity fortunately configuring the table storage will improve data retrieval speed and reduce the query workload on the data source over the next few minutes you’ll learn direct query performance optimization with table storage adventure Works is experiencing slow data retrieval speeds while trying to build its reports upon further investigation you discover that the cause of the slow retrieval speed is due to the query workload on the data source you will need to use table storage to reduce the query workload and improve the retrieval speed let’s explore what storage modes are and how they can be used to optimize the performance of your direct query data sets storage modes in PowerBI determine where the data of that table is stored and how queries will be sent to the data sources you can specify the storage mode of the table individually within your data model the storage mode lets you control whether PowerBI desktop catches table data in memory for reports storage modes in PowerBI offer the following benefits as users interact with visuals in PowerBI reports DAX queries are submitted to the underlying data set caching data into memory by properly setting the storage mode can boost the query performance and interactivity of your reports tables that are not cached don’t consume memory for caching you can enable interactive analysis over large data sets that are too large or expensive to completely cache into memory you can choose which tables are worth caching and which aren’t you can reduce the refresh time by only importing the tables that are necessary to meet your business and analytical requirements this will optimize the data refresh time and frequency now that you’re familiar with what storage modes are let’s examine the three storage modes that PowerBI supports if a table is using the import storage mode it means that the data of that table will be stored in the in-memory storage of PowerBI every query to the data would be a query to the in-memory structure and not to the data source for instance Adventure Works sourced a sales table from a SQL server but is using the import storage mode a copy of the data will be stored in the memory engine of PowerBI whenever you refresh a report in PowerBI desktop it will query the in-memory structure instead of sending queries to the SQL server data source tables using the direct query storage mode will keep the data in the data source for example if adventurework sales data is stored in a SQL server and a report is created within the storage mode PowerBI will send SQL queries to the data source and to retrieve the results because the table is using the direct query storage mode you can use SQL profiler at the same time to view manage and optimize the queries when using dual storage mode one table can act either as direct query or import with respect to the relationship to the other tables in some cases you will fill in queries from imported data while in other cases you will fulfill queries by executing an ondemand query to the data source for example to a SQL server let’s find out how various storage modes work in PowerBI desktop while connecting to direct query mode launch PowerBI desktop and connect to SQL Server via direct query navigate to get data and select a SQL Server from the drop-own list of options you’ll be presented with a SQL Server database dialogue box enter the server name and database name by default import mode is selected select direct query and select okay this action directs you to the SQL server containing an Adventure Works database named Adventure Works DW2022 here you can select the number of tables you want to load to PowerBI model select the following tables from the database the internet sales fact table and the product customer and sales territory dimension tables navigate to model view and expand the properties pane select the sales table scroll down to the properties pane and expand advanced access the storage mode drop-own menu to view the three storage modes select the import storage mode for the internet sales fact table once you have selected the import mode a dialogue box appears on screen this dialogue box warns that setting storage mode to import is irreversible you will not be able to switch back to direct query select okay you have now successfully optimized the storage mode of the fact table in the adventure works database you can further leverage this feature to decide which tables of the schema you need to import and which you can keep in direct query connectivity depending on the analytical requirements in direct query mode performance optimization is vital to deliver a smooth and responsive user experience by implementing query reduction strategies to optimize the number of queries sent to the underlying database and focusing on query performance enhancements you can maximize the benefits of real-time data connectivity in PowerBI aggregations in PowerBI are a great method of generating fast query performance and interactivity in your reports and visuals aggregations in PowerBI enable you to dive deeper into your data without compromising the speed and performance of your query in direct query connections powerbi not only provides a potential solution for small data sets but it also has the potential to handle large data sets by switching to direct query as direct query does not store data into the memory PowerBI sends queries to the underlying data source for every page of the report however direct query mode can be slow depending on the number of visuals in a report and the number of users interacting with the data at a given time for example imagine your report contains four visuals and every time you apply a filter to the data PowerBI sends queries to the data source sending queries to the data source with each interaction makes the direct query quite slow fortunately PowerBI has a solution to handle the slow response of direct query called composite mode composite mode allows you to use part of the model as a direct query which for larger tables is typically a fact table and use part of the model to import data for smaller tables usually dimension tables this approach allows you to achieve better performance when you work with smaller tables as they are just querying the in-memory storage of data however the tables that are part of direct query connection are still slow in response this is where a useful feature within composite mode called aggregations can come into play in PowerBI aggregation refers to summarizing or consolidating large volumes of data into more manageable summary tables to improve query performance by condensing detailed information into simpler high-level values aggregations are the solution to speed up the direct query connected tables within a composite mode with the help of aggregations you can create layers of pre-agregated values which are stored in memory storage of PowerBI for faster performance let’s consider these concepts in a scenario adventure Works wants to analyze data for the last 5 years of sales across all its products and regions the fact table might contain tens of millions of rows making it a huge data set for PowerBI’s import limit of file size in this example the objective of performing the analysis is to query the sales values by the year region product or customers category in short you are querying the fact table by aggregations of the dimension tables therefore creating and managing aggregations of the fact table will help you to reduce the file size of the sales table and optimize query performance for Adventure Works for example suppose you are aggregating sales data by calendar year the aggregated table can pre-calculate the sum of the sales amount for every calendar year in this case you only have five rows of data one for each year and that is smaller than the original fact table this pre-calculated aggregation can be imported to the memory of PowerBI and will be efficient in querying daily analysis furthermore if you want to analyze data at a higher level of granularity at a daily level the total number of data rows is still tiny in comparison to the millions of rows in the fact table as dimension tables are typically smaller than the fact table aggregated tables are always smaller than the fact table before you create aggregations in PowerBI you need to decide the granularity of analysis you want to perform on the aggregations for example evaluating sales amount at day level once you decide on the grain the next step is to create aggregations you can create aggregations in one of three ways you can create a table with aggregations at the database level for instance SQL Server database if you have access to the data source and then import the table to PowerBI you can create a view of the aggregation for example in SQL Server database and import the view to PowerBI if you have access to the data source you can use Power Query editor in PowerBI to create aggregations aggregations in direct query have several benefits let’s explore three specifically in case you are handling a large data set aggregations provide a faster and optimized query performance and assist you in analyzing the data they also reveal insights without querying the underlying data source that is slower in response and in worst case scenario the query times out if users at Adventure Works are experiencing slower refresh time of the reports in PowerBI you can create aggregations which help you to speed up the refresh process the smaller size of aggregated tables imported to memory reduces the refresh time enabling a better user experience adventure Works is anticipating a growth in sales volume for the upcoming year you can leverage aggregations to create and manage aggregations as a proactive measure to futureproof the solution thereby enabling a smooth scaleup of the company aggregations are the game-changing feature of PowerBI in optimizing the speed and performance when dealing with huge volumes of data with the help of aggregations you can have layers of pre-calculated tables stored in the memory of PowerBI always ready to respond to queries when users interact with the data in reports powerbi’s aggregation feature is useful for creating a seamless bridge between raw data and meaningful analytics in this video you’ll learn how to create and manage aggregations in Power Query Editor of PowerBI first you need to load the required tables launch the PowerBI desktop and connect to SQL Server via direct query navigate to get data and select SQL server from the drop-own list of options this opens a dialogue box called SQL Server database enter the server name and database name by default import mode is selected select direct query and then okay the action directs you to the SQL server containing the Adventure Works database powerbi opens the navigator window with the list of tables select the following tables to load the internet sales fact table and the customer and date dimension tables once the tables are loaded PowerBI autoestablishes the relationships between the tables in this instance you only need to review the relationship between the date and internet sales tables delete any inactive relationships between these tables next you need to create aggregations using Power Query Editor from the home tab select transform data to open the editor create an aggregated table based on the internet sales fact table note that this action converts the existing table to an aggregated table to keep the original table intact the first step is to reference the fact table select the internet sales table from the queries pane right click and select reference from the drop-own list this action duplicates the internet sales table rename the query as a sales or aggregated table next from the home tab of the query editor select choose columns this opens a select columns dialogue box for the current aggregations create an aggregation using the order date key and customer key columns from the list of columns first unselect all columns then select the following columns order date key customer key unit price and sales amount select okay next select group by from the transform tab to open the group by dialogue box by default basic is selected choose advanced the first section presented is grouping this is because you’ve selected two columns for grouping select add grouping to add another field select order date key and customer key from the first and second grouping columns respectively the second section is aggregations to find the new column name then the mathematical operation for the aggregation like sum count average and so on finally select which column the calculation should be based on for the current lesson add the following aggregations sum sales amount based on the sales amount column sum unit price based on unit price column and order count which will take the count rows operation and does not require a column reference select okay after adding and defining aggregations the action will add an aggregated table to the data model the new aggregated table is much smaller than the original table now you have created an aggregation based on fact internet sales keeping the original table intact the table is added to the data model next you need to establish the relationship between a sales table and the customer and date dimension tables build the relationship between the a sales table and the dimension tables customer key and date key columns finally you need to set the storage mode of the aggregated table as import navigate to the model view and expand the properties pane select the a sales table in the properties pane expand advanced select import from the storage mode drop-own list of options the action opens the dialogue box indicating the warning message setting storage mode to import is an irreversible operation this means that you will not be able to switch back to direct query there is another recommendation on the dialogue box the number of weak relationships can be reduced by setting the customer and date dimension tables to jewel the checkbox set affected tables to jewel was checked by default leave this checked and select okay this action imports the a sales table to PowerBI’s memory and converts the storage mode of the dimension tables to Juul the reason is that both dimension tables are connected to the original fact table that is direct query sourced and to the a sales table that uses import mode this means the dimension tables are set to dual storage mode so they can act both ways depending on the situation select the dimension tables and check the storage mode option in the bottom right hand corner of the visualization pane to confirm that dual storage mode is selected in this video you learned how to create and manage aggregations in Power Query Editor of PowerBI congratulations on reaching the end of the third week in this course on data modeling in PowerBI this week you’ve explored optimizing a model for performance in PowerBI let’s take a few minutes to recap what you’ve learned in this week’s lessons you began the week with an introduction to what optimization is and why it is necessary you learned about PowerBI dashboards you learned how dashboards can provide access to large volumes of data that can be used to generate insights on successes challenges and opportunities you then explored query lag and how simple queries that used to take seconds could begin to take many minutes even hours you investigated the challenges that growing data volumes can bring as well as how performance optimization can address those issues and you reviewed the benefits of performance optimization in PowerBI and how it affects the overall performance of data models reports and dashboards you then further examined optimization and what it is and how performance issues can arise due to a variety of factors including inefficient data models complex DAX calculations and inappropriate visuals you explored how optimizing your PowerBI report and data model can significantly cut down the loading and processing time of large data sets allowing you to execute queries faster next you examined how the benefit of performance optimization informs decision-making and how the ability to make timely and informed decisions is critical to its success and how with an optimized PowerBI data model reports load swiftly enabling faster analysis of trends and thereby leading to more prompt informed decisions you then explored the user experience and the benefits that an optimized PowerBI system can have dramatically improving the user experience by ensuring reports load smoothly and swiftly next you learned about resource efficiency and how an optimized PowerBI data model can make more efficient use of resources handling larger volumes of data without a noticeable drop in performance you explored optimization by example and how to analyze the constant stream of data next you examined optimization techniques such as filtering sorting and indexing which can make your data more manageable and your searches faster improving overall efficiency you are introduced to four tools that will help you to understand vast amounts of data: sorting filtering indexing and data transformation you learned how sorting made data sets much easier to read and interpret how filtering reduces the processing time and computational load making the overall process more efficient how indexing allows you to quickly locate the data for a specific region without having to sift through the entire data set and how data transformation facilitates more accurate and meaningful analyses next you moved on to resolving performance issues in data models which had you explore the different types of relationships such as onetoone and manny to manny you then learned how to identify and reduce cardality levels and how identifying high cardality columns and modifying them appropriately can enhance your reports performance you learned about the behavior and limitations of direct query connections you learned that direct query is a data connectivity option in PowerBI that allows analysts to connect directly to the data sources without importing data into PowerBI model you explored the benefits of direct query which are real-time updates reduced memory usage and data security you then investigated the negative implications of direct query which are its impact on performance its limited support for data transformation its limitations in modeling in DAX and its reporting limitations you explored optimizing direct query performance with query reductions you learned that in direct query mode performance optimization is needed at each layer of the solution and how PowerBI desktop provides you with the option to reduce the number of queries sent to the database in direct query mode you learned some effective query reduction strategies and techniques including aggregations optimizing the data model and report optimization you then explored optimizing direct query performance with table storage and how storage modes in PowerBI determine where the data of that table is stored and how queries will be sent to the data sources and that you can specify the storage mode of the table individually within your data model you examined the benefits of storage mode which are query performance larger tables and data refresh optimization you then learned about import mode and that if a table is using the import storage mode it means that the data of that table will be stored in the in-memory storage of PowerBI you also explored direct query mode and that tables using the direct query storage mode will keep the data in the data source you then learned about dual mode and that when using dual storage mode one table can act either as direct query or import with respect to the relationship to the other tables you then moved on to aggregations in PowerBI and how aggregations in PowerBI enable you to dive deeper into your data without compromising the speed and performance of your query in direct query connections you explored composite mode you learned that composite mode allows you to achieve better performance when you work with smaller tables as they are just querying the in-memory storage of data and how in PowerBI aggregation refers to summarizing or consolidating large volumes of data into more manageable summary tables to improve query performance by condensing detailed information into simpler higher level values you identified the three ways to create aggregations which are you can create a table with aggregations at the database level for instance SQL server database if you have access to the data source and then import the table to PowerBI you can create a view of the aggregation for example in SQL Server database and import the view to PowerBI if you have access to the data source you can use Power Query Editor in PowerBI to create aggregations finally you learned about the benefits of aggregations that in the case you are handling a large data set aggregations provide a faster and optimized query performance and assist you in analyzing the data they also reveal insights without querying the underlying data source that is slower in response and in worst case scenario the query times out and if users are experiencing slower refresh time of the reports in PowerBI you can create aggregations which help you to speed up the refresh process the smaller size of aggregated tables imported to memory reduces the refresh time enabling a better user experience as well as that you can leverage aggregations to create and manage aggregations as a proactive measure to futureproof the solution thereby enabling a smooth scaleup of the company you’ve now reached the end of this module summary it’s time to move on to the discussion prompt where you can discuss what you’ve learned with your peers you’ll then be invited to explore additional resources to help you develop a deeper understanding of the topics in this lesson best of luck we’ll meet again during next week’s lessons you’re nearing the end of this course on data modeling in PowerBI you’ve put great effort into this course by completing the videos readings quizzes and exercises and you should now have a stronger grasp of the foundations of data modeling these include basic concepts of data modeling using DAX for analysis and optimizing a model for performance you’re now ready to apply your knowledge in the exercise and the final course assessment in the exercise you’ll build and optimize a data model putting everything you’ve learned into practice this is followed by the course assessment or graded quiz that consists of 30 questions related to topics you covered throughout the course but before you start let’s recap what you’ve learned in the first week of this course you discovered that data modeling is the process of creating visual representations of your data in PowerBI you can use these representations to identify or create relationships between data elements by exploring these relationships you can generate new insights into your data to improve your business microsoft PowerBI is a fantastic tool for creating data models and generating insights and you don’t need an IT related qualification to begin using it during your exploration of PowerBI you learned how to create data models using schemas and relationships analyze your models using DAX also known as data analysis expressions and optimize a model for performance in PowerBI you also explored key concepts related to data modeling you learned to identify different types of data schemas like flat star and snowflake create and maintain relationships in a data model using cardality and cross filter direction and form a model using a star schema in the second week of this course you focused on DAX or data analysis expressions this syntax is used to create elements and perform analysis in PowerBI you began by writing calculations in DAX to create elements and analyses in PowerBI you then explored the formula and functions used in DAX and used DAX to create and clone calculated tables you were then introduced to the concept of measures you learned where measures are used and what types are available you worked with measures to create calculated columns and measures in a mode and you learned about the importance of context and DAX measures finally you performed useful time intelligence calculations in DAX for summarization and comparison and learned how to use these techniques to set up a common date table in the third week of this course you learned how to optimize a model for performance in PowerBI you began by learning how to identify the need for performance optimization this means analyzing your data models to determine how they can perform more efficiently you then learned how to optimize your PowerBI models for performance you explored different techniques and methods for ensuring that you’re running efficient models and you also learned how to optimize performance using DAX queries now that you’ve built a solid understanding of the fundamentals of data modeling you’re ready to test your knowledge by undertaking the exercise and the final course assessment best of luck congratulations you’ve made it to the end of the data modeling in PowerBI course your hard work and dedication has paid off you’re making great progress on your data analysis learning journey and you should now have a thorough understanding of basic concepts of data modeling using DAX for analysis and optimizing a model for performance you should now have a firm knowledge of data modeling in PowerBI think about everything you can do with this new knowledge well done for taking the first steps towards your future data analysis career by successfully completing all the courses in this program you’ll receive a Corsera certification this program is a great way to expand your understanding of data analysis and gain a qualification that will allow you to apply for entry-level jobs in the field this program will help you prepare for the PL300 exam by passing the exam you’ll become a Microsoft certified PowerBI data analyst it will also help you to start or expand a career in this role this globally recognized certification is industry endorsed evidence of your technical skills and knowledge the exam measures your ability to perform the following tasks prepare data for analysis model data visualize and analyze data and deploy and maintain assets to complete the exam you should be familiar with Power Query and the process of writing expressions using data analysis expressions or DAX these are two concepts that you’ve explored in detail in this course and will continue to learn more about in future courses you can visit the Microsoft certifications page at http://www.learn.microsoft.com/certifications to learn more about the PowerBI data analyst associate certification and exam this course has enhanced your knowledge and skills in the fundamentals of data modeling in PowerBI but what comes next there’s more to learn so it’s a good idea to register for the next course whether you’re just starting out as a novice or you’re a technical professional completing this program demonstrates your knowledge of data modeling in PowerBI you’ve done a great job so far and you should be proud of your progress the experience you’ve gained will show potential employers that you are motivated capable and not afraid to learn new things it’s been a pleasure to embark on this journey of discovery with you best of luck in the future welcome to data analysis and visualization in PowerBI in this course you’ll discover the power of visualization in Microsoft PowerBI to create datadriven stories and solve realworld business problems data analysis and visualization are not only essential skills for data analysts to uncover and communicate data insights they are vital for organizations across different industries to flourish in today’s datadriven world from healthcare to finance data analysis and visualization play a critical role in informing decisionmaking and driving success with its extraordinary visuals PowerBI is a data analytics and visualization tool that you can use to transform data into intuitive visualizations it empowers you to present data in a visually appealing way that stakeholders can understand facilitating datadriven decisions you are currently on a path of discovery centered on data analysis in PowerBI exploring the skills tasks and processes that enable data analysts to create compelling data stories with PowerBI so what can you expect for this part of your learning journey you’ll start by diving into creating reports in PowerBI and exploring the various visualizations available to you and their potential to solve different business problems you’ll learn how to format these visuals and add them to reports and dashboards the powerful mediums through which you can provide stakeholders with insights in PowerBI you’ll master the art of designing reports and dashboards that are not just visually appealing but accessible userfriendly and interactive you’ll discover how to share your carefully crafted reports with stakeholders ensuring your hard work reaches the right audience and the journey doesn’t end there you can look forward to learning how to use visualizations and other features like AI to perform data analysis you’ll closely examine the data in your PowerBI reports discovering how to extract meaningful insights and value by using PowerBI’s analytical tools and performing advanced analytics by the end of this course you’ll learn how to recognize different types of visualizations in PowerBI add visualizations to reports and dashboards apply formatting choices to visuals incorporate useful navigation techniques into PowerBI reports design accessible reports and dashboards and use visualizations to perform data analysis to complete the course successfully you’ll need to apply the skills and knowledge you gain to a practical graded assignment in this assignment you’ll build reports and dashboards based on a realworld business scenario involving Adventure Works a fictional bicycle manufacturing company you may have encountered before in this program you’ll also need to complete a final graded quiz demonstrating your understanding of the key concepts in data analysis and visualization but no need to worry the videos readings exercises and quizzes in this course will gradually guide you through the learning material preparing you thoroughly for your assessment you have the flexibility to recap and revisit items as you need so watch pause rewind and re-watch the videos until you are confident in your skills the readings knowledge checks and quizzes will help you consolidate your knowledge and measure your progress ultimately this course is about more than just gaining knowledge and skills in data analysis and visualization in PowerBI it’s about setting yourself up for a career in data analysis by completing all the courses in this program you’ll earn a Corsera certificate to showcase your job readiness to your professional network plus the program prepares you for exam PL300 which leads to a Microsoft PowerBI data analyst certification globally recognized evidence of your realworld skills so are you ready to add data analysis and visualization skills to your data analyst toolbox well this course will equip you to recognize use and format different visualizations strategically design accessible and beautiful reports and dashboards and extract more value from your data using visualizations and advanced analytics best of luck as you embark on this learning journey renee Gonzalez the marketing director at Adventure Works walks into our office and finds a report on her desk the report is packed with data sales figures marketing campaign results regional statistics customer feedback and more but as she flips through the pages the strings of numbers and texts seem to blend failing to convey any meaningful story it’s like trying to decipher an alien language can she make informed decisions based on this data probably not data on its own is often meaningless but here’s the game changer when you apply the tools of data visualization and analysis the data starts to weave a story patterns emerge from the chaos trends become evident and the confusing jumble of numbers transforms into insights that can guide business decisions this is the power of business intelligence in this video you’ll explore the basics of business intelligence or BI specifically focusing on data visualization and analysis and the role it plays in making complex data accessible and understandable you’ll discover how business intelligence and data analysis go beyond data visualization providing deeper insights and forming the backbone of informed decision-making in its simplest terms business intelligence or BI is a technological approach to convert raw unprocessed data into meaningful actionable information for business analysis the heart of business intelligence is to create an environment where data informs strategic business decisions it’s about leveraging data to improve operations increase efficiency and boost financial performance bi uses several tools and methodologies to achieve these objectives including data mining analytical processing querying and reporting but two of the most critical tools in this toolbox are data visualization and data analysis data visualization is a graphical representation of information and data think charts graphs maps or any other visual format that makes complex data more understandable accessible and usable to grasp the power of data visualization let’s revisit the scenario at Adventure Works say the marketing director is examining the sales figures for different products in the last month the spreadsheet is dense with rows and columns of information you’d be hardpressed to discover any significant insights just by glancing at the raw data but imagine if you could take these numbers and transform them into a visually compelling line graph suddenly the sales trends are immediately visible it’s easier and quicker to identify high-erforming and underperforming products which can inform strategic planning and datadriven decision-making it may also provide insights into seasonality and the effect of marketing initiatives on income visualization is a powerful transformative tool used to spot patterns and anomalies identify trends and grasp complex data sets at a glance in addition to visualization another critical aspect of BI is data analysis while data visualization provides a graphical representation of your data data analysis dives deeper into these visualizations to undercover the reasons behind the trends and patterns data analysis is like the detective work of BI it sifts through data asks critical questions and uncovers the truth to illustrate the importance of data analysis let’s explore another term from BI profit margins the profit margin is a critical financial metric that provides insights into a company’s profitability you can calculate this by subtracting the cost of goods sold from sales revenue and dividing the result by the sales revenue but just knowing this profit margin figure isn’t enough let’s say for example that Adventure Works has a profit margin of 20% what does this figure tell you on its own not much but when you analyze this figure in relation to other factors the story begins to unfold for example to determine whether the margin is good or bad you can compare it across different periods or to the company average historical data or industry benchmarks you may also want to analyze the contribution of different products to profitability likewise you can also analyze the profit margin in relation to other financial metrics like sales revenue and expenses or external factors like market trends for a more comprehensive view of the financial health of Adventure Works data analysis helps you understand not just what is happening but also why it’s happening it allows you to diagnose problems spot opportunities and make informed decisions data analysis can also be pivotal in predictive analytics an aspect of BI that uses current and historical data to forecast future events behaviors and trends let’s imagine Adventure Works is planning to launch a new product line by analyzing past sales data customer behaviors and market trends you can predict how well customers might receive this new product its potential sales and even what type of marketing might be most effective this type of predictive insight can be instrumental in crafting successful business strategies as you embark on your own journey in the world of business intelligence remember that you’re not just a data analyst you’re a storyteller each strand of data is a part of your narrative and it’s up to you to assemble these strands into a narrative that guides a business to success remember data is just data it’s what you do with it that counts with data analysis and visualization you can transform data into actionable intelligence imagine a stakeholder at Adventure Works is handed a spreadsheet with numbers representing sales production and human resources data trying to draw conclusions or make decisions using these rows and columns is as challenging as navigating a dense forest with a paper map although the map may have all the information you need it isn’t easy to understand and interpret but what if there was a way to examine this data that’s immediately understandable and meaningful data visualizations can act like a navigation system with a clear interactive display that demonstrates how to navigate the forest of vast and complex data in this video you’ll learn about data visualization including its role in business intelligence and how data flows and is represented in visualizations in Microsoft PowerBI at its most basic a visualization is a graphical representation of data however visualizations are much more than just common graphical depictions converting raw data into a visual format using PowerBI can help you identify patterns trends and insights that might not be apparent in textbased data for example suppose Adventure Works wants to track the performance of its different bike types across various regions the data comes from several sources ranging from sales and regional reports to customer feedback in a spreadsheet this data would be complex and hard to digest however you can use PowerBI with its many ways to visualize data which you’ll learn about later to transform the data into a compelling interactive and easily digestible format visualizing data for business intelligence is crucial particularly in complex and dynamic business environments like Adventure Works let’s explore how data visualization in PowerBI can enhance business intelligence at an organization like Adventure Works the data generated from its operations is vast and complex visualizing this data simplifies the complexity transforming large intricate data sets into intuitive easy to understand graphical representations data visualizations can reveal patterns trends and correlations hidden in raw data for example Adventure Works could use a bar chart to visualize sales data demonstrating geographic regions where sales are the highest they could also use a scatter plot to identify correlations between marketing spend and sales performance powerbi’s interactive visualizations allow companies to dive deep into their data they can drill down into specific areas of interest such as analyzing sales trends for a particular product in a specific market over a given period leading to more precise datadriven decisionmaking visualizations make data more accessible to a broader audience not everyone at organizations like Adventure Works will be comfortable interpreting raw data but most stakeholders can understand a well-designed chart or graph as a result more stakeholders can engage with the data and contribute to datadriven decisionm visualizations are a powerful communication tool and can tell a compelling story with data making the insights more memorable and persuasive to demonstrate the success of a new product line to stakeholders at Adventure Works you could use visualizations to highlight key performance metrics in a visually engaging way now that you know more about the importance of visualizing data for business intelligence let’s explore how creating visualizations works in PowerBI creating visualizations in PowerBI begins with connecting to your desired data sources these can range from Excel spreadsheets to SQL databases once connected you can use Power Query to extract transform and load the data into PowerBI these transformations include renaming columns changing data types filtering rows and combining data from multiple sources you can then load this refined data into PowerBI’s data model for further manipulation using data analysis expressions or DAX a formula language for creating custom calculations the next stage of the workflow involves representing this processed data in visualizations powerbi provides a wide variety of visualization types such as bar charts scatter plots pie charts and even geographical maps after selecting a visualization type you map the data elements to different aspects of the visualization from adding values to the axes or fields to the color scheme PowerBI allows you to add slicers which are visual filters that allow viewers to segment and filter the data in real time to enhance the usefulness and interactivity of these visualizations the final step in the workflow involves arranging the visualizations on a report page and then sharing the report with other stakeholders the PowerBI service allows you to publish these reports enabling a broader audience to interact with them online even on mobile devices visualizations don’t only present data in a more understandable form they also enable realtime data analysis for example as sales figures at Adventure Works are updated the visualizations in PowerBI will update automatically this provides companies like Adventure Works with up-to-date accurate insights and enables them to react more quickly to changes in their business environment the next stage of the workflow involves representing this process data in visualizations data analysts must carefully craft them to communicate the right insights effectively this includes ensuring you select the correct type of visualization for the data you want to represent for example while pie charts are appropriate for displaying parts of a whole line graphs are more suitable for displaying trends over time an inappropriate choice of visualization can lead to misunderstandings or even misinformation visualizations are not only advantageous but essential in today’s datarich business environments rather than simple graphical representations of data used correctly visualizations are like keys to insights transforming the way stakeholders understand and engage with data and journey through the complex world of business intelligence with PowerBI you can guide stakeholders to strategic decisionmaking uncovering valuable insights and knowledge as a new data analyst at Adventure Works you’re overwhelmed with the vast amount of sales customer and manufacturing data you know the data contains invaluable insights about commerce customer behavior production efficiency and more but how do you translate it into meaningful information that stakeholders can understand and act upon you have a powerful solution PowerBI visualizations in this video you’ll learn about commonly used visualizations in Microsoft PowerBI you’ll discover their purpose and versatility in relation to data representation and interpretation you learned that data visualization is the graphical representation of data a method to uncover patterns trends and insights that may not be apparent in raw data visualizations communicate complex data sets in an intuitive and accessible way creating an approachable narrative that encourages datadriven decision-making let’s explore some of the common visualization types available in PowerBI and their practical uses in the context of Adventure Works the first visualization type is the column chart column charts are a clear straightforward way to compare different categories in a vertical orientation they can demonstrate data changes over time or illustrate comparisons among items column charts are generally used when there are fewer than 10 categories on the x-axis the horizontal axis at the bottom of the chart adventure Works could use a column chart to compare the sales of different bicycle models over the past year each column would represent a different product category and the height of the columns would indicate the sales figures allowing stakeholders to compare and contrast sales performance across models quickly bar charts are another powerful visualization for comparing different categories unlike column charts however bar charts are a horizontal representation of data the length of each bar corresponds to the quantity of the data it represents bar charts are useful for comparing larger quantities or categories with lengthy labels long labels are inappropriate for column charts as their vertical orientation means the labels appear sideways which can be challenging to read you can also use bar charts to display comparisons among discrete categories or non-ontinuous distinctly separate groups of data such as different payment methods for example Adventure Works could use a bar chart to compare the number of order transactions per payment category this clear and straightforward visual would make it easy for stakeholders to compare the performance of the different payment methods identify opportunities for payment option optimization and gain insight into customer behavior and preferences a further common visualization type in PowerBI is the line chart line charts are best suited for showing trends over time they connect individual numeric data points forming a line this visual is useful when you have a large data set and are interested in visualizing trends patterns or fluctuations in your data over time it’s particularly effective when used to represent many data points adventure Works could use line charts to track sales trends over time they might compare the monthly sales figures of different bicycles for the past five years to identify when sales peak and when they are slow helping inform strategic decisions about promotions and inventory powerbi also offers area charts which are in essence line charts except color or texture fills the area beneath the line these charts help compare two or more quantities and show part to whole relationships over time or across categories representing how individual segments contribute to an entire data set for example in an area chart for adventure works based on sales data each product type like mountain bikes or road bikes would be in an area on the chart showing its sales as a portion of the total sales this can help stakeholders understand how each product contributes to total sales and how this relationship changes over time now let’s explore pie charts pie charts are circular graphics divided into slices to illustrate numerical proportions this visualization type is ideal when you want to show a data set as a proportion of a whole each slice of the pie represents a category of data and the size of each piece is proportional to the quantity it represents from the whole adventure Works might use a pie chart to illustrate the proportion of sales made up by each product category each slice would represent a different product category and the size of each slice would be proportional to the revenue generated by that category this visual would enable stakeholders to understand which products contribute most to overall sales at a glance keep in mind that pie charts become less effective when there are too many categories to compare resulting in a high number of small slices in this case a bar chart might be better for clear visualization the last visualization type you’ll learn about in this video is the table tables in PowerBI are a way to view raw detailed data and exact numbers they display information in columns and rows providing a comprehensive numerical view of your data while they don’t offer the same visual impact as other chart types tables can display additional details that might be critical to stakeholder understanding of your data adventure Works could use a table to display a detailed monthly sales breakdown for each product category by region this would allow the relevant stakeholders to examine exact sales figures and make precise comparisons supporting detailed nuanced analysis in this video you discovered a range of common visualizations available to you in PowerBI each visualization type plays a unique role in data storytelling by understanding and effectively using the visuals in PowerBI you can transform raw data into a masterpiece that conveys insightful actionable information driving more thoughtful decision-making and improving business outcomes in a complex organization like Adventure Works sales reports are indispensable in coordinating sales efforts across regions and product lines let’s explore how to apply visualization items to a basic sales report once you’ve imported your data using get data on the home ribbon and cleaned and transformed it using the power query editor you can start adding visualizations to your report canvas first let’s add a column chart to visualize how sales are distributed among various product categories helping Adventure Works gain insight into the performance of different products from the visualizations pane select the clustered column chart button this will create an empty chart on your report page now that you have an empty clustered column chart it’s time to fill it with data you can find your data fields in the fields pane also referred to as the data pane or data section typically located on the far right side of the PowerBI interface these fields correspond to the columns in your data source find and select the product category field on your sales data source while holding the field drag it over to the Xaxis box under the visualizations pane releasing it will drop the field into the box by placing the product category field in the Xaxis well or input box you’re telling PowerBI to use the unique values from this field to create individual columns on the chart the next field you need to add to your chart is the order total field select and drag the order total field to the yaxis box as you did with the product category field and the x-axis when you drop a field into the y-axis box PowerBI will perform a calculation on that field for each category in this case it will calculate the sum of the order total for each product category and display this data in the respective column with this column chart stakeholders at Adventure Works can identify trends opportunities and challenges in product performance that can guide product development marketing campaigns and pricing strategies next let’s create a pie chart to represent sales distribution by different payment methods visually a pie chart will make it possible for stakeholders to determine how much of the total each payment method represents to start creating your chart select the pie chart button in the visualizations pane this will add an empty pie chart to your report page to start populating the chart with data find the payment method field in the fields pane and drag it into the legend well in the visualizations pane by putting the payment method field in the legend well you’re telling PowerBI to create a different slice of the pi for each payment method in your data after that find the order total field in the fields pane and drag it into the values well when you drop a field into the values well PowerBI performs a calculation on that field for each category by default PowerBI calculates the sum so it will calculate the sum of the order total for each payment method this pie chart can help Adventure Works understand key revenue streams and customer payment preferences and even guide decisions around payment processing partnerships finally let’s add a line chart visualization to the report line charts are effective for showcasing trends or changes over time for example this chart can help stakeholders recognize and understand the patterns and cycles in their sales data and identify any anomalies to create the line chart identify the line chart button from the visualization pane and select it this will generate an empty line chart on your new page to fill your empty line chart with data locate the order date field representing time and drag it into the xaxis field well located in the visualizations pane by doing this you’re instructing PowerBI to use time as the xaxis of your line chart which forms the basis for the trend analysis then locate the order total field and drag this field into the yaxis field well by default PowerBI will calculate the sales sum for each date and plot it as a data point on the line chart this offers stakeholders a practical way to visualize and understand sales trends over time stakeholders can use the line chart to inform strategic decisionmaking and drive business growth remember that PowerBI may make certain assumptions about your date data when creating line charts for example if your order date field includes specific times PowerBI might plot every unique timestamp to ensure PowerBI aggregates data according to your preferences select the drop-own arrow next to order date and choose your desired level of detail for example by year quarter month or day after creating your visualizations the next step is to save your report to ensure you don’t lose any of your work to save your report select the file option located in the upper left corner of the PowerBI interface a drop-own menu will appear from this menu select save a window will open asking you to name your report name it something descriptive to help you and others understand what the report is about such as Adventure Works Sales Analysis Report in this window select save again to finalize the process and there you have it you’ve learned how to apply visualization items to a basic report in PowerBI the sales analysis report complete with visualizations holds valuable insights for Adventure Works and will support datainformed decisionmaking imagine you are a data analyst at Adventure Works working with vast amounts of information daily while innovative and interactive charts can be flashy and captivating there are moments when your audience wants simplicity a straightforward no frrills presentation microsoft PowerBI’s table visualization is useful when you want to employ the classic clear-cut style of tables to ensure your audience can grasp the essence of the data quickly it elegantly presents refined data allowing viewers to immediately consume critical information and insights in this video you will learn more about the table visualization in PowerBI and how to configure it when you load a raw data set into PowerBI like an Adventure Works sales report with data from February March and April it is tough to pinpoint details quickly for instance figuring out the monthly sales for each region becomes a challenge and if you are trying to dive even deeper aiming to identify specifics like the number of orders that were either cancelled or shipped extracting this information from this raw format is a difficult task the table visualization in PowerBI can summarize all these insights and still present them in tabular format the same sales data is now presented using a table visualization the table displays summarized insights which is much more userfriendly to work with you can even customize the table visualization to improve its aesthetic appeal or aid engagement and comprehension now that you know more about the table visualization in PowerBI let’s learn how to configure this visualization once you load your data in PowerBI using a table visualization is quite straightforward open your report view and select the table visual from the visualizations pane this will instantly place this visual in the report area you can resize this visual by dragging the corners or sides while keeping this visual selected select as many data fields as you want for example you can select month and order total on the data pane this will give you an insight into monthly total sales if you want to break down the sales by different regions simply add the product region field from the data pane and the table visual will display monthly sales for each region adding another field order quantity to this visual gives you more insight into how many items were shipped cancelled or still under processing the visual even calculates the totals automatically displaying them at the bottom of the visual what if you want to see the order status in this table just select the order status field from the data pane notice how the table visual summarizes valuable information like order quantity and order total for each row you can sort any of these columns by selecting the column header for example selecting the product region column header sorts it in ascending order another click on the same header will sort it in descending order you can change the sequence of these columns by dragging the fields up or down on the visualizations pane let’s drag the order status after the product region notice how the visual changed the way it’s displaying the data it now shows the order status column right after the product region column you have the option to format this table visual and change its appearance by customizing various options available in the format tab expand the style presets option and select any preset from the available dropdown the appearance of your table will change instantly you can also further customize the table by expanding other sections for example you can display horizontal grid lines by expanding the grid section and selecting your desired color and width you can also change the table header font size color and other options by expanding the column headers section there are many other options to format the appearance and feel of the table whether to reflect your brand colors or to increase its visual appeal for your audience using raw data can feel like looking for a needle in a haystack it can be overwhelming messy and confusing but using table visuals in PowerBI is like sorting that haystack into neat manageable piles making it easier to find what your audience is looking for with data neatly laid out rowby row and column by column table visualizations present insights clearly and are an invaluable tool for bridging raw data and actionable intelligence your manager asks you to present a sales report to key stakeholders during a business meeting later in the week imagine you receive an Excel file containing all the adventure work sales data for the current year the sales department wants an appealing report that offers a comprehensive view of the company’s monthly sales volume and the number of processed orders and cancellations so what is your strategy for completing this task this is where Microsoft PowerBI’s bar and column charts can make you shine in this video you’ll discover the different bar and column charts in PowerBI that can help you efficiently represent your data you will also learn about the four field wells you can use to customize these charts axis legend values and tool tips previously you learned that bar and column charts are popular types of visualizations to display data in a clear and organized way they are beneficial for showcasing categorical data or data that can be organized into distinct groups bar charts display data horizontally whereas column charts display data vertically the simplicity and intuitive nature of bar and column charts make them effective tools for presenting data and identifying patterns or trends over time with six different types of bar and column charts in PowerBI you can convert raw data into visually appealing and meaningful insights let’s explore each of these chart options their features and how to add and configure them to PowerBI it can be difficult to identify patterns or insights when working with raw data sets containing text and numbers in this data set sales volume across different regions and the order status such as shipped or cancelled are organized into various columns let’s examine how to visualize this data using the different bar and column charts available in PowerBI to create a bar or column chart that demonstrates the number of orders by status and month select the month order quantity and order status data fields from the data pane with the relevant data fields selected let’s start by placing a bar chart on the report area you can do this by selecting the stacked bar chart icon on the visualizations pane you can resize it as needed by dragging its edges with this chart stakeholders can quickly compare and gain insight into the number of orders shipped cancelled or processed during February and March this is much easier to interpret compared to working with the raw data set you have the option to visualize this data using the variety of bar and column charts available to you to change the chart type select the chart you placed and then select the relevant icon from the visualizations pane such as the stacked column chart a stacked column chart is like a stacked bar chart but data is displayed as columns instead of horizontal bars another option for visualizing the data is a clustered bar chart in a clustered bar chart the values are displayed in individual bars instead of a group in the next option the clustered column chart the data is shown in individual columns the last two options are the 100% stacked bar chart and the 100% stacked column chart in both charts important insights are displayed on the tool tips for example if you hover your mouse over any of these bars or columns PowerBI displays the percentage and value of any grouped item such as the order quantity in PowerBI you can select any of the charts individual bars or columns to highlight them the other items fade making the selected items more prominent this is useful for highlighting specific areas or insights of interest now let’s explore four essential field wells in these charts the legend X and Yaxis and tool tips the field wells represent different sections of your chart that you can customize according to your requirements the first field well is called a legend it displays under the title or on the side of a chart the legend field controls the color coding or grouping of the bars or columns in your chart it helps to differentiate between different categories or subgroups within the data the legend makes it easier to understand which color in the chart represents which item you can hide the legend by turning it off in the format tab on the visualizations pane you can hover your mouse over the bar or column to display the data if the legend is not shown the next field wells are the X and Y axis each axis represents the data points you want to compare or analyze for bar charts the X-axis shows the values like order quantity and total sales and the Yaxis shows the categories like month or product regions for column charts this is reversed the x-axis shows the category and the y-axis shows the values like order quantity or total sales the final field well is called tool tips a tool tip displays data or extra information when you hover over the data points of a chart understanding the different types of bar and column charts in PowerBI such as stacked clustered and 100% stacked charts allows you to present your data in visually engaging and meaningful ways by using the four field wells axis legend values and tool tips you can create customized visualizations that are informative and insightful adventure Works is preparing for their annual sales conference your team leader has tasked you with presenting a report that portrays the direction of sales trends the report must also incorporate monthly information regarding delivered pending and canceled orders this is where Microsoft PowerBI’s line and area charts become instrumental in this video you’ll explore line and area charts when to use them and how to add them to your reports learning to use these charts is essential for creating attractive reports that empower stakeholders to make informed and effective decisions a line chart uses a line to connect individual data points it is the perfect tool for illustrating a sequence of values or displaying trends over a time period for example a line chart can help Adventure Works understand how sales are progressing monthtomonth or year to year a line chart with multiple lines can show sales across different regions over time and help the stakeholders understand the trend or sales performance while a line chart focuses on trends an area chart emphasizes the magnitude of changes it can display the part to whole relationships among your data making it easier to compare quantities for example regional sales represented by an area chart can help stakeholders intuitively understand and compare the degree each product region contributed to total sales for each month there’s a variant of the area chart called a stacked area chart where the data points from multiple categories are stacked on top of one another this can be useful when emphasizing the total across several categories for example you could use a stacked area chart to illustrate the total orders over a period and demonstrate how each product region contributes to the total so how do you decide when to use bar or column charts which you learned about previously or line and area charts when presenting a few items bar and column charts can be visually appealing and effective however when dealing with many data points these charts can become cluttered and difficult to read each bar or column takes up a certain amount of space and the chart can become overcrowded if there are too many to plot unlike bar and column charts area charts are effective for visualizing changes in multiple values over time both line and area charts are effective in visualizing the changes in values of multiple categories particularly over time while line charts are useful for identifying trends area charts offer a further benefit they help us interpret the magnitude of the values they also effectively illustrate the cumulative impact of the data points over the selected time providing an overall picture of the data trends now that you’ve been introduced to line and area charts let’s take a moment to explore how you can create them in PowerBI start by importing the Adventure Works quarterly sales data set file to a new PowerBI project in PowerBI the line chart area chart and stacked area chart icons are available in the visualizations pane to create a line chart select the line chart icon from the visualizations pane and place it on the report section open the data pane and select two fields month and order quantity the x-axis of the visualization is sorted by descending order quantity to modify it to ascending order navigate to the visual settings and select sort access and sort ascending a line chart is handy for illustrating trends for example this line chart displays the
total sales from February to April it clearly demonstrates an upward trend in sales for the quarter the sales team at Adventure Works may also want to compare the performance and trends of different regions across the quarter to do this select the line chart open the data pane and select the product region the line chart now indicates that although there appears to be a general upward trend in sales in all regions the European region outperformed both Asia and North America in February March and April as you discovered earlier you can display your data another way using area charts and stacked area charts to create a new area chart select the area chart icon from the visualizations pane place it on the report section and select the month and order quantity fields from the data pane using the visualization settings change the ascending order quantity to descending order in the x-axis to highlight the increase again for a more nuanced understanding of the number of orders for the quarter you may want to display the data by individual regions to do this select the product region field from the data pane while keeping this chart selected the sales team can get a better idea of how the regions contributed to the order quantity in February March and April you can also display the values in a stacked manner you can do this by selecting the visual and then selecting the stacked area icon on the visualizations pane this allows you to display the individual values as well as the total on a single chart in all these charts you can hover over the data points to display the values in a tool tip for example a tool tip could display the exact sales figure for a specific month this tool tip is one of the four essential field wells available in many visualizations in PowerBI the other three important field wells are the legend the X and the Y-axis you can configure the titles of these axes colors and other details by selecting the paintbrush icon on the visualizations pane this will open the format tab where you can make any necessary changes line area and stacked area charts are potent tools in PowerBI that can convert complex data into easily understandable visuals learning to use these visualizations and their essential field wells can equip you to deliver effective PowerBI reports that present clear and compelling comparisons of data over time and across different categories the sales manager at Adventure Works wants a comprehensive overview of how order quantity relates to overall sales performance for the past few months while bar charts can easily display the sales or the order quantity juggling these metrics on one chart could be a visual challenge likewise line charts offer an excellent way to track changes over time but won’t show the difference between sales and order quantities by visualizing the order quantity and total sales metrics for the past few months simultaneously the sales manager can quickly identify any patterns or trends and make strategic decisions to boost sales performance this is where combination charts referred to as combo charts in Microsoft PowerBI can help in this video you’ll learn more about these charts including how to create and format them in PowerBI a combo chart is a dynamic combination of a line and a column chart allowing you to visually represent two different yet interconnected data points powerbi offers two types of combo charts a line and a stacked column chart and a line and a clustered column chart a line and stacked column chart is helpful for displaying a total across the series of data and how each individual part contributes to the total for example you could create a line and stacked column chart for the sales team using columns to visualize total monthly sales each stacked by different product regions the line represents a different but related factor order quantity on the other hand line and clustered column charts are excellent for comparing several sets of data side by side this can be useful to track and compare different metrics over the same period for instance you might have columns representing the sales of each product region by month with a line indicating the average order quantity across all regions as a PowerBI analyst combo charts are one of the many essential visualization tools in your toolbox so let’s delve into the process of adding and setting up a combo chart in PowerBI suppose you need to create a combo chart in PowerBI using an Adventure Works data set containing sales data the purpose of the chart is to provide the sales team with insights into orders for February March and April including the overall performance of each month and each sales region to create this combo chart you’ll need four data fields: month order quantity order total and product region let’s start by placing a line and stacked column chart on the report area from the visualizations pane you can resize the visualization by dragging its edges select the chart while keeping it selected open the data pane on the visualizations pane and select the month order quantity and order total fields in the column yaxis field in the visualizations pane order quantity and order total appear together select the order quantity field and drag it to the line yaxis field both the line and column visuals now appear on the inserted chart now let’s add one more field from the data pane product region the chart now has a stacked look with each colored segment representing the contribution of each product region to the order total stakeholders can now not only compare the sales performance over the quarter but also compare the performance of each region monthtomonth you can also sort the chart in ascending order to do this select the three dots on the top right corner of the chart followed by sort axis from the drop-down menu and sort ascending you can change this chart to a line and clustered column chart by selecting the chart and then selecting the line and clustered column chart icon on the visualizations pane let’s briefly explore some of the key field wells for the chart the x-axis or shared access for the line and columns displays the categories in this chart month is used as the category the line y-axis is where you place the data to be displayed as a line like sum of order quantity the column yaxis is where you place the data to show as columns like order total and finally the legend is used to add categorical fields to the chart for example the product regions when you hover over a data point with your mouse some default values for the data point display if you’d like to add additional information to this displayed data select the appropriate fields from the data pane and drag them to the tool tip area combo charts in PowerBI are yet another tool in your data analytics toolbox with your knowledge and understanding of these charts and their functionalities you can present complex and related data points seamlessly and in a visually compelling way at Adventure Works your recent report made quite an impact your manager asks you to create another Microsoft PowerBI report adding visualizations other than the area charts you used previously your team suggests using pie and doughut charts which can offer similar critical insights to area charts but are clearer when many items have the same data range as it can be difficult to identify these items correctly in an area chart this is where pie and donut charts can be helpful in this video you will learn about these charts and how to use them in your PowerBI reports pie and donut charts are two types of visualizations available in PowerBI these charts which are circular and cut into slices provide a way to represent data proportionally while pie and doughut charts are useful for comparing different categories they become less effective when comparing large amounts of categories as the slices can become too small and difficult to distinguish between choosing between a pie and a donut chart depends on the specifics of your data and your report requirements let’s explore each type of chart starting with a pie chart in a pie chart each slice of the pie corresponds to a unique category from your data set the size of each slice is directly proportional to the quantity it represents suppose you have a quarterly sales data set with a pie chart you can visually compare the contribution of each month to the total sales the larger the slice the higher the sales for that month providing your audience with an immediate and intuitive understanding of the distribution of sales like a pie chart a doughut chart segments are proportional to the data they represent the difference between a pie and a doughut chart is that the doughut chart is ringing shaped with a circular central space you can use this space to provide context for the surrounding segments returning to the sales data example you could use the donuts chart center to highlight total sales average sales or any other key metric you’ll learn more about this later in the course when choosing between a pie and a doughut chart to represent parts of a hole the doughut chart may be a better choice if you’d like to display additional information in the space in the center having explored pi and doughnut charts let’s uncover the steps for adding and configuring them in PowerBI imagine you need to create a pie chart using a quarterly sales data set from Adventure Works for the pie chart you need to specify at least two data fields let’s start by placing a pie chart on the report area from the visualizations pane and resizing it by dragging its edges select the pie chart and while keeping it selected open the data pane and select two fields month and order quantity ensure that month goes to the legend field and the order quantity goes to the values field you can add more data to create a more detailed pie chart or illustrate additional insights for example you may want to examine the total order quantity by region to do this select the product region field from the data pane and ensure that it goes to the details field now the pie chart slices display the total order quantity sold in February March and April for Asia Europe and North America you can sort this chart by order quantity to display the slices in size order to do this select the three dots in the top right corner of the chart select sort axis and then sort ascending you can also visualize this data using a donut chart which also shows the relationship of parts to a whole to convert the pie chart to a donut chart select the pie chart while it is still selected select the doughut chart icon on the visualizations pane unlike a pie chart the center of the doughut chart is blank this allows space for additional information that can provide context for the surrounding segments to make your charts more interactive and display more data when presenting them to your audience you can enable drill mode for example select product category from the data pane and then select the drill down icon to turn on the drill mode ensure that product category goes to the legend field there is no visual change if you add the product category field when the drill mode is off once you turn on drill mode you can display the additional details by selecting each slice for example if you select the slice that displays the total sales in April more information is displayed to return to the main chart select the drill up icon in the dynamic world of data analytics the correct visualization can make all the difference pi and donut charts offer clean effective ways to visualize and compare proportions to illustrate the relationships within your data by using these visualizations in PowerBI you can present clear and engaging presentations you’ve been exploring the range of visualizations that Microsoft PowerBI offers one of these is a tree map chart like a pie or donut chart tree maps are another helpful tool in PowerBI for illustrating your proportional data however instead of circles tree maps use rectangles to display your data you might be wondering why do I need another chart if they serve a similar purpose using different chart types can enable you to make the best use of space in your reports and add variety by displaying data in new and exciting ways in this video you’ll become familiar with tree map charts understand their applications and how to craft them in PowerBI to create insightful presentations a tree map is a unique visual used to display hierarchical data or data that’s organized in a treelike structure as nested rectangles the entire chart represents the total data set or tree and each rectangle or branch represents a portion of the whole tree each rectangle’s size corresponds to the value or size of the data it represents while pi and doughut charts are familiar and widely used to represent data proportionally they have limitations for example pie and donut charts can become cluttered and difficult to read when dealing with many categories or variables or when the differences between data points are small however the design of a tree map chart allows for easier visualization and interpretation of larger data sets its rectangular nested structure means it can handle more data points without becoming overly complex to illustrate this pie chart represents sales at Adventure Works across Asia Europe and North America for one quarter when you convert the same chart to a tree map it becomes less cluttered and the information is presented in a more readable way now let’s create a tree map chart using a quarterly sales data set from Adventure Works let’s start by placing a tree map chart from the visualizations pane on the report area you can resize it as required by dragging the edges to create a tree map chart you need three fields to add data fields select the chart while keeping it selected open the visualizations pane and select month order total and product region from the data pane this visual automatically directs the selected data fields to the appropriate field wells month to the category well product region to the details well and the sum of the order total to the values well if you are not satisfied with this automatic selection of the field wells you can manually drag the data fields to the appropriate field well let’s compare this tree map chart to a pie chart created using the same data there is a legend in the pie chart which is absent in the tree map chart because the month names are already displayed in each branch inside the tree a separate legend is not required also the pie chart displays the data values by default which are missing from the tree map chart you can enable the data values in a tree map chart to do this select the chart and open the format tab on the visualizations pane select data labels to turn on the data values now the tree map chart displays the values beside the month and the region name similar to a pie and donut chart you can add more fields to the tree map chart and enable drill mode to add more data fields select the data field order status from the data pane while keeping the tree map chart selected a drill down arrow icon appears on the top right hand corner of the chart select the drill down icon to enable the drill mode then select any branch to display the detailed information making it interactive if you’d like to return to the main less detailed visual you can select the drill up arrow icon you can also customize your tree map by changing the font size of the category and data labels and colors of the categories to do this open the format tab on the visualizations pane then open the data and category labels section here you have the option to change colors and the font sizes of your chart as needed tree mapap charts offer a unique approach to displaying hierarchical data allowing for efficient use of space clear comparisons and effective handling of larger data sets while pi and donut charts are popular knowledge of tree map charts provides an added layer of flexibility and depth to your reports you now know what a tree map is and how it can elevate your data storytelling and presentation skills well done imagine you are in a sales meeting presenting a chart focusing on employee turnover rates at Adventure Works while this chart may help management understand why employees are leaving the company or make resourcing decisions it is not useful in the context of the sales department that’s because the chart is not representing a key performance indicator relevant to the sales department such as total sales revenue previously you discovered the importance of creating targeted charts to help stakeholders make informed decisions these charts are tailored based on the key performance indicators or KPIs relevant to different departments in this video you’ll learn more about visualizing KPIs by exploring the elements available in PowerBI to display KPIs in an engaging way kpis differ from regular charts and metrics because they align directly with strategic business objectives instead of simply presenting raw data KPIs offer insight into how that data impacts overall business goals and progress a well-designed KPI visual helps stakeholders clearly understand organizational or departmental goals and the metrics that signify progress by providing a concise summary of complex data KPI visuals make it easier and more efficient for stakeholders to comprehend a business’s overall performance progress and key metrics this empowers stakeholders to make informed decisions and implement datadriven strategies to promote successful business performance microsoft PowerBI offers a range of visualizations to display KPIs including cards multirow cards gauges and the KPI visual let’s explore each of these visuals and their uses the card visualization displays one value or a single data point this type of visualization is ideal for representing essential statistics you want to track on your PowerBI dashboard or report for example you could use a card visual in a sales dashboard to provide a snapshot of the total sales revenue enabling stakeholders to gain instant insight into overall financial performance next is the multirow card visualization that displays one or more data points with one data point for each row another visualization you can use is the radial gauge this visual is a circular arc that displays a single value measuring progress toward a goal or target or indicates the health of a single measure although radio gauges can highlight critical insights in a visually appealing engaging way they take up a lot of space compared to the insights they provide let’s examine the structure of this visual powerbi spreads all the data values evenly along the arc from the minimum leftmost value to the maximum rightmost value the default maximum value is double the actual value you should specify the target minimum and maximum values using the corresponding field wells in the visualizations pane to create a realistic gauge chart that represents your data the shading in the ark represents the progress towards your target and the value underneath the ark represents the progress value lastly the KPI visual in PowerBI is a powerful tool for tracking the performance of a metric against a target the KPI visual also includes a trend line or chart to show the data’s trajectory over time in this case the chart is showing the daily sales trend against the target of $10,000 it displays an indicator that shows whether the performance is above or below the target for example this KPI visual clearly indicates that the total sales amount on the last day is falling behind the target the KPI visual usually has three field wells indicator which is the primary measure you are tracing trend axis which shows how the indicator is performing over time and target goals which represents the benchmarks you are trying to achieve you’ll place the relevant measures or fields into these field wells to represent your data accurately and comprehensively with the chart key performance indicators act as a health checkup for a business providing stakeholders with insights into their progress toward reaching business goals by using PowerBI’s card multiro card gauge and KPI visuals you can make KPIs quick and easy to understand that means stakeholders can make informed decisions and reach their goals faster suppose you’re a data analyst at Adventure Works as the financial year ends you need to provide management with a report analyzing sales trends and financial performance across regions throughout the year ribbon and waterfall charts in Microsoft PowerBI can help you achieve this goal in this video you will learn about these specialist charts and how to use them in your PowerBI projects a ribbon chart is a form of stacked chart for visualizing data that changes over time and has a clear ranking order these charts stack the highest ranked series at the top of the chart making it easy to track shifts in the rankings over time they are also helpful for comparing the performance of different categories across distinct time intervals in the adventure work scenario management wants to understand the sales ranking of various regions throughout the year this ribbon chart effectively conveys how the different sales regions performed compared to each other and how their sales rankings varied from February to April waterfall charts show a running total as PowerBI adds and subtracts values these charts are useful for understanding cumulative effects in data analysis and visualization cumulative effects refer to how an initial value is affected by a series of positive or negative sequential factors events or changes over time for example a waterfall chart can be used in financial analysis to visualize how a company’s net income results from a cumulative effect of various financial elements including revenue costs and other factors like taxes this waterfall chart depicts how adventure work sales total changed from February to April for the different product regions showing a general upward trend with this visual stakeholders can intuitively grasp the overall sales performance as well as easily compare and contrast the contributions of each month and the regions to the sales total over time now let’s take some time to explore how to configure ribbon and waterfall charts in PowerBI you can start with a blank PowerBI file this data set contains sales data for Adventure Works across different regions over time let’s place a ribbon chart from the visualizations pane on the report area you can resize it as needed the aim of the ribbon chart is to demonstrate the change in sales value and ranking changes in categorical data like product regions and month so you’ll need to include three data fields to display the data properly while keeping the chart selected open the data pane and select the relevant fields month product region and order total ensure that month goes to the xaxis field product region to the legend field and order total to the y-axis field none of these fields is optional when creating a ribbon chart you can sort the category fields by selecting the three dots on the top right corner of the chart followed by sort axis let’s select sort ascending to ensure the months are sorted in the correct order note that each month has two distinct areas on this chart first is the actual sales value for each region the other shaded area shows how that region performed compared to the previous month’s data for example by hovering over this shaded area for Europe in April the tool tip reveals that Europe’s sales rank changed from second in March to first in April you can create a waterfall chart using the same process as you followed with the ribbon chart alternatively you can convert the ribbon chart you created by selecting it and then selecting the waterfall chart icon from the visualizations pane there are four field welds in this waterfall chart category breakdown order total and tool tips ensure that month goes to the category field which defines the x-axis and shows the individual positive and negative values then ensure the product region goes to the breakdown field which represents different segments in the category however unlike ribbon charts this field is optional in waterfall charts lastly ensure the order total goes to the yaxis field this field denotes the yaxis values to calculate the running total if there is a decrease in the sales total the waterfall chart displays red areas to observe this you can sort the chart in descending order by selecting the three dots in the top right corner then selecting sort axis and sort descending each month shows the total sales and how these regions are performing compared to the previous month’s data you can find out additional information about this performance using the tool tips field by hovering over any of the red or green areas you learned about two specialized charts in PowerBI ribbon and waterfall charts ribbon charts help represent rankings and their shifts over time which is ideal for sales performance analysis across categories waterfall charts on the other hand are perfect for breaking down the cumulative effects of various factors providing clear insights into financial performance these charts are impactful visualizations for complex data sets the sales manager at Adventure Works has noticed a recent decline in online sales despite continued marketing efforts and website traffic concerned that marketing strategies may not be converting leads into sales the marketing team asks you to create a visualization that represents the customer journey from lead or interest in the product to actual sales they’d like to gain insight into dropoff rates between the stages and identify areas they can improve their marketing strategies to improve sales performance funnel charts in PowerBI are one type of visualization you can use to represent the progression of data through different stages like a sales workflow in this video you’ll learn about funnel charts and how to implement them in PowerBI the funnel visualization displays a linear process that has sequential connected stages where items flow sequentially from one stage to the next funnel charts are commonly used in business or sales contexts they are well suited to visualizing data that’s sequential and moves through at least four stages where you expect a greater number of items in the first stage than in the final stage the charts can help reveal bottlenecks such as where a significant number of items are being lost are not moving forward in linear processes in addition you can use them to calculate a potential outcome by stages such as revenue sales or deals and track conversion and retention rates these rates relate to how many potential customers move through each stage of the sales process and stay in the process similarly you can use them to track the progress and success of click-through advertising campaigns now let’s take a moment to examine an example funnel chart representing the stages of a sales workflow each bar in the chart represents a stage the customer goes through during the sales process it begins with the lead stage at the top of the funnel representing customers interested in a product or service the qualify solution and proposal stages follow where these leads are evaluated for their potential presented tailored solutions and then sent formal sales proposals lastly the finalized stage is where the lead agrees to the proposal closing the sales deal each stage in the chart decreases as the lead conversion process progresses creating a funnel shape the narrowest part of the funnel represents the leads that resulted in actual sales now that you know more about funnel charts and their uses let’s explore how to create and configure a salesfunnel chart in PowerBI for the sales team at Adventure Works you’ll start with a blank PowerBI file the data set contains sales data including information about the lead conversion stages let’s start by placing a funnel chart on the report area from the visualizations pane you can resize it as needed keeping the chart selected open the data pane and select two fields sales ID and conversion stage ensure that conversion stage goes to the category field well and sales ID to the values field well category defines the stages of the process and values assigns the numeric data to each stage notice the shape of the funnel the highest value is displayed on the top gradually displaying the lower values each of the horizontal bars in a funnel chart is called a stage as mentioned before this is the typical pattern of the sales conversion process many people are identified as potential leads in the first stage but the number gradually decreases as they finally become the customer if you hover your mouse over each stage it displays information that compares to its previous stage and the highest or the first stage you can use the tool tips field well for providing this additional information when hovering over a specific stage you can format the colors of each stage whether to reflect your brand colors or improve readability and aesthetic appeal to do that go to the format tab on the visualizations pane and open the colors section then turn on show all and select the color for each stage you can also sort funnel charts in reverse order where the lowest value shows at the top and the highest value at the bottom you can do that by selecting the three dots icon at the top right corner of the chart then sort a access and sort ascending funnel charts are an invaluable tool for presenting sequential or staged data these charts provide a clear and concise visualization of various stages of a process such as a sales pipeline or customer journey enabling you to identify trends bottlenecks and opportunities by incorporating funnel charts into your PowerBI reports you can provide stakeholders with a comprehensive view of essential data supporting more informed and strategic decisionmaking suppose Adventure Works has been facing a steady decline in its profitability for some months marketing has invested heavily in advertising across multiple platforms and has run several promotional campaigns to boost sales the company is struggling to understand the relationship between its advertising spend and its sales revenue in this video you will learn about scatter charts their purpose and configuring them in PowerBI scatter charts are a powerful tool in data visualization they use dots to represent values obtained for two variables in a data set plotting these two numeric variables along two axes scatter plots help illustrate how one factor is affected by another representing correlations between the variables the relationship between the variables can be linear follows a straight line nonlinear follows a curved line or random scatter charts can help you identify trends patterns and perhaps most importantly anomalies like outliers in your data anomalies refer to deviations from the general pattern of the data outliers are a type of anomaly where valid data points significantly differ from other observations deviating from the general data trend they tend to lie far away from other data points in a scatter chart for example in a scatter chart representing the relationship between sales revenue and advertising spend at Adventure Works you might expect the data points to show a positive correlation where higher advertising spend is associated with more sales an outlier would be a data point representing unusually high sales revenue and low marketing spend this data point is worth investigating as it may indicate an effective marketing strategy able to generate revenue beyond what is expected based on the amount of money spent on marketing a keen eye for outliers is essential because they can dramatically skew statistical measures and data distributions though they might seem problematic at first outliers often carry vital information about the process under investigation or the data gathering mechanism they can help businesses gain valuable insight into potential issues or areas for improvement and optimization let’s help Adventure Works investigate the relationship between their advertising spend and sales revenue by creating a scatter chart the company can also explore any outliers using this chart enabling them to quickly identify issues areas for improvement and exceptional successes let’s use an imported data set containing Adventure Works sales and advertising expenditure data for this task to understand how various advertising media are performing with their advertising budget against the sales revenue you need to compare two fields the sales revenue and profit margin you need to identify each of these items via their campaign ID and platform type start by opening the report view place a scatter chart in the report area by selecting the scatter chart icon from the visualizations pane and resize accordingly while keeping the chart selected open the data pane and select these four fields campaign ID profit margin sales revenue and platform the campaign ID should go to the values field these represent your individual data points the profit margin goes to the xaxis field the sales revenue goes to the yaxis field and the platform goes to the legend field the x and yaxis field wells contain the data fields to compare against each other to display more data when hovering over a data point drag the advertising spend field from the data pane to the tool tips field now hover over any data point to see the updated tool tip this scatter chart is visualizing the correlation between marketing spend and sales the data points or markers are shown as dots you can manually change the size of these markers if needed by opening the format tab and the markers section the data points behaving as expected are closely gathered in the chart creating a cluster there are three outliers instantly evident this makes it easy to investigate these data points and gain insight into what caused the deviations from the expected pattern the data point in the leftmost corner represents a campaign that has an unusually high advertising spend compared to its sales revenue this is not in line with the trend seen in the other campaigns where a lower advertising spend usually correlates with a higher sales revenue marketing can use this insight to make decisions around resourcing for example reallocating the advertising budget to campaigns that are not underperforming in contrast the data point in the middle represents a campaign demonstrating a substantial deviation from the expected trend with a low advertising spend yet an unusually high sales revenue likewise for the data point on the top right corner sales revenue is exceptionally high given its relatively low advertising spend this campaign outperforms all others in terms of sales despite the minimal investment in advertising stakeholders can investigate these outliers to gain insight into the successful strategies and optimize other campaigns two additional field wells for scatter charts in PowerBI are worth noting the size field enables you to change each marker size dynamically it provides insight into how additional factors are affecting the data points for example let’s drag the advertising spend data field to the size field on the visualizations pane notice how the size of the data points change with the dot in the leftmost corner being the largest and the dot in the top right corner being the smallest the size of these points is now representing the advertising expense you can also add animation to your chart by adding a data field to the play axis for example let’s drag the advertising spend field to this play axis the chart now displays as a video like a player with a play button when you play it will animate each data point and display advertising spend in the top right corner this is useful for engaging audiences during presentations in this video you discovered scattered charts in PowerBI a type of visualization you can use to represent the relationship between two variables scattered charts are a powerful data visualization tool for uncovering outliers providing insights into trends and patterns and assisting datadriven decision-making they are an essential part of any data analyst’s toolkit congratulations you’ve completed the first module of this course creating reports in Microsoft PowerBI this week you are introduced to the different types of visualizations in PowerBI and how to add them to reports and dashboards with an emphasis on the significance of visualizations in presenting valuable insights to stakeholders you started the week by exploring the course overview and structure as part of your course introduction you set up your PowerBI environment and online account preparing you for the course exercises you also explored the importance of visualization and analysis in the context of business intelligence using real world scenarios and terms to enrich your understanding next you were introduced to visualizations in PowerBI starting with an overview of their importance in business intelligence you discovered the power of visualizations to simplify vast and complex data uncover patterns and trends enable detailed investigations of data make data accessible to and engaging for all kinds of stakeholders and communicate your analysis insights effectively you also explored creating visualizations in PowerBI a process that involves connecting to your data sources extracting transforming and loading your data selecting your visualization types and mapping data elements to different aspects of the visuals arranging the visualizations on the report page and finally sharing your report you learned how to apply visualization items to a basic report and were introduced to some common business reports you then familiarized yourself with the visualizations pane in PowerBI gaining hands-on experience in creating your own business report a sales report for Adventure Works you also explored how to pin visualizations in PowerBI in order to empower stakeholders to access key insights quickly encourage collaboration and promote a datadriven culture in your third lesson you delved deeper into basic visualizations in PowerBI you explored bar and column charts line and area charts combo charts pie and donut charts and tree map charts you not only learned how to create these different charts but also when and how to use them for maximum impact and effective data representation you also had the opportunity to practice your new skills by completing various activities and tasks using different chart types plus you discovered how important it is to target your data visualizations based on the needs of your audience with the basic visualizations covered you moved on to some of the specialist visualizations in PowerBI you learned about key performance indicators which are measurable metrics linked to an organization’s objectives and their vital role in business you were introduced to cards multi-roll cards gauges and KPI visuals visualization types in PowerBI that you can use to represent KPIs in business reports kpi visualizations provide stakeholders with a snapshot insight into overall performance and progress towards goals you also learned about ribbon waterfall funnel and scatter charts including their different purposes and how to configure each of them in PowerBI you then had the opportunity to put your knowledge to good use by creating a performance report for the marketing team at Adventure Works configuring visualizations that showcased relevant KPIs and answering realworld questions about performance over time you are now equipped with essential data visualization techniques and report creation skills in PowerBI you will build on your learning thus far discovering how to enhance the user experience and accessibility of your reports keep up the momentum and ensure you use the quizzes and additional resources to further consolidate your learning you’re a data analyst at Adventure Works a company that relies heavily on data analytics for decision-making the company recently added some talented individuals to its sales team including Logan who is visually impaired and uses screen reading software to access digital content soon after joining the team Logan realizes that the Microsoft PowerBI reports he receives are not entirely compatible with his screen reader he finds it difficult to interpret the visuals and graphics and there are some components that he cannot access recognizing the potential impact on Logan’s performance and the ability of the sales team to make datadriven decisions his manager immediately alerts the data analytics team while their reports are comprehensive and visually appealing the team has neglected the critical aspect of accessibility in this video you’ll learn about accessibility in data and reporting its importance in the business context and designing PowerBI reports that are accessible and inclusive to all in the context of digital systems accessibility refers to products applications websites and tools designed to allow all users to use them effectively regardless of whether they have any disabilities accessibility practices cover a wide variety of elements to ensure the usability and inclusivity of digital content this includes enabling digital content compatibility with assist of technology or AT which is used to increase maintain or improve the functional capabilities of people with disabilities such as Logan’s screen reader powerbi supports many accessibility standards that help ensure your PowerBI experiences are accessible to as many people as possible among these standards are the web content accessibility guidelines commonly known as WUKAG that help ensure web content is accessible to people with disabilities according to key principles of these guidelines web content including information user interface components and navigation should be perceivable operable understandable and robust or interpretable by a wide range of user agents including assist of technology implementing accessibility features in PowerBI reports can enhance the audience’s experience and comprehension of your reports in several ways firstly accessible reports promote inclusivity by designing PowerBI reports with accessibility in mind you ensure everyone can interact with and understand the data regardless of any limitations this results in a more inclusive and equal environment accessible reports also improve usability the practices used in creating accessible reports such as providing clear and concise titles adding alternative text descriptions for visuals and implementing keyboard navigation typically results in a better user experience for everyone in addition you can cater to different user learning and processing preferences by using various channels or methods to present information like text visuals audio and tool tips multimodal presentation can enhance comprehension and engagement for a wider audience accessibility features can also promote a clear interpretation of the data presented using techniques such as tool tips or descriptive titles can provide more context and reduce the chances of misinterpretation of the data finally accessible reports ensure compliance with various jurisdictional laws and regulations regarding digital content accessibility this keeps your organization within the legal framework and builds trust with your audience to promote accessibility which is vital in data and reporting PowerBI offers a variety of features for designing accessible reports powerbi visuals are fully keyboard navigable and compatible with screen readers facilitating user interaction and navigation powerbi also supports high contrast themes ensuring better readability plus users can use focus mode to expand visuals improving visibility and view data in a screen reader friendly tabular format with the show data table option for users with difficulty with color like color blindness you can use markers to convey different series in visuals like line or area charts similarly PowerBI supports pattern fills in visuals like pie or bar charts which you can use in addition to or instead of solid colors it also has some built-in report themes that consider accessibility guidelines when choosing colors and themes you need to ensure that there is enough contrast between text and background colors and be aware of color combinations that are difficult to distinguish you can add alt text which refers to alternative text descriptions to the visuals in your reports to make them more accessible alt text conveys essential insights even if users cannot see your visuals adding descriptive titles and labels to your visuals also enhances their accessibility as well as their understandability and usability finally some users may have motor difficulties and rely on assistive technologies that for example use keyboard commands for reading and interacting with your report content you can set the tab order of reports to help keyboard users navigate them in an order that matches the way other users visually process the report visuals in this video you discovered the importance of making PowerBI reports easy to use for all users and how to design accessible PowerBI reports which you’ll explore in more detail as you progress through the course accessibility ensures you follow the rules about being fair and inclusive makes your reports easier to use and helps everyone understand your data the usability and understandability of your reports play a vital role in communicating analysis insights and ultimately for stakeholders like Logan to apply data insights to decisions in the business context knowing the importance of accessible reports you need to include features that make your Microsoft PowerBI reports accessible to everyone in this video you’ll learn how to configure and format visualizations to improve accessibility let’s start by adding alt text or an alternative text description to a pie chart visual in an existing report for Adventure Works this is especially useful for people with visual impairment because screen readers can read this text when they select a visual to provide alt text for any object in a PowerBI desktop report start by selecting the object in the visualizations pane select the format section expand general scroll to the bottom and fill in the description in the alt text text box this text box has a limit of 250 characters alt text should include information about the insight that you would like the report consumer to take away from a visual because screen readers read out the title and type of visual you only need to add a description related to the data and main point of the visual for example alt text for this pie chart could be sales figures for February March and April in Europe North America and Asia combined next let’s explore how to set up tab order to improve accessibility by ensuring easy keyboard navigation navigate to the tab order page of the report to set the tab order select the view tab in the top ribbon in the show panes panel select selection in the selection pane choose tab order to display the current tab sequence for your report you can select an object then use the up and down arrow buttons to move the object in the hierarchy you can also select an object with your mouse and drag it to the position you’d like in the list now let’s move on to working with titles and labels to increase accessibility for visuals in your reports make sure that any titles access labels legend values and data labels are easy to read and understand let’s navigate to the titles and labels page of the report and compare the two-line chart visuals the visual on the left has no legend or access labels this makes it difficult to comprehend the insights the chart is meant to convey by including a legend the report consumer now knows which line in the chart corresponds to which product region and including the axis labels of February March and April makes it easier to interpret the trends in the data over time you can also add data labels to your charts to do that select the visual select the format section and find the data labels toggle and turn it to on turning data labels on for this chart displays the order total amount for each month along the lines representing the product regions this makes it easier for the user to interpret the visual at a glance with data labels you can even choose to turn on or off the labels for each series in your visual as well as position them above or below a series while PowerBI does its best to place data labels above or below a line sometimes it isn’t clear for example in this visual the data labels are jumbled and not easy to read to change the default position expand the data labels menu and select above or under from the position drop- down list positioning your data labels above or below your series can help ensure clarity especially if you’re using a line chart with multiple lines with a few adjustments the data labels are now clearer you learned that markers can also help to convey information in visuals like line area combo scatter and bubble charts adding markers improves accessibility by not only relying on color for users to interpret your visual and distinguish between data points for example different series in a line chart to turn markers on select the visual then the format section in the visualizations pane next expand the shape section scroll down to find the show markers toggle and turn it to on the line chart is now displaying markers to change the shape of the markers for each line separately select the format tab and expand markers from there select any series from the series dropdown and change the shape and size of the markers from the shape section lastly let’s explore the focus mode and show data option in PowerBI when a report consumer is examining a visual in a dashboard they can expand it to fill up more of their screen by selecting the focus mode icon in the context menu of the visual this displays only the selected visual allowing for better presentation and focus to return to the main report area select the back to report button to view the data in a visual in a tabular format select the three dots icon on the top right corner of the visual followed by the show data table in the visual context menu this displays the data in a table that is screen reader friendly you can also switch the layout to vertical or horizontal by selecting the layout button on the top right corner of the visual in this video you learned how to format visuals to improve accessibility and use various accessibility features in PowerBI integrating accessibility features improves inclusivity by ensuring users can access and interact with your content and can enhance the overall comprehension and usability of your reports your manager Adio asked you to design a report highlighting critical data within a table visual he wanted you to display data bars with sales figures for immediate recognition and to differentiate specific rows based on their data values for increased readability to implement this request you discovered PowerBI’s useful feature conditional formatting this feature enables the customization of charts based on diverse data criteria enhancing report readability and user engagement in this video you’ll learn about the conditional formatting feature in PowerBI and how to apply it to visualizations conditional formatting is a feature that allows you to apply specific formatting to cells or rows in a table or matrix based on specific conditions this feature is significant when you have vast amounts of data and want to highlight certain elements that meet specific criteria for example if the total profit displayed in a table was a negative value indicating a loss you could highlight this by using conditional formatting to change the value to a red color other visuals also support conditional formatting for example you can format a bar chart so that if the sales target for a specific product category goes beyond a certain threshold that category’s bar will change color conditional formatting offers many benefits it provides immediate insights allowing users to quickly spot trends anomalies and focal points without going through a vast amount of data one by one a more visually appealing report particularly one with colored data or data bars in a table can enhance user engagement making the information more accessible and readable in addition relying solely on manual analysis can result in users missing crucial details however with conditional formatting vital data points are automatically highlighted significantly reducing the potential for errors now let’s explore how to add conditional formatting to a table visual which offers excellent support for conditional formatting select the table visual from the visualizations pane you can resize it as needed in the report view now select the month product region order status order quantity and order total fields from the data pane from the format tab expand style presets and select the alternating rows preset from the drop- down menu if you’d like to resize the columns you can drag the column corners as needed you can also change the column headers by doubleclicking the fields in the column well on the visualizations pane let’s rename sum of order quantity to order quantity and sum of order total to order total now let’s show data bars using conditional formatting data bars display on columns with numerical values like order total or order quantity in this table to show the data bars rightclick the order total field in the column well on the visualizations pane select conditional formatting and select data bars this will display the data bars dialogue box in this data bars dialogue box you can select a color for positive and negative bars positive bars will display when the value is positive and negative bars when the value is negative select the colors and select okay the data bars will display in the order total field with your selected colors you can also change the background color of a cell using conditional formatting let’s try this with the order status column say you want to change the background color when the values are shipped cancelled and processing respectively to do that rightclick the order status field in the columns well on the visualizations pane select conditional formatting then background color this will show the background color dialogue box where you can set the conditions to apply specific formatting type shipped in the value text field and change the background color then select the plus new rule button to add a new rule in this new rule type cancelled and change the background color add one more rule and type processing and change the background color select the okay button and the table will update with the new conditional formatting instantly remember that you can add as much conditional formatting to each field as you want in this video you discovered how to implement conditional formatting in a table visual conditional formatting in PowerBI is an effective feature that you can use to enhance the clarity and usability of your visualizations making your data easily accessible and increasing visual appeal and user engagement during a recent project review you presented a report you carefully designed to the Adventure Works marketing team the presentation went smoothly engaging the audience with crucial data insights however Renee the marketing director noticed that the visual elements of the report didn’t align with the company’s brand colors and style guide renee asked you to update the design elements of the report to reflect the company’s brand aesthetics as you started selecting each individual item and manually adjusting their colors it was clear that this would be a tedious time-consuming task luckily your manager stepped in demonstrating how themes in Microsoft PowerBI could simplify the task at hand and save you a lot of time and effort in this video you will learn more about themes in PowerBI and working with them in your reports themes in PowerBI are predefined sets of colors fonts and visual styles that you can apply to your reports easily and quickly they ensure visual consistency across different reports and can save significant time that would be otherwise spent customizing individual items you can customize themes to align with company color schemes and design guidelines this can help enforce a strong brand identity in your reports and create a more impactful and professional appearance using themes in PowerBI can enhance accessibility in a variety of ways powerbi offers theme customization options you can use to cater to specific accessibility needs such as high contrast themes for users with visual impairments you can also enhance readability by using themes that employ distinct and consistent colors assisting users in differentiating between various data points and categories plus PowerBI provides built-in themes to help make your report more accessible for example by offering themes with colors that are easy to distinguish and visible to colorblind users this can broaden the accessibility of your reports to a more diverse audience not to mention a well-designed theme ensures that reports are userfriendly and easier to interpret let’s take a moment to explore how you can apply these themes in PowerBI you can choose report themes by going to the view ribbon in the themes section select the drop- down arrow and then select the theme you want to apply to your report these themes are similar to themes seen in other Microsoft products such as Microsoft PowerPoint here you can also find accessible themes which you can utilize to create accessible reports select a theme to apply it to your report instantly if you would like to customize the appearance of your PowerBI reports in the future changing the theme allows you to update all your visuals at once for more options you can also browse the collection of themes created by members of the PowerBI community by selecting theme gallery from the themes drop- down menu this opens the themes gallery in your browser in the themes gallery you can select any theme then scroll down and download the JSON file for the theme to install the downloaded file select browse for themes from the themes drop-down menu go to the location where you downloaded the JSON file and select it to import the theme into PowerBI desktop as a new theme this theme will instantly apply to your current report you can customize a theme directly in PowerBI Desktop to do this select a theme that is close to what you’d like you can then customize the theme by making any necessary adjustments to customize a theme from the view ribbon select the themes drop-own button and select customize current theme a dialogue appears where you can make changes to the current theme you can then save your settings as a new theme there are customizable theme settings in various categories you can name your custom theme and define color settings customized text settings such as font family size and color and visual settings which cover background border header and tool tips and adjust page elements like wallpaper and background as well as filter pane settings including background color transparency font and icon color size and filter cards after you make your desired changes select apply to save your theme you can now use the theme in your current report it will also be available in the custom themes section in the themes drop-down menu in this video you learned about themes in PowerBI using themes can significantly enhance the efficiency consistency and accessibility of your reports enabling you to effortlessly maintain a uniform look that aligns with brand guidelines learning how to use and customize themes is an essential skill that’ll help you make visually appealing easy to understand and professional reports quickly you need to present this quarter’s sales data to Adventure Works management team the data you’re dealing with is multifaceted and includes information like product categories regions stores periods and various performance metrics like total sales average sales and profit margin you include various charts and graphs that visually represent the overall sales trends regional performance and product category performance in a dashboard for management however the team also wants more granular and contextual information like store specific performance and individual product performance within categories due to the dashboard’s highle design displaying all these detailed data points could clutter the dashboard and overwhelm users you can use PowerBI’s tool tip feature to deal with this in this video you will learn about how this feature can improve the accessibility of your PowerBI reports and how to add custom tool tips you learned that tool tips in PowerBI display additional information about the data being displayed in your visuals when users hover over different data points you can create custom tool tips by adding extra items to the tool tips field well for a visual tailoring the content to the needs of your report users tool tips can contribute to improved accessibility of PowerBI reports and dashboards in various ways tool tips allow you to provide an extra layer of detailed information without cluttering the dashboard for example hovering over a specific region in a regional performance chart could show the top performing and bottom performing stores within that region this can make complex charts and graphs more accessible to all users including those with cognitive disabilities you can customize tool tips to provide contextspecific details for instance when a user hovers over a bar representing a product category in a bar chart the tool tip can display the top three best-selling products within that category for visually impaired users descriptive tool tips can provide crucial information that might not be readily accessible from the visualization screen readers can read out tool tips making the data more understandable for those with visual impairments tool tips are included in the show data table option for every visual tool tips can also support users that find distinguishing between different segments or lines in a chart based on color challenging such as colorblind users detailed tool tips can help these users by providing the necessary information when they hover over parts of the visualization even if they cannot visually distinguish between the colors users can discover new insights and patterns with tool tips in turn they may facilitate users who need additional support to interpret the visualizations and ensure insight clarity you can also use tool tips to explain or define the metrics and measures used in the visualizations enhancing users understanding of the data a further benefit of interactive features like tool tips is that they can make the data exploration process more engaging increasing user engagement lastly tool tips can help maintain a clean minimalist design in the dashboard by minimizing visual distractions tool tips ensure you don’t overwhelm the dashboard with additional details this allows users to focus on highle trends and patterns and explore details when necessary aiding their overall comprehension of relevant insights now that you know more about tool tips and how they can support report accessibility let’s explore how to configure and customize them in PowerBI if you hover over this ribbon chart PowerBI displays a tool tip that contains contextual information useful for understanding the visual for example hovering over this faded area shows various performance indicators for the Europe sales region such as monthly order totals and rankings the tool tip can also display other information related to this data point if you hover over the solid color it provides the month region name and the sum of order total you can customize this tool tip say for example some stakeholders want additional information related to order quantity and product stock to add this information select the visual open the visualizations pane and scroll to the tool tips field well drag order quantity from the data pane to this well powerbi will automatically convert it to sum of order quantity you can further customize a tool tip by selecting an aggregation function select the arrow beside the field in the tool tips well then select from the available options like sum average minimum maximum and many others as per your requirement you can repeat this process for product stock once tool tips are added to the tool tips well hovering over the same data point on the visualization also displays values for the sum of order quantity and sum of product stock you can also change the position of these fields in the tool tip by dragging them in the tool tips field well in this video you discovered how to add tool tips in PowerBI and how they can make your reports more userfriendly and accessible ultimately tool tips help add extra details without cluttering your dashboards and reports this feature can improve clarity and data comprehension and ensure all users including those with cognitive disabilities or visual impairments can access vital information the sales team at Adventure Works wants a comprehensive overview of their bicycle sales performance from overall company performance down to specific product models and different sales representatives setting up a hierarchy in a Microsoft PowerBI data model is a neat way to organize and explore related data from a general view to specific details in this video you’ll discover more about hierarchies in reports and how to create well ststructured hierarchies in PowerBI so that users can easily explore data at various levels of detail in your reports data hierarchies are a way to organize and structure your report data and visuals in PowerBI hierarchies group related data items by hierarchical relationships while you do not need to organize your data in PowerBI using hierarchies it can make it easier for users to understand the data and the connections between different components hierarchies in PowerBI also support data exploration making it possible for users to navigate from high-level data overviews to more detailed information these hierarchies enable drill mode in your visuals empowering users to drill down into detail within the same visualization or report for example PowerBI automatically creates a date hierarchy when importing date columns from data arranging dates from more general to more specific such as year quarter month and day in a data set with timebased sales data a hierarchy like this enables users to explore the sales totals from a broader point of view such as yearly sales to a more detailed one such as sales on a particular day let’s explore hierarchies further by considering the example of an adventure works data set containing sales records you can create a hierarchy by organizing the data points into a structured framework that starts with bike as the main category and further breaks down into subcategories which you can break down further into specific product names this way stakeholders can understand the overall sales of bikes at a glance and explore the data at a more detailed level such as the sales performance of mountain bikes versus road bikes or the sales performance of individual products similarly for a data set containing geographical sales data you can structure the data according to the hierarchy of continent country city area this way report users can drill down into the data by geographic level from exploring global trends to examining local successes or difficulties so how can you create hierarchies like these in PowerBI let’s take a moment to explore the process you can start by importing your data set in this case the adventure works sales data set into a blank PowerBI report you don’t need to transform any data then select the sales table followed by the load button if you open the data pane you will notice that PowerBI has automatically created a hierarchy with all the date fields such as estimated delivery date and order date for example if you expand order date then date hierarchy it shows the dates organized according to year quarter month and day how can you create a hierarchy of your own let’s create a hierarchy for product related data using the product category product subcategory color and product name fields imagine how this hierarchy should be constructed the product category should be the overarching or main category at the top rightclick the product category field in the data pane and select create hierarchy from the context menu this will immediately create a new item in the data pane called product category hierarchy if you expand this item the product category field is nested inside it to add more fields to this hierarchy right click on a field for example the product subcategory and select add to hierarchy from the context menu then select the newly created product category hierarchy the product subcategory field will be added to the product category hierarchy following the same process let’s add product color and product name fields to this hierarchy you can remove any field from the hierarchy by right-clicking on it and selecting delete from model you can instantly add a table visual to your report area by checking the check box before the hierarchy on the data pane you can resize this visual as needed alternatively you can create a visual and then apply the hierarchy to it select the tree map visual from the visualizations pane and resize it as needed while keeping it selected mark the checkbox of the product category hierarchy in the data pane now select the order quantity field the tree map visual will be ready with drill down mode instantly and you can dig down into as many levels of data as you want you can turn the drill down mode on by selecting the down arrow on the top right corner of this visual and make the report interactive understanding report hierarchy enables you to organize data for yourself and the stakeholders working with the report you’re creating hierarchies facilitate an understanding of how different data fields relate making the data less confusing and more userfriendly with hierarchies users can start with the bigger picture and smoothly zoom into different levels of detail as needed empowering them to make a range of informed decisions imagine you are asked to design an interactive visual for a report that displays crucial information while allowing users to delve into any chart element and engage more deeply with the associated data points users should have the flexibility to navigate through multiple layers and return to the main report as needed while drill down only allows users to navigate from a broader to more detailed level within the same visualization with PowerBI’s drill through feature users can navigate from a visualization to a separate detailed report page focused on the selected data point in this video you’ll learn how to configure the drill through feature in a PowerBI report for Adventure Works let’s start with a pie chart displaying total sales figures by month this visual provides stakeholders with a way to compare monthly order totals at a glance suppose you want to direct users who require more detail about sales performance to a separate page that displays the sales data broken down by region and order status you can add a new page to your report by selecting the plus icon at the bottom to add a page title doubleclick on this new page title and type regional sales add a table visual to the page and resize it accordingly then select month from the order date hierarchy order quantity order status and product region the table is now displaying all of this data at once so how can you have users land on this new page because the pie chart displays total sales by month you can link the table to the chart using the shared month field while keeping the table selected drag the month field from the order date hierarchy to the drill through field well notice how a back button is added above the table visual you can now press the control key on the keyboard and select this button to return to the main report returning to page one in our report area when you right click on any slice of the pie chart for example April a new item in the context menu called drill through displays select regional sales and notice how the table is now showing only the sales records for April returning to the main report if you rightclick on the March slice and select drill through followed by regional sales you are shown the regional sales table for only March’s sales data suppose some stakeholders also want insights into the performance of different categories of bikes let’s create a new page that displays the data by bike categories sold in every month and link it to the main chart using the drill through feature add a new page and rename it bike categories select a card visual resize it as needed and select month from the order date hierarchy on the data pane dragging it into the fields well next select a multirow card and resize it as needed select the order quantity and product category fields on the data pane drag the month field to the drill through well to link the new page to the main chart now let’s return to the main page and explore the new addition if you select any slice for example March there are two items available under the drill through menu in the context menu if you select bike categories you will be taken to the bike categories page but now data is showing for only March you can add as many pages as you need and link them to other report pages using the drill through feature in PowerBI in this video you learned how to use the drill through feature in PowerBI this feature is essential for professional and real life business data visualization enabling you to create multi-page reports with easy navigation allowing users to dive deeper into the data as needed without sacrificing clarity in reporting and visualization sorting and filtering functions can help users better understand the data presented in reports highlight patterns and trends and focus on information that’s relevant to them in this video you’ll discover how to apply and manage sorting and filtering features in PowerBI with PowerBI you can sort or order the data in your report visuals based on different data fields like ascending or descending order for example in a report on sales performance sorting a column chart depicting sales performance by region in ascending order makes it easier for stakeholders to identify the lowest and highest performing sales regions an unsorted visual can create confusion and make the visual unreadable and difficult to understand consider this line chart showing sales trends for the quarter the chart is sorted by sales amount by default and the months are not presented in logical chronological order if you do not sort the visuals by month users might have difficulty understanding or misinterpret sales performance over time as at a glance it seems like sales are declining however when properly sorted by month it is clear that sales increased in all three regions over time there are also many filtering options available to you when creating your reports filtering enables you to select specific data points or subsets of data as needed to ensure the data presented is relevant and clear this is helpful for excluding certain values when representing your data with different visuals for example this report displays the combined total of orders from different sales regions it includes all types of orders including cancelled orders or those still being processed in this example you may want to use filtering to exclude these data fields if you add an order status filter to show only the numbers for orders that have been shipped the picture changes dramatically by filtering out canceled orders and orders still being processed stakeholders can focus on completed orders and gain a better overall picture of actual sales performance in the different regions now that you know more about the sorting and filtering features let’s explore how to use them in PowerBI you can sort any chart in PowerBI by data fields in a variety of orders depending on your needs to sort select the three dots on the top right corner of the visual followed by your preferred sorting method some visuals like this line chart give you the option to sort the legend as well arranging the different categories presented in the legend in a particular order other visuals like this pie chart offer only sort access which refers to sorting data points along the horizontal or vertical axes in a particular order from the axis you can select various data fields and then also select to sort them in ascending or descending order let’s sort the stacked column chart in the bottom left corner of the report by month currently it is sorted by order quantity in ascending order select the three dots on the top right corner of this chart select sort axis then month followed by sort ascending the chart is now sorted by month in ascending order beyond sorting PowerBI offers powerful filtering capabilities there is a filters pane that you can use to apply different filters to the whole report page as well as individual charts let’s filter the line chart in this report to show the order total for the shipped orders only notice the filters on this visual section in the filters pane let’s filter the line chart in this report to show the order total for the shipped orders only here you can select relevant fields and apply filtering for example you can exclude Asia from this line chart by selecting the product region and then checking every region excluding Asia the line chart will update instantly it now displays sales data for Europe and North America only you can also add other filters like order status here drag the order status field from the data pane to the add data fields here box now check shipped the line chart will update and display the order total for only shipped orders instead of individually applying filters you can apply filters on all chart items at once from the filters pane unselect any chart item by selecting a blank area on the page and open the filters pane if it’s not opened yet notice the section called filters on this page this is where you can drag the relevant data fields and set filters for all visuals on the report page let’s drag the order status field from the data pane to this section and check shipped notice how all visuals on this page reflect this change instantly if you have a multi-page report you can apply filters to all pages by dragging any field to the filters on all pages section in the filters pane and then by setting the filters you can also remove a filter anytime by selecting the field you want to remove in the filter pane followed by the cross or X icon in the top right corner in this video you explored sorting and filtering discovering how these can provide stakeholders with a clearer picture of their data these features are fundamental to data analysis and reporting in PowerBI applying sorting and filtering to your visualizations makes it possible for stakeholders to focus on the vital relevant data points enabling faster datadriven decision-making imagine you’re presenting a report to key decision makers at Adventure Works one visual displays sales across a quarter while another portrays product categories arranged in descending order based on the number of orders the stakeholders request more interactivity in the report for example by selecting a specific month on the sales chart they wish to see corresponding product categories emphasized in the other chart this provides clarity on which products sold the most during a particular month microsoft PowerBI’s cross filter and cross highlight functionalities make it possible for you to emphasize related data across multiple charts or remove unrelated data in this video you’ll learn about these exciting features and how to use them in your PowerBI reports cross filtering refers to the practice of selecting an item or data point on one visual which in turn filters out unrelated data in another visual it creates a relationship between two separate visuals such that a selection in one visual affects the data shown in another for example with cross filtering selecting the mountain bikes column in a report will filter the table visual to display only sales data related to this product category the other product categories are no longer shown with cross highlighting when you select a data point in one visual it highlights the related data in other visuals instead of filtering out unrelated data this is the default behavior for most visuals in PowerBI to illustrate with cross- highlighting selecting the mountain bikes column in one chart highlights the sales of mountain bikes in February March and April for each region in the stacked bar chart unlike cross- filtering it still displays unrelated data however it’s dimmed or faded let’s take a moment to explore these cross filter and cross highlight features in PowerBI in this report there are four different visuals displaying various sales data let’s start by examining how default cross highlighting works in PowerBI using the stacked bar chart in the top left corner if you select any region for example Europe it highlights the bar related to Europe and dims the other bars notice how all other charts instantly reflect your selection and highlight data that is related to your selection in the stacked bar chart the bright areas represent data related to Europe and the dim areas represent data from other regions you can press the shift key on the keyboard and select multiple regions or even multiple units in the stacked bar chart every time your selection changes the other charts respond automatically by highlighting the related data take note that the table visual behaves differently rather than fading the irrelevant data it hides them based on your selection this is called cross filtering to clear your selection you can select the selected item again to return to normal view if you select data points on any of the charts on this page the other charts will cross highlight based on your selection instantly for example if you select mountain bike on the stacked column chart in the top right corner the other charts respond just remember that cross- highlighting means irrelevant data will remain visible but dimmed and cross filtering means irrelevant data will be hidden you can change the default behavior of interaction in PowerBI reports from cross- highlighting to cross filtering to do that select the file menu options and settings and then options this opens the options dialogue box from here select the report settings from the left sidebar and then check change default visual interaction from cross highlighting to cross filtering in the visual options section and select okay now if you select mountain bike on the stacked column chart notice how the stacked bar chart on the left reacts it is not showing the dimmed areas anymore and is displaying data related to the mountain bikes only in other words cross filtering hides all sales data unrelated to mountain bikes based on your selection in the other visual cross filtering and cross- highlighting are powerful features in PowerBI that can enhance the clarity and effectiveness of your reports having the ability to enable one chart to influence another you offer a more interactive and intuitive experience for report users this approach not only makes your report more dynamic but also simplifies the data analysis process as you create more interactive reports for your audience filtering data becomes increasingly important at Adventure Works the CEO asks you to set up a sales report that she can use in a presentation with the company’s shareholders next week you want to make this report as useful as possible for the CEO but unfortunately her schedule is busy between now and the presentation you know she will be filtering data but cannot predict every filter she will apply however you know that she’ll most likely filter the data by region and product this is a perfect scenario to use a slicer in Microsoft PowerBI in this video you’ll learn what a slicer is how it works and how to apply slicers to your reports a slicer is a great way to apply common filters to a report page quickly when added to a report you can use the slicer to display a list of commonly used or most important filters the slicer can be displayed in multiple formats depending on the field on which the slicer is filtering for example if you apply the slicer to a field with text data type the slicer can display as a list of unique entries in that field similarly if you apply the slicer to a field with a date type the slicer can be displayed as a date range selector however no matter which format the slicer is displayed in the underlying behavior is the same the slicer provides a list of filters that users can apply to the visualizations in the report when a filter is selected the visualizations will immediately update to reflect the filtered data it is important to note that you do not need to connect every visualization in a report to the slicer as a PowerBI data analyst you can configure which visualizations are impacted by the slicer selected filters you can also synchronize multiple slicers so that when a slicer applies a filter other slicers on different pages are updated to reflect the selected filter this is useful when filtering through multiple layers of data for example if you had one slicer for regions on a sales page and another slicer for regions on a costs page when you select a specific region the region is selected on both slicers this helps improve the user experience as filtering remains consistent as you navigate multiple pages of the report now let’s explore how to configure a slicer in a PowerBI report let’s begin with an existing sales report for Adventure Works the report has two pages sales summary and sales detail on the sales summary page you need to apply two slicers one for region and one for products let’s start by adding the region slicer navigate to the visualizations pane and select the slicer icon then select the slicer in the report and navigate to the data pane in the data pane select the region field in the region table notice that the slicer now lists all of the sales regions of Adventure Works if you select the entry for France in the slicer this will apply a filter for sales data belonging to France notice that when you apply the filter the visualizations update immediately next let’s add the slicer for products again navigate to the visualizations pane and select the slicer icon select the slicer in the report and navigate to the data pane this time select the product field in the product table the slicer now displays the lists of all products now let’s confirm that each visualization is connected to the slicers to do this navigate to the format option in the ribbon menu and select edit interactions each visualization will show a filter icon indicating that filters are being applied if you want to disconnect the slicer select the none icon in the visualization remember that you can synchronize the slicers across pages to reflect the current filter context let’s configure two slicers to synchronize with each other first I’ll create the same region slicer in the second page of the report by adding the slicer visualization and again applying the region field from the data pane next navigate to the view menu and select sync slicers this opens the sync slicers view select the region slicer in the report it is now displayed in the sync slicers view expand the advanced options drop- down menu enter the name of a group you want this slicer to belong to for this scenario let’s name the group region there are two additional options here sync field changes to other slicers and sync filter changes to other slicers for this report you need to select both options as you want to sync the slicers with each other when the viewer interacts with them and also for maintainability purposes so that if you change the filtered field in the data pane both slicers will update now select the region slicer in the first page and navigate to advanced options again once again enter the group name region while you can enter any name for the group you must name it consistently if you misspell the group name on a slicer it won’t synchronize correctly again select sync field changes to other slicers and sync filter changes to other slicers now it’s time to test the report when applying a filter using the region slicer for example by selecting France the visualizations on the first page update now when you navigate to the second page the region slicer on this page is already set to France and the data is filtered you learned about adding slicers to PowerBI reports in this video slicers are a dynamic tool that you can use to enhance the interactivity of your reports while also improving the user experience as you design reports for different audiences it is essential to consider their filtering needs and identify common or important filters to apply the world of apps has rapidly expanded over the past decade from apps on your mobile phone to apps in the web browser on your desktop with people already familiar with the app experience what if you could make your reports more app-like this could improve the user experience for your target audience immensely and encourage them to interact with and use the reports you build microsoft PowerBI comes with a built-in set of buttons that you can add to your reports to increase interactivity from navigation between pages to quickly applying filters in this video you’ll discover more about buttons and how they’re invaluable in your toolkit for building interactive reports buttons in PowerBI come with many configurable options the two most common configurations you will work with are the visual style and the action you can change the visual style of buttons to different shapes such as rounded rectangles pillshaped and arrows you can also change the colors of the buttons and their text if the business you work for already has other applications these options help you align with potential existing app and user experience guidelines the action of the button is how it behaves when a user interacts with it let’s explore the different options available back returns the users to the previous page of the report this action is useful for drill through pages bookmark allows users to capture or bookmark a particular state in the report it presents the report page that’s associated with a bookmark that is defined for the current report you’ll learn more about this later drill through navigates the user to a drill through page filtered to their selection without using bookmarks page navigation also involves navigation without using bookmarks it navigates the user to a different page within the report q&a opens a Q&A explorer window when your report readers select a Q&A button the Q&A explorer opens and they can ask natural language questions about your data apply all slicers and clear all slicers buttons apply all the slicers or clear all the slicers on a page lastly web URL opens a web page in a browser these buttons provide different means through which users can engage with your reports let’s explore how to enhance the interactivity of a report by adding buttons this PowerBI sales report has two pages sales summary and sales detail on the sales summary page there are slicers available let’s start by configuring buttons for page navigation to add a button navigate to the insert tab in the ribbon select the buttons dropdown and choose right arrow position the arrow in the top right corner of the report select the button in the report to open the format pane the format pane allows you to configure the different options of the button for now let’s expand the action section in the format panel in the action section first select the off button so that it changes to on enabling the action next select page navigation as the type and then choose the sales details page as the destination now let’s navigate to the second page of the report again navigate to the insert tab in the ribbon select the buttons drop-down and choose left arrow in the action section select page navigation and then the sales summary page as the destination finally position the arrow in the top left corner of the report page you can test the buttons by holding the control key and selecting the buttons given that there are slicers on the sales summary page you can ensure a good user experience by allowing the report viewer to clear the slicers quickly to do this navigate to the insert tab in the ribbon select the buttons drop-down and choose clear all slicers let’s position the clear all slicers beside the slicers on the report page for ease of access now when the viewer applies a filter using the slicers they can select the clear all slicers button to reset the state of all the slicers these simple changes will help improve the user experience of the report buttons are a useful way to improve the user experience for your target audience when building your next report consider how you can use buttons to simplify navigation add filtering and provide access to the Q&A feature as you progress with your learning you’ll explore how this feature is particularly useful when building reports for mobile devices at the end of the last financial year Adventure Works conducted a customer survey to determine how happy customers were with the way the company handled product orders and deliveries unfortunately a common complaint was that it took too long for orders to arrive after being placed to investigate the possible causes of this delay you have created a report in Microsoft PowerBI that tracks data from different sources including storefront orders warehouse fulfillment and courier delivery because you plan on sharing this report with multiple departments you know each department will want to filter the data specifically to align with their responsibilities rather than expecting users to apply complex filters they are unfamiliar with to isolate the data they’re looking for your manager suggests using the bookmarks feature to make this data easily accessible to them in the next few minutes you’ll learn what bookmarks are and how to add them to your reports in PowerBI bookmarks in PowerBI are a way to capture the current state of the report you are viewing and share this state with other viewers for example if you apply filters to a report you can save the filtered state as a bookmark viewers can then select the bookmark and the report will change to the filtered state you established when adding a bookmark there are four state options that you can save data properties such as filters and slicers display properties such as visualization highlighting and visibility current page changes which present the page that was visible when you added the bookmark and selecting if the bookmark applies to all visuals or selected visuals in the adventure works example bookmarks will enable different users to focus on different parts of the data without setting up filters every time you can also highlight specific insights and create customized views relevant to the different departments by default all states are saved for all visuals if you modify a report after you create a bookmark any visualizations not present when you created the bookmark will appear in a default state so remember if you change a report you should make sure to update your bookmarks to reflect the changes given that bookmarks in PowerBI are excellent for creating tailored interactive reports that users can easily navigate and extract crucial insights from it’s essential to know how to create them let’s take a moment to find out let’s start by filtering data in an existing sales report in PowerBI with two pages sales summary and sales details let’s filter data related to the France sales region by selecting France in the region slicer next let’s filter further by selecting the Mountain 200 Black 38 model in the product slicer now that the report is in a filtered state let’s create a bookmark to do this select view in the ribbon menu and then bookmarks this opens the bookmarks panel to create the bookmark select the add button this saves the state and creates a new bookmark with a default name to rename the bookmark select the three dots beside its name and select rename for this bookmark let’s rename to France if you don’t want the bookmark to open the current page you can select the three dots beside the bookmark again note that current page has a check mark beside it indicating that it is enabled for the bookmark to disable it select current page now let’s test the bookmark clear all slicers so that the report is reset if you open the bookmark panel again and select the bookmark you can observe the filters reapplied to the report bookmarks in PowerBI empower you to streamline data exploration and customize and tailor reports based on user needs by capturing states of reports such as data and display properties bookmarks allow different users to filter and focus on specific aspects of the data easily bookmarks are also a valuable tool for enhancing interactivity and creating tailored user-friendly reports that can support datadriven decision-making adventure Works has embraced the datadriven decision-making unlocked by Microsoft PowerBI however as you’ve continued building and updating various reports you’ve identified a significant time cost to maintaining them and when you need to add new visualizations to the company’s many reports moving all the existing individual visualizations is very timeconuming the lead data analyst suggests grouping the visualizations to make maintenance easier this video will demonstrate how to group and layer visuals to improve maintainability let’s start with an existing Adventure Works sales report the report has four visualizations sales revenue by region sales revenue by month sales units by region and sales units by month to make maintenance more manageable let’s create two groups one for the sales revenue visualizations and one for sales units visualizations to do this first select the sales revenue visualizations by holding down the control key and selecting the two visualizations then navigate to the format tab in the ribbon menu and select group next select the two sales units visualizations by holding down the control key and selecting them again navigate to the format tab in the ribbon menu and select group notice that now when you select and move the sales revenue by product visualization the sales revenue by month visualization moves too this is because they are grouped you can view all existing visualizations and groups using the selection pane to open the selection pane navigate to the view tab in the ribbon menu and select the selection button the groups created in this video are listed under the layer order tab in the selection pane inside each group are the visualizations that belong to the group to improve maintainability let’s rename the groups let’s doubleclick the first group’s name and rename it sales revenue similarly doubleclick the second group’s name and rename it sales units the ordering of groups and visualizations is important in the pane as this determines how the elements are layered for example moving the sales revenue group to overlap the sales units group results in this group displaying under the sales units group visually to change the visual order you can select the revenue group in the selection pane and select the upward arrow so that it moves above the units group in the layer order now suppose after reviewing the groups with a colleague you conclude that managing the visualizations as a single group would be better in the selection pane you can select and drag both sales units visualizations in the units group to the revenue group notice that the units group is automatically removed as there are no more visualizations belonging to it let’s add a title to the report page which is now more maintainable through its grouped visualizations and descriptive group name select the insert tab in the ribbon followed by text box in the text box add the text sales detail then select all the text in the text box and change the font size to 24 now let’s organize the layout of the report select and drag one of the visualizations and the group will move move the group to the bottom of the report page then move the report title to the top of the report and adjust its sizing as more pages are added to a report and future updates are made time is saved by organizing visualizations into groups in this video you discovered how to group and layer visuals in PowerBI grouping visualizations is a crucial activity for improving the maintainability of reports make sure to consider the benefits of grouping visualizations and how to implement groups effectively when designing reports in PowerBI data analysis expressions or DAX is a powerful language for creating custom calculations however DAX is contextsensitive so it’s important to understand how context influences the reports you build with it in this video you’ll explore how visualizations impact DAX context adventure Works is analyzing its total annual revenue the company needs to identify its total revenue based on different product categories as part of its analysis once the analysis is completed the results must be delivered to management as a visual presentation adventure works can use DAX filter context in visualizations to perform its analysis and create its reports let’s begin with a recap of what we mean by the term context in Microsoft PowerBI in data analysis context comes in two primary forms row context and filter context row context refers to the table’s current row being evaluated within a calculation whereas filter context refers to the filter constraints applied to the data before it’s evaluated by the DAX expression in other words you can determine which of your reports rows or subsets should be included or excluded from the calculation the interaction between DAX evaluation context and visualization is crucial for creating dynamic and interactive reports and dashboards each time you interact with the data like selecting a portion of a chart or an item in a slicer you alter the filter context let’s consider an example to find out more about how this works adventure Works can create a DAX measure of profit margin and then create a visual in the report canvas from this measure the visualization displays the profit margin of the entire data set because that is the current context let’s learn more by exploring how Adventure Works make use of DAX filter context in its visualizations adventure Works begins its analysis of its product categories by creating a DAX formula that calculates the sum of the quantity of each product sold multiplied by the unit price in the sales table when executed the formula computes the sum of all sales amounts the result of this formula is that Adventure Works has sold $3.5 million worth of goods over the past year however when this measure is added to a PowerBI report as a visual like a bar chart for example it isn’t very engaging it offers limited insight into the sales data by displaying only the total revenue the visuals become more engaging and display meaningful insights when used with filter context for example Adventure Works could generate more useful insights by comparing or contrasting total sales revenue across product categories by comparing sales of bicycles to other categories Adventure Works discovers that bicycles outsell all other products by a considerable amount adventure Works can still view the total revenue but each of these revenue figures now has a meaning which is the total revenue for each product category powerbi is displaying the sum of all sales within a specific product category but now it’s computing different values for different cells because of the evaluation or filter context total sales by category adventure Works can enhance these visuals further by using the year category from the date table as another filtered context or attribute once this context is applied a new visualization is generated each table cell shows a different value even if the formula is always the same you can place multiple fields in both rows and columns this is because both the row and column sections of the table define the context as you discovered earlier the interaction between the DAX evaluation context and the visualization alters the filter context interaction affects DAX calculations and alters the results in the visualizations let’s explore this process using an adventure works data set now that Adventure Works has calculated its annual total sales it creates two slicers in its report one for the region and the second for the month when a specific region is selected the profit margin measure recalculates and the chart dynamically adjusts adventure Works can also select a month to implement month as an additional filter on top of region the measure now displays the profit margin value for a specific region in a specific month the contextsensitive nature of DAX is a powerful feature it enables dynamic calculations based on the context in which DAX computes the formula by understanding how context impacts DAX you can create more accurate insightful and dynamic reports to tailor to specific business scenarios congratulations on completing the navigation and accessibility module of the data analysis and visualization with PowerBI course this module taught you essential skills for creating accessible well ststructured and interactive reports let’s recap what you accomplished you started with how to design accessible reports you discovered the significance of accessibility and the many benefits of implementing accessibility features in PowerBI such as improving your reports inclusivity usability and understandability you learned about some of the PowerBI features that can support the accessibility of your reports including keyboard navigation and tab order screen reader compatibility accessible themes and high contrast support focus mode and displaying data in a screen reader friendly table format markers and pattern fills and alt text titles and labels you explored how to enhance accessibility by formatting and configuring your visualizations using these accessibility features learning how to design reports that cater to a diverse audience who can all access and comprehend the information you present conditional formatting was a key focus empowering you to apply dynamic rules to your visualizations that enhance their clarity and usability you also engaged with themes in PowerBI and the ways they can enhance the accessibility of your reports such as enhancing readability in addition to other benefits such as visual consistency and enhancing clarity and brand identity in the process you learned how to apply configure and customize themes in PowerBI to further guide your journey you were introduced to best practices for designing accessible reports you then put your newfound knowledge of accessibility into action by applying formatting themes and design best practices to create an accessible report for Adventure Works you went on to learn how to enhance the accessibility of your reports even further by adding custom tool tips to your visualizations you also explored the many ways tool tips can improve accessibility in your reports such as making the data more accessible to users with visual impairments as tool tips are screen reader compatible and making complex charts more understandable to users including those with cognitive disabilities next you focused on report navigation and filtering you began by comprehending the concept of report hierarchies and learned how to configure them effectively in your reports these hierarchies empower users to drill down into your data as needed encouraging user interaction and engagement and enhancing user understanding you also learned how to configure PowerBI’s drill through feature which empowers users to navigate from a visualization to a separate detailed report page focused on the data point they select another key area of exploration was sorting and filtering data which are fundamental to data analysis and reporting in PowerBI you gained proficiency in applying and managing these techniques in PowerBI reports to enhance data presentation and exploration and highlight relevant insights you were then introduced to the concept of cross filtering and cross highlighting providing you with the knowledge to configure interaction behaviors for visualizations improving the interactivity of your reports whereas cross highlighting highlights the related data in other visuals when a user selects a data point in one visual cross- filtering filters out or removes the unrelated data from the other visuals you applied your skills by sorting and filtering marketing data in a report emphasizing and contextualizing the importance of sorting and filtering in the real world after that you took your PowerBI reporting skills to the next level with an indepth exploration of creating highly interactive reports you discovered the dynamic nature of slicers and how they can contribute to enhanced report interactivity plus you explored using buttons to add more interactivity to your reports and learned how to customize them to suit your needs you learned how to improve user experience and storytelling in your reports by adding bookmarks as well as how to add URLs to enrich your PowerBI reports further grouping and layering visuals provided a way to efficiently manage the visuals in your reports making report maintenance more efficient you put your skills into action by creating an interactive report demonstrating your proficiency in using the drill through button slicer and bookmark features finally you recaped the importance of filter context in DAX measures and how it impacts visualizations throughout this module knowledge checks were strategically placed to assess your understanding of key concepts covered in relation to designing accessible reports navigating and filtering data effectively and creating interactive reports keep up the excellent work and get ready to explore designing accessible dashboards and data sharing bringing you closer to becoming a proficient PowerBI data analyst and visualization expert the marketing director at Adventure Works receives an overwhelming number of data reports monthly sales numbers customer demographics market trends and product performance metrics all need to be analyzed and interpreted and she needs your help doing this luckily you know about dashboards a tool in Microsoft PowerBI that can help transform this data into valuable insights but what is a dashboard and how does it differ from a report in this video you’ll explore the concept of dashboards in a business context you’ll discover their importance functionalities and how they serve as key tools in data analysis and decision-making processes let’s start by exploring what a dashboard is consider the dashboard of a car it presents critical data like speed fuel level and engine temperature in a consolidated visually understandable way this information allows you to make necessary decisions while driving similarly in the business context a dashboard visualizes the critical information required to accomplish specific objectives skillfully arranged and consolidated on one screen for example a sales dashboard for Adventure Works might display total sales sales by region top selling products and trends over time dashboards can present data from different sources in various forms making it easier for stakeholders to understand they are interactive and real time allowing users to in essence have a conversation with their data and drill down into specific details when needed say you notice an unusual sales spike in one region at Adventure Works with an interactive dashboard you can delve deeper into the data inspecting the specifics of the sales transactions identifying the products involved and even the key customer demographics contributing to this sudden surge dashboards play an important role in today’s competitive business world where informed decision-making is vital to success with dashboards you can transform raw data into actionable insights providing a comprehensive view of business performance at a glance dashboards can serve as an essential navigational tool for tracking various aspects of business performance for example for Adventure Works dashboards can bring the different threads of data on sales trends production efficiency customer behavior and market dynamics together presenting a comprehensive view of the overall health and trajectory of the business suppose there’s a sudden drop in sales in a specific sales region without a dashboard recognizing this issue would require sifting through vast amounts of sales data a time-consuming process with the potential for oversight however a well-designed dashboard can quickly highlight this anomaly triggering a timely investigation and corrective action dashboards also play a vital role in promoting a culture of transparency and accountability within an organization they act as unbiased databacked mirrors that reflect the true performance of different business units against set targets and benchmarks by doing so dashboards can foster a sense of ownership and accountability among team members encouraging continuous improvement dashboards make data accessible to everyone break down barriers and encourage data sharing between teams as well as promote a shared understanding of business performance across departments but what is the difference between a dashboard and a report though often used interchangeably dashboards and reports serve different purposes in Microsoft PowerBI a report in Microsoft PowerBI is highly interactive users can slice and dice the data drill down into details apply filters and explore various facets of the data within the report itself in essence a PowerBI report provides an indepth interactive multi-perspective view of a specific data set or topic it’s like an exploratory journey through your data a dashboard on the other hand is like a summary or highlight reel of one or more reports it’s a one-page overview of the most important metrics or KPIs selected from the various pages of one or more reports a useful way to consider the difference between a dashboard and a report is to compare it to a news bulletin versus an indepth news article the news bulletin or dashboard provides key highlights summarizing the most essential points if a particular news point catches your attention you can read the full news article or report for a more detailed understanding as you continue your data analysis journey remember that the true power of data lies not in its volume but in its usability both dashboards and reports are vital navigation tools in the sea of data they provide visibility drive accountability facilitate understanding and ultimately inform decision making addio your manager at Adventure Works asks you to create a dashboard in Microsoft PowerBI that highlights key performance indicators and insights from a sales analysis report you and your team created this screencast will explore how to create and configure a dashboard in Microsoft PowerBI as well as how to configure the mobile view for the dashboard and customized themes previously you learned that a dashboard is a consolidated display of multiple visualizations reports and other data in a single layout to create a dashboard open your Microsoft PowerBI service and navigate to your workspace in the left navigation pane then from your available workspaces select the adventure works workspace let’s create a new canvas where you can pin your visuals on the top left corner select new and then select dashboard a popup appears asking you to name your dashboard let’s name it Adventure Works Sales Dashboard after typing the name select create once you have created your dashboard you can start adding visuals return to your workspace and open the sales report you and your team created each visualization in your report has a pin icon in the top right corner select the pin icon for the total sales by product category bar chart this opens a dialogue box where you can choose where to pin this visual select your newly created Adventure Works sales dashboard from the drop-down menu the bar chart is a good starting point for your dashboard as it provides a broad overview of sales distribution by product category then pin the monthly sales trends line chart this chart shows the sales pattern over time which is critical for identifying seasonal trends or growth patterns in the modern business landscape having mobile accessible data is key with PowerBI’s mobile layout feature you can configure your Adventure Works sales dashboard to be mobile friendly ensuring stakeholders can access insights on the go to switch to mobile view go to the main navigation bar find and select the edit menu from the drop-down options select mobile layout to switch the view from desktop to mobile once you select the mobile layout your screen adjusts to replicate a mobile devices screen size now instead of a wide canvas it displays a vertical layout this canvas is blank but don’t worry all your visuals are safe and where you left them you just need to decide which visuals to show on the mobile layout and where to place them a list of all the visualizations in your dashboard is displayed on the right side of your screen each visualization has a pin icon next to it to select the visuals you’d like to appear in the mobile layout select the relevant pin icons selecting these pins indicates the visuals you’d like to appear in the mobile layout you can select and drag each visualization to move it around on the canvas you can also resize each visualization by dragging its edges finally let’s explore how to change the theme for the Adventure Works sales dashboard start by navigating to the Adventure Works Sales Dashboard you just created in the upper menu find and select the edit menu this opens a drop-own list of view options select dashboard theme another drop-own list appears select switch theme a popup window displays various pre-made themes you can apply to your dashboard choose a theme that you feel best visually represents the data and select it then select save the theme is now applied to your dashboard and you’ll immediately observe the changes in color and style applied across all your visualizations and there you have it you now know how to create a dashboard configure the mobile view and customize your dashboard theme foundational knowledge that is vital to using dashboards in PowerBI and conveying key insights from your reports with its large scale of operation Adventure Works generates immense data volumes daily as a data analyst your role involves harnessing this data making sense of it and transforming it into insights that inform strategic decision-making but with such a large mass of data where do you start microsoft PowerBI has the answer it’s quick insights and Q&A features over the next few minutes you’ll discover how to optimize the usability of your PowerBI dashboards by adding quick insights and utilizing the Q&A feature you’ll also learn how to set up quick insights and integrate the Q&A feature into your dashboards quick insights is a feature in PowerBI that automatically searches data sets to discover and visualize potential insights it identifies patterns trends outliers and other useful insights that may not be immediately obvious for example uncovering sales patterns to help the marketing team at Adventure Works target their campaigns more effectively quick Insights not only presents the insights in an easy to understand format but also explains how it arrived at these insights this way even if you’re new to data analysis you can follow along and gain a solid understanding of the data let’s explore the steps to set up and use the quick insights feature in PowerBI open your Microsoft PowerBI service and navigate to your workspace on the left hand side of the screen here different data sets and reports shared with you are displayed select the data set or report you want to analyze open or select the ellipsus menu and get quick insights to initiate the automated analysis powerbi starts an automatic scan of your data during this process the function applies various machine learning algorithms and statistical functions to your data set it searches for potential patterns trends correlations outliers and other interesting attributes this process can take a few minutes depending on the size and complexity of your data set after the scan you can access the insights by selecting view insights this will lead you to a new page filled with cards each insight card visually represents a particular pattern or trend in your data hover over the visuals or select them to display more details this is where your data interpretation skills come into play in this case you have to understand what each of these visuals represents and how it relates to the Adventure Works business context if you find any insight particularly useful or wish to share it with others in your team you can pin it to a dashboard to do this hover over the card and select the pin icon in the top right corner of the card then select the dashboard you want to pin it to or create a new one now let’s move on to the Q&A feature the Q&A feature is a natural language processing tool in PowerBI it allows you to ask questions about your data in plain English and provides answers in the form of charts graphs or simple numeric results this feature is invaluable in the business context because it allows users of all levels to interact with their data and find specific answers without requiring deep technical knowledge the key advantage of the Q&A feature is its flexibility you ask questions ranging from simple questions like “What was the total revenue last quarter?” to more complex ones such as “Which product had the highest sales growth rate last year?” The more you use the Q&A feature the more it learns and adapts to your question style offering even more relevant and precise answers over time let’s explore how to set up and use the Q&A feature in PowerBI at the top of your dashboard there’s a field ask a question about your data this is the Q&A box place your cursor in the box to ask your question type your question in normal conversational language as you type PowerBI Q&A will start offering suggestions and autocomplete options based on the data in your dashboard for instance if you’re interested in sales trends you could type “What were the total sales last month?” or “Show sales by product category.” As soon as you finish typing your question PowerBI Q&A generates an answer in the form of a data visual such as a bar chart line graph or table this visualization is based on the best interpretation the Q&A can make of your question if the interpretation is not what you intended you can rephrase or refine your question the PowerBI Q&A tool uses machine learning so it becomes smarter and more accurate the more you interact with it if the visual answer to your question is particularly useful and you want to keep it handy you can pin it to your dashboard to do this locate and select the pin icon at the top right of the visual choose the existing dashboard where you want to pin it or create a new one with quick insights and Q&A you are well equipped to bridge the gap between data and decision-making these features simplify complex data analysis enabling you to deliver actionable insights faster and more accurately imagine you’ve prepared stunning visuals in Microsoft PowerBI for Renee Gonzalez the marketing director at Adventure Works showcasing sales trends across different product categories you’ve pinned these visuals to your dashboard for easy reference but as you start digging deeper into the data exploring trends and cross- filtering data you come across a snag the pinned visuals are static snapshots they don’t interact or update you realize you’ve hit a roadblock that prevents you from extracting the full potential of your data analysis frustrating right you’re not alone as that’s a common issue with pinned visuals in PowerBI in this video you’ll explore the limitations of pinned visuals in PowerBI and how to overcome these limitations by setting up and pinning live reports to your PowerBI dashboard in PowerBI a pinned visual is a snapshot of a specific piece of data or chart from a report that is attached or pinned to a dashboard you can pin various things like a line chart showing sales trends over time a bar chart comparing the performance of different product lines a gauge displaying progress towards a goal or even a simple card displaying a single important number like total sales or total customers pinned visuals provide an at a glance overview of specific insights however they have certain limitations
the main limitation is their lack of interactivity you can’t cross filter or drill through data using pinned visuals which prevents you from exploring data trends in greater detail for example imagine Renee is studying a pinned visual showcasing sales trends for different bicycle product categories as she scans the data she wants to filter it by region to understand which categories are more popular in certain regions this could provide valuable insights for regional marketing strategies however the static nature of pinned visuals prevents her from cross-filtering or drilling through the data leading to incomplete insights and potentially missed opportunities for datadriven strategies so is there a way around these limitations absolutely the solution lies in pinning live reports to your dashboard instead pinning a live report means attaching an entire report page to your dashboard as a live tile unlike standard visuals pinned to a dashboard live report tiles are dynamic and maintain the interactivity of the original report this includes the ability to drill through data cross filter and view tool tips which provides a more immersive data exploration experience directly from the dashboard pinned live reports retain the original report layout and formatting making the visuals aesthetically consistent the interaction between visuals within live reports reveals relationships and patterns that isolated visuals cannot while pinned visuals offer a quick view of specific data points pinning live reports significantly enhances data exploration and analysis capabilities providing a comprehensive interactive view of your data now let’s explore how to set up and pin live reports the first step is to select the report you want to pin to your dashboard if you’re starting from scratch you will need to create a new report once you have opened your report select the reading view button on the ribbon directly above your report then select the ellipses on the far right of the ribbon followed by pin to dashboard from the drop-own menu the pin live page feature lets you pin an entire report page as a live tile on the dashboard this means the tile will continually update and allow interaction something a simple pinned visual cannot do a dialogue box asks you to choose a destination for your pinned live report you can select an existing dashboard or create a new one by typing a new name into the text box after you’ve selected the destination select the pin live button in the bottom right corner to pin your live report to the selected dashboard to view your newly pinned live report navigate to your chosen dashboard by selecting the workspaces button on the lefth hand navigation bar and selecting the dashboard where you pinned the live report now a live interactive report is directly accessible from your dashboard it retains all its interactive capabilities in the report view allowing you to filter and drill down into the data directly from the dashboard any changes you make to the original report will reflect in the live report on your dashboard ensuring real time data updates by using live reports you not only enrich your data storytelling but also create opportunities for more deeper more insightful analysis pinning live reports to your dashboard can help you turn static one-dimensional visuals into dynamic insightful narratives your manager Adio asked you to create a comprehensive report on the sales of Adventure Works product lines across different regions you have cleaned and analyzed the data and created a final report that is visually appealing and informative now you need to share the data and insights contained in the report with key decision makers in Adventure Works this is where Microsoft’s PowerBI publishing reports feature comes into play over the next few minutes you’ll discover the process of publishing reports in PowerBI let’s start by exploring what publishing reports in PowerBI means when you publish a report you move it from your local PowerBI desktop and upload it to the more accessible and collaborative online platform PowerBI service publishing a report connects you with decision makers allowing you to share your reports with colleagues your whole organization or external stakeholders who need to draw insights from the data in data analysis the purpose of creating reports is to assist with decision-making guide strategies and provide insights into business operations and for that to happen you need to publish and share the reports for example you can publish and share your report with the regional sales managers at Adventure Works this enables them to access the report through the PowerBI service where they can identify bestselling and underperforming products analyze sales patterns such as seasonal trends and then plan and focus marketing efforts accordingly furthermore a published report is not static you can set up automatic data refreshes so the report is always up to date with the latest data let’s explore how to publish reports in PowerBI publishing a report to PowerBI service from PowerBI desktop involves a series of steps let’s work through these steps the first step is to save the report since PowerBI will not allow you to publish unsaved reports select file in the top left corner of the PowerBI desktop interface and then save as to save the report choose a location on your computer and give it a descriptive name like Adventure Works product sales report select save once you’ve saved the report the publish option becomes available in the home tab of the ribbon of PowerBI desktop select publish and a new dialogue box pops up in this dialogue box indicate where you want to save the report in PowerBI service select Adventure Works as your workspace and then the select button for larger projects or collaborations you can create and select different workspaces once you’ve selected the destination PowerBI starts publishing the report a loading dialogue appears indicating that the report is being published depending on the size of the report and your internet connection this could take a few moments once your report is published a new window pops up to confirm it says success and gives you two options you can either open the report in PowerBI service or you can cancel and open it later in this case let’s select open selecting open launches the default web browser on your computer and takes you directly to your report in PowerBI service the report now displays as it will appear to other users while data analysis is about facts and numbers it’s also about communication publishing reports in PowerBI is a crucial part of the data analysis storytelling process as a data analyst your reports are pivotal in driving datainformed decisions and a vital link in the chain of business intelligence as a data analyst at Adventure Works you are tasked with reviewing and sharing sales data since Adventure Works is a multinational company the final report contains large amounts of information which you need to present in a format that is more manageable for stakeholders microsoft PowerBI allows you to pageionate and export reports as a result you can break down complex sets of results into smaller more digestible parts and share them easily in this video you will learn how to create multiple pages of content in a PowerBI report and navigate between them you will also learn how to export these pages to a PDF file in PowerBI you can organize and present your data across multiple pages within a single report which is known as pageionation a page in a PowerBI report is like a page in a book pages make it easier for the reader to navigate and understand the content for example if you have a large data set with numerous visuals presenting all of them on a single page can make the report difficult to read and interpret by dividing your report content into multiple pages you make your report more organized and easier to navigate let’s discover how to configure pageionation and export reports in PowerBI desktop with PowerBI desktop open navigate to the file menu located in the top left corner of the applications home screen once you select file a side menu appears select open report and then select browse reports to open a dialogue box navigate to the location on your computer where your PowerBI report file is stored select the file and then open to load the report now that your report is loaded you need to make sure you’re in the right view to pageionate your report a vertical pane on the left of the screen contains three views in PowerBI report data and model select report this choice is now highlighted on the bottom left of the report view screen is a tab with the name current page to add a new page select the plus sign which is the new page option to rename this page appropriately to represent the data it contains right click on the page name and select rename page you can then move visuals and report elements by cutting and pasting them from your main report to these newly created pages you can navigate between pages by selecting the tabs this allows you to organize the data in your report and makes it easier to review and understand if you need to present this report in a meeting or share it with colleagues who don’t use PowerBI you can export it to a PDF format select file in the top left corner of your PowerBI desktop screen on the menu that opens select the export option a side menu opens with the different export formats available select the to PDF option to begin the process of exporting your PowerBI report as a PDF document depending on the complexity and size of your report this may take a few seconds to a few minutes once the export is completed the PDF file will open automatically to display the result creating multiple pages and exporting to PDF can help you to produce effective PowerBI reports pageionation and exporting in PowerBI help you break down and categorize data clearly to enhance understanding and easily share insights that can drive informed decisions you’ve spent hours working on a sales report for the management team at Adventure Works and are confident that it will not only meet but exceed their expectations the feedback unfortunately is not about the insights your report offers it’s about the loading time your sales stats visuals load at a sluggish pace causing the stakeholders to become impatient despite your effort in creating the report its slow loading time overshadows its merits sounds like a nightmare right but it doesn’t have to be this is where Microsoft PowerBI’s performance analyzer comes into the picture over the next few minutes you’ll learn about the vital role of PowerBI’s performance analyzer in optimizing the performance of your reports by the end of this video you will understand why it’s important to measure current performance before implementing changes using the performance analyzer so let’s get started the performance analyzer a tool in PowerBI is designed to help you understand the load time for each visual element in your report this functionality is crucial in scenarios where a report has various visuals filters and calculations each of which can potentially impact the overall performance of the report it is critical to measure current performance before making changes to a report in data analysis just as you wouldn’t make business decisions without first analyzing relevant data you shouldn’t implement changes to your PowerBI report without understanding the current performance situation and identifying any problem areas with insights from the performance analyzer you can take targeted actions improve the performance of the lagging visuals and transform your report into a fast loading efficient tool the performance analyzer doesn’t just highlight what’s wrong it also shows you what’s right not all visuals or filters in your report will be problematic many of them might be well optimized and load swiftly recognizing these efficient components allows you to learn from them and apply those best practices to other reports or visuals now let’s dive into the interface and discover how to activate the performance analyzer in PowerBI desktop after your report is open and loaded select the view tab find and select the performance analyzer option at the top middle of the screen a new pane titled performance analyzer opens on the right side of your screen displaying buttons for starting and stopping recording refreshing visuals and exporting data the performance analyzer pane has a button labeled start recording to begin gathering performance data for your report select this button once activated the performance analyzer starts monitoring any actions taken on the report capturing useful performance metrics for each visual element on the page now that the recording has started you need to generate the actions you want to analyze this could involve refreshing a report page to load all the visuals or navigating through different report pages if it spans multiple pages you can manually refresh the page by selecting the refresh visuals button in the performance analyzer pane this action causes PowerBI to reload all visuals on the page and the performance analyzer records the performance data for each visual during this process the performance data displays in a list in the performance analyzer pane with each visual on a separate row this list contains information such as the name of the visual the duration of time it took for the visual to render the time it took to run the DAX query for the visual and more this information can help you understand how long it takes for each visual to load and render and identify any potential bottlenecks in your report expanding the row by selecting the plus icon reveals more granular details about the performance of that visual this includes a breakdown of the time it took for each operation such as the DAX query execution visual display rendering and any other operations the actual DAX query run and more the performance analyzer lists visuals in the order they were rendered on the page by default however this order may not always be the most useful when diagnosing performance issues you can reorder the list by selecting the duration column header this sorts the visuals by the time taken to render allowing you to quickly identify which visuals are taking the longest to render and could be potential targets for optimization once you’ve gathered the performance data you need you can stop the performance analyzer recording select the stop button in the performance analyzer pane to conclude the data capture you can always start a new recording session by clicking the start recording button again as a data analyst your task isn’t just to ensure that your reports are accurate or comprehensive but also that they’re efficient a well optimized report can mean the difference between insights that sit on a virtual shelf gathering dust and insights that spark change and propel a business forward in the world of data speed isn’t just a convenience it can enhance the impact of your reports lead to better decision-making and drive business success imagine you’re a data analyst in Adventure Works working through streams of data finding patterns making connections and uncovering insights that could improve business performance you’re in the middle of an exciting project where you’ve created a new complex DAX query to analyze sales performance and uncover trends but as you load your PowerBI report you’re not met with a rush of insights but rather a slow loading screen that seems to drag on forever this isn’t just frustrating it’s a barrier between you and the crucial insights needed to drive Adventure Works forward as these performance issues make your data exploration and analysis frustratingly slow you remember a helpful tool the performance analyzer in this video you’ll discover the role of the performance analyzer tool in diagnosing and resolving DAX performance issues you’ll become familiar with the process of identifying if a DAX query is causing a delay and learn how to optimize it for improved performance at the heart of PowerBI’s data modeling is DAX or data analysis expressions as you may recall DAX encompasses a wide range of functions operators and constants that you can combine to create different formulas and expressions the power of DAX lies in its flexibility with DAX you can build custom calculations within data models thereby allowing you to analyze data in unique and powerful ways however just like a powerful vehicle it requires skill and care to operate effectively and efficiently while DAX has immense analytical power it can sometimes run into performance issues these issues arise when the DAX queries that are created based on your formulas and visual configurations become complex making the engine work harder and longer to return the results for example suppose you are dealing with large adventure work sales tables that need to be sifted through your DAX formulas might be complex and inefficient or you might have a data model that’s been improperly structured regardless of the case these issues can lead to slow report loading times sluggish interactions and an overall frustrating user experience to help identify and resolve these performance issues PowerBI has a built-in tool called the performance analyzer this tool provides detailed timing breakdowns on all the various components and processes that occur when your report is refreshed it helps you spot which visuals fields or DAX calculations are taking up the most time and hence slowing your report down let’s explore how to identify and resolve DAX query performance issues using the performance analyzer once you’ve loaded your PowerBI sales report you first need to open the performance analyzer on the ribbon interface at the top of your PowerBI report locate and select the view tab within the view tab find and select the performance analyzer option in the performance analyzer pane locate and select the start recording button now it’s time to refresh your report you can accomplish this in two ways either by selecting the refresh button situated in the home tab of the ribbon interface or by directly interacting with the report interactions could be in the form of changing a filter selecting a slicer or simply navigating to a different page of the report as you interact with the report while the performance analyzer is recording it will track and document the time taken to load each individual visual item this data is crucial for diagnosing performance issues once the report has finished refreshing review the performance analyzer pane you’ll see a list of all the visual items in your report and their respective load times pay special attention to any visual items that take a significantly longer time to load compared to others for the visuals with slower load times you can drill down into the details by selecting the arrow beside the visuals names this will provide a detailed breakdown of the DAX query time and the visual rendering time helping you understand where the bottleneck lies if the DAX query time is high then your effort should be directed towards optimizing the DAX measures in this case it appears that the average sales by product category is slowing down the report performance as it has a considerably larger DAX loading time locate the average sales field from the data view on your right and select it to view the underlying DAX formula the filter and all functions used in this formula iterate over the entire data table to calculate the average sales for each product across all stores this operation becomes particularly slow when working with larger data sets to simplify the DAX formula eliminate the filter and all functions and instead use the average X function the average X is a function that evaluates an expression for each row of a table and then returns the average result however since it operates directly on the data context which is already filtered based on the report’s current context it avoids the need to iterate over the entire data table finally rerun the performance analyzer to test if the optimization was successful the advantage of applying an optimized formula is that it simplifies the calculations and reduces the computational load by avoiding the iteration over the whole data table it leads to a significant speed up in query execution you’ve now seen how seemingly simple tasks like generating a sales report at Adventure Works can become complex it’s in these complexities that you as the data analyst can create value by optimizing your DAX queries and delivering faster smoother reports you can empower stakeholders to make quick and informed decisions remember data analysis isn’t about delivering vast amounts of information it’s about delivering the right information in the right format at the right time each time your report loads a little faster or your DAX query runs a little smoother you’re not just improving a technical process you’re contributing to better faster and more informed business decisions you are now better equipped to find the hidden inefficiencies in your DAX queries confront them headon and turn them into opportunities for learning and growth adventure Works has a rich set of data from manufacturing to sales the data is vast and you are responsible for developing a comprehensive dashboard that compiles all these data sources into meaningful insights you start creating a report in Microsoft PowerBI and use DAX the formula language in PowerBI as you create complex DAX expressions you realize that the report starts to lag the calculations are getting more complex and timeconuming and you wonder if there’s a more efficient way to handle all this data without sacrificing performance in your search for solutions you discover DAX variables which are said to have the power to make PowerBI dashboards more efficient could using DAX variables be the answer to you improving your report performance in the next few minutes you’ll discover DAX variables and their importance in PowerBI you’ll also learn how to effectively implement DAX variables to optimize the performance readability and accuracy of your PowerBI reports dax or data analysis expressions is a formula language that includes functions operators and values you can combine to construct formulas and expressions in PowerBI and Power Pivot in Excel in programming and formula languages a variable acts as a storage container you can put something into it like a number or a string or even the result of a more complex expression once you’ve assigned a value to a variable you can reference that variable by its name elsewhere thus saving you the need to recomputee or refetch that stored value in DAX variables serve a similar role but with a twist catering to its analytical nature instead of thinking of them as simple storage containers think of them as computational snapshots when dealing with complex data sets like the multi-layered operations at Adventure Works recalculating the same values or expressions can be resource inensive especially if done multiple times in a single report or visualization this is where using variables in DAX for PowerBI is beneficial let’s explore the benefits of using DAX variables in more depth using variables allows for storing intermediate results complex calculations done multiple times can be stored in a variable and referenced thereafter saving computational effort and time this optimization leads to faster report rendering and performance enhancement especially in large data sets dax formulas can sometimes become quite lengthy and complex by breaking down these formulas and storing parts of them in variables the main formula becomes more streamlined and easier to read improving readability also once a value or a result is stored in a variable it remains consistent throughout the formula this ensures consistency and no variation due to repeated calculations leading to more accurate results in addition to ensuring consistency reusing variables in multiple expressions within a formula means you don’t have to recalculate or redefine commonly used values or results and provides flexibility in formula construction should there be an error or an unexpected result in your report having your formula broken down into variables makes it easier to pinpoint where things might have gone wrong instead of sifting through a long complex formula you can check variable values individually making debugging easier lastly breaking down complex expressions into smaller parts held within variables makes your formulas more transparent and easier to understand this reduced complexity can be immensely beneficial when working in teams where other data analysts or report developers might need to decipher or modify your DAX expressions for example if you were to calculate the total sales for Adventure Works in the last year and then use that figure in multiple parts of your DAX formula without variables the same total sales value might get recalculated every single time it’s referenced this redundancy isn’t just a waste of computational resources it’s a drain on performance by using a variable you compute the value once store it as a snapshot and then reference this snapshot wherever needed in your formula ensuring both clarity and improved performance now let’s examine how to use a variable in DAX to improve report performance in PowerBI let’s start by opening the existing Adventure Works sales PowerBI report once your report is open you’ll notice various panes on the screen on the right side you’ll find the data pane which lists all the tables that your report is connected to select the sales table that contains the empty sales measure upon selecting the sales measure the formula bar will open where you can start writing your DAX formulas begin the formula with the var keyword this is the starting point for declaring a variable after typing var add a space and then name your variable it’s a good practice to name your variable something meaningful for instance if you’re calculating total sales for the last 12 months you might name your variable sales_12 months next you’ll provide the DAX expression that calculates the value for the variable after the equal sign write out the DAX formula you want the variable to hold this expression calculates the sum of sales amounts over the last 12 months after defining all necessary variables the next step in your DAX measure is using the return keyword this keyword indicates the final output of your DAX measure after performing calculations using your variables once you’ve written out your measure press enter with the measure saved to your table you could use the variable you created to quickly compare the last year’s sales figures across different product categories or regional markets by leveraging the pre-calculated variable the report would render these comparative visualizations much more quickly using variables in DAX within PowerBI offers a streamlined approach to handling complex calculations and improving report performance as you get more accustomed to this feature you’ll find yourself employing variables more often to make your DAX measures both efficient and maintainable using variables to optimize your data models and make them efficient can ensure not only quick results but more accurate insights every line of DAX you write every measure you create and every insight you derive has the potential to influence decisions shape strategies and drive success adventure Works has seen soaring sales this year with mountain bikes especially flying off the racks like never before but as you sift through your PowerBI dashboard a nagging feeling settles in the mountain bike sales data for the past 12 months that you have been visualizing through a complex DAX formula isn’t tallying up with the raw sales numbers questions whirl through your mind is there a missing link an error in the formula maybe the weight of potential inaccuracies weighs on you mistakes mean mistrust in data and mistrust in data can lead to poor business decisions in this video you’ll learn how to use variables in DAX to troubleshoot issues like this one to recap a variable in DAX lets you store a value or a table to be used later in your formula think of them as placeholders or temporary storage units for your data by breaking down your DAX formula into smaller pieces and storing parts of the calculation in variables you can keep track of each step making the process more comprehensible and easier to debug returning to the earlier adventure works example suppose you’re faced with a formula representing the sales for the last 12 months given the vast amount of data and interconnectedness of the business processes ensuring accuracy in the formula is paramount so let’s help Adventure Works troubleshoot their mountain bike sales data for the past 12 months before you can do any troubleshooting understanding the overall structure and components of the formula is essential without a comprehensive grasp of what the formula consists of determining what might be causing an issue becomes like finding a needle in a haststack once you have opened your PowerBI report on the right side of the interface you’ll notice the fields pane within the fields pane scroll until you locate the DAX measure you wish to troubleshoot in this case the measure to troubleshoot is the sales_12 months upon selecting the measure a formula bar appears above the report canvas this bar allows you to view the DAX expression while carefully examining the expressions present you can identify components like the calculate function sum aggregation and dates in period function as each of these plays a role in the calculation once you identify each component of the measure it’s time to create variables for each part by breaking down the formula into smaller parts and assigning them to variables you can address each segment separately this modular approach aids in understanding which part of the formula might be behaving unexpectedly on the upper ribbon select the modeling tab and select the button named new measure this indicates you’re creating a new formula or metric that isn’t present in your data upon selecting new measure the formula bar becomes active for you to define the logic of your formula and break it down into variables start by typing var which stands for variable followed by a space then provide a name for your variable like current date using the equals sign assign the function today to this variable and return the result now let’s create a new measure and add a variable called last year sales for the dates in period section with variables holding specific parts of the formula analyzing them individually allows for isolated testing by evaluating each variable separately you can confirm its correctness ensuring that each foundational block of the formula is sound before the whole formula is put together finally let’s create variables for the product category and subcategory to return the result for each on the right hand side locate the visualizations pane select the card icon to place a blank card onto your report canvas a card visual is useful because it displays a single prominent value ideal for scrutinizing individual variables once the card is active you’ll notice areas named values and axis in the visualizations pane locate your variable named current date in the fields pane select hold and drag it to the values area of the card the card will now dynamically showcase the current date as you continue the troubleshooting process create new card visuals on the canvas and drag the sales filtered by category and sales filtered by subcategory measures to the cards to provide a snapshot of the isolated categories after assessing individual variables it’s crucial to observe how they interact together sometimes even if variables are correct when isolated they may not interact as expected when combined this step ensures that the overall logic of combining the variables is correct let’s create a new measure called mountain bike sales to weave these variables together with the calculate function calculate modifies or extends the context in which a calculation occurs so combining these variables essentially tells PowerBI to consider only sales amounts of mountain bikes in the cross country subcategory for the last 12 months to visualize the combined logic drag the newly made measure mountain bike sales onto a new card visual if everything is functioning correctly this should vividly illustrate the mountain bike sales restricted to the last 12 months for the cross country subcategory you notice that the sales filtered by subcategory card is significantly different in value from the mountain bike sales card based on your troubleshooting you uncover that while the technical logic of your DAX calculation is correct a pre-existing filter was applied onto the sales filtered by subcategory card that skewed your calculation showing sales for the past 6 months to resolve this select the sales filtered by subcategory card visual and clear the applied filter in this video you learned how to use variable for troubleshooting you discovered the importance of breaking down a DAX formula piece by piece understanding each element and its interaction and how this modular approach provides a systematic method for troubleshooting you also explored the process of defining DAX variables and combining them to ensure their interactions produce accurate results imagine you’re a captain navigating the seas of business data your compass is your understanding of key performance indicators your sales are your dashboards and your map is Microsoft PowerBI the winds of analytics fill your sales pushing you towards better informed decision-making this module bringing data to the user has equipped you with the navigational skills needed to sail through the waters of business analytics you’ve not only discovered the pivotal role of dashboards in steering organizational decisions but also ventured into report navigation and publishing configuring mobile views fine-tuning report performance and sharing leveraging features like quick insights and Q&A and optimizing reports using DAX variables let’s recap key concepts including dashboards in business decisionmaking including how to create and customize them sharing information with stakeholders such as PowerBI workspaces publishing reports and optimizing pageionation for better navigation and user experience and the usage of the analyze in Excel feature in PowerBI and optimizing reports using DAX variables thereby making your report easier to debug and more efficient you started with a deep dive into creating dashboards you explored the concept of dashboards in the business context their importance functionalities and how they serve as key tools in data analysis and decision-making processes much like a car’s dashboard that shows critical data like speed and fuel level you learned that a business dashboard provides a consolidated real-time visual display of key performance indicators or KPIs such as sales trends and customer behavior while they share similarities with reports dashboards differ in that they offer a one-page summary of the most important metrics in contrast reports provide a more indepth multi-perspective view you also recognize the need to understand the visual and interactive nature of dashboards their role in promoting transparency and accountability within organizations and how they aid in breaking down barriers to information sharing your exploration continued to how to build a simple dashboard configure the mobile view and change themes you started by creating a new report dragging and dropping various data fields to make visual charts like bar graphs and line charts once you had your visuals you combined them into a single dashboard for a comprehensive view of important metrics to elevate your data analysis capabilities you explored how to optimize the usability of your PowerBI dashboards by adding two key features its quick insights and Q&A features you also discovered the limitations of pinned visuals in PowerBI how their static nature can prevent deep data exploration and how to overcome these limitations by setting up and pinning live reports next you delved into sharing reports with stakeholders you learned about PowerBI workspaces and their importance alongside the stepby-step process of creating a simple workspace workspaces are essential as containers that hold various components such as dashboards reports workbooks and data sets you explored the step-by-step process of publishing reports in PowerBI as well as the concept of pageionation and why it’s beneficial for creating organized reports publishing reports serves as a bridge connecting you the data analyst with decision makers and team members who need to draw insights from the data pagionation affirmed that dividing your report content into multiple pages makes your report more organized and easier to navigate akin to chapters in a book your journey then led you to understand the different elements of report page properties including page information canvas settings canvas background and wallpaper report page properties let you customize your report pages giving you control over how your report is presented influencing aspects like page size view and background enhancing overall readability and effectiveness you also learned how to use the analyze in Excel feature in PowerBI to take your reports and further analyze them combining the visual capabilities of PowerBI with the analytical depth of Excel it provides a live connection from an Excel pivot table to the data in PowerBI so when data in PowerBI is updated you can simply refresh your Excel report to see the new data you also explored the practical aspects of tuning report performance you grasped the role and function of the PowerBI performance analyzer the process of activating it starting a recording refreshing visuals analyzing performance data and exporting data for further analysis the performance analyzer helped you identify the parts of your report slowing things down by providing a detailed breakdown of loading times for each visual you also identified if a DAX query was causing the delay and took the necessary actions to optimize it for improved performance the process of simplifying a DAX formula involves reducing the complexity of the formula which might include eliminating unnecessary calculations using more efficient functions are avoiding iterating over large tables this can make the formula more efficient and less demanding on the DAX engine reducing the computational load in the final part of our journey you explored the importance of DAX variables how to use variables to enhance the performance and accuracy of your PowerBI reports and the steps to effectively implement them for optimal performance using variables in DAX formulas enhances readability by breaking down complex and lengthy expressions into more digestible smaller parts variables act as named references for parts of these formulas making the main expression streamlined and easier to interpret throughout this module you journeyed from understanding the foundational significance of dashboards to the details of optimizing DAX formulas at every step you’ve gained skills and techniques that empower you to bring data to the user a fundamental aspect of data analysis and visualization these skills and techniques aren’t just tools they’re instruments of change that can drive organizations like Adventure Works towards innovation efficiency and success the marketing director at Adventure Works Renee was captivated by the Microsoft PowerBI reports you produced recognizing their value in the company’s decision-making process Renee wants to delve deeper into the data introduce statistical results categorize data patterns and make predictions about future trends although these tasks have been vital for businesses for decades immensely helping their decision-making they were traditionally complex and timeconuming however the analytics in PowerBI has changed this powerbi offers a versatile and userfriendly toolbox to tackle analytical tasks effortlessly making these processes much more efficient and accessible but how can you use the analytics in PowerBI in your reports over the next few minutes you’ll be introduced to the concept of analytics and explore the analytics capabilities offered by PowerBI analytics refers to systematically using data statistical and quantitative analysis and predictive modeling techniques to uncover meaningful patterns insights and trends within data sets although these tasks have been vital for businesses for decades immensely helping their decision-making an essential part of analytics involves interpreting and visualizing data to extract valuable information resulting in actionable insights for informed and strategic decisions powerbi empowers you to transform raw data into meaningful insights through its various advanced tools and functionalities analytics in PowerBI unlocks many ways to enrich your visualizations adding significant value to your reports as you progress through this course you’ll explore the many ways analytics in PowerBI can enhance and elevate your reports for now let’s explore some of the PowerBI features available for analytics leveraging the statistical summary tool you can easily add functions to your visualizations like calculating averages and middle and median values you will also learn how to use the topend analysis in a visualization to highlight critical data points saving you time from repetitive tasks and manual calculations another feature you’ll learn about is DAX measures which can enhance PowerBI’s visualizations to find unusual data points called outliers with grouping and bin data for analysis you can classify two or more associated data points into groups or separate them into equals-sized groups respectively mastering organizing your data into meaningful categories can reveal trends and patterns in your data helping you make smarter decisions applying clustering techniques empowers you to discover another way of associating similar data points in a subset of your data using the clustering algorithm using a straightforward feature that identifies similarities and dissimilarities in the attributes values your data gets divided into subsets called clusters unveiling valuable patterns in your data powerbi empowers you to conduct time series analysis timebased data analysis with the time series involves exploring trends and patterns occurring over a range of time as you explore this feature further you’ll learn how to predict future trends using time series forecasting and discover captivating visuals to support your timeass associated data like the play axis an advanced visual containing a dynamic playback of data over time powerbi also offers the analyze feature this powerful feature automatically detects relationships and connections in your data revealing valuable insights that might have gone unnoticed with the press of a button on any data point PowerBI runs a rapid analysis to provide users with automated generated insights you can leverage advanced analytics custom visuals to create exceptional reports there are a variety of custom visuals in PowerBI called advanced analytics custom visuals or AI visuals powerbi leverages machine learning algorithms to provide insights on the data you provide on the chart visuals like key influencers and decomposition tree will take your data reports to a new level another AI powered feature of PowerBI service quick insights generates valuable information from your data sets in the form of a dashboard with the press of a button this will save you time and help stakeholders make better decisions faster plus you can uncover predictive and prescriptive insights with PowerBI’s AI capabilities you can generate AI insights with functionalities like sentiment analysis which visualizes emotions or attitudes in data and key phrase extraction which identifies phrases in text data these AI capabilities empower you to forecast future trends and stakeholders to make datadriven decisions with confidence you’ve now been introduced to the PowerBI features available for analytics in upcoming videos you will delve deeper into each one of the features and witness their magic at work exploring the powerful tools of analytics in PowerBI unlocks a world of possibilities for you to drive datadriven decision making with your reports by harnessing the power of analytics in PowerBI you can help organizations optimize their strategies and stay ahead in today’s dynamic business landscape adio your manager at Adventure Works just imported the company’s sales data for quarter 1 into a Microsoft PowerBI report there is an air of anticipation as your team brainstorms ways to extract valuable insights from this information despite the raw nature of the data set only containing product details order dates and the total order amount the team sees immense potential to build upon the aim is to create a report that can answer crucial questions like what was the total order amount per product category what were the average and medium amounts per product category did the early March ad campaign have any impact on sales adio is confident that PowerBI’s statistical summary capabilities can easily transform these questions into an insightful report in this video you will learn about these capabilities exploring the process of integrating a statistical summary into a PowerBI report data and statistics are closely intertwined as statistics serve as the essential language to articulate and analyze your data powerbi captures the power of statistics offering a comprehensive range of statistical functions you may already be familiar with some of the functions commonly used in data analysis such as sum of totals average for mean calculations and medium minimum and maximum to find the middle smallest and largest values in a data set powerbi not only provides rich features to seamlessly incorporate these functions into your visualizations and reports but also utilizes the DAX language that encompasses all of these statistical capabilities this powerful combination is referred to as the statistical summary in PowerBI using Adventure Works sales data set let’s examine two different ways of adding the average statistical function to a visualization this will help the sales team identify which product category accumulates the highest average order amount in addition to identifying whether Adventure Works early March ad campaign impacted orders the marketing team also needs to retrieve the number of orders per day from the data set as you are learning to integrate a statistical summary in a report let’s extract and utilize just three columns of Adventure Works sales data product category order date and order total which is the total order amount to prepare for our statistical summary exploration let’s create a few simple graphs to work with first let’s create a clustered column chart and select product category first to represent it on the xaxis and order total second as its yaxis to visualize the total amount of orders for each product category adjust the visual to the screen and click on an empty space of the canvas to deselect the bar chart and create the second visualization a line graph right below the column chart which will contain the order date on its x-axis using just order date without the date hierarchy and then the order total again as its yaxis this visualization depicts the total order amount of each date lastly let’s create a table graph in the right corner of the screen add product category as its first column and order total as its second column this will provide a better view of the numerical data when adding a numeric column to a visual the default function displayed is the sum or total of the amount however there are numerous built-in functions that you can apply to your graph these functions display on the popup menu in the visualizations pane directly at the right of your column such as average median and deviation to better understand how this works let’s add the order total column again in the same graph and adjust the function to calculate the average order amount of each product category instead you can also create your own calculations using DAX expressions which include a rich set of statistical functions let’s produce a similar result using a straightforward DAX measure in the ribbons home tab select new measure assign the measure a name and use the median function specifying the order total column for the calculation lastly modify the column chart to a line and column chart add your measure to the y-axis and observe the result now let’s explore the time series data let’s add the number of orders for each day to the line graph to do this drop the order total column into the secondary yaxis and use the count statistical function this is a helpful function that counts table rows in the graph based on the filter context it is given in this case where each row represents a single order the count function counts the number of orders by using statistical summary in PowerBI you explored how you can effortlessly calculate statistical measures and add them to your visualizations all the critical questions were answered in the report as it displays the average and median value of each product category and even displays the impact of the ad campaign in March when the count of orders doubled with just three columns as your data source you unlocked the power of analytics in PowerBI with the aid of statistical summary many business requirements can be met and questions answered with ease thanks to the array of statistical features tailor made for data analysts by PowerBI renee the marketing manager at Adventure Works has just finished a critical meeting with other marketing team leads to discuss new approaches and strategies for attracting new customers after the meeting she promptly reached out to the data analytics team to discuss the implementation of these approaches in their reports during the meeting the marketing leads for North America and Europe decided to take different approaches for each continent’s market this requires grouping country orders by continent a task that hasn’t been implemented in the existing data set additionally the marketing team agreed on launching ad campaigns in 10day intervals microsoft PowerBI’s visualization options already include automatic monthly and weekly breakdowns but the challenge is to figure out how to assemble orders into 10-day groups the data analytics team quickly searches for a solution and discovers that you can address both these problems using analytics in PowerBI particularly the grouping and binning data features these features both associate data points with each other in their respective ways grouping in PowerBI gives you the ability to manually divide data points into separate groups of your choice on the other hand bin automatically separates data points into segments referred to as bins giving you two options to do so you provide the number of outcome bins with PowerBI splitting the data points between them or you provide the size of bins and PowerBI splits the data points into any number of bins required to fit your data into the specified sized bins now the question is how can they effectively implement these features in the customer report in this video you’ll be introduced to the concept of grouping and bin and you will learn how to differentiate between the two concepts you will also learn how they can be effectively implemented in a PowerBI report to clarify information and provide easy to understand deliverables let’s start by helping Adventure Works group the orders from each country by continent to visually highlight orders for Europe and North America you need to group them in the report first let’s select a stacked bar chart and set the country on the Y-axis and the sum of order total on the X-axis hold down the shift key and select in the visual all the countries that belong to North America including USA Mexico and Canada while still holding the shift button down right click on the visual and select group data from the drop-down menu this action automatically creates a group and assigns it to the legend field resulting in a different color for the countries that were grouped together now let’s explore how to edit the group created earlier the new group appears as a new column in the table with an icon on the left side indicating that it is a super group of another column right click on this new group and select edit groups from the menu to open a new window now you have the option to rename the existing group let’s change Canada Mexico and USA to North America similarly you can select all European countries while holding the control key select group and create a new group called Europe once you are done select okay in addition to highlighting categories of data you can also use the newly created groups as an axis in your visuals to do this create a doughut chart and add the sum of order total to the values field then add country groups to the details field this will help you visualize the distribution of the order amounts between North America Europe and the other regions the doughut chart clearly represents how the orders are distributed among these different groups making it easier to analyze the data at a glance to create bins based on the 10day campaign interval right click on the order date column and select new group select bin as the group type and size of bins as the bin type in the bin size select the 10day interval to align with the campaign requirement and select okay next create a line chart and use the new bin on the x-axis and the sum of order total on the y-axis this creates a visualization of the 10day ad campaign interval by using this technique the marketing team can effectively analyze the data based on the 10day intervals gaining valuable insights into the trends and patterns within the data set as you know by now grouping and binning data has always been crucial in data analysis as it organizes data points into similar meaningful categories uncovering patterns hidden within them powerbi introduces this capability in its engine allowing you to seamlessly group or bin columns in a simple manner without having the hassle over delivering the result in code language to fully grasp the power of this feature let’s compare them with the complexity of using DAX code to achieve the same bin technique with just a few clicks the data analytics team publishes the report quickly leaving Renee astonished by the powerful capabilities of groups and bins in PowerBI the marketing team can now easily identify trends within the groups of North America and Europe enabling them to make immediate comparisons with the rest of the countries moreover they can analyze and assess the 10-day campaigns effortlessly gaining insight into critical information on their performance well done the sales team at Adventure Works is so impressed by your Microsoft PowerBI report that they ask you to add more analytics to the data set the team wants to analyze if there is a trend in the order amount identify the largest order of each day by order amount and determine the top 10 best and worst sales days for the business you can accomplish this by including a histogram in the report and using the topend analysis feature but what is a histogram and how do you add topend analysis in the next few minutes you’ll learn how to identify and build histograms as well as filter data points into a topend analysis showcasing only the most significant data a histogram is a way to visualize a topend data query result while the topend function in PowerBI is a built-in DAX function that retrieves the topend records from a data set based on specific criteria it compares the parameters provided and returns the corresponding rows from the data source the n in top n refers to the number of values at the top or bottom data points are grouped into ranges or bins making the data more understandable a histogram is a great way to illustrate the frequency distribution of your data as you already know a typical chart visual relates to two data points a measure and a dimension incorporating them on its X and Yaxis respectively adventure Works has an existing bar chart to track the total order quantity for different product categories but they would like to know how often quantities occur to do this they would create a histogram of the quantities the x-axis contains the quantity groups and the yaxis contains the frequency that these groups occur the most used charts for histograms are bar charts and area charts sorting a field in ascending or descending order is a relatively common process in data analysis reporting but what happens when there are so many attributes that the columns completely cover the canvas area hiding the crucial information the top end analysis prevents this by sorting the data to display according to a category’s best or worst data points this enables stakeholders to quickly identify the top or bottom values in the data and make datadriven decisions efficiently now let’s explore how to create histograms to analyze sales data and visualize the top 10 dates and sales by implementing top-end analysis in a visualization for the adventure work sales team let’s start creating a histogram to analyze trends in order amounts the first step in creating a histogram is to create a bar chart and to add order total to the X and Y axes ensure you select the sum of order total and not the count resize the chart by dragging its edges so it’s clearly visible notice that having numerous data points on the X-axis may make it difficult for users to interpret the analysis histograms directly address this issue by grouping X-axis data points in groups to achieve this use the bin technique you learned about previously rightclick the order total column and select new group from the drop- down menu select bin as the group type and number of bins as the bin type for the bin count enter 20 and then select okay to create the new bin in the order total column now replace the new bin on the x-axis instead of the standard column in both charts congratulations you have now created your first histogram bar charts are one of the most common histogram charts with area charts being a close second while having the visualization selected select the area chart to modify it using histograms the distribution of order amounts per amount ranges is clearly visible with the most revenue being accumulated through orders that were just over the $2,250 mark now let’s explore how you can visualize the top end data points of a column to achieve this you need an attribute and a sorting column the sorting column will be used to create ascending or descending order on the attribute column before the attribute column is filtered to its top end values let’s observe a topend analysis implementation creating a chart to highlight the top 10 days by sales amount create a funnel chart which is one of the most popular top- end charts and add order date without hierarchy to the category and order total to the values to limit the chart to a top 10 analysis navigate to the filter pane select the arrow on order date and select top N as the filter type select top 10 to display the best days you would select bottom for the worst days and add the total amount to the buy value to sort by this amount you now have a better understanding of the capabilities and potential of histograms and top end analysis in PowerBI by working through this lesson you discovered how to construct histograms transforming data into visualizations that uncover distribution patterns furthermore you’ve practiced your topend analysis skills to isolate key data points to inform actionable insights during a recent strategy meeting at Adventure Works stakeholders discussed adjusting prices to align with the business strategy however the current sales data set seems disconnected and lacks cohesion making it difficult to use recognizing the importance of optimizing the company’s product offerings you’d like to apply advanced analytics to categorize products based on order details and pricing your goal is to establish meaningful connections between the products to enable datadriven pricing decisions having explored groups and bins in Microsoft PowerBI you’ve learned to organize data points hierarchically with groups or into equal-sized bins but what if you want to group data points based on similarities in their values that’s where the clustering technique in PowerBI comes into play this video aims to equip you with all the relevant knowledge needed to apply the clustering technique to a data set including how to cluster data in scatter charts and identify outliers with clustering clustering is a powerful feature that enables you to discover groups of similar data points within your data set efficiently it is enabled in scatter plot visualizations as they are the optimal charts for analyzing data dispersed and identifying outliers by analyzing your data the clustering technique identifies similarities and dissimilarities in attribute values and then separates the similar data into distinct subsets known as clusters these clusters provide valuable insights and aid in understanding patterns and relationships within your data it covers the valuable insights that clustering can offer using the earlier example as a practical demonstration let’s begin exploring patterns in the Adventure Works products based on their sales data launching a new PowerBI report with the sales data set imported select the scatter chart icon on the visualizations pane and resize it on the screen for better visibility add product name in the values field as this is the field you want to separate into clusters for the axes use product price as the x axis and order total as the y-axis ensure the sum function is correctly applied to both as the default aggregation with this setup you can now apply the clustering technique to gain valuable insights from the data with the dots scattered across the graph let’s apply analytics to identify similarities between these data points that would group them into categories select the ellipses in the top right corner of the chart to see the visualization options now select the automatically find clusters option a pop-up window on your screen provides various clustering options you can adjust name the cluster group product cluster and for the description use clusters for product name based on product price and order total then you have to choose how many clusters you want the data points separated into or even let PowerBI automatically choose the number for our example let’s input three as the number of clusters and select okay the clustering technique has divided the product data points into three clusters the first cluster comprises products with low prices leading to low order amounts the second cluster includes products with high prices but relatively lower order totals compared to cluster three where high product prices also resulted in high order totals continuing with the clustering analysis you can leverage the newly formed clusters as axes for additional visuals allowing you to gain further insights based on clustering patterns select a horizontal clustered bar chart and set product category as the yaxis and sum of order total as the xaxis adjust the chart size to cover the right part of the canvas from top to bottom to add the new data grouping into the analysis add product cluster as the small multiple to do this navigate to format in the visualizations pane then small multiples and select three rows and one column to compare these multiples easily lastly include the product name in the tool tips of the visualizations by analyzing the clusters in both graphs you can directly gain insights from your data set while most ebikes and road bikes appear to belong to the high-erforming cluster three there are some exceptions in the lowerforming cluster 2 hovering over these product categories allows you to display the product names that belong to this category providing valuable information for future business decisions by clustering the products you helped the pricing department make crucial decisions to improve the promotion of specific products and embrace datadriven strategies at Adventure Works by analyzing products belonging to the low performing categories they adjusted their prices strategically aiming to achieve better results and optimize the overall market performance in this video you have gained valuable skills in using the clustering algorithm in your scatter plots to group data points effectively by applying clustering you learned how to identify hidden relationships and patterns within your data making it possible to optimize various aspects of business such as product pricing promotions and overall strategies you received a new report requirement this morning your task is to build a customer demographic analysis leveraging the sales and customer data sets to derive valuable insights about the customers to fulfill the business needs for visualizations based on country customer age and order dates you will have to use both axes categories categorical and continuous axes but what are these categories and how do you decide which one to use in each visualization over the next few minutes you’ll be introduced to categorical and continuous axes and learn how to differentiate between them you’ll also explore how to configure these axes in Microsoft PowerBI let’s start by exploring categorical axes you can use a categorical axis to represent discrete non-numeric data points it organizes data into distinct categories such as names categories are groups with no inherent numerical order common examples of categorical data include product names geographic regions and employee roles when you use a categorical axis PowerBI automatically arranges data points in the order they appear in the data set categorical axes are best suited for displaying qualitative information and facilitating comparisons between distinct entities or categories bar charts stacked bar charts pie charts and categorical line charts are common visualizations that use categorical axes on the other hand a continuous axis is designed to represent numerical data points with an inherent order and can be measured along a continuous scale these data points are typically represented by real numbers and can be integers or decimal values examples of continuous data include sales revenue temperature time and age continuous axes are ideal for visualizing quantitative information allowing users to identify trends patterns and correlations within the data common visualizations that use continuous axes are line charts area charts scatter plots and histograms now let’s explore how to use these two axes in your reports using a realcase scenario let’s explore both axes to understand their use better open a new report with sales and customer data sets imported the first visualization you’re going to work on is sum of order total by order date add a clustered column chart and insert order date on the x-axis without date hierarchy and order total on the y-axis resize the visual by dragging the edges the visual displays spaces with no data for the dates that held no orders this is because PowerBI automatically selects the continuous access type when given a date column in its access field by selecting the categorical access the bar chart displays no space by removing the depiction of dates with zero order total keep in mind that there is no right or wrong way to visualize the data and there are no numeric differences between the two axes the choice of axis type should be the one that best addresses the business need to explore the categorical axis let’s create a second visualization using a sum of order total by location to do this insert a clustered bar chart and add location on its y-axis and sum of order total on the x-axis move the visualization to the right part of the screen and resize it so it fits the screen top to bottom location has no inherent order so PowerBI automatically implements a categorical axis and turns off the option of turning it into a continuous axis for the last graph let’s explore another possibility of a continuous axis customer age is a column with an inherent numerical order so when you add a line chart and insert age on the x-axis and order quantity on the y-axis PowerBI uses the continuous type of axis you can observe a major difference between the two axes if you try to access the visualization sorting method through the ellipsus you will notice that continuous access doesn’t allow you to use a different sorting other than the one inherited by the numeric column to change the default sorting you need to use a categorical axis understanding categorical and continuous axes and their roles in data visualization will enable you to select the correct axis based on the nature of the data you’re analyzing with this knowledge you can create more effective and informative visualizations making it easier to compare discrete categories or identify trends and patterns within numerical data renee the marketing manager at Adventure Works relies heavily on analytics using Microsoft PowerBI to equip herself for important executive meetings as part of her preparation for a high-level meeting with the company’s executives Renee has created several reports and presentations based on the results of the most recent marketing campaigns run by her department renee takes great care when preparing the analysis however she worries that there could be essential data insights that she and her team have overlooked seeking expert advice she turns to Lucas the data analyst for guidance lucas suggests using the analyze feature in PowerBI with this feature they can examine the data from different perspectives and ensure that no valuable aspects have been missed but what is the analyze feature and how can it be added to reports the analyze feature provides you with advanced analytics to automatically detect patterns trends and anomalies in your data in this video you’ll explore the analyze feature and how it can be used to identify trends and patterns now let’s help Renee to examine her data from different perspectives with the customer and sales data sets imported let’s create a new report and add visualizations first you’ll create a line chart and insert the order date on the x-axis without the date hierarchy and then the sum of order total on the y-axis you will also add an area chart next to it with the age field as the x-axis and the sum of the order total field as the y-axis finally on the bottom of the page you’ll add a clustered column chart with the product category as the x-axis the sum of order quantity as the y-axis and the order status as the legend then resize it to fit the screen now let’s start using the analyze feature on each of these visualizations to discover what insights it can add to your analysis starting with the line chart it is obvious that the biggest order was placed on the 7th of March to explore this further select this specific date rightclick and select analyze now you can select the explain the increase option once this is selected a variety of different visualizations appear these analyze the increased order figure on this day based on factors such as product size payment method product categories and others clusters that were created manually in the table will also be included in the analysis by scrolling through these automatically generated visuals you can gain a clear picture of the factors that caused the increase in the order amount now let’s run the analyze feature on the second visualization the area chart since using distinct ages isn’t very informative for analysis you’ll first create bins to group the age data to do this right click on the age column and choose the option new group apply size of bins as the bin type with 10 as the bin size and select okay to create the age groups separated by decade then drag and drop this new bin to add it on the x-axis and use the x button on the previously used age column to remove it from the chart to investigate further with the analysis feature let’s select the first bin with decreasing values right click on it select analyze and then explain the decrease just as with the analysis in the first visualization this action causes a number of visuals to appear these help us to identify all relevant aspects that might have contributed to the decrease in the age group above 40 years now let’s explore another useful aspect of the analyze feature in the bar chart which shows product category and status you may notice that road bikes have an unusually high number of canceled orders to investigate what might have caused this right click on the blue cancelled bar for road bikes and select analyze if you select find where this distribution is different a variety of visualizations are generated these illustrate the factors that played a significant role in the large number of cancellations of orders for road bikes this feature can highlight contributo factors such as country and location product cluster and more every visualization generated by the analyze features includes a thumbs up and a thumbs down option on the upper right corner this allows you to provide feedback to PowerBI regarding the usefulness of its analysis for your report when you are using the explain the increase or explain the decrease features you have the flexibility to select different visualizations to display the results that best suit your analysis requirements finally if the analysis feature provides an insightful visual that you’d like to include in your report you can quickly add it to the report by selecting the plus sign button in the top right of the visualization in this video you explored how to generate valuable insights from your data using the analyze feature in Microsoft PowerBI in this demonstration you learned how to work with diverse visualizations and interpret the results effectively the analyze feature provides you with advanced analytics automatically generating visualizations from your data sets aiding you to automatically detect patterns trends and anomalies in your data time series analysis involves analyzing a series of data in chronological order to identify meaningful information and reveal trends in this video you will explore how to create an insightful report analyzing adventure work sales data over a period of 3 years time series analysis involves analyzing a series of data in chronological order to identify meaningful information and reveal trends in this video you will create an insightful report analyzing Adventure Works sales data over a period of 3 years in your PowerBI report three Adventure Works data sets have already been imported these are sales product and date you will now add four visualizations as the basis for the time series analysis first add a simple card visualization with sales amount as its field second add a horizontal clustered bar chart with product in its y-axis and sales amount in its x-axis using the filter pane add a top 10 analysis on the visualization by sales amount so the highest selling products are highlighted line charts and scatter plots are the two most common visualizations used in the time series analysis with the first two basic visualizations already created let’s add these two types of graphs to the report add a line chart and include the date field from the date table in the x-axis this should not include the date hierarchy use sales amount from the sales table in the y-axis add a fourth visualization which is a scatter plot use the sum of total product cost from the sales table in the x-axis add the sales amount from the sales table to the yaxis include the category field from the product table in the legend section and the sum of sales amount from the sales table in the size section resize and move all the visuals so that they are better placed on the page now that the visualizations are created let’s explore how time series analysis can give you different perspectives on these visuals before you can create a time series analysis you must first import a custom animation visual from Microsoft AppSource microsoft AppSource is an online store offering custom visualizations that are built by industry-leading software providers to access the Microsoft App Store first select the ellipses in the visualizations pane and then select the get more visuals option this will take you directly to the PowerBI custom visuals in Microsoft app source search for the term play access to find the certified play access dynamic slicer visualization when you have located it choose add you should now have the play access button imported into the visualizations pane now let’s explore how to use the play access button as a dynamic filter in the report the play access button automatically filters all the other visuals using the chronological order of the date field that is added to it first select the new playaxis visualization in the visualization pane add month from the date table as a field this will ensure that the play access visualization will filter the report in a monthby-month sequence in the format your visual section there are three different formatting options that you may use specifically for the playaxis visual first there is animation settings it is possible to set the animation to auto start or to run on a loop for a specified time frame the second option is the time which you can use to modify the rate of filter transition here you will set it at 750 milliseconds which is a smooth transition speed the next format option relates to the color of the visual and specifically the color of each action of the play access button in this area you can specify colors for play pause stop previous and next actions the last format option is enable captions if you set this feature to on the button shows the value of the field that you have inserted and how it changes during the animation press play on the play access button to watch the sales data change month by month the play access button makes the report interactive by updating all the visuals simultaneously this provides a dynamic picture of the data outcomes over time and provides a more detailed analysis of the trends in adventure works sales you now know how to do a time series analysis and implement the playaxis visualization you can also use the play axis to conduct time series analysis decision makers in all areas of businesses require answers to very similar questions typical questions asked of the data analyst might be can we compare daily sales against the sales average is there a way to uncover trends in order quantity within our visualizations can we manually add a sales target threshold into our visualizations the senior management at Adventure Works consult with their data analyst Lucas they would like to see key information such as trends or averages to be clearly visible on certain visualizations lucas identifies reference lines as the key Microsoft PowerBI feature which will fulfill this requirement a reference line is an additional element that can be added to a visualization to draw attention to a key insight or piece of information powerbi offers a variety of reference lines that can be added to a visualization to include an additional measure for comparison with the data points the implementation of the line is based on integral calculations in the line type you’ve selected or on settings which you can customize let’s explore the different types of reference lines an average line represents the average value of a data series it is useful for identifying how individual data points relate to the overall average a median line shows the median value of a data series it is particularly helpful when dealing with skewed data distributions a percentile line identifies a specific percentile value such as the upper percentile within a data set helping you understand data distribution an x-axis or yaxis constant line is a straight line that represents a constant value on a visualization it is used to indicate a fixed threshold target or benchmark value for comparison a trend line reference line helps to identify trends or patterns in data different types of trend lines can be added to capture relationships in data it’s important to note that each visual within PowerBI supports its own set of reference lines this means that not every reference line type might be available for every type of visual powerbi intelligently offers reference lines that are contextually relevant to the type of data and visualization you’re working with for instance certain reference line types like trend line and average line are more applicable to line charts or scatter plots where data trends are easier to discern other reference lines like min line and max line are often used in bar charts to quickly visualize data ranges in some visualizations such as maps reference lines are disabled due to their limited interpretability within the visual context in the next few minutes you will be able to follow a practical demonstration on how to implement reference lines in PowerBI reporting this PowerBI report has two data sets already imported customer and sales you will create three graphs and add reference lines to them as another layer of visual information first create an area chart add age bins as the x-axis value and sum of order quantity as the yaxis value and resize it on the screen next add a line chart use order date as the x-axis value without using the date hierarchy and order total as the y-axis value resize this visual finally add a horizontal bar chart include location as the yaxis value and order total as the xaxis value and resize it to fill the screen now let’s add reference lines once you have selected the area chart a magnifying glass icon appears in the visualizations pane selecting this opens the analytics pane this pane lists the types of reference lines that can be added to the visualization add a trend line by selecting the on button a reference line appears which depicts the trend of order quantity over age groups it shows that older people order significantly less than younger people you can use the options below the trend line to adjust the line color transparency and style so that it stands out more for the next example select the line chart you will now add an average line which will help identify the days where the order total amount was above or below the average of each day in the analytics pane select average line and add line when the average line appears the choices underneath can be used to format it or to add a data label lastly in the bar chart it is important to easily identify the locations which are over a minimum target threshold select the bar chart in the analytics pane select constant line and add line add 3000 as the constant line value format the line if required it is now obvious that three locations Chicago Shanghai and Buenosaurus are below target thresholds of order total when choosing visualizations keep in mind that they do not all support reference lines for example if you change the bar graph to a map you can see that the line disappears in the analytics pane the message analytics features aren’t available for this visual appears you’ve now explored how adding reference lines to visualizations can highlight trends and data sets and simplify comparative analysis between data points adding reference lines to your report extends the capabilities of visual customization and allows you to meet the diverse demands of different business scenarios planning for the future is crucial for all businesses one business may need to plan for seasonal fluctuations in orders or revenue another may need to plan for growth and/or expansion what is critical in either situation is that key decision makers have reliable data and information and that they also have a realistic picture of future outcomes data analysts use forecasting to examine previous trends and patterns in business to predict whether they will continue and how they can affect future outcomes microsoft PowerBI contains a forecasting tool which can assist in this process renee at Adventure Works is currently formulating a 2-year development plan for the department she manages she has already been impressed by the reports that she has seen in PowerBI she approaches Lucas the data analyst to see if there are any visualizations available that could apply predictive models and forecast results lucas informs her that one of the core charts in nearly every report is already equipped with forecasting capabilities she’s excited to find out more the forecasting tool in PowerBI is directly built into line charts and it allows analysts and business users to predict future trends and values based on historical data they can make informed decisions and plan more effectively users can tailor their predictions to align with specific business needs and data patterns with forecasting options let’s look at three important concepts confidence interval in forecasting is the range of values within which the actual feature outcomes are likely to fall with a certain level of confidence it quantifies the uncertainty associated with a forecast for example a 95% confidence interval indicates that there’s a 95% likelihood that the actual future values will fall within the forecasted range this helps decision makers understand the potential variability in the predictive values seasonality refers to recurring patterns or cycles that appear at regular intervals in time series data patterns could be daily weekly monthly or yearly they often result from external factors like holidays or seasons or economic cycles recognizing and accounting for seasonality allows forecasting models to capture the expected fluctuations in data that repeat over time lastly ignore the last is a feature that allows users to selectively exclude the most recent data points from the historical data set when generating forecasts in PowerBI anomalies or abrupt changes in the data may occur in the latest periods which might distort the forecasted results by ignoring the last few data points users can focus the forecasting model on the more stable and representative patterns in the earlier data now let’s step through a practical example of including forecasting results in a line chart forecasting in PowerBI starts with a line chart adventure Works sales and date data sets have been imported into a new report in the visualizations pane select a line chart to add it to the canvas add date from the date table to the x-axis do not add the date hierarchy add sales amount from the sales table to the y-axis this basic configuration is all you need to apply forecasting to access the forecasting capabilities select the line chart then select the magnifying glass to open the analytics pane of the visuals select forecast in the list and turn it on a predictive section has already been added in the line chart select the arrow on the left to open the forecast settings options is the first and most important section here you can define the rules for how the forecasting line will be drawn units is set to points points refers to the date unit currently used in the visualization in forecast length you can specify a number of these date units and this will determine the length of the forecasting line in this case to forecast a whole year of values select 365 points to forecast a whole year period for confidence level select 90% confidence interval and select apply the forecast line also contains options to customize the line select the forecast line select a blue color so that it is similar to the actual line with the style option you can choose a dashed dotted or solid forecast line adjusting the transparency setting changes the visibility of the forecasted plot the confidence band choices allow you to customize the style of the upper and lower bounds changing it from fill to line the none choice will display no confidence bounds at all the forecasting feature in Microsoft PowerBI can create predictions of future trends from historical data adding these to your reports can provide you with valuable insights you are now familiar with using forecasting in a line chart and with concepts such as confidence intervals seasonality and ignore the last you’ve learned how to capture recurring patterns and how to allow for uncertainty these skills will allow you to design reports containing accurate forecasting the accurate anticipation of future outcomes will drive informed decisionmaking understanding the forces driving sales trends is a continuous concern for businesses advanced analytics tools are an accessible avenue to understanding these forces this is precisely the avenue your team proposes to navigate within Adventure Works sales data set with the robust capabilities of Microsoft PowerBI’s key influencers visuals you aim to identify all primary factors contributing to the rise and fall of sales figures in this video you’ll discover the power of the key influencer visualization an advanced analytics visualization in PowerBI you’ll learn how to include it in a PowerBI report and use it properly to obtain valuable information the key influencer visualization is one of the main advanced analytics visualizations in PowerBI it uses advanced algorithms to uncover relationships buried within data shedding light on the influential factors behind specific outcomes whether you want to understand the triggers behind a surge in sales or the reasons for a sudden decline the key influencers visual offers a concise snapshot of what truly matters now let’s explore the capabilities of the key influencers visual let’s start with an empty report with imported adventure work sales data select the key influencer icon on the visualization pane to add it to the canvas your aim is to apply AI insights to analyze the factors behind increases and decreases in the sales amount to do this drag and drop this sales amount field from the sales table in the analyze field the key influencer visual is now declaring that there are no fields in explain by requesting any number of relevant fields to the sales amount to initiate the analysis an AI analysis on all those factors will take place locating which of them are the main contributors behind sales amount surges and decreases to ensure the visualization provides insightful results you can add various relevant fields to the analysis for example let’s add the country region field from the customer table and the color and subcategory fields from the product table notice that as you add fields the visualization is already running a background analysis on the correspondence between the sales amount with all fields added in the explain by section let’s observe the results the top influencers affecting the sales amount are displayed on the visuals left side you can view the analysis results in detail by selecting any of them let’s select the red color influencer to delve deeper into the analysis when you select an influencer bar chart with a color field an analysis of sales amount compared to the average of sales per color displays you can observe the influence the red and silver products have on the sales total at a glance in contrast with the multi and white colors that barely made any sales to analyze the factors behind low sales amounts select the what influences sales amount box to change it to decrease apart from highlighting the key influencers affecting the sales these advanced visuals also group these influencers showcasing segments of influencers that played a significant role in sales increases or decreases select the top segments option in the upper border of the visual and in the field when is sales amount more likely to be choose high to identify the segments that perform well in sales now select the largest circle to view the results red road bikes have the biggest impact on sales with mountain bikes in the second position in this video you’ve explored the key influencers visualization an advanced analytics feature in PowerBI in just a few minutes with the support of AI algorithms powering the key influencer visual you extracted insights from your data set shedding light on the driving factors behind sales trends whether positive or negative you can also incorporate advanced analytics into your reporting process elevating the quality and depth of your analytical insights the marketing team at Adventure Works was fascinated by the impact the previous advanced visualization key influencers had on their data set they are now eager to explore what other advanced visualizations can accomplish your manager Addio wants to introduce decomposition trees another specialized analytics tool in Microsoft PowerBI if you’re wondering where and how to include the decomposition tree visual in a report this video is for you in the next few minutes you’ll be introduced to the decomposition tree and how to use this visual to navigate through data hierarchy levels which refer to the arrangement of data points in a structured format where elements are organized into levels or tiers based on their relationships you’ll also learn how to activate its AI potential letting the visual guide you through the critical factors behind outcomes but first what are decomposition trees the decomposition tree visual in PowerBI lets you visualize data across multiple dimensions it automatically aggregates data and enables drilling down into your dimensions in any order it is the optimal solution when analyzing the hierarchical structure of data being an AI visual it can also leverage the hierarchical graphical representation of the visualization to automatically explore dimensions based on certain criteria here is an example of how the decomposition tree breaks down Adventure Works sum of sales amount into hierarchical groups referred to as branches to analyze the distribution of the amount in its subcategories the user can navigate through the branches manually by selecting any data point or enable the AI capabilities of the visual to automatically navigate through the branch based on the most influential components to start our journey with decomposition trees let’s launch a new report using the Adventure Works sales date and product data set locate and select the decomposition tree visual in the visualizations pane to add it to the report readjust the visual so it fits the whole screen add the sales amount into the analyze field before looking into its AI powered capabilities let’s explore the basic functions of decomposition trees decomposition trees excel at analyzing data structured in a hierarchical fashion so let’s find a structure built like this in the data set navigate to the data view of the report and to the product table you can see that each model belongs to multiple supercategories which have the following sequence product model subcategory and category let’s add this hierarchy to the decomposition tree to utilize its basic features add all four components of the hierarchical structure into the explain by field in any order a plus sign appears just right of the sales amount bar navigate through the hierarchy components in the order they are being used in the data set to get a complete breakdown of the sales amount between products in the data set although you can use the plus sign in any order you want utilizing the hierarchy sequence will give the best decomposition possible hit X anytime to remove a column from the decomposition tree and use the lock button to prevent a user from removing it now that you have a basic understanding of the decomposition tree let’s look at its AI capability to explore this potential let’s remove the model and product fields and add two other dimension fields to the chart color from the product table and year from the date table start at the first level of decomposition the category and select the plus sign you can now see that besides the columns added on the explain by field there are two more options high value and low value with a light bulb on their left side by selecting either one of them the decomposition tree will automatically choose the main driving factor between all fields added in the explain by section and highlight it for you to look at its capability select the high value of accessories to identify that the helmet subcategory was the driving factor of the accessory sales while in the clothing category the main reason behind the accumulation of the high amount was the superb clothing sales of 2019 on the other hand by removing the generated column and selecting a low value in the bikes category you can identify that blue colored bikes were the lowest performing attribute in bike sales with each lowest point being in 2020 in this video you learned about the capabilities of the decomposition tree an advanced visualization in PowerBI the decomposition tree is a unique tool for ad hoc exploration and root cause analysis of the factors behind any outcome in a data set combining both basic features with advanced AI capabilities it can convert information into valuable insights and contribute to business decision making by providing a deeper understanding of the underlying insights in a data set in the modern age of technology where information is all around us imagine you could uncover a map that reveals the hidden pathway that leads to success this is the exciting world of identifying patterns and trends in Microsoft PowerBI a journey that transforms raw data into secrets for success and numbers into opportunities this module gave you the experience of a modern-day explorer equipped not with a compass but with PowerBI’s analytical tools so let’s briefly recap some of the key concepts covered in the identifying patterns and trends module your foundation of identifying patterns and trends was laid through an introduction to analytics in PowerBI and its statistical summary capabilities you are equipped with the knowledge needed to incorporate a range of statistical functions into your reports supported by practical examples and a detailed cheat sheet of available statistical functions within DAX language you learned the importance of grouping similar data points into segments to highlight hidden patterns to empower you in this concept you explored PowerBI’s grouping bin and clustering techniques which helped match the precise needs of your analysis covering histograms top-end analysis and continuous and categorical axes you gained even more tools to include analytics in your data sets advancing and focusing on trend identification you engaged with the exceptional tools of the analytics pane including reference lines error bars and forecasting these tools significantly enhance chart information depth enabling not just data point comparison but also future trend prediction these tools have the capacity to explain data fluctuations providing a variety of insightful visuals that you can instantly add to your reports moreover you gained an initial glimpse into PowerBI’s ability to generate insightful visualizations via the analyze feature automatically these tools have the capacity to explain data fluctuations providing a variety of insightful visuals that you can instantly add to your reports lastly your introduction to AI visuals in PowerBI completed the picture you learned how to conduct root cause analysis within your reports using specialized visualizations like key influencers and decomposition trees these visuals are invaluable for uncovering key drivers behind data set fluctuations you also explored the Q&A visualization a powerful tool capable of transforming any business user into a data analyst formulating queries and crafting visualizations this natural language processor empowers you to translate language into graphs with remarkable efficiency ultimately your journey through the identifying patterns and trends in PowerBI has equipped you with a multi-dimensional toolkit from mastering statistical functions to unraveling hidden insights through segmentation and powerful analytics techniques you’ve become a data explorer skilled at revealing the story within the numbers with the ability to predict trends and harness AI powered visuals you are now better prepared to translate data into strategic decisions imagine yourself as an explorer in a maze of data surrounded by a vast and complex landscape of information somewhere deep within beyond the twists and turns lies pathways to hidden insights and unchartered opportunities awaiting discovery navigating through this data maze without proper guidance or tools could mean missing out on these hidden treasures entirely microsoft PowerBI serves as your modern-day explorer’s toolkit equipped with advanced mapping techniques helpful clues and expert data navigation it helps you cut through the noise interpret the data patterns and go directly to the heart of the insights buried within during this course you’ve transformed from a curious data wanderer into a skilled navigator prepared to guide businesses like Adventure Works toward newfound opportunities and business success using data analysis and visualization in this video you’ll consolidate critical lessons from your journey through this data analysis and visualization with PowerBI course you’ll have a refreshed understanding of creating visually engaging dashboards and reports you’ll also recall concepts related to making your PowerBI dashboards and reports more userfriendly accessible and inclusive sharing your dashboards and reports with users and optimizing reports using DAX language and using visualization and AI in PowerBI to perform data analysis and identify patterns and trends your journey began with a foundational understanding of PowerBI acting as your compass you delved into the details of PowerBI service PowerBI desktop and PowerBI mobile in this part of the course you are introduced to choosing between PowerBI Pro and PowerBI Premium the limitations and advantages of each and how these choices impact data storage sharing and collaboration capabilities you also became well-versed in the administrative interface getting to grips with workspace creation and data set management this was like understanding the maz’s structure and its very pathways setting the course for your data journey you learned how permissions and roles in PowerBI can influence the accessibility and security of your data much like how an explorer’s team is structured based on roles and expertise in navigation you gained insight into diverse visualization forms from simple bar charts to more complex waterfall and funnel charts your journey went beyond surface level exploration introducing you to the DAX language for calculated columns and measures to make your visuals more dynamic and informative you also explored advanced customization options such as using slicers for real-time data manipulation or conditional formatting to highlight key metrics these tools became guiding tools for precise data interpretation you also picked up the importance of visual hierarchy and storytelling along the way realizing that a well ststructured report can convey a narrative that empowers decision makers making your insights both accessible and inclusive became your next focus you learned how to make your PowerBI dashboards and reports accessible to users with disabilities this involved implementing high contrast color schemes adding alt text to visuals and ensuring tab navigation compatibility moreover you explored the built-in translation features of PowerBI ensuring minimal data language barriers these strategies ensure your data exploration is inclusive and reachable for all additionally you covered how to create mobile responsive reports understanding that accessibility also pertains to the variety of devices used to access data navigating through advanced functionalities was your next challenge here you deepened your knowledge of PowerBI’s more robust features such as using drill down and drill through functionalities to navigate between different layers of your data you also tackled data modeling understanding how to create relationships between various tables and sources your expedition delved deeper to uncover query parameters and their role in making your reports dynamic and interactive these tools enable you to interpret the data in the maze precisely without losing sight of the broader context you even ventured into APIs and custom connectors expanding the realms of data sources you can bring into PowerBI finally you were introduced to PowerBI’s AI capabilities like text analytics and the integration of machine learning models you explored time series analysis to forecast trends and discovered how to generate predictive models understand correlation and create data simulations your exploration continued to discover how to generate predictive models understanding correlation and create data simulations this makes it possible to predict and prepare for future trends much like an experienced explorer reading signs from the environment to prepare for what lies ahead you were guided through the process of automated machine learning in PowerBI making it possible to create predictive models without indepth programming knowledge like finding shortcuts and secret pathways within the maze as you conclude this course take a moment to reflect on your expedition you began as a budding explorer and now stand as the guide of others navigating through the intricate and sometimes bewildering maze of data analytics with confidence you’ve mastered the navigational tools and the instruments at your disposal with PowerBI and learn the art of reading and interpreting data in its deepest forms remember the world of data is vast and the technology that helps us navigate it is ever evolving you’ve acquired the skills strategies and insights to embark on countless more adventures but the maze remains boundless with every question you answer you’ll discover new ones that provoke your curiosity and challenge your understanding that’s the beauty and the challenge of data analytics embrace the ongoing quest for knowledge wisdom and growth with optimism in your heart and curiosity as your guide the best adventures still await congratulations on completing the data analysis and visualization with PowerBI course your dedication and hard work have paid off and you’ve gained knowledge skills and tools that will help set you on a path to excel in the world of data analysis you have successfully covered the following topics adding visualizations to reports and dashboards applying formatting choices to visuals adding useful navigation techniques to reports designing accessible reports and dashboards and using visualizations to perform data analysis you should now be well grounded in data analysis and visualization with Microsoft PowerBI you’ve learned how to use the power of data visualization and reporting in PowerBI to create compelling data stories and use formatting navigation and filtering to create interactive user-friendly and accessible reports that are engaging and informative from using visualizations and AI features to uncover data trends and patterns to sharing your insights effectively you are now better positioned to support businesses like Adventure Works in making datadriven decisions and driving business success but remember this is just one step on your data analysis journey by completing all the courses in this program you’ll receive the Microsoft PowerBI Analyst Professional Certificate from Corsera this program is an excellent opportunity to enhance your proficiency in data analysis in PowerBI and gain a qualification that opens doors to entry-level positions in the data analytics field this program will also help you prepare for exam PL300 Microsoft PowerBI data analyst by successfully completing the PL300 exam you’ll earn the Microsoft Certified PowerBI data analyst certification which will position you well to begin or advance your career in this role this globally recognized certification is industry endorsed evidence of your technical skills and knowledge the exam measures your ability to prepare data model data visualize and analyze data and deploy and maintain assets to complete the exam you should be familiar with Power Query and the process of writing expressions using data analysis expressions or DAX which you will learn about throughout the program to learn more about the PowerBI data analyst certification and exam visit the Microsoft Certifications page at http://www.learn.microsoft.com/certifications your journey through this course has not only provided you with essential skills in data analysis but also has laid the groundwork for your future endeavors your ability to recognize different visualizations apply formatting choices design accessible reports and dashboards and perform data analysis using PowerBI will undoubtedly set you apart in the world of data professionals but there’s still more to learn and room to grow so why not register for the next course in the program whether you’re a novice in the data analysis field or an experienced technical professional completing the entire program will showcase your knowledge of and proficiency in analyzing data with PowerBI your dedication to learning and growing in the world of data analysis is commendable and you should be proud of your progress and accomplishments your commitment will show prospective employers that you are capable motivated and driven and eager to learn it’s been a pleasure to be part of your educational journey wishing you all the best as you continue to explore the endless possibilities that data analysis with PowerBI has to offer congratulations once again and best of luck hello and welcome to the creative design in PowerBI course businesses and organizations obtain data from many sources these include government financial economic health and scientific data to name just a few as a data analyst it might be your job to extract insight from this large pool of data you could use Microsoft PowerBI to import this data and create data models but how will you then present the results of your work would you agree that a more creative presentation approach is required especially when dealing with large volumes of data you might aim for a more userfriendly presentation of the data so we’ve designed this course to give you the skills you need to visually share your data insights with your intended audience in this course you will learn how to creatively design dashboards reports and charts you’ll make visuals that the audience can quickly understand and you’ll know when and how to include specialist elements such as videos streaming data and QR codes as part of your business intelligence presentations you’ll be introduced to the theory and practice of visualization and design this includes the design principles of data display and visualization let’s now quickly summarize the course material to give you an overview of all you’ll study in this course you’ll begin by learning how to create a cohesive report design based on the characteristics of your target audience you will identify key information so that you can produce audience focused reports in week two you’ll learn how good design enhances the comprehension of data in your reports you’ll apply visual clarity use multi-dimensional visualizations insert map visualizations and implement custom visualization such as Python-based visualizations with these methods you can design powerful report pages that improve the enduser experience then it’s time to visit the concepts of dashboard design and storytelling you’ll compare the design of a dashboard with the design of a report and you’ll explore the principles of data storytelling advanced dashboard features such as embedding media and QR codes are part of your studies this week during the course you can watch pause rewind and re-watch the videos until you’re confident in your skills consolidate your knowledge by consulting the course readings and measure your understanding by completing knowledge checks and quizzes in addition the course discussion prompts allow you to share and chat with other learners by connecting with your classmate during discussions you can help grow your network of contacts your studies prepare you for a final project and a graded assessment that you’ll undertake in the last week of this course in the project you’ll get a pre-made Adventure Works data set and model in PowerBI your challenge is to use the data to prepare reports for the sales team and the executive board you’ll need to use data storytelling and cohesive design you’ll also be asked to use the data to highlight new business opportunities after this hands-on learning you will complete a final graded assessment be assured that everything you need to complete the assessment is included in the course and of course as part of your preparation for assessment you can always review the content of any lesson to revise the relevant videos readings exercises and quizzes businesses need data sourcing preparation and analysis presenting the insights gained is often the last part of this data processing it’s a key factor in ensuring that the benefits of the analysis are understood by all stakeholders is this course for you hopefully the outline of the course content and topics will help you decide you don’t need an IT related background to take this course it’s for anyone who likes using technology and has an interest in presenting the results of data analysis whatever your background to complete this course you need to have access to some resources you need a laptop or desktop computer with a recommended 4 GB of RAM an internet connection and a Windows operating system version 8.1 or later it should have a .NET Framework version 4.6.2 or later installed and a subscription to Microsoft Office 365 you will also need to install PowerBI Desktop available as a free download the courses in this program prepare you for a career in data analysis when you complete all the courses in the Microsoft PowerBI analyst professional certificate you’ll earn a Corsera certificate to share with your professional network taking this program not only helps you to become job ready but also prepares you for exam PL300 Microsoft PowerBI data analyst in the final course you’ll recap the key topics and concepts covered in each course along with a practice exam you’ll also get tips and tricks testing strategies useful resources and information on how to sign up for the exam finally you’ll test your knowledge in a mock exam mapped to the main topics in this program and the Microsoft Certified Exam PL300 ensuring you’re wellprepared for certification success earning a Microsoft certification is evidence of your real world skills and is globally recognized a Microsoft certification showcases your skills and demonstrates your commitment to keeping pace with rapidly changing technology it also positions you for increased skills efficiency and earning potential in your professional roles the topics covered in the practice exam include prepare data model data visualize and analyze data and deploy and maintain assets in summary this course introduces you to how a data analyst using Microsoft PowerBI applies data design techniques to create compelling stories through reports and dashboards i hope you are ready to start creating compelling and cohesive reports and dashboards using the best visual techniques to optimize audience focus i don’t have to tell you that a social media photograph gets way more likes and shares than a message that contains text only we choose to look at images first your brain processes visual data thousands of times faster than text that’s the main reason we prefer visual communications it’s also why right now all over the world people are using data visualization software to make sense of large complex data of course humans communicated visually long before we had technological power let’s check in on how we progressed from using just numbers for data presentations prepare to understand the real meaning behind the numbers as our understanding of the impact of visuals increased the approach to creating visualizations changed and in 1933 Harry Beck created the London Rail Underground Map inspired by electrical circuit diagrams it simplifies a complex layout by focusing not on rail line geography but on how a commuter uses the rail system it’s a visual style still used today to make data easier to understand visualizations that successfully connect with users have a lasting impact on how we communicate data let’s say you want to use data visualization to illustrate a much larger rail network it could be 10 times bigger or a thousand times bigger scale it to 100,000 times and you have an idea of the data volumes now available data visualization tools help us understand big data in the world around us just compare older 2D maps to how satellite mapping reveals a different vision we can zoom in for more detail to give a granular understanding of the area zoom further into a city’s layout and reveal data insights with visual markers while always being able to place our insight in the context of a global landscape businesses benefit from data visualization by understanding the impact of their decisions businesses can create better products and services that improve the lives of their customers but data visualization is not just for business it improves data accessibility for governments organizations and citizens for the first time we all have access to detailed and accurate data about the planet professor Hawkins from the University of Reading created global warming strikes like this a simple visual with no text no numbers but its message of the danger of global warming is clear despite technological advances the goal of data visualization remains the same to make data accessible and easier to understand imagine a world where large-scale decisions are better understood through visualizations of this data you can use data visualization tools to enhance your communication skills reveal insights on a global scale and help build a better world how do you choose an outfit from your wardrobe when choosing which clothes to mix and match it’s important to know what colors go well together after all you want to look your best the same goes for your reports and dashboards to look their best they need to have the best mix of colors and shades that’s why you are now being introduced to color theory in this video you’ll explore color theory its basic concepts and how it assists you in creating presentations and data graphics color theory is the collection of design rules and guidelines used to communicate with users through effective color schemes color theory involves the meaning and use of colors and how to pick the best colors in different situations to build harmonious and visually captivating color combinations as a data analyst understanding the principles of color theory is essential for creating visually captivating and effective designs colors can evoke emotions convey messages and enhance the impact of reports color theory is a practical guideline for the visual effects of color combinations it includes the color wheel color harmony color psychology and color symbolism it gives you a powerful toolkit to create visually pleasing and meaningful designs the color wheel represents the relationship between colors it consists of primary colors red blue yellow secondary colors which are mixes of primary colors such as orange green and purple and intermediate or tertiary colors which are mixes of primary and secondary colors the color wheel guides your choice of colors leading to color schemes that create harmonious compositions color harmony is another important concept color harmony refers to the arrangement of colors in a specific design that is visually pleasing to the viewer you create visual balance and enhance the overall impact of your design by choosing the correct color combination here are a few methods used to combine colors into a color scheme complimentary colors this system uses opposite hues on the color wheel analogous colors uses groups of colors that are next to each other on the color wheel triadic is a color concept that uses a three-pointed triangle selection of colors from the color wheel monochromatic color combinations use several variations of the same color the psychology of color is one of the most important aspects to consider during your design colors can evoke emotions and influence behavior for instance when designing marketing materials for Adventure Works outdoor adventure products incorporating vibrant and energetic colors like orange and yellow can evoke feelings of excitement and enthusiasm colors can often carry symbolic meanings and cultural associations different cultures may interpret colors differently so it’s important to consider cultural context when selecting colors for global designs for instance while red may symbolize luck in Eastern Asian cultures it can represent danger in some Western cultures by understanding color symbolism you can ensure that your designs effectively convey the intended message across different cultural backgrounds given the importance of color theory it’s crucial to consider accessibility when working with color in design as not all individuals perceive colors in the same way color blindness is a condition where individuals have difficulty distinguishing certain colors or perceiving color differences the most common type of color blindness is red green color blindness where individuals have trouble differentiating between shades of red and green to ensure that your designs are accessible to individuals with color blindness use color combinations that have sufficient contrast this means avoiding color combinations that may appear similar to individuals with color blindness it’s recommended to use high contrast color combinations such as black text on a white background to improve readability additionally providing alternative ways of conveying information beyond color is crucial for example if you’re using color to indicate different categories or data points consider also using patterns labels or symbols to supplement the color coding this ensures that individuals with color blindness can still understand and interpret the information accurately by considering color theory and accessibility together you can create designs that are not only visually appealing but also inclusive and accessible to a wider range of individuals mastering color theory is a vital skill for any artist designer or creative professional by understanding the principles of the color wheel color harmony color psychology and color symbolism you can create visually captivating designs that effectively communicate messages and evoke emotions in your audience as you embark on your colorful journey at Adventure Works let color theory be your guide in transforming ordinary designs into extraordinary visuals if I tell you that the temperature is very hot what color comes to mind most people answer in the range of orange to red color is a crucial design element for business intelligence dashboards and reports to make them visually intuitive and understood by all viewers by the end of this video you will understand how colors evoke psychological associations and convey symbolic meanings let’s explore the science of color in communicating datadriven stories in business communication colors serve as navigational tools directing users attention and facilitating efficient information access here are some roles colors can play in designing your reports and dashboards background is the color of your report or dashboard background or the background of an individual visual within the report use low saturation colors that is a color that is not too vivid rich or intense then the background will not distract users from the main story the dominant or primary color gives viewers the first impression of the color theme it’s typically used in a lot of elements to create contrast within your report an accent color is used for focal points of your report capturing users immediate attention examples include call toaction buttons alerts and warning messages semantic colors are colors that have an
actual meaning and they aid a seamless comprehension for example commonly employed colors for alerts are red for bad orange represents average and green signifies good semantic colors are usually used for conditional formatting on texts and charts once you choose colors for your reports you can create a color palette powerbi can upload a color palette as a JSON file to design a custom theme for your reports and visualizations by using a JSON file you can create a report theme file that standardizes your charts and reports making it easy for your organization’s reports to be consistent use these colors to amplify insights for example identify certain values or groups within your data that are good or bad use contrasting colors to differentiate between different values use shades of the same color to demonstrate the strength or weakness or various grades for instance using shades of the same color in a geographical visual to represent the ascending or descending values of sales use a dull color for something less important and a bright color for crucial information at Adventure Works you must create a report showing a table of sales data with profit margins the profit margins will be emphasized using effective color combinations while considering accessibility requirements let’s explore color selection in data visualization launch Microsoft PowerBI desktop and open the project salesbyear.pbix pbix navigate to the report view of PowerBI desktop to the report containing a table with sales and profit margin values and a column chart emphasizing the profit margin to remove the sum of prefix from the column titles go to the visualization pane and in the columns list doubleclick on the column name and delete the sum of text this can be done for all columns that need to be renamed to change the theme of the visualization navigate to the view tab of PowerBI and select the accessible city park theme from the theme drop-down list this will change the entire color combination for the current report the theme contains colors that satisfy accessibility requirements to ensure accessibility for the broadest range of consumers you can increase the font size and change the font color throughout the report to maximize visibility and contrast for instance increase the font size of the table values to 18 point select the table and navigate to format visual visual expand the value section and change the font size to 18 expand the column header section and change the font size to 18 then to accommodate the new size of the table move and resize the two visuals the next task is to highlight the most valuable information in the table the profit is the most important information for the executives you can use color psychology to emphasize this section of the visual select the table visual and go to the visualization pane in the columns list select the drop- down arrow beside the profit margin column and move the cursor to conditional formatting in the drop- down list this opens a submen of the drop-own list font color is what is needed from this list this opens the font color format dialogue box for the profit margin column values select rules from the format style drop-own menu and select values only from the apply to section profit margin is selected under the what field should we base this on section leave this column selected next to define the rules for the first rule the process is to select the greater than or equal to symbol and enter zero for the value then select number from the drop-own list just after the and part of the rule select less than and write max in values and select number from the drop-own list finally in the then part of the rule select the green color from the theme color selection section to set up the second rule continue to select a new rule with the plus icon to add a new rule to the list in the first control select greater than or equal to from the drop- down list and remove zero it will automatically select min and then select number from the drop-own list after the and of this rule select less than write zero in the values and select number from the drop- down list finally select a red color from the theme colors select okay the conditional formatting will change the color of the text to red if the profit margin is in the negative range this is a format that the company executives expect it allows them to quickly assess this part of the report to colorize the column chart representing the profit margin select the chart and in the visualization pane navigate to the format visual tab from the expand the column section where you can assign individual colors to each column select a red color for financial year 2022 and keep the green for 2020 and 2021 finally change the text size of the column chart to 12 point this means in format visual changing the font size for x-axis values y-axis values title and data labels that example transformed a report lacking clear visuals and without Adventure Works branding into an attention-grabbing report by the intelligent use of colors as a report designer understanding the key role of color is crucial to creating visually compelling and impactful work you get a report or a page of information on your screen how do you decide if the content is important enough for you to read many designers include headlines subheadings and other design devices such as callouts elements like these highlight key parts of the information allowing you to decide faster if the content is relevant to you you’ll use similar tactics in your Microsoft PowerBI report and dashboard designs over the next few minutes you will be introduced to the concepts of positioning and scaling by strategically placing and sizing visual elements such as charts tables and text you guide the viewer’s attention and indicate the level of importance of the information let’s say you are asked to create a complex report for Adventure Works to present the company’s annual revenue growth by region to achieve effective positioning and scale you place a bar chart in the middle of the report clearly displaying revenue figures for each region to provide additional context you position a map visualization alongside the bar chart showing the geographic distribution of revenue growth by placing the two different visual elements together you can enable viewers to make connections between regions and their respective revenue performance for the most effective delivery you must plan your report think about the positioning of different portions of data use scaling techniques and create a good user experience in your report positioning is the strategic placement of visual elements within a report to guide the viewer’s attention and convey key information it’s essential to consider the flow of information and the logical sequence in which the audience will consume it the placement of data and insights can significantly impact how they are perceived for example when presenting sales figures for Adventure Work’s latest product line you would position the most important metrics such as revenue and units sold at the top of the report this ensures that viewers immediately grasp the success of the product line before diving into further details additionally you must pay attention to the logical flow of information you arrange sections of the report in a way that follows a natural progression enabling viewers to easily navigate through the data supporting details such as product specifications or regional sales performance are strategically positioned below the main metrics providing contextual information to support the overall narrative now let’s explore scaling scaling refers to the relative size and proportions of visual elements within a report it is important to recognize that finding the right scale is crucial for ensuring readability and visual clarity heading and titles are carefully sized to be larger and bolder drawing the viewers’s attention to important sections for instance when showcasing the company’s quarterly sales performance you can use a larger font size for the title to make it stand out and capture the viewer’s interest in contrast data labels and annotations are scaled down to avoid overwhelming the viewer with unnecessary information additionally the scale of charts and graphs should be carefully considered to represent the data accurately access labels tick marks and legends should be appropriately sized and positioned for easy interpretation by maintaining consistency in the scale of measurement across multiple charts and graphs in your reports you enable the viewers to make meaningful comparisons and draw insights effectively overall the positioning and scale of information in report design should aim to create a visually pleasing and intuitive experience for your audience by effectively organizing and presenting data you can enhance understanding facilitate analysis and effectively convey your message for report design mastering the art of positioning and scale is vital by considering the logical flow emphasizing key information and balancing scale you create visually compelling and informative reports that captivate viewers as a data analyst adopting these principles can elevate your report designs and effectively communicate insights to your audience adventure Works has a salesperson performance Microsoft PowerBI report with total sales and quantity sold however the visuals are randomly positioned and the information is overwhelming the task is to redesign the report to better present the data let’s explore how this is done the report contains a clustered column chart showing total sales by year and salesperson a clustered chart showing quantity by salesperson a card showing the top three salespersons and the company logo the first issue with the current report is the density of information presented in a single visual for example the column chart of total sales by year and salesperson is busy with too much information and the second is that all the visuals are randomly located on the report canvas to begin the redesign in the view tab from the theme drop-down options activate the accessibility city park theme themes are standardized color schemes that can be applied to your entire report to maintain consistency throughout your report the accessibility support in this theme includes a color palette that provides contrast between content background and adjacent colors so the text and graphics are legible to ensure accessibility for the broadest range of consumers increase the font size and change the font color throughout the report to maximize visibility and contrast to make the text color of the axis titles and labels consistent throughout the report customize the theme to do that navigate to the view tab and in the themes dropdown select customize current theme the customize theme dialogue appears select advanced from the middle pane and select a black color for the second level elements select apply then select the total sales by year and salesperson column chart in visualization build visual scroll down and remove the salesperson field from the legend section the legend is busy with too much information in a small area the primary objective of the chart is to show the total sales per salesperson by removing the salesperson field and creating a slicer we can present the same information with better clutter-free visuals resize the column chart and drag it to the left of the canvas then navigate to visualization format visual visual to expand the Xaxis and scroll down to change the title toggle to the off position move the second chart out of the way for now for the first chart going to visualization format visual visual expand the column section and select FX to open the conditional formatting dialogue box in the dialogue box select the total sales from the drop-own of what field should we base this on section then select the black color in the lowest values check add a middle color and select a green color for the midval section select the darker green color for the highest value section then select okay to finish setting up conditional formatting the conditional formatting converts the columns to the shades of green and black color that you specified with the shade based on the column value it also adds a color legend to the column chart the legend is an unnecessary element in the chart that can be deleted to make the design cleaner to remove it go to visualization format visual visual legend and turn the toggle to the off position finally change the text size of the chart X and Yaxis and data labels to 12 points as the original visual was created to represent the salesperson’s performance add a salesperson slicer to the report to do this from the data pane bring the salesperson field from the salesperson’s table to the report canvas and select the slicer option from the visualization pane selecting the slicer go to visualization format visual visual slicer settings options there from the style drop-own list of options select the drop-own choice resize the slicer and drag it to the top right position of the report canvas next select the sum of quantity by salespersons column chart and replace the salesperson field from the x-axis with the year field from the order date column of the sales table the reason for this change is that we have a salesperson slicer and we can create a consistency between it and this chart by having year on the xaxis then the salesperson slicer will interactively present the sales generated by each salesperson in each year from visualization format visual general expand title and rename it as quantities sold rename the y-axis label as quantity sold then remove the x-axis title apply conditional formatting to the column colors remove the color legend and change the text size the column chart is resized to the same size as the previous one and dragged to position it parallel with the previous visual next resize and drag the top three salesperson’s card below the slicer and adjust the position and size accordingly for better visibility and accessibility change the text size and color of the salesperson’s name on the card go to format visual visual expand the card section change the title font size to 18 and color to black finally drag the Adventure Works logo to the top left of the canvas and add a report title of salesperson’s performance the report now has a structured layout with a logical flow of all the information originally presented this report demonstrates that proper positioning and information density adjustments improve comprehension and engagement placing visual elements optimizing scale and ensuring clarity of labels allows organizations to effectively communicate insights and make datadriven decisions in the realm of report design the organization and presentation of information plays a crucial role in capturing the attention of viewers in this video you will explore the concept of cohesive pages and the importance of striking the right balance between chaos and cohesion in report design drawing inspiration from Adventure Works you will delve into how thoughtful design choices contribute to cohesive pages that effectively convey information and captivate audiences before going into the dynamics of chaotic versus cohesive pages let’s recap the significance of cohesion in report design in previous videos you learned how elements such as color positioning and visual hierarchy contribute to cohesive designs by utilizing consistent color palettes strategic positioning of elements and clear visual hierarchy designers can create reports that are visually appealing easy to navigate and convey a unified message consider that your company Adventure Works needs to showcase its product lines performance across different regions in a report to create a cohesive page you need to employ a clean and structured layout you have to utilize consistent color schemes such as using brand colors to highlight important information and differentiate regions graphs and charts are thoughtfully positioned aligned and scaled to facilitate easy interpretation in this scenario a chaotic page would feature disorganized graphs overlapping text and a mix of unrelated colors leading to confusion and a lack of clarity chaotic pages suffer from a lack of structure coherence and intentionality they are characterized by cluttered layouts conflicting color schemes and elements positioned inconsistently chaos not only hampers visual appeal but also creates confusion and hinders effective communication of information in an Adventure Works report a chaotic page may include confusing graphs overlapping text and inconsistent use of color making it challenging for viewers to understand the intended message when working for Adventure Works you recognize the significance of cohesive pages and strive to create designs that engage and inform viewers effectively by adopting cohesive design principles you ensure that your reports are visually appealing organized and easy to navigate for example when presenting quarterly sales performance you carefully arrange key metrics in a logical flow utilizing a consistent color palette that aligns with their brand identity this approach creates a cohesive page that guides viewers through the information in a structured and comprehensible manner adventure Works demonstrates how thoughtful design choices contribute to cohesive pages you ensure that fonts colors and other visual elements align with the brand identity creating a consistent and recognizable aesthetic throughout your reports by utilizing whites space effectively you allow elements to breathe and improve readability clear headings and subheadings along with intuitive navigation elements further enhance the overall cohesion and user experience by incorporating these steps into your report design process you can improve cohesiveness and create visually appealing reports that effectively communicate information cohesiveness is not just about aesthetics but also about facilitating understanding and engagement for the intended audience creating a clear visual hierarchy is essential for guiding viewers through the report and highlighting key information use font size color and formatting to differentiate between headings subheadings and body text ensure that the most important elements stand out and draw the viewers’s attention adopting a consistent color scheme throughout the report enhances cohesiveness and strengthens brand identity choose a color palette that aligns with the company’s branding guidelines and use it consistently across charts graphs text boxes and other visual elements this consistency helps to establish visual harmony and reinforces the overall design aesthetic pay attention to the positioning of elements within the report ensure that related information is grouped together logically and presented in a sequential manner use alignment and spacing techniques to create a sense of order and structure avoid cluttering the page with unnecessary elements and maintain sufficient white space to enhance readability and visual appeal utilize grids and guides as design aids to achieve precise alignment and spacing grids help maintain consistency and alignment across different sections of the report while guides assist in positioning elements accurately these tools provide a framework for maintaining cohesiveness and ensuring that elements are visually aligned consistency and typography is crucial for creating a cohesive look and feel choose fonts that are legible and align with the overall design style use a limited number of font styles and sizes to maintain consistency throughout the report consider the readability of the chosen fonts and ensure that they are suitable for the target audience regularly review and refine your report design to identify areas for improvement seek feedback from colleagues or stakeholders to gain fresh perspectives analyze the report’s effectiveness in communicating the intended message and make necessary adjustments to enhance cohesiveness continuous improvement is key to achieving optimal results in the dynamic world of report design finding the balance between chaos and cohesion is essential for creating engaging and impactful pages by recapping the importance of cohesion exploring chaotic examples and showcasing the best practices you have gained insights into how color positioning and other design elements contribute to the creation of cohesive pages as you embark on your own report design journey remember the value of cohesive pages thoughtful design choices including consistent color schemes strategic positioning and attention to visual hierarchy can elevate your reports and captivate your audience by creating designs that balance order and clarity you will effectively communicate your message empower viewers with valuable insights and leave a lasting impact let’s take a poorly designed sales performance report and redesign it into a cohesive report the report view of PowerBI desktop displays a sales performance report called adventurework sales.pbix the report is poorly designed with randomly placed visuals and lacks coherence the redesign will change colors reposition and scale visuals and format text the report contains two line charts one funnel chart and two card visuals a logo and a report title the first step is to change the theme from the theme drop-down activate the accessible city park theme to ensure accessibility and impose a consistent style the theme contains colors that satisfy accessibility requirements customize the theme to enhance the label and access colors to ensure accessibility for the broadest range of consumers increase the font size and change the font color throughout the report to maximize visibility and contrast now drag the company logo to the top left of the report canvas also drag the title box to align with the logo let’s change the color of the title to black and make the text bold to align with the color palette of the theme select the sum of total sales card visual and rename the title as revenue to match the intent of the data in visualization format visual general effects change the background to theme color 2 both cards will have the same background color differentiating them from the report background and letting the viewer know that they both hold related data and contain the most valuable information in visualization format visual visual callout value change the font size to 32 and change the color to white to indicate the importance of this item for category label change the color to white and font size to 18 for better visibility against the new background then repeat these steps for some of quantity card visual and rename that visual as units sold now reposition both card visuals to the top right of the canvas and make sure these are of the same size because they are of equal importance you can rescale the card by selecting and dragging any side of the visual next select the sum of total sales by month line chart and rename it to a more appropriate title of revenue by month remove the x-axis title by turning the title toggle to the off position navigate to visualization format visual visual expand x-axis and scroll down to turn the title toggle to the off position x-axis represents monthly sales with the month name the month title on the axis does not add any relevant information rename the y-axis to total sales USD to clarify the sales details and currency now to add grid lines to the line chart in visualization format visual visual grid lines select dashed as style and black as the color next select the sum of total sales by month and country line chart and change its title to revenue by country remove its xaxis title as done in the previous chart and rename the y-axis as total sales USD next to format the legend navigate to visualization format visual visual and scroll down to the legend section in the legend section turn the title toggle button to the off position change the text size to 12 points and select the top right position from the position drop-down list of options the legend title is redundant because the country names provide sufficient information add grid lines to match the other visuals ensuring items such as title legends axis values and font size are formatted consistently for all the visuals helps report cohesion select the funnel chart and rename the title to revenue by category in visualization format visual visual conversion rate labels toggle to off as this is not relevant to the sales go into visualization format visual visual and expand the color section and select FX to open the conditional formatting dialogue in the dialogue select the total sales from the drop-down of what field should we base this on section then select a blue color called theme color five for the lowest values check add a middle color and select mid green theme color one for the midvalue section select the dark blue color theme color two for the highest value section select okay to apply conditional formatting converts the bars to shades of blue in descending order of sales amount dark blue represents the highest sales values next change the text size of the funnel chart to 14 points for better accessibility and visibility likewise change the font size of the axis titles and labels of both line charts to 12 points finally rescale and reposition the visuals making sure the distance between the visuals is equal to maintain design integrity adjust the position by dragging and rescale by selecting and dragging any side of the visual it’s good practice to review your work and possibly invite comments from colleagues a quick review right now suggests some slight improvements for instance to finish increase the size of the titles on each chart to 18 points that’s a demonstration of how to create cohesion in a report by applying and customizing an accessible theme ensuring consistent formatting for all visuals and scaling and positioning visuals in a logical hierarchical way to deliver a coherent data story imagine you’re planning a musical performance but you are playing for two different audiences one a group of classical music enthusiasts and the other a crowd of young energetic music lovers satisfying both audiences is a challenge it’s like the challenge you have when presenting data understanding your target audience is crucial and catering to their unique needs is the key to success it’s impossible to please everyone but the data must be readily understood by the majority with essential insights highlighted for your specific audience a key visualization success factor is understanding the audience you must tailor presentations to the specific needs and preferences of the target audience that is the specific group of people that your content is intended to reach it is the group of individuals most likely to be interested in or benefit from your data identifying and understanding the target audience is essential for communication and allows tailored strategies that can connect with this specific group’s preferences needs and characteristics every audience has unique characteristics including their level of technical expertise roles and responsibilities demographic information and other specific needs in this video you will explore the importance of knowing the audience and how the characteristics of your target audience influence the creation of your data presentation because of their characteristics you may be able to identify an audience’s needs an executive board needs highle summaries and key performance indicators while a marketing team wants detailed customer insights and marketing analytics when considering the target audience for a report or presentation assess some factors this will help identify the audience’s characteristics and needs enabling you to tailor your design to meet their specific requirements here are some key factors to consider identify the different roles or job functions of the potential users for example are they executives analysts marketers or sales representatives each row may have distinct data requirements and preferences determine the audience’s level of expertise and familiarity with the subject matter or the software being used are they beginners intermediate users or advanced professionals this helps you gauge the complexity of the information and the level of detail needed understand the goals and objectives of the audience what specific information or insights are they seeking for example executives may be interested in highle performance summaries while analysts may require more detailed data for in-depth analysis determine the specific information needs of the audience what kind of data or metrics are most relevant to their decision-making process for instance marketing teams may focus on customer demographics and campaign performance in contrast finance teams may require financial metrics and profitability analysis consider the preferred communication style of the audience some individuals prefer visual representations and charts while others prefer textual reports or interactive dashboards adapting your content to their preferred format enhances engagement and understanding assess cultural and demographic factors influencing the audience’s preferences and understanding this includes language preferences cultural nuances and accessibility considerations recognize the time constraints of the audience are they busy executives who require concise and summarized information or do they have more time for in-depth exploration tailoring the level of detail and presentation format can ensure that the information is effectively conveyed within the available time frame by considering these factors you can gain valuable insights into the target audience and align your report or software design to meet their specific needs once the target audience is identified the next step is to use data visualization techniques to address audience requirements it’s important to find the right balance between providing the required data and ensuring that it is understood by most of the audience when creating for diverse audiences it is crucial to simplify complex concepts and avoid jargon or technical terms that may be unfamiliar to non-technical stakeholders adventure works for instance may use clear and concise language to explain intricate manufacturing processes or market trends which your internal team would be familiar with however if presenting to external partners or users from outside the company they may be unfamiliar with manufacturing processes and therefore the technical terms should be avoided it’s important to identify and highlight the most relevant insights for the target audience for instance when presenting to the executive board the focus may be on financial performance market share and strategic initiatives on the other hand when presenting to the marketing team you can focus on customer behavior campaign effectiveness and market segmentation by tailoring the content to the specific interests of each audience data presentations become more engaging and actionable incorporating examples and scenarios that your audience is familiar with can help them connect with the data when presenting to the executive board a case study on the success of a recent product launch or a comparison of sales performance across different geographic regions can provide valuable insights similarly presenting market research findings or customer feedback to the marketing team can help them fine-tune their strategies and campaigns knowing the audience is vital in creating impactful data presentations by understanding the target audience’s needs preferences and roles within the organization data analysts can tailor their presentations to ensure maximum impact and understanding focusing on simplifying complex concepts highlighting relevant insights and using real world examples specific to the audience can significantly enhance the effectiveness of data presentations balloons are great fun at every party they brighten the room and raise the celebration mood but the same balloons that you used at a retirement function you don’t expect them to work as well at a kid’s birthday party for that party you’ll have balloons in different shapes and colors it’s the same situation when it comes to presenting data designing with the end user in mind is the key to success in data visualization the age range of the target audience is a vital consideration age related design considers the unique needs preferences and capabilities of different age groups in this video you’ll explore the significance of age related design in Microsoft PowerBI and discover specific considerations when designing visualizations for younger children aged 5 to 12 teenagers adults aged 18 to 64 and older adults aged 65 and above before exploring age related design considerations let’s briefly revisit the fundamentals of color theory color plays a crucial role in data visualization evoking emotions conveying meaning and aiding comprehension when designing for different age groups it’s important to select colors that are visually appealing to the group easily distinguishable and aligned with the intended message now let’s examine age related design in detail designing for younger children requires a simplified and engaging approach use vibrant and engaging colors younger children are attracted to bright and bold colors a visually stimulating color palette can capture their attention and enhance their engagement use simple and intuitive icons complex visual elements can overwhelm young children choose simple and recognizable icons that are easy to interpret interactive features such as buttons or dragable elements make the experience more interactive and enjoyable for young users incorporate playful illustrations and characters for example adventure works could use animated bicycle characters or friendly animal mascots in their visualizations to make the content more relatable tell a story through the data to capture the imagination of younger children adventure works could create a virtual journey such as showcasing different bicycle models in color and visually appealing environments for adults use a clean and professional design choose a visual style that meets the target audience’s expectations avoid excessive use of playful elements or overly casual designs ensure the visual elements have sufficient contrast and use clear readable typography for easy comprehension use text that is clear legible and easily readable choose appropriate font sizes typography and contrast to enhance readability adults appreciate a clear and intuitive user interface use logical navigation structures like menus and breadcrumbs to help users quickly navigate the content streamline the user interface and minimize complex interactions consider the audience’s needs for efficient data analysis and decision-m design dashboards and reports that provide relevant information quickly and concisely incorporate advanced visualizations appropriately consider using advanced charts graphs and interactive elements to provide deeper insights and facilitate data exploration allow users to personalize their dashboards or reports according to their preferences and priorities providing customization options can enhance user engagement and satisfaction designing for older adults requires additional focus on clarity legibility and ease of use use large and well spaced elements aging eyes may need help with small text or densely packed visuals enlarge fonts and provide ample spacing between elements to enhance readability and prevent visual clutter designing for different age groups requires consideration of their unique characteristics and needs by incorporating age related design principles you can create Microsoft PowerBI visualizations that cater to the specific requirements of groups like younger children and older adults from vibrant colors and interactive elements for children to clear typography and simplified interactions for older adults every design decision should prioritize the target audience’s ease of understanding and engagement age related design is one important aspect of creating inclusive and compelling visualizations continually exploring and understanding the needs of diverse user groups will help you focus the features of PowerBI to deliver impactful and accessible data visualizations for all imagine you’re preparing a delicious meal carefully selecting the finest ingredients your focus is on the flavors that will make the meal great in a similar way when presenting data focusing on the key details is crucial much like those food ingredients your audience craves the most relevant and impactful insights prioritizing key information ensures your message fulfills and satisfies the audience understanding the needs and preferences of your audience allows you to focus on the most relevant data points highlight outliers and provide the right level of detail for effective communication in this video you will explore the importance of prioritizing key information in Microsoft PowerBI and how it can enhance data insights for your audience before exploring the details of prioritizing it is vital to know your audience and their specific needs for instance presenting to the executive board requires a highle overview with emphasis on the big picture and key insights while presenting to a sales team may require more detailed information about performance evaluation consider a report for the executive board with an overview of quarterly sales and an emphasis on product categories the data also indicates that the executives need to focus on France and the United Kingdom for their marketing efforts by understanding your audience you can tailor the presentation to their specific needs ensuring that the key information is appropriately highlighted it allows you to customize the content format and level of detail in your presentation by adapting the presentation to the preferences knowledge level and goals of the sales team you increase the chances of delivering a compelling message that meets their needs when presenting data it is essential to capture the attention of your audience quickly by focusing on headlines or the most important findings and trends you can convey the main message effectively in the case of Adventure Works annual sales report key headlines may include overall revenue growth top selling product categories and regions with significant sales increases by highlighting these headlines you provide a clear and concise overview that immediately grabs the audience’s attention in any data set there are often outliers or data points that deviate significantly from the norm these outliers can provide valuable insights or indicate areas that require attention by highlighting them visually such as using color or annotations you draw the audience’s focus to these critical data points for example adventure works may have a particular product that experienced a sudden spike in sales or a region that underperformed compared to others by highlighting these outliers you prompt further exploration and discussion ensuring that the audience does not overlook essential information while headlines and key findings are crucial it is also essential to provide access to detailed information for a closer inspection when appropriate different audience members may have different levels of expertise or specific questions that require a deeper dive into the data in tailoring presentations the availability of detailed information for closer inspection should be carefully considered aligning with the needs and preferences of the specific audience for instance in an annual sales report from Adventure Works presenting to the executive board may emphasize highle trends revenue figures and strategic directions while a presentation to the sales team might delve into granular details like regional performance customer segments and sales targets adapting the level of detail ensures that each audience receives the information that aligns with their decision-making requirements optimizing the impact of the presentation microsoft PowerBI allows for interactive exploration where users can drill down into specific data points or filter the information based on their interests by providing this level of detail you enable further analysis and empower your audience to extract insights relevant to their specific needs the definition of significant information can vary across different audiences what may be crucial for one group may not be as relevant to another therefore it is crucial to adapt your presentation to align with the preferences of your audience for example the executive board may prioritize overall revenue and market share while the sales team may be more interested in product specific details or customer segmentation by understanding these preferences you can ensure that the key information presented is meaningful and resonates with your audience prioritizing key information in Microsoft PowerBI is a critical skill for effective data visualization and communication you can enhance data insights by understanding your audience focusing on headlines highlighting outliers providing access to detailed information and adapting to audience preferences the key to successfully prioritizing information is understanding your audience and tailoring your presentation to meet their specific needs picture a vault where your most valuable possessions are stored now imagine that this vault doesn’t have a strong lock leaving your treasures vulnerable to theft just as you’d prioritize security for your valuables safeguarding data is paramount in our digital age data the lifeblood of modern organizations is subject to a range of threats cyber attacks breaches and unauthorized access ensuring the security of this digital gold mine isn’t just a choice it’s a necessity let’s explore the world of data security where the keys to protection lie in understanding the risks implementing robust measures and fostering a culture of vigilance in the world of data visualization ensuring the security of data is of utmost importance from protecting sensitive information to maintaining data integrity incorporating robust security measures is crucial in this video you will explore the significance of security in data visualization and discuss key considerations for safeguarding data throughout the visualization process adventure Works a fictional multinational bicycle manufacturer is used as an example to illustrate the concept of data security in practice data visualization often involves working with sensitive information such as customer data financial records or proprietary business insights ensuring the security of this data is essential to maintain trust comply with regulations and protect against unauthorized access or data breaches let’s examine the key aspects of security and data visualization controlling access to data is vital to ensure that only authorized individuals can view or interact with specific data sets by implementing role-based access control data can be restricted or served in a controlled manner to the individuals who need to access it this helps protect sensitive information and reduces the risk of unauthorized data exposure additionally access logs and audit trails can be implemented to track and monitor data access providing accountability and visibility into data usage in Adventure Works you implement role-based access control to ensure that sensitive data is accessible only to authorized individuals in data visualization processes for instance the finance team has access to financial data while the marketing team can view customer demographics for targeted campaigns this granular access control prevents unauthorized individuals from accessing data beyond their scope safeguarding sensitive information anonymizing data is an effective technique for protecting privacy and confidentiality by removing personally identifiable information or replacing it with pseudonyms the data can be used for analysis and visualization while preserving privacy anonymization techniques such as generalization suppression or noise addition ensure that individuals cannot be identified from the data generalization involves simplifying or aggregating data to a higher level of abstraction often to protect privacy or reduce complexity suppression is the deliberate removal of certain data elements to prevent identifying individuals or sensitive information noise edition introduces controlled random variation into the data to make it more challenging to deduce specific details about individuals or confidential data these techniques are commonly used in data anonymization and privacy preservation to strike a balance between sharing useful information and safeguarding sensitive details ensuring data remains useful while reducing the risk of privacy breaches organizations should follow best practices and guidelines for data anonymization considering factors such as the nature of the data regulatory requirements and the intended use of the visualizations in Adventure Works you conduct market research and collect customer feedback to protect customer privacy you employ data anonymization techniques when visualizing the data personal information such as names addresses and contact details are replaced with pseudonyms or aggregated to preserve anonymity this allows Adventure Works to analyze and prevent valuable insights without compromising the privacy of customers maintaining data integrity is crucial to ensure the accuracy and reliability of the visualized information data integrity aspects include data validation error detection and consistency checks data validation involves verifying the accuracy and integrity of input data to ensure it meets predefined criteria error detection focuses on identifying mistakes or anomalies in data helping prevent erroneous information from causing problems consistency checks ensure that data conforms to established standards or matches other related data maintaining a reliable and cohesive data set these practices collectively help maintain data quality minimize errors and ensure that information is reliable and useful for decision-making and analysis implementing data validation rules and performing regular audits help identify and rectify any anomalies or inconsistencies in the data ensuring the visualizations reflect accurate and reliable insights furthermore employing data encryption techniques can prevent unauthorized modifications and tampering of the data maintaining its integrity throughout the visualization process in Adventure Works you prepare quarterly reports on sales performance which are shared with the executive board to ensure data integrity you implement data validation checks to detect any anomalies or errors in the sales data by cross- referencing the data with your customer relationship management system or CRM and performing consistency checks Adventure Works ensures the accuracy and reliability of the visualized sales information this data integrity provides the board with confidence in making informed decisions based on reliable insights when transferring data between different systems or sharing visualizations with stakeholders it is essential to prioritize secure data transmission using encrypted connections such as HTTPS or SSLTS ensures that data is encrypted during transit making it difficult for unauthorized individuals to intercept or manipulate the data https hypertext transfer protocol secure is a protocol that provides secure communication for website connections allowing user data to be transmitted in an encrypted manner this encryption relies on security protocols such as secure sockets layer SSL or transport layer security TLS secure sockets layer transport layer security SSLTS is used to ensure privacy and integrity during data transmission over the internet protecting user data from malicious attacks and ensuring its security these protocols enhance users online experience by providing a more secure environment when conducting online transactions and sharing sensitive information additionally organizations should consider secure file sharing methods such as using virtual private networks or VPNs for the connections using two-factor authentication or 2FA for authenticating users using Microsoft one drive for business Google workspace or Dropbox business for enterprise level cloud storage solutions and using secure protocols like secure file transfer protocol or SFTP and also utilize secure cloud-based platforms for distributing visualizations s ensuring data remains protected throughout its journey adventure Works collaborates with external partners and distributors sharing visualizations and sales data for joint business planning to ensure secure data transmission you utilize encrypted connections such as SSL TLS when sharing sensitive information over the internet this encryption protects the data from unauthorized access during transit maintaining the confidentiality and integrity of the shared visualizations and data data visualization often involves working with data that is subject to legal and regulatory requirements such as general data protection regulation or GDPR compliance with these regulations is crucial to protect individuals rights and maintain legal obligations data visualization practices should adhere to the relevant regulations including obtaining appropriate consent anonymizing data when necessary and implementing necessary safeguards organizations should stay informed about evolving data protection regulations and ensure their data visualization processes align with the correct legal frameworks adventure Works operates in various regions with different data protection regulations when visualizing data they ensure compliance with relevant regulations such as GDPR they obtain appropriate consent from customers anonymize data where necessary and implement necessary security measures to protect personal information this ensures that Adventure Works aders to the legal requirements and maintains the privacy rights of individuals security is a fundamental aspect of data visualization ensuring the confidentiality integrity and availability of data by implementing robust security measures such as access control data anonymization maintaining data integrity secure data transmission and compliance with data regulations organizations can build trust protect sensitive information and deliver reliable insights to their stakeholders as the importance of data continues to grow prioritizing security in data visualization is essential for maintaining the confidentiality and integrity of information in today’s datadriven world kim grew up in a small town in rural America the town had seen better days the region’s economy was in decline there were few career prospects for a young woman kim had to stay in her hometown and take whatever jobs she could find luckily she was an avid social media fan with a recent smartphone the phone allowed her to connect online even though the town’s wired internet connections were slow and often failed completely she vented her career and life frustrations on social media and very soon she got many suggestions for alternative careers and educational paths kim explored the opportunities available to her taking advantage of the low barrier of entry offered by the internet she used her phone and computer to take online courses and to research business ideas she had an eye for fashion and makeup an affinity for emerging styles and an ambition to succeed that combination led her to establish a business venture offering a few products online luckily for Kim the launch of her online business coincided with the upgrade of the town’s broadband to fiber connectivity yes you can work from anywhere with an internet connection but if you’re at all competitive it’s nice to be somewhere that has fast internet speeds the world is now a global village the internet is at the heart of this transformation and is an integral part of our everyday lives that’s why the need for better speeds and greater coverage has been felt around the world in the USA average connection speeds increased from 25 megabytes per second in the past to over 100 megabytes per second in recent times this is largely due to the widespread adoption of fiber optic technology which gives us faster speeds and improved coverage kim started slowly but her business grew as more and more people in her small town began to connect to and use the internet more because of its better speed her business expanded as the world grew more connected through fast internet connections kim started to use data from her customers to visualize and identify preferences and grow her business further despite the lack of local resources Kim was able to run a global business from her small town people both in rural and urban areas can access the internet easily with predictable costs and 247 access thanks to new technologies such as mobile broadband connections on 4G and 5G when traveling Kim can run her business using her smartphone connected to a cellular network or using one of the many Wi-Fi hotspots supplied by cities across the world the rise of global internet connectivity allowed Kim to access a wide array of resources with fast access to a global network she was able to stay upto-date with the latest trends in international business she made connections with professionals in other countries and was soon collaborating on new business deals and markets she couldn’t have considered before what was once an impossibility is now a reality for Kim she continues to explore global internet connectivity and use customer data analysis to expand her international business and explore new opportunities welcome to this high-level recap of the lessons covered this week this summary will help you revise the concepts of visualization and design during the course various adventure work scenarios were used as real life simulations of a multinational bicycle retailer operating in multiple countries these scenarios are designed to facilitate understanding and provide relatability and will be mentioned again in this recap as you review color theory positioning scale and density of information chaotic versus cohesive pages knowing the audience age related design prioritizing key information and security in data color theory is a crucial guideline for mixing colors and understanding the visual impact of specific color combinations it includes concepts like the color wheel color harmony color psychology and color symbolism by grasping these principles you gain a powerful toolkit for crafting visually appealing and meaningful designs the color wheel illustrates the relationships between colors including primary secondary and tertiary colors enabling you to navigate various color schemes for harmonious compositions color harmony focuses on arranging colors pleasingly in a design achieved through complimentary analogous triad or monochromatic combinations enhancing balance and impact color psychology explores how colors evoke emotions and influence behavior helping you use colors strategically for specific messages for example using yellow and orange can often evoke vibrant and energetic emotions symbolic meanings and cultural associations of colors are also essential ensuring effective communication across diverse cultural backgrounds mastering color theory empowers designers to create captivating designs effectively convey messages and evoke desired emotions making color theory a guiding force in transforming ordinary designs into extraordinary reports and dashboards color is a fundamental component in report design and data visualization impacting the quality and effectiveness of reports color influences emotions perceptions and the overall visual impact of your data visualization each color holds unique psychological associations and symbolic meanings generating diverse emotional responses for example warm colors like red and orange convey energy passion excitement and attention or warning while cool colors like blue and green evoke calmness serenity and harmony by skillfully selecting and combining colors designers can effectively convey the intended emotional message in report design while also considering cultural interpretations for global designs positioning in report design involves strategically placing visual elements to guide the viewer’s attention and convey essential information adventure Works recognizes the importance of this ensuring key data points like revenue and units sold are prominently placed at the top of a report the logical flow of information is also considered with supporting details arranged beneath the main metrics creating a natural narrative for easy navigation scaling information in the report and dashboard design is also crucial for clarity visual hierarchy and emphasis proper scaling optimizes space ensures responsiveness and reduces cognitive load chart selection plays a pivotal role in optimizing scale of information for example bar charts are used for presenting nominal and original scales while line charts work with interval and ratio scales once an appropriate chart is selected all associated elements can be scaled proportionately according to the degree of emphasis overall mastering the art of positioning and scale enhances report designs creating engaging informative reports that effectively communicate insights to the audience positioning in design involves arranging visual elements to guide attention and convey messages effectively adventure Works understands this importance ensuring key data is presented clearly and avoiding overcrowding techniques like grouping related info consistent spacing and visual hierarchy are employed to enhance information density while white space prevents clutter allowing viewers to focus their attention aligning elements guides the narrative and helps the flow of information proper positioning and information density are crucial in data visualization for comprehension and engagement enabling organizations to communicate insights efficiently cohesive page design is crucial contrasting with chaotic layouts that lack structure and coherence cohesive designs engage viewers utilize clear visual hierarchies and maintain a consistent color scheme aligned with the brand identity thoughtful positioning effective use of whites space and strategic typography contribute to organized visually appealing reports the incorporation of grids guides and regular reviews will refine the design ensuring a cohesive presentation of information by mastering these principles you create compelling reports that communicate effectively and leave a lasting impact on your audience the crucial first step in creating a successful report or presentation is identifying the target audience’s unique characteristics such as their roles expertise goals information needs and preferred communication style adventure Works for instance uses clear language and visualization elements to explain complex concepts while highlighting relevant insights for different groups such as the executive board or marketing team where possible incorporate real world examples and scenarios to help the audience connect with the data this targeted approach ensures data presentations effectively convey meaningful insights and contribute to the business success of Adventure Works to optimize data visualization designing with the end user in mind is crucial and age related design is a significant aspect to consider designing for all age groups requires understanding their unique needs by following age related design principles Microsoft PowerBI users can create visually appealing and engaging visualizations that cater to the specific requirements of different age groups the goal is to prioritize ease of understanding and engagement for the target audience prioritizing key information is a crucial aspect of data presentation by understanding your audience you can tailor your presentation to meet their specific needs ensuring that the most relevant data points are appropriately highlighted when presenting data capturing attention quickly is essential identifying outliers and important data points is another critical strategy providing access to detailed information for closer inspection is essential for those in your audience who need to drill down to reveal more data that’s part of adapting to your audience’s preferences prioritizing key information in Microsoft PowerBI is a critical skill that enhances data visualization and communication by considering your audience focusing on headlines highlighting outliers providing detailed access and accommodating audience preferences you can drive more meaningful decision-making based on data insights during your data visualization work security has a vital importance when dealing with sensitive information this includes data such as customer data financial records or proprietary business insights ensuring proper data security is crucial for maintaining trust complying with regulations and preventing unauthorized access or breaches by implementing robust security measures such as access control data anonymization maintaining data integrity secure data transmission and compliance with data regulations organizations build trust protect sensitive information and deliver reliable insights to their stakeholders access control involves controlling who can access specific data sets reducing the risk of unauthorized exposure you can implement role-based access control granting access only to authorized individuals and ensuring that sensitive data is protected data anonymization preserves privacy by removing identifiable information allowing analysis and visualization without compromising personal details maintaining data integrity is crucial to ensure the accuracy and reliability of the visualized information data integrity aspects include data validation error detection and consistency checks compliance with data regulations such as general data protection regulation or GDPR is essential and you can obtain consent from customers anonymize data as needed and implement security measures to comply with relevant regulations during this week you explored color theory positioning scale of information and information density chaotic versus cohesive pages knowing the audience age related design prioritizing key information and security and data by applying these techniques you will have more control over data visualization and design in Microsoft PowerBI the difference between insight and noise is clarity is the message of your report clear to the viewer or is the insight hidden by the noise in your presentation crafting compelling visualization in PowerBI is a necessity in this video you will learn to transform raw data into captivating stories where charts and graphs are not just shapes they bring essential clarity to your story data visualization helps convey complex information in a way that is easy to grasp and interpret microsoft PowerBI offers a wide range of visualization options from simple bar charts to intricate custom visuals allowing you to tailor your presentations to your audience and data however the true impact of data lies not just in its presentation but also in the clarity and visual appeal of the visualization when considering the importance of clarity charts data and visuals are all crucial components clear and visually appealing charts make it easier for stakeholders to understand complex data the right chart type can simplify complex information making it accessible to broader audiences data is only valuable when it communicates an insight and supports a decision visual impact ensures that your data presentation is engaging and persuasive cluttered visuals can lead to misinterpretation and therefore erroneous conclusions visual clarity in your reports reduces the risk of drawing incorrect insights let’s explore some best practices to create visual clarity and impact selecting an appropriate visual to present the data is critical for ensuring clarity and visualization it helps to display data accurately for instance a pie chart can be used to present a data set showing parts of a whole this might be a breakdown of total sales by each product category but what if you have 20 product categories pie charts will get cluttered and difficult to read if the data set is too complex break it down into smaller more digestible parts you can create summarization and aggregation measures within your data model you can employ drill down functionality of PowerBI to present details about your data although you can use colors to highlight key data points overuse of colors can lead to confusion you need to include clear and concise data labels for data points in your chart type avoid overcrowding the chart axis as this creates clutter in your chart and the overall report becomes unreadable you need to maintain a formatting consistency across all charts of your report pages you can use and customize report themes to ensure a cohesive look the data quality also contributes to the visual clarity of the report visualizations are only as good as the data quality they represent you need to make sure the data is clean accurate and formatted when choosing a chart for your report consider key elements such as the data type the message the context and the audience understand the nature of your data is it numerical categorical or geographical this helps you decide the appropriate chart type determine the data story you want to convey in your report are you showing comparison trends distribution or proportions this influences the chart selection evaluate how your visualization will be used dashboards presentations and interactive reports require distinct types of charts and visuals consider your audience’s familiarity with data visualizations select a chart type that connects with their experience although PowerBI provides tools and the flexibility to create stunning visuals it’s up to you as a data analyst and report designer to use them to eliminate clutter and impart visual appeal by prioritizing clarity selecting an appropriate chart and following best practices you can transform your data into captivating and meaningful stories that deliver insights in the dynamic world of data visualization creating visually appealing and compelling reports is essential for effective communication and decision-making however as you design these reports you must not forget about accessibility in the context of data reporting and visualization accessibility refers to the design and implementation of reports that can be easily used and understood by all individuals including those with disabilities this involves creating reports in a way that accommodates various needs such as providing alt text for visuals ensuring sufficient color contrast enabling keyboard navigation and providing compatibility with screen readers ensuring that your reports are inclusive and accessible to all users regardless of their abilities is a crucial aspect of responsible and user centric report creation because of its global operations Adventure Works executive management want to design its reports and dashboard to be used by a broader audience therefore as a data analyst your task is to consider the accessibility features of PowerBI before you plan and execute data analysis and design reports and dashboards now let’s explore a project file in PowerBI to learn how to create reports that are userfriendly and accessible to all audiences the project file contains three data tables sales products and region the first task is to create a line chart by dragging the total sales month and country fields from these tables into the respective wells of the line chart visual next create a donut chart representing the total sales by product category select the total sales and category fields to add to the chart for users with visual impairments these visuals may not be accessible add alt text to make your reports inclusive select the line chart and access visualizations format visual then general and scroll down to the alt text box enter the following descriptive text for the line chart monthly regional revenue analysis for adventure works this description acts as a text alternative that screen readers can access this lets users understand the content even if they cannot see it your users can also expand a specific visual from the report or dashboard select the line chart then select the focus mode icon on the top right corner of the visual the chart fills the entire screen select back to report to exit focus mode you can also view the data in a tabular format that is more screen reader friendly from the visual context menu select show as a table from the drop-own list this displays the line chart with a data table visual and report page titles are important accessibility features that serve as reference points let’s add some access visualizations select general then select the chart title provide a descriptive title of the chart like month sales by country next you need to name your report pages select the page number and rename the page to better represent the data both the X and Yaxis titles should also be readable and provide sufficient information in the line chart a color on its own might not be sufficient to convey information use markers to help distinguish the different data sets used in the visual select the line chart and turn the markers toggle to the on position select a different shape marker for each country you can configure the marker shape size and color for each line powerbi’s tab order feature provides a way to arrange all visual elements logically to accommodate keyboard users this ensures a natural order of visuals that keyboard shortcuts can access navigate to the view tab of PowerBI desktop and access the selection pane from the show panes group this opens a selection pane with two tabs layer order and tab order in the tab order tab you can rearrange the order of visuals in your report you must ensure screen readers effectively interpret and convey visuals and text this way you can ensure that the report is properly interpreted and conveyed to users with screen readers finally choose an appropriate accessibility theme and the high contrast windows option from the view tab to help ensure report accessibility this generates contrasting text and background colors to help make the content readable for users with visual impairments or color blindness if you use a high contrast mode in Windows PowerBI desktop automatically detects which high contrast theme is being used in Windows and applies those settings to your reports lastly test your reports with diverse users including those with disabilities to gather feedback and identify accessibility issues real world feedback helps you improve report design there are accessibility features available in PowerBI to help you successfully create a report design that can be accessed by a wide range of consumers integrating PowerBI accessible features into your workflow is not a limiting factor in designing compelling reports and dashboards it is the correct way to generate reports usable by a broader audience including those with disabilities you created a canvas of charts and graphs in Microsoft PowerBI to visualize your data but as you review your report it seems incomplete it’s as if one piece of the puzzle is missing that critical piece is the assessment of its clarity and impact a report is not just a collection of individual charts its clarity and its impact come from combining these visual elements into a compelling narrative this video will explore strategies and best practices to ensure your PowerBI reports are not just a canvas of information but are visually compelling engaging and impactful guidelines for creating an impactful report include deciding on the report objective establishing a visual hierarchy using branding and themes carefully composing the report employing storytelling techniques and optimizing the report performance for the best user experience what do you intend to communicate in your report and what is your target audience having a clear understanding of these aspects guides your design decisions the use of visual cues such as size color and visual placement builds the visual hierarchy to emphasize key insights or data points and assist navigation use branding and themes to help create a professional report design brand guidelines enforce a consistent style that adds credibility to your reports when composing your report consider layout and composition factors such as whites space alignment and screen real estate optimization whitespace means ensuring proper spacing between report elements like headings visuals and brand elements alignment is about aligning report elements to create a structured layout and a sense of order that emphasizes the data story screen real estate refers to the available space on the report canvas of PowerBI finding the right balance between presenting enough data to get your message across while avoiding overwhelming your audience is crucial when dealing with a lot of data points think about incorporating interactive elements like tool tips slicers and drill through such features keep the main visual clear but allow users to expand specific data points telling a story with your data significantly enhances the engagement and impact of your PowerBI report sequence items on the report canvas to make a natural storytelling flow for example a clear introduction key insights supporting details and finally a conclusion slow loading or unresponsiveness leads to a poor user experience that can diminish the impact of a report optimize report performance by eliminating unnecessary data minimizing complex DAX logic and aggregating data choosing an appropriate chart type based on the data type is critical in designing a clear and impactful report we will now explore use cases strengths and limitations of some commonly used chart types bar charts can compare discrete categories or values displaying rankings and trends over time easy to interpret useful to display data with few categories can come in the form of a bar chart where the bars display horizontally and in a vertical orientation when it displays as a column chart not suitable for continuous data and can become cluttered with too many categories display trends and patterns over time with a line chart to identify changes in data over a continuous scale excellent for visualizing time series data and to display multiple series for comparison less effective for comparing individual data points and not suitable for categorical data pie and donut charts display the composition of a whole showing parts of a percentage and they emphasize relative proportions easy to understand and they work well with a small number of categories not suitable for use beyond eight categories scatter plots are great for visualizing the relationship between two numerical values identifying outliers and spotting correlations it reveals patterns clusters and trends and is effective in displaying highdensity multi-dimensional data the visual may be overwhelming with too many categories a gauge chart displays a single value in relation to a predefined target such as key performance indicators or KPI provides a visual representation of performance against a goal not suitable for displaying multiple data points tree map is ideal for visualizing hierarchical data structures showing the proportions of categories within a whole visualizing hierarchical relationships by effective use of space and color coding may not be suitable for non- hierarchical data and it gets complex when there are deep hierarchies a strategic approach to report design in Microsoft PowerBI can create a clutter-free and engaging data story by having a clear objective maintaining a visual hierarchy implementing consistency and adhering to best practices in all design choices such as chart selection you can create a report that makes the best impression on the audience data is not just numbers it is a compass that guides you through the maze of business performance highlighting exactly where you underperform and where opportunities await a key performance indicator chart is one way to transform numbers into insights stories and to uncover hidden messages from raw data often used for sales marketing and customer service KPIs act as performance benchmarks measuring progress and identifying trends a KPI visual typically displays a single metric and its performance against a target or baseline this makes it easier for viewers to quickly judge performance and identify problems microsoft PowerBI has a built-in KPI visual but gauge charts and bullet charts can also be used to present KPI values kpi measures a value and shows trends and status the value is the main measure that you want to evaluate for instance current sales the element you want to compare the value with is the target for example the sales target the trend is how the value performs over time for example are the sale values going upward or downward the KPI visual can be adjusted from a desktop design to a version that works well on mobile devices to optimize a KPI chart for mobile devices keep the charts layout uncluttered use appropriate font sizes and contrasting colors focus on presenting the essential data points and avoid excessive decorative elements adventure Works wants insight into sales figures and an assessment of sales targets let’s design a sales performance KPI visual in PowerBI desktop and optimize it for mobile devices first launch PowerBI desktop and open the adventure work sales report to create a KPI chart to track sales performance against the target drag the total sales and target fields from the sales table to the report canvas powerbi automatically generates a column chart from these values you don’t need this chart so select KPI visual from the visualization pane to convert it to a KPI this action results in an empty chart with no data hover the cursor on the information icon the icon indicates that both values and trend axes are needed for this chart the three elements of the KPI chart are in the build visual tab of the visualization pane these elements are value target and trend to compare the sales values with the target add the total sales measure to the value section of the visual for the trend axis add months to view monthly sales trends remove the target values and drag the month field from the order date hierarchy to the trend axis this action generates a KPI visual that charts sales values by month it’s like creating an area chart with month as an axis and sales as values the main value indicated in the visual is sales but is this total sales or a filtered value the value represented at the center of the KPI visual is the last data point shown in the trend axis this means that if the trend is a month then this is the last month sales only in this report it’s the sales for December 2018 if the data set contains sales for multiple years then the value indicates the sales for December of all years if the data set contains the values for the full year then it’s for December but what if you only have sales for certain months access the visualizations tab then format visual visual and date turn on the date toggle to display the values date you’ve presented the sales data but must compare the value to the target drag the target measure from the sales table to the target section of the KPI visual adding the target generates color coding in the visual by turning the value and the area chart red an exclamation mark appears beside the value indicating that the sales values are behind the target the target is represented as the goal by default the percentage difference between the sales and the target is displayed in parenthesis which is minus 6.59% in the current report if the sales values meet or exceed the target then the color of the value and area chart turn green with a check mark next you must format the chart using font style and size changing color or adding background color for instance you can choose the sentiment color red as bad or red as good based on the nature of the value lastly optimize the KPI visual for mobile devices navigate to the view tab and select mobile layout drag the KPI visual from the page visuals pane to the mobile layout page positioning and rescaling the visual to adjust it the visual is now optimized for mobile devices a KPI chart represents the sales trend against the target value with the help of KPI visuals Adventure Works can identify which product region or sales representative is underperforming and as a result devise strategic decisions for performance improvement the key to revealing insights from raw data is using the appropriate visualization techniques have emerged using specific data types and analytical methods to produce tailored visualizations dotplot is one such visualization that is popular when presenting categorical data in relation to a numerical value to display the relationship between two numeric variables you can create a scatter plot that defines the correlation between variables a variation of a scatter plot is a bubble chart that can display the relationship between three variables the third variable represented in the size of a bubble a bubble chart is like a dot plot but instead of numeric data you use categorical information on the x-axis dotplot charts are a simple yet effective data visualization technique used to display the distribution of data points along a single axis in a dot plot chart each data point is represented by a dot and dots are stacked vertically above the corresponding data values on the axis this makes dot plots especially useful for visualizing the distribution and frequency of categorical data powerbi does not have any visual named dotplot or dot chart but you can create a dot plot by converting a scatter chart to a dot plot however there are certain custom visuals available in the PowerBI marketplace that are used to directly create dot plots in PowerBI let’s quickly check on a few reasons dot plots make such a useful chart type a dotplot chart is easy to use it is easy to interpret for non-technical users it’s particularly useful when visualizing categorical data giving a clear comparison between categories it displays the distribution and patterns in the data it can visualize a large amount of multi-dimensional data and it’s a compact chart that’s cell phone friendly adventure Works needs insights into regional product category sales performance they need to know the quantity sold for each category and the revenue per country the challenge is the number of variables to be presented in a single visual as a PowerBI analyst you can deploy a dot plot to present categorical information such as category or country on the x-axis sales on the y-axis and quantity as the size of the dot let’s jump into PowerBI and use a dot plot to analyze and visualize the Adventure Works information open the Adventure Works sales project the PowerBI core visualization pane has a no dot plot or dot chart visual so you need to begin with the scatter chart and convert it into a dot plot adventure works must present sales quantities country and category data drag the sales and total quantity sold measures from the key measures table to the report canvas powerbi autogenerates a column chart select the scatter chart from the visualization pane to convert the column chart to a scatter chart powerbi autofills the x-axis section with sales and the y-axis field with total quantities sold this is your scatter chart the sales data is numeric but you need to bring categorical data to the x-axis drag the country column from the region table to the x-axis field of the visual and move the sales data to the y-axis next drag the category column from the product table to the visuals legend section when I hover the cursor on a single dot in the chart a tool tip appears displaying the country category and sales amount for the category in that country to add more data drag the quantity sold measure from the key measures table to the visual size section the dot size changes in proportion to the quantity sold the tool tip now displays quantity information in addition to the previous data the chart still resembles a bubble chart to change it navigate to the format visual tab and expand markers in the shape drop-own list select the square dot you could also select distinct shapes for each category the dot size can also be adjusted here next format the aesthetics first add a chart title description then adjust the legend position legend title and font size format the axes to display clear labels and titles add and format the grid lines then add background color to improve the report’s accessibility select different shapes for each category finally you must add analytics lines select analytics in the visualization pane represented by a magnifying glass icon to display a range of different analytical lines expand the average line drop-down and select add line to add an average line to the chart format the line color and toggle the data label button to the on position to add average sales value data other analytical lines can be added to the chart as required adventure Works analytical needs were fulfilled by presenting categorical data in a single visual the dotplot chart allows you to visualize multi-dimensional data with more than two variables and categorical information instead of numerical values on the x-axis of the chart interactive visualizations breathe life into data revealing hidden patterns and relationships between variables powerbi’s core visualization pane offers a visual where numbers are transformed into dynamic bubbles bubble charts can depict multi-dimensional data in a single view making intelligent use of space in addition to the X and Y axes a third dimension of data is represented through the size of each bubble this approach enables you to highlight complex relationships between variables and identify patterns that might not be immediately evident in traditional two-dimensional scatter plots the bubble charts ability to convey multiple data dimensions simultaneously gives analysts and decision makers deeper insights into their data these insights can lead to more informed choices and strategies across a range of applications such as market analysis financial planning sales performance evaluation and resource allocation one example of applying a bubble chart effectively is in market analysis suppose you are analyzing the performance of various products within different markets the X and Y axis can represent market share and revenue while the bubble size corresponds to the total number of units sold by examining this data in a bubble chart you can discern valuable insights such as which products are dominant in specific markets based on market share and revenue and how sales volume relates to these factors highdensity data refers to data sets containing a substantial number of data points which can lead to visual clutter and hinder effective data interpretation with bubble charts you visualize data point density and use sampling techniques to manage data representation on the chart by adjusting the size of the bubbles or employing dynamic filtering options you can focus on specific areas of interest and maintain a clear and coherent chart despite the data’s complexity adventure Works wants to get insight into their data about the performance of different product colors the correlation between total revenue and profit margin the management wants to know the number of units sold of each product color sales profit margin product color and quantity together make the analysis and visualization challenging you can utilize a bubble chart in Microsoft PowerBI desktop to give all the required information in a single visual let’s transform those raw numbers into dancing bubbles of information and help Adventure Works make datadriven decisions about product colors the data model displays information on total sales and profit margin measures the product table has product color information to begin visualizing profit margin and sales select scatter chart from the visualization pane to add a placeholder visual to the canvas drag the sales and profit margin measures from the key measures table on the data pane to the x and y axis this generates a scatter chart with a single data point to make the chart more interesting bring a third data dimension to the chart fields this converts the scatter chart to a bubble chart then drag the color column from the product table to the legend field of the visual the tool tip now displays information about the total sales amount of a specific color product and the profit margin associated with that product color adventure Works needs to know the unit sold so bring the quantity sold measure from the key measures table to the size section of the visual another important feature of bubble charts is the play axis which you can use to animate your visuals drag the year field from the order date hierarchy from the sales table to the play axis now you can also analyze the data by year select play on the left side of the axis powerbi animates the bubbles to represent the variations in sales quantities and profit margins over the years next navigate to the analytics tab represented by a magnifying glass in the visualizations pane add a medium line based on sales and another for profit margin these chart lines provide analytics on the median sales and profit values the analytics pane provides interesting insights about the data now you need to format the chart first change the bubble shape and size to convey additional information and insights select visualization format visual visual and then markers in the shape dropdown change the shape of an entire series or individual categories in the size section adjust the size you can apply further formatting by changing the font style size and color adding background color and so on adventure Works can now visualize dense and multi-dimensional data in a compelling visualization to draw meaningful insights for future strategic plans in this video you discovered how a bubble chart delivered an engaging visualization to Adventure Works about the correlation between profit margin and sales based on the product color units sold and year you also explored the analytical capabilities of the bubble chart by adding the median and average lines to the chart to convey additional insights about the data you are working with a large data set when you discover that no one is interested in the data that’s a big surprise to you then you realize that it’s the insights people want presented not the data when dealing with data sets containing an abundance of data points presenting the information without overwhelming the viewer is vital in this video you will explore advanced display techniques in Microsoft PowerBI techniques such as presenting highdensity data using maps drills and 3D visualizations in PowerBI highdensity data is where you have a large amount of data points or values within a small area on a visual it often leads to visual clutter and makes it challenging to accurately interpret the visual some techniques to handle highdensity data include use aggregations and summarization drill through and drill down color coding such as heat maps and geographical maps and using 3D and custom visualizations let’s check some PowerBI visualizations that use these techniques and evaluate their potential for use in reports the first one to explore is heat maps heat maps are a powerful tool for visualizing the density and distribution of data across geographical regions or grids using color gradients to represent values heat maps allow viewers to quickly identify patterns trends and hotspots within large data sets for example imagine you are analyzing sales performance across various regions for Adventure Works a heat map could represent the sales figures using a color spectrum highlighting regions with the highest sales in vibrant hues while cooler shades indicate lower sales the heat map visualization is not available in the PowerBI core visualization pane you can import a heat map from PowerBI marketplace you can also use a Python-based heat map visualization in PowerBI you will learn about that option later in the course another visual to consider for highdensity data is called tree maps tree maps are ideal for displaying hierarchical data and comparing the proportions of data points across different levels in a tree map each rectangle represents a category and its size correlates with the proportionate value it represents this technique allows viewers to analyze the overall composition and the data point breakdown in a single visual for instance you can use a tree map to display the distribution of sales by product categories and subcategories within Adventure Works now let’s explore the functionality of drill through and drill down where analysts and viewers can dig deeper into the data a drill down in PowerBI allows users to move from a higher level of detail to a more granular level while a drill up does the reverse for example Adventure Work sales data is plotted on a time scale the viewers can use drill down to look at the sales data on a data hierarchy that goes from a year to each quarter to month and all the way down to a daily level there are two drill through situations to explain chart drill through lets users explore additional detail within a visual by clicking on specific data points for example in a bar chart representing sales figures for various products at a summary level selecting a specific bar say product 3 can trigger a drill through action revealing a detailed report highlighting sales trends in various regions product details and customer information related to that specific product page drill through allows users to navigate to a different page with associated information this advanced technique is especially valuable for creating summary pages with high-level insights while two-dimensional visualizations are more popular 3D visualizations can offer a new dimension of insights for instance a 3D scatter plot can showcase the distribution of products with a three-dimensional space revealing potential correlations and patterns such as a presentation of a product’s performance based on three parameters: price sales volume and customer satisfaction a 3D map can present data points in an interactive three-dimensional map space 3d mapping adds a sense of depth and realism to geographical data making it easier for users to identify spatial trends and analyze data use Microsoft PowerBI’s advanced display techniques to extract insight from large complex data sets while considering enduser requirements master highdensity data display drill through capabilities and the world of 3D visualization to improve your PowerBI reports and deliver impactful insights do you only access your social media accounts from a desktop computer no like most of us you probably spend most of your internet time on a mobile device accessing data on the go has become the norm decision makers expect to be able to access critical information anytime anywhere as a report creator you must be able to optimize report layouts for mobile devices that way you ensure your insights appear on smaller screens without losing clarity and usability creating a mobile friendly report layout involves careful consideration of visual placement font sizes and content organization to do that use the tools and settings in the mobile layout canvas of Microsoft PowerBI when optimizing a report for mobile one of the key considerations is responsive design a responsive layout automatically adjusts to fit different screen sizes and orientations ensuring that the report looks and functions optimally on various mobile devices such as tablets and smartphones the adaptability is crucial as mobile devices come in various screen sizes it ensures report access without the user needing to zoom or scroll horizontally another critical aspect of mobile optimization is the selection of visuals and data presentation not all visuals are suitable for mobile viewing due to their complexity or size you must choose visuals that convey essential insights while maintaining readability on smaller screens simplified visuals such as line charts bar charts and KPI cards are often preferred for mobile layouts as they can present data clearly font sizes play a crucial role in mobile optimization text that appears legible on a desktop monitor might become challenging to read on a smaller mobile screen use appropriate font sizes that ensure readability without straining the user’s eyes headers and labels should be clear and concise while data points should have sufficient spacing to avoid clutter in addition to visual elements interactivity is another aspect to consider when optimizing mobile devices you must choose visuals that convey essential insights while maintaining readability on smaller screens some interactions such as tool tips and drill through actions may work fine on desktops but might not translate well to touch-based mobile devices test and adjust interactions to ensure a smooth and intuitive mobile user experience as a best practice testing your mobile optimized report on various devices is crucial to identify potential issues and ensure consistency across different platforms emulating different mobile devices or using responsive design testing tools can help verify the reports performance and appearance on various devices adventure Works executive management wants to visualize its product sales summary it must be a mobile friendly sales summary dashboard so that it can be accessed anytime anywhere let’s use PowerBI desktop to optimize the Adventure Works sales summary report for mobile viewing before optimizing a report for mobile it is essential to review its current layout and design you need to identify elements that may not translate well to smaller screens and those that require adjustments to maintain readability and user friendliness let’s optimize the adventure work sales summary report for mobile devices the report contains one column chart representing the yearly sales amount a donut chart displaying sales by country or region and two card visuals showing sales and profit to begin navigate to the view tab and select mobile layout the mobile layout page has three panes: visualizations page visuals and mobile layout the page visuals canvas displays all the visual elements of the original report the mobile canvas has a precise grid layout for rescaling and repositioning the visuals on the screen with snap to grid functionality additionally you can select the checkbox lock objects from the view ribbons page options this action locks the visual elements in place to avoid any accidental movement use this once you are satisfied with the position and scale of your visual next drag all visual elements from the page visual pane and drop them to the mobile canvas one at a time first move two card visuals to the mobile canvas align both cards to the top side by side of the mobile screen now the main values on the card visuals are no longer visible so navigate to visualizations then visual expand the call out and in the value section change the font size to 18 in the label section change the font size to 12 in the spacing section change the vertical spacing to five pixels you can adjust font size independently for mobile and desktop versions of reports repeat this formatting for the second visual make some fine adjustments in positioning and scaling of the cards to optimize the readability and design next drag and drop the column chart to the mobile canvas enlarge the chart to fill the screen size and align it below the two card visuals finally move the donut chart to the mobile canvas enlarge it to fill the screen below the column chart in the mobile layout the donut chart legend values are not completely visible a small arrow is visible on the right end of the legend this suggests navigating for more information navigate to visualizations visual and expand legend in the position drop-own menu select center left you can also adjust the font size if necessary this changes the position of the legend from the top to the left all values are now visible without further navigation you can perform more adjustments for scaling the visuals and aligning them in the mobile layout screen the Adventure Works sales summary report is ready for anytime anywhere access on mobile devices optimizing report layouts in Microsoft PowerBI for mobile devices is an essential step in meeting the needs of today’s onthe-go business environment the world of data visualization continues to evolve and Microsoft PowerBI is at the forefront of introducing innovative ways to present and interpret data one of the latest additions to PowerBI’s visualizations is the shape map a feature that allows users to create geographic visualizations to uncover insights from geographical data in this video you will delve into the concept of shape map visuals their purpose and cover a step-by-step guide on how to add and configure them in your PowerBI reports adventure Works have recently expanded into territories across the globe as an analyst you realize the traditional table and chart visuals might not effectively communicate the geographical aspects of analysis you can use shape map visuals in PowerBI to better represent geographical and sales data to better showcase data topics such as population density competitor location and market demand across different regions a shape map visualization empowers users to tell stories using geographical data unlike traditional map visuals that plot data on a geographical map shape maps go a step further by enabling users to work with custom regions or shapes such as countries states or provinces sharing your report with a PowerBI colleague requires that you both have individual PowerBI paid licenses or that the report is saved in premium capacity powerbi Premium provides extra features like the ability to store more data cloud features and improved performance for PowerBI workspaces you can also use it to deploy reports and data sets and share content with users reliant on free licenses let’s help Adventure Works to craft a shape map visual to better present their performance across various geographical territories the shape map visual is only available in PowerBI desktop and in preview mode since it is in preview it must be enabled before you can use it to enable the shape map you need to select file options and settings options global preview features then select the shape map visual checkbox followed by okay you will then need to restart PowerBI desktop after making this selection now you need PowerBI to display the Adventure Works shape map visual the data set contains two fields sales and states these fields contain state names and corresponding sales amounts in PowerBI desktop after the shape map visual is enabled you select the shape map icon from the visualizations pane to add a shape map placeholder to the report canvas after adding the shape map to your report canvas you should add data to the data fields drag the state field to the location well and drag the sales field to the color saturation well of the map visual you can select the view tab to change the color scheme to a more accessible one such as accessible city park if you have an additional data set like product category or product color you can move them into the legend well to create a divergent color in this case as there is no category available in the data set you can apply gradient colors to the map go to format visual visual fill colors and turn the gradient toggle to the on position then add light blue for the minimum purple for center and black for the maximum you can also change the border color to black and three width now you need to display the map keys select the map settings dropdown then view map type key this action opens a dialogue that lists the map keys these keys are for US states you can change the map type to view keys for other countries if required the next option in this menu is projection you can use this option to present a 3D object on a 2D map powerbi selects Alber’s USA map style by default but three other options are available one option is equi rectangular this is a cylindrical projection that converts the globe into a grid each cell in the grid has the same size shape and area merc is another option this is a cylindrical projection with the equator depicted as the line of tangency polar areas are more distorted than equictangular projections and finally there’s orthographic this is a projection from an infinite point as if from deep space it gives the illusion of a three-dimensional globe next you’ll access the zoom dropdown and toggle on the zoom on selection and manual zoom options these options allow you to zoom in on states when selected finally to format the chart title access the general tab then expand the title drop-down and use the design effect options to change the title’s properties as required in this video you learned about shape map visuals discovered their purpose and explored a step-by-step guide on how to add and configure them in your PowerBI reports you specifically learned how to create a shape map visual with color coding to represent the sales amount for Adventure Works cororoplathth maps also known as filled maps stand out as a powerful tool for representing and analyzing spatial patterns by color coding geographical regions based on data values Cororopath maps offer a compelling way to visualize variations in data across different locations in this video you will explore the fundamental aspects of Cororoplathth maps their use cases and examples of the type of data best suited for this visual format adventure Works executive management realizes that simply looking at raw data in a tabular or columner format is not sufficient to comprehend the regional distribution of scales they need a visual that instantly communicates the variations in sales across various geographic regions as an analyst you can resolve this issue by employing the Cororapath map visual in PowerBI which allows you to present sales data on a geographical map with color-coded regions to indicate sales performance across various territories a cororoplath map is a geographic representation in which areas such as countries states or regions are shaded or patterned to illustrate quantitative data values each region on the map is assigned a color or pattern that corresponds to a specific data value allowing viewers to identify patterns and trends instantly the intensity of the color or pattern represents the magnitude of the data value enabling easy comparisons and highlighting regional disparities corroplath maps are most effective when the data being visualized has clear geographic boundaries when designing a cororopath map it is crucial to carefully select colors or patterns that are easy to interpret and distinguish using a color scale that smoothly transitions between values can enhance readability it is also essential to provide a clear legend or data scale to help users understand the relationship between colors or patterns and the corresponding data values now let’s consider some detailed use cases for cororoplath maps cororoplathth maps are ideal for visualizing population distribution across different regions by shading regions based on population density or total population you can quickly identify densely populated areas and areas with sparse populations corroplath maps are widely used to showcase various economic indicators such as GDP per capita unemployment rates or poverty levels across different geographic regions this helps policymakers and economists in understanding the economic disparities and making informed decisions corropath maps are valuable in displaying health and education related metrics such as disease prevalence vaccination rates literacy rates and school enrollment levels they provide insights into regional health and education challenges and aid and resource allocation cororoplath maps can effectively display environmental data such as air quality temperature variations or levels of pollution these maps help environmentalists and policy makers in assessing environmental conditions and devising appropriate conservation strategies but how can a cororoplath map best help adventure works in their business activities one example is to break down sales performance data per country as well as per state within those countries in this example of the United States states with higher sales are represented by darker shades while lighter shades indicate lower sales corropath maps offer a captivating way to explore and comprehend data patterns through geographic visualization their ability to showcase variations in data across different regions makes them a popular choice for a wide range of use cases from health economic indicators environmental data and population distribution with Cororoplath maps data analysts researchers and policy makers can gain valuable insights and make datadriven decisions with geographical context as an essential tool in the data visualization toolkit cororoplath maps assist in deeper understanding of the world around us cororopath maps have become an essential tool in data visualization for representing and analyzing data in a spatial context cororoplath maps also known as field maps are particularly effective in displaying quantitative data across geographical regions in this video you will explore the steps to create and utilize field maps in PowerBI focusing on a scenario involving the Adventure Works company by the end of this video you will have the skills to configure and display data on a cororoplath map allowing you to transform complex data sets into insightful visualizations before diving into creating a cororopath map it’s crucial to know how to select the appropriate data for analysis in the context of adventure works let’s consider a scenario where the company wants to understand the sales performance across different regions in a specific country the data should include at least two columns one representing the geographical regions and the other containing the relevant quantitative data such as total sales revenue or profit corresponding to each region in PowerBI creating an effective data model is the foundation of any compelling visualization the data should be structured in a way that PowerBI can understand the relationship between the geographical regions and the quantitative data you must ensure that the columns representing regions are in text format and contain matching names or codes for the regions present in the map data visualization similarly the quantitative data should be in numerical format for accurate analysis with the data model ready it’s time to create a corropath map visual in PowerBI to achieve this you can navigate to the visualizations pane and select the filled map option and PowerBI will automatically detect the columns representing the geographical regions and the quantitative data and position them on the respective fields to enhance the visualization and make it more meaningful you can customize the coroplath map further powerbi offers several customization options to help you fine-tune the visual representation for example you can adjust the color scale to highlight different intensity levels of the data making it easier to interpret variations additionally you can format the map’s title legend and other visual elements to suit your report’s aesthetics and readability let’s apply the steps mentioned above to a specific scenario involving Adventure Works a multinational bicycle manufacturer the company wants to analyze its sales performance across various states in the United States and identify regions with the highest and lowest sales for the very first step map and cororopath map visuals are disabled you must enable them by accessing file options and settings options global then security then check use map and filled map visuals the Adventure Works data set contains two relevant columns state for the geographical regions and sales for the quantitative data representing sales revenue in each state you must ensure that the state column is formatted as text and each state name matches the corresponding states in the map data visualization similarly the sales column should be in numerical format in this instance you will format it as currency you can select the visualizations pane and click on the filled map icon drag the state field to the location well and sales to the tool tip well of the visual to apply the color coding to the map visual go to visualizations format visual and then visual select fill colors and then select the FX icon to apply conditional formatting in the conditional formatting dialogue box add three rules for the color coding of the map based on sales values based on the data the maximum sales value is $400,000 and the minimum value is $81,000 so you can define the following rules rule one all sales values between $80,000 and $149,000 must be colorcoded yellow rule two all sales values between $150,000 and $249,000 must be red rule three all sales values between $250,000 and the maximum value must be purple you then expand the map settings in the style drop-own list you will select a map style powerbi has five styles: Aerial dark light grayscale and road you will select the aerial map style expand the controls option and turn auto zoom to the off position turn the zoom buttons and lasso tool to the on position this gives you control over zooming into a specific area of the map to make the corroplath map more informative you can customize the color scale to represent varying sales levels across states regions with higher sales revenue can be displayed in darker shades while regions with lower sales values can be represented in lighter colors formatting the map title and adding a meaningful legend will help convey the information more effectively lastly you can access the general tabs title dropdown to format the title of the visual and apply other effects as required cororopath maps are powerful tools that empower businesses to visualize and understand data across geographical regions with their ability to display data variations using color intensity these maps provide valuable insights into spatial patterns and trends by following the steps outlined in this video and applying them to a scenario involving adventure works you can master the art of configuring and displaying data on a corupath map in PowerBI in the ever evolving landscape of data visualization map visuals have emerged as powerful tools for presenting geographical data in an engaging and informative manner powerbi Microsoft’s robust business intelligence platform offers a range of features to create compelling map visualizations that can reveal insightful patterns and trends in this video you will explore essential tips and tricks to optimize your map visualizations in PowerBI ensuring that you leverage the full potential of your geographical data map visualizations hold the potential to unlock a wealth of insights from your data especially when dealing with geographical information however it’s essential to optimize these visuals to effectively communicate your insights to your audience adventure Works operates in multiple stores across different cities and states the North American sales manager asks you to present a report of sales for various states and cities as a PowerBI analyst your task is to create a comprehensive analysis of sales across various regions using map visuals a single layer of analysis in map visual might only provide a summary level of information about sales to dig deeper into states and cities you need to create geo hierarchy and map visual of PowerBI let’s go through adventurework sales data and create a geo hierarchy using filled map visuals in PowerBI launch PowerBI and open the project adventurework sales.pbix report the report contains two data tables a fact internet sales table and a geography table in map visualizations defining a precise location is especially important this is because some designations are ambiguous due to the presence of one location name in multiple regions for example there is a Southampton in England Pennsylvania and New York adding longitude and latitude coordinates solves this issue but if the data set does not have this information you will need to make sure to format the geographical columns as the appropriate data category select the country column from the geography table and navigate to column tools then properties in the data category dropdown select country format the data category for a state province name and city columns as state or province and city respectively a global icon appears before the field name this tells PowerBI that this is a geographical data type you will collapse the geography table and expand the fact internet sales table you then select the sales amount column from the fact internet sales table and format the data type as currency within two decimal places select the field map icon from the visualization pane to place a map placeholder in the report canvas you can then enlarge the placeholder to create the geo hierarchy drag the country state province name and city columns from the geography table to the location field of the map visual make sure the order of the fields is country then state province name and finally city next drag the sales amount field from the sales table to the tool tip field of the map visual to differentiate the states based on the sales you should color code the map open the conditional formatting dialogue box by selecting the FX icon from the fill colors in the conditional formatting dialogue box select yellow for minimum red for center and purple for maximum the data set contains sales data of various countries but you only want to present sales data for the United States expand the filter pane and under the country option select United States adding depth to map visualizations leverages geo hierarchies you can drill down from country to state state to city and so on at the top right corner of the map visual in the report canvas are arrow icons these arrows represent the drill down functions used to access the hierarchy of the data first select the downward arrow to turn on the drill down function when the drill down mode is on the arrow is highlighted with a black background now select the downwards double parallel arrow to go to the next level of the hierarchy in the current example selecting the double arrows takes us to the US country level alternatively you can also select the country on the map to go to the next level of the hierarchy you can then hover the cursor over California the tool tip displays the sales value for the entire state in the tool tip is a drill up and a drill down text with icons you can select these icons to either go one step up or one step down in the hierarchy select drill down to access the city level it is important to note that the color of the drill down will be the same color as the higher level view so it may need to be modified for accessibility purposes at the city level the tool tip displays all data from country to city with relevant sales amounts there’s no drill down option because city is the last level of the hierarchy in this report however you can create a more granular hierarchy by adding postal code and stores to the location save the project to your local computer making sure to apply all changes before exiting PowerBI you should now understand how to use data to create geo hierarchies powerbi map visualizations are a powerful and dynamic tool for data analysts seeking to explore understand and communicate geographic data in this video you’ll learn to explore the map visuals interface and display and configure a map adventure Works has created a filled map visual with geo hierarchy let’s help the company format this map by exploring the control options PowerBI offers you launch PowerBI and open the file adventurework sales.pbix go to visualizations and select format visual then visual then expand the map settings dropdown in the style dropdown you can select from the five map styles supported by PowerBI road style is selected by default let’s select aerial from the drop- down list expand the control section to reveal the three zoom options auto zoom zoom buttons and the lasso button auto zoom is automatically turned on you must also turn the zoom and lasso buttons to the on position this provides more control over the map to highlight a specific region the last option in map settings is geocoding culture by default PowerBI sets it to auto leave it as it is to further format the colors of the map visual open the conditional formatting dialogue box where you can modify the colors as needed with the current selection these colors represent the sales data across various states and cities yellow represents the states with the lowest sales values purple represents the states with the highest sales values next you can rename the labels and titles to make the visual clutter-free and help users identify specific places on the map double click on the state province name field in the location well of the map visual and rename it as state in the tool tip field rename sum of sales amount to sales go to visualizations format visual and then general change the title of the map visual to a more descriptive title like sales distribution by location you can configure and format the information that appears when you hover over a specific region on the map expand the tool tips option scroll down to the background and change the color to light green you can use the other options to further format the style and size of the data displayed on the tool tip you have now created a filled map with geo hierarchy and explored the various control and formatting options in PowerBI remember presenting information alone is not sufficient you must also use formatting and design to create engaging dashboards and reports in PowerBI in this video you learned how to explore the PowerBI map interface and display and configure a map powerbi offers various visualization options to display geographical data effectively two popular choices for mapping data are shape maps and filled maps known as corroplets both of these visualizations enable users to present geographic data in a visually engaging and informative manner in this video you will delve into the key differences between these two map types exploring their unique features use cases and the data they utilize as a business analyst working at Adventure Works you need to present regional sales data across different countries in PowerBI you have two options to choose from: filled maps or shape maps a filled map allows you to display color-coded regions based on a metric like sales for various geographical areas while shape maps provide more flexibility for customization the final selection should be based on the visualization requirements shape maps provide a platform for users to create their own custom visualizations by importing geographic data in the form of vector files the vector files used in shape maps are typically in the top too JSON format which is a file format used for storing geographic data topojson files allow for compact and efficient data representation as it reduces the data size and loading times in web applications and visualizations with shape maps users can visualize regions countries states or even custom territories by utilizing their own data sets there are three key features of shape maps to consider: customization precision and data complexity through customization users have the flexibility to use their data and design custom regions based on unique geographical boundaries or territories with precision shape maps can accurately represent non-standard geographic regions that are not predefined in standard geographical data sets by handling data complexity since users provide their geographic data shape maps are ideal for visualizing intricate boundaries and smaller regions filled maps or corropathlets are a type of map visualization that leverages predefined geographical boundaries provided by PowerBI’s
built-in mapping capabilities users can assign data values to regions represented by the map’s predefined shapes filled maps use color shading to represent data values allowing users to visualize data distribution across various regions the key features of Cororoplath maps are simplicity filled maps offer a straightforward approach to map visualization as they utilize predefined shapes without requiring additional custom data sets quick insights with field maps users can quickly gain insights into data distribution and patterns across various regions bing maps integration filled maps benefit from Bing Maps extensive geographic database providing accurate and up-to-date boundary information there are four main differences between shape and filled maps let’s consider these differences and how this would impact on your decisions when working with geographical data the primary distinction between shape maps and filled maps lies in their data sources and customization options while shape maps allow users to import their custom geographic data filled maps utilize predefined geographical boundaries from Bing maps this difference impacts the level of customization and the ability to visualize specific non-standard regions imagine Adventure Works wants to visualize its complex sales territories each with unique boundaries defined by the company’s specific business needs in this scenario shape maps will be a better choice with Shape Maps Adventure Works can import its custom geographic data creating precise and granular visualizations that accurately represent their sales territories the ability to use custom-defined administrative boundaries ensures that Adventure Works can tailor the map to its unique requirements making shape maps the perfect choice for this task shape maps represent data by associating values with custom regions created by users offering precise and granular visualizations filled maps use color gradients to represent data values within predefined regions providing a more generalized view of data distribution across larger geographic areas adventure Works wants to show its sales densities across different regions they want to get a quick high-level overview of how sales are distributed with field maps Adventure Works can quickly assess sales densities by country or region using color gradients providing insights without the need for customdefined boundaries shape maps are best suited for scenarios that require complex geographic representation such as visualizing sales territories customer distribution or customdefined administrative boundaries filled maps with their simplicity and quick insights are ideal for showcasing highle data patterns such as population densities sales performance by country or regional sales growth field maps benefit from Bing Map’s geographical database which ensures accurate and up-to-date boundary information this integration simplifies the process of creating visualizations especially for users who do not have access to specialized geographic data sets adventure Works faces a challenge they want to showcase sales performance by country highlighting regional sales growth but they also want to maintain a level of precision here’s where the choice between shape maps and filled maps becomes crucial shape maps with their custom regions could offer the precision needed to visualize specific sales trends however if a more generalized view is acceptable filled maps can quickly provide insights across larger geographic areas striking a balance between detail and simplicity in conclusion shape maps and field maps are two valuable map visualization options in PowerBI each catering to different use cases and data requirements in the realm of data visualization geospatial information can be a gamecher the ability to visualize data on maps not only adds context but also unlocks new layers of insights powerbi offers a range of map visualizations and one standout feature is its integration with Azure maps azure maps are part of the broader Azure location-based services family also called Azure LBS they provide a comprehensive platform for building geospatial solutions including mapping searching routting and traffic services azure maps visual provides a rich set of data visualizations for spatial data on top of a map it connects to a cloud service hosted in Azure to retrieve location data such as map images and coordinates that are used to create the map visualization it has several advantages compared to other map visualizations including seamless integration with Azure services advanced geospatial features scalability performance enterprisegrade security and developer friendliness details about the area are sent to Azure to retrieve images needed to render the map canvas also known as map tiles data in the location latitude and longitude buckets may be sent to Azure to retrieve map coordinates a process called geocoding in this video you will delve into what Azure maps are how to add them in PowerBI and provide a step-by-step guide to set up and configure an Azure map for Adventure Works competitor analysis by state now you will learn Azure maps and its usage in PowerBI reports you are working as a data analyst in Adventure Works company and you have public sales report data from a competitor you will configure an Azure map for Adventure Works competitor analysis by state you can enable the Azure Map PowerBI visual by selecting the Azure maps icon from the visualizations pane a disclaimer text appears on the screen regarding Azure Maps use of data access model view to view the data model tables the data model contains three data tables a reseller sales fact table a geography table and a reseller dimension table all these tables are related by one to many relationships you return to report view drag the country field from the geography table to the location well of the Azure map visual then drag the reseller measure from the reseller dimension table to the size well of Azure map visual the bubble size proportionally represents the number of resellers in each region to further analyze the reseller for each product line of Adventure Works drag the product line field from the reseller dimension table to the legend well of the visual this adds color coding to the bubble and displays the number of resellers for each product line in each country you can create a geo hierarchy by bringing other fields from the geography table to analyze the granular data further however in this video let’s just focus on the country level next let’s explore some formatting and control settings go to visualizations format visual visual and then map setting you can select the style of the map from the style dropdown select road from the available options in the bubble layer section you can configure the size shape and color of the bubbles the bubbles minimum size is very small so let’s change the size to 15 pixels in the size option of the bubble layer change the color of each bubble slice based on the product line you will also add category labels to the map for accessibility let’s increase the font size to 12 and reduce transparency to 25% lastly you can format the Azure Map title color text style and so on by following the steps outlined in this lesson you can seamlessly add configure and utilize Azure Maps to perform advanced analysis as you continue to explore the possibilities of Azure Maps and PowerBI you’ll be empowered to create compelling visual narratives that go beyond numbers helping you make informed decisions driven by location intelligence cycling is a peaceful and calming leisure activity that anyone can enjoy many people use their bicycles to get outdoors and enjoy the countryside or to go on camping trips with friends but in the business of bicycle manufacturing it’s a constant battle to grow sales and find new markets one way Adventure Works seeks new opportunities is by using data analysis it recently conducted some competitor analysis and that data tells an interesting story its main competitor is performing really well in specific European regions that’s an intriguing insight but the big question is what is the reason for that success what is it about the market that makes it different from elsewhere and is it something that Adventure Works can learn from does it have a product to satisfy the demand in this region the Adventure Works team does some more research to figure out what their competitor is doing right they check on sales volumes the products that do well and the areas of Europe that are supplied by competitors an analysis of competitor marketing tactics reveals that they’re selling to a specific young female demographic in particular regions they’re using a lot of focused social media marketing to get their message to the target audiences the findings point to the frustrations that young female cyclists have with their choice of bike types for city and suburban commuting to bring more depth to the data insights Adventure Works decides to analyze city demographic data where its competitors are most successful focusing efforts on these areas leads to the discovery that there are market demographics that are a perfect match for some Adventure Works products so what can Adventure Works do to compete in the identified regions and markets to find out more the team dive further into the demographic and marketing data the data analysis team then uses the data discoveries to create geographical visualizations the visualizations identify patterns and trends that can lead them toward the development of a new marketing strategy finally it’s time to present the new market plan to the company’s management team examining the new report of the targeted regions it compares the data to its own target audience for bike ranges adventure Works uses the collected data to design their own strategy to target a similar demographic the marketing staff brainstorm ideas for social media adverts influencers and other marketing tactics in areas that the target audience is spending most of their time jaime the CEO believes it has the potential to be very successful and is confident that this plan will help compete with her rivals in these regions data analysis is a powerful tool to help discover new business markets creative use of chart visuals and map visualization can help identify new opportunities and grow business through sales data analysis and competitor data analysis Adventure Works identified a market that they had not yet entered but competitors were already performing well in by the visual analysis of data it found market segments that matched its product line this was valuable insight and led it to new customers and new regions that have a high potential for continued growth powerbi offers several core visuals readily available on the visualization pane but what if the type of visualization you require doesn’t exist in PowerBI you can create it with custom visualizations in this video you’ll explore what custom visualizations are why they matter and how to create them adventure Works needs a visualization to explore its sales data however none of the existing visualizations in PowerBI are appropriate so Adventure Works needs a custom one find out more about custom visualizations then help Adventure Works build its own so what are custom visualizations custom visualizations are userdefined visual elements that extend the capabilities of PowerBI beyond the built-in visual options they enable you to create unique tailormade visuals that cater to specific business and visualization requirements enhancing data’s clarity and impact but why do visualizations matter because of their ability to help address unique needs every organization has its unique analytical requirements with custom visualizations you can create visuals that directly resonate with your organization’s specialized needs custom visuals also offer insights that standard visuals might not be able to convey as effectively this can help you uncover the trends and patterns hidden within your data for example through its custom sales data visuals Adventure Works might discover that it sells more bicycle repair equipment in the winter months custom visualizations can be installed in PowerBI from different sources you can import custom visuals created by developers from the PowerBI marketplace certified PowerBI visuals are available in AppSource microsoft or its partners develop these visuals which can be downloaded from PowerBI desktop you can create custom visualization in PowerBI using Python or R programming languages these visualizations are imported from a file on your local computer you can also develop PowerBI visuals to meet your analytical or aesthetic needs if developing in R or Python then it’s recommended that you use an integrated development environment or IDE such as Visual Studio Code also known as VS Code python is a powerful open-source programming language often used for data analytics it’s very versatile and offers a rich ecosystem it’s beginnerfriendly and backed by community support making it a great language for data professionals it also offers pre-written code bundles or libraries for creating visualizations like Seabor and Mattplot lib using R or Python to develop your own PowerBI visuals or to customize existing ones is an optional expertise you may wish to pursue it if you have a coding background a familiarity with Python or want to extend your skill set into this area before creating a visualization you need to load some data for it luckily Python has built-in data set examples that can be imported and can be used to create new data sets for this demonstration Python has already been installed in PowerBI and the relevant libraries and data sets have been imported so the first step I need to take in PowerBI desktop is to enable Python scripting i navigate to file and select options and settings then select options this opens options where I can select Python scripting always ensure PowerBI has detected the Python installation path under detected Python home directories if you need to you can copy and paste the path from your Python installation i select okay now I am ready to use Python and PowerBI python and PowerBI is used in two ways the first purpose is to import data the second is to create custom visualizations let’s explore the first method and import some data python libraries contain sample data sets that you can import to PowerBI i navigate to the get data dropdown and select more this opens the get data dialogue in the search bar I write Python the Python script appears on the right side of the window i select Python script and then select connect a Python script dialogue box appears on screen from here you can write a Python script to import sample data from Python libraries for instance I can write a Python script to import your data set into PowerBI desktop the code creates a data frame by importing the pandas package of Python with the required columns and associated values once I execute the code PowerBI opens the navigator window with a data set named sample data set the data set appears under the data pane on the right side of the PowerBI interface when I select load to load the data set it can now be used to create visualizations in PowerBI powerbi offers a wide range of core visualizations custom visualizations provide several unique advantages that contribute to more effective data communication improved insights and tailored solutions python with its rich set of libraries and ability to handle data manipulation visualization and machine learning tasks make it an essential tool for data professionals as a data analyst it’s important to be able to extract the insights you need from your data and engagingly present them integrating Python with PowerBI allows you to explore your data more deeply to reveal further insights and present the data through sophisticated visualizations in this video you’ll learn how to add a Python-based visualization to PowerBI Desktop adventure Works is analyzing its data sets and realizes that the core PowerBI visuals don’t provide a comprehensive view of its data you can help the company generate a more sophisticated analysis by leveraging a Python-based visualization in PowerBI let’s learn more about adding a Python-based visualization then help Adventure Works python is a powerful scripting language that relies on libraries these libraries like mattplot lib and seabor can be integrated with powerbi to create dynamic and sophisticated custom visualizations although python provides useful features and libraries it still has a few limitations and it’s important to be aware of these limitations before designing visuals python’s data set size is limited to 150,000 rows and has an input limit of 250 megabytes all data fields from different tables must have defined relationships between them or you’ll encounter an error python visuals refresh after each update filter or highlight external Python scripts might raise security concerns using R or Python to develop your own PowerBI visuals or to customize existing ones is an optional expertise you may wish to pursue it if you have a coding background a familiarity with Python or want to extend your skill set into this area to get you more familiar with custom visualizations let’s demonstrate a Python custom visualization in PowerBI desktop for this demonstration Python has already been installed in PowerBI and the relevant libraries and data sets have been imported so the first step is to create a visualization using the imported sample data set i navigate to visualization pane and select the Python visual icon this opens a dialogue called enable script visuals select enable a placeholder for a Python visual image appears in the report canvas and a Python script editor appears at the bottom of the report page a Python script can only use fields added to the value section by creating a data frame you can add or remove fields while you work on your Python script powerbi desktop automatically detects field changes as I select or remove fields from the value section supporting code in the Python script editor is automatically generated or removed i drag all the fields from the sample data set table to the value section of Python visual based on the selection the Python script editor generates the code the editor creates a data set called dataf frame with the fields I added to the value section duplicate rows are removed from the data and the fields are grouped the first visual will be a scatter plot graph that generates insights between the age and weight fields of the sample data set in the Python script editor I write the code to draw a scatter plot graph that measures age on the x-axis and weight on the y-axis i execute the code to import the mattplot lib Python library which creates the plot finally I select run from the top right corner of the Python script editor title bar to generate the Python visual on the report canvas next to generate another Python visual using Adventure Works data I open the Adventure Works Sales PowerBI project the data model contains four related data tables: sales products salesperson region i make sure the data tables relate to each other using appropriate relationships without these relationships you cannot use the fields from the different tables to create Python visuals the visual required for Adventure Works is a bar chart of total sales by each country to create this visual drag the total sales field from the sales table and the country field from the region table to the value section of the Python visual the editor creates a data set called dataf frame with the fields I added to the values section duplicate rows are removed from the data and the fields are grouped to create a column chart I write the Python script under the paste or type your script code here then I run the script the script draws a plot with total sales on the y-axis and country on the x-axis the script imports the metplot lib visualization library which generates the bar chart you can customize the visuals for color size data values and other attributes by modifying the Python code or importing other libraries that’s an example of creating Python-based visuals in PowerBI both by importing and with Adventure Work sales data set integrating Python with PowerBI helps to move a sophisticated data analysis to a compelling presentation however even though Python-based visualizations expand the capabilities of PowerBI they also have some limitations to consider such as Python’s limited data set size and they do require specialist expertise to implement in PowerBI welcome to this highle recap of the concepts and techniques covered this week this summary will help you revise the lessons on the design of powerful report pages during the course simulations of adventure work scenarios were used in videos and exercises these scenarios are designed to facilitate understanding and provide relatability the items we will review are clarity and visual impact accessibility considerations for Microsoft PowerBI creating and formatting KPI and dotplot charts how to visualize highdensity multi-dimensional data map visuals such as corroporath and shape maps and custom visualizations including adding a Python-based visualization in the first lesson on visual clarity in reports you learned to transform raw data into a story using charts and graphs that expressed the essential narrative of your data charts data and visuals are all crucial components of the clarity and visual appeal of data visualization selecting the correct chart type simplifies complex information making it easier for stakeholders to understand your presentation design with your audience in mind consider how familiar they are with data visualizations and then select visuals and chart types that are appropriate for their background and experience you must use your design ability to create visual impact and clarity one technique to use to do this is to eliminate clutter when building reports and visualizations don’t neglect accessibility produce reports that can be easily used and understood by all individuals including those with disabilities production should include alt text for visuals sufficient color contrast keyboard navigation and compatibility with screen readers the key areas of impactful report creation include deciding on the report objective establishing a visual hierarchy using branding and themes carefully composing the report employing storytelling techniques and optimizing the report performance for the best user experience when deciding on an appropriate chart type consider recommended use cases for the chart its strengths and its limitations by having a clear objective maintaining a visual hierarchy implementing consistency and adhering to best practices in all design choices such as chart selection you can create a report that makes the best impression on the audience kpi charts are often used to illustrate performance benchmarks measure progress and identify trends you can use the Microsoft PowerBI built-in KPI visual or use gauge charts and bullet charts to present KPI values dotplot charts are used to visualize the distribution and frequency of categorical data by displaying data points along a single axis for instance you can use a dot plot to represent category information on the x-axis sales on the y-axis and sales quantity as the size of the dot bubble charts depict multi-dimensional data in a single view for instance to analyze the performance of various products in different markets the X and Y axis represent market share and revenue while the size of the bubble is related to the total number of units sold with bubble charts you visualize data point density and use sampling techniques to manage data representation on the chart when creating reports PowerBI has many built-in capabilities that support ease of use and help your productivity they include app navigation ribbon navigation and navigation and key panes such as the visualization pane and the selection pane as a designer should you have any other disabling factors you have accessibility options that allow you to operate and design in Microsoft PowerBI you explored advanced display techniques in Microsoft PowerBI such as techniques to present highdensity data and the use of maps drills and 3D visualizations for instance you could use a heat map to illustrate sales figures using a color spectrum a tree map to display hierarchical data and compare data point proportions for sales data plotted on a time scale users can use drill down to look at the sales data on a data hierarchy that goes from a year to each quarter to month and all the way down to a daily level powerbi gives you the ability to use chart drill through and page drill through is a technique for creating summary pages with highle insights 3d visualization such as 3D mapping adds a sense of depth and realism to data making it easier to identify trends and analyze data as a report creator you must optimize report layouts for mobile devices to ensure reports display properly on mobile screens one of the key techniques to optimize a report for mobile devices is the use of responsive design powerbi’s shape map visualization reveals insights from geographical data cororoplathth maps visualize variations in data across different locations by color-coding geographical regions based on data values a popular use case for cororopath maps is to display environmental data such as air quality temperature variations or pollution levels for any PowerBI map visual it is vital to properly prepare the data this includes cleaning formatting handling missing values and optimizing for performance one key feature of PowerBI map visualizations is its integration with Azure maps azure maps are part of the broader Azure location-based services family also called Azure LBS custom visualizations are userdefined visual elements that can create unique tailormade visuals for specific visualization requirements custom visuals created by developers can be imported from the PowerBI marketplace certified PowerBI visuals are available in app source and they can be downloaded from PowerBI desktop you can also create custom visualization in PowerBI using Python or R programming languages to help you design powerful report pages you explored various features this week such as clarity and visual impact for charts and reports accessibility considerations for Microsoft PowerBI creating and formatting KPI and dotplot charts how to visualize highdensity multi-dimensional data map visuals such as cororroplath and shape maps and custom visualizations including adding a python-based visualization by applying these techniques you will be better able to create powerful report pages in Microsoft PowerBI data is a treasure and with Microsoft PowerBI analytical powers you can explore it in a variety of ways but what do you need to explore this treasure a treasure map to see the big picture or a magnifying glass to analyze the details that’s the difference between a dashboard and a report your dashboard will provide a high-level analysis of the data that has been analyzed in one centralized place dashboards are a simplified overview of the big picture designed to highlight key metrics for quick monitoring and decision-making reports are comprehensive and analytical designed to dive deep into data while in your report you are able to analyze the finer details of this data add filters slicers and drill through functions in this video you will learn more about the key differences between PowerBI dashboards and reports discovering their use cases along the way jamie the Adventure Works CEO needs to visualize an overview of the company’s performance including sales marketing customers and so on the sales and marketing directors need to explore more granular data to identify trends outliers and anomalies within the data as a principal PowerBI analyst you need to decide on a dashboard design that will work perfectly to present to the CEO with summary level visualizations but for each of the directors you need to create detailed reports about sales and marketing now let’s delve into the primary differences between dashboards and reports both PowerBI dashboards and reports serve distinct purposes and have unique design considerations before exploring design approaches let’s try to understand the fundamental differences between dashboards and reports let’s start by listing some key characteristics of PowerBI dashboards powerbi dashboards are concise summarized displays on underlying reports in PowerBI they typically contain a single canvas or page offering a high-level view of metrics and key performance indicators also called KPIs dashboards are designed for quick decision-making and monitoring they can also include visuals tiles and widgets from different reports when it comes to creating and designing a dashboard in Microsoft PowerBI you can only do it in Microsoft PowerBI service the Microsoft PowerBI service sometimes referred to as PowerBI online is the software as a service part of PowerBI you generate a dashboard and PowerBI service using visual elements and tiles as well as pin an entire page of a report to your dashboard first you have simplicity and focus dashboards are concise and focus on key metrics they avoid clutter and unnecessary visual elements and prioritize the most critical information for quick decision-making next you have visual hierarchy visuals need to be arranged in a logical sequence the use of size color and placement emphasizes the significance of information that is presented lastly there is mobile responsiveness you must ensure your dashboard is responsive and visually appealing on a variety of devices such as tablets and mobile phones it is important to use responsive design principles to adapt to all screen sizes now let’s turn our attention to PowerBI reports powerbi reports are detailed and structured documents often consisting of multiple pages or tabs they are also designed for in-depth analysis and exploration of data containing tables matrixes and visuals that provide detailed insights powerbi reports support filtering drill through and slicers for interactive exploration to maximize report impact for all types of viewers you must consider three major areas of design layout and structure interactivity and storytelling let’s start with layout and structure you need to use a clear and logical structure to guide report users through the data utilize page numbers titles sections and headers to improve report navigation next you have interactivity in the report design you must consider adding slicers filters and drill down and drill through functionality to access granular data finally storytelling reports are designed to tell a datadriven story you need to use text boxes annotations and narratives to explain valuable insights arrange visual elements in a logical sequence to guide users about the introduction main body and the conclusion of the story before exploring an example of using dashboards and reports let’s touch on charts in PowerBI and how they interact with dashboards and reports appropriate chart selection to match the type of data being presented is essential to designing both reports and dashboards in PowerBI chart selection is critical in data visualization as it directly impacts the effectiveness of data communication the choice of chart will determine how your audience understands and interprets data because a dashboard is based on your underlying reports it is essential to make the correct chart selections for the data in your reports for your task for Adventure Works you need to create multiple dashboards for the CEO as well as the sales and marketing directors let’s start with the CEO Jamie with a tailored dashboard with data presented to meet their specific needs with this dashboard you should focus on designing a dashboard emphasizing highlevel insights key performance indicators and strategic information in a visually appealing layout based on this typical dashboard layout often includes these six categories first is an executive summary this section may include KPIs in the form of card visuals such as revenue profit margin year-over-year growth and market share next up is sales performance this may include charts showing revenue expenses profit trends and time comparisons the third category is market overview which represents market share trends and competitive analysis the fourth category customer metrics can include customer retention and acquisition rate charts the fifth category is operational performance in this category production output customer satisfaction and departmental performance visuals can be included finally you have strategic initiatives completion status for key initiatives in the form of progress bars and charts illustrating project timelines and milestones can be presented in this section for the sales director you need to design reports with drill down and drill through modes for detailed and granular data analysis for the drill down and drill through modes to work you can break down the report into individual pages these pages are sales performance overview geographical analysis product analysis salesperson’s performance and timebased analysis each of these pages needs to be designed with appropriate structure and chart selection based on data you want to present lastly let’s consider what is required for the marketing director’s report the marketing director will need to see data related to Adventure Works marketing channels how campaigns are performing and a categorization of customers for the marketing director the report content should contain an overview marketing channel analysis campaign performance customer segmentation and recommendation and insights this will provide the marketing director with a good starting point to begin assessing their department and that concludes our summary of dashboard versus report design in Microsoft PowerBI designing a dashboard and designing a report are distinct processes with unique objectives reports offer in-depth analysis and exploration of granular data while dashboards provide high-level overview for quick decision-making and monitoring of key metrics consider a PowerBI dashboard that feels like it was designed just for you precisely delivering the insights you need to drive your decisions this dashboard is designed to optimize your experience the end user making your work easier creating user centric dashboards in PowerBI is not about displaying a collection of charts and graphs it is about solving specific problems for your users with important data indicators prioritized high on the page trends and performance comparisons further down the page and general information towards the bottom in this video you will learn about getting a better understanding of your audience creating user centric dashboards as well as exploring some examples of these dashboards so how can you better understand your audience when designing your PowerBI dashboards you will likely have a baseline of knowledge depending on the products or services your company offers but what else can be done to help understand your target audience let’s look at four methods you can use they are identifying the end users defining user needs establishing users data literacy and finally identifying the preferred devices of users let’s begin by identifying the end users end users are the individuals or groups who will be interacting with and generating insights from your dashboards identifying your audience helps tailor the dashboard to their specific needs and preferences next you must define user needs each user group may have distinct data requirements and objectives you need to work closely with each user group to determine the specific data they work with and how you can visualize them you can do this by identifying key metrics relevant to their roles allowing you to select what is presented on their dashboard having established the end users and their needs you must now consider their level of data literacy are they data savvy or do they need a simplified data interface for example a sales team will need the most accessible data they are used to working with as opposed to a finance team that may be used to more complex data sets and charts lastly you must consider the device preferences of your audience consider the devices they are using most frequently are they accessing dashboards on laptops tablets or mobile devices this will help you make selections optimized for device specific dashboards let’s consider an example where this is put into practice the Adventure Works sales director received a sales performance dashboard that she did not like as it was difficult to comprehend the visuals on the dashboard realizing she is unable to use the current dashboard to assist in decision-making she passed the dashboard and underlying reports to you to make necessary improvements when you open the dashboard you look to identify the issues the dashboard might look impressive at first glance but there are many problems remember a dashboard should be understandable and actionable but currently this dashboard is neither there are data shortcomings as well as design shortcomings in this dashboard the data shortcomings include the area chart displaying sales by category is not appropriate here the donut chart shows sales by country without any legend the tree map used to display sales by product subcategory is too busy with too many colors the top five products by sales column chart is not relevant to the sales dashboard with regards to the design there are a similar number of issues the salesbyear column chart has a negative value but is the same color as the positive numbers key metrics of the dashboard such as revenue units sold and profit are not presented appropriately overall there is no color and style uniformity in the entire dashboard based on a brief analysis of the dashboard it can be easily concluded that the dashboard is neither understandable nor actionable your task is to redesign the dashboard focusing on key metrics including the relevant information for salespeople and visually appealing colors and charts let’s redesign this dashboard by following these steps select visuals that effectively convey your intended message when you design user specific dashboards you might want to import custom visuals in PowerBI to meet the specific needs of your audience next place the most critical information at the top of the dashboard based on the requirement gathered use key performance indicator tiles to highlight key metrics maintain consistency in your design including the color schemes fonts and layouts if you choose a color to convey positive figures ensure it is consistent with all graphs and charts ensure you employ responsive design techniques when designing your dashboard many end users access dashboards from their mobile devices therefore you need to make sure the dashboard is visually appealing and functional on smaller screens create a narrative flow within your dashboard text boxes card visuals and annotations can guide users through the data visualization if you implement these best practices to redesign the dashboard you will create a dashboard which is understandable and actionable this dashboard is concise relevant to the sales manager and maintains consistency in terms of theme and color palette all the charts are appropriate for the data type presented let’s finish this example by outlining some user specific dashboards you would design for other departments in Adventure Works for the marketing team your dashboard would monitor marketing campaign effectiveness visualize social media engagement provide demographic and geographic insights about the target audience and display competitor analysis on various product lines if you were tasked to develop a customer support team specific dashboard you would track customer support ticket data display customer satisfaction scores provide a real-time view of open tickets and escalations as well as highlight frequently reported problems these are just guidelines in real life situations you need to tailor your dashboard according to your user requirements once you have crafted and designed a user specific dashboard it is essential to conduct testing and receive user feedback to ensure that the dashboard meets their needs and expectations user feedback can especially add value to improved iterations of your dashboard creating user centric dashboards is about two things is it understandable and is it actionable to do this you need to identify your target audience understand their needs their data literacy and the devices they use to engage with dashboards you should now understand the effective use of visuals how to remain consistent in your color selection and selecting the most appropriate data for your audience imagine you are working for Adventure Works when you receive a request from your manager Addio Quinn who is traveling abroad for a business meeting they need an up-to-date overview of the company’s sales performance in a dashboard format adio may not be able to access the dashboard on a large device such as a computer or laptop while traveling therefore your primary goal is to create and optimize the dashboard so Adio can access the required information on the go using their mobile device in this video you will learn about how you can optimize dashboards for mobile phones and Microsoft PowerBI mobile optimization of PowerBI reports and dashboards is not just a trend it is a necessity in modern business intelligence applications there are three reasons in particular why mobile optimization is so important they are accessibility real-time decision-making and enhanced user experience mobile optimized dashboards ensure that actionable insights are accessible to users who rely on smartphones as their primary device the second reason is real-time decision-making executives directors and managers need up-to-date information at their fingertips to make strategic decisions on the go lastly you have enhanced user experience a welloptimized dashboard improves the user experience making it easier for users to interact with and understand data let’s explore how you can optimize the Adventure Work sales dashboard for cellular devices a dashboard is a single canvas of data visualization displaying the current state of the business based on underlying reports in PowerBI service you want to optimize a sales summary dashboard for mobile devices log to your PowerBI service all reports data sets and dashboards are listed in my workspace select my workspace from the left navigation pane of the PowerBI canvas and select the sales summary dashboard to open it this is an existing dashboard created from a report published from PowerBI desktop in my workspace dashboards are distinguished by clock icons once the dashboard is open select the arrow beside edit from the top menu and then select mobile layout from the drop- down options this opens the phone dashboard edit view the phone layout screen has two panes edit mobile layout and unpinned tiles the unpinned tiles pane contains all tiles that are unpinned from the dashboard you can resize and rearrange any tiles to fit the phone view the desktop version of the dashboard will not change you can also unpin any tile from the phone view if it does not fit or is not needed in the edit mobile layout screen the tiles of the sales summary dashboard are not in the correct order you can resize reposition and rearrange the tiles in the mobile layout once you drag and resize a tile other tiles in the dashboard adjust their position automatically instead select unpin all tiles from the top menu bar this will unpin all tiles and move them to the unpinned tiles pane this will allow you to start the design from scratch you can now pin individual tiles and resize them in a sequence to the mobile layout pane the three card visuals contain a snapshot of information about sales and profit you can then pin these three card visuals to the top of the mobile layout screen select the pin icon on the top right corner of the tile to pin the visual on the mobile screen next pin the yearly profit tile to the mobile screen below the card tile you can pin the sales by year and sales by category tiles side by side below the yearly profit tile on the mobile screen next pin the sales by country tile and sales by salespersons below the existing tiles you can enlarge the sales by salesperson tile to display the entire data set the top five products tile is not related to the sales summary dashboard and is not needed in mobile screen so you can leave that tile on the unpinned tiles pane you can resize and rearrange the tiles according to your analytical and audience requirements if you are still unhappy after you have completed these changes you can either reset tiles or unpin all tiles reset tiles returns the dashboard to its original state while unpin all tiles moves all tiles from phone screen to unpinned tiles pane when you’re satisfied with the phone dashboard layout you can switch to web view by selecting web layout from the top menu bar powerbi automatically saves the mobile layout once a dashboard has been completed you can view it on your cell phone you will need to download and install the PowerBI mobile app and log into your account all dashboards are listed in my workspace the ability to access and act on data insights while on the move is an essential element of today’s fast-paced business landscape by ensuring your mobile dashboards are accessible enable real-time decision-making and enhance the user experience you will set yourself up for success optimizing PowerBI dashboards for mobile devices ensures that the decision makers have access to the data they need when they need it leading to better and instant decisions given the amount of data sources available a single dashboard can never display all of the available data as a data analyst you must manage multiple dashboards and reports in Microsoft PowerBI let’s say you need to design multiple but similar dashboards for example you might need these dashboards for managers in different countries designing each dashboard from the beginning each time is not good practice in this video we will explore features in the Microsoft PowerBI service that can accelerate your workflow when creating and managing multiple dashboards there are two different workflow approaches you can use in PowerBI service making a copy of a dashboard and pinning elements from one dashboard to another there are many occasions when a copy of a dashboard helps your workflow these include using a dashboard as a template testing dashboard versions making regional versions of a dashboard and working databases that have the same data structures and types you can use an existing dashboard as a kind of template to create a new dashboard use this technique when you work on scenarios that closely resemble each other in terms of structure and flow of information the procedure is to build the first dashboard copy it rename it and then edit this copy modifying it to reflect the second data scenario to test dashboard performance create a duplicate of a dashboard modify it and test its performance against the original version for global operations you may need to create slightly different versions of a dashboard to match the culture language or norms of various countries or regions when you get a new database that has the same data structure and types as the existing data set you can duplicate the original dashboard and use it as a template for the new data set the second technique to handle multiple dashboards in PowerBI service is copying a visual element between the dashboards for example imagine you have a custom visual tile in a dashboard that you want to include in another dashboard in your workspace you can simply pin the tile from one dashboard to another without navigating back to the original report the source of the tile does not change meaning that the pinned tile links back to the original source report where it was created if the original content changes all dashboards pinned to it will also be updated to create and copy dashboards you must use the Microsoft PowerBI service you can view dashboards in Microsoft PowerBI service and in Microsoft PowerBI mobile dashboards are not available in PowerBI desktop therefore you need to publish all your reports to PowerBI service before creating and managing dashboards to create a copy of a dashboard you must be the creator of the dashboard if someone in your team shared a dashboard with you you cannot duplicate it you cannot pin tiles from dashboards shared with you only from dashboards created by you let’s open PowerBI service and explore some techniques to manage multiple dashboards to duplicate a dashboard log into your PowerBI service and open the workspace that contains your dashboard select the dashboard to duplicate from my workspace navigate to file and select save a copy from the drop-down a duplicate dashboard dialogue opens here you need to give an appropriate name for the duplicated dashboard select duplicate a duplicated dashboard is saved in the same workspace as the original one now the dashboard can be opened and modified to satisfy the analytical requirements some of the tasks you can perform include move resize and delete tiles add or pin new tiles share your dashboard with colleagues and team members the next task is to pin a tile from one dashboard to another open the product sales dashboard from my workspace and hover the cursor on the tile to pin then select more options and select pin tile from the dropdown in the pin to dashboard dialogue from the drop-down select either an existing dashboard to pin to or create a new dashboard and pin the tile to that when you select pin a success message appears at the top right corner indicating the visualization has been pinned to the selected dashboard open the dashboard to check the pinned visual further operations can now be performed on the pinned visualization like resizing renaming and moving you can duplicate a dashboard and pin a tile from one dashboard to another in Microsoft PowerBI service in real world data analysis working on many dashboards and reports is a frequent practice being able to quickly replicate a dashboard and copy visual elements between dashboards is a valuable addition to your skill set content with a visual always attracts more viewers than non-visual content visually rich media such as photos images videos and animations significantly contribute to the impact of content eye-catching visuals help to onboard and engage viewers informative visuals enable them to focus on and understand your message in this video you’ll discover media elements you can integrate into your dashboard and explore the benefits they bring to your workflow microsoft PowerBI service supports many media types in a dashboard including text boxes images videos web content and live streaming or real-time data there are many benefits to using media elements such as their ability to enhance data context create engagement reinforce branding provide instructions and present a summary visual content such as images and videos provide a context to data for example you can use images to display product photos company logos location maps and use video footage for a manufacturing or promotional video clip to help users understand the data being presented still images and motion graphics make dashboards more engaging and assist effective storytelling videos or animations for instance can be included to narrate the story behind the data making it more relatable and impactful reinforce an organization’s branding by including company logos and product images in your dashboard animations and video clips about a company’s corporate culture manufacturing process or marketing campaigns are some examples that can be included the use of short video clips containing instructions on how to navigate dashboards and interact with data effectively is another helpful application of media in dashboards images and icons can be used to present a visual summary of data making it easier to quickly grasp key insights you can include live streaming as a media element in a dashboard powerbi’s real-time streaming updates your dashboard data automatically and constantly any PowerBI visual or dashboard can be used to display and update real-time data and visuals the streaming data that feeds your updates can come from social media sensors such as a point of sale terminal or sensors detecting changes in light heat or motion service usage such as metering the consumption of power or other utilities or any time-sensitive data there are three types of data sets designed to display on real-time dashboards and tiles push data set streaming data sets and pubnob streaming data sets a push data set is where the data is pushed to PowerBI service from any live streaming data set such as SQL server when the data set is created the PowerBI service automatically creates a new database in the service to store the data with a push data set you can create visuals reports and dashboards as with any other report visual because the data is stored in PowerBI service you can pin any visual to the dashboard from your report and on the dashboard visuals are updated in real time whenever the data is updated powerbi only stores data from a streaming data set in temporary caches which expire quickly with a streaming data set the data is also pushed to PowerBI service from any data set that is constantly updating like SQL server or Amazon web services Oracle and so on a streaming data set is not stored in PowerBI memory as a result it has no underlying data set physically saved in PowerBI that means you cannot use regular report functionality in PowerBI like using filters and slicers in your report for drill down functions and to create interactivity the only way to use a streaming data set is to add a tile to your dashboard and use the streaming data set as a data source called custom streaming data in PowerBI service the tile is then optimized to quickly display real-time data you can choose any visual you want on the tile and the benefit of a streaming data set is that the visual always displays live data we can also use something called the PubNub streaming data set pubnub is a platform for building realtime applications it works with the minimum of delay which is called low latency this is because no data is pushed to PowerBI all realtime data is live streamed from PubNub it is a solution that has high reliability and is scalable meaning that its reliability and performance are retained as your audience grows this is a vital feature since your audience will expect the real-time changes to be instant regardless of how many viewers are online pubnub manages this by being scalable over globally distributed data centers pubnub is compatible with platforms across web mobile and internet of things powerbi is one of these platforms that can read an existing PubNub data stream the PowerBI web client uses the PubNob software developer toolkit or SDK to read an existing PubNub data stream the PowerBI service stores no data because the web client makes this call directly you must list any approved traffic from your network to PubNub as allowed like a streaming data set PowerBI does not store data so you cannot use any report building functionality you can visualize a PubNob streaming data set by adding a tile to your dashboard and configuring a PubNub data stream as the data source tiles based on a PubNub data source are optimized to quickly display real-time data pubnub is a streaming service that means it is a platform that helps build and operate real-time interactivity for mobile web and internet of things it is useful for real-time use cases that require security scalability and reliability the three types of data sets you can use to display real-time data are push data set streaming data sets and pub streaming data sets in PowerBI with the push data set you can create reports visuals like you usually do with an imported data set and then pin the visual to the dashboard streaming data sets and PubNob streaming data sets are not stored in PowerBI memory and therefore do not allow you to create any report visuals to use those you create a dashboard tile and connect a live streaming data set directly to the visual on the tile choosing a streaming method depends on factors such as where the data set is hosted what the analytical requirements are and what infrastructure your organization has available live streaming brings many benefits including live streaming updates enable users to access current data in real time this is especially valuable for monitoring rapidly changing metrics or critical data points dashboards with live updates can include alert mechanisms that trigger notifications when specific conditions are met live data streaming allows organizations to respond quickly to market changes operational disruptions or emerging trends team communication is improved through real-time collaboration and live data updates enable organizations to adjust forecasts and strategies based on the most recent data incorporating media elements like still images motion graphics and live streaming updates helps to transform your PowerBI dashboard using dynamic engaging and real-time visuals these visuals not only enhance the user experience but also empower users to respond quickly and make decisions about changing business conditions a sales summary dashboard that you created has all the required sales data but it fails to engage the audience the addition of media elements can help in this video you’ll learn how to add and format dashboard media elements to help enhance user experience powerbi service allows you to incorporate media elements such as still images and motion graphics into your dashboard log into a PowerBI service account open the sales summary dashboard from my workspace we’ll add three media elements to the dashboard a text box a still image and a video clip you need to add a tile to your dashboard to place an image text box or video select add a tile from the edit drop-down the add a tile dialogue appears where you can select the media type to add a dashboard heading select the text box and select next the add a text box tile window appears on the right side of the screen where the title and description can be added add text to the content section such as this dashboard displays the most up-to-date sales information of Adventure Works next format the text to increase the size color and indentation change the font size to 16 bold the color to black and center it tick the check box to display the title and subtitle of the tile you can also set a custom link and add either an external link or a link to another PowerBI dashboard or report from my workspace hyperlinks can also be added to the content section of the text box next let’s add the Adventure Works logo to the dashboard if you want to place your company logo or any other image to your dashboard you need to publish the image online and create a URL link with http colon or https colon you must also make sure that security credentials are not enabled to access the image you cannot add SVG file types to a PowerBI dashboard from the add tile window select image and then next in the detail section to display the title above the image tick the display the title and subtitle checkbox when placing something like the Adventure Works logo you don’t need to enable the title and subtitle now to enter the image URL the Adventure Works logo is already published to Google Drive and the URL was generated without any security credentials which is added here to the URL section to hyperlink the tile select set custom link and then select external link you need to enter the URL of the external source to make the tile a hyperlink select apply and a logo image is added to the dashboard and you can rescale and reposition the tile within the dashboard the last media element to add is a video only YouTube and Vimeo links are supported from the add tile window select video a video information window appears where you need to add information about the video to display the title and subtitle of the video tick the check box display the title and subtitle we will leave the title and subtitle off for this demonstration add a video URL to a clip hosted on YouTube or Vimeo to add the hyperlinks tick the check box set custom link under functionality select external link and add the video URL you can add the video link to open in a new browser tab or add a link to an entire playlist viewers can watch the video on the dashboard tile and also select a hyperlink to navigate to the entire playlist to watch further videos in the same tab select the no option from the open custom link to open the custom video link in a new tab select apply a video tile is added to the dashboard and you can resize and reposition the tile as needed once you add a media tile to your dashboard you can go back and make any changes to the text box change the video URL and so on to make changes select the title and hover the cursor on more options indicated by three dots on the top right corner of the tile and select edit details then the edit tile window opens where you can make and apply changes to the media tile you should now be familiar with adding media elements to the dashboard and formatting them to help create an engaging and captivating user experience with the help of images and videos you can transform your dashboard into an immersive and informative tool you don’t ever want your end users to have to type in a URL they may not type it at all because it’s too much effort or worse still they may type it incorrectly fail to reach your site and give up a QR code is a better solution that avoids the end user having to type in anything it’s short for quick response code a QR code is a two-dimensional barcode that contains information in a machine readable format qr codes consist of black squares arranged on a white square grid typically in a square shape qr codes can store different types of data including text URLs contact information phone numbers and more qr codes are a valuable addition to PowerBI dashboards and reports they enhance user interactivity and data accessibility qr codes are useful in PowerBI dashboards because codes can be generated for specific reports and dashboard tiles in Microsoft PowerBI service users can scan the QR code using their mobile devices to instantly access the associated content without any manual navigation this feature is especially useful for onthe-go access to critical information external web sources or documents can be linked to QR codes providing users with additional context or supporting information related to dashboard data qr codes can be used to gather user feedback or conduct surveys directly from the dashboard since QR codes are mobile friendly they align with the growing trend of mobile business intelligence users can scan codes using their smartphones making data consumption more convenient and accessible the marketing department can use QR codes for instance linking to promotional materials or campaigns related to the data presented on the dashboard you can create a QR code for a dashboard tile and PowerBI service or for a PowerBI report to better understand the use of QR codes consider this scenario to help manage sales reporting and streamline order placement Reneie the Adventure Works marketing manager wants to have quick and easy access to key sales metrics she also wants to share the measures with the sales team to track the sales progress using PowerBI service you can fulfill her analytical needs by adding the power of a QR code reini can share the QR codes among her team members and any stakeholders to give them quick access to relevant data let’s explore PowerBI service and discover how to generate a QR code for a report or dashboard tile in PowerBI service you can generate QR codes for either the entire report that you published from PowerBI desktop or for an individual tile of a dashboard you can create a QR code in the PowerBI service for tiles in any dashboard even in dashboards that you cannot edit let’s check both processes log into PowerBI service and open the sales summary dashboard in the dashboard there is a tile representing sales by salesperson you can generate a QR code for this visual element of the dashboard select the more options from the upper right corner of the tile represented by three dots and select open and focus mode from the drop-down powerbi opens the visual in a full screen in focus mode select more options from the upper right corner of the menu bar and choose generate QR code from the dropdown a dialogue with the QR code appears from here you can scan the QR code or download it as an image which can be shared by email or print to display it in an office or a public place where colleagues can access the information if you want to print the QR code make sure to print it at 100% or actual size if the data in the tile is updated the sales manager can monitor the sales performance you can select exit focus mode to go back to the dashboard next to generate a QR code for the entire PowerBI report open the Adventure Works PowerBI report from my workspace select file and choose generate QR code from the drop-down a dialogue with the QR code appears and you can use the QR code as mentioned previously you can scan the QR code from the PowerBI app on a phone to directly access the visualization qr codes can be generated using the built-in capabilities of Microsoft PowerBI both for a dashboard tile and an entire PowerBI report strategic integration of QR codes and PowerBI can streamline the workflow leverage the power of mobile technologies and enhance the user experience whether it is for efficient data access or engaging user interaction QR codes are a valuable addition to your PowerBI dashboards and reports have you ever accidentally started watching a film halfway through remember how confused you felt and how many questions you had to ask the other viewers before you finally understood the character and the plot if a Microsoft PowerBI report or dashboard does not tell a cohesive story then the employees and stakeholders who view them can feel a similar confusion transforming raw data into a meaningful narrative is a vital skill for the data analyst effective data storytelling serves as a bridge between the analysis of the data and communication of the results it combines the art of storytelling with the science of analytics to convey insights and findings in a compelling way with a multinational organization like Adventure Works where employees and stakeholders are spread across different regions effective data storytelling is particularly important in this video you will explore the main components of data storytelling and discover the benefits of a good data story data storytelling is the art of using data and visuals to build compelling narratives which helps to convey a message highlight trends and engage a wide audience at its core it involves presenting data in a way that captures attention facilitates understanding and informs decision-making you can achieve effective storytelling by combining three distinct components in a well scripted way which can lead the report users to the insights produced by your analysis let’s explore those components at the core of data storytelling is the data itself this includes raw information facts and statistics that you have collected when the data has been processed and analyzed you can then identify the primary message you want to convey the use of a business analytic tool such as PowerBI can help to provide the context throughout your data story in addition the data provides the context that the audience needs to interpret the analysis presented to them next you design the journey the audience will take towards your primary message identifying the start and end points and any key data points along the way a narrative provides structure context and meaning to your data a well-crafted narrative explains the significance of data outlines the key findings and guides the audience through the story’s progression it might include explanations interpretations and implications based on data insights data visualization is the representation of data using charts graphs maps and other visual elements by choosing appropriate and effective data visualizations you allow viewers to quickly grasp information viewers can identify the trends patterns and insights that might be challenging to discern from raw data alone in the context of data storytelling visual elements educate your audience on your proposed theory by creating a connection between the visual elements and your narrative you can engage the audience and present both detailed and summarized data points these three components work together to create a datadriven story that communicates information and insights effectively and can even create an emotional response the data provides evidence substance and context visualizations aid in comprehension and the narrative ties everything together into a cohesive and compelling data story effective data storytelling can have a positive impact on the stakeholders directly involved and your organization as a whole some benefits of successful data storytelling include engagement engaging stories capture and hold the audience’s attention this engagement is vital for conveying critical messages next is enhanced understanding good data storytelling simplifies complex information and highlights key points making it accessible to a broader audience the visualizations and narratives help them to understand datadriven insights without requiring them to have advanced technical knowledge to capitalize on this you need strong communication data storytelling ensures that analysis is not limited to data analysts or data scientists it facilitates communication between different departments and disciplines within an organization fostering collaboration at the heart of datadriven stories is the purpose of solving problems datadriven stories help identify problems and opportunities by revealing patterns and trends it also encourages proactive problem solving through business analytic tools lastly there is effective reporting whether you are working in research business or academia data storytelling enhances the effectiveness of reports and presentations it transforms dry data into engaging narratives that captivate audience attention and involvement data storytelling is a transformative approach to data analysis and communication you can leverage the power of narrative data and visualization to convey insights effectively by mastering data storytelling you can add value to your data and insights and offer value to your audience and industry when you think about data and the story it can tell you need to think of it as a traditional story that you’ve read in books or watched in movies it contains the same elements of traditional stories like a setting characters a situation of conflict overcoming this conflict and a resolution to the story as an analyst you need to build your data story around these traditional storytelling methods by the end of this video you will have explored how elements of traditional storytelling can be translated to your data story in Microsoft PowerBI data contextualization establishes the environment and background against which the datadriven story unfolds your setting includes the details about the data sources the time frame and the broader context in which the analysis takes place for instance if you are analyzing sales data for a specific year in Adventure Works the setting would include details about the industry the market conditions and the company’s current financial status next up are the characters of your data story these are the individuals involved in the analytical process this includes data analysts data scientists and other stakeholders such as business leaders collaborators and external partners in a data story each character plays a unique role data analysts are the main characters who explore and interpret the data the main audience of your analysis such as CEOs or directors are supporting characters to the data story stakeholders are impacted by the insights driven from the data like many great stories conflict is central to your data story in this context the conflict is the business problem or data challenge it is the central issue that the data analyst aims to resolve for example your problem could be a decline in sales a drop in customer satisfaction or any other business issue determined through data analysis the conflict sets the stage for your analysis and drives the story towards the resolution finally there’s the resolution to the data story the resolution in the data story is the result of the analysis where insights are presented and actionable recommendations are made the resolution should provide a clear path of action based on datadriven insights and findings for example if the conflict is declining sales the resolution might involve strategies to boost sales like targeting specific customer segments launching a season specific marketing campaign and so on let’s explore how as a Microsoft PowerBI data analyst you would implement story elements to address a real world challenge at Adventure Works the story unfolds at Adventure Works headquarters where the company’s CEO Jaime is meeting with leadership to discuss the declining sales of Adventure Works products threatening the company’s future as a PowerBI data analyst and report designer you are the main character of this data story you are determined to uncover insights and anomalies from the data that will lead the company out of its sales slump a secondary character is the Adventure Works CEO Jaime jaime is considered a visionary CEO known for her adventurous spirit and belief in the company’s potential she is eager to make strategic decisions based on your analysis to move the company towards new heights the challenge facing Adventure Works is a steady decline in sales over the past two years the decline is causing concern among various stakeholders of the company including Jaime the executive leadership recognizes the company needs a datadriven solution to identify the reason for the decline and devise strategies to reverse the trend as the principal analyst you explore the company’s sales data from this 2-year period you investigate customer demographics seasonal trends and product performance through effective data visualization you uncover three significant insights first the sales of mountain bikes have outperformed other products in the same subcategory during the spring and summer months secondly by delving into customer feedback you discovered a compelling pattern of customers consistently praising the durability and quality of Adventure Works mountain bikes lastly you revealed a correlation between decreased marketing efforts and the months of declining sales based on your results it became clear that the company’s reputation for producing rugged and durable products is a hidden gem that can be capitalized on and that a consistent and effective marketing campaign is the missing piece of the puzzle to increase sales now you reached the resolution of this data story after working on data visualization and exploration you presented your report to the executive meeting and the CEO the committee decides to immediately address the identified issues based on your findings the marketing team drafts a roadmap to focus their efforts on promoting the durability and quality of their mountain bikes based on these findings the CEO Jamie provides a directive to the marketing director to increase the campaigns by targeting the competitive advantage Adventure Works has over their competition reliability with a datadriven strategy in place Adventure Works can now embark on a new journey as the company emphasizes the durability of its bikes and expands into new markets Adventure Works reignites their essence of exploration and sales begin to rise once more you have crafted a datadriven story of transformation for Adventure Works through data analysis and storytelling the company identifies outliers correlations and patterns to their problem this insight helps the company to rediscover its core strength and plan its future efforts accordingly a collection of numbers and charts on a report canvas in Microsoft PowerBI does not always tell a captivating story however with the science and art of data storytelling you can turn data context into your story setting turn stakeholders into characters and frame a business problem into a conflict and resolution the data storytelling process is an integral part of presenting data analysis it involves transforming datadriven insights into a narrative that is engaging and informative and leads to action and resolving the conflict in this video you will delve into the full process of data storytelling and how you can relate it to the data analysis process let’s start by outlining the eight steps you will cover they are goal data collection and preparation data analysis and exploration data visualization audience consideration communication feedback and iterations and actions and decision-making the data analysis process typically begins with defining a clear goal and a hypothesis of what you expect to uncover in your analysis analysts theorize about the relationship between the variables in the analysis and what they expect to discover from the data connecting this to data storytelling it is crucial to understand what message or insight you want to convey through the data this end goal guides the entire storytelling process data is collected from a source cleaned transformed and prepared for analysis as you learned in previous lessons this process might include merging data sets removing errors and duplicates handling missing values and so on in data storytelling your work begins with prepared data therefore it is essential to have a well ststructured data set that aligns with the goal of your story this ensures that the story is based on accurate and relevant information the data analysis and exploration stage involves statistical analysis hypothesis testing and data exploration techniques to uncover patterns trends and relationships in the data these findings are the heart of data storytelling you need to select the most critical insights that align with your goal such as key trends correlations anomalies or any other significant findings visualization is the key component of data analysis allowing you to explore and communicate data patterns effectively it plays a significant role in determining how receptive your audience is to receiving complex information to create effective visuals to support the goal of your story you need to choose the appropriate chart type relevant to your data effective visualization can help to reveal patterns trends and findings from your data provide context interpret results and articulate insights streamline data so your audience can process information and improve audience engagement you need to create a dashboard using data visualization tools in PowerBI to present these findings a data dashboard is used to manage information and for business intelligence a dashboard provides a single canvas to organize and present valuable information in a logical sequence the dashboard is the single location where the audience can understand the connections between the data story and the hypothesis you made initially data storytelling places a strong emphasis on the audience you need to tailor your story to your audience’s background their knowledge of the topic and business requirements the narrative is designed to resonate with the audience data storytelling involves dynamic and engaging communication this includes presentations interactive reports and dashboards you need to collect feedback from team members and other stakeholders which helps you refine your narrative visuals and overall storytelling approach to better meet your audience’s needs data storytelling is not just about providing information it aims to inspire actions having established your goal at the start of the storytelling process it should link back to the actions and decisions the compelling visuals and narrative aim to motivate stakeholders to make informed decisions backed up by accurate data and insights presented data storytelling is changing the way we consume information storytelling with data imparts a human dimension to often complex and cryptic data sets filled with numbers and statistics crafting a narrative plays a role in this process but the ability to comprehend and convey information is crucial for constructing a compelling narrative and leading to effective decisions congratulations on completing dashboard design and storytelling in Microsoft PowerBI you learned about using design principles to improve the visual impact of a dashboard and tailoring the design to the users interacting with the dashboard you also explored data storytelling and how it is a compelling way of transforming raw data into a data narrative that informs engages and inspires action let’s recap what you learned and the key takeaways from each topic you began by learning about improving dashboard and report design in Microsoft PowerBI dashboards are created in PowerBI service and are based on underlying reports dashboards are typically a single canvas of information presenting the current state of the business reports are designed from a variety of data sources in PowerBI desktop and typically contain multiple pages reports support the use of slicers and filters to enhance interactivity for users having established your knowledge of dashboards and reports you then learned about how to identify and focus on the end users in an adventure works scenario reports generated with data from various sources may contain information about the company’s inventory or sales the growth of the company in different regions about salesperson performance or best performing product categories the purpose of your analysis is a dashboard that contains only the relevant information needed by your target audience for example if you want to design a dashboard for the finance department you first need to identify the relevant data from the available data set you must visualize and present the information necessary for the finance team with all irrelevant data emitted when creating a user centric dashboard your ability to prioritize and visualize relevant data is a major step in engaging your audience you then learned about optimizing dashboards for mobile phones in the lesson you learned how to optimize dashboards for cellular devices how to allow for accessibility considerations and how to create dashboards for real-time decision-making and an enhanced user experience keep in mind though you need to be the owner of the dashboard to make any changes having completed the lesson on improving dashboard design you then learned about other dashboard elements you learned about working with multiple dashboards specifically how to duplicate a dashboard duplicating dashboards is especially important when you need to test the performance of a new dashboard with slight variations or to distribute a slightly different dashboard for other departments or regions another tool that you learned about is pinning a specific tile from one dashboard to another you can pin the tile from one dashboard to another without navigating back to the original report the source of the tile does not change meaning that the pinned tile links back to the original source report where it was created you then learned about incorporating media elements such as images videos and animations and text boxes to your visualization you learned about types of media which can positively impact the dashboard and its engagement with the audience you learned in this lesson how to add and edit various media files to the dashboard from PowerBI service you also learned what factors you must consider ensuring they work correctly for example an image file can only be displayed when it is published online with a URL without security credentials lastly in this lesson you gained hands-on experience in creating QR codes for various dashboard tiles and entire reports in PowerBI service a QR code is a feature that enables you and business users to access the most critical information on the go this can also be used to collect feedback conduct surveys and add external web links to your dashboard the last lesson in this module covered the principles of data storytelling data visualization and narrative are the three fundamental components of data storytelling effective data storytelling can have a positive impact on the overall analytical process benefits of data storytelling include engagement enhanced understanding communication problem solving and effective reporting next you went through an example of data storytelling for adventure works you learned about the principles of setting a stage identifying the conflict assigning the roles to various characters of the story and conflict resolution throughout the storytelling process then you learned about the storytelling process via eight steps they are goal data collection and preparation data analysis and exploration data visualization audience consideration communication feedback and iterations and actions and decision-m in the context of data analysis these steps cover the entire process from data collection and cleaning to databacked decision-m in real world scenarios you will come across examples of poor storytelling which need to be improved before they are presented to your audience choosing the wrong chart type designing a random dashboard canvas and inconsistent use of colors are all common mistakes you need to avoid while crafting a dashboard for your data story you should now have a better understanding of how to optimize your dashboard visuals and how to incorporate data storytelling best practices to create effective dashboards and reports the skills you’ve learned over these weeks will enable you to create data stories that capture user attention enable them to recognize the goals of your data analysis and generate effective solutions for your business congratulations on completing this course on creative design in Microsoft PowerBI microsoft PowerBI is not just an analytical tool it provides opportunities to implement creativity into your reports and designs to better engage dashboard and report users let’s recap what you have learned over the last few weeks reflecting on the key takeaways you started your learning journey by exploring color theory and the key role of color in building reports color theory is the collection of designs rules and guidelines used to communicate with users through color schemes you applied color theory and the role of color principles to improve a report for Adventure Works following on from this you explored appropriate positioning and scale of information while designing your PowerBI reports strategic placement of visual elements such as charts and graphs in a logical sequence within reports increases their user impact in addition consistent scaling within various chart types in accordance with the data type and structure also ensures the effectiveness of design next you learned how to avoid chaos in your PowerBI reports maintaining cohesion and consistency to your report building you also implemented the principles of chaos and cohesion practically to generate a cohesive design in PowerBI throughout this course you learned that the key to successful visualization is knowing your audience you must tailor your PowerBI presentations to meet the needs and preferences of your audience you must tailor your PowerBI presentations to meet the needs and preferences of those interacting with and using them during this lesson you learned how several factors such as job role user objectives information needs and cultural considerations influence your audience’s requirements you then switched to another crucial factor that plays a pivotal role in report design and that is age differences in your audiences colors are significant when designing PowerBI visualizations for various age groups appropriate formatting of a report that reflects the analytical message concisely while maintaining the design principles is key in report design and finally an important aspect of working with data is data security you learned about keeping data secure through data anonymization and how it can be achieved now let’s turn our attention to visual clarity in reports visual clarity at both chart level and report level affects the impact of your reports in this lesson you explored how to choose the correct chart type for the type of data you are visualizing you learned the data type the message and the audience all play a role when selecting a chart type branding visual hierarchy and the business objective are some of the factors that impact your visual clarity at report level next you covered both theoretical and practical aspects of accessible report design in Microsoft PowerBI many built-in tools can be employed to consider people with visual impairments while retaining an engaging and compelling report design following this you gained a thorough understanding of important chart types in PowerBI you gained hands-on experience in designing a key performance indicators or KPI chart a dotplot chart and a bubble chart a KPI chart is significant as you can visualize the current values against a predefined target value with trend axis in place a scatter plot chart along with its variations dot plots and bubble charts are of special significance because of their ability to display multi-dimensional and highdensity data in a single visual with these charts you can visualize categorical information on the charts x-axis having delved into the topic of charts you also explored advanced tools within PowerBI desktop to display complex data structure like tree maps heat maps and drill through and drill down functionalities of PowerBI to conclude this section on visual clarity in reports you learned how to optimize your PowerBI reports for mobile devices joining the wave of dynamic mobile business intelligence geographical data is the part of every business that requires special visual needs powerbi has various map visuals to visualize the location-based information you explored various map visuals through examples and with a hands-on experience shape maps and corropath maps also called filled maps are the two most common map visuals azure maps is a new map visual within PowerBI that offers more control and formatting options through map layers to accomplish the growing need to combine visualizations with complex data structures sometimes PowerBI core visuals are unable to fulfill your analytical requirements this is where you can leverage custom visualizations the PowerBI app source provides a range of custom visuals that are developed by partners and tested by Microsoft for quality and accuracy you learned how to download install and format a custom visual in your core PowerBI visualization pane you have gained a thorough understanding from installing Python to using it for your custom visualization python along with its rich and versatile visualization libraries such as mattplot lib and seabour provides an entire new avenue of dynamic and interactive visualization within powerbi having learned about designing powerful report pages you turned your attention towards dashboard design and storytelling the dashboard is a distinct component of the Microsoft PowerBI ecosystem you began by exploring the differences between a PowerBI dashboard and report as both offer several benefits and serve distinct purposes a PowerBI dashboard represents a snapshot of information displaying the current state of business and is a single canvas of visualization with key insights and KPIs a report is designed for granular data analysis that might consist of multiple pages with drill through and drill down functionalities you learned how to publish your report to PowerBI service create a dashboard and optimize your dashboard for mobile phones remember you can only create and optimize dashboards in PowerBI service the reports you generated using data from various sources might contain information about inventory sales regions growth of the company salesperson performance and best and worst performing product categories the product of your analysis is a dashboard that must contain only the relevant information needed by your target audience in the real world you need to work on multiple reports and dashboards simultaneously in this context you explored ways to streamline your workflow by duplicating a dashboard and pinning a visual element from one dashboard to another media elements are an integral component of a dashboard in the digital era adding images text boxes and videos to your dashboard can have a significant impact on audience engagement you gained practical experience in integrating media elements such as images and videos to your dashboard the fast-paced business landscape requires continual access to up-to-date data powerbi’s live streaming capabilities allow you to integrate real-time data to your dashboard for faster and on-time decision-making you learned that there are three types of live streaming data sets that PowerBI service supports push data set streaming data set and pub streaming data set only push data set is physically stored in PowerBI memory allowing you to build reports on top of the data set effective data storytelling serves as a bridge between the analysis of the data and communication of the results it combines the art of storytelling with the science of analytics to convey insights and findings in a compelling way you gained a thorough understanding of the components of data storytelling the narrative the data used and visualization and how these elements weave a data story next you learned the elements and the process of data storytelling with Adventure Works scenario with the eight-step process you crafted an engaging data story for Adventure Works the eight steps of data storytelling are goal data collection and preparation data analysis and exploration data visualization audience consideration communication feedback and iterations and actions and decision-making lastly you learned that effective data storytelling can have a positive impact on the overall analytical process benefits of data storytelling include engagement enhanced understanding communication problem solving and effective reporting as you have now finished your recap of this course you should take a moment to reflect on your learnings before embarking on the final project assessment and course quiz be sure to recap your learnings additional resources and previous quizzes and best of luck as you complete your journey congratulations on completing the creative design in PowerBI course your hard work and dedication have paid off you’ve made significant progress on your data analysis learning journey and you should now have a thorough understanding of the theory and practice of visualization and design including the design principles of data display and visualization this course provided you with a strong creative design foundation in Microsoft PowerBI this should allow you to modify your report designs to build cohesive reports and to produce audience focused reports aimed at target audiences you learned that to enhance the comprehension of data and improve the enduser experience you can apply visual clarity use multi-dimensional visualizations insert map visualizations and implement a custom visualization exploring the concepts of dashboard design and storytelling you compared the design of a dashboard with the design of a report examined the common steps involved with data storytelling and discovered advanced dashboard features such as embedding media and QR codes your PowerBI knowledge of visualization and design will help you to create better reports and dashboards well done for completing another step in your data analysis education by passing all the courses in the program you’ll earn a Microsoft PowerBI analyst professional certificate from Corsera this program is a great way to expand your understanding of data analysis and gain a qualification that will allow you to apply for entry-level jobs in the field and will help you prepare for the PL300 exam by passing the exam you’ll become a Microsoft certified PowerBI data analyst it will also help you to start or expand a career in this role this globally recognized certification is industry endorsed evidence of your technical skills and knowledge the exam measures your ability to prepare data for analysis model data visualize and analyze data and deploy and maintain assets to complete the exam you should be familiar with Power Query and the process of writing expressions using data analysis expressions or DAX you can visit the Microsoft certifications page at http://www.learn.microsoft.com/certifications to learn more about the certification and exam this course has enhanced your knowledge and skills in the fundamentals of creative designing in Microsoft PowerBI but what comes next there’s more to learn so it’s a good idea to register for the next course whether you’re just starting out as a novice or you’re a technical professional completing this program demonstrates your knowledge of data modeling in PowerBI you’ve done a great job so far and you should be proud of your progress the experience you’ve gained will showcase your willingness to learn your motivation and your capability to potential employers it’s been a pleasure to embark on this journey of discovery with you wishing you all the best as you continue to pursue your studies and develop your career working with PowerBI involves working with many different assets like reports and dashboards managing all of these can be a difficult challenge so we’ve designed this course to equip you with the skills you need to deploy and maintain PowerBI assets during this course you’ll explore the role of PowerBI in business deploying assets in a PowerBI workspace and the role that security and monitoring play in safeguarding reports and dashboards in PowerBI let’s take a few minutes to preview what you’ll learn you’ll begin with an introduction to the role of PowerBI in business with a focus on data flow data flow in business refers to the movement of information within an organization this movement or flow occurs in the following stages collection processing analysis and decision making once gathered the data is cleaned or standardized it’s then transformed data analysts use the refined data to generate insights the data is analyzed using PowerBI service this software offers many advantages for analysts it’s accessible scalable and offers collaboration tools and data backup and recovery features the data analyst is the central figure in this process they possess important skills and expertise in extracting valuable insights from data an important skill that all data analysts must possess is understanding structured query language or SQL data analysts use SQL to interact with the SQL databases that store the data analysts can connect to a SQL database using import or direct query modes import mode loads data directly into PowerBI direct query mode connects PowerBI directly to the source database an analysis is presented in the form of a report a report can be static or dynamic a dynamic report explores multiple areas of interest its results are presented in the form of visuals these reports also facilitate using whatif parameters that permit interactive adjustments to modify visualizations and generate insights into potential scenarios next you’ll explore how to deploy assets in a workspace a workspace is a specialized area in PowerBI that holds important assets there are two types of workspaces in PowerBI the first is a personal workspace which you can use to store your content the second is a shared workspace where a team can collaborate on reports and dashboards workspace roles determine how individuals can interact with workspaces workspace roles include viewer contributor member and admin you can manage these roles using PowerBI’s manage access feature next you’ll learn how to monitor workspaces by monitoring a workspace you can measure its impact and make changes to increase its usefulness you’ll also explore the topic of data sets and gateways in PowerBI a data set must contain the latest available information you can use a scheduled or incremental refresh to ensure accurate data and you can promote and certify data sets to inform your team where to access the most current and reliable data you’ll also explore establishing a secure reliable connection between your on- premises data and PowerBI service using data gateways there are three types of gateways in PowerBI the on premises data gateway the on- premises data gateway personal mode and the Azure virtual network or V-Net data gateway which type of gateway you choose depends on the setup of your organization and its specific data management and security requirements you’ll also learn how PowerBI deployment pipelines move content through the following life cycle stages development testing and staging or production another useful feature for maintaining your workspace is the lineage view this view shows the data journey from source to destination with all the connections in between impact analysis shows how changes to your data can impact or affect different assets in your workspace next you’ll explore the role that security and monitoring play in safeguarding reports and dashboards in PowerBI you’ll first explore how to share information safely and identify sensitive data sensitive data is essential information that if leaked could damage the company’s reputation finances or privacy you can safeguard data using PowerBI’s authentication tools you can also use sharing links to control who you share information with and use sharing permissions to determine what they can do with the data sensitivity labels are also another useful method of safeguarding data access to data sets is governed by data permissions these ensure that only authorized individuals can access data you can configure permissions in PowerBI to safeguard your data you’ll also review rowle security for safeguarding data rowle security or RLS controls which individuals can view data based on predefined roles and rules there are two types of role security static RLS restricts users to specific data dynamic RLS uses data analysis expressions or DAX to adjust realtime data access based on user roles finally you’ll explore subscriptions and alerts in PowerBI you can subscribe to reports and dashboards a PowerBI subscription is an automated delivery system that provides daily data snapshots as emails or notifications you can use the subscriptions pane in PowerBI to manage your subscriptions as well as subscriptions PowerBI also offers data alerts these automatic customizable notifications inform users when specific conditions or thresholds have been met or exceeded you’ll also complete exercises in which you’ll put your new skills into practice by helping adventure works with PowerBI knowledge checks which will test your understanding of these topics and additional resources in which you’ll consult Microsoft learn articles to help you explore these topics in more detail in the final week of this course you’ll undertake a project and graded assessment in the project you’ll prepare configure design and develop a data model for a fictitious online company called Tailwind Traders finally you’ll have a chance to recap what you’ve learned and focus on areas you can improve upon throughout the course you’ll engage with videos designed to help you build a solid understanding of data modeling in PowerBI watch pause rewind and rewatch the videos until you are confident in your skills then consolidate your knowledge by consulting the course readings and measure your understanding of key topics by completing the different knowledge checks and quizzes this will set you on your way toward a career in data analytics and form part of your preparation to take the PL300 Microsoft PowerBI data analyst exam by the end of the course you’ll be equipped with the necessary skills to work effectively with data models in PowerBI good luck as you start this exciting learning journey data is integral to business success but how that data arrives at the business is also important in this video you’ll learn about the flow of data in business and how it can be managed to help generate insights lucas is helping Adventure Works to develop its latest business plan this requires collecting all available data about the business to ensure that Adventure Works plan is as informed as possible this involves exploring what kind of data adventure works can analyze how it makes its way to the business and the techniques the company can use to prepare it for analysis but first let’s begin with the question what is data flow data flow in business refers to the movement of information within an organization this movement occurs in stages the first stage is collection where data is gathered from various sources such as Excel spreadsheets and SQL databases the second stage is processing where data is cleansed and transformed to prepare it for meaningful analysis during the next stage analysis advanced analytics and algorithms are applied to the processed data to uncover trends patterns and insights that inform business strategies the last stage is decision-making during this stage informed decisions are made based on the analyzed data guiding actions and adjustments within the business to optimize processes and achieve objectives and there are processes within business that govern aspects of data like how it is acquired stored manipulated and shared to support business operations and objectives let’s begin with the first stage data collection at Adventure Works data is collected from a variety of valuable sources firstly the Adventure Works e-commerce platform acts as a primary source capturing customer transactions web store browsing behavior and purchase history this platform integrates seamlessly with the customer relationship management or CRM system which compiles customer insights and interactions the point of sales systems in Adventure Works physical stores provide realtime data on instore purchases and customer foot traffic the company collaborates with suppliers who share inventory and sales data ensuring a streamlined supply chain social media platforms serve as another essential source offering insights into customer sentiment engagement and trends once the data is collected it then needs to be processed this vast amount of data is managed through SQL databases that securely store these records in tables you’ll learn more about SQL later in this course for now you just need to know that the SQL database is the center of Adventure Works data operations it links all aspects of the business and it provides an overview of business operations and customer interactions this empowers Adventure Works to make informed decisions for continued success with such a vast amount of information flowing through the system ensuring the accuracy and reliability of the data is paramount the two main steps in this stage of the process are data cleansing and transformation let’s explore these steps more closely data cleansing is the process of examining correcting and standardizing incoming data this removes inconsistencies from the data ensuring that it’s reliable and accurate for instance Adventure Works can standardize customer addresses at the data source by ensuring all addresses are collected and stored in the same format using consistent data types this provides a consistent foundation for shipping and billing this process not only refineses the quality of the data but also establishes a solid foundation for subsequent analysis once cleansed the data then flows through pipelines where transformation steps come into play the process of data transformation involves working with aggregations applying calculations and enhancing data for example Adventure Works can aggregate sales figures from different locations for an overview of regional performances these pipelines act as a bridge for the data to undergo a series of carefully designed transformations before it’s ready for analysis and reporting this stage of the process ensures that the insights derived from Adventure Works data are precise and actionable this helps to drive informed decisions for the company’s continued success after cleansing and transformation the refined data is now ready for analysis the results of this analysis form the foundation for insightful reporting for example Adventure Works can generate sales insights from its regional sales data these insights then form the basis of a report that offers a clear business snapshot now that Lucas has generated the required insights he passes the report on to management once Adventure Works management obtain a copy of this report they can use its insights to make decisions about the business the report indicates low sales of its new mountain bike model based on this insight Adventure Works might try a new marketing campaign for this model to help improve its sales beyond Adventure Works various industries harness data in unique ways to drive their operations for example the public transportation sector uses data from its routes travel times and ticket sales to optimize schedules allocate resources efficiently and enhance the overall commuting experience for passengers other sectors that make use of data include food companies those dealing with perishable goods are impacted by weather and temperature so they must collect and analyze meteorological data cold storage facilities rely on real-time temperature monitoring to preserve the quality and safety of products and they might also increase production in anticipation of a heat wave these examples illustrate how different sectors leverage data to make informed decisions this enhances their efficiency and competitiveness in the market you should now be familiar with the flow of data within a business and how this data is used to generate insights and make decisions an effective data flow is essential for generating insights for informed decision-making in today’s datadriven world the ongoing management of data is crucial for businesses to make informed decisions enhance efficiency and gain a competitive edge in this video you’ll learn how a company like Adventure Works can leverage its data assets using PowerBI service to become a datadriven enterprise and the importance of the continued maintenance of these assets adventure Works has set a goal of becoming a datadriven enterprise by the end of the year to achieve this goal the company must make the most of its data assets so its data analysts have configured custom reports and dashboards in PowerBI to monitor inventory levels track customer preferences analyze market trends and assess product performance let’s explore how the company can leverage and manage these assets to drive strategic decision-making in a datadriven enterprise like Adventure Works data isn’t just information it guides strategic choices resource allocation and maps the pathway for future growth during this transition to a datadriven mindset PowerBI service is used to deploy and maintain data assets as you’ve previously learned PowerBI service is a cloud-based platform used for data analysis it’s a centralized hub where teams can collectively work on reports and dashboards ensuring that everyone has access to the most up-to-date information this ensures that insights remain current and relevant and it empowers Adventure Works to make informed decisions swiftly and accurately unlike its desktop counterpart the service offers the following advantages it’s accessible for remote teams offering flexibility and collaboration across geographic distances adventure Works can use the service to scale up or down to accommodate changing business needs teams can also easily add or reduce resources without extensive hardware and infrastructure investments powerbi service also offers real-time collaboration features for documents and projects improving productivity and teamwork and it provides data backup and recovery reducing the risk of data loss due to hardware failures or other unforeseen events now that you’re more familiar with its advantages let’s explore how the Adventure Works data analysis team makes use of PowerBI service as you discovered earlier Adventure Works can deploy PowerBI service assets like reports and dashboards to monitor inventory levels track customer preferences analyze market trends and assess product performance all in real time let’s find out more about the insights PowerBI service can generate in these areas powerbi service can help to monitor inventory data data analysts can track inventory turnover rate order fulfillment accuracy shipping and delivery times and return rates adventure Works can track existing and emerging customer preferences this information can be used to differentiate its product offerings and stay ahead of competitors adventure Works can also use data to analyze market trends the company can identify opportunities for new product development or enhancements to existing products ensuring Adventure Works remains relevant and it can study trends in pricing to adjust costs to stay competitive and maximize profits other areas of the business that Adventure Works can monitor include product performance powerbi service can deliver information on the performance of individual product lines this information can include the best and lowest selling products and data from online product engagement and product recommendation effectiveness can guide decisions for the purchasing and marketing teams this ensures Adventure Works maintains a competitive advantage in a dynamic market it’s not just retailers like Adventure Works who use PowerBI service in today’s datadriven landscape businesses and organizations across various industries rely on the continuous maintenance of data assets to help guide decision-making for instance in the health care sector accurate and up-to-date patient records are critical for providing quality care a hospital’s ability to access a patients medical history in real time can be a matter of life and death in the finance industry investment firms require accurate data on stock prices and market trends to make timely investment decisions and as the Adventure Works examples demonstrated understanding customer behavior and preferences is vital for online retailers to tailor their offerings and marketing strategies effectively as these examples show data assets help to inform every sector of enterprise you should now be familiar with how a company like Adventure Works can leverage its data assets using PowerBI service to become a datadriven enterprise and the importance of the continued maintenance of these assets whether it’s optimizing supply chains fine-tuning logistics or tailoring marketing strategies the need for continuously maintained data assets is universal deploying and maintaining assets is not just an advantage but a prerequisite for success in today’s business world data analysis is essential and data analysts are central players in this data analysis process extracting invaluable insights from raw information in this video you’ll explore the pivotal role of a data analyst and the profound impact they have on organizational success adventure Works relies heavily on data analysts to help make sense of its data and generate insights to drive business success and there are certain skills and traits a company like Adventure Works looks for in its analysts let’s find out more about the skill sets Adventure Works values and the contribution that its analysts make to the company a data analyst is expected to possess specialized skills in statistics math and programming they use advanced tools to analyze big data and find hidden trends and anomalies that others might miss a data analyst creates reports and visualizations that combines complex information into simplified insights these reports and summaries help decision makers to navigate the business landscape they spot opportunities for improvement automation and cost reduction helping to make processes more efficient and boost the organization’s competitiveness data analysts enforce data protection rules they detect and fix weaknesses safeguarding organizations from harmful breaches and data leaks now that you’re familiar with the skills a data analyst must possess let’s examine some examples of where a data analyst can offer invaluable insights and solutions a data analyst at Adventure Works can employ advanced analytics to segment customers based on behavior demographics and preferences for instance a data analyst might identify a segment of Adventure Works customers who prefer outdoor gear by tailoring marketing messages and promotions to this group the company can increase sales for outdoor related products this enables targeted marketing for higher sales conversion and enhanced customer loyalty data analysts can also use past sales data trends and seasonality to forecast product demand and optimize stock accordingly a data analyst may discover that certain products have a seasonal demand spike by adjusting inventory levels and promotions accordingly Adventure Works can prevent overstocking and reduce carrying costs this leads to higher profitability because Adventure Works can avoid the risk of excess stock data analysts can also generate insights into sales by studying the purchasing patterns of customers to discover which products sell together most effectively through market basket analysis a data analyst might find that customers who purchase hiking boots often also buy outdoor gear adventure Works can use this insight to create bundled promotions that encourage customers to purchase these items together these insights help Adventure Works to meet the needs of its customers and increase its sales in an online industry stopping fraud is vital data analysts use realtime checks to spot suspicious transactions keeping Adventure Works safe financially and protecting its reputation a data analyst may set up alerts for transactions that deviate significantly from a customer’s typical behavior for instance if a customer suddenly makes a high-V value purchase after a history of smaller transactions it could trigger a fraud alert you should now be familiar with the pivotal role of a data analyst and the profound impact they have on organizational success data analysts are essential for helping businesses drive insights and progress as the examples you’ve just explored demonstrate data analysts help to make informed decisions improve operations drive innovation and reduce risks sql or structured query language is a powerful language with many advantages for data analysts working with large enterprise databases in this video you’ll learn about the importance of SQL how it helps with data storage and queries and how it integrates with Microsoft PowerBI adventure Works has just hired some new traininee data analysts it needs these analysts to generate insights from its SQL databases but several of them are unfamiliar with this tool let’s explore the answers to some of their questions about SQL to discover how it helps enterprises like Adventure Works the first question these new trainees have is what’s a SQL database at its core a SQL database is a system for organizing and storing data in a structured format when we refer to a structured format we mean that data is structured or organized so it can be located quickly when required for analysis a SQL database excels in handling structured data its framework is built of tables rows and columns this means that all data is stored in specific categories and analysts can find the data they need with minimal effort for example Adventure Works needs to retrieve bicycle data for a report it can create a SQL query that accesses the product category column in the products table where a list of all bicycle types in stock can be found as this example shows a strong business case can be made for SQL databases through their structured and reliable framework however another advantage of SQL databases is that they facilitate complex queries for quickly extracting specific subsets of data this is important for generating reports and insights data sets are also constantly expanding which requires scalability and a larger data set requires more complex methods of data retrieval you can retrieve data from large databases using techniques like partitioning and indexing finally SQL databases can be accessed by multiple users or applications at the same time an entire team of Adventure Works data analysts can access the SQL database simultaneously without causing a conflict or slowdown this is an important advantage for a business as we’ve discovered the main advantage of a SQL database is its storage capabilities the next question that the new data analysts have is how does this storage work sql databases store data using a method called normalization you might be familiar with this method from previous courses normalization divides data into multiple related tables each with a specific purpose it’s like tidying a room by putting similar things in separate boxes as you discovered earlier SQL databases also use indexing indexing is the technique of assigning a unique number to each row in a table this acts like a table of contents in a book making it easier to locate information as a data analyst it’s also important for you to understand that the real power of SQL isn’t just its storage capabilities the ultimate benefit of a SQL database is its ability to return information through SQL queries sql queries are statements written in SQL they instruct the database to perform a specific operation like returning all records in a table or just a specific subset so you must study the syntax and structure of SQL statements carefully to extract the necessary insights as efficiently as possible for example Adventure Works data analysis team has created a SQL query that returns all bike data from the products table however they can also create a more complex SQL query that returns data only on bikes that cost $1,000 or more the new data analysts are now more familiar with the basics of SQL so their final question is how does a SQL database relate to PowerBI just like PowerBI SQL databases are used by businesses of every size to manage and organize data by integrating SQL databases with PowerBI data analysts can use these tools to create compelling visualizations and reports that turn raw data into actionable insights having explored the basics of SQL alongside Adventure Works new data analysts you should now be familiar with the importance of SQL how it helps with data storage and queries and how it integrates with PowerBI sql is an essential tool for data analysts to help generate the insights businesses need develop a good understanding of SQL and you’ll be an asset to any enterprise powerbi is a powerful tool for extracting
data and it can also be integrated with a SQL or structured query language database to generate even greater insights into your data in this video you’ll explore the structure of a SQL database the steps to connect it to PowerBI and some examples of connection modes adventure Works has recently migrated its data sets to a SQL database the company has tasked Lucas with connecting this database to PowerBI so that it can begin to analyze its data let’s explore the basics of integrating PowerBI and SQL databases then follow along with Lucas as he establishes the connection to begin here’s a quick overview of a SQL Server a SQL Server is a relational database management system or RDBMS developed by Microsoft it provides a secure and scalable platform for storing managing and retrieving data sql servers organize data into structures called databases where they’re stored in tables with rows and columns this makes it easy to retrieve and work with specific data sets users can interact with SQL databases by creating SQL queries that send instructions to the database so your next question might be how do I connect to a SQL database establishing a connection between PowerBI and a SQL database requires three pieces of information the name of the server the database name and your credentials here’s how these pieces of information work together to provide access the server name identifies the location of the database server the gateway to your data the database name is the database within the server you intend to access and the credentials are typically the username and password that grant access permission to the server these details provide a secure and efficient foundation for linking your analytical tools there are two primary modes available for connecting your data in PowerBI import mode and direct query in import mode data is loaded directly into PowerBI for fast and responsive visualizations however the data is static so it might need to be refreshed to reflect realtime updates on the other hand direct query mode connects PowerBI directly to the source database this enables real-time analysis but potentially leads to slower performance due to continuous queries to the database which one you choose depends on your business needs when making your decision balance factors like data size update frequency and performance requirements to communicate with this infrastructure you need to construct queries written with SQL for example Lucas can use a basic select SQL query to retrieve sales data from the database the select command initiates the retrieval of data from the database in other words you’re instructing the database to select specific data in this query the asterisk signifies that we want to retrieve all columns from the specified table the from clause specifies the table from which we want to retrieve the data or the source of the information we’re interested in in this instance we need the rows and columns from the Adventure Works sales table finally the wear clause adds a condition that filters the resulting table rows based on specified criteria in this query product category road bikes indicates that we’re interested in records in the product category column that match the road bikes value now that you’re up to speed with the basics let’s work with Lucas to establish a connection between PowerBI and the Adventure Works SQL database select get data from the home ribbon tab to import data from any PowerBI source a pop-up window with all available data source connectors appears type SQL in the search bar to locate the SQL Server database connector identify the required connector and select connect this opens the SQL Server database window where you must input the database details the SQL server is the server’s IP address containing the database or its identifying name in this instance the Adventure Works server name is FG7N373 and the database name is MSDB next ensure that import is selected as the data connectivity mode to load the table in the PowerBI file memory these settings should suffice for your connection to all database tables the next step is to create a SQL query to retrieve the required data set expand the advanced options then input a SQL select query to retrieve all road bike data from the product category column in the Adventure Works sales table finally press okay next you must provide credentials to connect to the required database and extract the sales data select the database tab and input your database credentials make sure the correct database level is selected then select connect to establish a connection between PowerBI and the database table a warning appears stating that an encrypted connection to the database is missing we can ignore the warning for this example scenario and select okay however it’s good practice to use an encrypted connection in a realworld PowerBI environment a preview of the data set appears on screen you can select transform data to interact with the data set in power query editor or select load to connect to it directly in this instance we’ll select load to connect to it directly once the required rows are loaded navigate to data view if your loaded data is present as a table then this confirms that the connection has been established successfully you’ve now explored the structure of a SQL database the steps to connect it to PowerBI and some examples of connection modes by integrating PowerBI and SQL you can greatly enhance the power of your data analysis powerbi generates static reports that offer a snapshot of data at a fixed point in time however it can also generate dynamic reports which adapt and respond to your business needs in this video you’ll explore the basics of dynamic reports an overview of PowerBI parameters and how to generate dynamic reports using parameters over at Adventure Works Lucas is preparing sales reports however instead of generating a new static report for each aspect of the business he wants to create one report that can serve several different purposes dynamic reports are the perfect solution up to this point you should have experience working with static reports these offer fixed snapshots of data like total sales revenue over January however dynamic reports can be adapted and transformed based on user specifications dynamic reports can be modified using parameters to change how they display information as the data analyst you can decide which parameters inform the report this means that its content is always aligned with your business needs you can also adapt your parameters for different scenarios or you can switch between data sources in real time with this alignment an organization gains more value from one single report this saves time optimizes resources and leads to more efficient and effective reporting practices as you’ve just learned dynamic reports are created using parameters in the context of PowerBI parameters are dynamic variables that influence the data displayed in the report parameters are like dials and switches on a control panel if you update your parameters your report updates accordingly there are many different examples of parameters including numerical values text inputs and boolean or true false settings parameters also accept default values or free form text there are many options for customizing your parameters for example Lucas is developing a sales report that must analyze monthly sales data in North America he can set up a parameter to analyze sales on a continual month-by-month basis or input a custom date range he can also set parameters to filter data by region so that the report focuses only on North America or he could set up a custom region name to focus on a specific area of interest like monthly sales data for states on the West Coast powerbi parameters are the cornerstone of dynamic reporting empowering users like Lucas to customize their data views let’s explore a few more examples of how parameters can be used with dynamic reports you can use parameters to explore high levels of data granularity with dynamic data selection and filtering for example as you’ve just discovered Lucas can analyze specific areas of interest in his data using custom ranges this helps to deliver greater insights for adventure works parameters also enable dynamic data source connections with parameters you can switch between data sources like databases files or application programming interfaces also known as APIs this is great for dealing with evolving data environments or multiple data repositories parameters can be used to analyze existing business situations or create new what-if scenarios for example Lucas can create financial forecasts by inputting growth rates expense projections and revenue assumptions as his parameters this generates a range of potential revenue outcomes for Adventure Works leveraging PowerBI parameters through scenarios helps Adventure Works to explore multiple outcomes helping to create datadriven business decisions you should now be familiar with the basics of dynamic reports PowerBI parameters and how to generate dynamic reports using parameters by using dynamic reports you can align your data more closely with the needs of your business and gain maximum value from one single report dynamic reports are an interactive userfriendly way of viewing and analyzing data and offer much more powerful insights than traditional static reports in this video you’ll learn how to create a dynamic report using a SQL database and PowerBI parameters lucas must generate a dynamic report for Adventure Works that analyzes the company’s sales data across multiple regions the report must extract data from a sales table in a SQL database it then needs to use parameters to alter the displayed region according to user selections the first step is to create a connection select get data from the home ribbon tab select SQL Server from the list of options the SQL Server database dialogue box appears on screen input the server name in the server field and the database name in the database field ensure that the import mode option is checked for data connectivity mode import mode should be selected by default next you need to retrieve and load the data for your report expand advanced options input a SQL select query that retrieves all table columns from the Adventure Works sales table containing data or values for sales in Asia select okay to execute the query input your database username and password credentials to access the SQL server select connect then okay on the encryption warning finally select load to load the database table into your report the table shows data from sales in Asia as specified in the where clause of the SQL select query the next step is to format the table and visualize the data the table’s default name is query one rename the table to sales now you need to visualize the sales as a table graph select the table visualization then expand the columns of the sales table select the product category product region and order total columns finally you need to increase the size of the text to make it more visible navigate to the format pane of the visualization increase the table’s values to 15 point font size increase the column headers to 16 point font size resize the table to fit the values and center it on screen next you need to create parameters to make the connection dynamic navigate to the transform tab on the home ribbon to access power query editor once in power query you can view the data set table you’ve connected to you can now create a new parameter to access the dialogue box for creating new parameters access the home tab select manage parameters then new parameter these actions open the manage parameters window you can configure your parameter as follows name it region parameter select text as the data type ignore suggested values as it’s not required for this project finally add Asia with single quotes as the current value select okay to create the parameter now you need to assess your parameter by adjusting your SQL query right click on your sales query in the query editor then select advanced editor your code appears on screen in the advanced editor dialogue box replace Asia in your code with the amperand symbol and region parameter check the bottom left hand corner of the dialogue box to ensure no syntax errors have been detected then select done you need to grant permission for this query to run select edit permission and then run select close and apply to return to report view now you need to test that the report is dynamic select transform data from the home ribbon and select edit parameters change Asia to Europe select okay then select apply changes to refresh the data set select run to enact your changes the data set modifies itself to display sales in Europe adventure Works now has a dynamic report that it can use to explore its sales data across multiple regions and you should now be familiar with the process steps for creating a dynamic report using a SQL database and PowerBI parameters a dynamic report typically offers insight into one area of interest at a time however with a multialue dynamic report you can explore several areas of interest at once in this video you’ll learn how to create a multialue dynamic report in PowerBI adventure Works needs to transform its current dynamic sales report into a multialue dynamic report that offers insight into its sales data across multiple regions simultaneously let’s create this report for the company using PowerBI the first step is to create a spreadsheet containing the required values to be passed to the SQL query it must use single quotes for text values however to include a single quote at the beginning of your text in Excel you need to use double quotes this indicates to Excel that you’re typing a single quoted text access the transform data option to open Power Query Editor select and import the product region selection Excel spreadsheet check the box for sheet one and select okay to add it to the editor once the sheet is loaded in the editor rename column one to region selection now you need to create a function to match the database table rows with the user selection in the spreadsheet select the sales query from the queries menu right click on the query and select create function from the list of options in the create function window type the following function name get sales data from regions select okay power query creates a folder that contains all parts of the function the next step is to invoke your custom function this ensures that the database table records match the spreadsheet column values in other words you import only the relevant data select the other queries folder and select sheet one then access the add column ribbon tab and select invoke custom function this action opens the invoke custom function window name the new column invoked function data select the get sales data from regions function query and select region selection as your region parameter then select okay your data set shows a new invoked function data column containing the required sales regions you can use the double arrow button on the top of the new invoke function data column to expand the data avoid using the original column name as a prefix this would make the column names too long it should only be used if combining multiple columns of the same name in the same function might cause confusion select okay to load the data this loads the database table columns and rows with a product region that matches the spreadsheet selections double click on sheet one in the queries pane and rename it to sales function select close and apply to return to the report view access the visualization pane and select the table icon select the following columns from the data pane product category product region and order total as you select these columns the table visualization is populated with the data from each one next select the format painter on the visualization pane increase the font size of the table’s values to 15 point and the column headers to 16 point for greater visibility then resize the table return to the spreadsheet and change Asia to Europe then save the document return to PowerBI and select the refresh option from the home tab the new multialue region selection from the spreadsheet is shown in the database table results your multivalue dynamic report is now ready to present to Adventure Works this report lets the company select and analyze sales from multiple regions for greater insight you should now be familiar with the process of creating a multialue dynamic report in PowerBI dynamic reports show information on your current data but with whatif parameters you can dynamically alter reports to observe hypothetical outcomes or scenarios in this video you’ll explore the concepts of whatif parameters and scenario-based analysis and you’ll review the process steps for applying these concepts to your reports adventure Works has raised its monthly order amount target lucas its data analyst must determine the target to meet next month’s sales goals lucas can use whatif parameters to forecast scenarios and identify the required sales target before we explore how Lucas can carry out this task let’s review the basics of whatif parameters a whatif parameter is a customdeefined variable that can make interactive adjustments within a PowerBI report you can adjust your parameters to change your visualizations and generate insights into future scenarios the main purpose of whatif parameters is to enable dynamic scenario analysis this means users can explore various hypothetical scenarios without the need for complex calculations or creating multiple versions of the same report instead a single report can be transformed into a versatile tool capable of adapting to various business contexts for example Adventure Works can use whatif parameters to create sales forecasts the company’s data analysts can tweak variables like sales growth rates seasonality factors or marketing budgets they can then instantly observe how these adjustments affect projected revenue sales and revenue targets this level of interaction empowers users to make informed decisions based on realtime insights while what if parameters offer tremendous flexibility it’s important to recognize when and where they can be most effective they’re most effective in scenarios with many variables that can significantly impact outcomes and where it’s important to be able to quickly assess these outcomes what if parameters can be applied across a range of industries organizations and use cases for financial analysts they facilitate stress testing of financial models and evaluation of risk scenarios marketing professionals can use them to optimize advertising budgets and forecast campaign outcomes supply chain managers can simulate various demand scenarios to fine-tune inventory levels once you have the available data the possibilities of whatif parameters are near endless now that you’re more familiar with whatif parameters let’s help Lucas perform a scenario-based analysis for Adventure Works lucas must create a whatif parameter to forecast the sales required in February to reach the new monthly target of 70,000 using the data from the sales report to help him first navigate to the modeling tab select new parameter and numeric range from the drop-own menu the parameters dialogue box appears on screen input the details as follows name the new parameter forecasted increase assign it a decimal data type input one as the minimum amount and two as the maximum then input 0.1 as the increment this creates 10 steps between one and two and set the default to one finally check add slicer to this page and select create a slicer is added to the page expand its settings on the visualization tab select vertical list as the style and turn on single select so a value is always selected resize the visual to fit the left side of the report navigate to the data pane and expand on the forecasted increase table to identify what has been created by the whatif parameter first there is the column that’s currently being used in the slicer which contains a list of numbers based on the parameter settings this was created by the generate series function secondly a measure contains the option selected in the slicer captured by the selected value measure you also need a third measure to handle the desired calculation to create it select new measure from the ribbon and name it forecast amount add the sum of order total column multiplied by the forecasted increase value measure now you need to add this measure to the analysis navigate to the column chart and access the build visual settings add the measure to the yaxis of the visualization since the parameter is set to one the forecasted results of the calculation is the exact same number as the current total you can cycle through the options to view more scenarios the whatif parameter dynamically modifies the visualization one forecast shows that a 1.6 increase in the total amount is enough to reach the monthly target you should now understand the concepts of whatif parameters and scenario-based analysis and the process steps for applying these concepts to your reports what if parameters in PowerBI offer a transformative approach to data analysis by providing the ability to dynamically adjust variables and instantly visualize the impact they empower users to make more informed decisions data scientists and data analysts and big tech companies already use SQL and other languages for advanced data analysis this gives leadership valuable insights into overall productivity and what the weak spots may be leading to evidence-based strategic decisions they can create comprehensive customer profiles to better understand their customers needs leading to targeted marketing initiatives businesses can look at supply chain analytics to figure out where production delays or bottlenecks happen but what impact can data science have on a larger global scale some cities are already using data analytics to inform decisions about urban planning to lead to a better quality of life for their inhabitants ultimately working toward being recognized as a smart city singapore Oslo New York and Paris the list goes on imagine a city planned entirely based on data analysis a city that takes the innovations all those cities already use and incorporates them into one place what would that look like welcome to Data Topia during its inception urban planners and data scientists work together to develop an exact ratio of residents to schools to shops to restaurants to healthcare facilities to green spaces and so on ensuring that all these amenities are accessible to all residents all the time there are no traffic jams in Datatopia real time data analytics and predictive models provide timely and actionable insights to traffic management centers using cameras sensors and GPS data from vehicles this is used to adjust traffic lights dynamically and reduces congestion by improving the efficiency of intersections digital signs display realtime traffic information to drivers suggesting alternate routes when congestion is about to occur real time analytics automatically detect traffic incidents and alert authorities leading to quick response times to minimize disruptions and improve safety data topians don’t have to worry about overflowing waste bins all bins have been fitted with sensors that detect when they are nearing full capacity triggering timely waste collection and preventing overflows landfill usage and recycling rates are carefully monitored using realtime analytics this data is used to inform sustainability initiatives water use cleanliness of public spaces and energy use is also monitored in Datatopia street lights dim when roads are empty to reduce energy consumption green energy systems power the city and smart grids optimize power distribution predictive analytics have shown that 38% of Dattopians will be over 65 in the next 10 years health care measures such as hospital capacity and resource allocation are carefully managed to accommodate the aging population data analytics identifies trends and patterns within the population to target preventive interventions and improve overall health outcomes this includes identifying at risk populations and tailoring interventions to specific groups education is very important in Datatopia educators can analyze attendance records coursework completion rates and other data to identify at risk students early in the academic year early warning systems can trigger interventions to prevent dropouts and improve student success analytics are also used to recognize high achievers who may benefit from advanced coursework statistical algorithms are used to predict student outcomes this drives decisions in allocation of university course offerings in the city data science is used in resilience planning in Datatopia predictive analytics ensure that the city has resilience strategies in place to cope with various challenges such as cyber threats economic downturns or natural disasters this data is used to improve emergency response times and the deployment of emergency services during a crisis datatopia seamlessly integrates information and technology to create a healthy and sustainable urban ecosystem we may not quite live like the people of the imagined data topia just yet whether it seems like a dream or a nightmare to you it’s clear that with the ever evolving landscape of the practical application of data analytics we may be getting one step closer every day congratulations on reaching the end of these lessons on PowerBI in enterprise during these lessons you explore data’s role in large enterprises let’s take a few minutes to recap what you learned in these lessons you first learned how data flows through an enterprise you discovered that data flow refers to data movement within an enterprise this movement occurs in the following stages: collection processing analysis and decision-m in a large enterprise data flows in from a variety of sources its flow is governed by processes influencing how it is acquired stored manipulated and shared once gathered the data must be cleansed and transformed to prepare it for analysis data cleansing is the act of standardizing data so that it is reliable and accurate data transformation is the act of transforming data as it flows through pipelines once cleansed and transformed the refined data is ready to inform strategic decisions as its insights are revealed through PowerBI reports organizations use these reports insights to become datadriven enterprises data isn’t just information it guides strategic choices and helps to map a pathway to growth powerbi service is used by many businesses to generate datadriven insights this is because of the advantages that it offers it’s accessible for remote teams it scales to meet data growth it offers real time collaboration and data backup and recovery and it’s you who helps organizations to take advantage of these benefits the data analyst is the figure that plays a central role in extracting valuable insights from this data a data analyst brings several important skills to an enterprise they provide analytical expertise they create reports and visualizations with data that drive decision-making they generate insights that identify room for innovation and they help to identify and mitigate risks next you learn about SQL and its role in enterprise sql or structured query language is used by data analysts to interact with SQL databases data is stored in a SQL database that stores data in a structured format this means data is organized so that it can be located quickly when required sql databases also store information using normalization and indexing to make it easier to locate data sql databases offer many advantages for enterprises they’re great for storing data they facilitate complex queries they can scale to meet the demands of a growing business and they can be accessed by multiple users at the same time sql databases return information through SQL queries data analysts must be familiar with SQL syntax to create queries that extract the required data to connect to a SQL database you must identify the location of the server and the database on the server that you need you then need to provide credentials to gain access you can connect your data using import mode or direct query mode import mode loads data directly into PowerBI direct query mode connects PowerBI directly to the source database you can communicate with this infrastructure using SQL queries for example Adventure Works can use SQL select queries to extract information on bicycles sql databases and PowerBI servers also facilitate the use of dynamic reports dynamic reports can alter between views based on user selection you can also create multialue dynamic reports that simultaneously explore several areas of interest within your data sets both can be modified using parameters to change how they display information this provides more value than standard reports as a data analyst you can decide which parameters inform the report once they align with your business needs you must connect PowerBI and a SQL server to create a dynamic report you then need to create a SQL query to retrieve and load the data from the SQL database once loaded you need to visualize the data typically in graph format finally you must configure parameters to analyze the data multi-dynamic reports are more difficult to create this is because they require the use of custom functions to be invoked in a data set powerbi reports also make use of a whatif parameter a whatif parameter is a custom-defined variable that can be used to make interactive adjustments within a PowerBI report you can adjust your parameters variables to change your visualizations and generate insights into future scenarios they’re most effective in scenarios with many variables that can significantly impact outcomes that must be assessed quickly throughout these lessons you also completed several knowledge checks that tested your understanding of the concepts and processes you explored you also encountered additional resources which presented you with links to further reading materials that you can use to enhance your understanding of the role of PowerBI in enterprise you’ve now reached the end of this summary it’s time to move on to the module quiz where you can test your knowledge of these topics this is followed by the discussion prompt where you can discuss what you’ve learned with your peers you’ll then be invited to explore additional resources to help you develop a deeper understanding of the topics in this lesson best of luck working with PowerBI service requires managing many different reports dashboards and data sets keeping track of these can be a demanding task fortunately you can use the workspace feature to manage your data assets in this video you’ll explore PowerBI service workspaces their advantages the types of workspaces available and best practices to follow when using them lucas has been tasked with managing several different reports and dashboards for Adventure Works he can use PowerBI service workspaces to keep all these data assets in one place using personal and shared environments let’s explore how workspaces can help Lucas manage Adventure Works assets powerbi service workspaces act like specialized rooms in a house each workspace hosts distinct data sets reports and dashboards this is great for data analysts because it helps with organized and efficient data management several features of workspaces make them useful for data analysts these include organization access control collaboration and streamlined updates let’s explore these features beginning with organization workspaces offer data analysts great organizational potential each workspace is a unique container for related reports dashboards and data sets this helps keep your data tidy and easy to locate workspaces also provide access control safeguard your data from unauthorized users with your workspac’s access control features depending on the workspace you can determine who can see or edit the content for example Lucas can configure his workspace so that only other members of the data analysis team can view it this is especially useful when working on confidential data or collaborating with specific teams workspaces also enable collaboration between teams shared workspaces are like conference rooms they’re spaces where Lucas and the data analysis team can discuss and refine data insights it’s not just about storing reports but building them together workspaces help keep content updated with workspaces you can streamline updates to your projects updating or modifying data is much easier with everything in its right place whether pulling in new data or revising visualizations having a structured workspace ensures consistency and clarity now that you know more about workspaces and their advantages let’s explore the different types available there are two main types of workspaces these are personal and shared workspaces both serve a different purpose let’s review their differences to find out more a personal workspace is like a private room in your house it’s your space where you can arrange things to your liking and work on projects privately here you’re in total control outsiders don’t have a key ensuring your work remains confidential and undisturbed shared workspaces let team members collaborate they can bring together their individual data insights and blend them into a collective narrative it’s a space designed for collaboration allowing multiple users to add edit and refine reports and dashboards simultaneously how you manage and utilize your workspace is crucial for effective data analysis adopting certain best practices can significantly enhance your efficiency and output one important best practice involves regular cleanup periodically review and remove outdated reports or data sets from your workspace this proactive approach ensures optimal performance and prevents potential confusion from irrelevant information you must also establish clear naming conventions for your data assets consistency is key when naming your reports dashboards and data sets this practice aids easy retrieval and benefits all users especially in shared workspaces you must also frequently review your access controls assign access levels based on roles and responsibilities to maintain data security and prevent unintended modifications for example over at Adventure Works Lucas must continually monitor who can access his team’s shared workspace to ensure only data analysts can view its assets in the digital realm safeguarding your work is paramount ensure that you back up your work regularly regular backups protect against unexpected data losses ensuring continuity in your projects on a large team like Lucas’ frequent backups are vital it only takes one mistake from one team member to lose important data and finally you should also encourage open discussion and collaboration with your team members you can do this by fostering a culture of continuous feedback you can refine data visualizations optimize reports and foster a more collaborative environment by actively seeking and implementing suggestions adhering to these best practices ensures efficient data management and creates a conducive environment for team collaboration you should now be familiar with PowerBI service workspaces their advantages the types of workspaces available and best practices to follow when using them as you’ve discovered through Lucas and his team workspaces can greatly benefit your data analysis projects as a PowerBI data analyst you’ll frequently collaborate with others in shared workspaces so it’s important that you understand how to create and manage these workspaces in PowerBI service in this video you’ll explore the process steps for creating a workspace and learn how to keep its content updated over at Adventure Works Lucas needs to create a collaborative workspace for his data analytics team a PowerBI service shared workspace is the perfect solution let’s help Lucas create and manage this workspace log into PowerBI service navigate to the lefthand sidebar to access the platform’s tools select workspaces to display the available workspaces for now Lucas only has access to my workspace his personal space select my workspace to access the space and reveal its contents the workspace contains reports dashboards and data sets however other team members need to collaborate on these assets to create a shared workspace for the team navigate to workspaces and select new workspace the create a workspace dialogue box appears on screen in this dialogue box you can input a workspace name assign a domain for your workspace and upload an image you can also use advanced settings to assign members for now let’s just input adventure work sales as the workspace name then select apply now that we’ve created the workspace we must upload some content select upload then select a PowerBI report the report and its data set and dashboard are uploaded to the workspace and ready to share however if any changes are made to the report in PowerBI desktop it will need to be uploaded again to the shared workspace to ensure these changes are reflected for all other users to demonstrate this let’s open the report in PowerBI desktop and make a quick change in the report select the order total by product color visualization select the ellipsus symbol then select sort access and modify the order by sort ascending all values on the x-axis are now sorted by ascending order total save the report and return to PowerBI service open the report again in the workspace screen this version does not reflect the change we made in PowerBI desktop so we’ll have to upload it again return to the workspace screen and select upload select browse and locate the updated report a warning appears stating that a data set with the same name already exists select replace and upload the new version of the report once the new version of the report is uploaded you can open the report and view your changes the updated chart is now visible in the report indicating a successful upload you should now be familiar with the process steps for creating a workspace and keeping its content updated by knowing how to build and manage shared workspaces in PowerBI service you can work effectively with your teams to generate insights and help drive business success running a shared workspace involves managing a lot of different people everyone must be assigned the correct roles and permissions to ensure the team works together effectively in this video you’ll explore workspace roles and the different types available and learn how to configure them lucas has created a new shared workspace for his Adventure Works colleagues to collaborate on the company’s latest reports he now needs to identify who requires access to the workspace and assign the correct roles to everyone let’s work with Lucas to assign roles to the team just as you wouldn’t let everyone in a company have the keys to every room roles determine who can do what in digital workspaces these roles ensure that each person has only the access required to do their part of the job nobody is granted unnecessary permissions that could lead to accidental disruptions or security risks in PowerBI service workspace roles are the backbone of efficient and secure collaboration workspace roles include viewer contributor member and admin let’s explore these roles in more detail beginning with viewer viewers are the audience they can look but can’t touch in other words they can view content without modifying or managing anything lucas can assign this role to managers stakeholders or anyone else who needs to be in the loop without directly impacting the workspace next is contributors contributors are there to add and modify content but they can’t adjust access permissions or delete items lucas should assign this role to those focused on adding content they can contribute to selecting content but don’t need to make bigger workspace adjustments workspaces also host members members can contribute to the content by adding and editing assets they can also add other members or collaborators with lower permissions however they cannot delete the workspace or manage user roles lucas can assign this role to regular team members who need to work on data or perform analysis and might also need to add others to the project and finally there’s admins admins oversee the workspace they have full control from adding editing and deleting content to managing user access and even deleting the workspace lucas can assign the role of admin to himself or another individual tasked with overseeing the entire project or workspace the chosen admin can keep the project running smoothly while ensuring everyone else performs their roles as required now that you’re more familiar with workspace roles let’s help Lucas to manage the roles in his shared PowerBI workspace lucas has uploaded the project’s report data set and dashboard in the adventure work sales workspace however roles must be assigned before the team can collaborate on this workspace first select manage access from the workspace environment all team members with access to the workspace are listed here for now it’s only Lucas who has access to add a new team member to the workspace and assign a role select add people or groups a brief information box appears stating that viewers cannot edit content in the workspace to add a team member search for their name or email in the search box for the first example let’s add Adio our fellow data analyst assign Adio the contributor role so he can collaborate on the content and press add adio is now added to the workspace next let’s add Renee the marketing manager as a viewer this role lets her access the workspace to view insights without making any changes lastly the IT department must be assigned the role of admin this role grants full permissions from content management to user access control locate the admin account in the search box select the admin role and add it to the workspace all roles have now been assigned select the back arrow to view the roles that everyone has been assigned select the down arrow on their permission to modify a role and alter it to another role for example Renee needs to be able to add users from her team to the workspace reassign her role to member to grant her these permissions having helped Lucas and his team organize their workspace you should now be familiar with workspace roles and the different types available and how they’re configured always configure workspace roles correctly to ensure your project runs smoothly and set your team up for success workspaces are useful for storing and collaborating on content but it’s important to keep this content organized and easily accessible workspace apps are a great way of organizing your content efficiently to be located quickly and easily in this video you’ll explore the basics of workspace apps their advantages and learn how to create one in Adventure Works each department accesses its reports and dashboards through PowerBI however navigating this content on PowerBI is complex and timeconuming as a solution Adventure Works wants to create departmentspecific apps so that each department can access its reports and dashboards quickly and efficiently let’s find out more about apps in PowerBI service and how adventure works can incorporate them an app in PowerBI is a collection of important assets like dashboards reports and data sets packaged together for ease of access these assets can be bundled together under a workspace they can then be published to the PowerBI service this enables a streamlined sharing and distribution mechanism for PowerBI content there are a few reasons why businesses like Adventure Works prefer to use apps to access content on PowerBI service one reason is ease of access with apps users don’t have to search through numerous reports and data sets everything they need is in one package this makes it quick and easy to locate content apps also facilitate version control when an app is updated users automatically see the latest version this ensures that everyone is on the same page apps also help with security apps maintain the same level of data security as individual reports access can be restricted to authorized users only and data can be secured at row level so users can only view what you want them to view these security measures are great for protecting your data finally apps can also be customized apps can be tailored for specific departments or roles within an organization for example Adventure Works can customize the app to show marketing data for the marketing department sales data for the sales department or financial data for the accounting department this makes Workspace apps incredibly flexible tools for data distribution now that you’re more familiar with PowerBI apps let’s explore the process for creating an app in PowerBI service adventure Works has created a workspace called Adventure Works Sales this workspace holds all content related to the company’s sales like reports and dashboards to create an app for this workspace select the create app option this opens the build your app window the window contains three tabs setup content and audience in the setup tab you must input key information about your app this includes the name description logo and color scheme you can also add contact information for publishers or other important individuals name the app Adventure Work Sales and add sales app as the description once you’ve input the required information select add content to move to the next tab in this tab select the add content option to add reports to the app adventure Works requires the orders report and product sales report select and add the reports once added the reports appear in the left sidebar you can preview the reports or adjust their order select the symbol to the left of the orders report and drag it to the bottom so it appears last in the app you can also select the down arrow on the right of add content to add separate sections to your apps let’s link to the Adventure Works site select add new section to add a new section the new section appears in the list rename it Adventure Works internal site press the down arrow again select add link name the link Adventure Works website and add the link in the opening field box select content area then in the section field box select Adventure Works website select add to add the link to the app then select next add audience to move to the next section the audience tab you can use the audience tab to manage access to your application anyone who can access the workspace can access the app by default you can add more users or groups from the search box or you can share your app with the entire organization for now let’s restrict access to workspace users select publish app to complete the process it might take a few minutes for the app to publish once it’s ready select go to app to view it the app is ready to use with the Adventure Works website as its landing page you can use the sidebar on the left to navigate its contents you should now be familiar with Workspace apps their advantages and how to create them in PowerBI service as you continue to work with PowerBI service use Workspace apps as useful tools to organize your content for quick access and more efficient projects workspaces are a useful tool for developers but how do you determine how widely used or effective your reports are with PowerBI workspace metrics features you can monitor the usage and effectiveness of your workspace content in this video you’ll learn about the importance of monitoring workspace and report usage utilizing the current report metrics and the new preview feature and you’ll explore how usage metrics enhance report and workspace efficiency lucas is responsible for monitoring the performance of his team’s PowerBI workspace and its content a strong understanding and efficient deployment of usage metrics will help Lucas monitor the effectiveness of his workspace and reports let’s explore these topics in more depth and find out how they can help Lucas monitoring workspace usage in PowerBI involves tracking how reports and dashboards are accessed used and shared within a workspace it provides a window into the effectiveness and reach of the deployed data solutions the insights gathered from this data enable data analysts to make informed decisions on optimizations security and resource allocation it’s important to understand how your content is used to measure its impact and effectively guide your efforts usage metrics act as feedback showing how reports and dashboards are accessed within the organization for example you might discover that your team references several reports daily or a certain dashboard isn’t receiving the number of views it should you can use these datadriven insights to improve the performance of these assets monitoring report performance ensures relevance efficiency and responsiveness aligning your work with organizational needs and user preferences monitoring is mainly performed using the PowerBI services usage metrics reports or monitoring reports you can enable these reports for every workspace giving insights into how frequently users access them the initial usage report in PowerBI primarily focuses on individual report metrics providing details such as the number of views shares and user interactions on a per report basis for example Adventure Works evaluates the performance of its global marketing reports by tracking views and user interactions the company also measures how the report has been shared to gauge engagement across its worldwide workforce the usage metrics report is instrumental in understanding the performance and user engagement of your workspace reports powerbi service offers its users the option to switch to a preview version of the new workspace metrics feature this new feature expands monitoring from individual reports to the entire workspace providing additional insights into report performance some of these insights include aggregated metrics which encompass all KPIs analyzed in the old usage reports and add report performance information this feature compiles all of Adventure Work’s previously analyzed KPIs and integrates report performance data to provide a comprehensive set of metrics other insights include the typical opening time of the report with daily and weekly breakdowns lucas uses this data to track the average report loading times to help ensure a smooth user experience and this feature also provides information on all workspace reports instead of a specific one lucas uses this data to understand how his reports are performing so he can improve their content you can also access a detailed FAQ article containing all relevant capabilities and a description of this rich new feature to run and access the usage metrics data you’ll require the following prerequisites you need a PowerBI Pro or premium per user PPU license to run and access the usage metrics data however the usage metrics feature captures usage information from all users regardless of the license they’re assigned to access usage metrics for a report you must have edit access to the report and finally your PowerBI admin must enable usage metrics for content creators your PowerBI admin may have also enabled collecting per user data in usage metrics ensure these prerequisites are established before running or accessing the usage metrics data in this video you’ve learned about the importance of monitoring workspace and report usage utilizing the current report metrics and the new preview feature and you explored how usage metrics enhance report and workspace efficiency monitoring workspace usage with PowerBI’s workspace metrics preview feature improves our understanding of data usage across the organization aligning with informed decision- making and resource efficiency as a data analyst your role includes tracking how users engage with your data with the workspace usage report you can review insights into workspace activity and user engagement you can then use these insights to optimize your data and reports in this video you’ll learn how to enable the workspace usage report feature in PowerBI generate and navigate a usage metrics report for a specific workspace report and interpret key metrics to gauge user engagement and report interaction lucas has uploaded a product sales report to his workspace he needs to check that his data analytics team has reviewed this report lucas can use the usage metric and workspace usage reports to monitor the team’s engagement with his product sales report let’s help Lucas achieve his goal by guiding him through this process the usage metrics report in PowerBI is important for understanding how individuals interact with reports and dashboards it is an insightful report that can be launched and viewed on any workspace report the new workspace usage report feature enhances this by providing even more detailed insights it allows a closer look at how workspaces are used not just individual reports thanks to these reports users can now view an enhanced overview of basic report metrics the report usage tab lets users better understand each report’s performance with more detailed usage metrics that provide data on topics like views and users the report performance tab provides a breakdown of a report’s effectiveness with detailed insights into specific report interactions and their impact users can also use the report list tab to explore how all the reports in the workspace are performing making it easy to compare their performance and success and the FAQ tab provides easily accessible answers and guidance adventure Works can use the new workspace usage report feature to align resources and strategies with actual user interaction and needs enhancing their performance and user experience now that you’re more familiar with usage reports and the new workspace usage report feature let’s create one for Lucas from the PowerBI home screen navigate to workspaces and select the adventure work sales workspace here you can view the content uploaded to this workspace to enable the usage metrics report on the product sales report hover over the report item and select the ellipsus symbol to access the reports options locate and select the view usage metrics report option to launch the monitoring report if this is your first time accessing the usage report PowerBI will need a few moments to create it in the usage metrics report you can find information on report views and unique views by day total report views and a list of all users who access the report there are also slicers available for your data that can filter the usage report based on distribution method this feature highlights users that the report was shared to or workspace users who access the report you can also slice based on the platform the users use to access the report either from a browser or mobile lastly you can even filter by viewing the usage of separate report pages to enable the new monitoring feature toggle the new usage report to on this transforms the usage report to the new workspace usage report this new feature contains four separate pages with monitoring tools on the first page report usage you can identify metrics like the old report with updated visualizations and separate graphs instead of slicers for example you can see that 100% of report access has been conducted through PowerBI.com instead of mobile also selecting pages on the bottom right visualization shows that the order report page takes up 57% of the views on the second page report performance you can see the loading time of the report based on date user country of browsing and the internet browser used this is a significant page when troubleshooting long loading times on reports on the third page report list the new usage report feature allows users to monitor the usage of every workspace report from this single view you can see the familiar tools from the old usage monitoring report now enabled through all workspace reports the fourth and last page FAQ contains a detailed guide on all metrics and terminology used in this new monitoring feature it explains the usage of every tool in detail all this information can easily be exported to Excel and analyzed making monitoring and reporting on the workspace usage easier than ever in this video you’ve learned how to enable the workspace usage report feature in PowerBI generate and navigate a usage metrics report for a specific report within a workspace and interpret key metrics to gauge user engagement and report interaction with these reports you can optimize your workspace and its reports so that they meet the needs of your team by now you’re familiar with generating insights into data insights are generated from data sets and these data sets in turn rely on timely accurate data flow from different sources over the next few minutes you’ll learn about the basics of data sets in PowerBI service explore the relationship of data sets to data flows and reports and compare scheduled and incremental refreshes in data sets adventure Works data sets are dynamic they’re continually updating as they receive new data from different sources the company must ensure that its reports capture this latest data so they’ve tasked Lucas with integrating its data sets and data flows let’s take a closer look at how data flows into data sets a data set in PowerBI is a collection of data you import or connect to this data can come from a single source or multiple sources once captured it forms the basis for your reports and dashboards every data set’s unique structure and metadata influences the analysis you can perform let’s break down this relationship further as the previous example shows data sets act as a bridge between data flows and reports in PowerBI data flows collect and transform data from various sources like SQL databases and Excel files these data sources are then loaded into data sets these data sets a collection of processed data feed into the reports this enables analysts to derive insights effortlessly the symbiotic relationship ensures a streamlined data flow from extraction to visualization let’s look at an example of how the Adventure Works sales department can use data flows to consolidate and prepare data for analysis an adventure works data flow may collect sales data from different regions using a complex network of data sources it then cleans this data by removing duplicates and transforming the remaining data into a unified format once this process is complete the cleansed and transformed data is loaded into a data set data analysts can use this data set to create a report to analyze sales trends compare regional performance and identify growth opportunities it’s important to remember that all data sets must be frequently refreshed to include updated data this is to ensure that your insights are as current as possible you can manually refresh your data set any time but with PowerBI you can also plan a refresh to occur automatically there are two main ways to automatically refresh your data in PowerBI service a scheduled refresh and an incremental refresh both refresh mechanisms are vital for maintaining the accuracy and relevance of data in the PowerBI service let’s take a closer look at these methods a scheduled refresh is a set routine where the entire data set is refreshed at specific intervals for example Lucas has scheduled a daily refresh for 2 a.m each morning in the Adventure Work sales workspace to ensure data remains current however be careful when using scheduled refresh it could be resource inensive for large data sets an alternative more resource efficient method is to use incremental refresh unlike a scheduled refresh an incremental refresh only updates the parts of the data set that have changed as you saw in the previous example Lucas sets a scheduled refresh at 2 a.m daily for the primary sales data set to capture the previous day’s data however he can also set an incremental refresh every hour for the continuously updated online sales data set this incremental refresh captures new sales data without reprocessing the entire data set this way Lucas efficiently keeps data sets current ensuring reliable analysis and reporting at Adventure Works both refresh methods help Lucas keep his reports timely and actionable you should now be familiar with the basics of data sets their relationship with data flows and reports and understand the difference between a scheduled and incremental refresh data sets are central to PowerBI and they’re a valuable part of your analytical toolkit leverage data sets effectively for greater insights and informed decision-making powerbi is a fantastic service for data analysis however to get the most out of it you must ensure it has a secure and stable connection to your data with PowerBI gateways you can create a strong safeguarded bridge between PowerBI services and your on premises data over the next few minutes you’ll discover how to connect data with PowerBI gateways explore the different types and uses of gateways and learn how to set up and manage gateways adventure Works stores large amounts of data on premises lucas and his data analytics team must connect to this data securely and reliably using PowerBI the team can leverage PowerBI gateways to establish a secure and reliable connection between on premises data and PowerBI service so why is PowerBI gateways a solution for Adventure Works powerbi gateways establish a secure and reliable connection or bridge between your on- premises data and the PowerBI service on Microsoft’s cloud this connection allows PowerBI service to access and retrieve data from on premises data sources this enables organizations to keep their data secure while benefiting from the PowerBI services cloud-based analytics and sharing capabilities powerbi gateways interact with on premises data in two ways the first is a data refresh gateways facilitate the scheduled refresh of data sets pulling the latest data from the source to PowerBI for example Lucas can use the gateway to schedule a daily refresh of adventure works on premises sales data this ensures that the sales team has the latest figures ready for analysis in PowerBI every morning the second type of interaction is query execution gateways help execute queries against the data source to retrieve updated data lucas opens the latest iteration of Adventure Works sales data report and executes a query to identify yesterday’s total sales the gateway helps Lucas to execute the query against the sales report there are three main types of gateways in PowerBI each suited to different scenarios the on- premises data gateway the on- premises data gateway personal mode and the Azure virtual network or V-Net data gateway which type of gateway you choose depends on the setup of your organization and its specific data management and security requirements let’s find out more about each type beginning with the on- premises data gateway the on premises data gateway suits multiple users sharing and refreshing data across many Microsoft services including PowerBI it’s very versatile which makes it useful for diverse organizational setups the gateway supports all types of connections from PowerBI like import data scheduled refresh direct query and live connection quick access to and support for these connections is important in real time data interaction for example each Adventure Works department requires access to different data sets stored on premises these data sets can be managed centrally with an on premises data gateway this setup lets multiple users refresh and access the data they need across different Microsoft services next let’s review the on premises data gateway personal mode the personal mode is tailored for single user scenarios it supports connections to local data sources such as SQL Server and Excel which is useful for individual users or analysts it’s also designed to be easy to set up and once setup is complete the gateway requires no additional configurations for data sources this offers a much less complex solution for business analysts who want to publish and refresh PowerBI reports with minimal hassle however this gateway supports only one type of connection import data or scheduled refresh and it’s designed only for PowerBI so it doesn’t support other applications lucas can use the personal mode of the on- premises data gateway to manage data sets he doesn’t want to share with the rest of the team with this straightforward setup he can refresh the data without going through the central gateway and finally there’s the Azure virtual or V-Net data gateway the Azure virtual network or V-Net data gateway best suits complex organizational setups by offering enhanced security and data management features within a virtual network it helps cut the costs or overheads of installing updating and monitoring on premises data gateways by virtually bridging PowerBI to supported Azure data sources this gateway securely communicates with the data source executes queries and transmits results to the PowerBI service as Adventure Works grows it requires better security and data management a V-Net is a great solution it enables secure data transfer and the ability to manage the data environment it provides a secure pathway for data that adheres to the company’s organizational security policies and it keeps data refreshed and readily available for analysis in PowerBI you should now understand how to connect data with PowerBI gateways the different types and uses of gateways and how to set up and manage gateways with a strong understanding of gateways you can establish an efficient and secure connection between your on premises data and PowerBI impactful insights depend on access to the latest data an analysis based on outdated data isn’t of much use to anyone configuring a regular PowerBI data refresh ensures your reports and dashboards are consistently synced with the latest data by the end of this video you’ll understand the importance of configuring a data set refresh and know how to configure a scheduled ondemand and incremental refresh adventure Works needs daily updates on its marketing campaigns and sales so Lucas must ensure that the reports and dashboards his team relies on for analysis contain the latest available data let’s help Lucas configure a data set refresh so his team is working with up-to-date information first access the adventure works sales workspace the workspace contains a new report on marketing campaigns access the report settings to plan a scheduled refresh select schedule refresh from the settings to navigate to the data set refresh settings the last refresh failed because the credentials weren’t entered when the data set was uploaded to the cloud navigate to the data source credentials category and select edit credentials this report is connected to the Adventure Works SQL database so input your Adventure Works SQL database username and password then select sign in next navigate further down the menu and expand the refresh settings toggle the setting on to activate the scheduled refresh check that the refresh is configured daily between 6:00 a.m and 100 p.m coordinated universal time or UTC the scheduled refresh is now ready navigate back to the workspace once the credentials are set you can manually refresh the data set whenever needed to demonstrate let’s refresh the orders report hover over the report and select the circular arrow this is the refresh icon selecting this icon performs an ondemand manual refresh of the data set next let’s configure an incremental refresh on the sales transaction report navigate to Power Query Editor on the PowerBI desktop to issue an incremental refresh you must now create two parameters one that determines when the refresh begins and another that states when it should end select manage parameters then new parameter in the manage parameters dialogue box name the first parameter range start assign it a date time parameter type and provide January 1st 2000 as the current value right click the parameter and select duplicate to create a copy this copy is now your second parameter rename it range end next select the sales table and identify the order date column select the columns down arrow access date time filters then custom filter in this window keep the rows where order date is after or equal to select parameter and input range start for the and option select before parameter and range end on the second row your configuration is now ready select okay then close and apply to return to PowerBI desktop right click on the sales table and select incremental refresh toggle the incremental refresh on configure the settings to archive data older than two years and incrementally refresh data from the last seven days each data set refresh will now remove transactions that occurred over two years ago and they’ll refresh only transactions that occurred in the last seven days note that as the info box states the report must be uploaded to the PowerBI service for the refresh policies to occur apply your changes and save your report lucas and his team are now working with the latest data and you should now understand the importance of configuring a data set refresh and how to configure a scheduled on demand and incremental refresh great work analyzing data involves working with many different data sets so it’s important to distinguish reliable data sets from unreliable or misleading ones to ensure your insights are accurate with PowerBI you can endorse promote and certify reliable data sets to clarify which ones you and your team should work from in this video you’ll understand the importance of data set endorsement differentiate between promoting and certifying data sets and learn how to promote a data set in the PowerBI workspace over at Adventure Works the sales workspace is cluttered with many data sets it’s difficult for Lucas and his team to determine which ones to work with lucas decides to identify and endorse reliable data sets to help his team maintain data integrity in their workspace let’s discover more about endorsing data sets then use our new knowledge to help Lucas and his team endorsing data sets involves identifying and marking reliable data sources in your workspace to ensure your team works with quality content you can endorse data sets in PowerBI from the endorsement and discovery menu data set endorsement in PowerBI comprises two levels promoting and certifying promoting a data set indicates that you trust its content and view it as ready for organizational use when you promote a data set a promoted icon appears next to it in the workspace when a data set is flagged as trusted it becomes easily discoverable and the team knows it’s reliable you can also certify a data set this is a higher level of endorsement it symbolizes that the data set meets the company’s stringent quality and compliance standards however content certification is a big responsibility only authorized users can certify content so this option is typically only available to workspace owners over at Adventure Works Lucas is the workspace owner that means he is the only team member who can certify data sets next let’s review the process for endorsing content in PowerBI by helping Lucas promote reliable data sets access the Adventure Works sales workspace to view all available data sets select filter then data set the team has been using the marketing campaigns report a lot recently it’s filled with high quality data that has delivered many great insights lucas has decided it can be endorsed as trustworthy content to begin the endorsement process hover over the data set to reveal the ellipsus symbol select the ellipsus then settings in settings locate and expand the endorsement and discovery section check the promoted option then check make discoverable so other users can identify the endorsed data set select apply to finish configuring the settings select adventure work sales from the navigation pane to return to the workspace navigate to the right of the workspace the marketing campaigns report data set is now marked as promoted the promoted flag draws the attention of the workspace users to the report and lets them know it’s suitable for analysis great work you’ve helped Lucas identify and endorse a reliable report that his team can use for analysis and you should now understand the importance of data set endorsement be able to differentiate between promoting and certifying data sets and know how to promote a data set by endorsing data sets you ensure your team works with and draws insights from reliable and consistent data anna oversees quality at the Spiro Car Company today she has a big meeting with senior leadership spiro has been manufacturing electric vehicles for the last 8 years and business is booming or at least it was lately there have been concerns about manufacturing time and quality business has slowed sales have dropped and morale is low and Anna unsurprisingly is worried luckily one thing Anna never worries about are statistics they never lie each machine in the assembly line reports statistics to a central database in the manufacturing facility unfortunately dumping data on her manager’s desks won’t solve the problem this time she has heard her colleagues discuss using PowerBI for analyzing data but Anna prefers the old ways and stores everything locally on a central database but what if she could somehow convert her data stack into a coherent interactive visual if so she would be one step closer to figuring out where quality is slipping and more importantly providing the leadership team with the answers they need she meets with Dennis and outlines her predicament he explains the on premises gateway to her this gateway will bridge the gap between Anna’s on premises data and PowerBI and best of all the data transfer is completely secure this means that she can access all the features of PowerBI using the data stored locally on her laptop a great solution after a quick guide through the basics from Dennis and a chat with it about requirements Anna is ready first she installs the gateway on the database server and signs in with her work account to register the gateway anna can now connect all the data she stores locally to reports and dashboards in PowerBI she can even configure a refresh schedule or perform an ondemand refresh she starts running reports building rich data visualizations and identifying interesting business insights she discovers that the main issue in the Spiro manufacturing supply chain process is a delay in delivering the car’s high-capacity battery packs the supplier also fails to deliver enough batteries which leads to further delays the quality slips as the assembly team tries to make up for these delays anna can’t believe how straightforward it was to convert her on premises data using the gateway and the best thing about it she doesn’t have to say goodbye to her older methods of storing her data locally anna arrives at the leadership meeting with an interactive dashboard to outline her findings and a plan to resolve the issue senior leadership decide to use Anna’s data analysis to develop a remediation strategy spiro switches to a more reliable supplier for their battery packs and they put better measures in place to review quality analytics so they can act before another issue occurs thanks to Anna Spiro’s business once again is booming when deploying content in PowerBI it’s important to ensure the data is safe and that the change is handled efficiently that’s why analysts make use of structured deployment over the next few minutes we’ll explore PowerBI’s deployment pipelines for streamlined project management in this video you’ll learn about PowerBI’s deployment pipelines recognize the importance of separate environments and explore how to enhance data security through structured development over at Adventure Works Lucas has been tasked with using PowerBI service to improve the company’s development process he must ensure that the data of all new content deployed to the workspaces remains accurate and secure during the report development stages let’s help Lucas achieve this deployment pipelines in PowerBI help content move smoothly through development testing and production stages this allows for controlled testing and validation of content before it reaches end users let’s explore these three stages of deployment in more detail first we’ll examine the development environment here developers can add new content without changing current reports this is the first step in the deployment process this is where developers can create and modify PowerBI reports any errors or issues at this stage have no impact on the existing production data for example Lucas improved a sales report by adding a new visual in the development stage ensuring it matched branding guidelines next let’s explore the test environment this is where a small group of testers review and test new reports for issues before they’re used in production providing feedback and checking for bugs and data problems here reports are validated for accuracy performance and any potential bugs before moving to the production environment for example Lucas can move his new visual from development to the testing phase this will allow for the testing team to check the accuracy and performance of the new visual lastly we’ll investigate the production environment once new reports and features are tested they’re ready to be used by the end users in the production environment this is the last step in the process for example once Lucas’ new visual has been validated through testing it is moved to the production environment once in the production environment users and stakeholders will be able to use the new feature however not all three development environments must be included in a deployment pipeline for example the testing phase could be excluded if it’s not considered necessary there are several benefits of a structured development life cycle by having distinct environments you can ensure that unvetted changes do not corrupt the production data a structured life cycle allows for comprehensive testing ensuring that the data remains accurate and reliable and deployment pipelines provide a streamlined process for managing changes enabling better control over the development process let’s find out how a structured development process helped Adventure Works in a realworld example lucas improved a sales report by adding a new visual in the development stage ensuring it matched branding guidelines after moving it to the test environment and thorough validation the report went to production this example showcases how PowerBI’s deployment pipelines ensure a smooth and accurate transition of content benefiting data accuracy and decision-making at Adventure Works using PowerBI’s deployment pipelines for a structured development process ensures safe data handling in this video you’ve learned about PowerBI’s deployment pipelines the importance of separate environments and enhanced data security through structured development with PowerBI’s deployment pipelines you can effectively manage changes with separate environments allowing for accurate and secure sales data while reducing risks and improving control and efficiency it’s important to catch potential errors in your pipelines to ensure your data is accurate for end users with PowerBI deployment pipelines you can catch these errors and ensure a smooth transition from development to production in this video you’ll learn how to access and configure a PowerBI service deployment pipeline how to allocate existing workspaces to their respective environments and how to oversee and monitor deployment history and settings a minor error in PowerBI report development could mislead end users lucas needs to use deployment pipelines to ensure changes are tested to enhance reliability and efficiency let’s guide Lucas through this process access the deployment pipeline icon on the left navigation pane on the PowerBI service homepage on smaller screens you might need to select the more ellipsus button in the navigation pane to locate and select the deployment pipelines an introductory screen with the pipeline capabilities appears select create a pipeline to begin streamlining the data processes the create a deployment pipeline window appears on the screen enter sales pipeline as the pipeline name and sales reports deployment pipeline as the description then select next three default environments appear on the screen you can add more environments by selecting the add button and naming them you can also remove environments by selecting the bin icon for this example let’s keep only the development and production environments of PowerBI we’re now on the deployment pipeline page note that the workspaces assigned to the environments must be created beforehand in this case the main workspace we’ve been using has been renamed to Adventure Works Sales Development highlight it in the development environment and select assign workspace next select the newly created Adventure Works Sales Workspace in the production environment and assign it after assigning both a warning pop-up appears indicating differences in content between the two environments select deploy in the test environment to confirm that the changes made by users in development have been approved they can now be deployed in the production environment where end users have access select deploy to begin the process a green tick appears at the end indicating that the two environments are now synced and no new changes are to be deployed for now several important features of the pipelines appear in the top ribbon you can adjust the pipeline settings from the ribbon manage access to the environment and view the deployment history the history contains necessary information such as the deployment user the number of items deployed and the final process status lucas has improved Adventure Works sales reports you can do the same by setting up a deployment pipeline to ensure smooth transitions from development to production minimizing errors and enhancing data integrity in this video you learned how to access and configure a PowerBI service deployment pipeline allocate existing workspaces to their respective environments and oversee and monitor deployment history and settings maintaining a workspace often requires updating its components however an update to one component could affect multiple others with lineage view and impact analysis you can understand how your components are related and how changes impact the workspace in this video you’ll learn about the core concepts of data lineage and impact analysis the functionality and benefits of the lineage view and you’ll also explore the impact analysis feature and its role in data management over at Adventure Works Lucas needs to update the SQL server his workspace depends on however several other workspaces also depend on this same server lucas must determine what components rely on this server and how they’ll be impacted by the changes he makes to it you can help Lucas by working with him to incorporate lineage view and impact analysis into his workflow let’s begin by understanding what these terms mean lineage view simplifies data tracking by showing its journey from source to destination it visually connects data elements by revealing the relationships between data sets data flows reports and dashboards these data elements are presented using a parent child relationship the parent child relationship shows how data elements are connected in a sequence parents are the starting points and children follow as subsequent steps in the data journey this helps to provide a clear picture of the connections between the data in your workspace lucas can use lineage view to manage his workspace by identifying and updating outdated data sets this ensures that his team works from the most recent and accurate reports another valuable tool in PowerBI is impact analysis impact analysis complements lineage view it helps you to understand how changes in your workspace affect different components it provides an overview of how data is used this feature helps you to make informed decisions when modifying data your data sets are intertwined with your reports workspaces and dashboards a change to one asset can affect multiple others once you understand how changes impact your workspace you can inform the rest of the team and ensure everyone can use the updated data effectively now that you’re more familiar with lineage view and impact analysis let’s explore how Lucas can incorporate them into his workflow when you log into a workspace you are presented with the default list view this view displays workspace items such as reports and dashboards to switch to the lineage view select the lineage view icon this view is only available to the admin contributor and member roles in lineage view you can explore the relationships between all your workspaces content for example in the adventure work sales workspace a SQL server database serves as the data source for both data sets in the workspace reports have also been created for both data sets additionally both reports have visualizations pinned to a single dashboard the sales dashboard selecting any component brings up a window with its details on the right hand side of the screen select the SQL server as this is the component to be modified selecting this component brings up information such as the server and database name the privacy and authentication methods and the status of the gateway which indicates that the connection is currently active select the X icon to close the window data sets also display their last refresh date and time you can refresh a data set on demand by selecting the refresh button this is the basic lineage flow in a workspace workspaces with larger data pools are more complex various reports could stem from a single data set this generates numerous end dashboards the show lineage button on every component is helpful in these situations you can select the arrow to highlight the entire lineage flow the most important feature of the lineage view is impact analysis select the screen icon on any lineage component to open the impact analysis window in this instance select the Adventure Works SQL Server data source the impact analysis window displays all components a SQL Server data source change affects the affected components are referred to as child items the asset you modify is the parent item in this instance modifying the Adventure Works server the parent item would impact six child items spread across three different workspaces you can also view the list of child items by type or workspace by selecting the buttons on the right before you modify the server you need to notify all team members impacted by your actions you can use the notify contacts feature to message all affected individuals you can also add a note to describe the impact in this video you learned about the core concepts of data lineage and impact analysis the functionality and benefits of the lineage view and the impact analysis feature and its role in data management lineage view and impact analysis in PowerBI boost data management you can easily track data history keep data updated and understand changes and effects these features make decision making smarter and data management smoother you interact with many different assets in your workspace and it’s important that they can be accessed quickly however some assets like reports can take longer to load the more you use them luckily PowerBI offers a caching feature you can use to optimize your workspac’s performance in this video you’ll learn about the fundamentals of query caching in PowerBI how caching interacts with import mode and the application of caching adventure Works data analysis team has been using the marketing campaign report heavily as a result of all these changes the report takes longer to load each time it’s accessed the team needs to make use of caching to improve the report’s performance let’s find out how caching is the process of temporarily storing query results this enhances performance by minimizing the time and resources required to fetch data accessed regularly for example the analytics team queries the marketing campaign report hundreds of times daily each query involves retrieving and processing significant data from the database this can strain the system and slow down the reporting process caching helps by saving frequently requested data like the marketing campaign report so it doesn’t need to be fetched from the database every time this speeds up the analytics process and reduces strain on the system there are many benefits to query caching first it offers faster performance with caching you can return reports and queries faster especially for frequently used static data sets it also preserves bookmarks and filters so that they don’t need to be reapplied or reset each time a query is run caching also offers personalized data access each user receives their own cached query results for a personalized experience query caching also follows all security rules which means that caching maintains data security without compromising compliance and lastly caching reduces the computing load on your workspace saving resources however query caching has certain limitations it is exclusive to import mode and not applicable for direct query and live connection modes not all users have access to query caching it is only available with a PowerBI premium or embedded subscription there are also other potential limitations clearing the cache when switching from on to off can cause a brief delay for ondemand queries and finally during data set refreshes the query cache updates and may impact performance with high query volumes now that you’re more familiar with query caching let’s help the Adventure Works data analytics team make use of this feature to improve their reports performance first open the Adventure Works sales data set where the report is located this report is used often which affects its loading speed so it’s a good candidate for query caching to use query caching hover over the marketing campaigns report data set select the ellipsus symbol and choose settings from the options in the settings menu navigate to and expand the query caching options query caching is turned off by default to enable query caching select on and then select apply this caches all bookmarks and filters on the initial report page the report will now open faster if you try to disable query caching a pop-up appears this pop-up warns that turning off query caching will result in saved queries being deleted the next time someone opens the report they may experience a slight delay during their first use this applies to both options with query caching disabled in this video you’ve learned about the fundamentals of query caching in PowerBI how caching interacts with import mode and the application of caching using query caching in PowerBI improves report speed and resource efficiency streamlining your data analytics journey it’s a smart way to optimize performance maintaining uninterrupted service connectivity in PowerBI is important for timely and accurate data analysis by understanding the most common connectivity challenges and how to troubleshoot them you can perform analysis without issue in this video you’ll learn about the most common connectivity issues in PowerBI how to rectify refresh failures caused by credential modifications and the process of configuring notification settings for multiple users over at Adventure Works Lucas has been alerted to a supply chain optimization project report that failed to update because of a credential change to troubleshoot this issue he must fix the schedule he also needs to add another team member to the notifications in case the updates fail again when he’s unavailable let’s help Lucas fix the report and ensure that Adio is notified the next time there’s a problem but before we do let’s learn more about troubleshooting service connectivity issues powerbi service connection problems can lead to data set refresh failures with various causes to fix this a clear troubleshooting plan is needed this involves checking the gateway configurations resolving data refresh issues and ensuring data source settings are correct by following this process users can improve service connectivity leading to smoother data analysis in PowerBI it’s also important to correctly set up notification settings to alert the right people about refresh failures this ensures quick action can be taken to resolve any issues let’s start by exploring some of the most common connectivity issues as you’ve just learned most connectivity issues in PowerBI fall under the umbrella of three main categories the first of these we’ll explore is gateway configuration the first step is to check the gateway connectivity status by verifying that a gateway connection is active and running on your data sources the next step is to ensure you’ve selected the correct gateway choosing the correct gateway facilitates a reliable connection to your data sources this ensures that your reports and dashboards have the most accurate and up-to-date information and you must also check that you’re using the latest gateway version an updated gateway ensures a solid connection between PowerBI and your data sources another category is data refresh issues this can include issues like unsupported data sources that do not support refresh operations understanding the nuances of these data sources and rectifying such issues is essential for ensuring that your reports reflect the most current data it’s also important to perform a scheduled refresh check testing the accurate configuration of the scheduled refresh is vital in preventing data latency a well-configured scheduled refresh guarantees that your data is updated regularly and that the insights derived from your reports are based on the latest available data finally there are also data source settings an example of this is data source misconfigurations addressing any misconfigurations in your data source settings promptly ensures uninterrupted data retrieval a malfunctioning data source may prevent the connection with PowerBI blocking the refresh processes and there’s also credential verification verifying the credentials for your data sources helps prevent unauthorized access and resolve connectivity issues ensuring the credentials are accurate and upto-date is fundamental for maintaining a secure and reliable connection to your data sources let’s discover how these issues can be solved by taking a few moments to help Lucas troubleshoot his PowerBI connection navigate to the supply chain optimization project workspace to address the data set that failed to refresh a red exclamation mark next to the refreshed column indicates that the refresh has failed to complete select the warning icon to view details of the error in the report settings menu immediately when opening the settings Lucas identified that the last scheduled refresh failed this resulted in the refresh being disabled by PowerBI so the error resulted from this failed refresh let’s troubleshoot this error scroll down and check the gateway and cloud connection options verify that the personal gateway is running on the database and does not pose an issue with the connection between the data source and PowerBI the next set of options data source credentials states that the data source failed due to incorrect credentials this is the cause of the connection issue select edit credentials to fix this and enter the new login credentials leave the rest of the settings as they are and select sign in the connection has now been reactivated scroll down to the refresh settings expand the options and select on to enable a daily refresh in the next section check the these contacts box to add AIO to the contacts list adio will now be notified if a refresh failure occurs again in the future in this video you learned about the most common connectivity issues in PowerBI how to rectify refresh failures caused by credential modifications and the process of configuring notification settings for multiple users by rectifying credential errors reconfiguring scheduled refreshes and ensuring the right individuals are notified about refresh failures you’ll ensure the accuracy and timeliness of your data congratulations on reaching the end of these lessons in deploying assets during these lessons you explored creating monitoring connecting to and maintaining workspaces and data sets in PowerBI let’s take a few minutes to recap what you’ve learned so far you began the first lesson by exploring the concept of a workspace you learned that a workspace is a specialized area in PowerBI that holds important assets like data sets reports and dashboards its advantages are that it helps to organize assets for easy management provides security through access control as only permitted users can access workspaces a workspace also enables collaboration teams can use them to build reports and workspaces let analysts update or modify data quickly there are two types of workspaces in PowerBI the first is a personal workspace which you can use to store your own personal content the second is a shared workspace where a team can collaborate on reports and dashboards always follow best practices in your workspace like performing regular cleanups establishing clear naming conventions safeguarding your data regularly backing up your work and seeking feedback from your team on improvements that could be made to the workspace the process of creating a workspace is very straightforward a workspace can be created by selecting the new workspace option from the workspaces tab in PowerBI when creating a new workspace you must consider workspace roles workspace roles determine who can perform each task workspace roles include the following viewers can view content but can’t modify it contributors can add and modify content members can alter content and add new members and admins have full control over the workspace assets and its members you can manage these roles using PowerBI’s manage access feature during this lesson you also created a shared workspace for Adventure Works where Lucas’ team could collaborate on reports in the next lesson you learned how to monitor workspaces this involves tracking how reports and dashboards are accessed used and shared within a workspace by monitoring a workspace you can measure its impact and make changes to increase its usefulness monitoring is performed through usage metrics and monitoring reports these reports provide details like how a report was used or an overview of a report’s performance you can create a usage metrics report in a workspace from a reports options list there are also slicers for your data that can filter report data powerbi automatically creates a usage metric report data set when you create a usage metric report the credentials for accessing this report must be carefully managed so that it can be refreshed and accessed as required in the third lesson you explored the topic of data sets and gateways in PowerBI a data set is a collection of data you import or connect to it can come from one or multiple sources the captured data forms the basis of your reports the captured data must be the latest available information this ensures that your reports are accurate you can use a data refresh to ensure accurate data a scheduled refresh is a routine that refreshes an entire data set at specified intervals you can configure a refresh by selecting the scheduled refresh feature from your reports options ensure you enter the correct details and credentials so PowerBI can access the report an incremental refresh updates only the parts of the data set that have changed this is a more resource efficient alternative you can configure an incremental refresh from Power Query Editor this involves creating two parameters determining when the refresh begins and when it ends promoting and certifying data sets lets you inform your team where to access the most current and reliable data promoting a data set indicates you trust its content and it’s ready for use certifying a data set states that it meets the company’s highest standards you can promote and certify data sets from PowerBI’s endorsement and discovery menu you also explored establishing a secure reliable connection between your on premises data and PowerBI service using data gateways these gateways enable you to perform a data refresh or query execution securely there are three types of gateways in PowerBI the on premises data gateway the on- premises data gateway personal mode and the Azure virtual network or V-Net data gateway which gateway you choose depends on your organization’s setup and its data management and security requirements you also practiced your new skills with an exercise in which you configured a data set for Adventure Works you also worked through a knowledge check which tested your knowledge of these topics and an additional resources item in which you explored Microsoft learn articles on data sets and gateways in the fourth and final lesson you learned how to maintain workspaces and data sets you began the lesson with an overview of development life cycles powerbi contains deployment pipelines that help move content through the following life cycle stages development in which new content is added testing in which content is reviewed for issues before it’s used in production and production when reports and features are deployed to end users the benefits of a structured development life cycle include data safety data integrity and efficiency and control you can access the deployment pipeline in PowerBI from the navigation pane this feature can create customize and manage pipelines or environments another useful feature for maintaining your workspace is the lineage view this simplifies data tracking by showing the data journey from source to destination with all the connections in between impact analysis helps you understand how changes to your data can impact or affect different assets in your workspace you can alternate between these views in PowerBI you’ve now reached the end of this summary it’s time to move on to the module quiz where you’ll test your knowledge of the topics you’ve covered best of luck data analysts often find themselves working with sensitive data as such they often need to think about the responsibility of handling such information safely in this video you’ll learn how to identify sensitive data and review measures that can be taken to protect data at Adventure Works a data breach could lead to legal trouble loss of trust and a competitive disadvantage safeguarding sensitive data is important for protecting its reputation and success data analysts must handle sensitive data with care so how do we tell the difference between regular data and sensitive data sensitive data contains important information about a business or its stakeholders that if mishandled could cause harm or misuse here’s a simple rule if it’s information that could damage the company’s reputation finances or stakeholder privacy it’s sensitive data for example general sales figures for a particular region might be considered regular data but a detailed list that breaks down customer details financial records employee information or even proprietary business knowledge is sensitive data any information that offers intimate knowledge that isn’t meant for circulation can be classified as sensitive the consequences of mishandling sensitive data can have multiple serious consequences both at business and employee level for example an email containing sensitive product designs for Adventure Works next big launch is inadvertently sent to an external vendor a mishap could give competitors an advantage or lead to legal problems if designs were patented also think about the impact of an employese’s personal data leak this could breach privacy laws resulting in fines and harm trust between employees and management one mistake can bring financial losses legal troubles and brand damage as you navigate the world of data it’s important to be equipped with a security toolkit let’s explore the various measures that can be implemented to ensure data remains in safe hands before a user can access a report they need to prove that they are who they say they are adventure Works operates globally so everyone accessing the PowerBI platform must be verified an authentication system requires users to input a unique identifier that ensures only authorized personnel can access data once a user is authenticated the system determines what data they are permitted to access this protects Adventure Works from internal leaks and unauthorized external breaches in PowerBI you can define roles for users as each role has specific permissions tied to it since employees within Adventure Works have varied job functions PowerBI allows roles to be customized ensuring data is distributed on a need to know basis for instance a product management analyst role might be permitted to see inventory levels reports while the human resources analyst can access employee reports regularly reviewing and updating these roles is essential to ensure they align with organizational needs and changes another measure used to protect sensitive data is rowle security rowle security or RLS is like a detailed filter where users can view only the data rows they are supposed to based on their role or identity for example a regional manager for North America at Adventure Works might only need to view sales data for North America and not Europe rls ensures specific rows of data in PowerBI are shown only to authorized users safeguarding regional strategies and preventing potential conflicts of interest another measure used to safeguard data is encryption adventure Works intellectual properties such as proprietary bicycle designs and vendor contracts are invaluable the company can use encryption to ensure that only authorized individuals can read this data as data moves between systems or across the internet it is susceptible to interception encrypting this data ensures that even if someone gains unauthorized access they can’t decipher the information this helps protect business interests as a global company Adventure Works data is often accessed from around the world encrypting data while it’s being transmitted ensures it can’t be accessed and misused finally there’s also data masking data masking allows you to work with obscured versions of sensitive data enabling you to verify transactions without risking financial security it strikes a balance between transparency and security for Adventure Works sometimes you might need to work with data without knowing the exact details in these instances you’ll need to use the technique of data masking for instance you might need to verify the last four digits of a customer’s credit card without seeing the whole number data is powerful but carries great responsibility in PowerBI every data point represents Adventure Work’s commitment to its global community you should now know how to describe sensitive data and understand the measures that can be taken to protect data protecting data preserves trust in the company’s vision your choices today shape tomorrow’s outcomes as a data analyst you’ll often need to send very large files to other people fortunately you can use PowerBI’s link sharing feature to grant access to reports without transferring large files or losing their interactivity in this video you’ll explore sharing a URL in PowerBI service different types of links and how to generate a URL or link to share a report at Adventure Works data analysts are constantly building useful and dynamic reports powerbi’s link sharing feature allows them to quickly distribute these reports to multiple teams with a simple link let’s find out more about how this works in PowerBI when you share a link you’re essentially giving someone a URL to access your report or dashboard directly in a web browser a link is fast efficient and doesn’t require downloading large files however it does pose security risks which means that access must be carefully managed powerbi offers different sharing options for links let’s explore some of these the first category is people in your organization for example you’ve built a report on Adventure Works yearly sales trends and want to share it with the whole sales team when you select people in your organization anyone with an Adventure Works email can open the report using the link this means only those within the organization can view those insights the next category is people with existing access you’ve shared a report with the product management team perhaps containing confidential info about a new touring bike prototype when you use the people with existing access option only those you’ve already permitted can view the report others at Adventure Works won’t be able to view it even if they find the link the final category is specific people in certain situations a specific person may need access to a report tailored to their project by using the specific people option you can ensure that only the individuals you explicitly mention can view the report other individuals can’t access it unless you permit them however configuring who can access the link is just as important as configuring what the individual can do with the data provided by the link configuring data protection is vital failure to do so could result in unauthorized access to sensitive customer and employee data leading to legal issues privacy breaches and a tarnished reputation sharing permissions is a vital tool for protecting data permissions safeguard your data by determining who can access it in large companies like Adventure Works these protections are crucial let’s explore two common sharing permissions in PowerBI re-share and build permissions data and insights must move between departments in big companies like Adventure Works re-share permissions let people share with others which can be great for sharing important information quickly but it can also cause problems each time it’s shared again the original context can get lost leading to misunderstandings or the wrong people accessing the data build permissions lets others use the data you’ve shared recipients with build permissions can merge data as needed for richer analyses but they can’t change the core data however using this power wisely is essential to avoid cluttered less useful reports now let’s demonstrate an example of how you can generate a link to share using PowerBI first start by navigating to PowerBI service on the left sidebar select workspaces and select the specific workspace where your desired report is located browse through the list of reports and select the title of the report you wish to share this opens the report and provides a live interactive view of its contents it’s always good practice to review the report before sharing to ensure it’s the correct one towards the top left corner of the screen locate and select the share icon which resembles an arrow the share button provides different mechanisms for report distribution in the window that opens just above the email address field select the people in your organization with the link can view and share option choose the people in your organization permission level from the available options ensure you uncheck the option allow recipients to share your report by toggling this option off you ensure that the content is only viewed by its intended audience once you have selected the desired permission level select the apply button near the bottom of the send link window is the copy link button depicted by a paperclip icon when you opt to share via a link PowerBI generates a unique URL that directs users to your report by copying this link you’re grabbing the address of the live version of your report once copied you can paste and share this link just like any other web link when a user clicks on it provided they have the required permissions they’ll be directed to the report on PowerBI service where they can interact with it live remember always to consider the sensitivity of the data when selecting an option next let’s configure build permissions for the reports data set access your data set from the workspace hover over the record select the ellipses or three dots to the right of the data set’s name and select manage permissions in the manage permissions pane select add user and then input the names or email addresses of the users or groups you want to grant build permissions to in the permissions dropdown select allow recipients to build content with the data associated with this data set this allows users to create new reports or visuals based on this data set coupling it with reshare ensures they can distribute their creations to others to restrict re-sharing simply uncheck the reshare option after configuring the permissions as desired select the grant access button having explored sharing via links you should now be familiar with sharing a URL in PowerBI service the different link types and generating a URL or link to share a report links and their related permissions are instrumental for sharing your reports safely in the business world data is power but it must be handled responsibly data analysts often work with sensitive client and employee data which must be safeguarded carefully fortunately they can use PowerBI’s data sensitivity labels to protect this information in this video you’ll learn how to identify data sensitivity labels and how to work with data sensitivity labels at Adventure Works customer and employee information needs to remain confidential lucas has just completed a new sales report this data is confidential so it’s important that he correctly labels the report as so let’s learn more about data sensitive labels and how Lucas can use them to categorize data powerbi’s data sensitivity labels allow you to categorize data and safeguard the company’s reputation and trust they act like digital tags showing the level of confidentiality data requires they guide users on how to handle data responsibly these labels are part of a security system across Microsoft’s products when you apply them in PowerBI you set the data sensitivity level properly using these labels ensures data protection especially when sharing or exporting there are six different categorizations of data sensitivity labels used in PowerBI personal public and general and there’s also confidential highly confidential and restricted let’s learn more about these labels by exploring how Adventure Works makes use of them in PowerBI from the left sidebar of PowerBI select Workspaces then select the workspace that contains the report or dashboard you wish to configure in this instance you need to configure Lucas’ sales report inside the workspace choose the sales report with the report open select the title at the top of the screen in the drop-own menu access the sensitivity label dropdown if you haven’t applied a label before you might find that the label reads none or no label in a faded gray color signaling its dormant state select the sensitivity label drop-down to show the range of available options select confidential for the current report let’s take a moment to review these labels the personal sensitivity label denotes data linked to specific individuals but not intended for the wider organization for example a junior data analyst might share information with a senior data analyst this information is valuable but doesn’t need to go to the entire company adventure Works often creates content for a wide audience including customers stakeholders and the public this content is labeled as public for example a brochure showcasing Adventure Works new bike range for an exhibition is intended for wide distribution without any restrictions the general sensitivity label is for information meant for the broader internal audience without specific sensitivities like Adventure Works monthly newsletters which cover company events and other general news this information is for all employees not external stakeholders and the general label keeps it freely accessible within the company the confidential label deals with sensitive information across departments this label is for data that needs careful handling it’s for valuable data that’s not intended for everyone like PowerBI reports shared between data analysts the highly confidential label safeguards Adventure Works critical innovations it’s for essential sensitive data like research into new products or markets this label ensures limited access protecting valuable information for project insiders at the highest level of data sensitivity is the restricted label for adventure works it means maximum secrecy and caution it’s for data that requires extensive protection like top executives discussing mergers acquisitions or critical contracts the restricted label keeps this monumental data as secret accessible only on a need to know basis now that you know the different labels let’s label the sales report select confidential for the current report the selected label appears near the report’s name at the top of the screen this signifies that you’ve successfully labeled your report in this video you learned how to identify sensitivity labels and how to work with sensitivity labels not all data is the same certain data must be treated more carefully than others use tools like data sensitivity labels to protect the integrity and confidentiality of your data many people think sensitive data leaks only happen because of a targeted attack from cyber criminals but sometimes unintentional internal leaks can be just as damaging meet Daniel daniel has been part of the Adventure Works team for the last 3 years as an IT specialist daniel’s life is busy and with his first kid on the way increasingly expensive while he’s happy at Adventure Works he sometimes wonders if he could earn more working elsewhere one day Daniel answers an IT help desk call from Maya on the payroll team daniel has never met Maya but he’s happy to help when she reports a problem opening Microsoft Excel attachments after a few minutes of troubleshooting Daniel has no success daniel asks Mia to send him an example of one of the attachments so he can check if it works from his side maya is anxious to get the issue resolved and without thinking she sends him the top email from her inbox which happens to be from HR when Daniel opens the attachment he discovers that it’s a complete list of salaries for all Adventure Works employees he’s a bit surprised to see this but he closes it down and helps Mia to adjust some of her trust center settings she verifies that this resolved the issue and they end their call daniel continues his work but before he logs off for the day curiosity gets the better of him he knows he shouldn’t but he reopens the attachment he received earlier from Maya he accesses the tab labeled IT department he sees his name and salary no surprises there he spots some names from the management team and he’s shocked by what some of them earn maybe he should consider management then he notices some other names these are names of colleagues on the same team as him friends he can’t resist looking at their salaries some are on a pretty similar pay scale to him but other team members earn significantly more per month he’s got no idea why this might be but he’s not happy he closes the spreadsheet logs off and heads home later that night Daniel can’t stop thinking about the salaries he saw it seems so unfair that people doing the same work as him earn more and some just joined Adventure Works in the past year daniel has been there over three years however the spreadsheets information is limited and doesn’t tell the full story the people on the list with higher salaries hold advanced qualifications that justify their higher pay and Daniel is in line for a promotion and a sizable salary increase next month in recognition of his hard work he has a bad night’s sleep and is not in a good mood when he arrives at the office the next day while he’s grabbing a muchneeded cup of coffee he bumps into Katie he confides in her about the salary information he saw the day before katie is annoyed too later that day she tells Caleb who then tells Sam and so it continues word is spreading and employee engagement has taken a hit daniel and Sam decide they’ve had enough of feeling undervalued and they accept slightly better paid positions with another company katie Caleb and the others have stayed where they are but they are not feeling very motivated with reduced headcount and disengaged staff the rest of the company has noticed that the quality of service from the IT help desk is slipping such a simple mistake could have been avoided if HR had used sensitivity labels with encryption settings on their sensitive files even if Mia had still inadvertently shared the Excel file with Daniel he would have been denied access to the file due to insufficient permissions life at Adventure Works would have carried on normally and Daniel would have received his muchdeserved promotion data helps businesses generate insights make decisions and succeed however not everyone in the business needs access to all its data sensitive data must be safeguarded with data permissions in this video you’ll learn about the risks of sensitive data and how to evaluate and safeguard these risks adventure Works relies heavily on data from sales reports to make decisions around its product lines however some of the Adventure Works sales reports also contain sensitive information on profit margins this information should be visible to senior leadership only let’s look at how PowerBI data set permissions can be used to restrict data access to only those who need it to perform their roles first let’s define what we mean by PowerBI data set permissions at the core of every datadriven organization lies its data sets data set permissions are the gatekeepers to these data sets as they’re like a series of digital locks and keys they’re permissions that ensure that the right individuals have the necessary keys to access specific data they strike a balance between accessibility and security all employees of Adventure Works have their own designated roles data permissions act as boundaries ensuring that everyone has access only to the data they need for their role the available permission types are read build reshare write and owner the first permission type we’ll explore is the read permission the read permission in PowerBI grants users the ability to view and understand data sets without altering the original content for example the marketing team at Adventure Works may need to look at the product sales report to analyze the effectiveness of marketing campaigns and promotions but they don’t need to alter this report in this case the read permission is sufficient it permits access while minimizing the risk of unintentional data modifications preserving data
integrity next we’ll explore the build permission the build permission enables users to construct visuals PowerBI reports and dashboards based on the available data without modifying the source data itself at Adventure Works the finance team responsible for creating and maintaining the sales data sets often find that sales representatives and product managers who have legitimate reasons to access the data are unintentionally changing key financial figures while exploring the reports this not only leads to incorrect financial analysis but also disrupts the financial team’s workflow by utilizing the PowerBI build permission the sales and product team can format the data for analysis without the risk of inadvertently altering it sharing information is central to collaborative environments like Adventure Works the reshare permission enables users to distribute specific data sets or reports to other users or teams permitted to access this information before a product launch at Adventure Works the finance team can use the re-share permission to share a tailored readonly data set with the marketing team this means the marketing team can optimize their advertising campaigns based on realtime sales data while the finance team is able to safeguard the integrity of their financial reports now we’ll examine the right permission the right permission in PowerBI allows users to alter data users with this permission have the authority to make modifications to the actual data sets adventure Works product development and marketing teams need access to the company’s sales and customer data granting the right permission allows the teams to not only view the data but also make specific updates and additions to the data set for example they can record customer feedback update product specifications and add marketing campaign results this permission when used cautiously ensures that Adventure Works data remains current and relevant however it comes with the caveat that any modification should be made with caution to prevent misinformation finally we’ll explore the owner permission much like the CEO overseeing every aspect of Adventure Works having an owner of the business data ensures centralized data governance the owner permission grants comprehensive control over data sets encompassing the capabilities of all other permissions owners can modify share build and even restrict access to data owners ensure that the correct data is available to the correct people safeguarding sensitive information while also fostering a culture of openness where needed with overarching control they are the custodians of data’s trajectory ensuring it aligns with the broader vision of the organization in this video you’ve learned about the risks of sensitive data and how to evaluate these risks and safeguard data these permissions promote data governance and integrity by ensuring that users only access the data relevant to their roles leading to more accurate analyses and informed decision-making as a data analyst you must ensure that your data sets are accessed only by relevant individuals and at the required permission levels so it’s important that you can configure data set permissions effectively in this video you’ll learn how to add and manage permissions for a data set in PowerBI adventure Works must share its sales report with the wider data analytics team however some team members must be assigned different data set permissions than others let’s help Adventure Works assign permissions as required upon successful login navigate to the icons on the lefth hand navigation pane select the workspaces icon select the Adventure Works workspace the Workspaces pane is where all your current and future workspaces reside browse through the data sets to find the Adventure Works product sales data set remember each data set can represent different departments or analytical perspectives once selected a new view appears on screen this screen provides useful details about the data set such as the current storage location the last date refreshed as well as existing reports and dashboards that currently use the data set find and select the file drop-down in the top left corner when this option is selected additional options appear such as download this file and manage permissions from the drop-down select manage permissions this option lets you oversee who can view or edit this data set a link section appears on screen these are sharable URLs that have been generated for this data set they act as direct gateways for users to access the data set without navigating the entire PowerBI interface each link outlines its creator who has access and the type of permissions assigned it allows you to maintain a clear shared data record ensuring that old links can be retired or renewed as needed next to the links tab select direct access the direct access tab enables you to grant direct access to a specific individual or group within Adventure Works here you will find the names of people and groups with access their email addresses and the type of permissions assigned select the add user button to add a new user you can input email addresses or names and PowerBI will suggest matches from your organization in this case you need to provide ADIO another data analyst access to the report once you’ve selected Adio you must assign in permission levels check the box that corresponds to the desired permission level for now you just need Adio to be able to read the data set assign read permissions you can add a personalized message explaining the reason for granting this access once you have selected grant access an email notification is sent to the user a new record appears in people and groups with access indicating that the user has been successfully granted access next you must remove access for the employee Kai as he’s no longer part of the project to remove access for a user or a group first locate their name in the people in groups with access section each name is followed by details such as the permission level and the date the access was granted next to each name is an ellipsus or three vertical dots which reveal additional options when selected within this menu locate the remove access button a confirmation pop-up appears select remove access it’s crucial always to be sure when revoking access to a data set as it can result in delays in accessing critical reports and dashboards upon removal the user’s name disappears from the people and groups with access list this immediate feedback confirms that the revocation action was successful finally you need to grant right access to Lucas identify his name in the list and select the ellipsus to bring up the menu select add right to assign right permission it’s important only to assign right access to people with the necessary understanding and responsibility you should now understand the process of granting and removing access to specified users with PowerBI these permissions help keep data in check and accurate by letting users access only the data they need for their roles improving analysis and decision-making data analysts often share sensitive data with people outside of the organization this means the correct permissions must be assigned when sharing links to this information to keep it secure in this video you’ll discover how to maintain data security and integrity when sharing information outside of your organization adventure Works needs you to share a PowerBI sales report with a new partner to prepare you for this task let’s explore the importance of maintaining the security and integrity of the data when sending it to outside stakeholders when sharing PowerBI reports externally it’s essential to protect sensitive data and respect privacy boundaries to prevent potential harm to the company and its stakeholders this involves carefully controlling what information is shared and maintaining strict security measures you can control this information using techniques like user licensing sharing permissions and rowle security or RLS there’s also data masking and anonymization report embedding and external sharing settings let’s explore these techniques in more detail when sharing PowerBI reports with external partners or vendors it’s important to ensure they have the right PowerBI Pro licenses for smooth access an Adventure Works admin can assign and oversee these licenses through the Microsoft 365 admin center requiring ongoing monitoring to maintain compliance and prevent violations next is the use of rowle security or RLS using rowle security is crucial especially when sharing sales data with external vendors adventure Works can ensure vendors see only relevant table data this technique keeps other sensitive information in the same table safe and inaccessible we’ll explore this more in a later lesson next let’s examine data masking and anonymization to protect sensitive data Adventure Works uses data masking and anonymization techniques this involves replacing real data with fake or pseudonmous data in Power Query allowing external partners to analyze trends without accessing Adventure Works sensitive information another technique is report embedding when Adventure Works shares PowerBI reports externally they choose secure embedding methods like publish to web or embed code they use these options carefully considering the data sensitivity before deciding which one to use this is important to keeping data confidential and limiting report access to the right people these embedding methods allow you to add reports to external platforms while keeping control over who can see and access the data next is external sharing settings to enable external sharing Adventure Works adjusts their PowerBI service settings controlled by the PowerBI admin these adjustments include various configurations to maintain the company’s security standards such as authorizing users or groups for external sharing and setting content restrictions they can also control the links expiration time and mandate authentication for external users to access shared content lastly let’s examine the use of sharing links adventure Works boosts report security by creating safe links with clear permissions making them a safer sharing choice these links can have expiration dates and be limited to specific users reducing the chance of unauthorized access you can use these features to share a sales report with the new partner so that it can only view required data in this video you discovered how to maintain data security and integrity when sharing information outside your organization as you explore and share data always be sure that you retain its integrity and confidentiality data analysts are often required to share sensitive data with multiple teams and departments this can pose a problem if the wrong individual accesses specific data fortunately you can use rowle security or RLS to ensure that your data remains accessible and protected in this video you’ll learn about the importance of maintaining data integrity how to evaluate and safeguard these risks and how RLS regulates data access adventure Works needs your help to manage data access for its global team of employees and customers effectively you can use role level security in PowerBI to tailor data access by region and role ensuring data integrity and confidentiality companywide let’s explore the basics of rowle security and how you can use it to help adventure works we’ll begin with an explanation of what we mean by rowle security rowle security or RLS ensures that only authorized individuals can access the right data this helps to preserve the security and integrity of your overall data sets in other words rowle security controls who sees what data based on predefined roles and rules it’s especially important when many different actors are interacting with the same data essentially it ensures that each person can view only the data they need and sensitive information is safeguarded let’s explore some of the advantages of implementing rowle security rowlevel security gives you precise control over who views what this helps prevent accidental data leaks by safeguarding sensitive data from unauthorized users as an organization expands its data scales and increases in its complexity rls makes it easier to handle these more complex data access needs you can use RLS to establish new rules for accessing data without starting from scratch compliance and auditing play a vital role in any organization rls helps companies comply with data privacy regulations it simplifies auditing by keeping track of who can access what for companies like Adventure Works data breaches pose a significant threat rls reduces the risk of data breaches with RLS even if someone unauthorized gets into a PowerBI report they can’t see data they aren’t assigned to this adds a layer of security against data breaches while there are many benefits to rowle security there are also several potential issues you could encounter if it’s not managed correctly using security layers especially dynamic RLS can slow down data retrieval because it filters data in real time monitor performance especially with big data sets to keep things running smoothly rowle security often requires maintenance that’s why regular checks and updates as roles and access needs change are important periodically review the RLS settings to make sure they still work well for your organization to ensure that the correct access is given to the correct individual when you set up RLS test it thoroughly to ensure the rules work and give the right access regular testing helps prevent data leaks and keeps everything working as expected next let’s explore the different kinds of rowle security static and dynamic static rowle security in PowerBI creates predefined rules to control data access based on user roles it restricts users to specific data ensuring that they only see information relevant to their roles for example a new hireer on your team has been tasked with analyzing sales of mountain bikes in North America this means they should not have access to sales data for other products or regions with static rowle security you can establish clear rules that ensure they can only access data related to sales of mountain bike products in North America dynamic rowle security in PowerBI adjusts real time data access based on user roles this permits users to view only the data that’s relevant to them at any given moment dynamic rowle security uses DAX or data analysis expressions formulas and user roles in PowerBI to filter data based on specific conditions these conditions could include user attributes or affiliations stored in a database for example your new hire has successfully analyzed sales of mountain bikes in North America so they’ve been tasked with analyzing sales of mountain bikes in other regions this means that PowerBI can now grant them access to data for other regions with dynamic row security the system can adjust its access so the new hire can view sales data for specific regions as required in this video you’ve learned about the importance of maintaining data integrity how to evaluate and safeguard these risks and how it regulates data access you should now be familiar with the basics of rowle security and how it ensures that data remains accessible and protected by using rowle security you can ensure that each entity gets the correct data in the right situation as a data analyst it’s important to control access to your data so that others can only view information relevant to their roles a useful method of safeguarding data is configuring security at the table row level in this video you’ll learn how to configure static rowle security on a data set in PowerBI your team member Addio Quinn needs access to the latest sales reports to analyze sales data from North America let’s configure static rowle security so Adio can only view the data required to complete his task to begin select the modeling tab then choose the manage roles option in the manage RO section you need to create a new role with the relevant permissions for audio select the create button to add a new role right click on the new role and choose rename rename the role as marketing North America to maintain a structured and organized role management next select the table you want to filter in this case it’s the sales table then right click on the table name and select add filter to specify which data rows this role can view choose the region field from the drop-own list and add it to the table filter DAX expression area the table filter DAX expression is where you define the limits for each RO’s data view it’s crucial to be precise about the data accessible to users in this role select the region field and input a relevant DAX expression stating that the region’s value should equal North America this DAX expression ensures that AIO can only view North American data to verify if the expression works as intended select the check mark icon in the top right corner of the manage roles window after creating your DAX expression select save to confirm your changes and establish clear visibility boundaries now you need to ensure that everything works correctly select view as and test the configuration choose the marketing North America role and select okay to view the data from a user’s perspective and verify its accuracy once you’ve completed your check select stop viewing to exit the view as ROS feature be sure to save your settings after saving your RO definition go to the home tab and select publish in the publish to PowerBI dialogue box choose Adventure Works the current PowerBI workspace you’re working in click the select button powerbi publishes the report to your chosen destination the time required for this process may vary based on the report size and your internet connection a new dialogue box confirms your report’s successful publication access the Adventure Works workspace and locate the newly published report and data set identify the data set with the same name as your report it’s now available in the PowerBI service and can be adjusted for user access select the ellipses next to your data set name to open a list of options choose security from the list to display the role level security settings from here you can assign user roles in the role level security settings locate the role you created in PowerBI desktop marketing North America then access the members area and enter Adio’s email address this action assigns Adio to the role of member and grants him access to North American marketing data next select add then select save to enforce the role assignments locking in the user access levels if Adio attempts to access data outside of North America he will see blank visuals as he only has access to marketing data related to the North American region you should now be familiar with the process steps for configuring static row security on a data set in PowerBI as a data analyst it’s your job to keep data safe and accurate so make sure that you always configure static role level security as required during a project the roles and needs of your users may often change which requires constant updating of data access permissions that’s a lot of work if you’re using static rowle security however with dynamic rowle security you can adjust data access automatically as roles change in this video you’ll learn how to configure dynamic rowle security or RLS on a data set in Microsoft PowerBI and how to assign validate and publish a report secured with dynamic RLS access PowerBI and open the Adventure Works product sales report locate and select the modeling tab in the ribbon area at the top of the screen on the modeling tab locate the security group in this group select the manage roles choice a dedicated manage roles window opens this is the area where you can define and manage roles create a new role using the manage roles dialogue box name the new role dynamic sales access now you need to apply filters select the role you just created then locate and select the table you wish to apply a filter to in this case it is the sales table next right click on the table name and select add filter select the email field from the drop- down list to add it to the table filter DAX expression area this area establishes visibility boundaries for each role determining what data each user can view you must now formulate a DAX expression that equates data from the table’s email column to the user principal name function the user principal name function fetches the user’s email address it then filters data dynamically by limiting the user to rows or data that match their email address for instance Lucas who works in sales and marketing can only access data relevant to his marketing campaigns this ensures he can’t access confidential data from other business areas to verify the syntax of your DAX expression select the check mark icon on the top right side of the manage rolls window if the expression is correct select save in the bottom right to confirm the change to the role once the role has been created and configured it must be tested to ensure it works as required select the view as choice on the modeling tab this opens a view as roles dialogue box then select the other user choice and enter Lucas’s email address then select okay you can now view the data as if you were Lucas if you are content with the validation exit the view as ROS mode by locating and selecting stop viewing at the top of the window save your changes to ensure your created role is not lost this ensures that all your configurations are stored securely after saving the role definition select the home tab and select publish in the publish to PowerBI dialogue box choose your current workspace and then the select button depending on the size of the report and your internet connection the publication process could take a few moments a new dialogue box confirms that your report has been published successfully next locate the newly published report and data set the data set can now be configured for user access select the ellipses security choice next to the data set name select security from the list this displays the rowle security settings of the report the role you created in PowerBI desktop is displayed in the left pane once the role is selected on the left email addresses can be added in the members pane on the right type in Lucas’s email to assign him to that role and give him access to specific data areas next select add and save to enforce the role assignments locking in the user access levels you can repeat this process for other users as required adventure works can now distribute the report with the knowledge that its data is safeguarded and you should now understand how to configure dynamic rowle security and assign validate and publish an RLS configured report searching for daily reports in PowerBI can be a time-consuming task wouldn’t it be great if they arrived automatically in your inbox at a set time each day thankfully you can configure this setup with report and dashboard subscriptions over the next few minutes you’ll learn how to set up subscriptions to your reports and dashboards and review the advantages of this setup every morning Lucas reviews his PowerBI workspace for new reports and dashboards this is a time-consuming process by configuring subscriptions he could have these assets delivered directly to his email subscribing to reports and dashboards in PowerBI offers a wide array of advantages let’s take a closer look at those benefits a PowerBI subscription is an automated delivery system that sends daily scheduled snapshots of your reports and dashboards as an email or as a notification this turns a tedious manual process into a seamless and automatic one one of the main benefits of subscribing to reports and dashboards is quick access to data once there’s a new update you and all other subscribers receive an instant update or alert this ensures that decision makers always operate with the most current data with a subscription Lucas can ensure that his sales and marketing insights are always drawn from the most recent reports and dashboards subscriptions also boost efficiency and productivity manually pulling up the same report day after day is a tedious task but you can automate this process with subscriptions your teams can prioritize more important tasks and dedicate more resources to analysis and insight instead of wasting time fetching reports with a subscription to the weekly sales dashboard Lucas could receive the latest sales and marketing data every Monday at 6:00 a.m sharp receiving regular reports fosters a sense of routine and consistency in data consumption with set delivery intervals users can create structured time slots dedicated to datadriven assessments a shared understanding is key to effective collaboration when multiple team members or teams subscribe to the same reports it establishes a uniformity in the information they base their decisions on everyone is working from the same version of each report now that you’re more familiar with the benefits and uses of subscriptions in PowerBI let’s configure a subscription for Lucas so he has quick access to the most up-to-date data all your reports dashboards and data sets are listed in your workspace select the report you’re interested in to open it once the report loads navigate to the top toolbar select the ellipses next to the edit button to open more options in a drop-own menu from these options select subscribe to report the subscriptions pane appears on screen you can use this pane to configure your subscription as follows first give your subscription a memorable name especially if you plan to set up multiple subscriptions decide how often you want to receive this report for example should it be daily weekly or even monthly depending on your chosen frequency set the specific time you’d like the report sent if you want other colleagues to receive this subscription add their email addresses here remember you also need access to the report to view it you can also add a custom message in the email received when the report is sent once you’ve set up your subscription select save and close or save to activate it you’ll then receive confirmation that the subscription is now active depending on your settings you’ll begin receiving the report via email based on your selected frequency select an existing subscription to view its details you can modify pause or cancel your subscription from this menu lucas now has daily automated access to sales and marketing reports and dashboards this gives him more time to analyze data and generate insights and you should now know how to set up subscriptions to your reports and dashboards and the advantages of this setup with PowerBI subscriptions you’ll work more efficiently consistently and faster this leaves you more time and opportunities to generate insights to help your organization achieve its goals much of your daily work as a data analyst involves analyzing data to generate insights but what if PowerBI could generate and deliver these insights to you with PowerBI data alerts you can receive automated insights that save time and effort in this video you’ll explore the benefits of data alerts and learn how to set up an alert in PowerBI at Adventure Works Lucas monitors and analyzes data for events like a spike in sales or a slowdown in production or shipping times however manually uncovering these insights takes time it would be much more efficient to configure data alerts that flag these events automatically let’s find out more about data alerts and how Lucas can use them for more efficient monitoring data alerts are essentially automatic notifications set up within PowerBI they inform users when specific conditions or thresholds in a dashboard are met or exceeded and these alerts can be customized to cater to a range of business needs there are many different benefits to data alerts a major benefit is real time decision-making data alerts notify data analysts immediately when specific metrics reach a predefined threshold this instantaneous awareness means decisions can be made quickly organizations can adapt to real-time changes in the business environment at Adventure Works Lucas can use data alerts to monitor sales spikes in Europe for marketing campaigns this realtime insight allows the European sales team to adjust strategies for maximum impact quickly data alerts also help with efficiency and timesaving manually analyzing data takes time by configuring data alerts that monitor important conditions data analysts can direct their attention elsewhere confident they’ll be notified if something requires their attention for example Lucas previously spent hours checking website traffic following the launch of new marketing campaigns now thanks to data alerts he’s instantly informed of significant traffic changes which frees his time for other tasks instead of discovering issues after they’ve occurred and seeking solutions data alerts can notify stakeholders of potential problems before they escalate for instance an alert can be triggered if a manufacturing process at Adventure Works starts to slow the company can intervene immediately before the slowdown impacts the wider production line this proactive approach can mitigate risks and prevent minor issues from becoming major problems data alerts also ensure that all relevant parties are notified about important datadriven insights for example if Adventure Works launches a new marketing campaign in Germany data alerts can notify the marketing and IT teams of surging website traffic this synchronization ensures greater collaboration the marketing team can assess the campaign success while the IT team can scale server resources and finally data alerts are highly customizable this lets different teams or individuals set alerts based on what’s most important to their role or department a sales manager might set alerts related to sales metrics while a supply chain manager might focus on inventory levels this personalized approach ensures that each stakeholder receives the most relevant data instead of unnecessary information now that you’re more familiar with data alerts let’s help Lucas set up alerts in PowerBI in your workspace is a list of reports dashboards and data sets select the report you’re interested in to open it once the report loads navigate to the KPI visual you wish to create an alert for it’s important to note that PowerBI differentiates between reports and dashboards dashboards are a collection of tiles each representing a specific visual or information alerts can be set on tiles pinned from report visuals or PowerBI Q&A and only on gauges KPIs and cards hover over the visual to pin it from your report to a dashboard then select the pin icon this action opens the pin to dashboard menu you can select the dashboard to which you want to pin the visualization and even change its theme a confirmation message appears once you’ve pinned the visualization select the messages go to dashboard option to view your pinned visualization move your cursor over the tile of interest an ellipsus appears at the top right corner select it to reveal a drop-own menu with additional options for that tile select manage alerts from the drop-own menu this opens the core settings for alerts related to this tile on the alerts menu select add alert rule you can now define a new condition for alerts a clear descriptive name for an alert like drop in shipping time provides a clear context next choose a condition parameter like above or below and set a numeric value this value becomes your trigger point for instance if shipping times drop below a set number it’ll trigger the alert you can decide the alerts notification frequency depending on how critical the data is if it’s a vital metric like manufacturing uptime you might instead set up every hour alerts for less urgent data every 24 hours might suffice once you’ve configured the alert to your satisfaction select save this activates your alert it’s good practice to review your alerts regularly to access your active alerts just select manage alerts again you can view and manage your existing alerts from the manage alerts menu frequently reviewing your alerts ensures that they’re still relevant to your organization’s goals outdated alerts might cause unnecessary distractions or lead you to miss out on critical insights you should now understand the benefits of PowerBI data alerts and be familiar with the setup process data alerts are a great tool for delivering automated actionable insights that save you time increase your productivity and help you and your organization succeed emily is the CEO IT specialist designer head of HR delivery driver and chief coffee maker at Ecocraft Furniture you name it Emily does it along with a small but close-knit team of other crafts people Ecocraft specializes in producing highquality sustainable furniture founded just two years ago the company is already exporting its products to various countries across North America and Europe the raw materials for Ecocraft’s furniture such as sustainably sourced wood and eco-friendly paints are imported from different countries this means transactions often take place in multiple currencies this has been one of the biggest challenges for Emily and Ecocraft fluctuations on the currency markets can significantly impact production costs and profit margins the company needs a system to issue alerts when rates are favorable for making large purchases or setting prices for overseas markets this would help Emily and Ecocraft manage budgeting and financial forecasting powerbi is the perfect solution for Emily she can use it to track important business metrics sales supply chain status and currency exchange rates emily decides to set up alerts on PowerBI for currency exchange rate changes this will give her the information she needs to make sound financial decisions the first step is to collect data emily enlists the help of her tech-savvy friend Alex who helps her create a robust data pipeline together they source real time and historical exchange rate data for the currencies of the countries from which they import raw materials they also collect data on their purchase orders and expenses related to each supplier next they create a dashboard to monitor various key performance indicators the dashboard will also identify patterns and potential risks associated with currency fluctuations the exchange rate data and other vital metrics like sales and supply chain status are displayed in real time emily configures PowerBI to send custom alerts whenever currencies pair like when the US Canadian dollar or the US dollar to euro cross thresholds that impact the company’s financials she sets these alert levels based on historical data and current business needs for instance if the exchange rate for the euro increases by more than 5% in a week Emily will receive an alert armed with these alerts Emily is better prepared to mitigate currency risk when an alert triggers she can immediately assess the potential impact on her production costs and take necessary actions this could include renegotiating contracts with suppliers and hedging currency exposure or seeking alternative suppliers from more stable regions shortly after setting up the PowerBI dashboard an alert indicates that the US dollar to euro exchange rate has dropped to a favorable level based on this information the team orders raw materials from the European suppliers saving thousands of dollars as Emily continues to use PowerBI and respond to alerts she gains deeper insights into her business she can analyze which suppliers are more cost effective based on currency trends and adjust her sourcing strategy accordingly these datadriven insights help the company to make more informed decisions save money improve the overall efficiency of its supply chain and ultimately increase profitability over time the currency alerts become integral to Emily’s business this provides the stability she needs to pursue her mission of creating beautiful eco-friendly furniture for years to come the company plans to extend the PowerBI platform’s capabilities to other business areas solidifying data as a core component of its growth strategy emily’s journey with PowerBI is a testament to the power of datadriven decision-making congratulations on reaching the end of these lessons on security and monitoring in PowerBI during these lessons you explored the role that security and monitoring play in safeguarding reports and dashboards in PowerBI let’s take a few minutes to recap what you learned in these lessons you first explored how to share information safely and identify sensitive data sensitive data is essential information that if leaked could damage the company’s reputation finances or privacy if the information is employee related the leak could damage an organization’s and its workforce’s relationship fortunately you can safeguard data in PowerBI using the following methods authentication and authorization systems ensure that those accessing the data are who they say they are assigning clear roles and permissions ensures that individuals can only access certain data rowle security or RLS filters data so that individuals can only access relevant elements of data sets data encryption prevents data from being intercepted during transmission data masking lets you work with obscured versions of data so that you can only view the information required to complete your task you also learn that sensitive information can be shared using links these links offer sharing options so you can control who views the data these options include people in your organization who need the data people with existing access to the data or specific people that you include directly and you can decide what recipients can do with the data using the following sharing permissions they can reshare the data with others or make use of the data to perform analysis another method of safeguarding data is the use of sensitive labels these labels let you categorize data making it clear who can access it these categories include personal which denotes data linked to specific individuals public which is data for a wider audience and general meaning information meant for a wider internal audience there’s also categories that govern more sensitive data the confidential label means the information is sensitive and requires careful handling highly confidential relates to sensitive data on critical business innovations and the restricted label is used for data that must be treated with maximum secrecy and caution you then demonstrated your understanding of sharing information in PowerBI by applying sensitive labels to an Adventure Works data set in the next lesson you explored the topic of organizations and permissions you discovered that access to data sets is governed by data permissions these ensure that only authorized individuals can access data powerbi offers the following permission types the owner permission grants a user complete control of a data set the read permission permits users to view but not alter data the reshare permission permits users to reshare data the build permission lets users utilize the data for analysis and the write permission enables users to alter data you then learned how to configure these permissions in PowerBI using the manage permissions option this option lets you create and manage URLs for data access that can be shared with your team you also learned that data can be shared outside of an organization however it’s important to consider which safeguards are most appropriate to ensure the data remains confidential you completed this lesson with a knowledge check in which you tested your understanding of data permissions and you reviewed additional resources to help you learn more about PowerBI and data permissions in the third lesson you reviewed rowle security for safeguarding data rowlevel security or RLS controls which individuals can view data based on predefined roles and rules some of the benefits of RLS include granular control over data the ability to scale as your data grows assistance with compliance and auditing and a reduced risk of data breaches however RLS also gives rise to several potential issues it can impact performance by slowing down data retrieval it requires regular maintenance and it must be tested frequently there are two types of role security the first is static static RLS restricts users to specific data so they can only view information relevant to their roles the other type is dynamic RLS dynamic RLS uses data analysis expressions or DAX to adjust real-time data access based on user roles you completed this lesson by undertaking a knowledge check focused on rowle security and you reviewed some additional resources on this lesson’s main topics in the fourth and final lesson you explored the topic of subscriptions and alerts in PowerBI you can subscribe to reports and dashboards a PowerBI subscription is an automated delivery system that provides daily data snapshots as emails or notifications the advantages of subscriptions include timely access to information a boost in productivity because more tasks are now automated consistency in data consumption and enhanced collaboration teams can now work from the same data sets you can configure subscriptions using the subscriptions pane in PowerBI with this feature you can name your subscription decide how often you receive it and even include other colleagues you can also modify pause or cancel your subscription as you need as well as subscriptions PowerBI also offers data alerts these automatic customizable notifications inform users when specific conditions or thresholds have been met or exceeded some of the benefits of data alerts include realtime decision-making efficiency through automation proactive problem solving enhanced collaboration and customization and personalization you can configure data alerts in PowerBI the manage alerts feature lets you set conditions and thresholds that determine when you receive alerts finally you demonstrated your understanding of these topics by undertaking an exercise in which you configured a data alert for Adventure Works you’ve now reached the end of this summary it’s time to move on to the discussion prompt where you can discuss what you’ve learned with your peers you’ll then be invited to explore additional resources to help you develop a deeper understanding of the topics in this lesson congratulations on everything you’ve achieved so far you’ve now reached the capstone project during this course you explored the role of PowerBI in business deploying assets in a PowerBI workspace and the role that security and monitoring play in safeguarding reports and dashboards in PowerBI let’s take a few minutes to recap what you’ve learned so far you began with an introduction to the role of PowerBI in business with a focus on data flow data flow in business refers to the movement of information within an organization this movement or flow occurs in the following stages: collection processing analysis and decision making once gathered the data is cleaned or standardized it’s then transformed data analysts use the refined data to generate insights the data is analyzed using PowerBI service this software offers many advantages for analysts it’s accessible scalable and offers collaboration tools and data backup and recovery features the data analyst is the central figure in this process they possess important skills and expertise in extracting valuable insights from data an important skill that all data analysts must possess is understanding structured query language or SQL data analysts use SQL to interact with the SQL databases that store the data analysts can connect to a SQL database using import or direct query modes import mode loads data directly into PowerBI direct query mode connects PowerBI directly to the source database an analysis is presented in the form of a report a report can be static or dynamic a dynamic report explores multiple areas of interest its results are presented in the form of visuals these reports also facilitate using whatif parameters that permit interactive adjustments to modify visualizations and generate insights into potential scenarios next you explored how to deploy assets in a workspace a workspace is a specialized area in PowerBI that holds important assets there are two types of workspaces in PowerBI the first is a personal workspace which you can use to store your content the second is a shared workspace where a team can collaborate on reports and dashboards workspace roles determine how individuals can interact with workspaces workspace roles include viewer contributor member and admin you can manage these roles using PowerBI’s manage access feature in the next lesson you learned how to monitor workspaces by monitoring a workspace you can measure its impact and make changes to increase its usefulness you also explored the topic of data sets and gateways in PowerBI a data set must contain the latest available information you can use a scheduled or incremental refresh to ensure accurate data and you can promote and certify data sets to inform your team where to access the most current and reliable data you also explored establishing a secure reliable connection between your on- premises data and PowerBI service using data gateways there are three types of gateways in PowerBI the on- premises data gateway the on- premises data gateway personal mode and the Azure virtual network or V-Net data gateway which type of gateway you choose depends on the setup of your organization and its specific data management and security requirements you also learned how PowerBI deployment pipelines move content through the following life cycle stages: development testing and staging or production another useful feature for maintaining your workspace is the lineage view this view shows the data journey from source to destination with all the connections in between impact analysis shows how changes to your data can impact or affect different assets in your workspace next you explored the role that security and monitoring play in safeguarding reports and dashboards in PowerBI you first explored how to share information safely and identify sensitive data sensitive data is essential information that if leaked could damage the company’s reputation finances or privacy you can safeguard data using PowerBI’s authentication tools you can also use sharing links to control who you share information with and use sharing permissions to determine what they can do with the data sensitivity labels are also another useful method of safeguarding data access to data sets is governed by data permissions these ensure that only authorized individuals can access data you can configure permissions in PowerBI to safeguard your data you also reviewed rowle security for safeguarding data rowle security or RLS controls which individuals can view data based on predefined roles and rules there are two types of rowle security static RLS restricts users to specific data dynamic RLS uses data analysis expressions or DAX to adjust real-time data access based on user roles finally you explored subscriptions and alerts in PowerBI you can subscribe to reports and dashboards a PowerBI subscription is an automated delivery system that provides daily data snapshots as emails or notifications you can use the subscriptions pane in PowerBI to manage your subscriptions as well as subscriptions PowerBI also offers data alerts these automatic customizable notifications inform users when specific conditions or thresholds have been met or exceeded during these lessons you also completed exercises in which you put your new skills into practice by helping adventure works with PowerBI knowledge checks which tested your understanding of these topics and additional resources in which you consulted Microsoft Learn articles to help you explore these topics in more detail you’ve now reached the end of this recap it’s time to move on to the capstone project which will test your understanding of these concepts through a series of exercises best of luck you’ve reached the next stage of the capstone project you’ve worked hard to get to this stage and made good progress let’s recap what you’ve achieved so far in the previous set of scenarios that you’ve just completed you prepared sales data configured data sources and designed and developed a data model you’ll begin this next stage of the capstone by configuring aggregations for Tailwind traders these aggregations will help the company generate insights into its financial performance as part of this scenario you’ll calculate sales and profits data and record the performance of visuals using the performance analyzer these aggregations will help generate insights informing the company’s strategic decisions for the upcoming business year by completing this exercise you’ll demonstrate your ability to create timebased summaries determine median sales volumes and utilize the performance analyzer tool next you’ll transform the insights you generated from configuring aggregations into a sales report tailwind Traders needs a report that helps to inform sales decisions the company needs your help to generate such a report using its sales data to generate this report you’ll complete the following tasks create charts and cards to visualize your data and add a slicer to your report aside from the sales report Tailwind Traders also requires a report that displays key insights into its profits creating this report will be your next task you’ll generate this report by creating charts and cards to visualize the data creating a KPI and adding a slicer through this and the previous scenario you’ll demonstrate your ability to create different kinds of charts to display sales data and display important sales metrics using cards and KPIs in the next capstone scenario you’ll help Tailwind Traders create an executive dashboard tailwind Traders will use the dashboard to generate insights into its global performance the dashboard must focus on sales and profits and be accessible on mobile devices you’ll create this dashboard by pinning sales and profits card visualizations and KPIs to the dashboard and configuring mobile view for the cards KPI visuals and core visualizations by completing this scenario you’ll show that you can create an executive dashboard in PowerBI display sales summaries highlight profit metrics use card visualizations for quick insights and configure a dashboard that’s mobile friendly in the final scenario you’ll need to help Tailwind traders to generate quick and actionable insights into its data you can carry out this task using PowerBI subscriptions and alerts features you’ll complete this task by creating daily alerts for key metrics and creating subscriptions for the sales and profits overview tabs by successfully helping Tailwind traders to generate quick and actionable insights you’ll prove that you can configure subscriptions and set up proactive alerts if you encounter any difficulty with these scenarios remember that you can refer to previous learning materials like videos and readings for guidance you’ve already completed similar tasks in the other exercise items in this course so you’re more than capable of working through these scenarios best of luck congratulations on completing the Capstone project it’s been a lot of work but you finally reached the end your completed Capstone PowerBI environment should contain sales and profits reports visualizations of the key metrics in your reports pinned to an executive dashboard and you should also have configured alerts and subscriptions let’s take a few moments to recap the exercises you’ve completed by reviewing examples of what the completed dashboard should look like don’t worry if these examples don’t quite match your dashboard you can review these best practice examples in more detail when you access the exemplars in the first exercise you configured aggregations using DAX you created measures to calculate the following: yearly profit margin quarterly profit and median sales you then assessed the performance of these reports in the second exercise you created a sales report you then visualized the data in this report using charts you created a bar chart for loyalty points by country a column chart for quantity sold by product a pie chart for median sales distribution by country and a line chart for median sales over time you also created cards for stock quantity purchased and median sales in the third exercise you created a profit report you then visualized the data in this report using charts you created a bar chart for net revenue by product a donut chart for yearly profit margin by country and an area chart for yearly profit margin over time you then created cards for year-to-ate profit and net revenue USD you then set up a KPI for gross revenue USD and added a slider for your profit report finally you saved and published the report once your profits and sales reports were completed your next task was to create an executive dashboard to create this data you created a dashboard called Tailwind Traders Executive Dashboard you then pinned the following sets of visualizations to the dashboard sales overview core visualizations sales overview card visualizations profit overview core visualizations and profit overview card and KPI visualizations once you finished pinning your visualizations you configured the mobile view for the cards KPI visuals and core visualizations in the final exercise your main task was configuring the dashboards alerts and subscriptions you first created a daily alert for gross revenue USD that informs Tailwind Traders when gross revenue drops below $400 US next you created and activated a weekly subscription for the sales overview tab ensuring it could be viewed and shared in PowerBI you then created and activated a weekly subscription for the profit overview tab ensuring it could be viewed and shared in PowerBI you’re now ready to begin working through the exemplers where you can compare your PowerBI environment against the best practice examples in more detail congratulations you’ve reached the end of this capstone project course you’ve worked hard to get here and developed many new skills you made great progress on your PowerBI journey this course and all you have achieved is a culmination of all the previous courses you’ve completed in this specialization having completed this course you now understand the basics of PowerBI’s relationship with business you’re familiar with the process steps for creating monitoring and maintaining workspaces you can connect data sets and gateways you can securely share information with your team and the wider organization and you can manage subscriptions and alerts in your workspaces with this course you were able to reinforce and demonstrate the learning and practical development skill set you have gained throughout this program this was achieved through hands-on guided practice configuring a PowerBI workspace for Tailwind Traders the graded assessment further tested your knowledge of PowerBI after completing the final project it’s a great time to pause and reflect on your journey you can reflect on the completed course from several vantage points you could consider the links between this course and the previous ones you’ve completed or you could reflect on the process of completing the project for example what were the hardest parts of the project what was the easiest what experience did you gain from the project and would you benefit from revisiting previous courses whether you’re just starting as a technical professional a student or a business user this course end project proves your knowledge of the value and capabilities of database systems the project consolidates your abilities with a practical application of your skills but the project also has another important benefit it means you have a fully operational PowerBI workspace to reference within your portfolio this serves to demonstrate your skills to potential employers and not only does it show employers that you are self-driven and innovative but it also speaks volumes about you as an individual and your newly obtained knowledge you’ve completed all the courses in this specialization and earned your certificate in PowerBI the certificate can also be used as a progression to other role-based certificates you may go deep with advanced role-based certificates or take other fundamental courses depending on your goals certifications provide globally recognized and industry endorsed evidence of mastering technical skills you’ve done a great job and should be proud of your progress the experience you’ve gained shows potential employers that you are motivated capable and not afraid to learn new things thank you it’s been a pleasure to embark on this journey of discovery with you best of luck in the future welcome to the Microsoft PL300 exam preparation and practice course a significant milestone on your journey toward becoming a certified Microsoft PowerBI data analyst if you’re motivated to set yourself up for a career in the world of data analytics you’re on the right track your learning journey in data analytics with Microsoft PowerBI has culminated in this course carefully designed to equip you with the knowledge skills and competencies you need to excel in the Microsoft PL 300 exam as you delve into this course you’ll navigate key PowerBI features and concepts that are integral to the PL 300 exam these concepts encompass a broad spectrum including data preparation modeling visualization and asset deployment plus by the end of the course you won’t just be well prepared for the PL300 exam you’ll also be equipped with valuable insights into your future career prospects in data analytics with PowerBI your course journey begins with a comprehensive review of fundamental concepts associated with data preparation and loading in PowerBI you’ll cover a range of essential topics such as the journey from exam preparation to Microsoft certification mastering the art of acquiring data from diverse sources and data profiling and cleaning as well as the intricacies of data transformation and loading the next part of your course journey involves a detailed recap of core data modeling concepts in PowerBI representing another crucial step in your preparation for the PL300 exam this will entail a thorough recap of designing effective data models and the creation of model calculations using DAX or data analysis expressions additionally you’ll delve into implementing well ststructured data models and optimizing data performance for efficient and seamless analysis following your refresher in data modeling you’ll take a turn toward revisiting essential concepts linked to data visualization and analysis more essential components to your PL300 exam readiness this part of the course encompasses creating impactful reports and enhancing and elevating those reports to boost usability and storytelling plus you’ll also focus on developing your skills in recognizing patterns and trends within data which is invaluable in data analytics after covering these critical content areas you’ll shift your focus to the deployment and maintenance of assets within PowerBI here you’ll refresh your understanding of pivotal topics like establishing and managing workspaces and assets you’ll also work on your proficiency in the efficient handling of data sets a skill that’s fundamental to the work of a data analyst to complete this course successfully you’ll have the opportunity to apply the skills and knowledge you have gained to a practice exam specially designed to simulate the conditions of the PL300 exam this practical hands-on assessment will allow you to assess your readiness and identify areas that may require further attention or improvement furthermore you’ll receive additional study resources and materials to further enhance your preparation you’ll also have the opportunity to explore different roles and career prospects that will be accessible to you once you’ve successfully completed the exam and obtained your Microsoft certification in sum the objective of this course is to prepare you for the PL300 exam and support you in realizing the next steps towards a career as a PowerBI data analyst the course is structured to prepare you thoroughly for assessment and guide you in recapping and consolidating the concepts you’ve acquired throughout the program it aims to increase your confidence in your competence and ensure you are truly exam ready as with the other courses in this program the videos readings activities and quizzes will contribute to you consolidating your knowledge and serve as a way for you to measure your progress beyond preparing for the PL300 exam this course holds a much larger promise it’s about more than just gaining knowledge and skills in data analysis in PowerBI it’s about taking an important step in setting yourself up for a career in data analysis a field filled with opportunities and potential by completing all the courses in the program you’ll earn a Corsera certificate which you can use to proudly showcase your job readiness to your professional network furthermore the program with an emphasis on this exam preparation and practice course will prepare you for the Microsoft Exam PL300 which leads to a Microsoft PowerBI data analyst certification globally recognized evidence of your realworld skills so are you ready to achieve exam readiness and take a leap toward a career in data analytics with PowerBI congratulations on reaching the home stretch of this program and all the best as you embark on the exciting and promising learning journey that lies ahead this is the final course in the Microsoft PowerBI data analyst professional certificate which will guide you through taking the PL300 exam and earning the associated Microsoft certification by obtaining the Microsoft PL300 certification you can unlock various career opportunities enhance your knowledge and skills and cultivate a competitive edge in the job market exams are nothing new it’s likely that you’ve encountered similar challenges earlier in your career just like before it takes preparation to make the most of it and the more effective your preparation the more benefits you will reap from all your effort this video provides a quick overview of what you can expect from the PL300 exam the logistics around taking the exam and the steps you need to take to prepare for success you can take the PL300 exam online at your home or office through Pearson View online you can also take your exam with Pearson View at one of their worldwide test centers pearson View is a global leader in computer-based testing and assessment services their Onview platform employs several security measures to ensure a fair and secure testing experience you can schedule your exam for a specific date and time on the Pearson View website there are a few important things to do before the day of the exam these include a system check making sure your ID document meets the specified requirements and choosing the appropriate space to take the exam the PL300 exam is a proctored exam which means that you are monitored by a live proctor or exam supervisor through your webcam during the exam the proctor ensures that you follow the exam guidelines and don’t engage in any prohibited activities the proctor will also give you certain instructions during the check-in process on the day of your exam there are very strict rules about what items and actions are allowed while taking the exam which you’ll learn in greater depth later it’s critical to understand these policies because failing to adhere to them will result in the termination of the exam session let’s move on to the topics covered in the exam to succeed in the PL300 exam you should be proficient at using Power Query and writing expressions using DAX or data analysis expressions you should know how to assess data quality as well as understand data security including rowle security and data sensitivity the PL300 exam measures your ability to accomplish the following technical tasks: data preparation data modeling data analysis and visualization and asset deployment and maintenance there are certain percentages of exam questions relating to each of these categories knowing these percentages can help you focus your study schedule on the categories that carry the most weight and help you prepare in the most effective way you can look forward to exploring the specific ways in which the skills related to each of these categories might be assessed later you can also consult the detailed exam skills outlined provided by Microsoft effective exam preparation not only requires a lot of dedication but you also need to consider effective strategies for during the exam for instance you should consider the type of questions you might get and how to approach them some helpful strategies include reading every option before choosing a final answer and following a process of elimination when you are unsure you will learn more about these and other strategies later one of the best forms of preparation is to take a practice test before the exam this way you can monitor your progress and identify the areas that might require a little more attention later in this course you will take two mock exams each one will focus on the topics and key concepts covered in the previous courses and the skills measured in the PL300 exam this video gave you a bird’s eyee view of how the PL300 exam works what it tests and some core elements of an effective exam preparation strategy you’ve already put in a lot of hard work by engaging in course material exercises and assessments during this program you are in a good position for the final preparation before taking the exam the information and materials in this lesson will help you focus your preparation in this final stage toward earning the Microsoft PowerBI data analyst certification datadriven enterprises rely on data analysts to provide them with accurate and insightful analysis as you’ve learned finding the correct data sources is essential for data analysts to help businesses achieve their goals in this video you’ll recap the importance of identifying the right data sources and connecting to data sources with Microsoft PowerBI as you begin the data analysis process identifying what data is required and which sources can provide the data is the first step toward a successful analysis outcome for example when looking to increase sales your social media accounts and popular search engines become your key data sources to analyze marketing data similarly if you’re looking to improve customer satisfaction tracking the volume of support requests and resolution time from your customer support system is the key data source fortunately PowerBI comes with over 100 connectors to allow you to tap into the different data sources available to you these include spreadsheet sources such as Microsoft Excel user directory services such as Microsoft Active Directory SQL databases such as Microsoft Azure SQL databases and text files in various formats such as XML JSON and CSV plus Microsoft continues to add new connectors and update existing connectors each year now let’s explore how to connect to a data source in PowerBI in PowerBI desktop select get data followed by Excel workbook when the file browser opens navigate to the folder that your Excel file is in select the Excel file then open the navigator window will open displaying all the available sheets within the workbook select the check boxes beside the sheets that you want to import at the bottom of the navigator window are three buttons: load transform data and cancel selecting load will load the data directly without cleaning or transforming it for this example let’s select transform data to open the Power Query Editor and inspect the data powerbi will begin loading the data note that this may take a few minutes depending on your computer and the size of the worksheet once the data is loaded the Power Query editor will open power Query allows you to apply transformation operations to the data before loading it into PowerBI on the left side of the editor is the queries pane where each table is listed selecting a table will allow you to clean and transform its data each row of data in the table is listed in the main working view on the right side of the editor is the applied steps list this lists each of the transform operations being applied to the data and the order in which they are being applied note that if you need to change the source of the data query you can select the cog icon beside the source step this opens a window where you can change the file from which the data is loaded if you’re satisfied with the existing data source you can close the window by selecting okay in this example let’s use the data as is without cleaning and transforming it select the close and apply button in the top left corner of the editor to finish transforming the data and load it into PowerBI powerbi will begin loading the data with transformations applied to it again this may take a few minutes depending on your computer and the size of the worksheet once the data is loaded you can begin working with it to build reports and dashboards if you want to inspect the data after loading select the table icon on the left side of the interface to open the table view also known as the data view in this view you can inspect each table and each row of data working with data sources is an important aspect of the role of a data analyst this video revisited the importance of identifying the right data sources and how to connect to an Excel data source load its data using Power Query Editor and configure the data source settings by selecting the cog wheel next to the source step in the applied steps pane as you solve business challenges unlock new opportunities and optimize existing processes consider which data sources can provide the data you need to achieve your objectives powerbi with its more than 100 connectors makes it possible for you to harness these sources to their fullest potential with hundreds of connectors in Microsoft PowerBI it should be no surprise that a wide range of options are available when using these connectors previously when you used an Excel worksheet as the data source the data imported into PowerBI but for larger volumes of data importing may become a resource inensive operation this is where choosing a different storage mode like direct query comes in in this video you’ll revise the different storage modes available in PowerBI powerbi Desktop supports three different storage modes also known as connectivity modes or data set modes in PowerBI service import mode direct query mode and dual mode when you use import mode data is copied from the data source to PowerBI this allows quick access to the data locally however if the data source is updated after importing you must refresh the data source fortunately you can configure PowerBI to schedule refreshes at specific intervals such as daily or weekly when you use import mode consider how up-to-date the data must be for stakeholders to make datadriven decisions effectively another consideration when using import mode is the required storage space if you are working with an extensive data set storing all the data on your local device may not be possible in today’s datadriven world it is not uncommon to see data sets consuming several gigabytes of storage so what about data sources with significantly large volumes of data a scenario where import mode may be unsuitable by changing to direct query mode PowerBI will query the data source directly for data rather than importing it this means that when a report is displayed in PowerBI each visualization will send a query to the data source to request the required data to determine what connectivity mode is supported you can refer to Microsoft’s documentation for your chosen connector one disadvantage of using direct query is that it requires transferring query results from the data source every time a query is made depending on the volume of data this may take some time slowing down visualizations and reports to improve the user experience PowerBI also provides a dual mode this mode is a combination of the direct query and import modes depending on the query and data source PowerBI will store a local copy of query results and refresh the copy as needed this helps improve the responsiveness of visualizations and reports without importing all data into PowerBI as you build data models in PowerBI connecting to multiple data sources is common when your data model connects to multiple sources it is known as a composite model with composite models you can configure the storage mode for each table in the model for example let’s say you have two tables in your data model products and sales in a niche business the product data set might be a small Excel spreadsheet and the sales data a large data set stored in a SQL database in this scenario it would make sense to use import mode for the products table and direct query or dual mode for the sales table this would help ensure no slowdown in your reports and that the viewers have a good user experience but what about connecting to a data set on PowerBI service powerbi features a type of connector called live connection which allows you to use direct query with data sets published to PowerBI service powerbi service becomes an important data source for building reports and dashboards as an organization grows hosting data in PowerBI service allows the organization to have one source of truth to maintain consistency and accuracy in reporting the benefit of using live connection is that security rules can be applied to the data ensuring that company data remains protected from unauthorized viewers in this video you recaped import direct query and dual storage modes to help you choose between them choosing the right storage mode is important to ensuring a good user experience for different stakeholders if data retrieval is slow reports and dashboards will also be slow which may result in stakeholders not utilizing the insights unlocked by your data analysis as you proceed through the data analysis process carefully consider which storage modes are suitable for different data sources and how they should be configured query parameters are a useful feature in Microsoft PowerBI for simplifying a dynamic element of your data for example changing between a test data source and a production data source or filtering data from your data source in this video you’ll revise how to configure query parameters and the values that they use in the Power Query Editor there’s an Excel data source loaded containing stock orders for different business regions because the data set is quite large let’s use query parameters to filter the data needed to do this select the manage parameters button in the home tab of the ribbon menu this opens the manage parameters window to filter the data by country you need to add a country parameter in the manage parameters window select new in the name field enter country in the description field let’s add a note that this parameter filters the stock order data by country ensure that the required option is enabled so that report users must specify a value for this parameter for the type field let’s change the type to text as the country values are text values also since there’s a fixed list of countries in the data let’s change the suggested values to list of values in the list of values add the three countries present in the data the United States France and Germany for the default value select United States this will be the default value for users of this data set for the current value select United States then select okay this adds the parameter to the queries pane to ensure that the data source query utilizes the parameter select the stock orders query in the queries pane then select the filter button in the country column followed by text filters and equals which opens the filter rows windows in the filter rows window change the filter value button to parameter this will then change the equals filter to utilize the previously defined country parameter you can then select okay note how the data set is now filtered by the country parameter in the home tab of the ribbon menu select close and apply to load the data set to confirm that the parameter has been applied select the table view button also known as the data view in this view it is clear that the data set contains only stock orders for United States this matches the current value specified for the country parameter earlier to visualize how this parameter is used let’s create a simple report containing a card visualization navigate to the report view in the visualizations pane select the card visualization the visualization is then added to the report now select the visualization in the report in the data pane also known as the fields pane let’s select the unit price field this applies the unit price field to the visualization in the visualizations pane in the data field rightclick the sum of unit price and then select average the visualization now displays the average value of the unit price field in the data set to change the parameter you can select the drop-down of the transform data button in the home tab of the ribbon menu then select edit parameters in the edit parameters window let’s change the country parameter to France then select okay powerbi now displays a notification that there are pending query changes if you select apply changes the parameter change will be applied note that the average value in the visualization has changed this is because the data set has now been filtered for only stock orders in the France business region to confirm this let’s select the table view button in this view it is clear that the data set contains only stock orders for France in this video you recaped how to change the values in a parameter query parameters are a great way to filter your data queries dynamically as you begin building reports and working with more extensive and multiple data sets consider how you can use query parameters to reduce the scope of data being retrieved by PowerBI optimizing your reports and providing a better user experience as a business continues to grow so does the challenge of managing large volumes of data and ensuring that the data is wellformed and ready for analysis microsoft PowerBI’s data flows help to solve this issue by creating reusable data transformation logic in this video you’ll explore what a data flow is how it works and how to connect to one in PowerBI desktop maintaining a single source of truth is important in a datadriven enterprise it ensures consistent analytical conclusions are obtained from the underlying data one method of ensuring a single source of truth is by creating data flows in PowerBI service a data flow is a collection of tables that exist within PowerBI service you can add and edit tables in your data flow apply transformations and manage data refresh schedules directly from the workspace in which your data flow was created each table consists of columns and rows each cell in a table is known as an entity data flows promote the reusability of underlying data elements preventing the need to create separate connections with your cloud or on premises data sources if you want to work with large data volumes and perform the extract transform and load or ETL process at scale data flows with PowerBI premium scales more efficiently data flows act as data sources for your data sets in both PowerBI service and PowerBI desktop data flows can also act as data sources for other data flows however when using a data flow there are important considerations and limitations to keep in mind if a data flow links to another data flow the maximum number of linked data flows in the chain is 32 this is known as the maximum depth you need a PowerBI premium subscription in order to refresh more than 10 data flows across the workspace data flows are managed individually this means that there is limited visibility into dependencies between data flows in PowerBI data flows you can use parameters but you can’t edit them unless you edit the entire data flow when creating a data set in PowerBI desktop and then publishing it to the PowerBI service ensure the credentials used in PowerBI desktop for the data flows data source are the same credentials used when the data set is published to the service previously in this course you walked through how to create a data flow let’s take a moment to explore how to connect this data flow to PowerBI desktop launch PowerBI desktop and select more from the get data drop-down list of options in the get data dialogue box that appears select Power Platform from the left column and select data flows from the right column of the dialogue box then select next if you are connecting to the dataf flow for the first time a dialogue box opens where you need to sign into your PowerBI service account after you enter your login credentials select connect a navigator window appears displaying the workspace and the data flow you created previously expand the workspace and data flow to display the available tables the two tables that you imported during the creation of the data flow are available here select both tables fact internet sales and dim date followed by load the tables are loaded into the PowerBI model a process you may be familiar with you can establish relationships between the data tables and create reports and visualizations as you typically do with any data set once the data is updated in the source data set you need to go back to PowerBI service and refresh the data flow or configure the scheduled refresh of it you will revise scheduled refresh later data flows are a powerful feature that enable you to centralize your data as a single source of truth as an organization grows data flows help to encourage consistency and reuse of data leading to effective decision-making within the organization businesses operate with many data sources from SQL databases to Excel spreadsheets but with multiple data sources comes varying degrees of quality some sources may be perfect and ready for analysis but others require quality checks cleaning and transformation in this video you’ll revise the importance of inspecting data before loading it for analysis before loading a data source into PowerBI it is essential to evaluate whether the data source will provide the data that you require and if the format is compatible with PowerBI utilizing the wrong data for analysis can lead to incorrect conclusions being drawn or even worse wrong business decisions being made once you’re satisfied that the data is suitable the next step is to load it into PowerBI when you first load a data source PowerBI inspects the first 1,000 rows of data of each table to determine the data types of each column powerbi supports multiple data types such as numeric types date and time types text and true or false in most scenarios PowerBI will automatically determine the correct type however while this automatic feature is useful it is important to inspect the results of it in the data view also known as the table view of Power Query Editor incorrect data types can cause significant issues later when writing DAX queries building reports and analyzing the data if you need to change the data type use the Power Query editor to perform the transformation once the correct column types are established it is important to evaluate the statistical distribution of the columns in PowerBI this is done using three data profiling tools column quality column distribution and column profile let’s revisit each of these profiling tools starting with column quality column quality displays the percentage of data that is valid in error and empty in an ideal situation you want 100% of the data to be valid column distribution displays the distribution of the data within the column and the counts of distinct and unique values distinct values are all the different values in a column including duplicates and null values distinct tells you the total count of how many values are present on the other hand unique values do not include duplicates or nulls unique tells you how many of those values only appear once lastly column profile provides a more in-depth look into the statistics within the columns for the first 10,00 rows of data this column provides several different values including the count of rows which is important when verifying whether you imported your data successfully for example if your original database had 100 rows you could use this row count to verify that 100 rows were in fact imported correctly additionally this row count will show how many rows PowerBI has deemed as being outliers empty rows and strings and the min and max which will tell you the smallest and largest value in a column respectively this distinction is particularly important in the case of numeric data because it will immediately notify you if you have an anomaly in your data such as a maximum value that is beyond what your business identifies as a maximum now let’s recap how to access these profiling tools in the Power Query Editor a sales data set has just been loaded in the Power Query Editor the data set contains the transaction ID product ID quantity sales amount and other related data to inspect each column’s data type navigate to the transform tab in the ribbon menu to display the data type in the ribbon menu select the column and inspect its data type the data type is currently set to text for each column as the data in the first four columns are numeric update the first four columns to the whole number data type by selecting each column and changing the type in the ribbon menu note that when the data type is changed a new step is added to the applied steps list remember you can edit remove and reorder the steps in this list next let’s update the sales amount column to the decimal number data type and finally update the transaction date column to the date data type next you have to evaluate the column quality distribution and profile to do this navigate to the view tab in the ribbon menu enable the column quality column distribution and column profile options in the menu the view now updates with the corresponding statistics each column is 100% valid meaning there are no errors or empty values in the quantity column there are four distinct values and zero unique this means that among this data there are four values that occur in the quantity column but none of them are unique in the column statistic panel the count is 52 since there are 52 rows of data this is the correct number the minimum and maximum values for the quantity column are within the expected range for the business if there were any issues with this data further transformation would be required to clean the data you will learn more about transformation later in this course the data is ready for import navigate to the home tab in the ribbon menu and select close and apply profiling your data is important for ensuring accurate results later in the data analysis process without accurate data businesses can’t unlock the insights that they’re seeking remember accurate and consistent data is a requirement for a successful datadriven enterprise as you know by now datadriven organizations rely on data to make informed decisions and drive innovation however the effectiveness of such decisions is greatly dependent on the quality and consistency of the data poor quality data and inconsistencies can lead to expensive mistakes missed opportunities and damaged reputations in this video you’ll explore resolving inconsistencies and issues in your data let’s start by exploring the question what is data quality data quality refers to the accuracy completeness and reliability of the data as a future data analyst a key responsibility of your role is ensuring that data is of high quality before it is used stakeholders and decision makers rely on accurate data to assess performance and build strategies inaccurate or incomplete data can lead to inaccurate reports and misguided decisions such decisions could have significant effects on the business if the business is operating in a regulated industry such as pharmaceuticals the wrong decision could lead the business to fall out of compliance with regulation and be subject to fines or legal proceedings for example duplicate entries in your marketing data could lead management to overstock certain products increasing costs and negatively impacting the finances of the business the common types of inconsistencies and quality issues that can occur are duplicate rows empty or missing values and errors or invalid values fortunately PowerBI comes with tools to help analyze the quality of your data and resolve inconsistencies and errors previously you learned how to use data profiling tools to analyze a column’s quality distribution and profile which helps identify irregularities in your data you also learned how to ensure that the column has the correct data type now let’s revisit how to use the Power Query editor to resolve other data quality issues and inconsistencies here in Power Query is a data set that contains several data quality issues the first issue is that every row is duplicated to resolve this navigate to the home tab on the ribbon menu then select the remove rows button and select remove duplicates power Query has now removed the duplicates and added a step to the applied steps list for removing duplicates next there are some values in the transaction date column that are null the sales team has informed you that there was an error on their system and the date was the 1st of January 2023 to fix this select the replace values button under the home tab the replace values dialogue box appears here specify null as the value to find and 1st of January 2023 as the value to replace with select okay and the changes are applied again note that a new step is added to the applied steps list in the sales amount column one of the values is spelled as the words 500 instead of the number to fix this use the replace values dialogue again this time specifying the words 500 as the value defined and the number 500 as the value to replace select okay to apply the changes now that the quality issues are resolved return to the home tab in the ribbon menu and select close and apply to apply the changes maintaining data quality is a key aspect of being a data analyst by regularly evaluating and auditing your data you can help maintain the accuracy of your analysis and help organizations make effective decisions that will lead them to success data comes in different forms a telephone number is not the same as a block of text therefore ensuring these different forms are correctly represented and stored in table columns is important for accurate and consistent data collection and analysis in this video you’ll revise how to identify and transform column data types and how to create a new calculated column based on existing data in PowerBI a table consists of one or more columns of data as you add data to the table a new row is created in the table with a value in each column each column has a specified data type which determines how the data in the column is represented which calculations are available and how the data can be used in visualizations you’re already familiar with the different types of data in PowerBI including numeric types date and time types text and true or false once your data is loaded into a table you may identify missing data for example suppose you are working with a table of products consisting of two columns cost and sale price for the report you’re building you also need to display the profit per product sold since the data is not provided by the data source you can use a calculated column to derive the value required calculated columns use a data analysis expressions or DAX formula to create new values for each row in the table like in the previous example these calculated columns will often use values from existing columns to derive their values based on the example the formula to create a profit column would be profit equal sale price minus cost this is a simple example but DAX is a powerful expression language that you can use to create complex formulas to derive insights from your data now let’s take a moment to review how to identify a column’s data type transform the column and create a new calculated column in PowerBI load and open the sales data set in the Power Query Editor as you’ve previously learned PowerBI automatically determines the data type based on the first 1,000 rows of the data set however it is best practice to inspect the data type of each column before importing to do this select the first column in the main working view in the home tab of the ribbon menu the data type is specified as whole number inspect each column noting that all columns except the last one are set to the whole number data type the last column transaction date is set to date data type all types are correct except the sales amount column since a currency amount can have numbers after the decimal place you need to change this column’s data type to fixed decimal number to do this select the column then select the data type in the ribbon menu and select fixed decimal number in the drop-down note that this can also be done in the transform tab of the ribbon menu a prompt appears asking if you want to replace the existing change type step in the applied steps list or add a new step for this example select add new step a new change type step is added to the applied steps list now that the data types for each column are correct you need to add a new calculated column the data set is missing the sale price per unit which is calculated as the sales amount divided by the quantity to do this select the add column tab in the ribbon menu and then select custom column the custom column prompt appears for the new column name enter sales amount per unit next you need to complete the custom column formula powerbi provides a list of available columns on the right side of the prompt first select sales amount and select insert this adds the sales amount column to the DAX formula in the custom column formula type space then forward slash and then space forward slash is the division operator in DAX then select the quantity column in the available columns list and select insert on the bottom left of the prompt note that PowerBI has detected no DAX syntax errors then select okay power Query has now added the calculated column to the table select the column to inspect its data type the column has been created as an any type change the column to a fixed decimal type and the data set is now ready in the home tab on the ribbon menu select close and apply to begin importing the data into PowerBI as you work with large data sets consider how correct data types and calculated columns can help optimize the visualization of your data saving calculation time during visualization will improve the user experience and drive engagement with the reports you are building as you begin working with multiple data sources keeping track of the different queries can grow in complexity very quickly this is where PowerBI’s query pane and reference queries become crucial to a data analyst in this video you’ll learn about the query pane and how to effectively manage queries using it in PowerBI when you connect to a data source it creates a query in the query pane as you begin applying transformations these exist within the context of the query however if you are working with large data you may need to apply multiple transformations inserting data into tables at different stages doing this with a single query can become difficult to maintain very quickly this is where duplicate and reference queries come in in the query pane you can duplicate a query to create a copy and perform different transformations on it from the original query this allows you to transform data into different formats and insert it into different tables for example let’s say you have a sales data set that contains the following columns sales date item quantity shipment address and shipment country you need to build a table for sales and a table for countries the sales table can be imported from the data set but unfortunately you don’t have a separate countries data set so you need to build a table from the sales data set in this scenario you can duplicate the query rename it to countries and apply the necessary transformations to remove all columns except shipment country remove duplicates and import the data into a country’s table you now have a table containing all countries that sales have shipped to in this scenario duplicate queries make sense as you have two completely different sets of transformations and resulting tables if there are common transformations this creates an issue for maintainability let’s work through an example where duplicate queries could create problems again let’s say you have a sales data set that contains the following columns: sales date item quantity shipment address and shipment country you need to build a table for sales and a table for countries however in both tables you need to rename the shipment address column to address and shipment country column to country if you duplicate the query you will need to apply this transformation in both queries and if you need to update this transformation later you will need to do it in both queries well this is a simple example if you had a series of more complex transformations maintaining these in two different queries could easily result in mistakes and human error this is where reference queries are important to use reference queries allow you to use another query as the base of a query using the previous example you can apply the column rename transformations in one query and then create two new queries which reference the first query to perform the subsequent operations to create the sales and country tables now if you update anything in the first query the dependent queries will be automatically updated this reduces the complexity and effort of maintaining queries minimizing the risk of human error it also increases the efficiency of PowerBI as PowerBI can pipeline results from the first query as input to the dependent queries instead of repeating transformations multiple times on multiple queries when importing very large data sets efficient queries can be the difference between a few minutes and a few hours of importing data duplicate and reference queries require much consideration when working in PowerBI identifying when efficiency and maintainability are needed is an important skill to develop as you progress in your career as a data analyst and can help you perform effectively in your role as you work with multiple data sources you’ll discover that the data is often disjointed and needs to be combined and transformed into a data model that is suitable for analysis in this video you’ll explore how merge and append queries in PowerBI can combine multiple data sources into single tables suitable for visualization and analysis in later stages of the data analysis process it is common to encounter data that is broken down into multiple files or data sources for example sales data might be stored in one Excel file per month or perhaps sales data was originally stored in Excel files but later moved to a SQL database however to effectively analyze this data you require it to be contained in a single table in PowerBI fortunately the Power Query Editor contains the append queries feature which allows you to append multiple sources into a single table using the earlier example let’s say you have one Excel file containing sales for January the file contains the columns sales date product name and sales amount you then have a SQL database containing a table with sales for February with the same columns as the Excel file using an append query you can combine the data from these two data sources into a single table containing sales for both January and February but what happens if the columns are different suppose that the SQL table contains an additional column named discount when the append query executes it will insert null values in discount column for rows that originate from the Excel file append queries works well when the columns in the data source are well aligned and the desired resulting table should match the format of the data sources however you may encounter more complex scenarios requiring the merging of data from different sources this is where merge queries comes in let’s say you have a table of customers named customers from a customer relationship management or CRM system you then have a table of sales orders from a SQL database named sales you want to prepare a single table containing the most common cities where orders are delivered to to do this you’ll need to merge the tables from the two data sources using a merge query to merge two tables you need to tell the merge query which type of join you would like to use the join type informs PowerBI how to merge the two tables a join requires that there is a common column between the two tables in our previous example the sales table contains a unique customer ID which is present in the customers table this is known as the join key once the join key is determined the join type must be chosen powerbi supports the following join types left outer right outer full outer inner join left anti-join and right anti- join let’s explore each join type and the way it combines data from multiple tables based on matching criteria to understand the join types picture two tables one of the left side named sales and one of the right side named customers the sales table contains the columns sales ID customer ID and sales amount the customers column contains the customer ID country and name columns the customer ID column in both tables will act as the join key with a left outer join the resulting table will contain all rows and columns from the left table merged with all matching rows and columns from the right table this results in a table with the column sales ID customer ID sales amount country and name if the sales table has a customer ID that does not exist in the customers table the name and country columns for that row will contain null values in a right outer join the resulting table will contain all rows and columns from the right table merged with all matching rows and columns from the left table this results in a table with the columns sales ID customer ID sales amount country and name if the sales table contains customer ids that are not present in the customer’s table these rows are excluded from the results a full outer join simply merges all rows and columns from both tables into the resulting table if the sales table contains rows that do not match the customer’s table null values will be inserted for the country and name country columns if the customer table contains rows that do not match the sales table null values are inserted for the sales ID and sales amount columns in an inner join the resulting table only contains the matching rows from both left and right tables a left anti-join will keep rows from the left table that do not have matching rows in the right table note that this will still include columns from the right table but since there is no match in the right table every row will have a null value in these columns a right anti-join will keep rows from the right table which do not have matching rows in the left table again note that this will still include columns from the left table but will have null values for these columns in each row merge and append queries are valuable tools in your data analysis toolkit they allow you to combine tables from multiple data sources into a format that aids rather than hinders the data analysis process as you continue through the data analysis process designing a schema to represent your data is a key step before diving into the analysis itself this video will explore table relationships and how to identify appropriate keys for establishing relationships a table relationship is how two tables are connected to each other let’s say you have two tables sales and products the sales table contains the following columns sales ID sales amount and product ID the products table contains the columns product ID product name and product category in the products table the product ID column is what’s known as a primary key each value in the product ID column is unique that is if one row has the ID of 11 no other rows in that table will have that ID therefore a primary key uniquely identifies a row in the table in the sales table the product ID column is what’s known as a foreign key it’s not the primary key of the table but instead it establishes a relationship to the products table this means that each row in the sales table is associated with a specific row in the products table if a row in the sales table has a value of 11 in the product ID column it is therefore associated with the row in the product table which has a primary key of 11 for primary and foreign keys the whole number data type is most commonly used however there are scenarios where a non-numeric identifier may be used for example if you are analyzing countrybased data you could use the two-letter standard identifier for each country such as US for United States DE for Germany and so on now that you know how to establish a relationship between two tables the next important aspect is the cardality of the relationship in PowerBI there are three types of cardality one many to one or one to many and many to many to explain these cardalities let’s say that you have two tables table A and table B a onetoone relationship would mean that each row in table A is directly related to only one row in table B and vice versa for example if table A contained countries and table B contained capital cities the relationship would be one to one as each country has only one capital and each capital belongs to only one country a many to one relationship would mean that multiple rows in table A can be related to a single row in table B the relationship from table B to table A is a one to many relationship that is each row in table B is related to multiple rows in table A our earlier sales and products example was an example of a many to one relationship multiple rows in the sales table are associated with one product in the products table a many to many relationship would mean that each row in table A is related to many rows in table B and each row in table B is related to many rows in table A for example if you had a table of books and a table of authors a book can be written by multiple authors and an author can write multiple books establishing relationships is an important aspect of building a schema for your data model you will learn more about schemas and data modeling later table relationships are an important consideration when modeling your data in PowerBI using incorrect relationships or cardality can lead to wrong insights and results in the data analysis process as a data analyst it is your responsibility to ensure correctness in the data model so that a successful analysis outcome can be achieved congratulations on completing the first part of the Microsoft PL300 exam preparation and practice course designed to help you achieve your PL300 certification you’ve discovered much about the PL300 exam and honed your data preparation skills and knowledge within Microsoft PowerBI to ensure your success let’s recap some key takeaways and insights you’ve covered so far you began with an overview of the course and how it will prepare you for your certification journey you explored the syllabus course structure and helpful tips for success you delved into all things Microsoft certification as part of your exam preparation you identified key knowledge and skills measured in this course’s mock exam and the PL300 exam learning how to plan your study time effectively the steps to register and schedule the procedur exam were outlined offering a clear road map to taking the exam you also discovered more about the administration of the PL300 exam so you know what to expect you explored testing strategies and the advantages of practice assessments and mock exams you also had the opportunity to discuss exam preparation with your fellow learners armed with more knowledge about the PL300 exam you moved on to reviewing exam content focusing on data preparation in Microsoft PowerBI you began by revisiting the practicalities of getting data from various sources you learned the importance of choosing the right data sources and were reminded of PowerBI’s extensive range of connectors you were guided through connecting to an Excel data source and loading data via the Power Query Editor and you explored configuring data source settings you also explored the difference between local and shared data sets the pros and cons of import direct query and dual modes and choosing different storage modes you gained handson experience setting up and configuring a data set reviewing the advanced query capabilities of Power Query and using query parameters in Power Query expanded your toolkit you covered connecting to a data flow recapping data flows and creating them in a workspace you also explored the difference between data flows and Microsoft data versse enriching your expertise then you focused on the critical task of profiling and cleaning data you covered evaluating data data statistics and column properties reviewing why data evaluation is crucial Power Query’s profiling capabilities and different evaluation methods through an interactive activity you practiced analyzing a data set for anomalies and statistical irregularities preparing you for real world scenarios as a PowerBI data analyst you also explore data inconsistencies unexpected or null values and data quality issues you may encounter as a PowerBI data analyst as well as resolving data import errors next you explored the transforming and loading data you reviewed creating and transforming columns understanding the importance of selecting appropriate column data types and how to transform columns and create calculated columns in Power Query you brushed up on shaping and transforming tables and applying query steps to shape the data exploring reference queries you recaped when to use reference or duplicate queries you also unpacked the differences between merge and append queries and explored the different types of joins finally you reviewed how table relationships work identifying appropriate keys for relationships and configuring data loading for queries in a PowerBI project you now have detailed insight into what taking the PL300 exam entails and have boosted your skills and knowledge in data preparation with PowerBI and that’s not just good for the exam it’ll also contribute to your success in the world of data analytics previously you covered how to establish table relationships building on this you will explore how to design a schema that contains facts and dimensions when deciding on the data schema you plan to use for your analysis the most common schema types are star and snowflake schemas you may recall that in these schemas data is broken down into fact and dimension tables fact tables represent a business processes measurements metrics or facts they can contain several repeated values for example one product can appear multiple times in multiple rows sold to different customers on different dates these values are used to create aggregations during visualizations dimension tables store contextual data or descriptive attributes about the facts these tables are connected to the fact table via key columns you can use dimension tables to group or filter data in the fact table during visualization in Microsoft PowerBI in the context of an Adventure Works data set with sales and product tables the sales table is the fact table as it contains transactional information about the sales process the product table is the dimension table as it contains the contextual information the product sold for each sale in the star schema the most common data model a single fact table is typically related to one or more dimension tables the snowflake schema further normalizes the dimension tables for example the product table is broken down into product category and product subcategory tables based on category ID and subcategory ID now let’s revisit how to create and configure a star schema in PowerBI launch PowerBI desktop and load the data from the Excel workbook containing Adventure Work sales data the data set contains four data tables one fact table the sales table and three dimension tables these are product region salesperson navigate to the model view where you can create and configure the data model and build a star schema once you load the data PowerBI auto detects the relationships between the data tables based on the key columns you can disable this function from options and settings to create and control the nature of relationships between your data models you can establish the relationship between the fact and the dimension table in two ways to build a star schema remember in a star schema the fact table is at the center of the star the first method is simply dragging the key column from the fact table to the dimension table in the current data set drag the product key column from the product table and drop it on the product key column in the sales table if there are no duplicate values in the product key column of the product table PowerBI automatically establishes a one to many relationship with a single cross filter direction repeat the same process for region and salesperson tables to relate these dimension tables to the sales fact table let’s delete the relationships to explore the second way to build the star schema right click on the connector line and select delete the relationship select manage relationships from the home ribbon a manage relationship dialogue box appears on screen here you can either select autodetect or new with the autodetect selection PowerBI identifies the key columns and establishes relationships in your data similar to when you load data into the PowerBI data model for the current exercise let’s select new a create relationship dialogue box opens select tables cardality and cross filter direction for all data model tables one at a time your star schema is ready to use for your analysis and visualizations practically in a star schema dimension tables are typically positioned above the fact table to give it a waterfall-like structure these dimension tables are used for filtering the fact table meaning the typical direction of the filter is like the flow of water from the waterfall in this video you explored how to build and configure the star schema from the adventure works data set data modeling is a key skill set that you need to master in your journey to become a successful PowerBI analyst and succeed in the Microsoft PL300 exam role-playing dimensions enable data to function dynamically and facilitate better informed decision-making this involves assuming the perspective of your data to play multiple roles and uncover insights that might remain hidden to the untrained eye in this video you’ll recap roleplaying dimensions and the use relationship function in Microsoft PowerBI in business intelligence a role- playinging dimension is a single dimension that can be used for different purposes in the same data model using an adventure works example you might have a date dimension table that connects to various fact tables like sales purchases and inventory this date dimension could play distinct roles like acting as order date when examining sales data purchase date when working with purchases or inventory check date for inventory related analyses previously you encountered a practical scenario involving role-playing dimensions a single sales table that contained multiple date related fields like order date shipping date and delivery date in this case the date dimension table in your model can be related to the sales fact table via multiple relationships to accommodate the different date roles such as new sales shipping dates and receipt dates however remember that only one relationship can be active at a time and the remaining relationships must be inactive you can switch the active relationship manually from the manage relationship in the PowerBI model view continuing with the previous example you would need to import Adventure Works sales data into PowerBI desktop to implement the roleplaying dimension and start building the relationships between the date dimension and the sales fact table the date dimension table is the roleplaying dimension in this scenario and is used for the entire analysis and visualization in PowerBI in a realworld environment you often need to analyze data and present information from a distinct perspective for example Adventure Works might need information about its sales values based on shipping or delivery dates currently the data model contains only one date dimension which is role-playing one way to achieve this is to duplicate the date dimension and rename it shipping date although this is not a practical approach fortunately PowerBI’s formula language DAX provides the solution with its use relationship function creating a measure using the DAX use relationship function temporarily switches the inactive relationship to active let’s break down the DAX formula to create a measure that calculates sales values based on shipping date the code is defining a new measure or calculation called total sales orders shipped in this formula the calculate function alters the filter context of the entire measure within the calculate function it uses the sum function to sum up the sales amount column of the sales table as the default relationship between the sales table and the date table is based on the order date column each DAX calculation is based on the relationship between the tables the use relationship function in DAX overrides this relationship and establishes a temporary relationship between the date column of the date table and the shipping date column of the sales table this inactive relationship becomes active only during the current calculation when using the use relationship function there are some essential points to consider you can only use use relationship within DAX functions that take filter as an argument for example calculate calculate table and total YTD when rowle security is defined for a data table you cannot use the use relationship function otherwise PowerBI will return an error you must first define relationships in your data model because the use relationship function uses existing relationships the column used as the argument in the formula must be part of the relationship if not an error message will display on screen you can nest up to 10 use relationship functions in a single expression lastly in a onetoone relationship use relationship can only activate a relationship in one direction meaning filter propagation will be in one direction only to activate birectional filter propagation you need to use two use relationship functions within the same expression mastering creating custom measures within your data model using a use relationship function and implementing role-playing dimensions are two methods you can use to handle the inactive relationship between data models these skills will not only help you to succeed in your Microsoft PL300 exam but will be valuable in practice as a PowerBI data analyst by now you have an idea of evaluation context and how it works in DAX calculations all DAX calculations compute measures under row and filter context calculate along with its companion calculate table is the only DAX function that can alter the filter context during a DAX calculation in this video you’ll revise how to use calculate to manipulate filters at Adventure Works the management team wants to analyze granular levels of sales data for example suppose the sales manager needs information about the sales of mountain bikes in Europe only a product specialist is interested in the performance of a specific color product that the company recently launched and the United States Countrywide manager wants to filter out the sales amount for the newly hired salesperson all this granular information is easy to compute using DAX measures in PowerBI you can filter the entire sales measure for a specific color product a particular region a salesperson and so on using calculate this will change the filter context of the measure from all to the filtered arguments let’s examine the syntax of calculate and how it impacts the filter context of the calculations in a DAX formula that calculates the total sales of red products the DAX code uses the calculate function and specifies a filter condition where the product table’s color column is equal to red when you use this measure in a matrix or table visual the filter over product color is added to the already existing filter placed by the matrix itself on the month column in the first column the month is the filter context filtering sales for each month the total sales measure computes the sales amount for each month for all products this time adding product color equals red as an additional filter context in this syntax a condition is used to apply the filter over product color however in the DAX engine filter arguments of calculate are tables so the same calculations can be achieved by a formula where the DAX engine converts the previous shorter syntax of calculate to a longer syntax let’s explore this behavior from another perspective if you visualize the total sales by color in a matrix the filter context is filtering the product color the presence of the all function in the longer expression means the outer filter over product color is ignored and replaced by the new filter introduced by calculate in the matrix the sales values for the red products are repeated in all the rows for each row the filter introduced by the matrix is the corresponding color and the red product sales imposes a new filter forcing red to be visible this means the new filter introduced by calculate overrides the existing filter so the sales values are computed within the filter context that filters only red products let’s say the European sales manager of Adventure Works needs the sales amount of red products in Europe only you need to introduce another filter argument within the calculate expression this expression applies two filter functions to the overall filter context of the calculation namely the product color filter to include only red products as in the previous example and takes the region groups as an additional filter to specify Europe as the region the measure presents the sales of red color products in Europe for various months likewise you can perform further granular analysis to compute the sales amount for individual categories product salespersons resellers of the company and so on from the examples you have learned the calculate only modifies the outer filter context by applying new filters this is done by either overriding the existing filter or by combining new filters with the existing ones the evaluation context and calculate function are the foundation of the DAX language making these fundamental skills any PowerBI analyst should master to pass the PL300 exam and to handle realworld analytical challenges previously you learned that multiple data tables constitute a data model for instance a star or snowflake schema a relationship exists between the data tables why does this relationship exist a model relationship propagates filters applied to one column of the table to another model
table a filter can only propagate if there is a relationship path to follow which may involve multiple model tables this video will cover the cardality types and cross- filter directions that exist between the data tables in Microsoft PowerBI in a model relationship two columns are involved from two different tables one from the from side and one from the two side of the relationship both these columns must be of the same data type at its core cardality defines the nature of the connection between two data tables it tells you how many values in one table correspond to how many values in another each relationship must have at least two data tables a from side and to side of the relationship the column on the from side of the relationship must contain unique values while the two side column can have duplicate values powerbi supports four types of cardality these are one one to many many to one and many to many when you establish relationships between tables by dragging the key column from one table to another PowerBI automatically detects and sets the cardality type by sending queries to investigate which columns contain unique values however sometimes PowerBI’s autodetected cardality is not correct therefore it is recommended to check the cardality type before starting analysis and visualization now let’s start by reviewing the onetoone relationship one type of cardality supported in PowerBI a onetoone cardality means both related columns contain unique values this is not a common type of relationship in data modeling consider an example where Adventure Works has two dimension tables product and product category each table has a skew or stock keeping unit column all fields in these columns contain unique values a onetoone relationship exists between these two tables based on the skew column because it’s common to both this means that when skew filters the product category table the product table will be filtered for products associated with the skew next are the one to many and many to one cardality types these two types are essentially the same where each value in one table column is related to multiple values in another it is also the most common type of cardality in PowerBI data models it ensures slicing and dicing data allowing for drill down analyses to uncover granular insights for example in an adventure works data set the sales table also the fact table is related to the region table or the dimension table both tables have a sales territory key column which establishes a one to many relationship between the tables in the region table the sales territory key field contains a unique value in each row as each region only exists once in the table each region can have multiple sales so their sales territory key may be repeated in multiple rows of the sales table a many to many relationship means both related columns can contain duplicate values this type of relationship is used when designing a complex data model typically it’s also used to relate two dimension tables or two fact tables for example consider the relationship between a financial corporation’s customers and the various financial products they hold a customer can hold many financial products and each financial product can be held by many customers a many to many relationship supports the duplicate customer ID data in both tables now that you’ve covered the cardality types in PowerBI let’s delve into how these cardonality types influence the cross filter direction you may recall that cross filter direction refers to the direction of filter propagation between two related model tables it dictates how data from one table influences the data in another table enabling relational analysis without resorting to complex queries or manual data consolidation single cross filter direction means the filter propagates unidirectionally from one table to the other within the relationship and both means the filter can propagate in both directions a relationship that filters in both directions is commonly described as birectional the cross- filter direction is dependent on cardality type onetoone relationships support only both cross filter direction one to many and many to one relationships support both types of cross filter directions many to many relationships can have a single cross filter direction where table A filters table B or table B filters table A or both of these single cross filter directions simultaneously although you can set and configure cross filter direction in PowerBI desktop’s model view in real world scenarios it’s often necessary to answer business questions that require changing the direction of filter propagation manually adjusting the cross filter direction to meet these analytical requirements is not practically feasible dax provides the solution with its cross filter function with the cross filter function you can change the cross filter direction for a specific measure while maintaining the original settings the syntax of the cross filter function takes three arguments let’s examine this syntax briefly in the first argument table one name refers to the name of the first table and column name one refers to an existing column within that table usually representing the many side of the relationship to be used similarly in the second argument table two name refers to the name of the second table and column name two refers to an existing column within that table this time usually representing the one side of the relationship to be used finally filter direction represents the cross- filter direction to be used you can define this as none single or both cross filter directions in the expression both cardality and cross- filter direction are the key analytical concepts in data modeling and analysis as the businesses continue to rely on datadriven decision-making mastering key skills in data modeling and DAX will set you on a path to becoming a functional and influential analyst you have just imported a data set for analysis and upon careful investigation you’ve realized that some information required to address business questions is missing in the data set creating calculated columns to add the missing information into your data tables is a concept you’ve learned before and will be covered briefly in this video calculated columns are custom data columns that are created within a Microsoft PowerBI data model using data analysis expressions or DAX language unlike standard columns that store data directly from imported data sets calculated columns contain formulas that drive values from existing data once you add a calculated column to your data model by defining a DAX expression you can use this column to generate any report and visualization just like the standard columns calculated columns are stored in the data model level and therefore consume memory so you have to be careful not to use too many calculated columns the standard columns of a data model are populated with the imported data model whereas you need to define a DAX expression to populate a calculated column from the existing data the data can be taken from multiple columns and tables of the data model that you must define in the DAX script remember calculated columns can be created from the report view data view or model view of PowerBI desktop and are based on the data you have already loaded into your data model for instance if you have a customer data table with two distinct columns containing information about the first and last names of the customer and you want to combine these two columns into a single column containing the full name of the customer you can use a DAX expression to concatenate the two columns into a single calculated column one of the most common examples of populating a data table with calculated columns is creating a date dimension table previously you populated a date table with various calculated columns like year month name month number and so on now let’s briefly recap the DAX syntax for defining calculated columns the syntax starts with the name of your calculated column followed by an equal operator then write the names of the tables to be referenced in single quotation marks and their respective column names in square brackets include a relevant arithmetic operator or any other expression for example at Adventure Works you are creating a sales report based on geographical information in the geography table both city and state information are available in separate columns displaying only the city name in a visualization might create ambiguity because of the same city name in multiple regions of the globe you can solve this by creating a calculated column using the following DAX code to create a new column in your geography table city and state and an equal operator region in single quotation marks followed by region then city in square brackets the concatenation operator followed by state in square brackets you also learned that if you want to include data from two different tables of the model you first need to make sure the tables have appropriate relationships and secondly you need to use the related DAX function in your formula let’s now recap the benefits of using calculated columns the first benefit is that it enhances data transformation calculated columns help you transform raw data into meaningful information for instance you can convert currency values calculate percentages and so on the next benefit is dynamic and interactive reports you can use calculated columns to introduce slicers and filters to make your report interactive and dynamic another benefit is consistency by embedding calculated columns within the data model you can ensure consistency in your reports changes in source data reflect instantly in calculated columns thereby reducing the risk of errors the last benefit of calculated columns is complex analysis whether it’s timebased calculations statistical analysis or forecasting calculated columns with the power of DAX allow you to tackle intricate data challenges calculated columns are indispensable tools in PowerBI offering a means to shape and analyze data effectively they enhance your data model by introducing new information based on an already loaded data set that allows you to reveal the hidden insights of your data the key lies in mastering the art of crafting the calculated columns using DAX to extract valuable information as a data analyst you receive data from different sources you clean and transform the data and build an effective data model for accurate and effective analysis to ensure an accurate and effective analysis you need to put on DAX magnifiers to see the hidden information in your data data analysis expressions DAX can be used to build calculated tables calculated columns and measures measures are of special significance as they do not take space from your PowerBI memory and are not stored in the model the measures are executed dynamically and can thereby integrate any filter context you apply while writing the script measures in PowerBI are the calculations that summarize aggregate or perform complex calculations on data the calculations can range from simple sums to intricate analyses with the use of these measures you can go beyond basic data visualization as they allow you to drive insights make data back decisions and unearth patterns and trends within your data set that at first glance are not noticeable you can create measures in PowerBI in two ways quick measures and custom measures using DAX in a previous lesson you covered how to create quick measures in PowerBI for Adventure Works timebased analysis to recap briefly PowerBI supports the following types of calculations in quick measures this average per category calculation lets you create the average variance min and max for each category you can apply some fundamental filters in this category of calculations powerbi allows you to create some basic time intelligence calculations like year-toate or YTD monthtoate or MTD and yearover-year or Y with this calculation category you can calculate the running total or total for each category basic mathematical operations are used like addition and subtraction to create quick measures simple concatenation can be done for your measures although you can create a handful of quick measures in PowerBI to get some quick insights the real analytical power of measures lies within the DAX logic dax allows you to write complex logic in the form of formulas and expressions custom measures refers to userdefined calculations or metrics created using DAX to generate insights about the data through aggregations calculations time intelligence functions and so on for example suppose Adventure Works needs to analyze its sales and profit data for each product category in sales region you can compute DAX measures to calculate total sales total profit and profit margin percentage separately these measures can be visualized in your report and you can integrate any filter the company needs to evaluate the total profit and profit margin for each product category and region as mentioned earlier measures compute the values on the go for example when you apply the filter for the bikes category the profit measure will use the product category bikes as the filter during the calculation and only display the profit margin values for the bikes category this way you can help Adventure Works generate the insights needed let’s explore the DAX syntax to create simple measures for sales profit and profit margins that you can use to address Adventure Works needs for sales create a measure called sales then add the sum x function after the equals operator in your first parameter reference the quantity column from the sales table and multiply it with the unit price column from the sales table to calculate profit create a measure called profit and after the equal operator subtract the total cost measure from the total sales measure you can use a measure inside other measures as in the profit measure both total sales and total cost are pre-calculated measures used to compute the next total profit measure next for profit margin you’ll start by creating a measure called profit margin and after the equal operator divide the profit measure previously created with the total sales measure make sure to format the measure as a percentage so the measure will display the percentage profit in your visualization remember to format the measure appropriately for example profit and sales measures can be formatted as currency with two decimal places while profit margin measures need to be formatted as a percentage with two decimal places measures created with DAX provide a way to summarize calculate and compare data across various dimensions based on specific criteria and business requirements measures serve as a microscope to see and discover the hidden message of your data mastering DAX is the key skill for any data analyst and you will receive a considerable number of questions about DAX in your Microsoft PL300 exam time is the dimension that virtually underpins all data analysis and for this reason time intelligence functions hold a position of paramount importance time intelligence functions are specialized functions designed to work with date and time data enabling users to perform advanced temporal analysis and gain deeper insight into historical data previously you cover the theoretical foundations of time intelligence functions and gain significant hands-on experience in creating them to summarize and compare data over time in this video you’ll recap the important benefits of time intelligence functions and how you can implement them in the aggregation and comparison of data values let’s start with the benefits of time intelligence functions the temporal comparison function makes it easy to compare data across different time periods you can create measures using DAX to compute yearoveryear or quarter overquarter trends which allow you to track growth seasonality and performance the next benefit of time intelligence functions is that they allow you to compute moving averages moving averages are a valuable tool for smoothing out fluctuations in data and identifying trends and patterns over time this is particularly important in scenarios where noisy or erratic data can be a challenge with time intelligence DAX functions you can compute moving averages to enhance your data model and analysis time intelligence functions facilitate the creation of cumulative totals which help in understanding the progression of values over time these measures are crucial for tracking key metrics such as cumulative revenue profit or customer acquisition time intelligence functions also facilitate the creation of periodtoate calculations to simplify the process of calculating values from the beginning of a time period to a specific date this is a valuable set of DAX measures to compute metrics like year-toate and month-to-ate values parallel period functions make it straightforward to compare data with previous or future periods which is vital for identifying trends and seasonality and making datadriven decisions with the benefits of time intelligence functions refreshed it’s time to recap a few important time intelligence DAX functions the first important time intelligence DAX function is total year-to- date let’s say for example that Adventure Works wants to compute the real-time sales performance of its various product categories you can calculate year-to-ate from the sales table’s total sales column or measure the DAX expression to compute YTD is a measure called sales year-to-ate followed by the total year-to-ate function after the equal operator in your first parameter reference the total sales column from the sales table and aggregate the values using sum in the second parameter reference the order date column from the sales table the date in square brackets represents the date column of the date hierarchy powerbi IntelliSense provides the option to select other fields of the date hierarchy such as year or month but to create time intelligence measures you need to select the date one of the main product categories of Adventure Works is bikes the company wants to evaluate the sales trends of bikes over the summer months you can use the dates between DAX time intelligence function to compute the measure summer sales after the measure is executed you can add the bikes category as an additional filter to the measure to answer management’s question the DAX code for the measure should be a measure called summer sales followed by the calculate and sum functions to compute the values of the total sales column of the sales table then insert the dates between function which takes the order date column from the sales table as a date reference finally include the starting date and another date referencing the end date now let’s say the marketing executive of Adventure Works wants to evaluate the impact of her recent marketing campaign the original time for the campaign was 3 months and after a month its impact should be evaluated you can create a DAX time intelligence measure using the dates in period function to compute a measure for last month’s sales create a measure called last month sale followed by the equal operator and the calculate and sum functions to compute the values of the total sales column of the sales table next add the dates in period function which takes the order date column from the sales table as a date reference this is followed by the today function that takes today’s data as the starting time 30 represents the number of intervals finally day represents the unit of time adventure works CEO wants a sidebyside comparison of the company’s sales for the current and the previous year this will provide her with insights into the necessary improvements to sales and marketing strategies you can create a measure using the same period last year dax’s time intelligence function as follows a measure called revenue previous year then define var as the variable for the previous year’s revenue followed by the equal operator and calculate which computes the previous year’s revenue by filtering the revenue measure based on same period last year finally the return function displays the value of the entire expression a sales forecast is a vital component of an analysis and adventure works sales executive wants a report based on historical sales values that predicts the future growth of the company in terms of revenue and profitability you can use the date ad function in DAX to either compare the current period sales with the previous period or to predict the future period period here refers to year quarter or month for instance to compare the current month sales with the previous one the DAX script should be a measure called sales comparison followed by an equal sign then calculate computes the measure by filtering the revenue measure followed by date add which takes the order date column from the sales table as a date reference one represents the number of intervals the negative sign indicates that the intervals are back in time this is followed by month representing the unit of time you can modify the code to predict the sales for a future period by changing month to any other time period like year or quarter time intelligence DAX functions in PowerBI are indispensable for analyzing historical trends forecasting future outcomes and understanding the impact of time on your data these measures uncover the insights hidden in your raw data you need to master this skill to excel as a PowerBI modeler pass your Microsoft PL300 exam and become a certified PowerBI analyst as datadriven businesses are evolving so are the business analytical tools microsoft PowerBI stands out as a formidable business intelligence ecosystem offering profound insights through its rich array of features central to the effectiveness of PowerBI are measures which serve as building blocks for data calculations and visualizations previously you covered measures in detail in this video you’ll recap the three main types of measures with scenarios measures are essential for performing quantitative analysis and deriving meaningful insights from the data they provide a way to summarize calculate and compare data across various dimensions based on specific criteria and business requirements measures can be categorized into three types: additive semi-additive and non-additive let’s recap each of these types of measures additive measures are the workh horses of data analysis and provide the easiest summation additive measures behave as you expect they can be summed up or aggregated across various dimensions without losing their meaning adventure Works has a sales analysis report that displays the sales amount and quantity sold for individual transactions each transaction is then tracked with a specific customer region product category and date as a data analyst you can create simple additive measures to sum up the attributes across all given dimensions this will help the adventure works team visualize total sales and total quantity by product category region salespersons and time of course the next type is semi-additive measures these measures introduce a layer of complexity they can be summed across some dimensions but not all and the crux of the matter is time think of inventory on hand as a simple example while it is meaningful to sum the inventory by product or warehouse it makes no sense to sum by time semi-additive measures are often seen in scenarios where time plays a crucial role you can handle these using DAX in PowerBI by specifying which dimensions are suitable for summation and which are not this dynamic flexibility makes it possible to create insightful reports while leveraging the power of DAX in PowerBI let’s explore an inventory balance example if a warehouse has 35 mountain bikes in stock at the end of September and 62 mountain bikes at the end of October it is not accurate to say the warehouse had 97 mountain bikes for the two months together you will handle these measures using DAX functions like last date last non-blank and others you’ll review later finally let’s cover non-additive measures these measures lead you to advanced analytics non-addeditive measures defy straightforward summation across any dimension consider for example the profit margin measure while it is tempting to sum profit margin across products or time periods the results do not make sense you cannot add percentages in this manner you need to perform complex calculations to handle non-additive measures like percentages or ratios to produce meaningful summation dax functions like average X sum X and divide provide you with the toolkit to work with non-additive data thereby allowing you to craft sophisticated calculations that provide valuable insights let’s delve a bit deeper into the profit margin example profit margin is a percentage that represents the profitability of a business and is calculated by dividing the profit by revenue for example let’s say Adventure Works has four product categories and the profit margin of the individual product categories are 9% for bikes 5.5% for accessories 10% for components and 2% for clothes if you sum up the profit margin of these product categories you’ll get a total profit margin of 26.5% however this result is incorrect because it does not reflect the true overall profit margin of Adventure Works you need to employ other DAX functions to compute these types of complex calculations the skill to distinguish and handle additive semi-additive and non-additive measures is the key to generating accurate and actionable insights out of your data the use of appropriate DAX functions from its rich library empowers you to compute each type of measure with precision and to reveal the story hidden within your raw data as a data analyst you import data from disperate sources to your data model the imported data however may not contain the information you need to visualize the key to any analytical work is to reveal hidden insights trends and opportunities you may need to add tables to your data model to accomplish this in this video you’ll explore the types of calculated tables and scenarios where creating these tables is necessary at Adventure Works the executive management team needs answers to specific business questions based on a specific data set after careful investigation you realize the information required can be visualized based on the provided data set but it may require more time and resources a quick way to accomplish the task is to create additional tables in the data model using DAX calculations say for example the sales table contains several columns but you only need to present the summary or the date table is missing from the data model and you need to perform timebased calculations or you also want to perform some analysis but keep the original table intact for other analytical needs all these are scenarios where you must create calculated tables previously you’ve learned that cloned tables are the exact copy of any existing table or data model clone tables are important when you need to manipulate data without affecting the original table for example Adventure Works wants to analyze sales data without altering the original sales table as they want to keep it as a reference you can simply create a clone version of the sales table by writing the following DAX expression as clone table name equals all original table name or more specifically as sales cloned equals all sales you can also create calculated tables using DAX expressions by taking data from multiple sources some examples of calculated tables include combining specific data fields from the sales and product tables to compare various product categories and associated sales values normalizing the dimension table for instance the product table contains categories and subcategories with information you need to separate from the product table this you typically do by creating a snowflake dimension creating a common date dimension table for a data model using DAX to perform advanced time intelligence calculations the last example of a calculated table is combining two tables with the same structure while keeping the original tables unaltered for example suppose you received two different tables with the same structure for Adventure Works customers one for Eastern States customers and one for Western States customers and you need to combine them into a single customers table you can also use measure to create calculated tables in PowerBI for example consider the scenario where you’ve created a measure sales for Adventure Works this measure displays all sales across countries you can use this measure to create a calculated table displaying the individual sales by each country using the following DAX expression country sales is the name of the new calculated table sales in single quotes is the name of the original sales table sales in square brackets is the DAX measure used to create a calculated table total sales in double quotes is the name of a new column added to the calculated table creating calculated tables from pre-calculated measures is especially useful when you want to create a summary table from large data sets or when you want to create a table with data that does not exist in the original tables this can enhance data analysis and visualization capabilities in PowerBI now let’s explore the syntax of a few common DAX table functions you can use the add columns function to add calculated columns to a given table or table expression here is the syntax for using the add column function type add columns and within the parenthesis specify the table name from which you want to retrieve data follow this with the name of the new column enclosed in double quotes and then provide the DAX expression for the calculation you can add more column names and expressions as needed but these additional pairs are optional the summarize function returns a summary table for the requested totals over a set of groups the DAX syntax for summarize is as follows type summarize and inside the parenthesis first input the name of the table you wish to summarize next include the names of the columns to group by each enclosed in double quotes you can also add new column names in double quotes followed by their respective DAX expressions for calculated values adding these additional columns and expressions is not mandatory but can be done based on your data analysis requirements filters returns the values that are directly applied as filters to column name with filters inside the parenthesis simply specify the name of the table or column for which you want to retrieve the current filters applied in the context top n returns the top n rows of the specified table for top n within the parentheses start by specifying the number of top items to return follow this with the name of the table from which to retrieve these top items conclude by indicating the column to sort by and optionally the order of sorting ascending or descending and lastly union creates a union or join table from a pair of tables when using union inside the parenthesis list the tables you wish to combine ensure each table name is separated by a comma the tables should have the same number of columns and corresponding columns should have compatible data types by using DAX to generate calculated columns you can combine data from various multiple tables into a single table that opens a whole new door of analysis in practice you will encounter situations where creating calculated tables is the only solution to certain data challenges the skills you’ve gained will help you tackle these real world analytical tasks efficiently data is often like a complex puzzle with pieces scattered across various dimensions microsoft PowerBI offers a way to unravel this mystery by creating a data hierarchy hierarchies provide a structured way to organize and visualize data allowing users to uncover hidden insights and tell a compelling story adventure Works a multinational company sells its products across the globe the product department heads need not only an overview of the sales but also require a deeper level of understanding of the location of customers and the category and subcategory of products sold you can provide this information by creating a hierarchical visualization of the data powerbi provides a way to display information where managers can drill down to view the granular details about customers and products in PowerBI a data hierarchy comprises interconnected fields from the data set organized in a way to present data elements in ranked order it represents a structured relationship between data attributes typically organized from an overview level to the most granular the hierarchical structure simplifies data exploration and analysis by allowing users to focus on specific aspects of the data at different levels for instance in a sales data set you might have a hierarchy that starts with year drills down to quarter then month and finally day in certain cases you can also drill down to hourly details product geography and organizational hierarchies are some other examples of data hierarchies in PowerBI in a hierarchical structure the first level sometimes called the parent level is ranked over the other sometimes referred to as the child level this way report users can drill down from the parent level presenting the highest level of information to the lower levels in an order powerbi allows a maximum of five levels to be added to a hierarchy using a hierarchical structure to create your visualization enhances the user experience in understanding the data and provides a more comprehensive analysis common visualizations that can be used to visualize hierarchies include bar or column charts line charts heat maps and map visuals powerbi provides several options to use a hierarchy in visualizations for example you can enable inline hierarchy labels to sort data by hierarchy levels you can use the path DAX function to add a column for the entire path length this is important when you are working with an organizational hierarchy you can also create DAX measures to determine the path length of the hierarchy which helps you in determining the shortest and the longest path now let’s explore how you can create a data hierarchy in PowerBI to help Adventure Works analyze granular data launch PowerBI desktop and load the data from the Excel workbook containing Adventure Works sales data the data set contains two data tables a fact internet sales and a geography dimension table the geography dimension table of the model contains geographical information therefore it is advisable to generate a geographical hierarchy the first step is to format the location-based data for an appropriate data category to do this select the country field and then select country from the data category drop-own list now format state province name city and postcode as state or province city and postal code a globe icon appears before the field name which tells PowerBI that this is geographical information let’s visualize the sales data by geography in the report view of PowerBI desktop to do this select the column chart from the visualizations pane and bring the sales amount field from the sales table to the yaxis well of the visual in the x-axis a geographical hierarchy is needed to display the sales data at various levels of locations bring the country state or province and city fields to the x-axis in the same order a set of arrows appears in the top right corner of the visual indicating the drill down functionality to turn on the drill down select the second down arrow if you hover the cursor over any data point for example the United States a drill down icon displays on the tool tip to go to the next level of the hierarchy select the drill down icon in our example the next hierarchy level is states from here you can either drill up or down to the next level alternatively you can create a hierarchy in the data pane select the country field from the data pane and select more options which is represented by three dots a drop-own list appears where you have to select create hierarchy a new country hierarchy field appears in the geography table with country as the highest level of the hierarchy you can now add related fields to the newly created hierarchy one at a time to do this select state or province and from the drop-own option select add to hierarchy next you need to select the hierarchy where you want to add the field in the current project there is only one hierarchy available country hierarchy select the country hierarchy and the field is added as the second level repeat the process for city and postal code you can test the country hierarchy by creating a new visual remember to format your reports using the appropriate font style and colors data hierarchies are indispensable tools for effective and granular data analysis and reporting in PowerBI they provide structure and context to your data making it easier to navigate and drive trends and let your audience gain a deeper understanding of the information at hand in fast-paced analytics where every business is turning into a datadriven organization performance is everything businesses rely on business analytics tools such as Microsoft PowerBI to turn vast amounts of data into actionable insights but what happens when too many users interact with your reports and you need to optimize the speed and efficiency of your reports and dashboards the performance analyzer helps you evaluate the performance of various elements of your PowerBI reports and dashboards adventure Works uses PowerBI as a business intelligence tool to create stunning reports and visualizations however as the data sets grow with the growth of the company and reports become more complex there is a need to make sure the reports perform optimally you can implement PowerBI’s performance analyzer to evaluate the performance of individual report elements such as visuals and DAX measures you may recall that the performance analyzer is a built-in tool of PowerBI that allows users to diagnose and optimize the performance of their reports and dashboards it provides insights into query execution time data model performance and visual rendering enabling analysts to pinpoint bottlenecks and fine-tune the creative work slow responding reports and dashboards hinder productivity and may lead to customer dissatisfaction with the performance analyzer you can identify and rectify slow performing report components not only is speed critical but efficiency also matters by identifying and optimizing inefficient elements of your reports you can reduce resource consumption and enhance user experience a healthy data model is the foundation of your analytical work the performance analyzer offers insights into your data model performance helping you to maintain and enhance it the tool does not stop at query diagnostics it also helps to analyze visual renderings this means you can identify the problematic and slow rendering visuals and optimize them for faster loading now let’s review how to use the performance analyzer you need to launch your PowerBI report and access the performance analyzer from the view ribbon of the report view upon selection the performance analyzer displays on the right side of the report canvas the performance analyzer records the processing time required to update or refresh each report element for instance when a user interacts with a slicer to modify the visual a query is sent to the underlying data model and visuals are updated according to the interaction you need to select start recording to start recording with the performance analyzer the performance analyzer inspects and collects performance measures in real time each time you interact with a report element the performance analyzer displays performance results in its pane once you finish recording select stop and the performance analyzer will display information about queries data models and visuals in a userfriendly interface the information log contains the time spent completing the following tasks dax query if your report has DAX calculations the duration between the query sent to the data and the results retrieved is displayed in the pane visual display the time needed by a visual to display on the report canvas which also includes the time to retrieve web data other this is the time the visual requires for preparing queries waiting for other visuals to complete or performing other background processing evaluated parameters if your report visual contains field parameters the time spent on these will be displayed in this category this is in preview mode the performance analyzer records duration in milliseconds and the values indicate the difference between the start and end of any operation once you stop the recording you can save the results onto your local computer now you can identify areas that need optimization and make necessary adjustments to your DAX logic visual elements and data model to improve overall performance having reviewed how to use the performance analyzer let’s briefly explore some of its real life applications when working with large data sets the performance analyzer helps you optimize the reports to ensure they remain responsive in the case of complex data models this tool assists you in maintaining efficient performance in addition you can use the performance analyzer to fine-tune reports visuals elements and queries for faster performance where you have many report users the performance analyzer in PowerBI is your handy tool for faster and more efficient yet visually appealing reports and dashboards to succeed both in Microsoft PL300 and as an efficient data analyst you need to master the skill of diagnosing issues through the performance analyzer and optimizing your reports accordingly in the dynamic landscape of data the sheer volume of data itself is not a threat to meaningful analysis the key lies in how you handle the data transform it and create visually appealing and analytically insightful reports but often the amount of effort you put into creating a masterpiece doesn’t perform according to expectations due to the slow responsiveness of the visuals and queries this highlights the significance of performance optimization which is equally important as creating reports and dashboards in this video you’ll review how to improve report performance via cardality and summarization in Microsoft PowerBI imagine Adventure Works Microsoft PowerBI reports meticulously designed to dissect sales trends monitor inventory levels and analyze customer behavior are encountering a challenge with a colossal volume of transactional data streaming daily the reports are performing sluggishly you may recall that you can improve performance by reducing data although the PowerBI engine effectively handles extensive data minimizing the volume of data loaded into your data model is still crucial this is especially important when working with larger data volumes or anticipating substantial data growth over time there are many reasons to minimize the data volume loaded into the PowerBI model including your current PowerBI capacity may not support the larger volumes of data for instance PowerBI shared capacity can host a model maximum of 1 GBTE in size smaller data models can reduce resource contention by using fewer resources like memory and processing power increasing efficiency loading more models for a longer period helps reduce the eviction rate meaning the data is removed from memory less frequently this can result in faster queries as the data sets do not need to be reloaded into memory smaller data models also tend to refresh more efficiently resulting in decreased time to generate and deliver reports with up-to-date data or lower report latency finally fewer rows in a data table can lead to faster calculation and improve query performance powerbi supports many techniques to reduce the data loaded to the PowerBI data model in this video you will review two methods reducing cardality and aggregation or summarization let’s begin with reducing cardality previously you learned about the type of cardality between data tables throughout the development of the data model you either establish or modify the relationship between the tables you need to ensure the data types of the fields participating in the relationship establishment are the same you cannot create a functional relationship where the data types of the columns are different for example the column has a key column that might be set to a text data type if the column contains only the numeric values you must change the data type to integer and whole numbers to decimal numbers which performs better than the text data type in the PowerBI model changing the decimal number data type to a fixed decimal number also improves the performance as you learned in the previous DAX lessons when you create a DAX calculation in your data model the default data type is decimal number or general this means the results of the calculation display unlimited places after the decimal which hinders optimal performance you need to define the distinct data type with specified decimal places for best performance changing to fixed decimal places reduces storage requirements enhancing model performance the next technique is reducing data via aggregations aggregation refers to summarizing large volumes of data into more manageable summary tables to improve query performance by condensing detailed information into simpler higher level values consider an example where you have a large data set containing a record of each transaction for reporting you’re analyzing only the yearly or monthly sales or sales by region you can create aggregated tables that are imported to the data model in the current example you can generate aggregated tables from the sales table grouped by region or month according to your requirements this pre-calculated aggregation can be imported to the memory of PowerBI and will be more efficient in querying daily analysis powerbi also supports three storage modes to handle large data sets where you can define the storage modes of data tables for example a large fact table with millions of rows can be set to direct query while smaller tables can be imported to the model for improved performance aggregations offer several benefits that can help you improve model performance if you are handling a vast data set aggregations provide a faster and optimized query performance they assist you in analyzing the data and revealing insights without importing the entire data set into the model if users are experiencing a slower refresh time of the reports in PowerBI you can create aggregations to help speed up the refresh process the smaller size of aggregated tables imported to memory reduces the refresh time enabling a better user experience lastly suppose your company is anticipating a growth in sales volume by expanding its operations to new regions or adding new products to its inventory you can leverage PowerBI to create and manage aggregations as a proactive measure to futureproof the solution enabling a smooth scaleup optimization of your data model in PowerBI is not just a technical endeavor it is a strategic imperative for organizations and an analytical challenge for you as an analyst powerbi’s performance optimization unlocks a new door of analysis ensuring that every decision is not just datadriven but empowered by the speed and efficiency necessary to thrive congratulations on completing the data modeling section of this course a prerequisite to analyzing data and creating reports and dashboards in Microsoft PowerBI let’s recap the key takeaways you began with a journey into designing data models starting with a recap of schema design principles you reviewed the star and snowflake design the two major types of schemas used in PowerBI and worked through a hands-on activity building a star schema for adventure works by understanding the fact and the dimension tables you explored how to handle the inactive relationships between two data tables by implementing a role-playing dimension and using the DAX user relationship function as DAX and the evaluation context are fundamental to data analysis in PowerBI you recaped using the calculate function to alter the filter context of your calculations you also explored cardality the nature of the relationship between data tables types of cardalities and different cross filter directions in PowerBI you can either select single or both cross- filter directions determining the filter propagation in one or both directions of the related tables next you moved on to creating model calculations using DAX you recaped calculated columns the custom data columns you create in your data model using DAX you gained a detailed overview of conceptual foundations and practical skills related to creating and managing measures using a library of DAX functions measures hold the hidden information in your raw data empowering users to gain meaningful insights you reviewed sum sum x and calculate functions to compute aggregation measures which are the most common calculation used for analysis in any datadriven business you also explored implementing time intelligence measures as the time dimension is the foundation of any business analysis requiring historical analysis and future predictions dax offers a rich library of time intelligence functions to aggregate and compare data over time such as dates YTD and total YTD by using time intelligence functions you can compute things like moving averages temporal analysis and cumulative totals to gain insight into the overall performance and growth of the organization you also recap types of measures including additive measures like total sales or total cost non-additive measures for example profit and margin and semi-additive measures such as inventory level and current account balance you gained hands-on insight into replacing an implicit measure with an explicit one and creating a semi-additive measure after that your focus shifted to implementing a data model you started by identifying the need for calculated tables such as when a data model lacks a common date dimension table and how to create them in PowerBI you gained a solid understanding of DAX functions that you can use to create and manipulate tables in PowerBI you then explored creating hierarchies including date product and geographical hierarchies creating a hierarchy is a significant feature of PowerBI allowing you to create a hierarchical structure to analyze the overview and granular details of data within the same visual by using drill down functionality further you explored how you can add a hierarchy to slicers in addition to the standard PowerBI visuals you reviewed PowerBI’s Q&A feature which uses natural language processing to answer business specific and userdefined questions in visual form this feature is significant in the real world datadriven environment by making it possible for individuals regardless of technical expertise or department to use and gain insights into the data from your reports and dashboards you learned that PowerBI allows you to teach Q&A to customize the review questions synonyms and relationships to help PowerBI better understand your business needs finally you focused on optimizing model performance this began with a review of PowerBI’s performance analyzer a robust diagnostic tool within the PowerBI ecosystem that allows you to monitor and evaluate the performance of your report visuals data model health and DAX queries you can use the information the performance analyzer provides to optimize slow responding report components and enhance the user experience you explored improving report performance by choosing optimal data types and summarizing data you learned that PowerBI offers several techniques to reduce data size and volume which is important for avoiding slower reports reducing cardality and creating aggregated tables are the two most important techniques you can employ as data reduction strategies to enhance model performance in PowerBI building and managing a healthy and functional data model is the key to performing any analytical work in PowerBI and gaining meaningful insights from your data understanding the schema DAX logic and performance optimization can help you become a certified PowerBI analyst via the Microsoft PL300 exam as well as handle complex realworld data challenges visualizations act as a bridge between raw data and actionable insights microsoft PowerBI offers a wide array of visualization options for reports empowering analysts to create compelling data narratives in this video you’ll explore the analytical background of visuals in PowerBI to help you identify and implement the appropriate visual to address the business need the management of Adventure Works requested a comprehensive sales report for the past year the challenge is to select the right visuals that align with the data and the analysis objectives ensuring clear and insightful presentation of the sales performance powerbi features a broad spectrum of visualizations each tailored for specific data representation needs the visualizations in PowerBI can be broadly categorized into general purpose visuals and specific purpose visuals general purpose visuals include visuals like tables and KPI cards that are versatile and can be employed across various analysis scenarios specific purpose visuals include a range of visualizations each designed to cater to specific analytical needs like time series and geospatial analysis among others the general purpose visuals in PowerBI are tables and matrices which effectively display data in a structured tabular format allowing for easy comparison and analysis across multiple dimensions card KPIs or key performance indicators which are instrumental in highlighting critical metrics immediately enabling decision makers to quickly grasp the performance indicators that are crucial for their business objectives and lastly slicers which act as interactive filters allowing users to filter the data being displayed dynamically thus enabling a focused analysis powerbi offers numerous visuals each tailored for specific types of analysis used daily in modern enterprises the key to effective data visualization lies in aligning the visual with the analysis goal thus enabling a clear insightful and engaging data narrative let’s explore the various categories of analysis specific visualizations and the PowerBI visuals most suited for each time series analysis is a method to analyze timeordered data to discern the structure or functionalities underlying them it is an essential analysis in forecasting monitoring and anomaly detection the optimal charts for time series analysis are line charts and area charts line charts are the ideal and most common way of visualizing a time series analysis while area charts are suitable for tracking quantity over time while emphasizing the magnitude the next analysis type categorical analysis deals with data that can be segregated into multiple categories but have no inherent order or priority categorical analysis helps you to understand the distribution and relation of data across different categories the optimal charts for categorical analysis are bar and column charts and pie and donut charts bar and column charts are effective for comparing the magnitude of categories and easily identifying the differences among them pie and donut charts are best for representing the proportions of categories especially when dealing with a small number of categories to prevent visual clutter correlation analysis aims to find a relationship between two or more variables understanding correlations is foundational for prediction causation analysis and trend discernment the optimal charts for correlation analysis are scatter charts and bubble charts scatter charts are suitable for spotting relationships between two variables and understanding the strength and direction of the relationship bubble charts extend scatter charts by adding a dimension through bubble size allowing for an additional layer of analysis the next type of analysis is distribution analysis this type of analysis observes how values of a variable are spread or clustered over a range it’s crucial for statistical analysis allowing comprehension of data variability and central tendencies distribution analysis is suitable for spotting relationships between two variables and understanding the strength and direction of the relationship next there’s part to whole analysis this type of analysis examines how individual parts contribute to the aggregate it’s a widely used analysis in understanding composition analyzing contribution and comparing individuals to the total waterfall charts are the most widely used for partto-ole analysis as it’s highly effective in showing the cumulative effect of sequential positive and negative values the last type of analysis is geospatial analysis geospatial analysis examines data in terms of geographical or spatial relationships it’s instrumental in finding patterns understanding spatial distributions and making geographically informed decisions powerbi offers a variety of different map visuals including shape maps cororoplath or filled maps and arcgis maps shape and corropath or filled maps support external geographical files to draw a map arcgis maps are rich in map visualization features the array of visualizations in PowerBI provides a powerful tool set for analyst to convey data narratives effectively the right choice of visualization based on the analysis need is crucial mastering the art of selecting the right visual in PowerBI is a valuable skill that significantly augments the data storytelling proess of analysts to ensure Microsoft PowerBI visuals are of a professional standard it is important to explore both general and visual formatting settings in this video you’ll explore the available formatting options in PowerBI and how to implement formatting options lucas is tasked with enhancing an Adventure Works sales report with two visualizations let’s help Lucas explore all general visual and conditional formatting techniques in PowerBI launch the sales categorical analysis PowerBI file in this report two commonly used categorical analysis visualizations have been used column and pie charts lucas is tasked to investigate all available formatting and configuration options that could enhance this report select the column chart and navigate to the visualizations pane select the format visual tab this is where the formatting options for every visual reside the formatting options are split into two categories visual and general visual contains chart specific settings and general contains settings shared by all visualizations even the text box and shape visualizations share these settings let’s select the column chart and general options again to view them in detail the properties section is used to adjust the size position or padding of the visual it’s helpful when slight adjustments are necessary like moving the visual to the right the title section focuses on formatting the title of the visualization and provides numerous setting options like font size color background color alignment subtitles and even a divider lastly the effects section includes settings to format the visualization background visual borders and shadows when you navigate to the visual formatting settings the column chart specific settings appear here you can view settings for both axes modifying their range of values font or axis title you can even change the y-axis to logarithmic to display the results on a different scale when using disabled settings like legend and small multiples make sure that fields are using the respective visual slots the next settings allow you to add grid lines on your visual a zoom slider to magnify specific axis ranges modify the color of your columns and add data labels when you select the table visual note that the visual settings are adjusted to fit this visualization here you have style presets to easily modify the table some grid options as well as options to change the appearance of cell values column headers and the total finally to add conditional formatting to your chart you can enable it on your table visual columns by selecting any field and then selecting conditional formatting in PowerBI you can format the background and font colors you can also add data bars icons or even links to web URLs selecting a font color for example the conditional formatting window appears here you can format the font color of the table visualization this formatting can be conditional based on a custom rule that you can apply the specific value of any field in the data set or even a gradient based on a value powerbi keeps adding conditional formatting on various visualization aspects for example select the column chart navigate to the columns field and expand it a button with a function symbol appears on the right of the color field this indicates that conditional formatting can be applied to the columns dynamically altering the color based on specific criteria when you select this button the conditional formatting window appears indicating that these visualization columns can be formatted based on specific rules field values or with a gradient color just like for the table in this video you learned how to explore all the available formatting options in PowerBI and implement formatting options navigating through large data sets to find important insights is a common task in data analysis microsoft PowerBI helps ease this task with its robust slicing and filtering features in this video you’ll explore the available slicing and filtering options available in PowerBI these features are essential for data analysis projects making it easier for users to focus on specific data subsets and uncover meaningful insights in their reports the management team at Adventure Works requested interactivity to be added to the sales categorical analysis report enabling them to dynamically apply filtering in the report the ability to shift through extensive data sets focusing on specific data points is important when building business intelligence reports slicing and filtering for this reason is an essential tool for a PowerBI analyst facilitating interactivity in reports that offer a dynamic and engaging data analysis experience let’s explore slicing and filtering in PowerBI in more detail to identify the three main methods of filter applications slicers the filter pane and visual filters the first way of slicing and filtering a report is by using slicers slices are visualizations that act as filters enabling a user to make selections that filter data within reports to add a slicer to the sales categorical analysis report select the slicer icon on the visualizations pane and adjust it by dragging its edges drag date into the field box the slicer visualization automatically identifies the field as a date field and selects the slicer setting style between the second way of slicing and filtering a report is through the filters pane the filters pane is a central location where users can apply and manage filters to their reports at three different levels visual page or report level visual level filters apply to a single visual page level filters apply to all visuals on a page and all pages or report level filters apply to all visuals within a report add country region to the filters on the page section and select Canada this will immediately filter the report to display only the data for the table rows with Canada in the country region field an important aspect of the filters pane is the hide and lock features it provides to the right of the filter you just added a lock filter button is visible this feature prevents report users from changing this filter the hide filter button hides the filter and prevents users from knowing that a filter is applied finally the third method of filtering is through visualization filters visual filters are a direct method of filtering allowing users to interact with the visuals on a report to filter the data for instance selecting the blue color on the tree map will filter the rest of the report based on the selected segment this feature is what makes PowerBI stand out as a highly interactive business intelligence tool as all page visualizations are constantly interacting with each other with a click of a button understanding slicing and filtering is key to unlocking the full capabilities of PowerBI they not only simplify the process of creating interactive reports and focusing on specific data segments but also empower data analysts to quickly identify valuable insights imagine effortlessly navigating through vast oceans of data in Microsoft PowerBI just like a seasoned captain navigating a ship through turbulent waters with page navigation tools you can unlock your report’s full potential for you and report users in this lesson you will cover the core features related to navigation and sorting you will learn about how page navigation effectively streamlines the flow and readability of multi-page reports effectively utilizing bookmarks capturing and sharing specific reports and states exploring the sorting functionalities in PowerBI to visually organize data enhancing clarity impact and insights lucas is a data analyst with Adventure Works and has been tasked with enhancing the interactivity and user experience of the company’s sales categorical analysis report in PowerBI as this report is crucial for monthly sales meetings the report requires navigation improvements to help the sales team navigate data more efficiently and gain quicker insights lucas’ objectives are to streamline the report’s navigation across multiple pages create bookmarks for key data points to enhance presentations and apply sorting techniques for clearer data visualization page navigation in PowerBI is a feature used to create multi-page reports that are userfriendly and easy to navigate it allows users to move between different pages of a report and is essential for organizing information logically across multiple pages the implementation of page navigation in PowerBI involves setting up interactive elements like buttons or links that users can select to move to different report pages it provides a guided experience beyond clicking on tabs as it directs users through the report in a structured userfriendly way especially in complex reports page navigation is integral for assisting users through a report’s narrative especially in complex data sets or presentations there are several benefits to using page navigation in PowerBI reports they include an enhanced user experience these features collectively improve the navigation and understanding of reports making them more userfriendly and accessible for instance in a financial report the first page might provide an overall summary and subsequent pages delve into specific areas like revenue by region or departmental expenses all interconnected through intuitive page navigation the second benefit is dynamic data presentation bookmarks and page navigation enable dynamic storytelling with data allowing for interactive and engaging presentations for example in a market analysis report bookmarks can allow users to switch between different market segments time periods or product categories making the presentation interactive another benefit of page navigation is improved data organization sorting mechanisms help in structuring data effectively leading to better comprehension and quicker insights for example sorting mechanism can be applied to a sales table to organize data by revenue allowing users to quickly identify top performing products when utilizing page navigation it often leads to increased efficiency this is due to streamlining the process of exploring and analyzing large data sets saving time and effort for both report creators and viewers for instance bookmarks can be combined with sorting mechanisms creating different sorted views of a data set like sorting customers by purchase frequency or sales by region this allows for quick comparisons and analysis saving time for both report creators and viewers the final advantage to using page navigation tools is the flexibility in analysis navigation offers flexibility in how data is viewed and analyzed accommodating a variety of analytical approaches and styles bookmarks can be used to switch between different data filters or visualizations even on the same page accommodating various analytical approaches bookmarks in PowerBI are a powerful feature that can enhance report interactivity and storytelling bookmarks allow users to save specific views and states of a report enabling quick navigation to these points during presentations or analysis they are particularly useful in highlighting changes or comparisons in data over time creating bookmarks involves selecting and saving the current state of a report including filters slicers and the visibility of visuals where visualizations can be hidden or left in view in cases where specific report configurations and filters are used in a report they can be saved as bookmarks to easily navigate back to them without having to reconfigure the report these bookmarks can then be linked to buttons or other interactive elements allowing for a seamless transition between different views within the report sorting data in PowerBI reports is a fundamental feature that organizes data within visualizations making it easier to interpret and analyze it brings clarity to reports by arranging data in a logical order whether ascending descending or based on specific criteria sorting helps present data in a structured manner aiding in the quick identification of trends outliers or specific data points it’s essential for making reports more intuitive and insightful powerbi allows sorting of data in various visualizations like tables charts and graphs users can sort data based on different attributes such as alphabetical order numerical values or custom criteria to suit the specific needs of their analysis in this video you explored essential features in PowerBI that elevate the functionality and user experience of reports you learned how page navigation streamlines the flow of multi-page reports how bookmarks offer dynamic presentation capabilities and sorting mechanisms bring order and clarity to data visualizations these tools are invaluable for analysts like Lucas at Adventure Works as they make reports not only more interactive and engaging but also more insightful and easier to navigate by effectively utilizing these features PowerBI users can transform their reports into powerful tools for storytelling and data analysis driving more informed decision-making in Microsoft PowerBI the interactions between visuals in a report is a fundamental aspect that enhances data exploration and analysis this is due to the fact all visualizations can filter one another over the next few minutes you will discover how visuals utilize and share data and how they can be configured to interact with one another you will explore the key interaction types filter highlight and none and their impact on overall report dynamics understanding these interactions and how to choose between them depending on the specific business need in hand is crucial for creating cohesive and informative reports that allow users to delve into data with greater clarity and context there are three key topics you will learn about in this video specifically you will learn how to grasp the basics of visual interactions specifically how visualizations interact with a PowerBI report explore interaction types specifically filter and highlight and how they can be applied and lastly you will gain insights into the non-interaction setting and when it is appropriate to use it in a report lucas the data analyst at Adventure Works encounters a challenge with a report called sales categorical analysis the sales team has reported an issue where selecting a data point in a column chart unexpectedly wipes out the data in the tree map visualization realizing this is a visual interactions problem Lucas is tasked with troubleshooting and resolving it he discovers that the current setting is likely a filter interaction causing the column chart selections to overly restrict the data displayed in the tree map the way visualizations interact within a report is crucial for a comprehensive data analysis experience these interactions determine how selecting or hovering over data in one visual affects the data displayed in another there are three primary types of interactions filter highlight and none let’s start with filter interaction when you select a data point in one visual it acts as a filter for the other visuals in the report for example selecting a specific category in a bar chart will filter the data in all other visuals to show only data related to that category filter interactions are essential for drilling into specific subsets of data and analyzing them in the context of the whole report filter interactions provide a focused view allowing users to isolate and analyze specific data points across different visuals next is the highlight interaction instead of filtering out non- selected data the highlight interaction dims it maintaining the overall context selecting a data point in one visual will highlight related data in other visuals while dimming the rest a highlight is used when the context of the entire data set is required even while focusing on a specific section the highlight interaction helps to understand the relationship of one part to the whole providing a broader perspective of the data this option disables interaction between visuals where selecting a data point in one visual has no effect on others this interaction is useful when visuals are meant to function independently without influencing each other’s displayed data it is crucial for reports with visuals that represent different data dimensions or when independent data exploration is required understanding these interactions is necessary for effective report design in PowerBI by applying these interaction types you can create reports that not only present data in an organized manner but also offer intuitive and insightful data exploration experiences in the upcoming video let’s assist Lucas in configuring the interactions between the sales categorical analysis report let’s start by launching the sales categorical analysis report to identify the interactions between visualizations we know that the bike category contributes almost entirely to the total of sales amount which might prove to be an issue for interaction between visualizations selecting the bikes column of the column chart the tree map boxes are almost unchanged then selecting accessories and clothing categories you notice that those categories are such a small percentage that they are barely visible when filtered the reason this occurs is that there is a highlight interaction type from the column chart to tree map chart highlighting just the percentile of each category this makes it difficult for users to comprehend the filtering of the report so you need to modify the interaction to access the interactions between visualizations select any visualization for example the column chart the format tab will now appear on the ribbon select format and enable edit interactions this is an onoff button which is now enabled it shows the interactions of a selected visualization towards all other objects in the report having selected the column chart notice the icons above the tree map these are the three interaction options: filter highlight and none select filter to change the interaction type and press on the columns of the column chart to notice the modification the users can now clearly see the color of products with the most amount in sales for each category remember that it’s a good practice to always disable the edit interaction button when completing your modifications on interactions as it takes up a lot of memory and might reduce the performance of PowerBI desktop the strategic use of visual interactions in PowerBI filter highlight and none plays a pivotal role in crafting engaging and insightful data stories by understanding and applying these interaction types report designers can guide users through a more nuanced and comprehensive data exploration journey imagine you are a data analyst for Adventure Works creating multi-page reports and you have implemented slicers on some pages when you change a slicer on one page it doesn’t change on the others currently you are recreating the same filter over and over which can be tiring for you and with so many changes to implement any mistake will lead to poor user experience how can your workload be improved and lead to a better chance of a strong user experience in this video you will learn about the fundamentals of synced slicers in Microsoft PowerBI learning how to implement this feature and gain insights into the enhanced storytelling capabilities and improved user experience provided by synced slicers adventure Works wants to analyze their bicycle sales performance across multiple regions they’ve created a comprehensive PowerBI report with pages dedicated to sales data customer demographics and seasonal trends however a challenge arises in maintaining consistent analysis across these pages when users want to focus on specific regions or time frames this is where implementing synced slicers comes into play enabling a seamless unified view of data through the entire report project slicers serve as an effective method for narrowing down information enabling you to concentrate on a particular segment of the semantic model slicers provide the flexibility to choose precisely which values are shown in your PowerBI visuals there may be instances where you require a slicer to be active on a single page of your report while at other times applying the slicer across multiple pages might be more appropriate utilizing the sync slicers feature allows any selection made via slicer on one page to influence the visualizations across all the pages you’ve synchronized synced slicers are not just a cosmetic addition they are a functional necessity for creating cohesive and user-friendly reports here’s why they are essential first is navigation consistency synced slicers ensure that when a user selects one page it reflects across all other pages this consistency eliminates confusion and enhances the user’s ability to analyze data coherently the second necessity of sync slicers is time efficiency by avoiding the need to repeatedly set the same filters on each page synced slicers save time and streamline the data exploration process lastly is improved data storytelling in reports where data storytelling is crucial synced slicers help maintain the narrative flow they allow the story to unfold effortlessly across different pages without jarring interruptions or resets in filters now let’s explore how you can sync slicers across pages in PowerBI reports let’s get the slicers in sync for the current report the report is split into two pages the first page shows sales by product category and color and the second page details sales data for all products from the last two months at the top left corner of both pages there’s a slicer if you pick a country on the product category and color page it only changes the data on this page the details page hasn’t changed however if you activate slicer sync the same filter will apply to both pages here’s how to do it in the view tab of the ribbon select sync slicers this brings up the sync slicers pane on the right now select the slicer on the first page and in the sync slicers pane select the sync checkbox for both product category and color and details now whenever you select a country on the slicer in the product category and color page it’ll also update the details page with the same filter to check if it’s working properly I’ll select a country on the first page when I open the second page I notice that the selected country in the slicer remains as selected this is how you can quickly synchronize slicers on various pages in a PowerBI report the sync slicers feature in PowerBI is a critical tool for enhancing the coherence and usability of reports by allowing slicers to synchronize across multiple pages it ensures that filter selections are consistent thus providing a smoother and more intuitive experience for the user you are part of a team working on sales reports for the stakeholders at Adventure Works you’ve noticed that the way the designers arrange the visuals is causing confusion making it hard to spot related items as well as this there’s no consistency in how visuals have been named everyone’s been labeling them however they please which makes it even harder to locate the essential elements using the selection pane you can organize and group these visuals making everything much easier to manage and understand in this video you are going to learn how to name visuals group the related visuals and properly organize by layering them on top of one another grouping and layering visuals in Microsoft PowerBI simplifies report creation and management by organizing data in a user-friendly way enhancing the user experience through clear logical presentation the first step towards enhancing user experience in PowerBI is to clearly name your visuals this involves assigning each visual a name that is meaningful and relevant ensuring quick identification following this organizing the visuals in your report by grouping related visuals to create a report that is both well ststructured and userfriendly the next crucial aspect is layering these groups effectively this technique is about strategically arranging your data to guide the viewer’s attention ensuring that the most important information stands out first lastly the culmination of these skills is evident in the way you manage the visibility of various report elements the control over what and when information is displayed allows you to direct your audience’s focus to essential data significantly enhancing the overall experience in your PowerBI reports now let’s explore how this works in Microsoft PowerBI naming grouping and layering in PowerBI is done from the selection pane to open the selection pane go to view on the ribbon and select selection the selection pane will appear on the right side of the PowerBI desktop editor displaying all items on the current page you can select any name in this pane to identify which visual it refers to it’s important that you name these visuals properly to organize them in an appropriate way this is especially useful when you have many visuals on a page for example if I select text box it will highlight the report heading i can rename it as heading by doubleclicking the item and entering the updated name this can be done on any of these titles when I double click on any item it enables me to edit the name in this selection pane you can also change the layering of the items meaning you can rearrange the order in which visuals appear to better understand this select the insert tab on the ribbon select buttons and then blank from the listed options this will place a new button on the report page notice the new button item that now appeared in the selection pane i drag the new button next to the date slicer i select this button in the selection pane and using the up and down arrows I can change its order for example if I send it below the slicer it disappears from the report because it is under the slicer visual using this method you can bring any item to the front or send them back using the selection pane you can also group items from this pane let’s group the heading and the underline below this heading named shape select the shape item from the selection pane i then press the control key on the keyboard and select heading notice how these two items are now highlighted now I right click on either item select group and then group again this will create a new group of these two items to ungroup right click on this newly created group then select group and choose ungroup this way you can use the selection pane to change the item names group them and layer them on top of or below each other by grouping and layering visuals effectively you’re not just tidying things up you’re making the whole experience smoother and more intuitive for anyone seeing your reports use these techniques in your next PowerBI project to create reports that are not just visually appealing but also userfriendly and coherent in today’s fast-paced business environment the ability to access and analyze data on the go is increasingly important with a significant shift towards mobile device usage optimizing Microsoft PowerBI reports for mobile viewing becomes an asset for any organization this video highlights the importance of adjusting reports for mobile view and explores the capabilities of Microsoft PowerBI’s mobile layout view offering a strategic advantage in data accessibility by the end of this video you’ll be able to understand the significance of mobile optimized PowerBI reports explore the features and benefits of PowerBI’s mobile layout view and identify best practices for designing mobile friendly reports lucas a data analyst with Adventure Works is tasked with creating PowerBI reports that are easily accessible and readable on mobile devices his challenge is to ensure that these reports provide a seamless user experience maintaining readability and functionality across various mobile platforms lucas aims to make these reports not just accessible but also as informative as possible for his team who often rely on quick data insights while on the move the way users interact with data has fundamentally changed mobile devices with smaller screens and touch-based navigation require a different approach to data visualization compared to traditional desktop displays recognizing this shift PowerBI introduced a dedicated feature for the unique demands of mobile platforms the mobile layout view the PowerBI mobile layout view is a feature within PowerBI desktop that allows creators to design and customize reports specifically for mobile devices this view addresses the unique challenges posed by smaller screens and touch interfaces key aspects include mobile optimized layout this layout differs from the standard view focusing on simplicity and readability on mobile devices it allows users to rearrange visuals to fit a vertical layout which is more suitable for mobile devices interactivity and functionality despite the change in layout the mobile view retains the interactivity and functionality of the desktop reports users can still filter slice and interact with the data in meaningful ways customization and flexibility powerbi provides flexibility in designing these reports users can choose which visuals to include how to arrange them and even create different views for different devices consistency in data representation while the layout changes the data and its representation remain consistent with the desktop version this ensures that users get the same insights regardless of the device they use preview and testing powerbi allows creators to preview how their reports will look on various devices helping them make necessary adjustments before publishing let’s look at an example of adjusting the sales categorical analysis report for mobile navigation using PowerBI mobile layout view to access the PowerBI mobile layout view you select the phone screen button on the bottom left of the page using this button enables you to switch between the desktop and mobile layout views the mobile layout view appears on screen it features the mobile layout canvas a grid layout where you adjust the visualizations to fit any mobile screen the page visuals pane where all the reports visualizations are listed and visualizations where the format settings of any selected visual will appear to adjust the report for mobile platforms drag and drop any visualization from page visuals to the canvas such as the date slicer and the tree map fitting both to the screen you can use the visualizations pane to format the visualizations such as enabling data labels for the tree map chart these changes won’t reflect on the desktop layout view the sales categorical report will now appear with these configurations when launched through PowerBI mobile ensuring the seamless navigation of the report using any kind of mobile device when designing reports for mobile devices using PowerBI’s mobile layout view it’s important to be aware of certain considerations and limitations that can impact the user experience these include tool tips availability while the tool tips are not active in the mobile layout canvas during the design phase they become accessible to users when viewing the report through the PowerBI mobile app metric visuals interaction on the mobile layout canvas metric visuals are set to be non-interactive this means users cannot interact with these visuals in the same way they might in a desktop report slicer selections consistency slicer selections made in the mobile layout do not transfer when switching to the web layout conversely if you switch from the web layout back to the mobile layout the slicer selections will reflect those changes additionally when a report is published any slicer selections displayed will be those set in the web layout regardless of whether the report is viewed in a desktop or mobile optimized view optimizing PowerBI reports for mobile devices is a strategic step towards enhanced data accessibility and decision making in today’s mobile ccentric world this feature is instrumental in ensuring that valuable data insights are always at the fingertips of decision makers regardless of their location or the device they use have you ever noticed numbers in your data that seem unusual and just don’t seem to fit the data analysts in Adventure Works have in their recent sales report some unusual figures stand out and need investigation these odd numbers might be a coincidence or they might be indicators of hidden issues in the Adventure Works data or in the business as a whole they might also be clues that can lead the Adventure Works team to deeper business insights these odd numbers are referred to as anomalies and outliers in this video you will learn what anomalies and outliers in data are you will also discover how these odd figures can reveal deeper insights and information about your data and how you can use them to inform smarter business decisions the Microsoft PowerBI sales report prepared by the Adventure Works Analytics team shows a profit downturn for a month in the middle of the cycling season typically this is a time associated with peak sales profits rose in another month without a corresponding increase in sales volume the team needs to understand why these numbers are appearing to determine if any action needs to be taken let’s explore the terms anomalies and outliers and discover some examples of each anomalies are data points that occur outside the expected range of values and which cannot be explained by the base distribution base distribution is the normal pattern that data follows anomalies are often caused by invalid data outliers are data points significantly different from the rest of the data there are often values that deviate from the other values in a data set however outliers can be explained by the base distribution the main difference between an anomaly and an outlier is that an anomaly is often an error or a rare unexpected event whereas an outlier is an extreme but expected value that still belongs to the pattern of the data so how would you recognize an anomaly let’s step through some examples a sudden spike in website traffic that cannot be explained by any known marketing campaigns or events a sudden drop in sales for a product that has been consistently selling well a sudden increase in the number of errors in a system that has been running smoothly could also be an anomaly a customer who is aged 200 years old now let’s step through some examples of outliers a top student who scores 100% on a test while the class average score is 70% a house that is significantly larger and more expensive than the other houses in a neighborhood a stock that experiences a sudden price change that is not in line with the rest of the market a customer who is aged 99 let’s explore how to use a scatter chart visualization in PowerBI to identify anomalies and outliers in a data set this data set contains advertising spending and profits based on the same campaign in different media over several months it looks problem free but we can’t be sure until we process this data with some visuals like scatter charts to visually spot outliers and anomalies we’ve plotted this data set using a scatter chart on this report page placing campaign ID in the values advertising spend on the x-axis sales revenue on the y-axis and platform on the legend there are some data points which stand out in the scatter chart some of these data points demonstrate a slight variation while others diverge significantly these unusual data points might be anomalies or outliers the orange data points represent the social media campaigns the majority of them did well and the chart shows that when the advertising spend increased sales also increased the CO4 campaign is an exception to this however this will not be considered an anomaly because you know that the Adventure Works website was down on that day despite the ads continuing to run on social media because you can define a reason why C00004 performed badly you can define it as an outlier another campaign C006 didn’t perform well despite its high advertising spend this was a print media campaign and on further investigation you found that that type of media was not popular and this is why the C006 campaign failed the campaign is also considered an outlier because you can explain the reason why it varies so much from the other campaigns the online campaign C023 also stands out as different from the other data points in its category in this case the reason why this campaign has performed so differently has not yet been identified until you have the exact reason why this campaign performed exceptionally well you would consider this an anomaly and not an outlier anomalies and outliers in data are critical indicators of deviation from the norm while outliers can be explained within the context of existing data anomalies hint at underlying issues or exceptional occurrences that demand deeper analysis identifying these can lead to improved strategies and more informed decision-making processes in business operations orders at Adventure Works have increased recently as more of their customers are enjoying outdoor pursuits the data analysis team are kept busy analyzing data related to the large volume of orders being processed and shipped and creating reports to present the results their reports contain many of Microsoft PowerBI’s bright and colorful visuals it’s a large amount of data and the team wants to ensure that viewers of the report can quickly spot patterns and insights two PowerBI features grouping and bin will help them to create visuals that are concise organized and easier to draw conclusions from in this video we will explore what groups and bins are in PowerBI and how they can help you to organize your visuals to deliver information and insights more effectively as a data analyst at Adventure Works you’re part of the team creating a sales report which will provide a summary of the current order fulfillment situation your first task is to compare the number of items that have been shipped with those that have a status of processing or cancelled the management team particularly wants to be able to easily access information on shipped orders data grouping will allow you to group orders according to their status that will make the order fulfillment status more visible and make the data as a whole more coherent the management team also want to know the overall number of shipped orders in different value ranges the data bidding process will be invaluable for this it will enable you to organize the results based on the order value ranges and this in turn will allow the management team assess the pattern of which orders were more valuable let’s explore how the grouping and bin techniques work grouping refers to the process of combining data rows based on specific column values in PowerBI this technique allows you to create a new column that represents aggregated data the purpose of grouping is to simplify and streamline your data visualization by categorizing similar data points together you can group data related to product categories regions or customer segments making it easier to analyze and present summary information for instance you can group states into regions like East Coast West Coast and Central or you could group products by categories such as electronics clothing and home appliances to understand combined sales numbers bidding involves dividing a numeric column into ranges or bins bidding is useful when you want to analyze data in discrete intervals by categorizing numeric values into bins you can gain insights into the distribution of data and identify patterns for instance you could bin ages into ranges such as 1 to 18 19 to 30 31 to 45 and so on if you’re monitoring website performance you could bin website load times into categories like fast less than 1 second average 1 to 3 seconds slow 3 to 5 seconds and very slow 5 plus seconds to identify user experience issues let’s explore how you can use grouping and bin to help Adventure Works display the order status and the value range of the orders let’s begin by applying data grouping to a visual this clustered bar chart shows orders across multiple product regions it includes all shipped orders as well as orders that were cancelled or are still showing as processing let’s group those orders which have a status of canceled or processing to do that right click on the order status field in the legend well and select new group when the group pop-up appears press the control key on the keyboard and select cancelled and processing then select the group button and finally select okay the clustered bar chart updates with this new group data instantly now the orders with a status of canceled or processing are displayed in the same group and you can see the total value for these orders summed up together the management team asked you to display the orders in different value segments you can use the bin feature to achieve this create a new report page and add a clustered bar chart select the product region and order status fields from the data pane ensure that the product region is placed on the y-axis and order status on the x-axis ensure that the clustered bar chart visual on the report page is still selected open the filter pane drag the order status field from the data pane into the filter pane in the filter pane select the order status filter box and then select shipped from the drop-own checklist the visual updates to show only the shipped orders as requested by the management team they also wanted to have the orders displayed in order value ranges so let’s create bins to achieve this in the data pane right click on the order total field on the data pane and select new group from the shortcut menu in the new popup enter 5,000 as the bin size and select okay a new entry appears in the data pane called order total open parenthesis bins close parenthesis drag this new entry to the legend well now the data is properly binned you can hover on any bar to see how many orders are in each of these bins in this video you explored what the grouping and binning features are and how to apply them in your data set by using these two features to organize the results displayed in the PowerBI visual you made the visual clearer and more concise the use of grouping and binning in the chart visuals has enabled additional analysis to be implemented artificial intelligence commonly referred to as AI has revolutionized the world of data analysis and visualization making it easier for businesses to uncover insights and make informed decisions microsoft PowerBI Microsoft’s popular business analytics tool has embraced AI with a range of AI visuals that empower users to delve deeper into their data in this video you will explore three key AI visuals available in PowerBI key influencers decomposition trees and forecasts you will learn how these AI visuals are applied in PowerBI and how they are utilized by data analysts to improve the key factors behind business results gain a detailed overview of data breakdown and predict future trends the Adventure Works management team has noticed a concerning trend a significant drop in bicycle sales despite a surge in interest in outdoor activities they want to identify the reasons behind it the management team need to discover why the results for this product range are not as good as expected they also want to identify the product ranges that are performing well and predict if the current trends in sales will continue the data analysis team in Adventure Works can use AI visuals to provide this information they begin with the key influencers visual the key influencers visual helps users identify the factors that influence a particular outcome or metric in their data the visual uses machine learning to analyze and identify the factors that have the most impact on a selected outcome as the name suggests the key influencers visual examines potential influencers ranks them based on their impact and presents these insights in an interactive easy to understand format it helps business users to understand what drives specific results or why events occur by using the key influencers visual the data analysis team can identify the adventure works products and product categories that are not performing well they can also obtain key insights on how to reverse the current downward trend in bicycle sales key influencer visuals in Microsoft PowerBI offer many benefits first they help to identify causal factors key influencers help you pinpoint the variables or factors that have the most significant impact on your chosen outcome allowing you to make datadriven decisions second key influencer visuals offer intuitive visualization the visual representation of insights is easy to interpret making it accessible to both technical and non-technical users key influencers visuals also incorporate drill down capability you can drill down into specific features to gain deeper insights and an understanding of how different values within those features affect the outcome lastly there is a statistical significance with key influencer visuals the tool calculates statistical significance ensuring that the relationships it uncovers are robust and reliable the data analysis team uses another AI visual called a decomposition tree to help the management team optimize their product lines the decomposition tree visual is an AI powered visual in PowerBI that allows users to break down a measure into its underlying components a measure in PowerBI is an aggregated combined or calculated value the decomposition tree visual is particularly useful when you want to understand the factors contributing to a particular metric it offers a structured approach to dissecting data hierarchies and providing clarity in identifying the most influential components this type of information and insights can be crucial for optimizing strategies and resource allocation the management team at Adventure Works wants to gain a clear understanding of sales trends and the data analysis team uses the decomposition tree visual to provide information on how revenue breaks down by product decomposition trees in PowerBI offer many benefits they are ideal for breaking down complex measures into their underlying components making data more digestible and actionable a decomposition tree is a hierarchical visualization it allows users to explore the contribution of different factors at various levels of detail this visual also allows for interactive exploration users can drill down into each component for deeper insights and perform ad hoc analysis the tool calculates statistical significance ensuring that the relationships it uncovers are robust and reliable now that the management team at Adventure Works has a clearer idea of the factors influencing low sales in one product range and of the patterns and breakdown of their revenue they want to move on to forward planning their goal is to proactively adjust production plans with the appropriate models to stay ahead of the competition by capturing emerging markets and effectively meeting future customer demands the data analysis team can facilitate this by using AI features in PowerBI to forecast future bicycle demand trends the forecasting feature in PowerBI leverages AI to predict future values based on historical data this is vital for businesses that want to make datadriven predictions and anticipate future trends forecasting provides three important benefits forecasting enables you to predict future trends the forecasting tool helps organizations anticipate future values based on historical data aiding in proactive decision-making and planning another key benefit of using forecasting is scenario analysis users can explore different forecasting scenarios adjusting parameters to discover how changes impact future predictions lastly forecasting allows users to use datadriven planning businesses can use forecast to optimize inventory management resource allocation and budgeting microsoft PowerBI’s AI tools including key influencers decomposition trees and forecasting make complex data easy to understand they do this by analyzing patterns and trends in the data which assists businesses in planning and decision-m these tools turn complicated data into useful information helping companies respond to today’s needs prepare for the future and stay ahead in their fields some viewers of your report still have difficulty quickly absorbing the core data insights you’ve learned a lot about working with data in Microsoft PowerBI and you’ve created your reports according to best practices your reports use appropriate visualizations and they look great is there anything else you can do to help the viewers of your report focus on the key points yes you can use reference lines and error bars to insert further analytical visuals this video will explore the concept of reference lines in PowerBI and the application of different types of reference lines in data visualization you’ll also learn about error bars and use different types of error bars to represent data variability and uncertainty by the end of the video you should be able to recognize appropriate scenarios and visuals where you can effectively use reference lines and error bars renee Gonzalez is the marketing director at Adventure Works she asks you to enhance a Microsoft PowerBI sales report she wants to add an average reference line to display a clear sales performance benchmark she also wants to incorporate percentage error bars into a sales by product chart to give the sales managers a better understanding of sales fluctuations reference lines are used to highlight significant data points or trends these lines serve as benchmarks or guides to make data easier to interpret a reference line allows viewers to quickly identify key points like averages medians or specific thresholds they play a crucial role in highlighting deviations understanding distributions and setting performance targets there are several types of reference line in PowerBI choose the one that best interprets your data an average line marks the average value across a data set this is useful to compare individual data points against the overall average a median line indicates the median or middle value a feature that is especially helpful in skewed distributions percentile lines display a specific percentile giving a better understanding of the data spread a constant line or x-axis y-axis line represents a fixed value it is often used for benchmarks or targets min and max lines are used in charts to highlight the lowest and highest values in a data set providing a clear visual reference for understanding the range and distribution of the data and a trend line helps identify patterns or trends in data aiding in understanding data movements over time error bars are used to represent variability or uncertainty in data visualizations an error bar extends from a central point in a chart such as a specific line of a line chart or a bar of a bar chart the error bar visually demonstrates the potential range of values around a data point with the specific lower and higher bound highlighted in the tool tip this feature is particularly important in conveying precision reliability and potential errors in data in addition to displaying a range of values error bars also provide context and depth to the data points allowing for a more nuanced understanding of the data for instance in a financial report error bars can illustrate the potential fluctuation in revenue forecasts helping investment managers grasp the level of risk or uncertainty involved there are different error bar types choose the type you need depending on how they should be calculated and applied over the visualization the by field type of error bar allows you to specify a particular field in your data set to determine the range of the error bars it is useful when you have specific error values for each data point with by percentage the error bars use a percentage to calculate the error range this is particularly helpful when you want to display a consistent percentage error across all data points uh by percentile type will provide insight into the distribution of data points by displaying the range within a specific percentile for example a 25th to 75th percentile error bar indicates the interquartile range covering the middle 50% of data points these error bars help in understanding the central trend and spread of the data and the standard deviation type calculates the error range based on the standard deviation of your data it’s commonly used to indicate the variability of the data around the mean let’s discover how you can use the power of reference lines and error bars to add data insights in PowerBI the sales report contains two column charts the one on the left distributes the dollar sales amount over the customer country field and the other one distributes it over the product color field let’s explore how reference lines and error bars can help us interpret this data let’s start with the sales amount by customer country column chart select it and navigate to the visualizations pane then to the analytics pane component which is located below the icon of a chart in a magnifying glass the analytics pane has all the analytics metrics that PowerBI can apply to your visualization to add a horizontal line giving the average of the sales amount value select average line choose add line and turn on the data label section also expanding its options adjust the horizontal position to right so that the average value will be visible on the visualization and modify the style to be both so the users are clear about what’s being depicted with the reference line moving to the other sum of sales amount by product color visualization let’s add error bars to showcase the potential fluctuation of sales based on color select the visualization and navigate to its settings in the analytics pane once again error bars are at the bottom of the analytics expand this section and choose on for the options field box then directly below expand the type option to select an error bar type to be applied on the column chart select by percentage and modify the upper and lower bounds to be 5% the error bars are applied to your visualization you can hover over any column to display how the figures of any color will be modified based on a 5% increase or decrease in the sales amount this video highlighted the importance of reference lines and error bars in PowerBI both are key tools for enhancing data visualization reference lines aid in identifying and comparing key data points while error bars provide crucial insights into data variability and precision in summary reference lines serve as benchmarks or indicators helping to highlight key data points error bars offer a visual representation of the variability or uncertainty within the data adventure Works has streamlined its data analysis thanks to Microsoft PowerBI to keep making datadriven business decisions Adventure Works needs to be able to visualize performance tracking this is a crucial business metric for instance how is its customer satisfaction rating how close is it to the required goal and how can Adventure Works compare satisfaction ratings across different regions metrics and scorecards are the answer they are PowerBI tools that Adventure Works can use to track measure and report on key business goals and outcomes in this video you will explore the fundamentals of metrics and scorecards you’ll also learn how to create and customize metrics and discover how to build effective scorecards adventure Works needs a scorecard in PowerBI service to track the company’s ambitious sales target jamie the CEO wants a real time updating metric that accurately reflects the progress towards the sales goal this metric is the focal point of the scorecard which will also encompass other key performance indicators metrics in PowerBI are quantifiable measures that serve as key indicators of business performance essentially they are datadriven benchmarks used to track and assess the efficiency and success of an organization’s processes initiatives or strategies metrics in PowerBI are not just static numbers they are dynamic and interactive elements that update in real time reflecting the latest data the real-time tracking capability of metrics means that businesses can respond promptly to changes metrics can be customized to suit specific business needs such as tracking sales targets monitoring customer satisfaction levels or measuring operational efficiency scorecards in PowerBI are a step further in data visualization and analysis scorecards display a collection of related metrics on a single comprehensive dashboard providing a broad view of business performance this consolidated view is vital for managers and decision makers it encapsulates critical data points and trends in an easily digestible format that can reveal how business areas interconnect and impact each other scorecards and PowerBI are highly customizable organizations can tailor the information to align with their strategic objectives and key performance indicators or KPIs this includes the ability to set and track goals visualize progress and identify areas needing attention or improvement let’s create a scorecard with metrics for Adventure Works to track its sales amount target sign into PowerBI service with your credentials navigate to the left sidebar of the platform and locate the metrics icon select metrics to take you to the metrics page on the top right select plus new scorecard a new scorecard opens which you can start populating with metrics on the right of untitled scorecard select the edit pencil to rename the scorecard to adventure work sales goals all scorecards are saved in my workspace by default but you can move it to another workspace by selecting file and then move scorecard select the adventure work sales workspace to move the scorecard to and select continue the scorecard is now ready for the first metrics to create one select new metric name it sales amount goal and assign the admin account as the owner together with yourself on the current value field select set up to provide an actual figure from your data set instead of a manual number choose connect to data select the all reports tab and search for sales report select sales report then select next to move to the next step the report is previewed in the metrics window on the report there is a card visualization showcasing the total amount of sales select it to confirm the measure being used the current value as well as the filters and slicers affecting this value select connect to drive this measure onto your metric on the next field box final target input 30 million as the goal for the total sales amount a small box appears as you type the number aiding you in formatting the figure add a status on the metric which could be on track since the sales team is close to hitting the required goal let the start date be the default date given and assign a due date for the team to hit the target for instance this could be the end of the year all metric settings are now configured so you can select save to add the new metric to the scorecard the scorecard is now ready for users to access to share the scorecard and its metric goals with other Adventure Works members on the top menu of the scorecard select share and for instance select Renee the marketing manager to share the scorecard with her this video explored metrics and scorecards in Microsoft PowerBI illustrating their critical role in tracking and achieving business goals metrics in PowerBI provide quantifiable indicators that reflect the success of or progress to specific objectives scorecards give a comprehensive view combining multiple metrics into a holistic view of performance using these tools can empower organizations to align their strategies with datadriven insights ensuring that decisions are informed and goal oriented congratulations on completing visualizing and analyzing data in Microsoft PowerBI during these lessons you’ve gained insights into key data analysis concepts and tools in PowerBI and worked through practical activities for a deeper knowledge of these topics let’s recap what you learned and the key takeaways from each topic you began by learning more about the wide choice of visualizations available in PowerBI general purpose visualizations such as tables and matrices card KPIs and slicers are versatile as they can be used in a variety of analysis scenarios powerbi also offers many visuals that are tailored for specific types of analysis and this lesson explored which visualization is appropriate for specific analysis types for example categorical analysis is best displayed in bar and column charts or pie and donut charts scatter and bubble charts are more appropriate for correlation analysis histograms waterfall charts and maps were also discussed this lesson also examined the specific and general formatting settings that enhance the appeal and readability of visualizations in your reports modifying the size or position of visual elements or applying format changes such as font size and color to titles and data labels can add clarity and impact to visualizations you also learned about conditional formatting which can be used to dynamically highlight critical data points and add visual variety the slicing and filtering features in PowerBI allow you to dynamically adjust visuals and focus on specific data points slicers allow for intuitive selections and enable you to refine the data represented in all the visuals on a report page the filtering feature can be applied in the filter pane which manages filters at different levels visual level filters apply to a single visual page level filters apply to all visuals on a page and report level filters apply to all visuals within a report you also had an opportunity to learn about the tools in PowerBI that business users can use to export data for further analysis or presentation for example the analyze in Excel feature allows them to work with PowerBI data sets directly in Excel this offers a familiar environment for in-depth analysis and custom report creation another feature pageionated reports is ideal for creating print friendly formats these reports are designed for easy reading on paper or PDF and they can accommodate detailed data and complex layouts you then learned how to enhance reports for usability and storytelling this lesson began by exploring how smooth page navigation can improve readability and flow in multi-page reports the use of buttons or interactive links creates a seamless transition between different pages and guides users through the report’s narrative bookmarks captures specific report views and states enabling quick access during presentations and highlighting data changes over time sorting organizes data within visualizations making it easier to identify trends and insights the way that multiple visualizations within a PowerBI report interact with each other enhances data exploration and analysis filter interactions cause a change in one visual to filter data on another this refineses the display data based on the selection and allows users to isolate and analyze specific data points across different visuals another option highlight interactions does not filter out non- selected data instead it emphasizes selected data in connected visuals while the unselected data is dimmed and not filtered out this provides a clear view of how parts relate to the whole lastly there is an option none which completely disables the interaction between visuals doing this keeps the visuals independent without any interaction which can be useful for standalone data presentations you learned that syncing slicers in PowerBI reports improves the user experience with synchronized slicers a selection made on one page applies to all other pages this streamlined approach reduces confusion saves time and maintains the narrative flow you are also introduced to the selection pane where you can manage the report elements here you can clearly name individual visuals to ensure quick and easy identification you can use the selection pane to group visuals and provide structure to the report the selection pane also allows you to layer these groups this helps you to guide the report viewer through the data by controlling the order in which the visuals appear finally this lesson focused on how to adapt a report for mobile use the PowerBI mobile layout view it demonstrated how to modify the visual elements and layout for better readability and interaction on a smaller screen size in the final lesson you learned about the features in PowerBI which help you identify and analyze patterns and trends in your data it demonstrated how to recognize anomalies and outliers you were provided with examples of both and shown how to use scatter charts to identify them in PowerBI recognizing these types of discrepancies is essential for uncovering underlying issues or exceptional events and leads to smarter business decisions and strategy improvements the lesson continued with an explanation of grouping and binning in PowerBI grouping consolidates similar data points into categories which facilitates efficient summary visualizations bidding in contrast segments numeric data into ranges aiding in distribution analysis finally you learned about PowerBI’s AI tools which provide insights that can inform planning and decision-m key influencers to identify critical factors affecting outcomes decomposition trees to break down complex metrics and forecasting to predict future trends from historical data you should now have a powerful tool set in PowerBI for creating reports the first item in this tool set is the wide array of charts offered by PowerBI which you can use to convey insights features such as bookmarks grouping and layering visuals offer a way to create a smooth narrative for the viewer filtering and slicers help them to drill down to deeper insights techniques such as detecting outliers and anomalies data grouping and binning and using AI visuals provide a solid foundation for accurate data analysis in the world of data and reports having a centralized location where teams can work together is beneficial for all involved that’s where Microsoft PowerBI workspaces come in workspaces are more than simple folders they are special team rooms where analysts can add and share their charts reports and data in this video you will learn about what Microsoft PowerBI workspaces are and how they can benefit your work you will explore the different roles people can have in these workspaces and learn how these roles can make teamwork in PowerBI smooth and efficient at Adventure Works you are responsible for
creating and managing reports for a variety of teams the sales team requires regular updates on their performance metrics the marketing team tracks campaign results and the customer service department looks for feedback on user behavior each team creates its own set of data visualizations often leading to a collection of reports scattered across different platforms however using the PowerBI workspace feature you can set up workspaces for each of the sales marketing and product teams then each team will have its centralized room to create share and discuss their specific reports first let’s explore what PowerBI workspaces are powerbi workspaces are places to collaborate with colleagues and create collections of dashboards reports data sets and pageionated reports powerbi provides two types of workspaces personal and shared your personal workspace is a private area for individual tasks while shared workspaces are designed for team collaborations where members can jointly develop and fine-tune reports workspaces can contain a maximum of 1,000 data sets or 1,000 reports per data set workspace offers a feature called roles which helps to manage access control on these resources understanding and properly utilizing the roles within PowerBI workspaces is important to ensure effective collaboration and content management assigning the correct role to each user is vital to maintain data integrity security and efficient workflow powerbi offers four types of roles: admin member contributor and viewer let’s start with the admin role the most powerful role the admin has full control over the workspace including content creation member management and workspace settings adjustments they can add or remove members change roles and even delete the workspace next you have the member role members have the privilege to add modify and delete content in the workspace they can collaborate with others and share the workspace content but cannot change workspace level settings after the member role is the contributor this role is slightly more restricted than the member role contributors can add and modify content but cannot delete items from the workspace they also cannot share content with others lastly we have the viewer role the viewer role represents the most limited level of access within a workspace viewers are primarily consumers of content and their permissions are confined to viewing the materials available within the workspace they do not possess the right to modify or delete any content making them ideal for scenarios where readonly access is required having established your understanding of workspace roles let’s consider workspace role capabilities when an individual belongs to a user group they receive the role you have designated if a person is part of multiple user groups they inherit the highest level of permission from the roles they have been assigned in PowerBI service a user group refers to a collection of users who are grouped together based on certain criteria roles or purposes these groups can be leveraged for various functionalities including content sharing and permission management powerbi’s workspace offers a unique and powerful feature the ability to create template apps these are preset customizable structures that serve as a foundation for building specific data visualization applications once created they can be shared not just within the organization but also externally this external sharing capability enhances the utility of template apps rather than confining data visualizations and reports within organizational boundaries businesses can distribute these template apps to customers partners or other stakeholders the usefulness of these template apps lies in their flexibility when customers receive a template app they aren’t just locked into viewing static predefined data instead they can connect these templates to their own data sets now that you’ve learned about Microsoft PowerBI’s workspace tools you can explore ways to help your teams collaborate and use data more efficiently from setting roles that decide who can do what to offering readytouse templates it streamlines many tasks imagine you’re tasked with presenting multiple reports and data sets to teammates across various departments it will be convenient to bundle everything neatly together and offer it as a unified online package this not only simplifies your presentation process but also enhances accessibility for a wider audience this is precisely the type of solution that Microsoft PowerBI workspace apps look to provide streamlining and enhancing your data sharing capabilities in this video you are going to learn about PowerBI workspace apps what they offer and how to create and share them with your audience adventure Works faces a data sharing hurdle different departments need various PowerBI dashboards and reports to operate effectively the finance team requires sales data the marketing team are keen on customer insights and the supply chain team wants to view inventory levels sharing this data separately will be challenging this is where PowerBI workspace apps can assist you in generating these dashboards and reports using this feature the data analysis team can group related content into specific apps for instance all sales related reports and dashboards go into one app while customer insights go toward another these apps are then published to the appropriate teams ensuring everyone has access to the relevant information this improves workflow and efficiency for you and the data analysis team in PowerBI you can create official packaged content and then distribute it as an app these can be distributed to a wide audience such as an entire organization or to specific groups or people apps are created in workspaces you can choose a selection of reports dashboards and data sets from a workspace to distribute as an app you can then publish the finished app to large groups of people in your organization to create or update an app you need a PowerBI Pro or premium per user known as PPU license for app consumers there are two options the workspace for this app is not in a PowerBI premium capacity the workspace for this app is in a PowerBI premium capacity if the app is not in a PowerBI premium capacity all business users need PowerBI Pro or premium per user licenses to view your app if the workspace for the app is in a PowerBI premium capacity business users without PowerBI Pro or premium per user licenses in your organization can view app content however they can’t copy the reports or create reports based on the underlying data sets let’s consider how you create apps you can start the app publishing process when your workspace has content when you enter your workspace you will notice a create app button which will be your starting point you’ll be taken to the application settings area where you can set the name of your application add a description choose a logo and select the theme color for your application after that you can select which content you want to include in your app and you can sort content as you please once you are happy with the content selection you must select the audience for this application having created your app you must create and manage the audiences engaging with the app an app audience is the group of people you choose to share your app with in the audience tab there is a centralized place to decide who has access to your app and to what extent think of it as your control room where you can set up different audience groups for your app you might want to give access to everyone in your company or just want a specific group or certain individuals to have access with PowerBI apps you can create multiple audiences for your app and show or hide different content to each audience you can also set some advanced options like if your audience can share the data set or build new content with the data set in this app once you have the audience and the content they can engage with it is time to publish your app once the app is published it can be accessed by your intended audience you can come back to the app and update the settings and the published app will reflect the changes in a few minutes once the app is published it can be accessed via the URL or by searching for it from the app marketplace app consumers in PowerBI service and in PowerBI mobile apps only see the content based on the access permissions for their respective audience groups by default consumers see the all tab view which is a consolidated view showing all content that they have access to in this video you’ve learned about the process of setting up audiences in PowerBI deciding on the content visibility for each group and the steps to effectively publish and share your app microsoft PowerBI subscription and alert features enable users to remain informed about significant shifts in their data with data alerts users can establish notifications that activate when dashboard data surpasses predefined limits along with data alerts subscriptions ensure users consistently receive updates on their reports and dashboards in this video you will learn about Microsoft PowerBI subscription and alert features to keep you consistently informed about crucial data changes and how to utilize them effectively the newly appointed director of the strategic planning department at Adventure Works is eager to make a measurable impact with the recent launch of ebikes in Adventure Works it’s essential for the director to have a firm grasp on the daily sales figures however being new to the company’s PowerBI setup navigating through the PowerBI dashboards can be timeconuming to streamline this the business intelligence team establishes a PowerBI subscription focused on eBike sales metrics every day the director receives an email snapshot of the prior day sales enabling immediate datadriven strategic discussions powerbi subscription and alert features are tools that redefine the way businesses approach data analytics it is important to note that to activate subscriptions and alerts the content must reside in premium capacity or be tied to a premium per user license to support nearrealtime data flows data sets must be configured for scheduled refreshes or direct query connections with data alerts users can establish notifications that activate when dashboard data surpasses predefined limits along with data alerts subscriptions ensure users consistently receive updates on their reports and dashboards let’s first explore subscriptions with subscriptions timely delivery and tailored report dissemination becomes seamless eliminating a laborious manual process and ensuring that stakeholders are always informed there are many benefits of using subscriptions in Microsoft PowerBI with subscriptions you can schedule automatic delivery of reports on a recurring basis email or chat digests of key report pages to stakeholders set different schedules like daily weekly or monthly delivery customize data views with parameters and rowle security and eliminate the need to manually distribute reports users can set up to 24 subscriptions per report or dashboard with unique recipients times and frequencies for each subscription subscriptions can include a snapshot and link to the report or dashboard or a full attachment of the report or dashboard you can also create dynamic per recipient subscriptions which are designed to simplify distributing a personalized copy of a pageionated report to each recipient of an email subscription now let’s turn our attention to alerts alerts in PowerBI notify users when data meets defined conditions such as surpassing sales targets dropping below inventory thresholds or any other measurable value set within the system alerts shift from passive data monitoring to proactive and timely decision-making allowing businesses to harness real time data intelligence effectively the benefits of using alerts in PowerBI include getting realtime notifications when data meets thresholds responding quickly to insights instead of passive monitoring receiving dynamic metric alerts account for data variability ingestion alerts notifying you on data set refreshes getting push notifications via email mobile and Microsoft Teams chat and shifting from reactive to proactive data analytics with subscriptions and alerts microsoft PowerBI analysts can build out robust notification strategies ensuring stakeholders always have visibility into the data they care about this keeps them informed of critical metrics and enables proactive responses to data trends and anomalies in today’s datadriven world how can data analysts discern between trustworthy Microsoft PowerBI content that holds reliable information and content whose accuracy hasn’t been tested microsoft PowerBI’s features of promoting and certifying content hold the answer promoting and certifying content in PowerBI can elevate data credibility and can elevate the credibility of your data and ensure it is trusted as reliable content in this video you will learn about the differences between promoting and certifying PowerBI content their respective use cases and the implications of each method for content creators and consumers the marketing team at Adventure Works detects a noteworthy increase in sports bike sales in Europe after compiling the data a PowerBI report is generated highlighting the sales trends and key insights after compiling the data a PowerBI report is generated highlighting the sales trend and key insights recognizing its value the report is promoted within the European sales division and given its potential relevance to global strategies the upper management deems it fit for companywide sharing before its wider distribution the central PowerBI team thoroughly reviews the report ensuring it aligns with global standards once certified this report will be accessible across all regions its certification badge becomes an assurance of its precision and significance influencing strategic decisions throughout Adventure Works global operations promoting content in PowerBI is like giving it a stamp of approval when content is marked as promoted it signifies that it aligns with specific organizational benchmarks for accuracy and reliability however it is crucial to note that while it has met these preliminary checks it has not been subjected to an exhaustive vetting process when content like a report or data set is promoted it is made available for a wider audience to discover and consume promoted content appears in content packs and curated content lists in the PowerBI service promoting makes the content visible to more users but does not validate or endorse it any user with edit access to a workspace can promote content from it certifying content is more specific and detailed than promoting content it requires setting up a content certification policy and process with designated reviewers reviewers validate content to ensure it meets standards and best practices before officially certifying it certification offers a greater level of trust and validation when content is certified it means it has passed through a rigorous scrutiny process adhering to the standards set by the organization this is often a testament to its quality accuracy and overall trustworthiness there are four key aspects of certifying content in PowerBI they are review process expert validation of data quality and adherence to best practices governance implementing strict organization standards while certifying contents visibility certified content is marked with a badge for easy recognition trust indicates high level approval and reliability for all users in the organization when certifying content it requires admin setup of content certification policies certified status expires unless reertified within the policy period let’s explore the key differences between promoting and certifying content when it comes to level of trust promoted content signifies the content is trusted by the creator and might have undergone peer review certified content implies organizational approval often by a central team or authority indicating the highest level of trust with visibility promoted content appears in shared and recommended sections for end users certified content stands out with a distinct badge in the service ensuring users can instantly recognize its elevated status with regards to governance promoted content allows for decentralized governance where individuals or departments can decide the criteria certified content typically requires centralized governance with strict criteria that content must meet to achieve certification next we have the audience promoted content is ideal for departmental or team level sharing where the audience knows the creator and trusts their expertise certified content is best for organizationwide sharing where the audience might not be familiar with the creator but trust the centralized certification process lastly is the review process promotive content might involve peer reviews or departmental checks while certified content often involves strict review by experts or a central BI team including checks on data sources calculations and visualizations in this video you’ve learned about content promotion and certification in Microsoft PowerBI and the key distinctions between each process these two methods are vital for distinguishing trustworthy data and ensuring its credibility some of your data is in cloud-based storage but your other data sources are on premises do you have to move the on- premises data to the cloud to be able to combine and analyze all your data no microsoft PowerBI connects to many data sources microsoft PowerBI data gateways are used to connect PowerBI cloud-based data analysis technology and the data source on premises the gateway is responsible for creating the connection and passing data through in this video you will discover what PowerBI gateways are and how they can help organizations manage on premises data that will later be shared with different types of users adventure Works operates across North America Europe and Asia it uses its global data sources to analyze market trends to make smart business decisions effective decision-making depends on up-to-date reports based on the latest data that’s why the team needs a solution to synchronize the on- premises data sources like SQL Server Excel files and Microsoft Dynamics CRM with Microsoft PowerBI service with the gateway in place every morning when a regional manager logs in they get a dashboard showing not just their own store sales from their on premises sources but also data from other branches across the world despite originating from a server thousands of miles away the data is upto-date and ready for use managers can compare their sales with other regions identify trends and adjust their local strategies accordingly a PowerBI data gateway is an application that connects PowerBI cloud-based data analysis technology and on premises data sources such as SQL server databases or Excel spreadsheets it is required whenever PowerBI must access data that isn’t accessible directly over the internet gateways are responsible for creating the connection and passing data through and they can be installed on any server in the local domain running Windows Server 2012R2 or later there are three types of gateways available personal mode standard or on premises mode and virtual network data gateway with a personal mode gateway only one user connects to data sources and sources can’t be shared with others this mode can only be used with PowerBI and is ideal when one person creates reports and doesn’t need to share data sources the standard or on premises mode gateway allows multiple users to connect to multiple data sources that are secured by virtual networks this mode is well suited to complex scenarios in which multiple people access multiple data sources the virtual network data gateway facilitates secure connections for multiple users to various data sources protected by virtual networks as a Microsoft managed service it eliminates the need for manual installation the virtual network data gateway is particularly effective in handling intricate situations where numerous individuals need access to diverse data sources simultaneously who is the gateway for what type of user with personal mode individual analysts want to manage their own reports and sync personal data sources with the cloud whereas with the on premises mode admins set up the gateway and configure it the BI team uses the gateway to get up-to-date data for their reports what is the connection type you can use the personal mode to import data or schedule refresh the standard mode is used to grab refresh or run direct query how is the data managed each user handles their own data in personal mode in the standard mode the company manages data centrally for all users what is happening with data supervision can we oversee the data there is no supervision in personal mode users are on their own in the standard mode there’s a central system to watch over all the data the final factor to consider is compatibility personal mode works only with PowerBI the standard mode works with PowerBI various apps flows and more the gateway is responsible for creating the connection with PowerBI online service and syncing the local data let’s examine some of the gateway details the gateway is installed on a server in the local domain during installation credentials are stored in local and PowerBI services credentials entered for the data source in PowerBI are encrypted and then stored in the cloud only the gateway can decrypt the credentials the gateway controls access to the local data when an online tool wants data it asks the gateway the gateway checks asking and if they have permission grants access the gateway doesn’t store data it just connects and transfers when data in PowerBI needs updating the gateway passes the request to the local data once the data responds the gateway sends the updated info back to PowerBI one of the standout features of the gateway is the ability to set up scheduled refresh this means that at specified intervals the gateway will automatically fetch the latest data ensuring that online reports and dashboards are always updated finally let’s check some business use cases for PowerBI data gateways organizations with multiple locations or teams spread across different regions can face challenges in accessing a centralized data source the data gateway ensures all teams have uniform access to the same data source data can change rapidly for instance there can be continual updates in global markets for businesses to make informed decisions they need realtime access to data the gateway ensures that the data in online reports and analyses is always up to date a security consideration to remember is that when you use a data gateway direct connections to the online premises data sources are minimized only the gateway communicates with the data source providing an added layer of security all data transferred is encrypted and the established connection is outbound this reduces the risk of security vulnerabilities in this video you learned about Microsoft PowerBI gateways gateways help organizations keep databases and other data sources on their on- premises networks yet allow secure use of that on- premises data in cloud services organizations have a lot of data but not everyone needs to access all of it all the time and some data is sensitive in nature and access to it should be restricted rowle security or RLS is a powerful and exciting data governance capability in PowerBI that enables you to control access to the organization’s data at a granular level it allows you to restrict data visibility for different users or groups ensuring that each user can only access the data they are authorized to view in this video you will explore different types of role security and roles and how to configure them in PowerBI the BI team in Adventure Works is working on quarterly reports and forecasts as their data grows they often need to protect their reports and control access among teams in a report they want to grant certain teams access to specific visuals while restricting access to those visuals for others this security challenge led Adventure Works to implement rowle security rls allows them to precisely manage who can view data and particular visuals within a report providing a tailored and secure experience for each team rowle security controls the data viewable by users based on predefined roles and rules the role is like a group the user belongs to and the role or rules can be designed based on columns of the data set there are two types of rowle security static RLS and dynamic RLS static RLS is the rowle security method to use when you have a fixed set of users and roles like when you have some predefined roles like manager product lead customer marketing lead and so forth in your team you can create these types of roles and apply filters within PowerBI desktop using its rowle security editor static RLS is suitable when you have a small fixed list of users and a simple RLS logic in the report dynamic RLS is a flexible approach because it operates with the user attributes and conditions stored in the data itself it operates by using a centralized role assignment table containing user attributes like role assignments user ids and filter conditions relationships between this table and the primary data tables are established and DAX expressions are used to dynamically filter data based on the user’s role and attributes dynamic RLS is ideal for scenarios where user access is based on varying criteria such as region specific data access or complex role assignments whatever rowle security you create you must always test your configurations rigorously to guarantee accurate and secure data visibility across users testing might mean you just open your report as a specified user and check the data visibility in the modeling ribbon there is a choice called view as that will allow you to simulate a user login and check if the RLS is working as expected let’s create some static and dynamic RLS in the Adventure Works reports first let’s start with static RLS this is the Adventure Works world sales report on the modeling ribbon select manage roles and create a new role called manager Europe we want people in this role to view data from Europe only select the sales table select more options the three dots next to it and select the region field now in table filter DAX expression add this DAX expression open square bracket product region close square bracket equal to open double quotes Europe close double quotes and select save this DAX expression means that any user who belongs to the manager Europe role will only view sales data related to the Europe region to test if the static rowle settings are working properly return to the report view in your PowerBI editor and check if you can view sales data for every region now on the modeling ribbon select view as and check the manager Europe role and select okay this will immediately apply the RLS restrictions on the report and you get sales data for only the Europe region all other regional sales data is hidden you can exit this restricted view by selecting stop viewing since everything is working as expected publish the report to your workspace and add some users to the manager Europe role go to your workspace and select the data set named world sales report choose more options the three dots next to it from the drop-down select security in the role level security dialogue select the manager Europe role and add users to this role then select save with this static role security setup when users in this role view the world sales report they will be able to view sales data related to Europe but will be unable to view sales data from other regions for a more flexible filtering approach you can create dynamic row security return to your PowerBI editor to start applying dynamic RLS for example inside the PowerBI editor model view of your report you can have a table with all the regional managers email addresses and the product regions they belong to this table is related to the sales table using the product region field if you create a dynamic RLS when the managers view this report they will get only sales data related to their corresponding regions return to the modeling ribbon and select manage roles let’s delete the previously created manager Europe role and create a new one named managers this time select the sales table and add this DAX expression sales open square bracket product region close square bracket equal to lookup value open parenthesis managers open square bracket product region close square bracket comma managers open square bracket email close square bracket comma user principal name open parenthesis close parenthesis comma managers open square bracket product region close square bracket comma sales open square bracket product region close square bracket close parenthesis when finished select save this DAX expression checks the currently logged in user’s email against the manager table then filters the product region based on the product regions this user belongs to to test if the security settings are working properly return to the report view and check if you can view sales data for every region on the modeling ribbon select view as check the newly created manager role you also need to check the other user role and input one of the manager’s email addresses from the manager table notice how the report view changed and you are viewing sales data only for the regions assigned to this manager you can select stop viewing to return the report to the normal unfiltered view return to the home ribbon and publish this report to your workspace then open your workspace in the PowerBI service area and go to the security setting of your data set add as many users as you want to this new manager role the dynamic role security is active for this report so when users view the report based on their email address and assigned regions in the PowerBI data set they will view only relevant sales data this way users will have access to filtered data dynamically based on their email and product regions rowle security or RLS is a powerful feature in PowerBI to filter data based on various conditions and roles by establishing the right relationships and using appropriate DAX expressions PowerBI can filter data based on various conditions ensuring that each user sees only the data relevant to their specific permissions always test your RLS configurations rigorously to ensure users data visibility is accurate and secure team collaboration is crucial for proper data analysis the challenge presented by collaboration is to ensure the correct distribution of data within your organization discover how PowerBI’s robust permission management settings can help you maintain control over critical data sets at Adventure Works ensuring data integrity while enabling effective collaboration in this video we’ll explore aspects of permission management for data sets and workspace apps you work as a Microsoft PowerBI data analyst at Adventure Works and there are occasions when you need to share certain data sets with your colleagues your colleagues can either reshare these data sets or create new reports based on them however some of these data sets hold significant importance for the organization and even though they are shared among users you do not want anyone to modify the data set in addition to standard sharing there are times when you also need to share all items in a specific workspace with other users or teams as workspace apps nevertheless you still require precise control over some of these items like reports or data sets ensuring that various teams can only access relevant items the Microsoft PowerBI service offers various permission management settings for data sets and workspace apps which can be incredibly helpful in this context let’s quickly review some key terms data sets are the core collections of data that you work with in PowerBI often representing various aspects of your organization’s data workspace apps in PowerBI allow you to share entire workspaces including data sets dashboards and reports ia workspace app is a full data package that can be shared with specific users or teams ensuring a comprehensive sharing experience now to briefly review the topic of permissions with data set level permissions PowerBI service enables you to assign specific permissions to data sets while sharing you can ensure that although colleagues can access and utilize the data they cannot make changes to it this ensures the sanctity of vital data sets then there is workspace apps permissions in some cases you need to share all files within a particular workspace with other users or teams using workspace apps with PowerBI’s permission management you can maintain granular control over who sees which reports this means different teams can access only the reports that are relevant to their needs keeping your data organized and secure to check how many workspaces reports or dashboards are affected by a data set you can perform what is known as impact analysis to do this you go to your workspace and hover on a data set then select the more options three dots next to it and select show lineage this opens the lineage view for your workspace items where you can view which items are connected to each other on the right side of the screen it also shows the impacted workspaces reports and dashboards for this data set you can always perform impact analysis by selecting show impact across workspaces under each data set to exit lineage view on the top right corner in your workspace you select source view this will take you back to the previous list view where you can view all the items in this workspace as a list let’s experiment with permissions in PowerBI service to begin open your workspace to set permissions for a data set select more options the three dots next to the data set and select manage permissions from here you can add users to your data sets at the top select add user in this grant people access dialogue you can type the username or email address and then select the appropriate permission level using the check boxes for example if you don’t want this user to make any changes to this data set uncheck the allow recipients to modify this data set checkbox once added all users will be shown in this permission view you can make further changes by selecting more options the three dots next to a user and removing or granting permissions you can also fine-tune permissions for your new or existing workspace apps we have already discussed how to create an app and select an audience in previous lessons let’s discover how to update the audience for an existing workspace app open your workspace and at the top select update app select the audience tab here you can fine-tune all the settings related to the audience for an app on the right side in edit audience you can modify the current audience for example currently this app is shared with all users in the entire organization you can change it to some specific users by selecting specific users or groups and then typing their name and selecting update app alternatively you can select new audience and choose other users with different permissions for example you may want to share it with some other user but this time you want to allow them to share the data set among the users in this audience group you can select advanced settings then check allow people to share the data set in this app audience you can also select allow people to build content with the data set in this app audience just in case you want to allow the creation of new reports based on this data set to complete select update app and select update again on the confirmation popup and finally closing the published popup that is a demonstration of how you can manage permissions for a specific data set or for workspace apps inside your PowerBI service area powerbi’s permission management settings offer a robust framework for maintaining data integrity while facilitating effective collaboration at organizations like Adventure Works whether you’re safeguarding critical data sets or sharing workspaces these tools help you to apply access control to your data congratulations on reaching the end of these lessons in deploying and maintaining assets you explored creating monitoring connecting to and maintaining workspaces data sets and dashboards in Microsoft PowerBI let’s recap what you’ve learned so far you began the first lesson by exploring the concept of a workspace you learned that a workspace is a specialized area in PowerBI that holds important assets like data sets reports and dashboards its advantages are that it helps to organize assets for easy management provides security through access control only permitted users can access workspaces enables collaboration teams can use workspaces to build reports and allows analysts to update or modify data quickly when creating a new workspace you must consider workspace roles workspace roles determine who can perform each task viewers can view content but can’t modify it contributors can add and modify content members can alter content and add new members admins have full control over the Workspace assets and its members during this lesson you learned how to share Workspace assets as an app creating an app requires a PowerBI Pro or premium per user license the technical process of creating apps in PowerBI was outlined beginning with selecting create app in the workspace leading to an application settings area where one can name the app add a description set a logo and choose a theme color content can be selected and sorted for inclusion in the app which is followed by selecting and managing the audience powerbi allows the creation of multiple audience groups for an app enabling tailored access and content visibility you also learned how to manage assets in a workspace you can import assets directly into a workspace by uploading them or publishing them from your PowerBI desktop when the changes are made you can always publish them again which will update the previously published reports and data sets in addition you learned about setting up subscriptions and alerts in PowerBI service which allows users to receive regular updates and notifications based on data changes these tools enhance user engagement by automating the distribution of insights and ensuring timely awareness of critical metrics the lesson continued by exploring the steps required to promote and certify contents in PowerBI promoting and certifying are crucial for establishing trust and standardizing data quality across the organization thereby enabling users to identify and rely on the most accurate and relevant business intelligence assets the lesson ended with a detailed guideline on various global options for files within PowerBI such as data load and report visualization knowing how to configure these settings is important because it allows for more tailored and efficient data processing enhances visual representation and ensures a more seamless and intuitive user experience the next lesson started with the concepts of a data gateway and how it can help PowerBI data analysts and organizations a data gateway serves as a bridge between PowerBI’s cloud services and on premises data sources such as SQL databases or Excel files whether you are a data analyst working on your own or working for an organization you can sync your data with data sets hosted in PowerBI service using these data gateways and always keep these data sets up to date by setting up schedule refresh there are three types of data gateway personal mode is for single user use and this is suitable for individual report creators the standard mode also known as on premises mode supports multiple users and data sources and it’s used for complex access scenarios lastly the virtual network data gateway allows multiple users to connect to various data sources within virtual networks without any installation managed by Microsoft this lesson also discussed details of rowle security or RLS in PowerBI service a feature that allows for more granular control over access to data rls enables creators to define permissions on data rows so that users will only view data relevant to them enhancing both security and user experience this is particularly useful in organizational scenarios where data access needs to be restricted based on user roles or departments ensuring that sensitive information remains confidential while still providing valuable insights to authorized personnel finally this lesson covered the management of permissions for data sets and workspace applications effective permission management enables selective sharing of data sets and workspace apps allowing the designated individuals to access the data sets and create reports from these data sets the workspace audience management tools allow for sharing with the entire organization or customizing access for users additionally impact analysis tools are available to determine the connectivity and potential effects on workspaces reports and dashboards when there are updates to a data set you’ve reached the end of our summary on deploying and maintaining assets keep practicing your practical skills with sample data sets reports and dashboards and remember you can always revisit any item in the course to revise a topic by playing a video viewing a document or engaging with an activity best of luck with your studies the Microsoft PL300 exam is a professional certification in Microsoft PowerBI for aspiring analysts the exam tests your knowledge and skills in the technical and business requirements of data modeling analysis and visualization in PowerBI in this video you’ll discover the recommended strategy to maximize your chances of passing the exam PL300 Microsoft PowerBI data analyst a successful exam with a good grade is achievable if you are well prepared and practice some basic strategies one of the best ways to prepare is to take a practice test before the exam this way you can monitor your progress and identify the areas requiring more study or attention you have taken knowledge checks graded quizzes and completed exercises throughout this course these are designed to help you monitor your progress while preparing for the real exam you’ll be able to complete the PL300 mock exam a little later focusing on topics and key skills measured in the proctored exam the topics include preparing the data modeling the data visualizing and analyzing the data and deploying and maintaining assets during this program you have covered the skills measured in the PL300 exam and gained significant hands-on experience using the realworld data set of Adventure Works now it’s time to practice what you’ve learned the PL300 mock exam is based on a similar style and format to the proctored exam you can revisit any lesson to revise a concept if you need to review anything this practice exam is intended to provide an overview of the style wording and difficulty of the questions that you are likely to experience on this exam these questions may differ from those you could encounter in the exam and the practice exam is not illustrative of the length of the official exam or its complexity for example you may encounter additional question types such as drag and drop build list order and case studies you’ll also encounter exhibit and active screen questions like drop-own menus option boxes and complete a statement these questions are examples to provide insight into what to expect on the exam and help you determine if additional preparation is required review some possible exam formats and question types from the Microsoft documentation to get a feel for an exam in the reading preparing for the exam you can access Microsoft’s exam sandbox environment which was created to demo the interface that hosts exams to protect exam security Microsoft does not specify exam formats or question types before the exam microsoft continually introduces innovative testing technologies and question types and reserves the right to incorporate either into exams at any time without advanced notice in the mock exam you’ll have 150 minutes to complete the final practice exam which consists of 50 questions on completion of the exam you’ll be presented with your overall score and the questions you answered correctly once you’ve completed the PL300 mock exam it’s time to focus on the real exam a good exam strategy for the PL300 exam can be summarized with a checklist of what to do on the test day when test day arrives you should follow these tips to prepare ensure that you are well rested and nourished eat a meal or a snack and try not to drink too much water so you don’t need the bathroom during the exam give yourself enough time to get set up the last thing you want is to feel hurried or be late for the exam remember to bring your current governmentissued ID which must match the name on your Microsoft certification profile use your phone to capture the required headshot and ID if you’re unsure and require more details check the official documentation from Microsoft and Pearson View you’ll find links to these resources in the reading preparing for the exam the PL300 is a closedbook exam meaning you cannot bring any study or exam materials to the examination a score of 700 or greater is required to pass when it comes to answering the exam questions you can use these strategies keep calm and read the entire question before checking the answer options if multiple answer options exist try eliminating those you know are incorrect by using this process of elimination you can cross off all the incorrect answers read every answer option before choosing a final answer don’t rush and pick the first answer if you’re having difficulty with a question move on and return after you’ve answered all the questions you know try not to spend too much time on only one question ensure that you have enough time to attempt all the questions before checking them at the end you may be unable to change some of your answers so ensure you answer questions correctly avoid second-guessing yourself and changing your answer this can often be counterproductive you can complete the PL300 mock exam later focusing on the topics and key concepts this exam does not employ negative marking if you’re unsure of a question try making the best educated guess possible the important thing to always remember is that a successful blend of preparation test strategy and exam technique will help you maximize your chances of obtaining certification best of luck on a brisk Monday morning you step into your office ready to tackle the terrain of data as a seasoned PowerBI specialist your manager stops by your desk her expression a mix of excitement and anticipation she places a challenge before you i need you to explore Microsoft Copilot in Bing a powerful artificial intelligence or AI tool it’s designed to revolutionize problem solving and enhance productivity i believe it’s quite transformative and I want your insights on it as you switch on your computer the weight of opportunity settles in your mind races with possibilities could co-pilot streamline the development process and uncover new insights that haven’t been considered yet instead of reacting to market changes now there’s an opportunity to proactively shape them it’s more than just analyzing data it’s stepping into the future of generative AI microsoft Copilot is a powerful AI tool that enhances how users interact with data and digital content across various platforms with its design deeply integrated into Microsoft’s ecosystem including Bing and Microsoft Edge C-Pilot serves as an everyday AI companion that simplifies tasks boosts productivity and enhances creative processes c-pilot is accessible directly through the Bing website or the Microsoft Edge browser it employs advanced AI to provide a dynamic interaction model where you can ask questions generate content and receive detailed answers directly related to the task they are performing this is useful in scenarios like getting suggestions on generating a color palette from a company logo understanding and troubleshooting data analysis expressions also known as DAX formula or even answering specific contextual questions about improving a report interface in the everchanging digital landscape proficiency with advanced tools like Copilot is crucial for adapting swiftly to new technologies and maintaining a competitive edge now that you know what Microsoft C-Pilot is let’s explore its core capabilities and features c-pilot transforms traditional search capabilities by providing comprehensive contextaware responses to complex queries whether you’re asking for the benefits of using direct query or wanting travel advice on attending a data conference Copilot generates textbased answers images additional links and more delivering a rich detailed response copilot excels in creating text for a variety of needs including drafting emails writing user manuals and generating creative content like marketing posts this feature allows users to input prompts and Copilot crafts the necessary text in seconds tailored to the desired tone and format integrated with Dell E3 technology the designer feature in Copilot enables users to generate images on demand this tool is accessible directly through the Bing interface and creates visual content ranging from social media posts to custom event invitations copilot extends its functionality to the edge browser offering insights within the sidebar additional information links and suggestions enrich the browsing experience helping to discover new content and access relevant data quickly copilot supports various multimodal interactions which means it can handle tasks combining different data input and output types such as text and images this enhances the flexibility and depth of user interactions with the tool having covered Microsoft Copilot’s vast capabilities and features in Bing let’s explore how its varied modes adapt to an individual’s needs these modes creative balanced and precise enhance the experience by shaping the AI’s responses to fluently match the context of queries creative mode is suitable for tasks requiring a high degree of creativity such as composing poetry and images or crafting engaging narratives it enhances responses with stylistic elements like word play providing more elaborate and detailed communication for instance creative mode can be used in the retail industry to develop unique marketing campaigns that captivate customers consider a clothing brand wanting to launch a new line using creative mode they can generate inventive product descriptions engaging storytelling around the brand’s journey and eye-catching promotional materials that differentiate their offerings from competitors and attract more customers balanced mode is the default configuration providing a compromise between creative mode’s detailed expressiveness and precise mode succinct nature it aims to deliver factually correct responses yet includes a slight creative twist to enhance engagement this mode is well suited for regular inquiries that require clear and accurate information but are enriched by a creative element to maintain interest and readability in the manufacturing sector balanced mode can be used to write user manuals that are not only informative and precise but also easy to understand and engaging this helps ensure that technical documentation while accurate is also accessible to users enhancing customer satisfaction and reducing errors in product use precise mode focuses on delivering brief and accurate responses when precision and conciseness are critical this mode ensures that responses are direct and to the point concentrating solely on factual content without additional creative additions it is ideal for straightforward questions where timely and accurate information is needed or when a concise summary is required to quickly grasp the essential facts for example precise mode is essential for developers and data professionals when troubleshooting complex formulas this mode provides straightforward accurate responses that help individuals quickly understand errors in their code or apply the best techniques to optimize their queries without sifting through irrelevant information by harnessing the power of Microsoft Copilot you embark upon infinite digital possibilities with each query you explore and insight you uncover you’re not only keeping up with new age technology you begin driving it as a data analyst your agenda consists of creating a series of PowerBI reports that accurately capture the company’s performance over the past quarter you have gathered the necessary data and spent hours planning the data flow however as you explore the data set you encounter familiar roadblocks some of the formulas in your reports are returning errors disrupting the flow of your analysis moreover ensuring the aesthetics of the reports align with your company’s theme is proving to be more time consuming than anticipated you often find yourself pondering the hours spent each week on similar tasks time that could otherwise be directed towards deeper analysis that could propel the company forward the potential of integrating C-pilot with PowerBI becomes apparent in moments like these as a data analyst your daily work is fraught with challenges that can perplex even the most experienced professionals in the field each step presents obstacles from data collection to report delivery one of the primary issues data analysts face regularly is formula errors these errors can range from simple syntax mistakes to more complex logical problems that can skew the analysis and lead to incorrect conclusions such issues not only delay the reporting process but also jeopardize the accuracy and reliability of the information presented to decision makers maintaining consistency in color usage that reflects the company’s theme across all reports requires meticulous attention to detail and in-depth knowledge of branding guidelines these design challenges often consume a substantial amount of time and can divert one’s focus from core analytical responsibilities copilot paired with PowerBI transforms the way you navigate these challenges you can ask C-Pilot questions about techniques to improve your reports interface or instruct it to troubleshoot data analysis expressions or DAX formulas for instance you might say “Explain this DAX formula and why it results in an error then Copilot immediately interprets your request and generates the relevant explanation and corrected DAX formula without you manually troubleshooting it moreover Copilot’s machine learning or ML aspect continuously learns from the data it processes and its interactions with you this enables Copilot to become more adapted understanding your specific needs over time for example imagine you are working on a series of financial reports and Copilot has resolved DAX errors for these formulas earlier in the chat session copilot then recognizes these patterns in your query history and personalizes future interactions to ensure the chat context remains relevant this saves you time by reducing the need to copy and paste formulas repeatedly and helps ensure accuracy in your analysis by minimizing the potential for errors now that you understand how Copilot leverages cutting edge artificial intelligence technologies let’s explore the advantages this powerful tool offers for data analysts these features not only enhance the efficiency of workflows but also elevate the quality and impact of reports c-pilot excels in troubleshooting and optimizing DAX formulas which are central to data manipulation and analysis in PowerBI if you’re struggling with a formula’s performance or accuracy C-Pilot provides suggestions for optimization it can also explain the logic behind DAX functions in simple terms making it easier for you to understand and effectively use them in your reports from an aesthetic standpoint Copilot can analyze images of your current reports or even suggest improvements to the layout for example if you upload an image of a report you’re currently working on Copilot can analyze the placement of elements and suggest a more streamlined or visually appealing arrangement that enhances readability and viewer engagement when you upload an image representing a company’s branding like a logo or marketing material Copilot can analyze the colors and generate a color palette that matches the branding this feature ensures that all reports maintain a consistent visual style that aligns with a company’s identity enhancing the professional quality of your presentations copilot can also serve as a creative assistant by generating images that inspire the design of your reports for example if you need to create a report on sustainability C-Pilot can generate images that evoke themes of sustainability you can use these images as a reference to design your own report visuals ensuring your reports are not only informative but also aesthetically aligned with the topic it is clear that C-Pilot is not just a tool but an assistant that brings out the best in your analysis efforts remember every report you create every DAX formula you solve and every insight you derive contributes to the decision that drives the company forward as you continue to leverage the power of PowerBI redefine the boundaries of what you can achieve with data and let C-pilot guide you to a new horizon of possibilities it’s early Monday morning and your manager has assigned you a critical task whereby you must develop a report for the upcoming quarterly review your manager expects the report to embody the company’s new logo and color scheme to add to the challenge the task now is not only to present data but to do so in a way that reflects the company’s updated brand identity feeling the weight of this responsibility you take a deep breath sip your coffee and get to work you are confident you can complete this task well with your trusty ally Microsoft Copilot when designing a report matching colors to a company’s logo and branding isn’t just about aesthetics but also about communication and consistency using artificial intelligence or AI assisted tools like Microsoft Copilot enables you to easily integrate a new color pallet aligning your report with the updated company branding this AIdriven approach enhances productivity by automating the once time-consuming task of manual caller matching so let’s unpack how you can achieve this first open Microsoft Edge and select the C-Pilot icon next to the search bar this access point is part of Microsoft’s integrated experience merging the functionalities of Bing and Copilot ensure that you are signed in with your Microsoft account you’ll be prompted to create an account if you don’t have an account once signed in select the more creative button to activate creative mode creative mode is recommended for highly creative tasks like developing unique concepts or exploring artistic elements such as images now focus towards the bottom left of the interface next to ask me anything and select add an image followed by upload from this device next in the file explorer navigate to the location where the logo image is saved select the image file and confirm the selection by selecting the open button to upload it the selected image then begins to upload to Copilot type the instructions in the text box depending on what you need Copilot to do with the image in this instance let’s create a color palette by inputting generate a color palette based on this logo upon selecting the submit button Copilot uses its AI technology to analyze the uploaded logo image it examines the logo’s colors and uses algorithms designed to identify and extract predominant and accent colors based on the analysis Copilot presents the color palette in hex codes which is the standard for color representation if the initial palette isn’t satisfactory or lacks some colors you can modify your prompt to specify your needs further for instance if the company branding includes the color blue which wasn’t present in the logo you can amend your prompt to include shades of blue in the palette with your generated color palette it’s time to integrate these colors into your PowerBI report open the report and select the view tab now select the themes drop-down to expand the theme gallery upon selecting customize current theme input the hex codes provided by Copilot via the drop- down buttons for each color setting such as first level and second level these hex codes represent the colors identified from the logo after inputting the new colors select apply to update the report with a new theme there you have it you can now confidently use Microsoft Copilot to enhance your report design you achieved maximized productivity and reduced the time you spent on the task remember partnering with an AI tool such as Microsoft Copilot makes managing complex tasks and deadlines easier so enjoy the journey as you embrace and explore its powerful capabilities as a senior data analyst you’ve spent weeks crafting a PowerBI dashboard for the company’s quarterly review however as you run through the last data validations a series of errors cascade through critical data analysis expressions or DAX formulas these aren’t simple fixes they involve complex nested if statements within calculate functions that you had previously tested in this critical moment you recall the Microsoft Copilot in Bing is the solution you need in this video you’ll discover the importance of mastering DAX for data manipulation and analysis in PowerBI and learn how Copilot can be a valuable tool for addressing formula issues mastering DAX is essential to turn complex data into compelling business insights however even the most skilled data analysts can encounter errors when navigating through its syntax and functionalities understanding these common issues can help you write more robust and efficient DAX code let’s explore these and how to resolve them using Microsoft Copilot in Bing when applied over large data sets the filter function can be computationally expensive and slows report performance for instance imagine using filter to identify all sales transactions above a certain value across the sales database the row iterative nature of filter would examine each transaction individually causing delays in loading the report here Copilot can help optimize the formulas to enhance performance and assist in correcting any logical errors by refining the filter criteria let’s examine how to achieve this begin by opening your PowerBI desktop report and navigate to the table containing the filter formula you intend to refine next select the formula bar where the filter statement is displayed now copy the contents from the field with your formula copied launch Microsoft Edge and select the Copilot icon in the sidebar to access the integrated C-pilot in Bing upon loading Copilot select the more precise button that activates precise mode locate the ask me anything text box and paste the slow filter formula providing Copilot with context now type the specific query for assistance on a new line in the same prompt window in this instance to optimize performance you can type “How can I optimize this filter function to improve performance when handling large data sets?” Select the submit button to send the query to Copilot once you press submit Copilot processes your input using its artificial intelligence commonly referred to as AI capabilities once you have a revised filter formula and are satisfied copy this directly from the copilot interface by selecting the copy button upon navigating to your PowerBI report select the table where you want to apply the updated formula then select the formula bar and paste the updated formula make sure to replace the old formula completely to avoid conflicts or errors select enter to commit the formula in PowerBI and observe how it executes one of the most powerful yet tricky aspects of calculate is its ability to modify the filter context of a calculation suppose you want to use calculate to sum sales for all countries but as a result it returns total sales for only the United States microsoft Copilot in Bing can help guide you through the correct structuring of calculate formulas suggest how to perform dynamic aggregations and even detect and suggest fixes to syntax errors in the ask me anything text box paste the calculate formula you need to troubleshoot on a new line in the same prompt window type how can I modify this calculate formula to sum sales for all countries once you select the submit button Copilot returns an explanation and a corrected calculate formula with a requested context after reviewing the initial results you can ask some additional questions to deepen your understanding or refine your formula further for instance can you suggest ways to avoid common syntax errors in this calculate formula this followup empowers you to grasp common mistakes and learn best practices in writing DAX formulas once you are satisfied with the response from Copilot select the copy button finally paste the results in Microsoft PowerBI to assess whether the suggestions improve the formula’s functionality deeply nested if statements can become difficult to manage and troubleshoot imagine using nested if statements to categorize sales into different classes based on the column amount the complexity of checking multiple conditions can easily lead to mistakes and logic copilot can simplify this by suggesting straightforward alternatives or helping restructure these nested conditions into manageable components now in the ask me anything text box paste the if formula that requires troubleshooting on a new line in the same prompt window enter can you suggest a simpler alternative to this nested if statement for better manageability upon selecting the submit button Copilot generates suggestions to simplify or improve the efficiency after reviewing the feedback provided by Copilot select the copy button finally navigate to PowerBI desktop paste the revised if statement into the formula bar and select enter to apply the formula as your journey through mastering DAX comes to a close reflect on the transformative power of blending AI with your analytical skills as you move forward equipped with the knowledge of DAX and the support of AI remember that each challenge overcome is not just a step toward progression but a leap toward mastering PowerBI congratulations on completing the Microsoft PL300 exam preparation and practice course your dedication has given you the skills and tools for success when writing the Microsoft PL300 exam you have now achieved all the PowerBI milestones in this program this course gave you opportunities to practice your exam technique and refresh your knowledge of all the key areas assessed in the Microsoft PL300 exam you tested your knowledge in a series of practice exams mapped to all the main topics covered in the Microsoft PL300 exam to help you prepare for certification success you also got tips and tricks testing strategies useful resources and information on how to sign up for the Microsoft PL300 proctored exam now that you have successfully completed this professional certificate you are ready to schedule the Microsoft PL300 exam through Pearson View through a mix of videos readings and exercises you’ve learned about the expectations for the learning content by starting with an introduction to the course following this you were provided with information about the Microsoft certification here you explored an introduction to preparing for the exam how to prepare for the procedurate examination how the exam is administered topics covered in the PL300 exam and testing strategy next you reviewed what you learned about getting data from data sources here you revisited how to identify and connect to a data source using a shared data set or local data set direct query import and dual mode parameter values how to set up a data flow how to connect to a data flow the Microsoft data versse and how to get data from data sources you then investigated how to profile and clean data this included consolidating your knowledge of evaluating data data statistics and column properties how to resolve inconsistencies and data quality issues and an indepth dive into profiling and cleaning data after that you explored the process of transforming and loading data where you covered how to create and transform columns identify when to use reference queries how to merge and append queries table relationships and an in-depth view of transforming and loading data next you explored modeling data where you revised key concepts related to modeling data in PowerBI here you reviewed designing data models where you learned about how to design a schema implement role playing dimensions use calculate to manipulate filters and configure cardality and cross filter direction next you explored how to create model calculations using DAX this is where you explored calculated columns and single aggregation measures as well as how to implement time intelligence measures you also reviewed the differences between additive semi-additive and non-additive measures later you reviewed how to implement a data model this is where you explored calculated tables and data hierarchies you also covered how to optimize model performance this included reviewing important topics like using the performance analyzer and how to improve performance via cardality and summarization you reviewed data visualization and analysis techniques in PowerBI to help you prepare for the PL300 exam in this section you revisited the process of report creation this included reviewing important topics like using appropriate visualizations configuring and formatting visualizations applying slicing and filtering and exporting and printing reports you re-examined how to enhance reports for better usability and storytelling this included reviewing report navigation and sorting interactions between visuals sync slicers group and layer visuals by using the selection pane and how to design reports for mobile devices following that you explored how to identify patterns and trends you revisited how to detect outliers and anomalies grouping and binning data AI visuals reference lines and error bars and scorecards and metrics you then moved on to deploying and maintaining assets this is where you revised creating and managing workspaces and assets you reviewed key concepts such as workspaces and workspace roles workspace apps how to publish import or update assets in a workspace subscriptions and data alerts how to promote or certify PowerBI content and global options for files next you reviewed how to manage data sets this section provided you with a summary of data gateways rowle security and granting access to data sets to round off your learning you took a mock exam that has been set up in a similar style to the industry recognized Microsoft PL300 exam by passing the exam you’ll become a Microsoft certified PowerBI data analyst it will also help you to start or expand a career in this role this globally recognized certification is industry endorsed evidence of your technical skills and knowledge the exam measures your ability to perform the following tasks prepare data for analysis model data visualize and analyze data and deploy and maintain assets to complete the exam you should be familiar with Power Query and the process of writing expressions using data analysis expressions or DAX you’ve done a great job so far and you should be proud of your progress the experience you’ve gained will showcase your willingness to learn your motivation and your capability to potential employers it’s been a pleasure to embark on this journey of discovery with you best of luck in the future the Microsoft PowerBI Analyst program is an excellent resource to start your career whether you’re a beginner or a seasoned professional looking to improve your skills data is the driving force behind this everchanging modern world shaping and developing industries and society it has transformed the way institutions operate from banks and hospitals to schools and supermarkets and for businesses data is everything it informs decisions and helps create value for customers content streaming services analyze data to decide what content to promote social media services analyze data to determine what products their customers are interested in and your local supermarket gathers and analyzes data to ensure the products you want are available the result of having all this data is that professional analysts are required to process and sort it to gain the insights that drive both the business and social worlds are you intrigued by this career field and wondering how to get started let’s meet two other students who have just begun their careers in entry- levelvel positions discover how and why they’ve chosen to embark upon career paths in this field with Microsoft and Corsera lucas a recent information technology graduate is currently searching for his first IT job he is eager to secure a position in the IT sector that offers good earning potential and a quick career progression he wants to work full-time in data analysis as he feels this career would offer both benefits during his degree he found working with and analyzing cloud-based data to be the most enjoyable element hence his focus on this career path lucas currently works shifts in a warehouse environment so he will need the flexibility of self-paced learning his earnings are low so he wants to achieve the qualification using the same basic laptop he relied upon as a student despite being a beginner Lucas has already mapped out his career and certification path and has enrolled in the Microsoft PowerBI analyst program he plans to apply for an entry- levelvel position as a data analyst once he has successfully completed the program and passed the PL300 exam as a data analyst he will inspect data identify key business insights for new business opportunities and help solve business problems amelia has been working as an administrative assistant in sales and marketing since leaving high school now that a few years have passed she is ready to embark upon a new career path in her current role Amelia has seen PowerBI reports and dashboards created by colleagues and shared with the team she was impressed at how the information was used to shape and focus the sales campaigns this sparked an interest in a career in data analysis amelia’s job requires her to work long hours so the ability to structure her own learning path is vital she also has a long commute so would like to access e-learning through her smartphone or tablet pursuing the PowerBI analyst qualification will showcase her dedication and help her apply for more senior roles in the department in the short term amelia doesn’t have a scientific background but she finds IT concepts logical and easy to understand so she’s embarking on the Microsoft PowerBI analyst program as it doesn’t assume a pre-existing high level of technical knowledge in the long term she hopes to secure an entry-level role as a PowerBI analyst as a PowerBI analyst she will be responsible for building data models creating data assets like reports and dashboards and ensuring data requirements are met you may be in a similar position to Lucas and Amelia and possess an interest in this exciting field of data analysis like them you can begin your career in this field by enrolling in the Microsoft PowerBI analyst program this will be the start of your new adventure good luck with your learning journey
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!
These resources provide a comprehensive pathway for aspiring database engineers and software developers. They cover fundamental database concepts like data modeling, SQL for data manipulation and management, database optimization, and data warehousing. Furthermore, they explore essential software development practices including Python programming, object-oriented principles, version control with Git and GitHub, software testing methodologies, and preparing for technical interviews with insights into data structures and algorithms.
Introduction to Database Engineering
This course provides a comprehensive introduction to database engineering. A straightforward description of a database is a form of electronic storage in which data is held. However, this simple explanation doesn’t fully capture the impact of database technology on global industry, government, and organizations. Almost everyone has used a database, and it’s likely that information about us is present in many databases worldwide.
Database engineering is crucial to global industry, government, and organizations. In a real-world context, databases are used in various scenarios:
Banks use databases to store data for customers, bank accounts, and transactions.
Hospitals store patient data, staff data, and laboratory data.
Online stores retain profile information, shopping history, and accounting transactions.
Social media platforms store uploaded photos.
Work environments use databases for downloading files.
Online games rely on databases.
Data in basic terms is facts and figures about anything. For example, data about a person might include their name, age, email, and date of birth, or it could be facts and figures related to an online purchase like the order number and description.
A database looks like data organized systematically, often resembling a spreadsheet or a table. This systematic organization means that all data contains elements or features and attributes by which they can be identified. For example, a person can be identified by attributes like name and age.
Data stored in a database cannot exist in isolation; it must have a relationship with other data to be processed into meaningful information. Databases establish relationships between pieces of data, for example, by retrieving a customer’s details from one table and their order recorded against another table. This is often achieved through keys. A primary key uniquely identifies each record in a table, while a foreign key is a primary key from one table that is used in another table to establish a link or relationship between the two. For instance, the customer ID in a customer table can be the primary key and then become a foreign key in an order table, thus relating the two tables.
While relational databases, which organize data into tables with relationships, are common, there are other types of databases. An object-oriented database stores data in the form of objects instead of tables or relations. An example could be an online bookstore where authors, customers, books, and publishers are rendered as classes, and the individual entries are objects or instances of these classes.
To work with data in databases, database engineers use Structured Query Language (SQL). SQL is a standard language that can be used with all relational databases like MySQL, PostgreSQL, Oracle, and Microsoft SQL Server. Database engineers establish interactions with databases to create, read, update, and delete (CRUD) data.
SQL can be divided into several sub-languages:
Data Definition Language (DDL) helps define data in the database and includes commands like CREATE (to create databases and tables), ALTER (to modify database objects), and DROP (to remove objects).
Data Manipulation Language (DML) is used to manipulate data and includes operations like INSERT (to add data), UPDATE (to modify data), and DELETE (to remove data).
Data Query Language (DQL) is used to read or retrieve data, primarily using the SELECT command.
Data Control Language (DCL) is used to control access to the database, with commands like GRANT and REVOKE to manage user privileges.
SQL offers several advantages:
It requires very little coding skills to use, consisting mainly of keywords.
Its interactivity allows developers to write complex queries quickly.
It is a standard language usable with all relational databases, leading to extensive support and information availability.
It is portable across operating systems.
Before developing a database, planning the organization of data is crucial, and this plan is called a schema. A schema is an organization or grouping of information and the relationships among them. In MySQL, schema and database are often interchangeable terms, referring to how data is organized. However, the definition of schema can vary across different database systems. A database schema typically comprises tables, columns, relationships, data types, and keys. Schemas provide logical groupings for database objects, simplify access and manipulation, and enhance database security by allowing permission management based on user access rights.
Database normalization is an important process used to structure tables in a way that minimizes challenges by reducing data duplication and avoiding data inconsistencies (anomalies). This involves converting a large table into multiple tables to reduce data redundancy. There are different normal forms (1NF, 2NF, 3NF) that define rules for table structure to achieve better database design.
As databases have evolved, they now must be able to store ever-increasing amounts of unstructured data, which poses difficulties. This growth has also led to concepts like big data and cloud databases.
Furthermore, databases play a crucial role in data warehousing, which involves a centralized data repository that loads, integrates, stores, and processes large amounts of data from multiple sources for data analysis. Dimensional data modeling, based on dimensions and facts, is often used to build databases in a data warehouse for data analytics. Databases also support data analytics, where collected data is converted into useful information to inform future decisions.
Tools like MySQL Workbench provide a unified visual environment for database modeling and management, supporting the creation of data models, forward and reverse engineering of databases, and SQL development.
Finally, interacting with databases can also be done through programming languages like Python using connectors or APIs (Application Programming Interfaces). This allows developers to build applications that interact with databases for various operations.
Understanding SQL: Language for Database Interaction
SQL (Structured Query Language) is a standard language used to interact with databases. It’s also commonly pronounced as “SQL”. Database engineers use SQL to establish interactions with databases.
Here’s a breakdown of SQL based on the provided source:
Role of SQL: SQL acts as the interface or bridge between a relational database and its users. It allows database engineers to create, read, update, and delete (CRUD) data. These operations are fundamental when working with a database.
Interaction with Databases: As a web developer or data engineer, you execute SQL instructions on a database using a Database Management System (DBMS). The DBMS is responsible for transforming SQL instructions into a form that the underlying database understands.
Applicability: SQL is particularly useful when working with relational databases, which require a language that can interact with structured data. Examples of relational databases that SQL can interact with include MySQL, PostgreSQL, Oracle, and Microsoft SQL Server.
SQL Sub-languages: SQL is divided into several sub-languages:
Data Definition Language (DDL): Helps you define data in your database. DDL commands include:
CREATE: Used to create databases and related objects like tables. For example, you can use the CREATE DATABASE command followed by the database name to create a new database. Similarly, CREATE TABLE followed by the table name and column definitions is used to create tables.
ALTER: Used to modify already created database objects, such as modifying the structure of a table by adding or removing columns (ALTER TABLE).
DROP: Used to remove objects like tables or entire databases. The DROP DATABASE command followed by the database name removes a database. The DROP COLUMN command removes a specific column from a table.
Data Manipulation Language (DML): Commands are used to manipulate data in the database and most CRUD operations fall under DML. DML commands include:
INSERT: Used to add or insert data into a table. The INSERT INTO syntax is used to add rows of data to a specified table.
UPDATE: Used to edit or modify existing data in a table. The UPDATE command allows you to specify data to be changed.
DELETE: Used to remove data from a table. The DELETE FROM syntax followed by the table name and an optional WHERE clause is used to remove data.
Data Query Language (DQL): Used to read or retrieve data from the database. The primary DQL command is:
SELECT: Used to select and retrieve data from one or multiple tables, allowing you to specify the columns you want and apply filter criteria using the WHERE clause. You can select all columns using SELECT *.
Data Control Language (DCL): Used to control access to the database. DCL commands include:
GRANT: Used to give users access privileges to data.
REVOKE: Used to revert access privileges already given to users.
Advantages of SQL: SQL is a popular language choice for databases due to several advantages:
Low coding skills required: It uses a set of keywords and requires very little coding.
Interactivity: Allows developers to write complex queries quickly.
Standard language: Can be used with all relational databases like MySQL, leading to extensive support and information availability.
Portability: Once written, SQL code can be used on any hardware and any operating system or platform where the database software is installed.
Comprehensive: Covers all areas of database management and administration, including creating databases, manipulating data, retrieving data, and managing security.
Efficiency: Allows database users to process large amounts of data quickly and efficiently.
Basic SQL Operations: SQL enables various operations on data, including:
Creating databases and tables using DDL.
Populating and modifying data using DML (INSERT, UPDATE, DELETE).
Reading and querying data using DQL (SELECT) with options to specify columns and filter data using the WHERE clause.
Sorting data using the ORDER BY clause with ASC (ascending) or DESC (descending) keywords.
Filtering data using the WHERE clause with various comparison operators (=, <, >, <=, >=, !=) and logical operators (AND, OR). Other filtering operators include BETWEEN, LIKE, and IN.
Removing duplicate rows using the SELECT DISTINCT clause.
Performing arithmetic operations using operators like +, -, *, /, and % (modulus) within SELECT statements.
Using comparison operators to compare values in WHERE clauses.
Utilizing aggregate functions (though not detailed in this initial overview but mentioned later in conjunction with GROUP BY).
Joining data from multiple tables (mentioned as necessary when data exists in separate entities). The source later details INNER JOIN, LEFT JOIN, and RIGHT JOIN clauses.
Creating aliases for tables and columns to make queries simpler and more readable.
Using subqueries (a query within another query) for more complex data retrieval.
Creating views (virtual tables based on the result of a SQL statement) to simplify data access and combine data from multiple tables.
Using stored procedures (pre-prepared SQL code that can be saved and executed).
Working with functions (numeric, string, date, comparison, control flow) to process and manipulate data.
Implementing triggers (stored programs that automatically execute in response to certain events).
Managing database transactions to ensure data integrity.
Optimizing queries for better performance.
Performing data analysis using SQL queries.
Interacting with databases using programming languages like Python through connectors and APIs.
In essence, SQL is a powerful and versatile language that is fundamental for anyone working with relational databases, enabling them to define, manage, query, and manipulate data effectively. The knowledge of SQL is a valuable skill for database engineers and is crucial for various tasks, from building and maintaining databases to extracting insights through data analysis.
Data Modeling Principles: Schema, Types, and Design
Data modeling principles revolve around creating a blueprint of how data will be organized and structured within a database system. This plan, often referred to as a schema, is essential for efficient data storage, access, updates, and querying. A well-designed data model ensures data consistency and quality.
Here are some key data modeling principles discussed in the sources:
Understanding Data Requirements: Before creating a database, it’s crucial to have a clear idea of its purpose and the data it needs to store. For example, a database for an online bookshop needs to record book titles, authors, customers, and sales. Mangata and Gallo (mng), a jewelry store, needed to store data on customers, products, and orders.
Visual Representation: A data model provides a visual representation of data elements (entities) and their relationships. This is often achieved using an Entity Relationship Diagram (ERD), which helps in planning entity-relational databases.
Different Levels of Abstraction: Data modeling occurs at different levels:
Conceptual Data Model: Provides a high-level, abstract view of the entities and their relationships in the database system. It focuses on “what” data needs to be stored (e.g., customers, products, orders as entities for mng) and how these relate.
Logical Data Model: Builds upon the conceptual model by providing a more detailed overview of the entities, their attributes, primary keys, and foreign keys. For mng, this would involve defining attributes for customers (like client ID as primary key), products, and orders, and specifying foreign keys to establish relationships (e.g., client ID in the orders table referencing the clients table).
Physical Data Model: Represents the internal schema of the database and is specific to the chosen Database Management System (DBMS). It outlines details like data types for each attribute (e.g., varchar for full name, integer for contact number), constraints (e.g., not null), and other database-specific features. SQL is often used to create the physical schema.
Choosing the Right Data Model Type: Several types of data models exist, each with its own advantages and disadvantages:
Relational Data Model: Represents data as a collection of tables (relations) with rows and columns, known for its simplicity.
Entity-Relationship Model: Similar to the relational model but presents each table as a separate entity with attributes and explicitly defines different types of relationships between entities (one-to-one, one-to-many, many-to-many).
Hierarchical Data Model: Organizes data in a tree-like structure with parent and child nodes, primarily supporting one-to-many relationships.
Object-Oriented Model: Translates objects into classes with characteristics and behaviors, supporting complex associations like aggregation and inheritance, suitable for complex projects.
Dimensional Data Model: Based on dimensions (context of measurements) and facts (quantifiable data), optimized for faster data retrieval and efficient data analytics, often using star and snowflake schemas in data warehouses.
Database Normalization: This is a crucial process for structuring tables to minimize data redundancy, avoid data modification implications (insertion, update, deletion anomalies), and simplify data queries. Normalization involves applying a series of normal forms (First Normal Form – 1NF, Second Normal Form – 2NF, Third Normal Form – 3NF) to ensure data atomicity, eliminate repeating groups, address functional and partial dependencies, and resolve transitive dependencies.
Establishing Relationships: Data in a database should be related to provide meaningful information. Relationships between tables are established using keys:
Primary Key: A value that uniquely identifies each record in a table and prevents duplicates.
Foreign Key: One or more columns in one table that reference the primary key in another table, used to connect tables and create cross-referencing.
Defining Domains: A domain is the set of legal values that can be assigned to an attribute, ensuring data in a field is well-defined (e.g., only numbers in a numerical domain). This involves specifying data types, length values, and other relevant rules.
Using Constraints: Database constraints limit the type of data that can be stored in a table, ensuring data accuracy and reliability. Common constraints include NOT NULL (ensuring fields are always completed), UNIQUE (preventing duplicate values), CHECK (enforcing specific conditions), and FOREIGN KEY (maintaining referential integrity).
Importance of Planning: Designing a data model before building the database system allows for planning how data is stored and accessed efficiently. A poorly designed database can make it hard to produce accurate information.
Considerations at Scale: For large-scale applications like those at Meta, data modeling must prioritize user privacy, user safety, and scalability. It requires careful consideration of data access, encryption, and the ability to handle billions of users and evolving product needs. Thoughtfulness about future changes and the impact of modifications on existing data models is crucial.
Data Integrity and Quality: Well-designed data models, including the use of data types and constraints, are fundamental steps in ensuring the integrity and quality of a database.
Data modeling is an iterative process that requires a deep understanding of the data, the business requirements, and the capabilities of the chosen database system. It is a crucial skill for database engineers and a fundamental aspect of database design. Tools like MySQL Workbench can aid in creating, visualizing, and implementing data models.
Understanding Version Control: Git and Collaborative Development
Version Control Systems (VCS), also known as Source Control or Source Code Management, are systems that record all changes and modifications to files for tracking purposes. The primary goal of any VCS is to keep track of changes by allowing developers access to the entire change history with the ability to revert or roll back to a previous state or point in time. These systems track different types of changes such as adding new files, modifying or updating files, and deleting files. The version control system is the source of truth across all code assets and the team itself.
There are many benefits associated with Version Control, especially for developers working in a team. These include:
Revision history: Provides a record of all changes in a project and the ability for developers to revert to a stable point in time if code edits cause issues or bugs.
Identity: All changes made are recorded with the identity of the user who made them, allowing teams to see not only when changes occurred but also who made them.
Collaboration: A VCS allows teams to submit their code and keep track of any changes that need to be made when working towards a common goal. It also facilitates peer review where developers inspect code and provide feedback.
Automation and efficiency: Version Control helps keep track of all changes and plays an integral role in DevOps, increasing an organization’s ability to deliver applications or services with high quality and velocity. It aids in software quality, release, and deployments. By having Version Control in place, teams following agile methodologies can manage their tasks more efficiently.
Managing conflicts: Version Control helps developers fix any conflicts that may occur when multiple developers work on the same code base. The history of revisions can aid in seeing the full life cycle of changes and is essential for merging conflicts.
There are two main types or categories of Version Control Systems: centralized Version Control Systems (CVCS) and distributed Version Control Systems (DVCS).
Centralized Version Control Systems (CVCS) contain a server that houses the full history of the code base and clients that pull down the code. Developers need a connection to the server to perform any operations. Changes are pushed to the central server. An advantage of CVCS is that they are considered easier to learn and offer more access controls to users. A disadvantage is that they can be slower due to the need for a server connection.
Distributed Version Control Systems (DVCS) are similar, but every user is essentially a server and has the entire history of changes on their local system. Users don’t need to be connected to the server to add changes or view history, only to pull down the latest changes or push their own. DVCS offer better speed and performance and allow users to work offline. Git is an example of a DVCS.
Popular Version Control Technologies include git and GitHub. Git is a Version Control System designed to help users keep track of changes to files within their projects. It offers better speed and performance, reliability, free and open-source access, and an accessible syntax. Git is used predominantly via the command line. GitHub is a cloud-based hosting service that lets you manage git repositories from a user interface. It incorporates Git Version Control features and extends them with features like Access Control, pull requests, and automation. GitHub is very popular among web developers and acts like a social network for projects.
Key Git concepts include:
Repository: Used to track all changes to files in a specific folder and keep a history of all those changes. Repositories can be local (on your machine) or remote (e.g., on GitHub).
Clone: To copy a project from a remote repository to your local device.
Add: To stage changes in your local repository, preparing them for a commit.
Commit: To save a snapshot of the staged changes in the local repository’s history. Each commit is recorded with the identity of the user.
Push: To upload committed changes from your local repository to a remote repository.
Pull: To retrieve changes from a remote repository and apply them to your local repository.
Branching: Creating separate lines of development from the main codebase to work on new features or bug fixes in isolation. The main branch is often the source of truth.
Forking: Creating a copy of someone else’s repository on a platform like GitHub, allowing you to make changes without affecting the original.
Diff: A command to compare changes across files, branches, and commits.
Blame: A command to look at changes of a specific file and show the dates, times, and users who made the changes.
The typical Git workflow involves three states: modified, staged, and committed. Files are modified in the working directory, then added to the staging area, and finally committed to the local repository. These local commits are then pushed to a remote repository.
Branching workflows like feature branching are commonly used. This involves creating a new branch for each feature, working on it until completion, and then merging it back into the main branch after a pull request and peer review. Pull requests allow teams to review changes before they are merged.
At Meta, Version Control is very important. They use a giant monolithic repository for all of their backend code, which means code changes are shared with every other Instagram team. While this can be risky, it allows for code reuse. Meta encourages engineers to improve any code, emphasizing that “nothing at meta is someone else’s problem”. Due to the monolithic repository, merge conflicts happen a lot, so they try to write smaller changes and add gatekeepers to easily turn off features if needed. git blame is used daily to understand who wrote specific lines of code and why, which is particularly helpful in a large organization like Meta.
Version Control is also relevant to database development. It’s easy to overcomplicate data modeling and storage, and Version Control can help track changes and potentially revert to earlier designs. Planning how data will be organized (schema) is crucial before developing a database.
Learning to use git and GitHub for Version Control is part of the preparation for coding interviews in a final course, alongside practicing interview skills and refining resumes. Effective collaboration, which is enhanced by Version Control, is a crucial skill for software developers.
Python Programming Fundamentals: An Introduction
Based on the sources, here’s a discussion of Python programming basics:
Introduction to Python:
Python is a versatile and high-level programming language available on multiple platforms. It’s used in various areas like web development, data analytics, and business forecasting. Python’s syntax is similar to English, making it intuitive and easy for beginners to understand. Experienced programmers also appreciate its power and adaptability. Python was created by Guido van Rossum and released in 1991. It was designed to be readable and has similarities to English and mathematics. Since its release, it has gained significant popularity and has a rich selection of frameworks and libraries. Currently, it’s a popular language to learn, widely used in areas such as web development, artificial intelligence, machine learning, data analytics, and various programming applications. Python is easy to learn and get started with due to its English-like syntax. It also often requires less code compared to languages like C or Java. Python’s simplicity allows developers to focus on the task at hand, making it potentially quicker to get a product to market.
Setting up a Python Environment:
To start using Python, it’s essential to ensure it works correctly on your operating system with your chosen Integrated Development Environment (IDE), such as Visual Studio Code (VS Code). This involves making sure the right version of Python is used as the interpreter when running your code.
Installation Verification: You can verify if Python is installed by opening the terminal (or command prompt on Windows) and typing python –version. This should display the installed Python version.
VS Code Setup: VS Code offers a walkthrough guide for setting up Python. This includes installing Python (if needed) and selecting the correct Python interpreter.
Running Python Code: Python code can be run in a few ways:
Python Shell: Useful for running and testing small scripts without creating .py files. You can access it by typing python in the terminal.
Directly from Command Line/Terminal: Any file with the .py extension can be run by typing python followed by the file name (e.g., python hello.py).
Within an IDE (like VS Code): IDEs provide features like auto-completion, debugging, and syntax highlighting, making coding a better experience. VS Code has a run button to execute Python files.
Basic Syntax and Concepts:
Print Statement: The print() function is used to display output to the console. It can print different types of data and allows for formatting.
Variables: Variables are used to store data that can be changed throughout the program’s lifecycle. In Python, you declare a variable by assigning a value to a name (e.g., x = 5). Python automatically assigns the data type behind the scenes. There are conventions for naming variables, such as camel case (e.g., myName). You can declare multiple variables and assign them a single value (e.g., a = b = c = 10) or perform multiple assignments on one line (e.g., name, age = “Alice”, 30). You can also delete a variable using the del keyword.
Data Types: A data type indicates how a computer system should interpret a piece of data. Python offers several built-in data types:
Numeric: Includes int (integers), float (decimal numbers), and complex numbers.
Sequence: Ordered collections of items, including:
Strings (str): Sequences of characters enclosed in single or double quotes (e.g., “hello”, ‘world’). Individual characters in a string can be accessed by their index (starting from 0) using square brackets (e.g., name). The len() function returns the number of characters in a string.
Lists: Ordered and mutable sequences of items enclosed in square brackets (e.g., [1, 2, “three”]).
Tuples: Ordered and immutable sequences of items enclosed in parentheses (e.g., (1, 2, “three”)).
Dictionary (dict): Unordered collections of key-value pairs enclosed in curly braces (e.g., {“name”: “Bob”, “age”: 25}). Values are accessed using their keys.
Boolean (bool): Represents truth values: True or False.
Set (set): Unordered collections of unique elements enclosed in curly braces (e.g., {1, 2, 3}). Sets do not support indexing.
Typecasting: The process of converting one data type to another. Python supports implicit (automatic) and explicit (using functions like int(), float(), str()) type conversion.
Input: The input() function is used to take input from the user. It displays a prompt to the user and returns their input as a string.
Operators: Symbols used to perform operations on values.
Math Operators: Used for calculations (e.g., + for addition, – for subtraction, * for multiplication, / for division).
Logical Operators: Used in conditional statements to determine true or false outcomes (and, or, not).
Control Flow: Determines the order in which instructions in a program are executed.
Conditional Statements: Used to make decisions based on conditions (if, else, elif).
Loops: Used to repeatedly execute a block of code. Python has for loops (for iterating over sequences) and while loops (repeating a block until a condition is met). Nested loops are also possible.
Functions: Modular pieces of reusable code that take input and return output. You define a function using the def keyword. You can pass data into a function as arguments and return data using the return keyword. Python has different scopes for variables: local, enclosing, global, and built-in (LEGB rule).
Data Structures: Ways to organize and store data. Python includes lists, tuples, sets, and dictionaries.
This overview provides a foundation in Python programming basics as described in the provided sources. As you continue learning, you will delve deeper into these concepts and explore more advanced topics.
Database and Python Fundamentals Study Guide
Quiz
What is a database, and what is its typical organizational structure? A database is a systematically organized collection of data. This organization commonly resembles a spreadsheet or a table, with data containing elements and attributes for identification.
Explain the role of a Database Management System (DBMS) in the context of SQL. A DBMS acts as an intermediary between SQL instructions and the underlying database. It takes responsibility for transforming SQL commands into a format that the database can understand and execute.
Name and briefly define at least three sub-languages of SQL. DDL (Data Definition Language) is used to define data structures in a database, such as creating, altering, and dropping databases and tables. DML (Data Manipulation Language) is used for operational tasks like creating, reading, updating, and deleting data. DQL (Data Query Language) is used for retrieving data from the database.
Describe the purpose of the CREATE DATABASE and CREATE TABLE DDL statements. The CREATE DATABASE statement is used to create a new, empty database within the DBMS. The CREATE TABLE statement is used within a specific database to define a new table, including specifying the names and data types of its columns.
What is the function of the INSERT INTO DML statement? The INSERT INTO statement is used to add new rows of data into an existing table in the database. It requires specifying the table name and the values to be inserted into the table’s columns.
Explain the purpose of the NOT NULL constraint when defining table columns. The NOT NULL constraint ensures that a specific column in a table cannot contain a null value. If an attempt is made to insert a new record or update an existing one with a null value in a NOT NULL column, the operation will be aborted.
List and briefly define three basic arithmetic operators in SQL. The addition operator (+) is used to add two operands. The subtraction operator (-) is used to subtract the second operand from the first. The multiplication operator (*) is used to multiply two operands.
What is the primary function of the SELECT statement in SQL, and how can the WHERE clause be used with it? The SELECT statement is used to retrieve data from one or more tables in a database. The WHERE clause is used to filter the rows returned by the SELECT statement based on specified conditions.
Explain the difference between running Python code from the Python shell and running a .py file from the command line. The Python shell provides an interactive environment where you can execute Python code snippets directly and see immediate results without saving to a file. Running a .py file from the command line executes the entire script contained within the file non-interactively.
Define a variable in Python and provide an example of assigning it a value. In Python, a variable is a named storage location that holds a value. Variables are implicitly declared when a value is assigned to them. For example: x = 5 declares a variable named x and assigns it the integer value of 5.
Answer Key
A database is a systematically organized collection of data. This organization commonly resembles a spreadsheet or a table, with data containing elements and attributes for identification.
A DBMS acts as an intermediary between SQL instructions and the underlying database. It takes responsibility for transforming SQL commands into a format that the database can understand and execute.
DDL (Data Definition Language) helps you define data structures. DML (Data Manipulation Language) allows you to work with the data itself. DQL (Data Query Language) enables you to retrieve information from the database.
The CREATE DATABASE statement establishes a new database, while the CREATE TABLE statement defines the structure of a table within a database, including its columns and their data types.
The INSERT INTO statement adds new rows of data into a specified table. It requires indicating the table and the values to be placed into the respective columns.
The NOT NULL constraint enforces that a particular column must always have a value and cannot be left empty or contain a null entry when data is added or modified.
The + operator performs addition, the – operator performs subtraction, and the * operator performs multiplication between numerical values in SQL queries.
The SELECT statement retrieves data from database tables. The WHERE clause filters the results of a SELECT query, allowing you to specify conditions that rows must meet to be included in the output.
The Python shell is an interactive interpreter for immediate code execution, while running a .py file executes the entire script from the command line without direct interaction during the process.
A variable in Python is a name used to refer to a memory location that stores a value; for instance, name = “Alice” assigns the string value “Alice” to the variable named name.
Essay Format Questions
Discuss the significance of SQL as a standard language for database management. In your discussion, elaborate on at least three advantages of using SQL as highlighted in the provided text and provide examples of how these advantages contribute to efficient database operations.
Compare and contrast the roles of Data Definition Language (DDL) and Data Manipulation Language (DML) in SQL. Explain how these two sub-languages work together to enable the creation and management of data within a relational database system.
Explain the concept of scope in Python and discuss the LEGB rule. Provide examples to illustrate the differences between local, enclosed, global, and built-in scopes and explain how Python resolves variable names based on this rule.
Discuss the importance of modules in Python programming. Explain the advantages of using modules, such as reusability and organization, and describe different ways to import modules, including the use of import, from … import …, and aliases.
Imagine you are designing a simple database for a small online bookstore. Describe the tables you would create, the columns each table would have (including data types and any necessary constraints like NOT NULL or primary keys), and provide example SQL CREATE TABLE statements for two of your proposed tables.
Glossary of Key Terms
Database: A systematically organized collection of data that can be easily accessed, managed, and updated.
Table: A structure within a database used to organize data into rows (records) and columns (fields or attributes).
Column (Field): A vertical set of data values of a particular type within a table, representing an attribute of the entities stored in the table.
Row (Record): A horizontal set of data values within a table, representing a single instance of the entity being described.
SQL (Structured Query Language): A standard programming language used for managing and manipulating data in relational databases.
DBMS (Database Management System): Software that enables users to interact with a database, providing functionalities such as data storage, retrieval, and security.
DDL (Data Definition Language): A subset of SQL commands used to define the structure of a database, including creating, altering, and dropping databases, tables, and other database objects.
DML (Data Manipulation Language): A subset of SQL commands used to manipulate data within a database, including inserting, updating, deleting, and retrieving data.
DQL (Data Query Language): A subset of SQL commands, primarily the SELECT statement, used to query and retrieve data from a database.
Constraint: A rule or restriction applied to data in a database to ensure its accuracy, integrity, and reliability. Examples include NOT NULL.
Operator: A symbol or keyword that performs an operation on one or more operands. In SQL, this includes arithmetic operators (+, -, *, /), logical operators (AND, OR, NOT), and comparison operators (=, >, <, etc.).
Schema: The logical structure of a database, including the organization of tables, columns, relationships, and constraints.
Python Shell: An interactive command-line interpreter for Python, allowing users to execute code snippets and receive immediate feedback.
.py file: A file containing Python source code, which can be executed as a script from the command line.
Variable (Python): A named reference to a value stored in memory. Variables in Python are dynamically typed, meaning their data type is determined by the value assigned to them.
Data Type (Python): The classification of data that determines the possible values and operations that can be performed on it (e.g., integer, string, boolean).
String (Python): A sequence of characters enclosed in single or double quotes, used to represent text.
Scope (Python): The region of a program where a particular name (variable, function, etc.) is accessible. Python has four main scopes: local, enclosed, global, and built-in (LEGB).
Module (Python): A file containing Python definitions and statements. Modules provide a way to organize code into reusable units.
Import (Python): A statement used to load and make the code from another module available in the current script.
Alias (Python): An alternative name given to a module or function during import, often used for brevity or to avoid naming conflicts.
Briefing Document: Review of “01.pdf”
This briefing document summarizes the main themes and important concepts discussed in the provided excerpts from “01.pdf”. The document covers fundamental database concepts using SQL, basic command-line operations, an introduction to Python programming, and related software development tools.
I. Introduction to Databases and SQL
The document introduces the concept of databases as systematically organized data, often resembling spreadsheets or tables. It highlights the widespread use of databases in various applications, providing examples like banks storing account and transaction data, and hospitals managing patient, staff, and laboratory information.
“well a database looks like data organized systematically and this organization typically looks like a spreadsheet or a table”
The core purpose of SQL (Structured Query Language) is explained as a language used to interact with databases. Key operations that can be performed using SQL are outlined:
“operational terms create add or insert data read data update existing data and delete data”
SQL is further divided into several sub-languages:
DDL (Data Definition Language): Used to define the structure of the database and its objects like tables. Commands like CREATE (to create databases and tables) and ALTER (to modify existing objects, e.g., adding a column) are part of DDL.
“ddl as the name says helps you define data in your database but what does it mean to Define data before you can store data in the database you need to create the database and related objects like tables in which your data will be stored for this the ddl part of SQL has a command named create then you might need to modify already created database objects for example you might need to modify the structure of a table by adding a new column you can perform this task with the ddl alter command you can remove an object like a table from a”
DML (Data Manipulation Language): Used to manipulate the data within the database, including inserting (INSERT INTO), updating, and deleting data.
“now we need to populate the table of data this is where I can use the data manipulation language or DML subset of SQL to add table data I use the insert into syntax this inserts rows of data into a given table I just type insert into followed by the table name and then a list of required columns or Fields within a pair of parentheses then I add the values keyword”
DQL (Data Query Language): Primarily used for querying or retrieving data from the database (SELECT statements fall under this category).
DCL (Data Control Language): Used to control access and security within the database.
The document emphasizes that a DBMS (Database Management System) is crucial for interpreting and executing SQL instructions, acting as an intermediary between the SQL commands and the underlying database.
“a database interprets and makes sense of SQL instructions with the use of a database management system or dbms as a web developer you’ll execute all SQL instructions on a database using a dbms the dbms takes responsibility for transforming SQL instructions into a form that’s understood by the underlying database”
The advantages of using SQL are highlighted, including its simplicity, standardization, portability, comprehensiveness, and efficiency in processing large amounts of data.
“you now know that SQL is a simple standard portable comprehensive and efficient language that can be used to delete data retrieve and share data among multiple users and manage database security this is made possible through subsets of SQL like ddl or data definition language DML also known as data manipulation language dql or data query language and DCL also known as data control language and the final advantage of SQL is that it lets database users process large amounts of data quickly and efficiently”
Examples of basic SQL syntax are provided, such as creating a database (CREATE DATABASE College;) and creating a table (CREATE TABLE student ( … );). The INSERT INTO syntax for adding data to a table is also introduced.
Constraints like NOT NULL are mentioned as ways to enforce data integrity during table creation.
“the creation of a new customer record is aborted the not null default value is implemented using a SQL statement a typical not null SQL statement begins with the creation of a basic table in the database I can write a create table Clause followed by customer to define the table name followed by a pair of parentheses within the parentheses I add two columns customer ID and customer name I also Define each column with relevant data types end for customer ID as it stores”
SQL arithmetic operators (+, -, *, /, %) are introduced with examples. Logical operators (NOT, OR) and special operators (IN, BETWEEN) used in the WHERE clause for filtering data are also explained. The concept of JOIN clauses, including SELF-JOIN, for combining data from tables is briefly touched upon.
Subqueries (inner queries within outer queries) and Views (virtual tables based on the result of a query) are presented as advanced SQL concepts. User-defined functions and triggers are also introduced as ways to extend database functionality and automate actions. Prepared statements are mentioned as a more efficient way to execute SQL queries repeatedly. Date and time functions in MySQL are briefly covered.
II. Introduction to Command Line/Bash Shell
The document provides a basic introduction to using the command line or bash shell. Fundamental commands are explained:
PWD (Print Working Directory): Shows the current directory.
“to do that I run the PWD command PWD is short for print working directory I type PWD and press the enter key the command returns a forward slash which indicates that I’m currently in the root directory”
LS (List): Displays the contents of the current directory. The -l flag provides a detailed list format.
“if I want to check the contents of the root directory I run another command called LS which is short for list I type LS and press the enter key and now notice I get a list of different names of directories within the root level in order to get more detail of what each of the different directories represents I can use something called a flag flags are used to set options to the commands you run use the list command with a flag called L which means the format should be printed out in a list format I type LS space Dash l press enter and this Returns the results in a list structure”
CD (Change Directory): Navigates between directories using relative or absolute paths. cd .. moves up one directory.
“to step back into Etc type cdetc to confirm that I’m back there type bwd and enter if I want to use the other alternative you can do an absolute path type in CD forward slash and press enter Then I type PWD and press enter you can verify that I am back at the root again to step through multiple directories use the same process type CD Etc and press enter check the contents of the files by typing LS and pressing enter”
MKDIR (Make Directory): Creates a new directory.
“now I will create a new directory called submissions I do this by typing MK der which stands for make directory and then the word submissions this is the name of the directory I want to create and then I hit the enter key I then type in ls-l for list so that I can see the list structure and now notice that a new directory called submissions has been created I can then go into this”
TOUCH: Creates a new empty file.
“the Parent Directory next is the touch command which makes a new file of whatever type you specify for example to build a brand new file you can run touch followed by the new file’s name for instance example dot txt note that the newly created file will be empty”
HISTORY: Shows a history of recently used commands.
“to view a history of the most recently typed commands you can use the history command”
File Redirection (>, >>, <): Allows redirecting the input or output of commands to files. > overwrites, >> appends.
“if you want to control where the output goes you can use a redirection how do we do that enter the ls command enter Dash L to print it as a list instead of pressing enter add a greater than sign redirection now we have to tell it where we want the data to go in this scenario I choose an output.txt file the output dot txt file has not been created yet but it will be created based on the command I’ve set here with a redirection flag press enter type LS then press enter again to display the directory the output file displays to view the”
GREP: Searches for patterns within files.
“grep stands for Global regular expression print and it’s used for searching across files and folders as well as the contents of files on my local machine I enter the command ls-l and see that there’s a file called”
CAT: Displays the content of a file.
LESS: Views file content page by page.
“press the q key to exit the less environment the other file is the bash profile file so I can run the last command again this time with DOT profile this tends to be used used more for environment variables for example I can use it for setting”
VIM: A text editor used for creating and editing files.
“now I will create a simple shell script for this example I will use Vim which is an editor that I can use which accepts input so type vim and”
CHMOD: Changes file permissions, including making a file executable (chmod +x filename).
“but I want it to be executable which requires that I have an X being set on it in order to do that I have to use another command which is called chmod after using this them executable within the bash shell”
The document also briefly mentions shell scripts (files containing a series of commands) and environment variables (dynamic named values that can affect the way running processes will behave on a computer).
III. Introduction to Git and GitHub
Git is introduced as a free, open-source distributed version control system used to manage source code history, track changes, revert to previous versions, and collaborate with other developers. Key Git commands mentioned include:
GIT CLONE: Used to create a local copy of a remote repository (e.g., from GitHub).
“to do this I type the command git clone and paste the https URL I copied earlier finally I press enter on my keyboard notice that I receive a message stating”
LS -LA: Lists all files in a directory, including hidden ones (like the .git directory which contains the Git repository metadata).
“the ls-la command another file is listed which is just named dot get you will learn more about this later when you explore how to use this for Source control”
CD .git: Changes the current directory to the .git folder.
“first open the dot get folder on your terminal type CD dot git and press enter”
CAT HEAD: Displays the reference to the current commit.
“next type cat head and press enter in git we only work on a single Branch at a time this file also exists inside the dot get folder under the refs forward slash heads path”
CAT refs/heads/main: Displays the hash of the last commit on the main branch.
“type CD dot get and press enter next type cat forward slash refs forward slash heads forward slash main press enter after you”
GIT PULL: Fetches changes from a remote repository and integrates them into the local branch.
“I am now going to explain to you how to pull the repository to your local device”
GitHub is described as a cloud-based hosting service for Git repositories, offering a user interface for managing Git projects and facilitating collaboration.
IV. Introduction to Python Programming
The document introduces Python as a versatile programming language and outlines different ways to run Python code:
Python Shell: An interactive environment for running and testing small code snippets without creating separate files.
“the python shell is useful for running and testing small scripts for example it allows you to run code without the need for creating new DOT py files you start by adding Snippets of code that you can run directly in the shell”
Running Python Files: Executing Python code stored in files with the .py extension using the python filename.py command.
“running a python file directly from the command line or terminal note that any file that has the file extension of dot py can be run by the following command for example type python then a space and then type the file”
Basic Python concepts covered include:
Variables: Declaring and assigning values to variables (e.g., x = 5, name = “Alice”). Python automatically infers data types. Multiple variables can be assigned the same value (e.g., a = b = c = 10).
“all I have to do is name the variable for example if I type x equals 5 I have declared a variable and assigned as a value I can also print out the value of the variable by calling the print statement and passing in the variable name which in this case is X so I type print X when I run the program I get the value of 5 which is the assignment since I gave the initial variable Let Me Clear My screen again you have several options when it comes to declaring variables you can declare any different type of variable in terms of value for example X could equal a string called hello to do this I type x equals hello I can then print the value again run it and I find the output is the word hello behind the scenes python automatically assigns the data type for you”
Data Types: Basic data types like integers, floats (decimal numbers), complex numbers, strings (sequences of characters enclosed in single or double quotes), lists, and tuples (ordered, immutable sequences) are introduced.
“X could equal a string called hello to do this I type x equals hello I can then print the value again run it and I find the output is the word hello behind the scenes python automatically assigns the data type for you you’ll learn more about this in an upcoming video on data types you can declare multiple variables and assign them to a single value as well for example making a b and c all equal to 10. I do this by typing a equals b equals C equals 10. I print all three… sequence types are classed as container types that contain one or more of the same type in an ordered list they can also be accessed based on their index in the sequence python has three different sequence types namely strings lists and tuples let’s explore each of these briefly now starting with strings a string is a sequence of characters that is enclosed in either a single or double quotes strings are represented by the string class or Str for”
Operators: Arithmetic operators (+, -, *, /, **, %, //) and logical operators (and, or, not) are explained with examples.
“example 7 multiplied by four okay now let’s explore logical operators logical operators are used in Python on conditional statements to determine a true or false outcome let’s explore some of these now first logical operator is named and this operator checks for all conditions to be true for example a is greater than five and a is less than 10. the second logical operator is named or this operator checks for at least one of the conditions to be true for example a is greater than 5 or B is greater than 10. the final operator is named not this”
Conditional Statements: if, elif (else if), and else statements are introduced for controlling the flow of execution based on conditions.
“The Logical operators are and or and not let’s cover the different combinations of each in this example I declare two variables a equals true and B also equals true from these variables I use an if statement I type if a and b colon and on the next line I type print and in parentheses in double quotes”
Loops: for loops (for iterating over sequences) and while loops are introduced with examples, including nested loops.
“now let’s break apart the for Loop and discover how it works the variable item is a placeholder that will store the current letter in the sequence you may also recall that you can access any character in the sequence by its index the for Loop is accessing it in the same way and assigning the current value to the item variable this allows us to access the current character to print it for output when the code is run the outputs will be the letters of the word looping each letter on its own line now that you know about looping constructs in Python let me demonstrate how these work further using some code examples to Output an array of tasty desserts python offers us multiple ways to do loops or looping you’ll Now cover the for loop as well as the while loop let’s start with the basics of a simple for Loop to declare a for loop I use the four keyword I now need a variable to put the value into in this case I am using I I also use the in keyword to specify where I want to Loop over I add a new function called range to specify the number of items in a range in this case I’m using 10 as an example next I do a simple print statement by pressing the enter key to move to a new line I select the print function and within the brackets I enter the name looping and the value of I then I click on the Run button the output indicates the iteration Loops through the range of 0 to 9.”
Functions: Defining and calling functions using the def keyword. Functions can take arguments and return values. Examples of using *args (for variable positional arguments) and **kwargs (for variable keyword arguments) are provided.
“I now write a function to produce a string out of this information I type def contents and then self in parentheses on the next line I write a print statement for the string the plus self dot dish plus has plus self dot items plus and takes plus self dot time plus Min to prepare here we’ll use the backslash character to force a new line and continue the string on the following line for this to print correctly I need to convert the self dot items and self dot time… let’s say for example you wanted to calculate a total bill for a restaurant a user got a cup of coffee that was 2.99 then they also got a cake that was 455 and also a juice for 2.99. the first thing I could do is change the for Loop let’s change the argument to quarks by”
File Handling: Opening, reading (using read, readline, readlines), and writing to files. The importance of closing files is mentioned.
“the third method to read files in Python is read lines let me demonstrate this method the read lines method reads the entire contents of the file and then returns it in an ordered list this allows you to iterate over the list or pick out specific lines based on a condition if for example you have a file with four lines of text and pass a length condition the read files function will return the output all the lines in your file in the correct order files are stored in directories and they have”
Recursion: The concept of a function calling itself is briefly illustrated.
“the else statement will recursively call the slice function but with a modified string every time on the next line I add else and a colon then on the next line I type return string reverse Str but before I close the parentheses I add a slice function by typing open square bracket the number 1 and a colon followed by”
Object-Oriented Programming (OOP): Basic concepts of classes (using the class keyword), objects (instances of classes), attributes (data associated with an object), and methods (functions associated with an object, with self as the first parameter) are introduced. Inheritance (creating new classes based on existing ones) is also mentioned.
“method inside this class I want this one to contain a new function called leave request so I type def Leaf request and then self in days as the variables in parentheses the purpose of the leave request function is to return a line that specifies the number of days requested to write this I type return the string may I take a leave for plus Str open parenthesis the word days close parenthesis plus another string days now that I have all the classes in place I’ll create a few instances from these classes one for a supervisor and two others for… you will be defining a function called D inside which you will be creating another nested function e let’s write the rest of the code you can start by defining a couple of variables both of which will be called animal the first one inside the D function and the second one inside the E function note how you had to First declare the variable inside the E function as non-local you will now add a few more print statements for clarification for when you see the outputs finally you have called the E function here and you can add one more variable animal outside the D function this”
Modules: The concept of modules (reusable blocks of code in separate files) and how to import them using the import statement (e.g., import math, from math import sqrt, import math as m). The benefits of modular programming (scope, reusability, simplicity) are highlighted. The search path for modules (sys.path) is mentioned.
“so a file like sample.py can be a module named Sample and can be imported modules in Python can contain both executable statements and functions but before you explore how they are used it’s important to understand their value purpose and advantages modules come from modular programming this means that the functionality of code is broken down into parts or blocks of code these parts or blocks have great advantages which are scope reusability and simplicity let’s delve deeper into these everything in… to import and execute modules in Python the first important thing to know is that modules are imported only once during execution if for example your import a module that contains print statements print Open brackets close brackets you can verify it only executes the first time you import the module even if the module is imported multiple times since modules are built to help you Standalone… I will now import the built-in math module by typing import math just to make sure that this code works I’ll use a print statement I do this by typing print importing the math module after this I’ll run the code the print statement has executed most of the modules that you will come across especially the built-in modules will not have any print statements and they will simply be loaded by The Interpreter now that I’ve imported the math module I want to use a function inside of it let’s choose the square root function sqrt to do this I type the words math dot sqrt when I type the word math followed by the dot a list of functions appears in a drop down menu and you can select sqrt from this list I passed 9 as the argument to the math.sqrt function assign this to a variable called root and then I print it the number three the square root of nine has been printed to the terminal which is the correct answer instead of importing the entire math module as we did above there is a better way to handle this by directly importing the square root function inside the scope of the project this will prevent overloading The Interpreter by importing the entire math module to do this I type from math import sqrt when I run this it displays an error now I remove the word math from the variable declaration and I run the code again this time it works next let’s discuss something called an alias which is an excellent way of importing different modules here I sign an alias called m to the math module I do this by typing import math as m then I type cosine equals m dot I”
Scope: The concepts of local, enclosed, global, and built-in scopes in Python (LEGB rule) and how variable names are resolved. Keywords global and nonlocal for modifying variable scope are mentioned.
“names of different attributes defined inside it in this way modules are a type of namespace name spaces and Scopes can become very confusing very quickly and so it is important to get as much practice of Scopes as possible to ensure a standard of quality there are four main types of Scopes that can be defined in Python local enclosed Global and built in the practice of trying to determine in which scope a certain variable belongs is known as scope resolution scope resolution follows what is known commonly as the legb rule let’s explore these local this is where the first search for a variable is in the local scope enclosed this is defined inside an enclosing or nested functions Global is defined at the uppermost level or simply outside functions and built-in which is the keywords present in the built-in module in simpler terms a variable declared inside a function is local and the ones outside the scope of any function generally are global here is an example the outputs for the code on screen shows the same variable name Greek in different scopes… keywords that can be used to change the scope of the variables Global and non-local the global keyword helps us access the global variables from within the function non- local is a special type of scope defined in Python that is used within the nested functions only in the condition that it has been defined earlier in the enclosed functions now you can write a piece of code that will better help you understand the idea of scope for an attributes you have already created a file called animalfarm.py you will be defining a function called D inside which you will be creating another nested function e let’s write the rest of the code you can start by defining a couple of variables both of which will be called animal the first one inside the D function and the second one inside the E function note how you had to First declare the variable inside the E function as non-local you will now add a few more print statements for clarification for when you see the outputs finally you have called the E function here and you can add one more variable animal outside the D function this”
Reloading Modules: The reload() function for re-importing and re-executing modules that have already been loaded.
“statement is only loaded once by the python interpreter but the reload function lets you import and reload it multiple times I’ll demonstrate that first I create a new file sample.py and I add a simple print statement named hello world remember that any file in Python can be used as a module I’m going to use this file inside another new file and the new file is named using reloads.py now I import the sample.py module I can add the import statement multiple times but The Interpreter only loads it once if it had been reloaded we”
Testing: Introduction to writing test cases using the assert keyword and the pytest framework. The convention of naming test functions with the test_ prefix is mentioned. Test-Driven Development (TDD) is briefly introduced.
“another file called test Edition dot Pi in which I’m going to write my test cases now I import the file that consists of the functions that need to be tested next I’ll also import the pi test module after that I Define a couple of test cases with the addition and subtraction functions each test case should be named test underscore then the name of the function to be tested in our case we’ll have test underscore add and test underscore sub I’ll use the assert keyword inside these functions because tests primarily rely on this keyword it… contrary to the conventional approach of writing code I first write test underscore find string Dot py and then I add the test function named test underscore is present in accordance with the test I create another file named file string dot py in which I’ll write the is present function I Define the function named is present and I pass an argument called person in it then I make a list of names written as values after that I create a simple if else condition to check if the past argument”
V. Software Development Tools and Concepts
The document mentions several tools and concepts relevant to software development:
Python Installation and Version: Checking the installed Python version using python –version.
“prompt type python dash dash version to identify which version of python is running on your machine if python is correctly installed then Python 3 should appear in your console this means that you are running python 3. there should also be several numbers after the three to indicate which version of Python 3 you are running make sure these numbers match the most recent version on the python.org website if you see a message that states python not found then review your python installation or relevant document on”
Jupyter Notebook: An interactive development environment (IDE) for Python. Installation using python -m pip install jupyter and running using jupyter notebook are mentioned.
“course you’ll use the Jupiter put her IDE to demonstrate python to install Jupiter type python-mpip install Jupiter within your python environment then follow the jupyter installation process once you’ve installed jupyter type jupyter notebook to open a new instance of the jupyter notebook to use within your default browser”
MySQL Connector: A Python library used to connect Python applications to MySQL databases.
“the next task is to connect python to your mySQL database you can create the installation using a purpose-built python Library called MySQL connector this library is an API that provides useful”
Datetime Library: Python’s built-in module for working with dates and times. Functions like datetime.now(), datetime.date(), datetime.time(), and timedelta are introduced.
“python so you can import it without requiring pip let’s review the functions that Python’s daytime Library offers the date time Now function is used to retrieve today’s date you can also use date time date to retrieve just the date or date time time to call the current time and the time Delta function calculates the difference between two values now let’s look at the Syntax for implementing date time to import the daytime python class use the import code followed by the library name then use the as keyword to create an alias of… let’s look at a slightly more complex function time Delta when making plans it can be useful to project into the future for example what date is this same day next week you can answer questions like this using the time Delta function to calculate the difference between two values and return the result in a python friendly format so to find the date in seven days time you can create a new variable called week type the DT module and access the time Delta function as an object 563 instance then pass through seven days as an argument finally”
MySQL Workbench: A graphical tool for working with MySQL databases, including creating schemas.
“MySQL server instance and select the schema menu to create a new schema select the create schema option from the menu pane in the schema toolbar this action opens a new window within this new window enter mg underscore schema in the database name text field select apply this generates a SQL script called create schema mg schema you 606 are then asked to review the SQL script to be applied to your new database click on the apply button within the review window if you’re satisfied with the script a new window”
Data Warehousing: Briefly introduces the concept of a centralized data repository for integrating and processing large amounts of data from multiple sources for analysis. Dimensional data modeling is mentioned.
“in the next module you’ll explore the topic of data warehousing in this module you’ll learn about the architecture of a data warehouse and build a dimensional data model you’ll begin with an overview of the concept of data warehousing you’ll learn that a data warehouse is a centralized data repository that loads integrates stores and processes large amounts of data from multiple sources users can then query this data to perform data analysis you’ll then”
Binary Numbers: A basic explanation of the binary number system (base-2) is provided, highlighting its use in computing.
“binary has many uses in Computing it is a very convenient way of… consider that you have a lock with four different digits each digit can be a zero or a one how many potential past numbers can you have for the lock the answer is 2 to the power of four or two times two times two times two equals sixteen you are working with a binary lock therefore each digit can only be either zero or one so you can take four digits and multiply them by two every time and the total is 16. each time you add a potential digit you increase the”
Knapsack Problem: A brief overview of this optimization problem is given as a computational concept.
“three kilograms additionally each item has a value the torch equals one water equals two and the tent equals three in short the knapsack problem outlines a list of items that weigh different amounts and have different values you can only carry so many items in your knapsack the problem requires calculating the optimum combination of items you can carry if your backpack can carry a certain weight the goal is to find the best return for the weight capacity of the knapsack to compute a solution for this problem you must select all items”
This document provides a foundational overview of databases and SQL, command-line basics, version control with Git and GitHub, and introductory Python programming concepts, along with essential development tools. The content suggests a curriculum aimed at individuals learning about software development, data management, and related technologies.
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!