Following will be the steps we need to follow for designing a Data Warehouse:
Creating External Tables and Loading Data from Flat files
To transform data from source flat files to relational or multidimensional targets, we can use either a flat file operator or an external table to load data from source.
External tables allow us to represent data from flat file sources in relational table format. We can directly execute operations like query, join and transformation on flat files using the Oracle database.
Note: When we use external tables, we can query on and transform data present in the flat files directly, for which relevant SQL code is generated by OWB. Additionally, heterogeneous joins between flat files and relational tables are enabled and are maintained by warehouse builder.
In Warehouse Builder, we can define external tables on top of flat-file structures, deploy external tables to the source database, and use these external tables in mappings instead of flat files. For example we have created two external tables (EXPENSE_CATEGORIES and EXPENSE_DATA) which will point to the sampled flat files.
The categories of the subjects are called dimensions. Tables that describe the dimension are called Dimension Table. Dividing a Data warehouse project into dimension provides structured information for reporting purpose.
Dimensions are the primary organizational unit of data in a star schema. Examples of some commonly used dimensions are Customer, Product, and Time.
A Dimension consists of a set of levels and hierarchies defined using these levels. Users often analyze data by understanding the structure of the dimension hierarchies. Query performance is improved because the query optimizer makes good decisions about choosing what data to use.
To create a dimension, we must define the following:
1) Dimension Attributes
3) Level Attributes
Below picture depicts the dimension created REL_TIME using TIME Dimension wizard and REL_CATEGORY Dimension Using simple Wizard. With this dimension, Warehouse Builder also creates tables (REL_TIME and REL_CATEGORY) that implements the dimension in a star schema and also creates a sequence to load the ID column of the dimension.
Cubes contain measures and links to one or more dimensions. Body of the cube contains measure values and Axes of a cube contains dimension values. Most of the measures are numeric and additive.
Sales data can be organized into a cube whose axes contain values such as Time, Customer and Product dimensions and whose body contains values from the measures value like sales.
In a relational implementation, a cube is connected to dimension tables by foreign key and consists of a set of measures.
We must define Cube Measures and Cube Dimension value to create a cube.
Now, we will design a simple relational cube REL_EXPENSE using the cube wizard in OWB. It refers two dimensions ‘REL_CATEGORY’ and ‘REL_TIME’. Wizard helps us to create cube in graphical manner.
Go to the ‘EXPENSE_WH’ module, right-click on Cubes, and select New > Using Wizard.
- Enter ‘REL_EXPENSE’ as the name of the cube on the Name and Description page. Click Next.
- On the Storage Type page, select ROLAP: Relational Storage to specify relational implementation of the cube. Click Next.
Note: We can implement a cube in a relational or a dimensional form in the database. In relational implementation, cube data is stored in relational tables. When a cube is implemented in a dimensional environment, analytic workspace stores cube data.
- We can now select the dimensions that the cube will refer to on the Dimensions page. In the Available Dimensions list, expand Dimensions, and then expand ‘EXPENSE_WH’. Select two dimensions ‘REL_CATEGORY’ and ‘REL_TIME’ and move to selected Dimensions list. Press Next.
- Enter ‘EXPENSE’ in the Name field on the Measures page. Accept the default data type. Press Next. We can add number of measures in cube.
- Analyze the ‘REL_EXPENSE’ cube that is connected to the two dimensions.
Note: We have designed the dimensional objects, their implementing objects and related sequences, and their mappings, but these things do not physically exist in the database as yet. This is a major benefit that Warehouse Builder provides by separating the design and logical model from the actual physical implementation.
We need to deploy the objects (dimensions, cubes, sequence, mappings etc…) to actually create or build the target warehouse. We also need to design, deploy, and execute the ETL mappings that will extract data from the relational or flat file data sources, transform the data in a suitable manner, and finally load the data in the dimension and cube tables.
Designing ETL Data Flow Mappings
Now, we will create mappings to extract data from sources, transform it, and load it into target warehouse.
We use the Mapping Editor interface to design a mapping in Warehouse Builder. There is a large set of mapping operators available for use within the Mapping Editor. Using Oracle Warehouse Builder, we can use activities external to Warehouse Builder (like e-mail, FTP commands).
Now as the source and target modules are already defined, we will check about create the data movement and transformation logic. ETL stands for Extract, Transform, and Load. ETL involves the movement and transformation of data from your sources data to your targets. The majority of the design what we have done yet in Oracle Warehouse Builder is determine the ETL logic.
Design a Mapping between an External Table and a Cube
Now, we will perform an operation to create the REL_EXPENSE_MAP mapping that maps an external table called EXPORT_CSV to a cube called REL_EXPENSE.
- Expand ‘MY_PROJECT’ in the Project Explorer. Expand Databases > Oracle > EXPENSE_WH > Mappings. Right-click the Mappings node and select New.
- Enter ‘REL_EXPENSE_MAP’ as mapping name in the Mapping window. Press OK.
- In the Explorer panel, on the left of the Mapping Editor, check that the Available Objects tab is selected. Expand Oracle > EXPENSE_WH > External Tables. Drag the ‘EXPENSE_DATA’ external table to the canvas.
- Now, drag the ‘REL_EXPENSE’ cube to the canvas. Now, there are two objects on the canvas, the ‘EXPENSE_DATA’ external table and the ‘REL_EXPENSE’ cube operators.
- From the Palette panel, drag the Expression operator to the canvas.Note: The Expression operator is used to transform the ‘EXPENSE_DATE’ attribute in the EXPENSE_DATA external table from a date data type to a number data type. The target attribute in the cube is a number data type, therefore, this data type conversion is required to match the data type.
- Drag a connection line from EXPENSE_DATA.OUTGRP1.EXPENSE_DATE to EXPRESSION.INGRP1.
- Right-click the expression operator ‘EXPRESSION’ header and select Details.
- Press the Output Attributes tab in the Expression Editor. Add ‘EXPENSE_DATE’ attribute and ensure that the data type is NUMBER. Press OK.
- Enter the following statement in Expression for EXPENSE_DATE :TO_NUMBER (TO_CHAR (INGRP1.EXPENSE_DATE, ‘YYYYMMDD’), ‘99999999’)
- Connect the source operator External table to the target operator Cube as follows :
- Drag a connection line from EXPRESSION.OUTGRP1.EXPENSE_DATE to REL_EXPENSE.REL_EXPENSE.REL_TIME_DAY_CODE
- Drag a connection line from EXPENSE_DATA.OUTGRP1.EXPENSE_TYPE to REL_EXPENSE.REL_EXPENSE.REL_CATEGORY_NAME
- Drag a connection line from EXPENSE_DATA.OUTGRP1.REIMBURSABLE_AMOUNT to REL_EXPENSE.REL_EXPENSE.EXPENSE
- We have now generated the code that loads the cube. Oracle Warehouse Builder generates the code required to implement our design based on the ETL logic that we design in a mapping,
Build your Team
Want to Hire Skilled Developers