Case Study

In this case study, we will use the Sales, Customer and Stores tables of the Mondrian Database in conjunction with a variety of processors and generate the output into different file formats. The data manipulations we will illustrate are extraction, merging, filtering, derivation, caching and transformation.

Before we begin, you should ensure the Mondrian datasource is configured as described in the section called “Using the JDBC/ODBC bridge driver”.

Adding the DataSources

Launch Elixir Repertoire. Choose or create a new file system or folder for this case study and from the popup menu choose Add->Datasource. In the DataSource Wizard that appears select the JDBC DataSource. Click Next.

In the Define JDBC DataSource screen, enter the DataSource name as Store. Select the JDBC/ODBC bridge(Sun JVM) as the driver specification. Enter the URL as jdbc:odbc:MondrianFoodMart. Click Next.

In the SQL window, add the following query:

Select * from Store

Click Next and then click Infer Schema. Similarly, add the Sales data source with SQL:

Select * from Sales_Fact_1997

Again, infer the schema. Similarly, add a Customer data source. In the SQL window enter:

Select * from Customer

and again infer the schema.

Add a Composite DataSource named Case Study.

Creating a Composite DataSource

After adding the Composite DataSource, it will open automatically. We are going to create the diagram as shown in Figure 4.72, “Case Study Composite Diagram”. Select the Customer DataSource and drag and place it on the diagram. Repeat the process for the Sales DataSource and then the Store DataSource.

Figure 4.72. Case Study Composite Diagram

Case Study Composite Diagram

Add the additional processors and connections as shown in Figure 4.72, “Case Study Composite Diagram”. In your version, you will notice the link from Cube to DataStore is a dashed line because the tool cannot identify the schema for this flow until the cache has a schema inferred. With the diagram created, we can walk through the flow and set the various processor properties.

Join 1

This configuration corresponds to an inner join, so records from the primary are only retained if a matching secondary record exists. In this case, only customers that have made purchases (have sales records) will be retained.

Filter

The output from this part of the flow will only contain records where the store_country field has the value USA.

Derivative

We have defined a new column grocery_meat_sum which contains the sum of grocery and meat fields.

Join 2

Again we have used an inner join so records are only fetched based on a match with the secondary input. As the secondary input filters out all countries except USA, the Join will discard all non-US customer sales from the primary datasource.

Cube

Notice that after you inferred the schema, the flow connector from the Cube to the DataStore has changed from a dashed line to a solid line, indicating that a schema is now defined for this link. You can test the process by selecting the Cube processor and choosing View Cube from the popup menu. The output is displayed as shown in Figure 4.73, “View Cube Output”.

The sum and average of sales for the male and female customers for the different cities belonging to specific States of USA are displayed.

Figure 4.73. View Cube Output

View Cube Output

DataStore

You can choose various kinds of output for the records that have been processed. For this walkthrough, we have chosen XML, MySQL and Oracle.

XML: Invoke the DataStore Wizard and set the following properties:

Select the DataStore, and select Generate from the popup menu. The XML file will be generated and saved in the specified location.

MySQL: Before generating the MySQL JDBC DataStore, the MySQL driver file must be copied to the /Public/lib folder in the Repository.

Invoke the DataStore Wizard and set the following properties:

Select the DataStore and choose Generate from the popup menu. The MySQL JDBC DataStore is generated and saved in the specified location.

Oracle: Before generating the Oracle JDBC DataStore, the Oracle driver file must be copied to the /Public/lib/ folder in the Repository. This ensures that the Oracle driver is loaded into the class path when the tool is launched.

Invoke the DataStore Wizard and set the following properties:

Select the DataStore and choose Generate from the popup menu. The Oracle JDBC DataStore is generated and saved in the specific location.