Join Processor

The Join processor is used to aggregate two data sources using one of several join types (e.g. Inner, outer, etc). The join operator links the records from two data sources by matching values in specified columns.

Elixir Data Designer supports all the three types of Joins: inner join, outer join and cross join.

The Join processor can be selected from the Designer menu bar and then placed on the diagram workspace. The Join processor must get input data from two DataSources.

Properties

The Join Processor properties are shown in Figure 4.5, “Join Wizard”.

Figure 4.5. Join Wizard

Join Wizard

The name of the Join can be entered in the text box provided. The Choose Colour dialog box is invoked on clicking the Colour button from which the colour of the Join processor can be selected.

The dialog consists of three tabs: Options, Primary and Secondary.

Working with Joins

Here's how to join the Sales and Customer data sources:

  1. Add the JDBC data sources, Sales and Customer using the procedure given in the previous chapter.

  2. Add a Composite DataSource with the name Join.

  3. Select the Sales data source, drag and drop it in the designer window. Similarly drag and drop the Customer data source onto the designer.

  4. Create a Join processor on the diagram.

  5. Connect the Customer data source as primary and Sales data source as secondary to the Join processor and connect the output to the Result. The diagram flow is shown in Figure 4.6, “Sample Join Flow”.

    Figure 4.6. Sample Join Flow

    Sample Join Flow

Inner Join

Inner join combines column values from one record of a data source with column values of a record from another (or the same) data source to form a single, merged record of data.

Using an Inner join on the Sales and Customer data sources you can explore the result of setting different options for the "If multiple matching secondaries".

a) If multiple matching secondaries - Repeat

  1. Using the flow described above, open the Join properties and select the Discard option from the "If no matching secondary" Combo box.

  2. Select the Repeat option from the "If multiple matching secondaries" combo box.

  3. Select the secondary tab. Select the customer_id field in the primary column against the customer_id field of the secondary data source as shown in Figure 4.7, “Join Wizard”.

    Figure 4.7. Join Wizard

    Join Wizard
  4. Click the Finish button to close the Join Wizard and return back to the Designer window.

  5. Select the Result, and choose View from the popup menu. The output is shown in the Figure 4.8, “Inner Join Result”.

    Figure 4.8. Inner Join Result

    Inner Join Result

This query fetches all primary records which have one or more matching secondary records. The primary record is repeated once for each secondary match. If there is no secondary match, the primary record is discarded.

b) If multiple matching secondaries - Keep

This query fetches all primary records which have one or more matching secondary records. The primary record is only output once, merged with the first secondary match. Subsequent secondary matches are ignored. If there is no secondary match, the primary record is discarded.

c) If multiple matching secondaries - Discard

This query fetches all primary records which have exactly one matching secondary record. The primary record is output once merged with the secondary match. If there is no secondary match, or multiple secondary matches, the primary record is discarded.

Outer Join

Outer join is a type of join in which both matching and non matching rows are returned. The values of all columns from the unmatched table in non-matching rows are set to NULL.

Using an Outer join on the Sales and Customer data sources you can explore the result of setting different options for the "If multiple matching secondaries".

a) If multiple matching secondaries - Repeat

  1. Return to the same Join diagram and edit the Join properties by selecting the option Keep from the "If no matching secondary" and Repeat from the "If multiple matching secondaries" Combo box.

  2. Enter "pri" in the Primary Prefix text box and "sec" in the Secondary Prefix text box.

  3. We will leave the secondary tab as before, linking the customer_id fields between primary and secondary.

  4. Click the Finish button and view the Result output. The output is shown in the Figure 4.9, “Outer Join Result”.

    Figure 4.9. Outer Join Result

    Outer Join Result

This query fetches all primary records. Each primary will be output as many times as there are matching secondary records. However, if there is no matching secondary record, the primary record will still be output once with the secondary fields set to NULL.

b) If multiple matching secondaries - Keep

This query fetches all records from the primary datasource and connects them to the first matching record from the secondary datasource. If no secondary record matches, the output record contains NULL for those values.

c) If multiple matching secondaries - Discard

This query fetches all records from the primary datasource. If there is exactly one matching secondary, it is merged. If there is zero or more than one matching secondary then NULL is used for each secondary field in the output.

Cross Join

A Cross Join merges each possible combination of records from the primary and secondary datasources. The output from a cross join can be very large!

  1. Reusing the same Join composite, open the Properties and select the Cross Join check box. You will notice the combo boxes are now disabled as all combinations are generated.

  2. Click the Finish button and view the Result output. The output is shown in the Figure 4.10, “Cross Join Result”.

    Figure 4.10. Cross Join Result

    Cross Join Result

The combination of all the records from both data sources are displayed. For instance, if there are 25 records in the Customer table and 40 records in the Sales table a total of 1000 (25*40) records are fetched by using Cross Join.