Insert excerpt | ||||||||
---|---|---|---|---|---|---|---|---|
|
Scenario
Combining a large stream with data from a small stream, where the values of the very small stream will be repeated throughout the result. For each pair of matching records from the data sets, a single record is produced in the output.
Example
Find the description for each code in a stream of thousands from a stream containing mapping data. There are only ~100 possible codes.
Solution
- To do this, use a calculate stream with an order/index lookup pipe.
- In the below screenshot, 'Source Stream 1' about 2000 records and we want to enrich this data with data from 'Source Stream 2', which contains about 50 records.
- The result stream type is set to 'Calculate'.
- The pipe from 'Source Stream 1' is a pull pipe with no grouping.
- The pipe from 'Source Stream 2' is a lookup pipe. An Order/Index entry should be added to define the joining key between the 2 streams.
- This will index all the records from the source stream 2 by the index attribute, so they can be searched quickly. The data will be queried once and the result put into memory.
- All stream attributes use the attribute name, prefixed by the pipe name. For example, in1.Attribute1.
You need to make sure that all attributes that you refer to with _out prefixes in the joining key have a lower order number than those that use the lookup pipe prefix. For example, in the above screenshot, it is essential that Attribute1 has a lower order number than Attribute3. If the order of the attributes were switched around, Attribute3 would not return a value, because the order/index would be looking for records where Attribute1 is null, because it would not yet be calculated.
Info |
---|
Watch out for the multiple records returned by your lookup pipe. You will either need to do one of the following:
|