On a recent trip to a new city, someone said that the easiest way from the airport to the hotel was to use the Metro. I could speak the language, but reading it was another matter. I was surprised by how quickly I navigated to the hotel by following the Metro map. The Metro map is based on the successful design of the London Underground map.
Harry Beck was not a cartographer. He was an engineering draftsman. He started drawing a different type of map in his spare time. Beck believed that the passengers were not worried about the distance accuracy of the map. He reduced the map to straight lines and sharp angles, which produced a map closer to an electrical schematic diagram rather than a more common geographic map. The company that ran the London Underground was skeptical of Beck’s map since it was radically different and they had not commissioned the project.
In PowerCenter, we use mappings to design the steps between the source and the target to answer the business requirements of our users. The mappings can present the logical design and the physical design of the ETL process.
Our PowerCenter Level One Developer Course provides information to entry level developers. One subject is the file list. You can watch a preview of the PowerCenter Level One Developer course.
You have multiple data flat files that must be processed. All of these files have the same layout; all fields are in the same order with the same data types, precisions, and various other properties. You can use a file list to process all of these similar files with only one mapping and session.
You must create a source definition using the file layout. The file list is a file on the server that references each one of the data files. You configure the session to run the file list by naming the list as the source. This functionality is detailed in a presentation with multiple hands-on labs in the PowerCenter Level One Developer course.
However, we do not discuss the property called ‘Currently Processed Flat File Name Port’. This property exists on the source definition and when configured, PowerCenter creates another port on the source that stores the name of that particular flat file where the record originated.
The ‘Add Currently Process Flat File Name Port’ exists on the Properties tab of the Source Definition:
This new port should be worked through the mapping. To do this, you add a new port with the correct data type and precision to each object in the mapping, including the target. Here is an example of a mapping after the new port has been added.
Each red arrow indicates that the new port has been added in the mapping. If the session has already been created, refresh the mapping.
Here are the session properties:
The Source FileType has been set to ‘Indirect’, this is necessary for file lists. The Source FileName is set to ‘customer_list.dat’. This file name is not the name of the specific data file, but the file that references the paths and names of each data file to be processed. Notice that the ‘Add Currently Processed Flat File Name Port’ property on the session is configured.
This is the file list. No file paths are indicated because the files exist in the same directory as the file list.
Now let’s execute the session.
This is the target table. We have loaded customer records from the flat files that are referenced in the file list document. This metadata information can be crucial when you must track the source of data or you are researching a processing error.
Adding a file list is a simple way to process multiple data files and capture metadata information. The list provides all of the information that is needed, without creating multiple mappings and sessions and like a Metro map, all of the information that you need is in one place.