T) Explain the Run time behaviour DEDUP SORTED component ?

 Step-by-Step Process:

  1. Reads Grouped Data

    • Takes in records that are already sorted by a specific key (like customer ID or date)

  2. Optional Filter Check

    • If you have a filter expression:

      • Filter returns FALSE (0): Record is skipped

      • Filter returns NULL: Record goes to reject port with error message

      • Filter returns TRUE: Record continues processing

    • If no filter: All records are processed

  3. Handles Record Groups

    • Groups consecutive records with the same key value together

    • For single-record groups: Record goes directly to output

    • For multi-record groups (duplicates):

      • Uses your "keep" setting to decide which record to keep

      • If keep = "first" or "last": Keeps one record (sends to output), sends others to duplicate port

      • If keep = "unique-only": Sends ALL records from duplicate groups to duplicate port (output gets nothing)

Key Points:

  • Data MUST be sorted first for this to work correctly

  • Works with consecutive duplicate records

  • You control which duplicate to keep (first, last, or none)

  • Filter is optional - use it to skip certain records before deduplication

Output Results:

  • out port: Contains unique records (based on your keep setting)

  • dup port: Contains the duplicate records that were removed

  • reject port: Gets records that caused filter errors

Comments

Popular posts from this blog

T) For data parallelism, we can use partition components. For component parallelism, we can use replicate component. Like this which component(s) can we use for pipeline parallelism?

T) When to use sort within groups ?

T) Explain about on .abi-unc files in abinitio ?