Class ParallelBatch

java.lang.Object
com.atlan.util.ParallelBatch
All Implemented Interfaces:
Closeable, AutoCloseable

public class ParallelBatch extends Object implements Closeable
Utility class for managing bulk updates across multiple parallel-running batches.
  • Field Details

  • Constructor Details

    • ParallelBatch

      public ParallelBatch(AtlanClient client, int maxSize)
      Create a new batch of assets to be bulk-saved, in parallel (across threads).
      Parameters:
      client - connectivity to Atlan
      maxSize - maximum size of each batch that should be processed (per API call)
    • ParallelBatch

      public ParallelBatch(AtlanClient client, int maxSize, boolean replaceAtlanTags, AssetBatch.CustomMetadataHandling customMetadataHandling)
      Create a new batch of assets to be bulk-saved, in parallel (across threads).
      Parameters:
      client - connectivity to Atlan
      maxSize - maximum size of each batch that should be processed (per API call)
      replaceAtlanTags - if true, all Atlan tags on an existing asset will be overwritten; if false, all Atlan tags will be ignored
      customMetadataHandling - how to handle custom metadata (ignore it, replace it (wiping out anything pre-existing), or merge it)
    • ParallelBatch

      public ParallelBatch(AtlanClient client, int maxSize, boolean replaceAtlanTags, AssetBatch.CustomMetadataHandling customMetadataHandling, boolean captureFailures)
      Create a new batch of assets to be bulk-saved, in parallel (across threads).
      Parameters:
      client - connectivity to Atlan
      maxSize - maximum size of each batch that should be processed (per API call)
      replaceAtlanTags - if true, all Atlan tags on an existing asset will be overwritten; if false, all Atlan tags will be ignored
      customMetadataHandling - how to handle custom metadata (ignore it, replace it (wiping out anything pre-existing), or merge it)
      captureFailures - when true, any failed batches will be captured and retained rather than exceptions being raised (for large amounts of processing this could cause memory issues!)
    • ParallelBatch

      public ParallelBatch(AtlanClient client, int maxSize, boolean replaceAtlanTags, AssetBatch.CustomMetadataHandling customMetadataHandling, boolean captureFailures, boolean updateOnly)
      Create a new batch of assets to be bulk-saved, in parallel (across threads).
      Parameters:
      client - connectivity to Atlan
      maxSize - maximum size of each batch that should be processed (per API call)
      replaceAtlanTags - if true, all Atlan tags on an existing asset will be overwritten; if false, all Atlan tags will be ignored
      customMetadataHandling - how to handle custom metadata (ignore it, replace it (wiping out anything pre-existing), or merge it)
      captureFailures - when true, any failed batches will be captured and retained rather than exceptions being raised (for large amounts of processing this could cause memory issues!)
      updateOnly - when true, only attempt to update existing assets and do not create any assets (note: this will incur a performance penalty)
    • ParallelBatch

      public ParallelBatch(AtlanClient client, int maxSize, boolean replaceAtlanTags, AssetBatch.CustomMetadataHandling customMetadataHandling, boolean captureFailures, boolean updateOnly, boolean track)
      Create a new batch of assets to be bulk-saved, in parallel (across threads).
      Parameters:
      client - connectivity to Atlan
      maxSize - maximum size of each batch that should be processed (per API call)
      replaceAtlanTags - if true, all Atlan tags on an existing asset will be overwritten; if false, all Atlan tags will be ignored
      customMetadataHandling - how to handle custom metadata (ignore it, replace it (wiping out anything pre-existing), or merge it)
      captureFailures - when true, any failed batches will be captured and retained rather than exceptions being raised (for large amounts of processing this could cause memory issues!)
      updateOnly - when true, only attempt to update existing assets and do not create any assets (note: this will incur a performance penalty)
      track - when false, details about each created and updated asset will no longer be tracked (only an overall count of each) -- useful if you intend to send close to (or more than) 1 million assets through a batch
    • ParallelBatch

      public ParallelBatch(AtlanClient client, int maxSize, boolean replaceAtlanTags, AssetBatch.CustomMetadataHandling customMetadataHandling, boolean captureFailures, boolean updateOnly, boolean track, boolean caseSensitive)
      Create a new batch of assets to be bulk-saved, in parallel (across threads).
      Parameters:
      client - connectivity to Atlan
      maxSize - maximum size of each batch that should be processed (per API call)
      replaceAtlanTags - if true, all Atlan tags on an existing asset will be overwritten; if false, all Atlan tags will be ignored
      customMetadataHandling - how to handle custom metadata (ignore it, replace it (wiping out anything pre-existing), or merge it)
      captureFailures - when true, any failed batches will be captured and retained rather than exceptions being raised (for large amounts of processing this could cause memory issues!)
      updateOnly - when true, only attempt to update existing assets and do not create any assets (note: this will incur a performance penalty)
      track - when false, details about each created and updated asset will no longer be tracked (only an overall count of each) -- useful if you intend to send close to (or more than) 1 million assets through a batch
      caseSensitive - (only applies when updateOnly is true) attempt to match assets case-sensitively (true) or case-insensitively (false)
    • ParallelBatch

      public ParallelBatch(AtlanClient client, int maxSize, boolean replaceAtlanTags, AssetBatch.CustomMetadataHandling customMetadataHandling, boolean captureFailures, boolean updateOnly, boolean track, boolean caseSensitive, AssetCreationHandling creationHandling)
      Create a new batch of assets to be bulk-saved, in parallel (across threads).
      Parameters:
      client - connectivity to Atlan
      maxSize - maximum size of each batch that should be processed (per API call)
      replaceAtlanTags - if true, all Atlan tags on an existing asset will be overwritten; if false, all Atlan tags will be ignored
      customMetadataHandling - how to handle custom metadata (ignore it, replace it (wiping out anything pre-existing), or merge it)
      captureFailures - when true, any failed batches will be captured and retained rather than exceptions being raised (for large amounts of processing this could cause memory issues!)
      updateOnly - when true, only attempt to update existing assets and do not create any assets (note: this will incur a performance penalty)
      track - when false, details about each created and updated asset will no longer be tracked (only an overall count of each) -- useful if you intend to send close to (or more than) 1 million assets through a batch
      caseSensitive - (only applies when updateOnly is true) attempt to match assets case-sensitively (true) or case-insensitively (false)
      creationHandling - if assets are to be created, how they should be created (as full assets or only partial assets)
    • ParallelBatch

      public ParallelBatch(AtlanClient client, int maxSize, boolean replaceAtlanTags, AssetBatch.CustomMetadataHandling customMetadataHandling, boolean captureFailures, boolean updateOnly, boolean track, boolean caseSensitive, AssetCreationHandling creationHandling, boolean tableViewAgnostic)
      Create a new batch of assets to be bulk-saved, in parallel (across threads).
      Parameters:
      client - connectivity to Atlan
      maxSize - maximum size of each batch that should be processed (per API call)
      replaceAtlanTags - if true, all Atlan tags on an existing asset will be overwritten; if false, all Atlan tags will be ignored
      customMetadataHandling - how to handle custom metadata (ignore it, replace it (wiping out anything pre-existing), or merge it)
      captureFailures - when true, any failed batches will be captured and retained rather than exceptions being raised (for large amounts of processing this could cause memory issues!)
      updateOnly - when true, only attempt to update existing assets and do not create any assets (note: this will incur a performance penalty)
      track - when false, details about each created and updated asset will no longer be tracked (only an overall count of each) -- useful if you intend to send close to (or more than) 1 million assets through a batch
      caseSensitive - (only applies when updateOnly is true) attempt to match assets case-sensitively (true) or case-insensitively (false)
      creationHandling - if assets are to be created, how they should be created (as full assets or only partial assets)
      tableViewAgnostic - if true, tables and views will be treated interchangeably (an asset in the batch marked as a table will attempt to match a view if not found as a table, and vice versa)
  • Method Details

    • add

      public AssetMutationResponse add(Asset single) throws AtlanException
      Add an asset to the batch to be processed.
      Parameters:
      single - the asset to add to a batch
      Returns:
      the assets that were created or updated in this batch, or null if the batch is still queued
      Throws:
      AtlanException - on any problems adding the asset to or processing the batch
    • flush

      public void flush() throws AtlanException
      Flush any remaining assets in the parallel batches.
      Throws:
      IllegalStateException - on any problems flushing (submitting) any of the parallel batches
      AtlanException
    • getNumCreated

      public long getNumCreated()
      Number of assets that were created (no details, only a count).
      Returns:
      a count of the number of created assets, across all parallel batches
    • getNumUpdated

      public long getNumUpdated()
      Number of assets that were updated (no details, only a count).
      Returns:
      a count of the number of updated assets, across all parallel batches
    • getNumRestored

      public long getNumRestored()
      Number of assets that were potentially restored from being archived, or otherwise touched without actually being updated (no details, just a count).
      Returns:
      a count of the number of potentially restored assets, across all parallel batches
    • getNumSkipped

      public long getNumSkipped()
      Number of assets that were skipped during processing (no details, just a count).
      Returns:
      a count of the number of skipped assets, across all parallel batches
    • getCreated

      public OffHeapAssetCache getCreated()
      Assets that were created (minimal info only).
      Returns:
      all created assets, across all parallel batches
    • getUpdated

      public OffHeapAssetCache getUpdated()
      Assets that were updated (minimal info only).
      Returns:
      all updated assets, across all parallel batches
    • getRestored

      public OffHeapAssetCache getRestored()
      Assets that were potentially restored from being archived, or otherwise touched without actually being updated (minimal info only).
      Returns:
      all potentially restored assets, across all parallel batches
    • getFailures

      public List<AssetBatch.FailedBatch> getFailures()
      Batches that failed to be committed (only populated when captureFailures is set to true).
      Returns:
      all batches that failed, across all parallel batches
    • getSkipped

      public OffHeapAssetCache getSkipped()
      Assets that were skipped, when updateOnly is requested and the asset does not exist in Atlan.
      Returns:
      all assets that were skipped, across all parallel batches
    • getResolvedGuids

      public Map<String,String> getResolvedGuids()
      Map from placeholder GUID to resolved (actual) GUID, for all assets that were processed through the batch.
      Returns:
      all resolved GUIDs, across all parallel batches
    • getResolvedQualifiedNames

      public Map<AssetBatch.AssetIdentity,String> getResolvedQualifiedNames()
      Map from case-insensitive qualifiedName to resolved (actual) qualifiedName, for all assets that were processed through the batch. Note: this is only populated when caseSensitive is false, and will otherwise be empty
      Returns:
      all resolved qualifiedNames, across all parallel batches
    • close

      public void close() throws IOException
      Close the batch by freeing up any resources it has used. Note: this will clear any internal caches of results, so only call this after you have processed those!
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException - on any problems freeing up resources