Documentation Center

How to retrieve blobs

Blobs can be retrieved through the API all at one time, or in chunks

Introduction

The following two methods can be used to retrieve blobs from the Content Manager repository:
  • Retrieve complete blobs using one API call
  • Retrieve blobs in chunks when you are expecting large blobs

Retrieving complete blobs using one API call

This category of methods in the API return blobs as base64-encoded xml snippets in an ObjectList XML Structure. You get multiple objects plus their metadata with one call. Because blobs are base64-encoded, you must decode them before they can be used! An example of such a method is DocumentObj.RetrieveObjects.

Although this is fairly easy, it has one major disadvantage. Because all blobs are embedded in the returned XML structure, the size of the returned XML can become very large. This could potentially result in out-of-memory and / or performance issues.

If you decide to use this method, we advise to retrieve objects using multiple API calls:
  1. Retrieve the metadata of the objects that you want to download
  2. use the ishlngref attribute in the ObjectList XML Structure that is returned to create a list with the language objects.
  3. Per x language objects, retrieve the metadata and the blobs using DocumentObj.RetrieveObjectsByIshLngRefs. The number of objects within the group depend on the expected maximum size of the blobs.

Retrieving blobs in chunks

Blob can also be retrieved one-by-one and in chunks, using a "GetNextChunk"-method if available (e.g. PublicationOutput.GetNextDataObjectChunkByIshLngRef)

Execute the following steps to get the blobs:
  1. Retrieve the objects from which you would like to download the blobs.
  2. Get the unique identifier (e.g. ishlngref attribute) for each object from the returned xml structure
  3. For each object get the information of the actual content (e.g. size, edt)
  4. Loop until full size of object is retrieved over the "GetNextChunk"-method
Example for the output of a publication

Since the output of a publication can be very large, you have to use a "GetNextChunk"-method to get the content of the publication. So, when retrieving the output of a publication, you have to use PublicationOutput.GetNextDataObjectChunkByIshLngRef.

Below you will find the steps (including some additional information) showing how you could download a published output from the Content Manager application:
  1. Find the language objects of the publication (= publication output objects) from which you would like to download the output(s).
  2. From the returned ObjectList XML Structure (each ishobject element represents a publication output object), you should retrieve all publication output references using the ishlngref attribute
  3. For each ishlngref you should execute following routine to actually download the content
    1. Use PublicationOutput.GetDataObjectInfoByIshLngRef to retrieve all information regarding the actual content of the publication output
    2. From the returned Data Objects XML Structure retrieve the following values:
      • The ed attribute with the GUID of the actual content blob.
      • The size attribute with the number of bytes consumed by the content.
    3. Loop until the full size of object is retrieved over PublicationOutput.GetNextDataObjectChunkByIshLngRef
      • The parameter plLngRef should be filled with the value of the ishlngref that you are currently processing
      • The parameter psEdGUId should be filled with the value of the ed attribute
      • The parameter plOffSet which is the current position in the content, should start at 0 and be increased for each loop in order to retrieve the next chunk of content
      • The parameter piSize should contain the chunk size that you want to retrieve. Advised chunk size is around 200KB, although a bigger chunk size like 2MB could perform better on high latency connections.