How to retrieve blobs
Blobs can be retrieved through the API all at one time, or in chunks
Introduction
- Retrieve complete blobs using one API call
- Retrieve blobs in chunks when you are expecting large blobs
Retrieving complete blobs using one API call
This category of methods in the API return blobs as base64-encoded xml snippets in an ObjectList XML Structure. You get multiple objects plus their metadata with one call. Because blobs are base64-encoded, you must decode them before they can be used! An example of such a method is DocumentObj.RetrieveObjects.
Although this is fairly easy, it has one major disadvantage. Because all blobs are embedded in the returned XML structure, the size of the returned XML can become very large. This could potentially result in out-of-memory and / or performance issues.
- Retrieve the metadata of the objects that you want to download
- use the
ishlngrefattribute in the ObjectList XML Structure that is returned to create a list with the language objects. - Per x language objects, retrieve the metadata and the blobs using DocumentObj.RetrieveObjectsByIshLngRefs. The number of objects within the group depend on the expected maximum size of the blobs.
Retrieving blobs in chunks
Blob can also be retrieved one-by-one and in chunks, using a "GetNextChunk"-method if available (e.g. PublicationOutput.GetNextDataObjectChunkByIshLngRef)
- Retrieve the objects from which you would like to download the blobs.
- Get the unique identifier (e.g.
ishlngrefattribute) for each object from the returned xml structure - For each object get the information of the actual content (e.g. size, edt)
- Loop until full size of object is retrieved over the "GetNextChunk"-method
- Example for the output of a publication
-
Since the output of a publication can be very large, you have to use a "GetNextChunk"-method to get the content of the publication. So, when retrieving the output of a publication, you have to use PublicationOutput.GetNextDataObjectChunkByIshLngRef.
Below you will find the steps (including some additional information) showing how you could download a published output from the Content Manager application:- Find the language objects of the publication (= publication output objects) from which you would like to download the output(s).
- From the returned ObjectList XML Structure (each
ishobjectelement represents a publication output object), you should retrieve all publication output references using theishlngrefattribute - For each
ishlngrefyou should execute following routine to actually download the content- Use PublicationOutput.GetDataObjectInfoByIshLngRef to retrieve all information regarding the actual content of the publication output
- From the returned Data Objects XML Structure retrieve the following values:
- The
edattribute with the GUID of the actual content blob. - The
sizeattribute with the number of bytes consumed by the content.
- The
- Loop until the full size of object is retrieved over PublicationOutput.GetNextDataObjectChunkByIshLngRef
- The parameter
plLngRefshould be filled with the value of theishlngrefthat you are currently processing - The parameter
psEdGUIdshould be filled with the value of theedattribute - The parameter
plOffSetwhich is the current position in the content, should start at 0 and be increased for each loop in order to retrieve the next chunk of content - The parameter
piSizeshould contain the chunk size that you want to retrieve. Advised chunk size is around 200KB, although a bigger chunk size like 2MB could perform better on high latency connections.
- The parameter