Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This method is used to get all item metadata. This method returns a list of MetaDataValue objects (see Document Object JSON Schema).  MetaDataValue objects are unicode values (see Python Unicode).

...

Code Block
languagepy
document.get_meta_data_value('name', 'origin', True|Falsereverse)

Parameters

ParameterTypeDescription
nameRequired: stringThe name of the metadata
origin[Optional] string

The metadata value set by either one of the following components:

NameDescription
crawler

The metadata value set during the Crawling stage

converterThe metadata value set during the Processing stage
mapping

The metadata value set during the Mapping stage

If no value is supplied and the reverse value is True, the most recent origin is considered, i.e. crawler in preconversion and mapping in postconversion.

reverse[Optional] Boolean used to determine whether to get boolean

Whether to scan the metadata origin in reverse order or not. The default value is True, meaning that the value is fetched from the latest indexing pipeline stage with a non-empty value.

...

Used to add an item metadata. This method can also be used to unset or override an item metadata.

Code Block
languagepy
document.add_meta_data({'myMetadataName':'myMetadataValue'})

...

Use this method in your extension script to tag source items with a relevant indexing message that is sent to the Coveo Cloud V2 source logs. Log messages are useful when you want to edit, debug or troubleshoot an extension scripts. For instance, it is a common practice to use the try/catch or try/except block to log an error as a string in the source logs. It is recommended to use the Log method since outputting text to a field as a form of logging can be a serious index bloat. For instance, using the Get Metadata method to output the metadata content to a field is a bad practice.

...

Code Block
languagejs
{
	"PermissionSets": 
	[
		{
			"AllowAnonymous": false, 
			"DeniedPermissions": [], 
			"Name": "", 
			"AllowedPermissions": []
		}
	], 
	"Name": ""
}, 
{
	"PermissionSets": 
	[
		{
			"AllowAnonymous": false, 
			"DeniedPermissions": [], 
			"Name": "View All Data Members", 
			"AllowedPermissions": 
			[
				{
					"SecurityProvider": "SALESFORCE-00Df40000000SAbEAM", 
					"IdentityType": "virtualgroup", 
					"Identity": "ViewAll:Irrelevant:", 
					"AdditionalInfo": {}
				}, 
				{
					"SecurityProvider": "SALESFORCE-00Df40000000SAbEAM", 
					"IdentityType": "virtualgroup", 
					"Identity": "ObjectAccess:ViewAllRecordsProfiles:Solution", 
					"AdditionalInfo": {}
				}, 
				{
					"SecurityProvider": "SALESFORCE-00Df40000000SAbEAM", 
					"IdentityType": "virtualgroup", 
					"Identity": "ObjectAccess:ViewAllRecordsPermissionSets:Solution", 
					"AdditionalInfo": {}
				}
			]
		}
	], 
	"Name": "View All Data"
}, 
{
	"PermissionSets": 
		[
			{
				"AllowAnonymous": false, 	
				"DeniedPermissions": [], 
				"Name": "Read access members", 
				"AllowedPermissions": 
				[
					{
					"SecurityProvider": "SALESFORCE-00Df40000000SAbEAM", 
					"IdentityType": "virtualgroup", 
					"Identity": "ObjectAccess:ReadRecordsProfiles:Solution", 
					"AdditionalInfo": {}
				}, 
				{
					"SecurityProvider": "SALESFORCE-00Df40000000SAbEAM", 
					"IdentityType": "virtualgroup", 
					"Identity": "ObjectAccess:ReadRecordsPermissionSets:Solution", 
					"AdditionalInfo": {}
				}
			]
		}
	], 
	"Name": "Read Access & Sharing"
}
Show If
actionedit

get_permissions() =  [<extension_runner.PermissionLevel object at 0x7efe2c093390>]

type = type 'list'

type list[0] = class 'extension_runner.PermissionLevel'

Clear All Permissions

Used to clear all item permissions.

Code Block
languagepy
document.clear_permissions()
Warning

Using this method could have huge implications as everybody gets access to the item.

Add Allowed Permission

Used to add an allowed security identity.

Code Block
languagepy
document.add_allowed(identity, identity_type, security_provider, {additional_info})

Parameters

ParameterTypeDescription
identityRequired: stringThe name of the allowed security identity to add
identity_typeRequired: string

The security identity type can be:

  • user
  • group
  • virtualgroup
  • unknown
security_providerRequired: string

The name of the security identity provider.

Sample value: 'Email Security Provider'

additional_infodictionary of stringA collection of key value pairs that can be used to uniquely identify the security identity.

...

Used to add a denied security identity.

Code Block
languagepy
document.add_denied(identity, identity_type, security_provider, {additional_info})

Parameters

ParameterTypeDescription
identityRequired: stringThe name of the denied security identity to add
identity_typeRequired: string

The security identity type can be:

  • user
  • group
  • virtualgroup
  • unknown
security_providerRequired: string

The name of the security identity provider.

Sample value: 'Email Security Provider'

additional_infodictionary of stringA collection of key value pairs that can be used to uniquely identify the security identity.

...

Used to set item permissions. To set permissions, you must define a permission level, a permission set and then, a permission.

Code Block
languagepy
document.set_permissions([PermissionLevelObjects])

Permission Level

ParameterTypeDescription
level_nameStringThe name of the permission level.
permission_setsArray of PermissionSet objectArray of permission sets

...

ParameterTypeDescription
set_nameStringThe name of the permission set.
allow_anonymousRequired: BooleanbooleanWhether to allow anonymous access.
allowed_permissionsArray of Permission objectArray of allowed permissions
denied_permissionsArray of Permission objectArray of denied permissions

...

Code Block
languagepy
document.get_data_stream('name', 'origin', True|Falsereverse)
Example
Code Block
languagepy
# Get document body text data stream appear in a log message
# While editing IPE, you need to select the Body Text checkbox because IPE needs access to it
myDataStream = (document.get_data_stream('body_text')).read()
log(str(myDataStream))
Info
iconfalse
titleTip:

 For Web and Sitemap type sources, it is recommended to use the web scrapping feature rather than extensions to do common HTML content processing such as excluding sections and extracting metadata (see Web Scraping Configuration).

Parameters

ParameterTypeDescription
nameRequired: string

The available item data streams are:

  • documentdata
    The complete item binary content extracted by the Crawling stage of the indexing pipeline (see Coveo Cloud V2 Indexing Pipeline).

    Example

    The documentdata of a PDF file is the actual PDF file.

    The documentdata of a web page is the page HTML markup.

    You may want to retrieve the documentdata of an item in a Preconversion extension in rare cases where you want to modify the original item content.

    Example

    You indexed scanned items that are saved as image files. You want to index the text content of the images. You use a preconversion extension to read each image documentdata, send it to a third party optical character recognition service (OCR) service, and save the returned text back in the documentdata so that the Processing stage can prepare the text content for the Indexing stage.

    Getting the documentdata can significantly degrade indexing performances because each item binary data has to be fetched, decompressed, and decrypted.
    There is generally no point to get and modify the documentdata in a postconversion extension because the Indexing stage does not process it.

    Info

    In the Coveo Cloud administration console Add/Edit an Extension panel, the documentdata is referred to as the Original file.

  • body_text
    The complete textual content of an item extracted by the converter in the Processing stage of the indexing pipeline (see Coveo Cloud V2 Indexing Pipeline ).
    You can get the body_text of each item in a postconversion extensions for rare cases where you want to access and possibly modify the item text content.
    There is no point in getting and modifying the body_text in a preconversion extension because the Processing stage would overwrite it.

    Info
    iconfalse
    titleNote:

    For index size and performance optimization, the body_text is limited in size to 50 MB. This means that for rare items with larger body_text, the exceeding text will not be indexed, and therefore not searchable.


  • body_html
    The complete HTML representation of an item created by the converter in the Processing stage of the indexing pipeline (see Coveo Cloud V2 Indexing Pipeline ). The body_html appears in the Quick View of a search result item.
    You can get the body_html of each item in a postconversion extensions for cases where you want to access and possibly modify the item text content.

    Example

    Your source indexes a question and answer website. Each question and each answer is indexed as a separate item even if they can come from the same HTML page. Your indexed items do not have the <head> elements from the original HTML page and therefore are missing resources such as CSS. Consequently, the Quick View for these items does not look good.

    You get the body_html in an extension and inject the appropriate <head> elements.

    There is no point in getting and modifying the body_html in a preconversion extension because the Processing stage would overwrite it.

    Info
    iconfalse
    titleNotes:

    When you can define your desired body_html content as a static HTML markup containing metadata placeholders, it is generally simpler to use a mapping on the body field (see Add/Edit Mapping).

    For index size and performance optimization, the body_html is limited in size to 10 MB. This means that the Quick View of items with larger body_html will be truncated.

  • $thumbnail$
    The thumbnail image created by the converter in the Processing stage of the indexing pipeline for specific file types ( Microsoft Word, Excel, PowerPoint, and Visio as well as many image file types such as JPG, BMP, GIF, TIF, PSD, PNG... ).
    You can get the $thumbnail$ in a postconversion extension in the rare cases where you want to modify the thumbnail or extract information from the thumbnail image.
    Your thumbnail image can have any size, resolution or format (as long as a browser can display it), but it is a good practice to stick to a normalize image size and resolution.

    Info

    If you want to overwrite the thumbnail (or create one) you do not need to get the $thumbnail$.


origin[Optional] string

The metadata value set by either one of the following components:

NameDescription
crawlerThe stream value set during the Crawling stage
converterThe stream value set during the Processing stage
mappingThe stream value set during the Mapping stage

If no value is supplied and the reverse value is True, the most recent origin is considered, i.e. crawler in preconversion and mapping in postconversion.

reverse[Optional] Boolean used to determine whether to get the stream booleanWhether to scan the metadata origin in reverse order or not. The default value is True, meaning that the stream value is fetched from the latest indexing pipeline stage with a non-empty streamvalue.

Add Data Stream

Used to add or override an item data stream.

...