Yogesh Bahuguna

Table of contents

Created

27 September 2010

Requirements

Prerequisite knowledge

Some experience with using LiveCycle Contentspace ES2 and LiveCycle Workbench ES2 will be helpful.

User level

Beginning

In a few simple steps you can easily index the content you store in LiveCycle Contentspace ES2 on Google Search Appliance (GSA) so that people can search the content from outside the LiveCycle Contentspace ES2 interface. You just need to create a content rule and a LiveCycle process; no outside batch jobs or programs are needed.

Step 1: Create a rule

In LiveCycle Contentspace ES2, content rules can be created on documents or spaces to invoke any action (from a predefined set of actions) whenever a new document is created, an existing document is updated, or a document is downloaded. Since the goal of this article is to index the document's content and metadata whenever a document is created or updated, you'll need a rule on the Inbound condition and one on the Update condition.

LiveCycle Contentspace ES2 provides built-in actions—such as executing a script or invoking a LiveCycle process—that you can use to replicate the behavior of any batch process. For this article, you will use the Invoke A LiveCycle Process action, which will enable you to easily extract the content and metadata of the article or video to be indexed in GSA. The creation of the required process is covered in Step 2.

Note: When you create a rule on a parent space, you can choose to have it applied to documents inside sub spaces as well.

Follow these steps to create a new content rule in LiveCycle Contentspace ES2:

  1. Login to LiveCycle Contentspace ES2 and open the space you want to work with.
  2. Click More Actions > Manage Content Rules, and then click Create Rule to open the Create Rule Wizard.
  3. In the Select Conditions section specify the content that the rule will affect. If you want all content types to be indexed then select All Items, otherwise select as required based on type, name, and so on. You can define multiple conditions. Click Next
  4. In the Select Actions section, select Invoke a LiveCycle Process. When you select this option all processes that have a single input parameter of type document will be shown. You need to select the process that you will be creating in Step 2. For example, if you create a process named KnowledgeCenter/KC_Gsa_Feed, you would select that here. Click Next.
  5. In the Enter Details section select Inbound as the Type so that the rule will be invoked on inbound content. Type a title and description for the rule; for example type GSA RULE for both. You may select the option to apply the rule to sub spaces if that is appropriate. Click Next.
  6. Verify the summary of the new rule (see Figure 1) and click Finish.

Step 2: Create a LiveCycle process

Next, you need to create theprocess invoked by the content rule. The process will extract the metadata aswell as the content of the documents that are to be indexed. Since the processwill only be invoked when a document is uploaded or updated, it requires only asingle input of type document.

A document object contains thecontent as well as other document properties, which can be easily extracted inthe process using getAttribute and other document objectfunctions. When the process is invoked the entire document that triggered therule will be passed as a variable; in this case the variable is contentData .

The process shown in Figure2creates an XML feed and sends that feed to GSA by invoking an HTTP POST in ascript (Java code). The XML feed can be created as required based only onmetadata or with the entire content as well. The process for this article sendsthe complete content via the XML feed in BASE64 encoded form.

The first activity in the process establishes the GSA feed URL from the global settings (the gsafeedurl variable). The second activity reads document properties such as nodeId (see Figure 3) and encodes the content of the document in a BASE64-encoded string.

If there are any aspects applied to the documents that you want to include in the XML feed, use getDocAttribute() to retrieve the nodeId from the document (see Figure 3). You can then use the nodeId to retrieve the other content properties stored as different aspects using getContentAttributes in the next activity (the third activity in Figure 2). It will return a Map of all the properties, which you will need to parse to extract the required information.

Once you have all the information to create the XML feed, invoke the HTTP POST to the GSA URL using the script. Here is an example script:

import java.net.*; import java.io.*; try { URL url; URLConnection urlConnection; DataOutputStream outStream; DataInputStream inStream; String docContent = patExecContext.getProcessDataStringValue("/process_data/@contentbase64"); String appurl = patExecContext.getProcessDataStringValue("/process_data/@appurl"); String docName = patExecContext.getProcessDataStringValue("/process_data/@contentName"); String gsaFeedUrl = patExecContext.getProcessDataStringValue("/process_data/@gsafeedurl"); // Create XML feed String xmlText= "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?> " + "<!DOCTYPE gsafeed PUBLIC \"-//Google//DTD GSA Feeds//EN\" \"\"> " + "<gsafeed> " + . . . "</gsafeed>"; String body = "feedtype=" + URLEncoder.encode("full", "UTF-8") + "&datasource=" + URLEncoder.encode("ITKnowledgeCenter", "UTF-8") + "&data=" + URLEncoder.encode(xmlText, "UTF-8"); // create connection url = new URL(gsaFeedUrl ); urlConnection = url.openConnection(); ((HttpURLConnection)urlConnection).setRequestMethod("POST"); urlConnection.setDoInput(true); urlConnection.setDoOutput(true); urlConnection.setUseCaches(false); urlConnection.setRequestProperty("Content-Type", "multipart/form-data"); urlConnection.setRequestProperty("Content-Length", ""+ body.length()); urlConnection.connect(); // Create I/O streams outStream = new DataOutputStream(urlConnection.getOutputStream()); outStream.writeBytes(body); outStream.flush(); outStream.close(); // Get the response BufferedReader rd = new BufferedReader(new InputStreamReader(urlConnection.getInputStream())); String line = "fail"; while ((line = rd.readLine()) != null) { //System.out.println(line); } outStream.close(); rd.close(); } catch(Exception ex) { System.out.println("Exception caught:\n"+ ex.toString()); }

Note: The script above is only sample code to illustrate the topics covered in this article. It may not follow all best practices.

Where to go from here

The content rule you created handles GSA indexing only on upload; you can create a new rule that invokes the process on update as well.

For more details on content rules, see Getting Started with LiveCycle Contentspace ES2.

For more details on managing processes, see Application Development Using LiveCycle Workbench ES2.