27 September 2010
Some experience with using LiveCycle Contentspace ES2 and LiveCycle Workbench ES2 will be helpful.
Beginning
In a few simple steps you can easily index the content you store in LiveCycle Contentspace ES2 on Google Search Appliance (GSA) so that people can search the content from outside the LiveCycle Contentspace ES2 interface. You just need to create a content rule and a LiveCycle process; no outside batch jobs or programs are needed.
In LiveCycle Contentspace ES2, content rules can be created on documents or spaces to invoke any action (from a predefined set of actions) whenever a new document is created, an existing document is updated, or a document is downloaded. Since the goal of this article is to index the document's content and metadata whenever a document is created or updated, you'll need a rule on the Inbound condition and one on the Update condition.
LiveCycle Contentspace ES2 provides built-in actions—such as executing a script or invoking a LiveCycle process—that you can use to replicate the behavior of any batch process. For this article, you will use the Invoke A LiveCycle Process action, which will enable you to easily extract the content and metadata of the article or video to be indexed in GSA. The creation of the required process is covered in Step 2.
Note: When you create a rule on a parent space, you can choose to have it applied to documents inside sub spaces as well.
Follow these steps to create a new content rule in LiveCycle Contentspace ES2:
Next, you need to create theprocess invoked by the content rule. The process will extract the metadata aswell as the content of the documents that are to be indexed. Since the processwill only be invoked when a document is uploaded or updated, it requires only asingle input of type document.
A document object contains thecontent as well as other document properties, which can be easily extracted inthe process using getAttribute and other document objectfunctions. When the process is invoked the entire document that triggered therule will be passed as a variable; in this case the variable is contentData .
The process shown in Figure2creates an XML feed and sends that feed to GSA by invoking an HTTP POST in ascript (Java code). The XML feed can be created as required based only onmetadata or with the entire content as well. The process for this article sendsthe complete content via the XML feed in BASE64 encoded form.
The first activity in the process establishes the GSA feed URL from the global settings (the gsafeedurl variable). The second activity reads document properties such as nodeId (see Figure 3) and encodes the content of the document in a BASE64-encoded string.
If there are any aspects applied to the documents that you want to include in the XML feed, use getDocAttribute() to retrieve the nodeId from the document (see Figure 3). You can then use the nodeId to retrieve the other content properties stored as different aspects using getContentAttributes in the next activity (the third activity in Figure 2). It will return a Map of all the properties, which you will need to parse to extract the required information.
Once you have all the information to create the XML feed, invoke the HTTP POST to the GSA URL using the script. Here is an example script:
import java.net.*;
import java.io.*;
try
{
URL url;
URLConnection urlConnection;
DataOutputStream outStream;
DataInputStream inStream;
String docContent = patExecContext.getProcessDataStringValue("/process_data/@contentbase64");
String appurl = patExecContext.getProcessDataStringValue("/process_data/@appurl");
String docName = patExecContext.getProcessDataStringValue("/process_data/@contentName");
String gsaFeedUrl = patExecContext.getProcessDataStringValue("/process_data/@gsafeedurl");
// Create XML feed
String xmlText= "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?> " +
"<!DOCTYPE gsafeed PUBLIC \"-//Google//DTD GSA Feeds//EN\" \"\"> " +
"<gsafeed> " +
.
.
.
"</gsafeed>";
String body = "feedtype=" + URLEncoder.encode("full", "UTF-8") +
"&datasource=" + URLEncoder.encode("ITKnowledgeCenter", "UTF-8") +
"&data=" + URLEncoder.encode(xmlText, "UTF-8");
// create connection
url = new URL(gsaFeedUrl );
urlConnection = url.openConnection();
((HttpURLConnection)urlConnection).setRequestMethod("POST");
urlConnection.setDoInput(true);
urlConnection.setDoOutput(true);
urlConnection.setUseCaches(false);
urlConnection.setRequestProperty("Content-Type", "multipart/form-data");
urlConnection.setRequestProperty("Content-Length", ""+ body.length());
urlConnection.connect();
// Create I/O streams
outStream = new DataOutputStream(urlConnection.getOutputStream());
outStream.writeBytes(body);
outStream.flush();
outStream.close();
// Get the response
BufferedReader rd = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));
String line = "fail";
while ((line = rd.readLine()) != null)
{
//System.out.println(line);
}
outStream.close();
rd.close();
}
catch(Exception ex)
{
System.out.println("Exception caught:\n"+ ex.toString());
}
Note: The script above is only sample code to illustrate the topics covered in this article. It may not follow all best practices.
The content rule you created handles GSA indexing only on upload; you can create a new rule that invokes the process on update as well.
For more details on content rules, see Getting Started with LiveCycle Contentspace ES2 .
For more details on managing processes, see Application Development Using LiveCycle Workbench ES2.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License