HP Structured Records Management Solution Tutorial Document release date: August 2011 Software release date: August 2011
Legal notices Warranty The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is subject to change without notice. Restricted rights legend Confidential computer software.
Contents About this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Intended audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 New and revised information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Related documentation . . . . . . .
6 Deploying and running a business flow . . . . . . . . . . . . . . . . . . . . . . . . . . . .37 Deployment prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37 Deploying the business flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37 Running the business flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41 Summary and next steps . . . . . . . . . . . . . . . . . . . . . . . . . . . .
About this document Using HP Database Archiving and HP TRIM in conjunction with one another provides a powerful means of safely retiring old, legacy data and applications within the bounds of your existing records management policies. This tutorial is designed to help you get started using HP Database Archiving to move eligible data from your production database into a structured records management system, HP TRIM.
New and revised information This document includes the following new and revised features in the HP SRMS software: • HP TRIM is now supported on 64-bit machines • For SRMS, you must apply hot fix 16 with HP Database Archiving 6.30 For more information about the hot fix, refer to http://quixy.deu.hp.
Convention Element You must supply a value for a variable parameter. ... • Indicates a repetition of the preceding parameter. • Example continues after omitted lines. Medium blue text: Figure 1 Cross-reference links and e-mail addresses Medium blue, underlined text (http://www.hp.
• Subscribing to this service provides you with e-mail updates on the latest product enhancements, versions of drivers, and firmware documentation updates as well as instant access to numerous other product resources. • After signing up, you can quickly locate your products under Product Category. Support You can visit the HP Software Support web site at: http://www.hp.com/go/hpsoftwaresupport HP Software Support Online provides an efficient way to access interactive technical support tools.
1 Structured records management concepts This chapter provides you with a conceptual overview of the archive building process and the tutorial itself.
• HP TRIM 7.10, HP Database Archiving software 6.30 with hot fix 7, and RQS 6.30 with hot fix 11 have been installed and configured in your environment by HP Enterprise Services for structured records management (HP TRIM Enabler Pack). • You have created a classification for the sales orders from DEMARC similar to the following: • You have installed a database that is supported by HP Database Archiving software 6.30. For details, see the HP Database Archiving software Installation manual.
Structured records management Records management traditionally concerned itself with information printed on paper. These records included: • narrative papers such as correspondence, memos, and policies. • non-narrative papers, such as inventories, general ledgers, and customer registers. When records management moved into the digital age, it took control of the electronic equivalents of narrative papers, namely unstructured information.
Used together, HP Database Archiving and HP TRIM provide just such a solution with its many associated benefits. Figure 1 1 Structured records management workflow Define This step defines the data model and the rules for the data to be extracted. Unlike unstructured information, which is stored in relatively well defined containers in the shape of files, structured data is stored in a set of tables, some of which are active data tables and some of which serve as lookup tables.
NOTE The extraction process also can create an MD5 hash of each exported file and includes this in the summary file. These hashes can be used to validate that the files loaded into HP TRIM are identical to the files generated by HP Database Archiving. TIP You have the option to remove the data from the source system immediately upon its extraction or at some later time (deferred deletion).
Scalability In order to achieve optimum performance and scalability with SRMS, you may spread the configuration across multiple machines where necessary. HP TRIM runs only on MS Windows, but HP Database Archiving can run on UNIX or MS Windows. The following SRMS configurations are possible: • MS Windows only. In this case, all of the machines in the configuration are MS Windows systems. Note that one of the machines in this configuration must have HP TRIM and HP Database Archiving installed on them.
2 Configuring the Demarc data To follow the instructions in this tutorial, you must have the sample Demarc data set loaded in your database. This chapter explains how to obtain and load the Demarc data. This chapter includes: • Loading the Demarc data (page 15) • Summary and next steps (page 16) Loading the Demarc data The example in this tutorial is based upon the Demarc data set. You must install this schema and populate it before you can start the tutorial.
On UNIX: ./load_demo.sh ./load_demo.sh ./load_demo.sh ./load_demo.sh ./load_demo.sh oracle sqlserver sybase db2 generic NOTE The generic option is for JDBC/ODBC data sources. 5 Respond to the prompts. Default values are displayed next to the prompts inside of square brackets [ ]. It may take a few minutes for the scripts to complete running. TIP If you want to use a schema name other than DEMARC, enter the desired name when prompted for demo schema/username.
3 Creating an archive definition This structured records management tutorial is based upon the same data model used in the general HP Database Archiving tutorial in Tutorial: Designing and deploying archive modules. This chapter walks you through the process of importing the solution project for the general tutorial. In subsequent chapters, you will modify this project for the structured records management case.
In the Name field, type DEMARC Orders App v1 SRMS as the name of your new project. 2 For Database, if you already created a connection to the database with DEMARC, choose that connection from the pull-down list. Otherwise, click New to set up a database connection for DEMARC. 3 Once the New Project dialog box is filled out, click OK. 4 Select File > Import. The Import dialog box displays. 5 Choose Existing Designer project from the list. 6 Click Next. The Import Existing Project dialog box appears.
2 Click New. 3 Click Next. 4 Enter a name for the HP TRIM connection, for example, trim_repos. The name you choose must match the value specified by in the srmsLoader_config.xml, which resides in the location where the HP TRIM Enabler Pack was installed and configured. For example: trim_repos ... If you are uncertain of the connection name or details, consult your DBA or HP TRIM administrator.
TIP Typically, a special user will have been created for this purpose. This user must have SELECT privileges on the TSFILEPLAN table. If you are not sure what user name and password to use for the HP TRIM database, contact your DBA or HP TRIM administrator. 8 Click Finish. 9 Click Close. Summary and next steps In this chapter you learned to: • Import a project and create a connection to the HP TRIM database The next step is to create a cartridge to actually archive and classify the data.
4 Creating a cartridge Once you have a working data model and a connection to HP TRIM, you can begin to create, convert, and classify cartridges.
5 If it is not already selected, select Orders as the model. 6 Optionally, click Annotation to add a comment. Click OK to exit the Annotation dialog box when you are done. 7 Click OK. 8 Click OK. The Database to File Cartridge editor appears.
Navigating in the cartridge editor If you look carefully at the bottom of the editor, you see a number of tabs, which correspond to the different parts of the cartridge you can edit. The first tab, Overview, is an overview of the cartridge. Each section on the page has a title that acts as a hyperlink to the corresponding tab. At the top of each page, you will find a link called Back to Overview, which returns you to the Overview page.
3 Review the data in the Preview tab. — The top part of the window shows the rows of the driving table. Select a row or range of rows in the top part of the window to filter the rows displayed in the bottom part. Use Ctl-click to select more than one row or clear the rows selected. — The Excluded By column displays the rule that caused a row to be excluded. All rows that are excluded are displayed in red. — Click on column headers to sort the rows by that value.
Classifying extracted data Once the cartridge is working satisfactorily, you can use the SRMS menu to select HP TRIM classifications for the extracted data. For example, in this case, your HP TRIM instance could have an Accounts Payables/Receivables classification called Customer Sales Orders. Given that the cartridge is archiving sales order data, you could classify all of the extracted data under Customer Sales Orders.
7 Choose SRMS > Classify. NOTE The names and hierarchy of your classifications may vary depending upon your HP TRIM instance. If necessary, you can substitute your own classification selection here. The key is to apply a valid HP TRIM classification to the cartridge. 8 Advanced concept Choose an appropriate classification for the data you plan to extract. For example, you could select Accounting > Accounts Payable /Receivable > Customer Sales Orders.
— Assign a default owner for records Once you know for what purposes you need the classification, you can design it to the detail required. Note that you can also import specific metadata values during the load process, or default the values of the HP TRIM record type. It is not mandatory to have a classification at all.
• Preview the data for your cartridge • Convert it to an SRMS cartridge • Apply a classification to the cartridge Once you are satisfied with your cartridge, you could deploy and run it by itself, but, for structured records management, you must perform some additional processing. To do that, you must create a business flow that runs the cartridge along with a Groovy script. When you deploy and run the business flow, the cartridge will be deployed and run with it.
5 Creating a business flow You can run your cartridge separately or as part of a larger workflow. For example, in the case of structured records management, you run your cartridge and then run a script to load the archive files into HP TRIM. This chapter includes: • Creating a business flow (page 29) • Adding a Groovy script (page 31) • Summary and next steps (page 36) Creating a business flow 1 Select File > New Business Flow. 2 Type Orders_D2F_BF_SRMS for the Name.
5 Click Archive and then click under the Start activity to place it. 6 In the Archive dialog, select Orders_D2F_SRMS in the Cartridge field, if not already selected. 7 should be selected by default in the Selection field and Copy in the Data Movement field. If not, you should select those values. Standard NOTE You can select Archive instead of Copy, if required. CAUTION For Deferred Deletes, you must select Copy only.
8 Click OK. You now have a business flow with one cartridge. Adding a Groovy script As it stands now, your business flow extracts the data from the database and puts it in a file, but you still need to ingest the file into HP TRIM. To get the extracted data into HP TRIM, you need to add another activity that loads the data file into HP TRIM. To add a Groovy script that loads the archived data file into HP TRIM: 1 Open the Orders_D2F_SRMS business flow.
TIP Alternatively, you could add the necessary Groovy script just as you would any other Groovy script. Select the Groovy Script tool, click underneath the Orders_D2F_SRMS box, and select the SRMS > Call TRIM Loader template. 3 Double-click Call_SRMS_Loader to review the Groovy script. For the purposes of this tutorial, you can accept the default behavior of the basic template.
Calling the SRMS Loader Your configuration determines how you need to invoke the SRMS Loader to ingest your archive files into HP TRIM: NOTE In most cases, HP Database Archiving resides on a different machine than HP TRIM. HP Database Archiving can reside on MS Windows or UNIX, whereas HP TRIM can reside only on MS Windows.
• When HP Database Archiving and HP TRIM reside on heterogeneous operating systems with unshared file systems, you must provide a path specification, insert an ftp command for moving the archive files, and manually call the SRMS Loader. To achieve this result, you need to insert a Groovy script in the business flow editor and choose the SRMS > Call TRIM Loader (Advanced) template.
Substitution path specifications for archive files In the SRMS Loader call in your Groovy script, a path specification indicates a substitution path to apply to the HP Database Archiving metadata files. • DEFAULT_PATH indicates that the existing paths in the metadata files can be used as is. This setting is best when both HP Database Archiving and HP TRIM are on MS Windows systems with shared file systems and identical path specifications.
srmsMetadata.getdefaultConfigFile(), "ASYNCHRONOUS","c:/temp/mydir/", "LOCAL") This example call loads the output generated by all cartridges within the current business flow into HP TRIM: import groovy.sql.Sql import com.hp.ilm.db.extensions.srms.infrastructure.* SrmsMetadata.sendToSRMS("TRIM", CURRENT_GROUP_RUN_ID, ENVIRONMENT_NAME, REPOS_DB, SrmsMetadata.
6 Deploying and running a business flow When the business flow definition is complete, you are ready to deploy it to the local or remote system where you plan to execute it. Alternatively, you could also generate it on the file system for future deployment on another system by you or someone else. This chapter describes how to set up the deployment environment, deploy and run a business flow in the environment, and monitor the business flow while it is running.
To deploy your business flow: 1 Return to Designer or restart if it is not currently open. 2 In the Project Navigator, right-click the Orders_D2F_BF_SRMS and select Deploy from the pop-up menu. TIP In the Deployment Assistant on the Deployment Type page, you can select Deploy Locally, if you installed the repository on the same database server where you are currently running Designer.
7 Click Next. The Deploy Environment page displays. 8 Choose the environment to which you want to deploy this business flow, for example, Oracle_OLTP. 9 Click Next. NOTE If you deployed database to database archiving as part of your environment setup, you are prompted for topology. If database to database archiving is not present, the only option is to archive from the active database and Deployment Assistant need not prompt you for topology.
12 Click Next. The Summary page shows a summary of the options you have selected. 13 Click Finish. You may have to wait a few minutes before the Deployment Finished dialog appears. 14 When the Deployment Finished dialog appears, click Show Log to show the log file. Review the log and ensure there are no errors or problems.
15 If you discovered errors in the previous step, click OK and step back through the Deployment Assistant to correct the problems. If there were no errors, click OK to close the log file. 16 Click OK to close the Deployment Finished dialog. 17 In the Deployment Assistant, if you specified Include Documentation, you should find a PDF file with your business flow’s documentation located in install_dir\obt\businessflow\environment_name.
4 Under Core parameters, notice that Extract file format is set to XML denormalized by default, which means that the archive file will be XML rather than comma separated values (CSV). 5 If the SRMS Loader was configured to perform hash checks, you need to perform the following steps to generate MD5 files: TIP If the ExecuteHashCheck parameter in srmsLoader_config.xml is set to true, the loader performs a hash check before loading the XML files into HP TRIM. This hash check requires an MD5 file.
For the purposes of this tutorial, the remainder of the parameters can use the default values. 6 Click Apply to accept any parameter changes. 7 (Optional) Since the business flow will create files on the file system, you might also want to confirm the exact location where HP Database Archiving will create the files. To perform this procedure, you need to be the admin user or another user with Manage Environment privileges. a Click Environment from the menu at the top of the page.
8 Click Launch from the menu at the top of the page. 9 Click Orders_D2F_BF_SRMS. The Launch page for that business flow appears. 10 Click Run. TIP For the purposes of this tutorial, you run the business flow manually. For your production systems, you would typically click Schedule to automate the running of the business flow. 11 Click Confirm when prompted. The business flow is launched and you are taken to a monitoring page that will periodically refresh with the latest status.
NOTE You may recall that, in Designer, we chose to run the loader with the ASYNCHRONOUS parameter. Hence, the business flow will not go into completed status until the files are actually loaded into HP TRIM.
Chapter 6: Deploying and running a business flow
7 Querying the archive After your data has been archived to file and ingested by HP TRIM, you can see the files as a record in HP TRIM. You can also query the data in the files by using the Record Query Server. This chapter describes how you can use the query server on MS Windows to look at the archived data in MS Excel.
5 Select the next document, which is an MD5 summary. This is created by HP Database Archiving as part of the extraction. It contains the md5 hash for the summary.xml file and is used to check that the summary file has not been tampered with. After all the files have been imported into HP TRIM, it performs a validation of their contents against the MD5 files created by HP Database Archiving during extraction.
Searching the extract in HP TRIM HP TRIM also allows you to search the structured records. As with any other record in HP TRIM, you may need to query it for business reasons, a Freedom of Information request or an e-discovery exercise for litigation. All of the HP TRIM search methods are available, in the same way that they would be for other types of records. For example, you might search by Date Created.
Record Query Server (RQS) in HP TRIM provides you with direct query access to your data without reloading it into a relational database. Using RQS, you can access your data using standard SQL reporting and development tools on Windows and UNIX. You can even join your archived data with existing data in a database.
See also 5 Click OK. The collection is saved to the server. This process may take a few moments. 6 You can now import the data into Microsoft Office Excel or another ODBC/ JDBC client of your choice. Refer to Querying your collection in Microsoft Office Excel (page 51). HP Database Archiving software Runtime guide Querying your collection in Microsoft Office Excel To quickly test your collection: 1 Open Microsoft Office Excel. NOTE The steps in this section are based upon Microsoft Office 2003.
3 Select xmlArchive_SRMS*. 4 Click OK. 5 In the OpenAccess Login dialog, type your User Name and Password. The default user name is install and the default password is OA. 6 Click OK. The Query Wizard - Choose Columns page appears. 7 In the Available tables and columns list, expand the ORDER_HEADER node. 8 Select ORDERID. 9 Click the shuttle (>) to move ORDERID to the Columns in your query list. 10 Repeat step 8 and step 9 for CUSTOMERID, ORDERDATE, TOTAL, and STATUSID. 11 Click Next.
12 Click Next. No changes are necessary on the Query Wizard - Filter Data page. 13 Click Next. No changes are necessary on the Query Wizard - Sort Order page. 14 Select Return Data to Microsoft Office Excel in the Query Wizard - Finish page. 15 Click Finish. The Import Data dialog box appears. 16 Select New Worksheet. 17 Click OK. Your data is loaded into the spreadsheet and you can manipulate it as you would any other data in an Excel spreadsheet.
— import your data into MS Excel You have now completed the basic HP Database Archiving structured records management tutorial.
A Troubleshooting This appendix describes some of the common issues and solutions that you may encounter when configuring and running HP Database Archiving and HP TRIM together for structured records management. • Summary of issues (page 55) • Design time issues (page 56) • Runtime issues (page 57) • Archive access issues (page 60) WARNING! This appendix only covers those issues specific to HP Database Archiving when integrated with HP TRIM for structured records management.
Design time issues Design time issues This section describes issues that might arise when you are designing an SRMS cartridge. Unable to convert cartridge to SRMS Symptom When you select SRMS > Convert to SRMS Cartridge, you receive an error similar to the following: HP Structured Records Management Solution is not configured. Please contact your HP Account Representative. Cause Solution You either do not have SRMS configured at all for this instance of HP Database Archiving or SRMS is misconfigured.
Runtime issues ... If you are uncertain of the connection name or details, consult your DBA or HP TRIM administrator. Warning when adding Groovy script Symptom When you choose SRMS > Add SRMS activity to Business Flow, you get a warning indicating that you have two instances of the Groovy script and you only need one: Business Flow already contains an SRMS Loader activity. Only one call per business flow is needed.
Runtime issues Exception while executing a Groovy script Cause The archive data files were not found at the location specified in the Groovy script that calls the SRMS Loader. Hence, the SRMS Loader failed. Solution Ensure that your path specification in the Groovy script matches the location of the archive data files.
Runtime issues Solution Upload the data to a different database. Archive files not loaded into HP TRIM Symptom Cause Solution Your cartridge successfully ran, and the XML files were generated, but the files were not loaded into HP TRIM. In order for your archive files to be loaded into HP TRIM, the business flow must successfully call the HP TRIM Loader to ingest the files into HP TRIM. If the loader is not called or fails for some reason, the files are not ingested into HP TRIM.
Archive access issues Solution • In the Web Console, under database to file parameters, ensure that the Checksum algorithm parameter is set to MD5 checksum. • If there truly is a difference between the loaded files and the generated files, you may need to discard the loaded files and regenerate new archive files in order to be certain that the files were not corrupted in some way.
Archive access issues Solution You can either not use the .gz format for your extractions or use the RQS to query the records instead. Sending to RQS fails Symptom When you right click the folder containing your XML documents in HP TRIM and select Send To > Prepare for RQS, you receive an error: Registration not complete for RunID x. Register with RQS process reported failure for RunID: x Detail: java.sql.SQLException: [DataDirect][OpenAccess SDK JDBC Driver]TCP/IP error, connection refused.
Archive access issues 62 Appendix A: Troubleshooting
Glossary active database The database from which you plan to archive data. Typically, this database is your online transaction processing (OLTP) or production database. In a two or three tiered configuration, the active database resides on tier one and is the source for data movement operations. active environment The Web Console views and acts upon only one environment at a time, the active environment. To switch the active environment, you use the Change Active option in the Web Console.
business flow status The Web Console shows the last run of each business flow. The states are Complete/Error/Running. cartridge An instance of model- or schema-based eligibility criteria used to move or copy data from one location to another. Cartridges capture the application and business rules to ensure referential integrity of the data. For any one model in your project, you may have many cartridges that use it.
customization mode A Designer mode that provides visual cues to indicate customizations in the model. In a project with locked files, customization mode is on by default, but you can toggle it on and off from the toolbar in the model editor. data masking The process of replacing private or confidential data during movement with a specified mask. You can choose from pre-defined masks that are part of HP Database Archiving or create your own mask.
the type or version of a database or application, which can be obtained programmatically at deployment time. embedded repository A Java database, installed with HP Database Archiving, that can act as your repository database, where you store your HP Database Archiving metadata. Alternatively, your source database or another database can act as the repository database. environment The source and (optional) target credentials against which you plan to run commands.
lookup table A table that contains helpful non-transactional information. For example, non-transactional information could be status definitions, or the name of the sales representative. managed table A table in the model that is copied and then purged from the active database by a cartridge. Transactional, chaining, and driving tables in a model are all typically managed tables. model A model identifies the tables and table relationships representing a business entity or related business entities.
rule Qualifications added to the model in order to include or exclude data based on certain criteria. For example, you might add a rule to exclude from archiving any orders that are not yet closed. runtime parameter A type of parameter that has its values set by the operator executing the job in Console or on the command line. Typically, this type of parameter represents operational values that tend to change frequently and therefore need to be set each time the job is run.
transactional table A table that contains information about the business transaction. For example, a transactional table might contain detailed tax or payment information related to each business transaction. unique identifiers (UIDs) A 16 hexadecimal identifier calculated based on the content of a Designer file. This value is used to determine if the user has customized key pieces of a project. unmanaged table A table in a model that is copied but not purged from the active database by a cartridge.
Glossary
Index A adding Groovy scripts 31 archiving data collections 50 data movement 30 query server 50 querying data from 51 audience intended 5 B business flows creating 29 deploying 37 running 41 start activity 29 C cartridges converting to SRMS 25 creating 21 editing 23 editior 23 failing to launch 58 model-based 21 previewing 23 schema-based 21 classifying cartridges, troubleshooting 56 concepts 12 HP TRIM 26 in Designer 25 menu item 26 collections archive data 50 connections to HP TRIM database 18 convention
documentation 38 Groovy scripts adding 31 adding SRMS activity 31 example loader calls 35 troubleshooting 57 H HP Subscriber’s choice web site 7 HP TRIM calling SRMS Loader 31 classifications 26 searching records 49 troubleshooting 59, 60, 61 viewing extracted data from 47 I ingesting concepts 13 L licensing, HP end user license agreement 2 logs showing deployment 40 M managing concepts 13 Microsoft Excel querying 51 models cartridges 21 P prerequisites product 5 software 9 previewing cartridges 23 projects
Groovy scripts 57 Groovy scripts, SRMS 57 HP TRIM 59, 60, 61 W web sites HP Subscriber’s choice 7 support 8 HP Structured Records Management Solution Tutorial 73
Index