OpenERP Management System: OpenERP steps for Pentaho Kettle/PDI

Tuesday, December 11, 2012

OpenERP steps for Pentaho Kettle/PDI

Pentaho Kettle / PDI is a powerful tool to transfer data between systems.

If you're running OpenERP in together with other applications, and need them to share data, this should be the first place to look at. Data is extracted using "input steps", transformed and then loaded using "output steps".

Kettle has specialized steps for OpenERP, that interact with the OpenERP server in the proper way - through the XML-RPC API. Unfortunately these steps are not yet available in the stable version.

You try them out using the latest development version. The development versions, however, are not well suited for usage in a production environment. Wouldn't it be nice if you could install these OpenERP steps in a Kettle stable version?

Well, I gave a try at this, and got them working with the 4.3.0-stable version. Here's the recipe:

Download the trunk version and copy the following files to the stable version:

from \data-integration\plugins\kettle-openerp-plugin\*, and
from \data-integration\libext\webservices, the files:

ws-commons-util-1.0.2.jar
xmlrpc-client-3.1.3.jar, and
xmlrpc-common-3.1.3.jar.

Now (re)start Spoon and a "!BaseStep.Category.OpenERP!" will be available containing three new steps: Delete, Input and Output.

Note that for some reason the "Test" button in the "Database Connection" dialog doesn't work correctly: it reports an error when testing the OpenERP server connection, even though the connection is correct. I believe that this is a bug that has been fixed in later development branches.

A big advantage for using the OpenERP API instead of the direct access to the database tables is the requests go through the ORM, so you get improved data consistency and take advantage of the ORM's create() and write() logic, and model default values.

On the downside, these steps don't handle XML Ids, only database (integer) ids, so you need to set your own strategy to identify records, such as using "reference" or "code" fields. If your record identification relies on XML Ids, then import_data() base tools, such as import_sqlpush, may still be a good option.

11 comments:

M H J BramwellMarch 13, 2013 at 11:46 AM
I have had to alter your procedure considerably in order to achieve the same result with PDI 4.4.0-stable under Ubuntu 12.04 LTS.

Like you, I get the latest snapshot from here: http://ci.pentaho.com/view/Data%20Integration/job/Kettle/

From it I was able to obtain only the directory "kettle-openerp-plugin" and the jar "xmlrpc-common-3.1.3.jar". The latter was in a directory "lib", rather than "libext", as you indicate.

In order to get the others I had to use http://www.findjar.com/. I got:
http://mirrors.ibiblio.org/maven2/org/apache/ws/commons/util/ws-commons-util/1.0.2/ws-commons-util-1.0.2.jar
http://mirrors.ibiblio.org/maven2/org/apache/xmlrpc/xmlrpc-common/3.1.3/xmlrpc-common-3.1.3.jar

With those in place I get as far as defining a connection to my server and specifying a model name. At the moment of exiting the "Model Name" field I am hit with:
NoClassDefFoundError: com/debortoliwines/openerp/api/RowCollection

It turns out that the corresponding jar is no longer in the directory "kettle-openerp-plugin". I found a jar "/lib/openerp-java-api-1.0.0.jar" and put that in my "libext" directory.

With that done I am at last able to attempt to work with plugin without it failing.

Unfortunately, I still have not been able to do anything useful, due to a reported error in my data.

Unexpected error
org.pentaho.di.core.exception.KettleException:
Failed to commit batch:
Unknown database identifier '0'
Row :{}

I have no idea what that could mean.
ReplyDelete
Replies
M H J BramwellMarch 13, 2013 at 1:56 PM
I also experienced a failure when trying to test the connection OpenERP. I fixed it by copying (not moving) the jar "kettle-openerp-plugin-TRUNK-SNAPSHOT.jar" from the directory "plugins/kettle-openerp-plugin/" to the directory "libext". Apparently the class loader for connections testing doesn't look in the classpath of the corresponding plugin.
ReplyDelete
Replies
Daniel ReisMarch 13, 2013 at 3:07 PM
@MHJBramwell Thanks for sharing.
ReplyDelete
Replies
M H J BramwellMarch 13, 2013 at 4:46 PM
Do you have any idea how I can get past this . . . ?

Unknown database identifier '0'
Row :{}

I happens after launching the transformation.
ReplyDelete
Replies
M H J BramwellMarch 14, 2013 at 1:08 PM
It seems to be that the plugin is for OpenERP V6.1 and earlier.
ReplyDelete
Replies
Pieter van der MerweMarch 20, 2013 at 9:57 PM
You can implement the following fix to orm.py in V7 for the plugin to work:

--- openerp/osv/orm.py.orig 2013-03-20 22:21:17.892445229 +1100
+++ openerp/osv/orm.py 2013-03-20 22:05:12.474594534 +1100
@@ -1499,7 +1499,9 @@ class BaseModel(object):
except ValueError:
# in case of overridden id column
dbid = record['.id']
- if not self.search(cr, uid, [('id', '=', dbid)], context=context):
+ if dbid == 0:
+ dbid = False
+ elif not self.search(cr, uid, [('id', '=', dbid)], context=context):
log(dict(extras,
type='error',
record=stream.index,

I couldn't get the indentation to work in this comment. You can download the same diff file with better indentation from here
ReplyDelete
Replies
M H J BramwellMarch 26, 2013 at 10:49 AM
Thank you! I will look into it when I can. Lacking a way forward with Pentaho I've instead made my own explicit Google Spreadsheet to OpenERP V7 data pump.

https://github.com/martinhbramwell/GData_OpenERP_Data_Pump

Any comments greatly appreciated.
ReplyDelete
Replies
UnknownApril 24, 2013 at 8:46 PM
Hy,

Thanks for your work. I would like to add some little comments :

First, I read carefully the link here : http://wiki.pentaho.com/display/EAI/OpenERP+Object+Output
It is written : If no key value fields are specified, all incoming records are treated as new.
Unfortunately, if I don't use any key values fields, I have this issue : http://pastebin.com/k29yzgkZ

On a second part, if I want to precise a company_id it's not possible. So for multiple_company environment, how will it work?

ReplyDelete
Replies
UnknownApril 24, 2013 at 9:48 PM
I made some little proposals for improvements :
* http://jira.pentaho.com/browse/PDI-9816
* http://jira.pentaho.com/browse/PDI-9814
* http://jira.pentaho.com/browse/PDI-9813

Hope it could help
ReplyDelete
Replies
UnknownJune 10, 2013 at 11:27 AM
As can download these steps. thanks
ReplyDelete
Replies
Daniel ReisJune 11, 2013 at 9:59 AM
Joaquin: you can find in the latest development build, at http://ci.pentaho.com/view/Data%20Integration/job/Kettle
ReplyDelete
Replies

Add comment