Featured FREE Whitepapers

What's New Here?

software-development-2-logo

You can’t Refactor your way out of every Problem

Refactoring is a disciplined way to clarify, retain or restore the design of a system as you make changes, and to help cleanup and correct the mistakes and mess that we all make as we work, to clear away the evidence of false starts and changes in direction and back tracking and to help fill in gaps and misunderstandings. As a colleague of mine has pointed out, you can get a lot out of even the most simple and obvious refactoring changes: eliminating duplication, changing variable and method names to be more meaningful, extracting methods, simplifying conditional logic, replacing a magic number with a named constant. These are easy things to do, and will give you a big return in understandability and maintainability. But refactoring has limitations – there are some problems that refactoring won’t solve. Refactoring can’t help you if the design is fundamentally wrong Some people naively believe that you can refactor your way out of any design mistake or misunderstanding – and that you can use refactoring as a substitute for upfront design. This assumes that you will be able to immediately recognize mistakes and gaps from customer feedback and correct the design as you are developing. But it can take a long time, usually only once the system is being used in the real world by real customers to do real things, before you learn how wrong you actually were, how much you missed and misunderstood, exceptions and edge cases and defects piling up before you finally understand (or accept) that no, the design doesn’t hold up, you can’t just keep on extending it and patching what you have – you need a different set of abstractions or a different architecture entirely. Refactoring helps you make course corrections. But what if you find out that you’ve been driving the entire time in the wrong direction, or in circles? Barry Boehm, in Balancing Agility and Discipline, explains that starting simple and refactoring your way to the right answer sometimes falls down: “Experience to date also indicates that low-cost refactoring cannot be depended upon as projects scale up. The most serious problems that arise with simple design are problems known as “architecture breakers”. These highly expensive problems can occur when early, simple design decisions result in forseeable changes that cause breakage in design beyond the ability of refactoring to handle.” This is another argument in the “Refactor or Design” holy war over how much design should be / needs to be done upfront and how much can be filled in as you go through incremental change and refactoring. Deep Decisions Many design ideas can be refined, elaborated, iterated and improved over time, and refactoring will help you with this. But some early decisions on approach, packaging, architecture, and technology platform are too fundamental and too deep to change or correct with refactoring. You can use refactoring to replace in-house code with standard library calls, or to swap one library for another – doing the same thing in a different way. Making small design changes and cleaning things up as you go with refactoring can be used to extend or fill in gaps in the design and to implement cross-cutting features like logging and auditing, even access control and internationalization – this is what the XP approach to incremental design is all about. But making small-scale design changes and improvements to code structure, extracting and moving methods, simplifying conditional logic and getting rid of case statements isn’t going to help you if your architecture won’t scale, or if you chose the wrong approach (like SOA) or the wrong application framework (J2EE with Enterprise Java Beans, any multi-platform UI framework or any of the early O/R mapping frameworks – remember the first release of TopLink?, or something that you rolled yourself before you understood how the language actually worked), or the wrong language (if you found out that Ruby or PHP won’t scale), or a core platform middleware technology that proves to be unreliable or that doesn’t hold up under load or that has been abandoned, or if you designed the system for the wrong kind of customer and need to change pretty much everything. Refactoring to Patterns and Large Refactorings Joshua Kerievsky’s work on Refactoring to Patterns provides higher-level composite refactorings to improve – or introduce – structure in a system, by properly implementing well-understood design patterns such as factories and composites and observers, replacing conditional logic with strategies and so on. Refactoring to Patterns helps with cleaning up and correcting problems like “duplicated code, long methods, conditional complexity, primitive obsession, indecent exposure, solution sprawl, alternative classes with different interfaces, lazy classes, large classes, combinatorial explosions and oddball solutions”.Lippert and Roock’s work on Large Refactorings explains how to take care of common architectural problems in and between classes, packages, subsystems and layers, doing makeovers of ugly inheritance hierarchies and reducing coupling between modules and cleaning up dependency tangles and correcting violations between architectural layers – the kind of things that tools like Structure 101 help you to see and understand. They have identified a set of architectural smells and refactorings to correct them:Smells in dependency graphs: Visible dependency graphs, tree-like dependency graphs, cycles between classes, unused classes Smells in inheritance hierarchies: Parallel inheritance hierarchies, list-like inheritance hierarchy, inheritance hierarchy without polymorphic assignments, inheritance hierarchy too deep, subclasses without redefinitions Smells in packages: Unused packages, cycles between packages, too small/large packages, packages unclearly named, packages too deep or nesting unbalanced Smells in subsystems: Subsystem overgeneralized, subsystem API bypassed, subsystem too small/large, too many subsystems, no subsystems, subsystem API too large Smells in layers: Too many layers, no layers, strict layers violated, references between vertically separate layers, upward references in layers, inheritance between protocol-oriented layers (coupling).Composite refactorings and large refactorings raise refactoring to higher levels of abstraction and usefulness, and show you how to identify problems on your own and how to come up with your own refactoring patterns and strategies. But refactoring to patterns or even large-scale refactoring still isn’t enough to unmake or remake deep decisions or change the assumptions underlying the design and architecture of the system. Or to salvage code that isn’t safe to refactor, or worth refactoring. Sometimes you need to rewrite, not refactor There is no end of argument over how bad code has to be before you should give up and rewrite it rather than trying to refactor your way through it. The best answer seems to be that refactoring should always be your first choice, even for legacy code that you didn’t write and don’t understand and can’t test (there is an entire book written on how and where to start refactoring legacy spps). But if the code isn’t working, or is so unstable and so dangerous that trying to refactor it only introduces more problems, if you can’t refactor or even patch it without creating new bugs, or if you need to refactor too much of the code to get it into acceptable shape (I’ve read somewhere than 20% is a good cut-off, but I can’t find the reference), then it’s time to declare technical bankruptcy and start again. Rewriting the code from scratch is sometimes your only choice. Some code shouldn’t be – or can’t be – saved. ‘Sometimes code doesn’t need small changes—it needs to be tossed out so that you can start over. If you find yourself in a major refactoring session, ask yourself whether instead you should be redesigning and reimplementing that section of code from the ground up.’ Steve McConnell,Code Complete You can use refactoring to restore, repair, cleanup or adapt the design or even the architecture of a system. Refactoring can help you to go back and make corrections, reduce complexity, and help you fill in gaps. It will pay dividends in reducing the cost and risk of ongoing development and support. But refactoring isn’t enough if you have to reframe the system – if you need to do something fundamentally different, or in a fundamentally different way – or if the code isn’t worth salvaging. Don’t get stuck believing that refactoring is always the right thing to do, or that you can refactor yourself out of every problem.   Reference: You can’t Refactor your way out of every Problem from our JCG partner Jim Bird at the Building Real Software blog. ...
java-interview-questions-answers

JavaEE Revisits Design Patterns: Decorator

This time last year I wrote a series of blog posts on JavaEE implementation of design patterns. Roughly after a year, I realized I missed my favorite pattern, the decorator. Decorator pattern is basically a way to extend functionality of an object by decorating with other objects which can wrap the target object and add their own behavior to it. If you never used or heard of decorators, I highly recommend reading chapter 3 of Head First Design Patterns. Pretty much like other patterns mentioned in my posts before, JavaEE has an easy an elegant way to use the decorator pattern. Lets start with a simple stateless session bean. package com.devchronicles.decorator;import javax.ejb.Stateless; import javax.ejb.TransactionAttribute; import javax.ejb.TransactionAttributeType;/** * * @author murat */ @Stateless @TransactionAttribute(TransactionAttributeType.REQUIRED) public class EventService {public void startService(){ System.out.println("do something important here..."); } } To start implementing the decorator pattern, we need an interface so we can bind the decorators and the object to be decorated together. package com.devchronicles.decorator;/** * * @author murat */ public interface ServiceInterface { public void startService(); } The interface has the method which the decorators will add functionality on. Next we need some changes on our existing EventService bean to make it decoratable. package com.devchronicles.decorator;import javax.ejb.Stateless; import javax.ejb.TransactionAttribute; import javax.ejb.TransactionAttributeType;/** * * @author murat */ @Stateless @TransactionAttribute(TransactionAttributeType.REQUIRED) public class EventService implements ServiceInterface{public void startService(){ System.out.println("do something important here..."); } } Now we are ready to add as much as decorator we need. All we need to do is to annotate our class, implement the ServiceInterface and to inject our service delegate. package com.devchronicles.decorator;import javax.decorator.Decorator; import javax.decorator.Delegate; import javax.inject.Inject;/** * * @author murat */ @Decorator //declares this class as a decorator public class DecoratorService implements ServiceInterface{ //must implement the service interface@Inject //inject the service @Delegate //and annotate as the delegate ServiceInterface service;@Override public void startService() { //implement the startService method to add functionality System.out.println("decorating the existing service!"); service.startService(); //let the execution chain continue } } Several decorators can be using the service interface. package com.devchronicles.decorator;import javax.decorator.Decorator; import javax.decorator.Delegate; import javax.inject.Inject;/** * * @author murat */ @Decorator public class Decorator2Service implements ServiceInterface{ @Inject @Delegate ServiceInterface service;@Override public void startService() { System.out.println("decorating the service even further!!!"); service.startService(); } } Most of the configuration can be done via annotation in JavaEE6. However we still need to add some xml configuration to make decorators work. It might seem disappointing since we already annotated our decorators but still the configuration is pretty simple and needed in order to declare the order of execution. Add the following lines to the empty beans.xml. <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/beans_1_0.xsd"> <decorators> <class>com.devchronicles.decorator.DecoratorService</class> <class>com.devchronicles.decorator.Decorator2Service</class> </decorators> </beans> When the startService method of our EventService is executed, the decorators will decorate the ejb and add their own behavior to the execution. ...INFO: WEB0671: Loading application [Decorator] at [/Decorator] INFO: Decorator was successfully deployed in 2,534 milliseconds. INFO: decorating the existing service! INFO: decorating the service even further!!! INFO: do something important here... ...   Reference: JavaEE Revisits Design Patterns: Decorator from our JCG partner Murat Yener at the Developer Chronicles blog. ...
android-logo

Android – Select multiple photos from Gallery

Today i am going to discuss about implementation of selecting multiple photos from Photo Gallery. You know, i have searched enough to find out a solution for selecting multiple images from android’s native gallery. There are 2 ways to implement multi selection of images in gallery: 1) Intent for getting multiple images 2) Define custom gallery with fetching and loading photos from native gallery.1. Intent for getting multiple images: As being native lover, i tried to find out a solution by implementing solution provided same like we select single image from gallery, by using below code: Intent intent = new Intent(); intent.setType('image/*'); intent.setAction(Intent.ACTION_GET_CONTENT); startActivityForResult(Intent.createChooser(intent, 'Select Picture'), PICK_IMAGE); But i am unable to implement a solution for selecting multiple images using the above way. Yes i came to know we can use SEND_MULTIPLE intent but i am sure how do we implement this and select mutiple images. I will try to implement it and share with you if i will get success. 2. Define custom gallery with fetching and loading photos from native gallery: As we dont know about the Intent way solution, this is the good idea for selecting multiple photos. One of my friend Vikas Kanani is already done with this solution earlier. I did thorough testing and came to know about the below issues:Images are loading very slow if we are having larger number of images, lets say 2000-5000 Crashing if we load more imagesFor resolving above issues, what i did? I have implemented Asynchronous image loading so that every image gets loaded asynchronously.Now, lets implement improved solution Step 1: Download Image loader library from Here. Step 2: Add the library inside the libs folder, right click on the jar file -> Select Add to Build PathStep 3: Define row layout for image item row_multiphoto_item.xml <?xml version='1.0' encoding='utf-8'?> <RelativeLayout xmlns:android='http://schemas.android.com/apk/res/android' android:layout_width='fill_parent' android:layout_height='fill_parent' ><ImageView android:id='@+id/imageView1' android:layout_width='100dp' android:layout_height='100dp' android:src='@drawable/ic_launcher' /><CheckBox android:id='@+id/checkBox1' android:layout_width='wrap_content' android:layout_height='wrap_content' android:layout_alignRight='@+id/imageView1' android:layout_centerVertical='true'/></RelativeLayout> Step 4: Define activity layout with GridView ac_image_grid.xml <?xml version='1.0' encoding='utf-8'?> <RelativeLayout xmlns:android='http://schemas.android.com/apk/res/android' android:layout_width='fill_parent' android:layout_height='fill_parent' ><GridView android:id='@+id/gridview' android:layout_width='fill_parent' android:layout_height='fill_parent' android:columnWidth='100dip' android:gravity='center' android:horizontalSpacing='4dip' android:numColumns='auto_fit' android:stretchMode='columnWidth' android:layout_above='@+id/button1' android:verticalSpacing='2dip' /><Button android:id='@+id/button1' android:layout_alignParentBottom='true' android:layout_width='wrap_content' android:layout_height='wrap_content' android:layout_centerHorizontal='true' android:onClick='btnChoosePhotosClick' android:text='Select Photos' /></RelativeLayout> Step 5: Define a UILApplication to declare application level configuration settings UILApplication.java package com.technotalkative.multiphotoselect;import android.app.Application;import com.nostra13.universalimageloader.cache.disc.naming.Md5FileNameGenerator; import com.nostra13.universalimageloader.core.ImageLoader; import com.nostra13.universalimageloader.core.ImageLoaderConfiguration;/** * @author Paresh Mayani (@pareshmayani) */ public class UILApplication extends Application {@Override public void onCreate() { super.onCreate();// This configuration tuning is custom. You can tune every option, you may tune some of them, // or you can create default configuration by // ImageLoaderConfiguration.createDefault(this); // method. ImageLoaderConfiguration config = new ImageLoaderConfiguration.Builder(getApplicationContext()) .threadPoolSize(3) .threadPriority(Thread.NORM_PRIORITY - 2) .memoryCacheSize(1500000) // 1.5 Mb .denyCacheImageMultipleSizesInMemory() .discCacheFileNameGenerator(new Md5FileNameGenerator()) .enableLogging() // Not necessary in common .build(); // Initialize ImageLoader with configuration. ImageLoader.getInstance().init(config); } } Step 6: Define a base activity to create a singleton instance of ImageLoader class. BaseActivity.java package com.technotalkative.multiphotoselect;import android.app.Activity;import com.nostra13.universalimageloader.core.ImageLoader;/** * @author Paresh Mayani (@pareshmayani) */ public abstract class BaseActivity extends Activity {protected ImageLoader imageLoader = ImageLoader.getInstance();} Step 7: Now, Its time to define a main activity where we can write a logic for fetching photos from native gallery. Here i have also defined an ImageAdapter for the GridView. MultiPhotoSelectActivity.java package com.technotalkative.multiphotoselect;import java.util.ArrayList;import android.content.Context; import android.database.Cursor; import android.graphics.Bitmap; import android.os.Bundle; import android.provider.MediaStore; import android.util.Log; import android.util.SparseBooleanArray; import android.view.LayoutInflater; import android.view.View; import android.view.ViewGroup; import android.view.animation.Animation; import android.view.animation.AnimationUtils; import android.widget.BaseAdapter; import android.widget.CheckBox; import android.widget.CompoundButton; import android.widget.Toast; import android.widget.CompoundButton.OnCheckedChangeListener; import android.widget.GridView; import android.widget.ImageView;import com.nostra13.universalimageloader.core.DisplayImageOptions; import com.nostra13.universalimageloader.core.assist.SimpleImageLoadingListener;/** * @author Paresh Mayani (@pareshmayani) */ public class MultiPhotoSelectActivity extends BaseActivity {private ArrayList<String> imageUrls; private DisplayImageOptions options; private ImageAdapter imageAdapter;@Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.ac_image_grid);final String[] columns = { MediaStore.Images.Media.DATA, MediaStore.Images.Media._ID }; final String orderBy = MediaStore.Images.Media.DATE_TAKEN; Cursor imagecursor = managedQuery( MediaStore.Images.Media.EXTERNAL_CONTENT_URI, columns, null, null, orderBy + ' DESC');this.imageUrls = new ArrayList<String>();for (int i = 0; i < imagecursor.getCount(); i++) { imagecursor.moveToPosition(i); int dataColumnIndex = imagecursor.getColumnIndex(MediaStore.Images.Media.DATA); imageUrls.add(imagecursor.getString(dataColumnIndex));System.out.println('=====> Array path => '+imageUrls.get(i)); }options = new DisplayImageOptions.Builder() .showStubImage(R.drawable.stub_image) .showImageForEmptyUri(R.drawable.image_for_empty_url) .cacheInMemory() .cacheOnDisc() .build();imageAdapter = new ImageAdapter(this, imageUrls);GridView gridView = (GridView) findViewById(R.id.gridview); gridView.setAdapter(imageAdapter); //gridView.setOnItemClickListener(new OnItemClickListener() { // @Override //public void onItemClick(AdapterView<?> parent, View view, int position, long id) { // startImageGalleryActivity(position); // } //}); }@Override protected void onStop() { imageLoader.stop(); super.onStop(); }public void btnChoosePhotosClick(View v){ArrayList<String> selectedItems = imageAdapter.getCheckedItems(); Toast.makeText(MultiPhotoSelectActivity.this, 'Total photos selected: '+selectedItems.size(), Toast.LENGTH_SHORT).show(); Log.d(MultiPhotoSelectActivity.class.getSimpleName(), 'Selected Items: ' + selectedItems.toString()); }/*private void startImageGalleryActivity(int position) { Intent intent = new Intent(this, ImagePagerActivity.class); intent.putExtra(Extra.IMAGES, imageUrls); intent.putExtra(Extra.IMAGE_POSITION, position); startActivity(intent); }*/public class ImageAdapter extends BaseAdapter {ArrayList<String> mList; LayoutInflater mInflater; Context mContext; SparseBooleanArray mSparseBooleanArray;public ImageAdapter(Context context, ArrayList<String> imageList) { // TODO Auto-generated constructor stub mContext = context; mInflater = LayoutInflater.from(mContext); mSparseBooleanArray = new SparseBooleanArray(); mList = new ArrayList<String>(); this.mList = imageList;}public ArrayList<String> getCheckedItems() { ArrayList<String> mTempArry = new ArrayList<String>();for(int i=0;i<mList.size();i++) { if(mSparseBooleanArray.get(i)) { mTempArry.add(mList.get(i)); } }return mTempArry; }@Override public int getCount() { return imageUrls.size(); }@Override public Object getItem(int position) { return null; }@Override public long getItemId(int position) { return position; }@Override public View getView(int position, View convertView, ViewGroup parent) {if(convertView == null) { convertView = mInflater.inflate(R.layout.row_multiphoto_item, null); }CheckBox mCheckBox = (CheckBox) convertView.findViewById(R.id.checkBox1); final ImageView imageView = (ImageView) convertView.findViewById(R.id.imageView1);imageLoader.displayImage('file://'+imageUrls.get(position), imageView, options, new SimpleImageLoadingListener() { @Override public void onLoadingComplete(Bitmap loadedImage) { Animation anim = AnimationUtils.loadAnimation(MultiPhotoSelectActivity.this, R.anim.fade_in); imageView.setAnimation(anim); anim.start(); } });mCheckBox.setTag(position); mCheckBox.setChecked(mSparseBooleanArray.get(position)); mCheckBox.setOnCheckedChangeListener(mCheckedChangeListener);return convertView; }OnCheckedChangeListener mCheckedChangeListener = new OnCheckedChangeListener() {@Override public void onCheckedChanged(CompoundButton buttonView, boolean isChecked) { // TODO Auto-generated method stub mSparseBooleanArray.put((Integer) buttonView.getTag(), isChecked); } }; } } Download this example from here: Android – Select multiple photos from Gallery   Reference: Android – Select multiple photos from Gallery from our JCG partner Paresh N. Mayani at the TechnoTalkative blog. ...
jboss-hibernate-logo

Logging Hibernate SQL

There are two well-known ways to log Hibernate SQL in Grails; one is to add logSql = true in DataSource.groovy (either in the top-level block for all environments or per-environment)             dataSource { dbCreate = ... url = ... ... logSql = true } and the other is to use a Log4j logging configuration: log4j = { ... debug 'org.hibernate.SQL' } The problem with logSql is that it’s too simple – it just dumps the SQL to stdout and there is no option to see the values that are being set for the positional ? parameters. The logging approach is far more configurable since you can log to the console if you want but you can configure logging to a file, to a file just for these messages, or any destination of your choice by using an Appender. But the logging approach is problematic too – by enabling a second Log4j category log4j = { ... debug 'org.hibernate.SQL' trace 'org.hibernate.type' } we can see variable values, but you see them both for PreparedStatement sets and for ResultSet gets, and the gets can result in massive log files full of useless statements. This works because the “Type” classes that Hibernate uses to store and load Java class values to database columns (for example LongType, StringType, etc.) are in the org.hibernate.type package and extend (indirectly) org.hibernate.type.NullableType which does the logging in its nullSafeSet and nullSafeGet methods. So if you have a GORM domain class class Person { String name } and you save an instance new Person(name: 'me').save() you’ll see output like this: DEBUG hibernate.SQL - insert into person (id, version, name) values (null, ?, ?) TRACE type.LongType - binding '0' to parameter: 1 TRACE type.StringType - binding 'me' to parameter: 2 DEBUG hibernate.SQL - call identity() When you later run a query to get one or more instances def allPeople = Person.list() you’ll see output like this DEBUG hibernate.SQL - select this_.id as id0_0_, this_.version as version0_0_, this_.name as name0_0_ from person this_ TRACE type.LongType - returning '1' as column: id0_0_ TRACE type.LongType - returning '0' as column: version0_0_ TRACE type.StringType - returning 'me' as column: name0_0_ This isn’t bad for one instance but if there were multiple results then you’d have a block for each result containing a line for each column. I was talking about this yesterday at my Hibernate talk at SpringOne 2GX and realized that it should be possible to create a custom Appender that inspects log statements for these classes and ignores the statements resulting from ResultSet gets. To my surprise it turns out that everything has changed in Grails 2.x because we upgraded from Hibernate 3.3 to 3.6 and this problem has already been addressed in Hibernate. The output above is actually from a 1.3.9 project that I created after I got unexpected output in a 2.1.1 application. Here’s what I saw in 2.1.1: DEBUG hibernate.SQL - /* insert Person */ insert into person (id, version, name) values (null, ?, ?)TRACE sql.BasicBinder - binding parameter [1] as [BIGINT] - 0TRACE sql.BasicBinder - binding parameter [2] as [VARCHAR] - asd and DEBUG hibernate.SQL - /* load Author */ select author0_.id as id1_0_, author0_.version as version1_0_, author0_.name as name1_0_ from author author0_ where author0_.id=?TRACE sql.BasicBinder - binding parameter [1] as [BIGINT] - 1TRACE sql.BasicExtractor - found [0] as column [version1_0_]TRACE sql.BasicExtractor - found [asd] as column [name1_0_] So now instead of doing all of the logging from the types’ base class, it’s been reworked to delegate to org.hibernate.type.descriptor.sql.BasicBinder and org.hibernate.type.descriptor.sql.BasicExtractor. This is great because now we can change the Log4j configuration to log4j = { ... debug 'org.hibernate.SQL' trace 'org.hibernate.type.descriptor.sql.BasicBinder' } and have our cake and eat it too; the SQL is logged to a configurable Log4j destination and only the PreparedStatement sets are logged. Note that the SQL looks different in the second examples not because of a change in Grails or Hibernate but because I always enable SQL formatting (with format_sql) and comments (with use_sql_comments) in test apps so when I do enable logging it ends up being more readable, and I forgot to do that for the 1.3 app: hibernate { cache.use_second_level_cache = true cache.use_query_cache = false cache.region.factory_class = 'net.sf.ehcache.hibernate.EhCacheRegionFactory' format_sql = true use_sql_comments = true }   Reference: Logging Hibernate SQL from our JCG partner Burt Beckwith at the An Army of Solipsists blog. ...
postgresql-logo

Introduction to PostgreSQL PL/java

Modern databases allow stored procedures to be written in a variety of languages. One commonly implemented language is java.N.B., this article discusses the PostgreSQL-specific java implementation. The details will vary with other databases but the concepts will be the same. Installation of PL/Java Installation of PL/Java on an Ubuntu system is straightforward. I will first create a new template, template_java, so I can still create databases without the pl/java extensions. At the command line, assuming you are a database superuser, enter # apt-get install postgresql-9.1 # apt-get install postgresql-9.1-pljava-gcj$ createdb template_java $ psql -d template_java -c 'update db_database set datistemplate='t' where datnam='template_java'' $ psql -d template_java -f /usr/share/postgresql-9.1-pljava/install.sqlLimitations The prepackaged Ubuntu package uses the Gnu GCJ java implementation, not a standard OpenJDK or Sun implementation. GCJ compiles java source files to native object code instead of byte code. The most recent versions of PL/Java are “trusted” – they can be relied upon to stay within their sandbox. Among other things this means that you can’t access the filesystem on the server. If you must break the trust there is a second language, ‘javaU’, that can be used. Untrusted functions can only be created a the database superuser. More importantly this implementation is single-threaded. This is critical to keep in mind if you need to communicate to other servers. Something to consider is whether you want to compile your own commonly used libraries with GCJ and load them into the PostgreSQL server as shared libraries. Shared libraries go in /usr/lib/postgresql/9.1/lib and I may have more to say about this later. Quick verification We can easily check our installation by writing a quick test function. Create a scratch database using template_java and enter the following SQL: CREATE FUNCTION getsysprop(VARCHAR) RETURNS VARCHAR AS 'java.lang.System.getProperty' LANGUAGE java;SELECT getsysprop('user.home'); You should get “/var/lib/postgresql” as a result.Installing Our Own Methods This is a nice start but we don’t really gain much if we can’t call our own methods. Fortunately it isn’t hard to add our own. A simple PL/Java procedure is package sandbox;public class PLJava { public static String hello(String name) { if (name == null) { return null; }return 'Hello, ' + name + '!'; } } There are two simple rules for methods implementing PL/Java procedures:they must be public static they must return null if any parameter is nullThat’s it. Importing the java class into PostgreSQL server is simple. Let’s assume that the package classes are in /tmp/sandbox.jar and our java-enabled database is mydb. Our commands are then -- -- load java library -- -- parameters: -- url_path - where the library is located -- url_name - how the library is referred to later -- deploy - should the deployment descriptor be used? -- select sqlj.install_jar('file:///tmp/sandbox.jar', 'sandbox', true);-- -- set classpath to include new library. -- -- parameters -- schema - schema (or database) name -- classpath - colon-separated list of url_names. -- select sqlj.set_classpath('mydb', 'sandbox');-- ------------------- -- other procedures -- -- --------------------- -- reload java library -- select sqlj.replace_jar('file:///tmp/sandbox.jar', 'sandbox', true);-- -- remove java library -- -- parameters: -- url_name - how the library is referred to later -- undeploy - should the deployment descriptor be used? -- select sqlj.remove_jar('sandbox', true);-- -- list classpath -- select sqlj.get_classpath('mydb');-- It is important to remember to set the classpath. Libraries are automatically removed from the classpath when they’re unloaded but they are NOT automatically added to the classpath when they’re installed. We aren’t quite finished – we still need to tell the system about our new function. -- -- create function -- CREATE FUNCTION mydb.hello(varchar) RETURNS varchar AS 'sandbox.PLJava.hello' LANGUAGE java;-- -- drop this function -- DROP FUNCTION mydb.hello(varchar);-- We can now call our java method in the same manner as any other stored procedures. Deployment Descriptor There’s a headache here – it’s necessary to explicitly create the functions when installing a library and dropping them when removing a library. This is time-consuming and error-prone in all but the simplest cases. Fortunately there’s a solution to this problem – deployment descriptors. The precise format is defined by ISO/IEC 9075-13:2003 but a simple example should suffice. SQLActions[] = { 'BEGIN INSTALL CREATE FUNCTION javatest.hello(varchar) RETURNS varchar AS 'sandbox.PLJava.hello' LANGUAGE java; END INSTALL', 'BEGIN REMOVE DROP FUNCTION javatest.hello(varchar); END REMOVE' } You must tell the deployer about the deployment descriptor in the jar’s MANIFEST.MF file. A sample maven plugin is <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <version>2.3.1</version> <configuration> <archive> <manifestSections> <manifestSection> <name>postgresql.ddr</name> <!-- filename --> <manifestEntries> <SQLJDeploymentDescriptor>TRUE</SQLJDeploymentDescriptor> </manifestEntries> </manifestSection> </manifestSections> </archive> </configuration> </plugin> The database will now know about our methods as they areinstalled and removed.Internal Queries One of the ‘big wins’ with stored procedures is that queries are executed on the server itself and are MUCH faster than running them through the programmatic interface. I’ve seen a process that required over 30 minutes via Java knocked down to a fraction of a second by simply moving the queried loop from the client to the server. The JDBC URL for the internal connection is “jdbc:default:connection”. You cannot use transactions (since you’re within the caller’s transaction) but you can use savepoints as long as you stay within a single call. I don’t know if you can use CallableStatements (other stored procedures yet) – you couldn’t in version 1.2 but the Ubuntu 11.10 package uses version 1.4.2. Lists of scalar values are returned as Iterators in the java world and SETOFin the SQL world. public static Iterator<String> colors() { List<String> colors = Arrays.asList('red', 'green', 'blue'); return colors.iterator(); } and CREATE FUNCTION javatest.colors() RETURNS SETOF varchar AS 'sandbox.PLJava.colors' IMMUTABLE LANGUAGE java; I’ve added the IMMUTABLE keyword since this function will always return the same values. This allows the database to perform caching and query optimization. You don’t need to know the results, or even the size of the results, before you start. Following is a sequence that’s believed to always terminate but this hasn’t been proven. (Unfortunately I’ve forgotten the name of the sequence.) As a sidenote this isn’t a complete solution since it doesn’t check for overflows – a correct implemention should either check this or use BigInteger. public static Iterator seq(int start) { Iterator iter = null; try { iter = new SeqIterator(start); } catch (IllegalArgumentException e) { // should log error... } return iter; }public static class SeqIterator implements Iterator { private int next; private boolean done = false; public SeqIterator(int start) { if (start <= 0) { throw new IllegalArgumentException(); } this.next = start; }@Override public boolean hasNext() { return !done; }@Override public Integer next() { int value = next; next = (next % 2 == 0) ? next / 2 : 3 * next + 1; done = (value == 1); return value; }@Override public void remove() { throw new UnsupportedOperationException(); } } CREATE FUNCTION javatest.seq(int) RETURNS SETOF int AS 'sandbox.PLJava.seq' IMMUTABLE LANGUAGE java; All things being equal it is better to create each result as needed. This usually reduces the memory footprint and avoids unnecessary work if the query has a LIMIT clause.Single Tuples A single tuple is returned in a ResultSet. public static boolean singleWord(ResultSet receiver) throws SQLException { receiver.updateString('English', 'hello'); receiver.updateString('Spanish', 'hola'); return true; } and CREATE TYPE word AS ( English varchar, Spanish varchar);CREATE FUNCTION javatest.single_word() RETURNS word AS 'sandbox.PLJava.singleWord' IMMUTABLE LANGUAGE java; A valid result is indicated by returning true, a null result is indicated by returning false. A complex type can be passed into a java method in the same manner – it is a read-only ResultSet containing a single row.Lists of Tuples Returning lists of complex values requires a class implementing one of two interfaces. org.postgresql.pljava.ResultSetProvider A ResultSetProvideris used when the results can be created programmatically or on an as-needed basis. public static ResultSetProvider listWords() { return new WordProvider(); } public static class WordProvider implements ResultSetProvider { private final Map<String,String> words = new HashMap<String,String>(); private final Iterator<String> keys; public WordProvider() { words.put('one', 'uno'); words.put('two', 'dos'); words.put('three', 'tres'); words.put('four', 'quatro'); keys = words.keySet().iterator(); } @Override public boolean assignRowValues(ResultSet receiver, int currentRow) throws SQLException { if (!keys.hasNext()) { return false; } String key = keys.next(); receiver.updateString('English', key); receiver.updateString('Spanish', words.get(key)); return true; }@Override public void close() throws SQLException { } } and CREATE FUNCTION javatest.list_words() RETURNS SETOF word AS 'sandbox.PLJava.listWords' IMMUTABLE LANGUAGE java; org.postgresql.pljava.ResultSetHandle A ResultSetHandleis typically used when the method uses an internal query. public static ResultSetHandle listUsers() { return new UsersHandle(); }public static class UsersHandle implements ResultSetHandle { private Statement stmt;@Override public ResultSet getResultSet() throws SQLException { stmt = DriverManager.getConnection('jdbc:default:connection').createStatement(); return stmt.executeQuery('SELECT * FROM pg_user'); }@Override public void close() throws SQLException { stmt.close(); } } and CREATE FUNCTION javatest.list_users() RETURNS SETOF pg_user AS 'sandbox.PLJava.listUsers' LANGUAGE java;The Interfaces I have been unable a recent copy of the pljava jar in a standard maven repository. My solution was to extract the interfaces from the PL/Java source tarball. They are provided here for your convenience. ResultSetProvider // Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden // Distributed under the terms shown in the file COPYRIGHT // found in the root folder of this project or at // http://eng.tada.se/osprojects/COPYRIGHT.html package org.postgresql.pljava;import java.sql.ResultSet; import java.sql.SQLException;// An implementation of this interface is returned from functions and procedures // that are declared to return <code>SET OF</code> a complex type. //Functions that // return <code>SET OF</code> a simple type should simply return an // {@link java.util.Iterator Iterator}. // @author Thomas Hallgren public interface ResultSetProvider { // This method is called once for each row that should be returned from // a procedure that returns a set of rows. The receiver // is a {@link org.postgresql.pljava.jdbc.SingleRowWriter SingleRowWriter} // writer instance that is used for capturing the data for the row. // @param receiver Receiver of values for the given row. // @param currentRow Row number. First call will have row number 0. // @return <code>true</code> if a new row was provided, <code>false</code> // if not (end of data). // @throws SQLException boolean assignRowValues(ResultSet receiver, int currentRow) throws SQLException; // Called after the last row has returned or when the query evaluator dec ides // that it does not need any more rows. // void close() throws SQLException; } ResultSetHandle // Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden // Distributed under the terms shown in the file COPYRIGHT // found in the root directory of this distribution or at // http://eng.tada.se/osprojects/COPYRIGHT.html package org.postgresql.pljava;import java.sql.ResultSet; import java.sql.SQLException;// An implementation of this interface is returned from functions and procedures // that are declared to return <code>SET OF</code> a complex type in the form // of a {@link java.sql.ResultSet}. The primary motivation for this interface is // that an implementation that returns a ResultSet must be able to close the // connection and statement when no more rows are requested. // @author Thomas Hallgren public interface ResultSetHandle { // An implementation of this method will probably execute a query // and return the result of that query. // @return The ResultSet that represents the rows to be returned. // @throws SQLException ResultSet getResultSet() throws SQLException;// Called after the last row has returned or when the query evaluator decides // that it does not need any more rows. void close() throws SQLException; } Triggers A database trigger is stored procedure that is automatically run during one of the three of the four CRUD (create-read-update-delete) operations.insertion - the trigger is provided the new value and is able to modify the values or prohibit the operation outright. update – the trigger is provided both old and new values. Again it is able to modify the values or prohibit the operation. deletion – the trigger is provided the old value. It is not able to modify the value but can prohibit the operation.A trigger can be run before or after the operation. You would execute a trigger before an operation if you want to modify the values; you would execute it after an operation if you want to log the results.Typical Usage Insertion and Update: Data Validation A pre-trigger on insert and update operations can be used to enforce data integrity and consistency. In this case the results are either accepted or the operation is prohibited. Insertion and Update: Data Normalization and Sanitization Sometimes values can have multiple representations or potentially be dangerous. A pre-trigger is a chance to clean up the data, e.g., to tidy up XML or replace < with < and > with >. All Operations: Audit Logging A post-trigger on all operations can be used to enforce audit logging. Applications can log their own actions but can’t log direct access to the database. This is a solution to this problem. A trigger can be run for each row or after completion of an entire statement. Update triggers can also be conditional. Triggers can be used to create ‘updateable views’.PL/Java Implementation Any java method can be a used in a trigger provided it is a public static method returning void that takes a single argument, a TriggerData object. Triggers can be called “ON EACH ROW” or “ON STATEMENT”. TriggerDatas that are “ON EACH ROW” contain a single-row, read-only, ResultSet as the ‘old’ value on updates and deletions, and a single-row, updatable ResultSet as the ‘new’ value on insertions and updates. This can be used to modify content, log actions, etc. public class AuditTrigger {public static void auditFoobar(TriggerData td) throws SQLException {Connection conn = DriverManager .getConnection('jdbc:default:connection'); PreparedStatement ps = conn .prepareStatement('insert into javatest.foobar_audit(what, whenn, data) values (?, ?, ?::xml)');if (td.isFiredByInsert()) { ps.setString(1, 'INSERT'); } else if (td.isFiredByUpdate()) { ps.setString(1, 'UPDATE'); } else if (td.isFiredByDelete()) { ps.setString(1, 'DELETE'); } ps.setTimestamp(2, new Timestamp(System.currentTimeMillis()));ResultSet rs = td.getNew(); if (rs != null) { ps.setString(3, toXml(rs)); } else { ps.setNull(3, Types.VARCHAR); }ps.execute(); ps.close(); }// simple marshaler. We could use jaxb or similar library static String toXml(ResultSet rs) throws SQLException { String foo = rs.getString(1); if (rs.wasNull()) { foo = ''; } String bar = rs.getString(2); if (rs.wasNull()) { bar = ''; } return String.format('<my-class><foo>%s</foo><bar>%s</bar></my-class>', foo, bar); } } CREATE TABLE javatest.foobar ( foo varchar(10), bar varchar(10) );CREATE TABLE javatest.foobar_audit ( what varchar(10) not null, whenn timestamp not null, data xml );CREATE FUNCTION javatest.audit_foobar() RETURNS trigger AS 'sandbox.AuditTrigger.auditFoobar' LANGUAGE 'java';CREATE TRIGGER foobar_audit AFTER INSERT OR UPDATE OR DELETE ON javatest.foobar FOR EACH ROW EXECUTE PROCEDURE javatest.audit_foobar();Rules A PostgreSQL extension is Rules. They are similar to triggers but a bit more flexible. One important difference is that Rules can be triggered on a SELECT statement, not just INSERT, UPDATE and DELETE. Rules, unlike triggers, use standard functions.The Interface As before I have not been able to find a maven repository of a recent version and am including the files for your convenience. TriggerData // Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden // Distributed under the terms shown in the file COPYRIGHT // found in the root folder of this project or at // http://eng.tada.se/osprojects/COPYRIGHT.html package org.postgresql.pljava;import java.sql.ResultSet; import java.sql.SQLException;// The SQL 2003 spec. does not stipulate a standard way of mapping // triggers to functions. The PLJava mapping use this interface. All // functions that are intended to be triggers must be public, static, // return void, and take a <code>TriggerData</code> as their argument. // // @author Thomas Hallgren public interface TriggerData { // Returns the ResultSet that represents the new row. This ResultSet wil // be null for delete triggers and for triggers that was fired for // statement. //The returned set will be updateable and positioned on a // valid row. When the trigger call returns, the trigger manager will se // the changes that has been made to this row and construct a new tuple // which will become the new or updated row. // // @return An updateable <code>ResultSet</code> containing one row or // null // @throws SQLException // if the contained native buffer has gone stale. // ResultSet getNew() throws SQLException; // Returns the ResultSet that represents the old row. This ResultSet wil // be null for insert triggers and for triggers that was fired for // statement.The returned set will be read-only and positioned on a // valid row. // // @return A read-only ResultSet containing one row or // null. // @throws SQLException // if the contained native buffer has gone stale. // ResultSet getOld() throws SQLException;// // Returns the arguments for this trigger (as declared in the <code>CREAT // E TRIGGER</code> // statement. If the trigger has no arguments, this method will return an // array with size 0. // // @throws SQLException // if the contained native buffer has gone stale. String[] getArguments() throws SQLException;// Returns the name of the trigger (as declared in theCREATE TRIGGER // statement). // // @throws SQLException // if the contained native buffer has gone stale. // String getName() throws SQLException; /** //Returns the name of the table for which this trigger was created (as //* declared in the <code>CREATE TRIGGER</code statement). * * @throws SQLException* if the contained native buffer has gone stale. String getTableName() throws SQLException; /// Returns the name of the schema of the table for which this trigger was created (as * declared in the <code>CREATE TRIGGER</code statement). //@throws SQLException * if the contained native buffer has gone stale. */String getSchemaName() throws SQLException; // Returns <code>true</code> if the trigger was fired after the statement or row action that it is associated with. //@throws SQLException * if the contained native buffer has gone stale.boolean isFiredAfter() throws SQLException; //Returns <code>true</code> if the trigger was fired before the * //statement or row action that it is associated with. * * @throws SQLException * if //the contained native buffer has gone stale. */ boolean isFiredBefore() throws SQLException; //Returns <code>true</code> if this trigger is fired once for each row * //(as opposed to once for the entire statement). * * @throws SQLException * if the //contained native buffer has gone stale. */ boolean isFiredForEachRow() throws SQLException; //Returns <code>true</code> if this trigger is fired once for the entire //statement (as opposed to once for each row). * * @throws SQLException * if the //contained native buffer has gone stale. */ boolean isFiredForStatement() throws SQLException; //Returns <code>true</code> if this trigger was fired by a <code>DELETE</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByDelete() throws SQLException; //Returns <code>true</code> if this trigger was fired by an //<code>INSERT</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByInsert() throws SQLException; //Returns <code>true</code> if this trigger was fired by an //<code>UPDATE</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByUpdate() throws SQLException;// Returns the name of the table for which this trigger was created (as // declared in the <code>CREATE TRIGGER</code statement). * * @throws //SQLException* if the contained native buffer has gone stale. */ String getTableName() throws SQLException; // Returns the name of the schema of the table for which this trigger was created (as / declared in the <code>CREATE TRIGGER</code statement). * * @throws //SQLException * if the contained native buffer has gone stale. */ String getSchemaName() throws SQLException; //Returns <code>true</code> if the trigger was fired after the statement // or row action that it is associated with. * * @throws SQLException * if the //contained native buffer has gone stale. */ boolean isFiredAfter() throws SQLException; // Returns <code>true</code> if the trigger was fired before the * //statement or row action that it is associated with. * * @throws SQLException * if //the contained native buffer has gone stale. */ boolean isFiredBefore() throws SQLException; // Returns <code>true</code> if this trigger is fired once for each row * //(as opposed to once for the entire statement). * * @throws SQLException * if the //contained native buffer has gone stale. */ boolean isFiredForEachRow() throws SQLException; // Returns <code>true</code> if this trigger is fired once for the entire // statement (as opposed to once for each row). * * @throws SQLException * if the //contained native buffer has gone stale. */ boolean isFiredForStatement() throws SQLException; // Returns <code>true</code> if this trigger was fired by a //<code>DELETE</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByDelete() throws SQLException; // Returns <code>true</code> if this trigger was fired by an //<code>INSERT</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByInsert() throws SQLException; // Returns <code>true</code> if this trigger was fired by an //<code>UPDATE</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByUpdate() throws SQLException; }/** // Returns the name of the table for which this trigger was created (as // declared in the <code>CREATE TRIGGER</code statement). * * @throws //SQLException* if the contained native buffer has gone stale. */ String getTableName() throws SQLException; // Returns the name of the schema of the table for which this trigger was created (as // declared in the <code>CREATE TRIGGER</code statement). * * @throws //SQLException * if the contained native buffer has gone stale. */ String getSchemaName() throws SQLException; /// Returns <code>true</code> if the trigger was fired after the //statement * or row action that it is associated with. * * @throws SQLException * if //the contained native buffer has gone stale. */ boolean isFiredAfter() throws SQLException; // Returns <code>true</code> if the trigger was fired before the * //statement or row action that it is associated with. * * @throws SQLException * if //the contained native buffer has gone stale. */ boolean isFiredBefore() throws SQLException; // Returns <code>true</code> if this trigger is fired once for each row * (//as opposed to once for the entire statement). * * @throws SQLException * if the //contained native buffer has gone stale. */ boolean isFiredForEachRow() throws SQLException; // Returns <code>true</code> if this trigger is fired once for the entire // statement (as opposed to once for each row). * * @throws SQLException * if the //contained native buffer has gone stale. */ boolean isFiredForStatement() throws SQLException; // Returns <code>true</code> if this trigger was fired by a //<code>DELETE</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByDelete() throws SQLException; // Returns <code>true</code> if this trigger was fired by an //<code>INSERT</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByInsert() throws SQLException; // Returns <code>true</code> if this trigger was fired by an //<code>UPDATE</code>. * * @throws SQLException * if the contained native //buffer has gone stale. */ boolean isFiredByUpdate() throws SQLException; } TriggerException // Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden // Distributed under the terms shown in the file COPYRIGHT // found in the root folder of this project or at // http://eng.tada.se/osprojects/COPYRIGHT.html package org.postgresql.pljava;import java.sql.SQLException;// An exception specially suited to be thrown from within a method // designated to be a trigger function. The message generated by // this exception will contain information on what trigger and // what relation it was that caused the exception // // @author Thomas Hallgren public class TriggerException extends SQLException { private static final long serialVersionUID = 5543711707414329116L;private static boolean s_recursionLock = false;public static final String TRIGGER_ACTION_EXCEPTION = '09000';private static final String makeMessage(TriggerData td, String message) { StringBuffer bld = new StringBuffer(); bld.append('In Trigger '); if(!s_recursionLock) { s_recursionLock = true; try { bld.append(td.getName()); bld.append(' on relation '); bld.append(td.getTableName()); } catch(SQLException e) { bld.append('(exception while generating exception message)'); } finally { s_recursionLock = false; } } if(message != null) { bld.append(': '); bld.append(message); } return bld.toString(); } // Create an exception based on the <code>TriggerData</code> that was // passed to the trigger method. // @param td The <code>TriggerData</code> that was passed to the trigger // method. public TriggerException(TriggerData td) { super(makeMessage(td, null), TRIGGER_ACTION_EXCEPTION); } // Create an exception based on the <code>TriggerData</code> that was // passed to the trigger method and an additional message. // @param td The <code>TriggerData</code> that was passed to the trigger // method. // @param reason An additional message with info about the exception. public TriggerException(TriggerData td, String reason) { super(makeMessage(td, reason), TRIGGER_ACTION_EXCEPTION); } } User-defined types in the database are controversial. They’re not standard – at some point the DBA has to create them – and this introduces portability issues. Standard tools won’t know about them. You must access them via the ‘struct’ methods in ResultSets and PreparedStatements. On the other hand there are a LOT of things that are otherwise only supported as byte[]. This prevents database functions and stored procedures from easily manipulating them. What would be a good user-defined type? It must be atomic and it must be possible to do meaningful work via stored procedures. N.B., a database user-defined type is not the same thing as a java class. Nearly all java classes should be stored as standard tuples and you should only use database UDTs if there’s a compelling reason. A touchstone I like is asking whether you’re ever tempted to cache immutable information about the type, vs. about the tuple, in addition to the object itself. E.g., a X.509 digital certificate has a number of immutable fields that would be valid search terms but it’s expensive to extract that information for every row. (Sidenote: you can use triggers to extract the information when the record is inserted and updated. This ensures the cached values are always accurate.) Examples:complex numbers (stored procedures: arithmetic) rational numbers (stored procedures: arithmetic) galois field numbers (stored procedures: arithmetic modulo a fixed value) images (stored procedures: get dimensions) PDF documents (stored procedures: extract elements) digital certificates and private keys (stored procedures: crypto)Something that should also be addressed is the proper language for implementation. It’s easy to prototype in PL/Java but you can make a strong argument that types should be ultimately implemented as a standard PostgreSQL extensions since they’re more likely to be available in the future when you’re looking at a 20-year-old dump. In some important ways this is just a small part of the problem – the issue isn’t whether the actual storage and function implementation is written in C or java, it’s how it’s tied into the rest of the system. PL/Java Implementation A PL/Java user defined type must implement the java.sql.SQLData interface, a static method that creates the object from a String, and an instance method that creates a String from the object. These methods must complementary – it must be possible to run a value through a full cycle in either direction and get the original value back. N.B., this is often impossible with doubles – this is why you get numbers like 4.000000001 or 2.999999999. In these cases you have do to the best you can and warn the user. In many cases an object can be stored more efficiently in a binary format. In PostgreSQL terms these are TOAST types. This is handled by implementing two new methods that work with SQLInput and SQLOutput streams. A simple implementation of a rational type follows. public class Rational implements SQLData { private long numerator; private long denominator; private String typeName;public static Rational parse(String input, String typeName) throws SQLException { Pattern pattern = Pattern.compile('(-?[0-9]+)( */ *(-?[0-9]+))?'); Matcher matcher = pattern.matcher(input); if (!matcher.matches()) { throw new SQLException('Unable to parse rational from string \'' + input + '''); } if (matcher.groupCount() == 3) { if (matcher.group(3) == null) { return new Rational(Long.parseLong(matcher.group(1))); } return new Rational(Long.parseLong(matcher.group(1)), Long.parseLong(matcher.group(3))); } throw new SQLException('invalid format: \'' + input + '''); }public Rational(long numerator) throws SQLException { this(numerator, 1); }public Rational(long numerator, long denominator) throws SQLException { if (denominator == 0) { throw new SQLException('demominator must be non-zero'); }// do a little bit of normalization if (denominator < 0) { numerator = -numerator; denominator = -denominator; }this.numerator = numerator; this.denominator = denominator; }public Rational(int numerator, int denominator, String typeName) throws SQLException { this(numerator, denominator); this.typeName = typeName; }public String getSQLTypeName() { return typeName; }public void readSQL(SQLInput stream, String typeName) throws SQLException { this.numerator = stream.readLong(); this.denominator = stream.readLong(); this.typeName = typeName; }public void writeSQL(SQLOutput stream) throws SQLException { stream.writeLong(numerator); stream.writeLong(denominator); }public String toString() { String value = null; if (denominator == 1) { value = String.valueOf(numerator); } else { value = String.format('%d/%d', numerator, denominator); } return value; }/* * Meaningful code that actually does something with this type was * intentionally left out. */ } and /* The shell type */ CREATE TYPE javatest.rational;/* The scalar input function */ CREATE FUNCTION javatest.rational_in(cstring) RETURNS javatest.rational AS 'UDT[sandbox.Rational] input' LANGUAGE java IMMUTABLE STRICT;/* The scalar output function */ CREATE FUNCTION javatest.rational_out(javatest.rational) RETURNS cstring AS 'UDT[sandbox.Rational] output' LANGUAGE java IMMUTABLE STRICT;/* The scalar receive function */ CREATE FUNCTION javatest.rational_recv(internal) RETURNS javatest.rational AS 'UDT[sandbox.Rational] receive' LANGUAGE java IMMUTABLE STRICT;/* The scalar send function */ CREATE FUNCTION javatest.rational_send(javatest.rational) RETURNS bytea AS 'UDT[sandbox.Rational] send' LANGUAGE java IMMUTABLE STRICT;CREATE TYPE javatest.rational ( internallength = 16, input = javatest.rational_in, output = javatest.rational_out, receive = javatest.rational_recv, send = javatest.rational_send, alignment = int);Type modifiers PostgreSQL allows types to have modifiers. Examples are in ‘varchar(200)’ or ‘numeric(8,2)’. PL/Java does not currently support this functionality (via the ‘typmod_in’ and ‘typmod_out’ methods) but I have submitted a request for it.Casts Custom types aren’t particularly useful if all you can do is store and retrieve the values as opaque objects. Why not use bytea and be done with it? In fact there are many UDTs where it makes sense to be able to cast a UDT to a different type. Numeric types, like complex or rational numbers, should be able to be converted to and from the standard integer and floating number types (albeit with limitations). This should be done with restraint. Casts are implemented as single argument static methods. In the java world these methods are often named newInstanceso I’m doing the same here. public static Rational newInstance(String input) throws SQLException { if (input == null) { return null; } return parse(input, 'javatest.rational'); }public static Rational newInstance(int value) throws SQLException { return new Rational(value); }public static Rational newInstance(Integer value) throws SQLException { if (value == null) { return null; } return new Rational(value.intValue()); }public static Rational newInstance(long value) throws SQLException { return new Rational(value); }public static Rational newInstance(Long value) throws SQLException { if (value == null) { return null; } return new Rational(value.longValue()); }public static Double value(Rational value) throws SQLException { if (value == null) { return null; } return value.doubleValue(); } and CREATE FUNCTION javatest.rational_string_as_rational(varchar) RETURNS javatest.rational AS 'sandbox.Rational.newInstance' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_int_as_rational(int4) RETURNS javatest.rational AS 'sandbox.Rational.newInstance' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_long_as_rational(int8) RETURNS javatest.rational AS 'sandbox.Rational.newInstance' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_as_double(javatest.rational) RETURNS float8 AS 'sandbox.Rational.value' LANGUAGE JAVA IMMUTABLE STRICT;CREATE CAST (varchar AS javatest.rational) WITH FUNCTION javatest.rational_string_as_rational(varchar) AS ASSIGNMENT;CREATE CAST (int4 AS javatest.rational) WITH FUNCTION javatest.rational_int_as_rational(int4) AS ASSIGNMENT;CREATE CAST (int8 AS javatest.rational) WITH FUNCTION javatest.rational_long_as_rational(int8) AS ASSIGNMENT;CREATE CAST (javatest.rational AS float8) WITH FUNCTION javatest.rational_as_double(javatest.rational) AS ASSIGNMENT; (Sidenote: STRICT means that the function will return NULL if any argument is NULL. This allows the database to make some optimizations.) (Sidenote: we may only be able to use the IMMUTABLE flag if the java objects are also immutable. We should probably make our Rational objects immutable since the other numeric types are immutable.) Aggregate Functions What about min()? Rational numbers are a numeric type so shouldn’t they support all of the standard aggregate functions? Defining new aggregate functions is straightforward. Simple aggregate functions only need a static member function that take two UDT values and return one. This is easy to see with maximums, minimums, sums, products, etc. More complex aggregates require an ancillary UDT that contains state information, a static method that takes one state UDT and one UDT and returns a state UDT, and a finalization method that takes the final state UDT and produces the results. This is easy to see with averages – you need a state type that contains a counter and a running sum. Several examples of the former type of aggregate function follow. // compare two Rational objects. We use BigInteger to avoid overflow. public static int compare(Rational p, Rational q) { if (p == null) { return 1; } else if (q == null) { return -1; } BigInteger l = BigInteger.valueOf(p.getNumerator()).multiply(BigInteger.valueOf(q.getDenominator())); BigInteger r = BigInteger.valueOf(q.getNumerator()).multiply(BigInteger.valueOf(p.getDenominator())); return l.compareTo(r); }public static Rational min(Rational p, Rational q) { if ((p == null) || (q == null)) { return null; } return (p.compareTo(q) <= 0) ? p : q; } public static Rational max(Rational p, Rational q) { if ((p == null) || (q == null)) { return null; } return (q.compareTo(p) < 0) ? p : q; }public static Rational add(Rational p, Rational q) throws SQLException { if ((p == null) || (q == null)) { return null; } BigInteger n = BigInteger.valueOf(p.getNumerator()).multiply(BigInteger.valueOf(q.getDenominator())).add( BigInteger.valueOf(q.getNumerator()).multiply(BigInteger.valueOf(p.getDenominator()))); BigInteger d = BigInteger.valueOf(p.getDenominator()).multiply(BigInteger.valueOf(q.getDenominator())); BigInteger gcd = n.gcd(d); n = n.divide(gcd); d = d.divide(gcd); return new Rational(n.longValue(), d.longValue()); } and CREATE FUNCTION javatest.min(javatest.rational, javatest.rational) RETURNS javatest.rational AS 'sandbox.Rational.min' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.max(javatest.rational, javatest.rational) RETURNS javatest.rational AS 'sandbox.Rational.max' LANGUAGE JAVA IMMUTABLE STRICT;CREATE AGGREGATE min(javatest.rational) ( sfunc = javatest.min, stype = javatest.rational );CREATE AGGREGATE max(javatest.rational) ( sfunc = javatest.max, stype = javatest.rational );CREATE AGGREGATE sum(javatest.rational) ( sfunc = javatest.add, stype = javatest.rational );Integration with Hibernate It is possible to link PL/Java user-defined types and Hibernate user-defined types. Warning: the hibernate code is highly database-specific. This is the hibernate user-defined type. PostgreSQL 9.1 does not support the STRUCT type and uses strings instead. We don’t have to use the PL/Java user-defined data type to perform the marshaling but it ensures consistency. TheDbRationalType is the Rationalclass above. The same class could be used in both places but would introduce dependency on a Hibernate interface into the PL/Java class. This may be acceptable if you extract that single interface from the Hibernate source code. public class Rational implements UserType, Serializable { private final int[] sqlTypesSupported = new int[] { Types.OTHER }; private long numerator; private long denominator;public Rational() { numerator = 0; denominator = 1; }public Rational(long numerator, long denominator) { this.numerator = numerator; this.denominator = denominator; }public long getNumerator() { return numerator; }public long getDenominator() { return denominator; }@Override public Object assemble(Serializable cached, Object owner) throws HibernateException { if (!(cached instanceof Rational)) { throw new HibernateException('invalid argument'); } Rational r = (Rational) cached; return new Rational(r.getNumerator(), r.getDenominator()); }@Override public Serializable disassemble(Object value) throws HibernateException { if (!(value instanceof Rational)) { throw new HibernateException('invalid argument'); } return (Rational) value; }@Override public Object deepCopy(Object value) throws HibernateException { if (value == null) { return null } if (!(value instanceof Rational)) { throw new HibernateException('invalid argument'); } Rational v = (Rational) value; return new Rational(v.getNumerator(), v.getDenominator()); }@Override public boolean isMutable() { return true; }// // important: PGobject is postgresql-specific // @Override public Object nullSafeGet(ResultSet rs, String[] names, Object owners) throws HibernateException, SQLException { PGobject pgo = (PGobject) rs.getObject(names[0]); if (rs.wasNull()) { return null; } TheDbRationalType r = TheDbRationalType.parse(pgo.getValue(), 'rational'); return new Rational(r.getNumerator(), r.getDenominator()); }// // important: using Types.OTHER may be postgresql-specific // @Override public void nullSafeSet(PreparedStatement ps, Object value, int index) throws HibernateException, SQLException { if (value == null) { ps.setNull(index, Types.OTHER); } else if (!(value instanceof Rational)) { throw new HibernateException('invalid argument'); } else { Rational t = (Rational) value; ps.setObject(index, new TheDbRationalType(t.getNumerator(), t.getDenominator()), Types.OTHER); } }@Override public Object replace(Object original, Object target, Object owner) throws HibernateException { if (!(original instanceof Rational) || !(target instanceof Rational)) { throw new HibernateException('invalid argument'); } Rational r = (Rational) original; return new Rational(r.getNumerator(), r.getDenominator()); }@Override public Class returnedClass() { return Rational.class; }@Override public int[] sqlTypes() { return sqlTypesSupported; }@Override public String toString() { String value = ''; if (denominator == 1) { value = String.valueOf(numerator); } else { value = String.format('%d/%d', numerator, denominator); } return value; }// for UserType @Override public int hashCode(Object value) { Rational r = (Rational) value; return (int) (31 * r.getNumerator() + r.getDenominator()); } @Override public int hashCode() { return hashCode(this); }// for UserType @Override public boolean equals(Object left, Object right) { if (left == right) { return true; } if ((left == null) || (right == null)) { return false; } if (!(left instanceof Rational) || !(right instanceof Rational)) { return false; }Rational l = (Rational) left; Rational r = (Rational) right; return (l.getNumerator() == r.getNumerator()) && (l.getDenominator() == r.getDenominator()); } @Override public boolean equals(Object value) { return equals(this, value); } } CustomTypes.hbm.xml <?xml version='1.0' encoding='utf-8'?> <!DOCTYPE hibernate-mapping PUBLIC '-//Hibernate/Hibernate Mapping DTD 3.0//EN' 'http://www.hibernate.org/dtd/hibernate-mapping-3.0.dtd'><hibernate-mapping><typedef name='javatest.rational' class='sandbox.RationalType'/></hibernate-mapping> TestTable.hbm.xml <?xml version='1.0' encoding='utf-8'?> <!DOCTYPE hibernate-mapping PUBLIC '-//Hibernate/Hibernate Mapping DTD 3.0//EN' 'http://www.hibernate.org/dtd/hibernate-mapping-3.0.dtd'><hibernate-mapping><class name='sandbox.TestTable' table='test_table'> <id name='id'/> <property name='value' type='javatest.rational' /> </class></hibernate-mapping> Operators Operators are normal PL/Java methods that are also marked as operators via the CREATE OPERATOR statement. Basic arithmetic for rational numbers is supported as public static Rational negate(Rational p) throws SQLException { if (p == null) { return null; } return new Rational(-p.getNumerator(), p.getDenominator()); }public static Rational add(Rational p, Rational q) throws SQLException { if ((p == null) || (q == null)) { return null; } BigInteger n = BigInteger.valueOf(p.getNumerator()).multiply(BigInteger.valueOf(q.getDenominator())).add( BigInteger.valueOf(q.getNumerator()).multiply(BigInteger.valueOf(p.getDenominator()))); BigInteger d = BigInteger.valueOf(p.getDenominator()).multiply(BigInteger.valueOf(q.getDenominator())); BigInteger gcd = n.gcd(d); n = n.divide(gcd); d = d.divide(gcd); return new Rational(n.longValue(), d.longValue()); }public static Rational subtract(Rational p, Rational q) throws SQLException { if ((p == null) || (q == null)) { return null; } BigInteger n = BigInteger.valueOf(p.getNumerator()).multiply(BigInteger.valueOf(q.getDenominator())).subtract( BigInteger.valueOf(q.getNumerator()).multiply(BigInteger.valueOf(p.getDenominator()))); BigInteger d = BigInteger.valueOf(p.getDenominator()).multiply(BigInteger.valueOf(q.getDenominator())); BigInteger gcd = n.gcd(d); n = n.divide(gcd); d = d.divide(gcd); return new Rational(n.longValue(), d.longValue()); } public static Rational multiply(Rational p, Rational q) throws SQLException { if ((p == null) || (q == null)) { return null; } BigInteger n = BigInteger.valueOf(p.getNumerator()).multiply(BigInteger.valueOf(q.getNumerator())); BigInteger d = BigInteger.valueOf(p.getDenominator()).multiply(BigInteger.valueOf(q.getDenominator())); BigInteger gcd = n.gcd(d); n = n.divide(gcd); d = d.divide(gcd); return new Rational(n.longValue(), d.longValue()); } and CREATE FUNCTION javatest.rational_negate(javatest.rational) RETURNS javatest.rational AS 'sandbox.Rational.negate' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_add(javatest.rational, javatest.rational) RETURNS javatest.rational AS 'sandbox.Rational.add' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_subtract(javatest.rational, javatest.rational) RETURNS javatest.rational AS 'sandbox.Rational.subtract' LANGUAGE JAVA IMMUTABLE STRICT; CREATE FUNCTION javatest.rational_multiply(javatest.rational, javatest.rational) RETURNS javatest.rational AS 'sandbox.Rational.multiply' LANGUAGE JAVA IMMUTABLE STRICT; CREATE FUNCTION javatest.rational_divide(javatest.rational, javatest.rational) RETURNS javatest.rational AS 'sandbox.Rational.divide' LANGUAGE JAVA IMMUTABLE STRICT;CREATE OPERATOR - ( rightarg = javatest.rational, procedure.rational_negate );CREATE OPERATOR + ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_add, commutator = + );CREATE OPERATOR - ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_subtract );CREATE OPERATOR * ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_divide, commutator = * );CREATE OPERATOR / ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_divide ); The operator characters are one to 63 characters from the set “+ – * / < > = ~ ! @ # % ^ & | ` ?” with a few restrictions to avoid confusion with the start of SQL comments. The commutator operator is a second operator (possibly the same) that has the same results if the left and right values are swapped. This is used by the optimizer. The negator operator is one that the opposite results if the left and right values are swapped. It is only valid on procedures that return a boolean value. Again this is used by the optimizer.Ordering Operators Many UDTs can be ordered in some manner. This may be something obvious, e.g., ordering rational numbers, or something a bit more arbitrary, e.g., ordering complex numbers. We can define ordering operations in the same manner as above. N.B., there is no longer anything special about these operators – with an unfamiliar UDT you can’t assume that < really means “less than”. The sole exception is “!=” which is always rewritten as “” by the parser. public static int compare(Rational p, Rational q) { if (p == null) { return 1; } else if (q == null) { return -1; } BigInteger l = BigInteger.valueOf(p.getNumerator()).multiply(BigInteger.valueOf(q.getDenominator())); BigInteger r = BigInteger.valueOf(q.getNumerator()).multiply(BigInteger.valueOf(p.getDenominator())); return l.compareTo(r); } public int compareTo(Rational p) { return compare(this, p); }public static int compare(Rational p, double q) { if (p == null) { return 1; } double d = p.doubleValue(); return (d < q) ? -1 : ((d == q) ? 0 : 1); } public int compareTo(double q) { return compare(this, q); }public static boolean lessThan(Rational p, Rational q) { return compare(p, q) < 0; } public static boolean lessThanOrEquals(Rational p, Rational q) { return compare(p, q) <= 0; } public static boolean equals(Rational p, Rational q) { return compare(p, q) = 0; }public static boolean greaterThan(Rational p, Rational q) { return compare(p, q) > 0; } public static boolean lessThan(Rational p, double q) { if (p == null) { return false; } return p.compareTo(q) < 0; } public static boolean lessThanOrEquals(Rational p, double q) { if (p == null) { return false; } return p.compareTo(q) = 0; } public static boolean greaterThan(Rational p, double q) { if (p == null) { return true; } return p.compareTo(q) > 0; } Note that I’ve defined methods to compare either two rational numbers or one rational number and one double number. CREATE FUNCTION javatest.rational_lt(javatest.rational, javatest.rational) RETURNS bool AS 'sandbox.Rational.lessThan' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_le(javatest.rational, javatest.rational) RETURNS bool AS 'sandbox.Rational.lessThanOrEquals' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_eq(javatest.rational, javatest.rational) RETURNS bool AS 'sandbox.Rational.equals' LANGUAGE JAVA IMMUTABLE STRICT; CREATE FUNCTION javatest.rational_ge(javatest.rational, javatest.rational) RETURNS bool AS 'sandbox.Rational.greaterThanOrEquals' LANGUAGE JAVA IMMUTABLE STRICT; CREATE FUNCTION javatest.rational_gt(javatest.rational, javatest.rational) RETURNS bool AS 'sandbox.Rational.greaterThan' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_cmp(javatest.rational, javatest.rational) RETURNS int AS 'sandbox.Rational.compare' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_lt(javatest.rational, float8) RETURNS bool AS 'sandbox.Rational.lessThan' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_le(javatest.rational, float8) RETURNS bool AS 'sandbox.Rational.lessThanOrEquals' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_eq(javatest.rational, float8) RETURNS bool AS 'sandbox.Rational.equals' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_ge(javatest.rational, float8) RETURNS bool AS 'sandbox.Rational.greaterThanOrEquals' LANGUAGE JAVA IMMUTABLE STRICT;CREATE FUNCTION javatest.rational_gt(javatest.rational, float8) RETURNS bool AS 'sandbox.Rational.greaterThan' LANGUAGE JAVA IMMUTABLE STRICT;CREATE OPERATOR < ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_lt, commutator = > , negator = >= , restrict = scalarltsel, join = scalarltjoinsel, merges );CREATE OPERATOR <= ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_le, commutator = >= , negator = > , restrict = scalarltsel, join = scalarltjoinsel, merges );CREATE OPERATOR = ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_eq, commutator = = , negator = <>, hashes, merges );CREATE OPERATOR >= ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_lt, commutator = <= , negator = < , restrict = scalarltsel, join = scalarltjoinsel, merges );CREATE OPERATOR > ( leftarg = javatest.rational, rightarg = javatest.rational, procedure = javatest.rational_le, commutator = <= , negator = < , restrict = scalargtsel, join = scalargtjoinsel, merges );CREATE OPERATOR < ( leftarg = javatest.rational, rightarg = float8, procedure = javatest.rational_lt, commutator = > , negator = >= );CREATE OPERATOR <= ( leftarg = javatest.rational, rightarg = float8, procedure = javatest.rational_le, commutator = >= , negator = > );CREATE OPERATOR = ( leftarg = javatest.rational, rightarg = float8, procedure = javatest.rational_eq, commutator = = , negator = <> );CREATE OPERATOR >= ( leftarg = javatest.rational, rightarg = float8, procedure = javatest.rational_ge, commutator = <= , negator = < );CREATE OPERATOR > ( leftarg = javatest.rational, rightarg = float8, procedure = javatest.rational_gt, commutator = < , negator = <= ); Restrict is an optimization estimator procedure. It’s usually safe to use the appropriate standard procedure. Join is an optimization estimator procedure. It’s usually safe to use the appropriate standard procedure. Hashes indicates that the operator can be used in hash joins. Merges indicates that the operator can be used in merge joins.Indexes Indexes are used in three places – to enforce uniqueness constraints and to speed up WHERE and JOIN clauses. -- btree join CREATE OPERATOR CLASS rational_ops DEFAULT FOR TYPE javatest.rational USING btree AS OPERATOR 1 < , OPERATOR 2 <= , OPERATOR 3 = , OPERATOR 4 >= , OPERATOR 5 > , FUNCTION 1 javatest.rational_cmp(javatest.rational, javatest.rational);-- hash join CREATE OPERATOR CLASS rational_ops DEFAULT FOR TYPE javatest.rational USING hash AS OPERATOR 1 = , FUNCTION 1 javatest.rational_hashCode(javatest.rational);Operator Families Finally, PostgreSQL has the concept of “Operator Families” that group related operator classes under a single umbrella. For instance you might have one family that supports cross-comparison between int2, int4 and int8 values. Each can be specified individually but by creating an operator family you give a few more hints to the PostgreSQL optimizer.More InformationCREATE TYPE (PostgreSQL) PostgreSQL ‘create trigger’ documentation. PostgreSQL ‘create rule’ documentation. java CREATE OPERATOR (PostgreSQL) CREATE OPERATOR CLASS (PostgreSQL) CREATE OPERATOR FAMILY (PostgreSQL) Operator Optimization (PostgreSQL) Interfacing Extensions To Indexes (PostreSQL) Creating a Scalar UDT in Java (user guide) CREATE AGGREGATE documentation (PostgreSQL) CREATE CAST documentation (PostgreSQL) CREATE TYPE documentation (PostgreSQL) CREATE OPERATOR documentation (PostgreSQL) CREATE OPERATOR CLASS documentation (PostgreSQL) Interfacing user-defined types to indexes (PostgreSQL)Reference: Introduction To PostgreSQL PL/Java, Part 1,  Introduction To PostgreSQL PL/Java, Part 2: Working With Lists,  Introduction To PostgreSQL PL/Java, Part 3: Triggers,  Introduction To PostgreSQL PL/Java, Part 4: User Defined Types, Introduction To PostgreSQL/PLJava, Part 5: Operations And Indexes from our JCG partner Bear Giles at the Invariant Properties blog....
software-development-2-logo

What is HMAC Authentication and why is it useful?

To start with a little background, then I will outline the options for authentication of HTTP based server APIs with a focus on HMAC and lastly I will provide some tips for developers building and using HMAC based authentication. Recently I have been doing quite a bit of research and hacking in and around server APIs. Authentication for these type APIs really depends on the type of service, and falls into a couple of general categories:Consumer or personal applications, these typically use a simple username and password, OAuth is used in some cases however this is more for identity of an individuals authorisation session within a trusted third party. Infrastructure applications, these typically use a set of credentials which are different to the owners/admins credentials and provide some sort of automation API for business or devices to enhance the function or control something.For infrastructure APIs I have had a look at a few options, these are explained in some detail below.Basic Authentication This is the simplest to implement and for some implementations can work well, however it requires transport level encryption as the user name and password are presented with ever request. For more information on this see Wikipedia Article.Digest Authentication This is actually quite a bit closer to HMAC than basic, it uses md5 to hash the authentication attributes in a way which makes it much more difficult to intercept and compromise the username and password attributes. Note I recommend reading over the Wikipedia page on the subject, in short it is more than secure than basic auth, however it is entirely dependent on how many of the safeguards are implemented in the client software and the complexity of the password is a factor. Note unlike basic authentication, this does not require an SSL connection, that said make sure you read the Wikipedia article as there are some issues with man in the middle attacks.HMAC Authentication Unlike the previous authentication methods there isn’t, as far as I can tell a standard way to do this, that said as this is the main authentication method used by Amazon Web Services it is very well understood, and there are a number of libraries which implement it. To use this form of authentication you utilise a key identifier and a secret key, with both of these typically generated in an admin interface (more details below). It is very important to note that one of the BIG difference with this type of authentication is it signs the entire request, if the content-md5 is included, this basically guarantees the authenticity of the action. If a party in the middle fiddles with the API call either for malicious reasons, or bug in a intermediary proxy that drops some important headers,the signature will not match. The use HMAC authentication a digest is computed using a composite of the URI, request timestamp and some other headers (dependeing on the implementation) using the supplied secret key. The key identifier along with the digest, which is encoded using Base64 is combined and added to the authorisation header. The following example is from Amazon S3 documentation. 'Authorization: AWS ' + AWSAccessKeyId + ':' + base64(hmac-sha1(VERB + '\n' + CONTENT-MD5 + '\n' + CONTENT-TYPE + '\n' + DATE + '\n' + CanonicalizedAmzHeaders + '\n' + CanonicalizedResource)) Which results in a HTTP request, with headers which looks like this. PUT /quotes/nelson HTTP/1.0 Authorization: AWS 44CF9590006BF252F707:jZNOcbfWmD/A/f3hSvVzXZjM2HU= Content-Md5: c8fdb181845a4ca6b8fec737b3581d76 Content-Type: text/html Date: Thu, 17 Nov 2005 18:49:58 GMT X-Amz-Meta-Author: foo@bar.com X-Amz-Magic: abracadabra Note the AWS after the colon is sometimes known as the service label, most services I have seen follow the convention of changing this to an abbreviation of their name or just HMAC. If we examine the Amazon implementation closely a few advantages become obvious, over normal user names and passwords:As mentioned HMAC authentication guarantees the authenticity of the request by signing the headers, this is especially the case if content-md5 is signed and checked by the server AND the client. An admin can generate any number of key pairs and utilise them independent of their Amazon credentials. As noted before these are computed values and can be optimised to be as large as necessary, Amazon is using 40 character secrets for SHA-1, depending on the hash algorithm used. This form of authentication can be used without the need for SSL as the secret is never actually transmitted, just the MAC. As the key pairs are independent of admin credentials they can be deleted or disabled when systems are compromised therefor disabling their use.As far as disadvantages, there are indeed some:Not a lot of consistency in the implementations outside of the ones which interface with Amazon. Server side implementations are few in number, and also very inconsistent. If you do decide to build your own be advised Cryptographic APIs like OpenSSL can be hard to those who haven’t used them directly before, a single character difference will result in a completely different value. In cases where all headers within a request are signed you need to be VERY careful at the server or client side to avoid headers being injected or modified by your libraries (more details below).As I am currently developing, and indeed rewriting some of my existing implementations I thought I would put together a list of tips for library authors.When writing the API ensure you check your request on the wire to ensure nothing has been changed or “tweaked” by the HTTP library your using, mine added a character encoding attribute to the Content-Type. Test that order of your headers is correct on dispatch of the request as well, libraries my use an hash map (natural ordered), this may break your signature depending on the implementation. In the case of Amazon they require you to sort your “extra” headers alphabetically and lower case the header names before computing the signature. Be careful of crazy Ruby libraries that snake case your header names (yes this is bad form) before presenting them to your code as the list of header names. When debugging print the canonical string used to generate the signature, preferably using something like ruby inspect which shows ALL characters. This will help both debugging while developing, and to compare against what the server side actually relieves. Observe how various client or server APIs introduce or indeed remove headers.From a security stand point a couple of basic recommendations.Use content MD5 at both ends of the conversation. Sign all headers which could influence the result of the operation as a minimum. Record the headers of every API call that may have side affects, on most web servers this can be enabled and added to the web logs (again ideally this would be encoded like what ruby inspect does).So in closing I certainly recommend using HMAC authentication, but be prepared to learn a lot about how HTTP works and a little Cryptography, this in my view cant hurt either way if your building server side APIs.                     Reference: What is HMAC Authentication and why is it useful? from our JCG partner Mark Wolfe at the Mark Wolfe’s Blog blog....
java-logo

Investigating Deadlocks – Part 1

I’m sure we’ve all been there: it’s late, you’re hungry, your server has hung or your application’s running at snail’s pace, and there’s someone breathing down your neck wanting you to fix the problem before you go. One of the possible causes of your application hanging unexpectedly is a threading issue known as a Deadlock. Without going into too much detail, threads can be in one of a number of states as shown by the UML state diagram below……and an deadlock is all to do with the BLOCKED state, which the API documentation defines as a “a thread that is blocked waiting for a monitor lock”. So, what is a deadlock? Simply put, given two threads A and B then a deadlock occurs when thread A blocks because it’s waiting for thread B to release a monitor lock, and thread B blocks because it’s waiting for thread A to release the same monitor lock. However, things can be more complex than this in that the deadlock can contain a whole bunch of threads. For example, thread A blocks because it’s waiting for thread B, thread B blocks because it’s waiting for thread C, thread C blocks because it’s waiting for thread D, D blocks because it’s waiting for E, E blocks because it’s waiting for F and F blocks because it’s waiting for A. The trick is finding out which threads are blocked and why, and that’s done by taking a thread dump from your application. A thread dump is simply a snapshot report showing the status of all your application’s threads at a given point in time. There are several tools and techniques available to help you get hold of a thread dump and these include jVisualVM, jstack and the unix kill command; however, before obtaining and interpreting a thread dump, I’ll need some code that will create a deadlock The scenario I’ve chosen for this is one of a simple bank account transfer. The idea is that there is a balance transfer program running that’s randomly transferring various amounts between different accounts using a bunch of threads. In this program, a bank account is represented using the following, very simplistic, Account class: public class Account {private final int number;private int balance;public Account(int number, int openingBalance) { this.number = number; this.balance = openingBalance; }public void withdraw(int amount) throws OverdrawnException {if (amount > balance) { throw new OverdrawnException(); }balance -= amount; }public void deposit(int amount) {balance += amount; }public int getNumber() { return number; }public int getBalance() { return balance; } } The above class models a bank account with attributes of account number and balance, and operations such as deposit(...) and withdraw(...). withdraw(...) will throw a simple checked exception, OverdrawnException, if the amount to withdraw is greater than the available balance. The remaining classes in the example code are DeadlockDemo and its nested class BadTransferOperation. public class DeadlockDemo {private static final int NUM_ACCOUNTS = 10; private static final int NUM_THREADS = 20; private static final int NUM_ITERATIONS = 100000; private static final int MAX_COLUMNS = 60;static final Random rnd = new Random();List<Account> accounts = new ArrayList<Account>();public static void main(String args[]) {DeadlockDemo demo = new DeadlockDemo(); demo.setUp(); demo.run(); }void setUp() {for (int i = 0; i < NUM_ACCOUNTS; i++) { Account account = new Account(i, rnd.nextInt(1000)); accounts.add(account); } }void run() {for (int i = 0; i < NUM_THREADS; i++) { new BadTransferOperation(i).start(); } }class BadTransferOperation extends Thread {int threadNum;BadTransferOperation(int threadNum) { this.threadNum = threadNum; }@Override public void run() {for (int i = 0; i < NUM_ITERATIONS; i++) {Account toAccount = accounts.get(rnd.nextInt(NUM_ACCOUNTS)); Account fromAccount = accounts.get(rnd.nextInt(NUM_ACCOUNTS)); int amount = rnd.nextInt(1000);if (!toAccount.equals(fromAccount)) { try { transfer(fromAccount, toAccount, amount); System.out.print("."); } catch (OverdrawnException e) { System.out.print("-"); }printNewLine(i); } } // This will never get to here... System.out.println("Thread Complete: " + threadNum); }private void printNewLine(int columnNumber) {if (columnNumber % MAX_COLUMNS == 0) { System.out.print("\n"); } }/** * The clue to spotting deadlocks is in the nested locking - synchronized keywords. Note that the locks DON'T * have to be next to each other to be nested. */ private void transfer(Account fromAccount, Account toAccount, int transferAmount) throws OverdrawnException {synchronized (fromAccount) { synchronized (toAccount) { fromAccount.withdraw(transferAmount); toAccount.deposit(transferAmount); } } } } }DeadlockDemo provides the application framework that creates a deadlock. It has two simple tasks: setup() and run(). setup() creates 10 accounts initializing them with an account number and a random opening balance. run() creates 20 instances of the nested class BadTransferOperation, which simply extends Thread, and starts them running. Note that the values used for the number of threads and accounts are totally arbitrary. BadTransferOperation is where all the action takes place. Its run() method loops 10000 times randomly choosing two accounts from the accounts list and transferring a random amount of between 0 and 1000 from one to the other. If the fromAccount contains insufficient funds then an exception is thrown and a ‘-’ printed on the screen. If all goes well and the transfer succeeds then a ‘.’ is printed on the screen. The heart of the matter is the method transfer(Account fromAccount, Account toAccount, int transferAmount) containing the FAULTY synchronization code: synchronized (fromAccount) { synchronized (toAccount) { fromAccount.withdraw(transferAmount); toAccount.deposit(transferAmount); } }This code first locks the fromAccount, and then the toAccount before transferring the cash and subsequently releasing both locks. Given two threads A and B and accounts 1 and 2, then problems will arise when thread A locks its fromAccount, number 1, and tries to lock its toAccount, which is account number 2. Simultaneously thread B locks its fromAccount, number 2, and tries to lock its toAccount, which is account number 1. Hence thread A is BLOCKED on thread B and thread B is BLOCKED on thread A – a deadlock. If you run this application, then you’ll get some output that looks something like this:…as the program comes to an abrupt halt. Now I have a deadlocked application, my next blog will actually get hold of a thread dump and take a look at what it all means. Reference: Investigating Deadlocks – Part 1 from our JCG partner Roger Hughes at the Captain Debug’s Blog blog....
java-logo

Locking with a semaphore : An example

Concurrency is one aspect that brings along interesting challenges along with it. If not correctly handled, it brings about race conditions that will baffle people because those issues just pop up from time to time and work flawlessly sometimes. The Java language gives many ways of handling race conditions when dealing with concurrent threads accessing a common resource. Some include;Using the volatile keyword Using classes available in java.util.concurrent and java.util.concurrent.atomic Synchronized blocks Using a SemaphoreOf course there might be many more that i might not be aware of. For today, the example i want to show you all is the one using a Semaphore. This was introduced from JDK 1.5, and provides the developer with the ability to acquire and release locks in a seamless way. Also the example i will be showing is a hypothetical scenario which i used just to depict what can be achieved using a semaphore and therefore please do not look at the intrinsic details of the code :).. So the scenario as such, there is an in-memory cache holding objects of type ‘Person’. Users can insert and retrieve records using the cache. The issue here is we are going to control concurrent access to our in-memory cache using semaphores. Now i do not want to bore you with more text so lets get to business and show some code; import java.util.concurrent.Semaphore;/** * This class will allow thread to acquire and release locks as required * * @author dinuka.arseculeratne * */ public class PersonLock {/** * We do not want multiple lock objects lying around so we make ths class * singleton */ private PersonLock() {}/** * Bill Pugh's way of lazy initializing the singleton instance * * @author dinuka.arseculeratne * */ private static class SingletonHolder { public static final PersonLock INSTANCE = new PersonLock(); }/** * Use this method to get a reference to the singleton instance of * {@link PersonLock} * * @return the singleton instance */ public static PersonLock getInstance() { return SingletonHolder.INSTANCE; }/** * In this sample, we allow only one thread at at time to update the cache * in order to maintain consistency */ private Semaphore writeLock = new Semaphore(1);/** * We allow 10 concurrent threads to access the cache at any given time */ private Semaphore readLock = new Semaphore(10);public void getWriteLock() throws InterruptedException { writeLock.acquire(); }public void releaseWriteLock() { writeLock.release(); }public void getReadLock() throws InterruptedException { readLock.acquire(); }public void releaseReadLock() { readLock.release(); } }This class will handle the process of obtaining and releasing locks required to make our cache thread safe. I have used two separate locks here for reading and writing. The rationale behind this was to allow users to read data though it might be stale at the time of reading. Note that i have used ‘ten’ here which denotes that ten thread can simultaneously obtain locks and access the cache for read purposes. Next up you can see in the write lock, i have used the ‘ one’ which signifies that only one thread can access the cache at a time to put items to it. This is important in order to maintain consistency within the cache. That is, i do not want multiple threads trying to insert items to the map which would result in unpredictable behavior ( at least in some instances). There are mainly two ways by which you can acquire a lock using a semaphore. 1. acquire() : is a blocking call which waits until the lock is released or the thread is interrupted 2. tryAcquire() : is a non-blocking call which will return immediately and return true or false signifying whether the lock was obtained or not. Here i have used the blocking acquire call because i want the thread to wait until the lock is available. Of course this will depend on your use case. You can also define a timeout period in the tryAcquire() method so that the thread will not wait indefinitely for a lock. Next up the storage class below shows how i have used the lock class to insert and read data within the cache. import java.util.HashMap; import java.util.Map;/** * A mock storage to hold the person objects in a map * * @author dinuka.arseculeratne * */ public class PersonStorage {private Map<Integer, Person> personCache = new HashMap<Integer, Person>();private int counter = 0;/** * This class is made singleton and hence the constructor is made private */ private PersonStorage() {}/** * Bill Pugh's way of lazy initializing the singleton instance * * @author dinuka.arseculeratne * */ private static final class SingletonHolder { public static final PersonStorage INSTANCE = new PersonStorage(); } /** * Use this method to get a reference to the singleton instance of * {@link PersonStorage} * * @return the singleton instance */ public static PersonStorage getInstance() { return SingletonHolder.INSTANCE; }/** * Inserts the person into the map. Note that we use defensive copying so * that even if the client changes the object later on, those changes will * not be reflected in the object within the map * * @param person * the instance of {@link Person} to be inserted * @return the key which signifies the location of the person object * @throws InterruptedException */ public int putPerson(Person person) throws InterruptedException { Person copyPerson = person.copyPerson(); personCache.put(++counter, copyPerson); return counter; }/** * Here as well we use defensive copying so that the value of the object * reference within the map is not passed in to the calling party. * * @param id * the id representing the location of the object within the map * @return the instance of the {@link Person} represented by the key passed * in * @throws InterruptedException */ public Person retrievePerson(int id) throws InterruptedException { PersonLock.getInstance().getReadLock(); if (!personCache.containsKey(id)) { throw new RuntimeException('Key is not found'); } PersonLock.getInstance().releaseReadLock(); return personCache.get(id).copyPerson(); }}Obviously the code will work without the locks as well, but the issue is that the application will be inconsistent and provide different results at each run. This is not something you want your application to do and hence with locks you guarantee your application works consistently. And lastly a small test class to show how it will behave; not that in here we obtain the lock before calling the putPerson() method and release the lock within the finally block in order to guarantee the release of the lock.             /** * A test class to demonstrate the locking at work * * @author dinuka.arseculeratne * */ public class TestLock {public static void main(String[] args) throws InterruptedException {Thread t1 = new Thread(new Runnable() {@Override public void run() { Person p1 = new Person(1L, 'Test1', 'XYZ'); try { PersonLock.getInstance().getWriteLock(); PersonStorage.getInstance().putPerson(p1); } catch (InterruptedException e) { // Exception handling need to be done e.printStackTrace(); } finally{ PersonLock.getInstance().releaseWriteLock(); } } });Thread t2 = new Thread(new Runnable() {@Override public void run() { Person p2 = new Person(2L, 'Test123', 'ABC');try { PersonLock.getInstance().getWriteLock();PersonStorage.getInstance().putPerson(p2); } catch (InterruptedException e) { // Exception handling need to be done } finally{ PersonLock.getInstance().releaseWriteLock(); } } });t1.start(); t2.start();System.out.println(PersonStorage.getInstance().retrievePerson(2)); } }That concludes my short introduction to using Sempahores to make your code thread safe.For anyone who wants to play around with the code, you can obtain it from here. Try to remove the locks in the Storage class and see how it behaves on each run. You will see possible race conditions taking place. Reference: Locking with a semaphore : An example from our JCG partner Dinuka Arseculeratne at the My Journey Through IT blog....
jboss-seam-logo

Sending Event Invitations With Seam

These days one of my colleagues had problems with sending event invitation using mail templates with seam (version 2.x). In basic this should not be a hard task, so I will in short explain what need to be done for sending event invitation using seam mail templates. When you send mail invitation you need to send an email with attachment which contains information about particular event. I will create simple template and sender class which will be used for sending invitation. Seam 2.x include additional components which are responsible for sending mails and creating templates. To use this features we need to include seam mail components in application, with maven we could do it like this: <dependency> <groupId>org.jboss.seam</groupId> <artifactId>jboss-seam-mail</artifactId> </dependency>The seam templating mechanism allow us to create mail template like we do it for standard jsp pages. It is easy and simple to learn, and you also can use standard jsp tags, JSF if you use it. In this example i will not go deeper in the usage of seam mail templating mechanism, Below you can find simple example of template used for sending invitation. <!DOCTYPE composition PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <m:message xmlns="http://www.w3.org/1999/xhtml" xmlns:m="http://jboss.com/products/seam/mail" xmlns:h="http://java.sun.com/jsf/html"><m:header name="Content-Class" value="urn:content-classes:calendarmessage"/> <m:from name="Test Mail" address="no-reply-mail@invitation.example" /> <m:to name="Igor Madjeric">#{eventInvitation.recipient}</m:to><m:subject> <h:outputText value="Test invitation" /> </m:subject><m:body> <m:attachment contentType="text/calendar;method=CANCEL" fileName="invitation.ics"> BEGIN:VCALENDAR METHOD:REQUEST PRODID:-//Direct Scouts GmbH//INA//DE VERSION:2.0 CALSCALE:GREGORIAN BEGIN:VEVENT DTSTAMP:#{eventInvitation.currentDateAsString} DTSTART:#{eventInvitation.startAsString} DTEND:#{eventInvitation.endAsString} SUMMARY;CHARSET=UTF-8:Test invitation UID:de827ded-5fc8-4ceb-af1b-b8d9cfbcbca8 ATTENDEE;ROLE=OWNER;PARTSTAT=NEEDS-ACTION;RSVP=FALSE:MAILTO:#{eventInvitation.recipient} ORGANIZER:MAILTO:xxx@gmail.com LOCATION;CHARSET=UTF-8:#{eventInvitation.location} DESCRIPTION;CHARSET=UTF-8:#{eventInvitation.description} SEQUENCE:0 PRIORITY:5 CLASS:PUBLIC STATUS:CONFIRMED TRANSP:OPAQUE BEGIN:VALARM ACTION:DISPLAY DESCRIPTION:REMINDER TRIGGER;RELATED=START:-PT00H15M00S END:VALARM END:VEVENT END:VCALENDAR </m:attachment> </m:body> </m:message>As you can see, it is not complicated, it is like making JSP page. When you creating invitation you need make attention on UID, its a unique identifier for event for which you create invitation, so if you later need to change something about that event you just need to use same UID. For this example I’ve created EventInvitation model class which contains data needed for event. They do not contains a lot of data but you can extend it if you need more. package ba.codecentric.mail.sender.model;import java.text.SimpleDateFormat; import java.util.Date; import org.jboss.seam.ScopeType; import org.jboss.seam.annotations.Name; import org.jboss.seam.annotations.Scope;@Name("eventInvitation") @Scope(ScopeType.PAGE) public class EventInvitation { SimpleDateFormat iCalendarDateFormat = new SimpleDateFormat("yyyyMMdd'T'HHmm'00'"); private String recipient; private String location; private String description; /* Start and stop dates */ private Date start; private Date end; public String getRecipient() { return recipient; } public void setRecipient(String recipient) { this.recipient = recipient; } public String getLocation() { return location; } public void setLocation(String location) { this.location = location; } public String getDescription() { return description; } public void setDescription(String description) { this.description = description; } public String getStartAsString() { return iCalendarDateFormat.format(start); } public String getEndAsString() { return iCalendarDateFormat.format(end); } public Date getStart() { return start; } public void setStart(Date start) { this.start = start; } public Date getEnd() { return end; } public void setEnd(Date end) { this.end = end; } public String getCurrentDateAsString() { return iCalendarDateFormat.format(new Date()); } @Override public String toString() { return "EventInvitation [recipient=" + recipient + ", location="+ location + ", description=" + description + ", start=" + start + ", end=" + end + "]"; } }It is simple seam component with page scope, leave as longs as page. From template you can see that we used methods ..AsString for setting dates values. That is because, we cant simple use raw date for representing date in invitation, instead of that we format dates using next format “yyyyMMdd’T’HHmm’00′”. For filling dates I’ve used next simple form: <!DOCTYPE composition PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <ui:composition xmlns="http://www.w3.org/1999/xhtml" xmlns:ui="http://java.sun.com/jsf/facelets" xmlns:h="http://java.sun.com/jsf/html" xmlns:s="http://jboss.com/products/seam/taglib" xmlns:rich="http://richfaces.org/rich" xmlns:a4j="http://richfaces.org/a4j" xmlns:f="http://java.sun.com/jsf/core" template="/includes/template.xhtml"><!-- main content --> <ui:define name="MainContent"> <div class="WelcomeContent"> <a4j:form> <rich:panel header="Welcom To Seam Mail Invitation Sender" style="width: 315px"> Start:<rich:calendar value="#{eventInvitation.start}" popup="true" datePattern="dd/M/yy hh:mm a" showApplyButton="true" cellWidth="24px" cellHeight="22px" style="width:200px"/> <br /> End:<rich:calendar value="#{eventInvitation.end}" popup="true" datePattern="dd/M/yy hh:mm a" showApplyButton="true" cellWidth="24px" cellHeight="22px" style="width:200px"/> <br /> Location:<h:inputText value="#{eventInvitation.location}" id="location"/> <br /> Description:<h:inputText value="#{eventInvitation.description}" id="description"/> <br /> Recipient:<h:inputText value="#{eventInvitation.recipient}" id="recipient"/> <a4j:commandButton value="Send Invitation" action="#{mailInvitationSender.sendInvitation}" reRender="info" /> <h:panelGroup id="info"> <h:outputText value="Status: #{mailInvitationSender.status} " rendered="#{not empty mailInvitationSender.status}" /> </h:panelGroup> </rich:panel> </a4j:form> </div> </ui:define> </ui:composition>Nothing complicated just simple page for filling data. And at end we will take look in sender class. package ba.codecentric.mail.sender.controller.impl;import javax.ejb.Remove; import javax.ejb.Stateful;import org.jboss.seam.ScopeType; import org.jboss.seam.annotations.In; import org.jboss.seam.annotations.Logger; import org.jboss.seam.annotations.Name; import org.jboss.seam.annotations.Scope; import org.jboss.seam.faces.Renderer; import org.jboss.seam.log.Log;import ba.codecentric.mail.sender.controller.LocalMailInvitationSender; import ba.codecentric.mail.sender.model.EventInvitation;@Name("mailInvitationSender") @Scope(ScopeType.CONVERSATION) @Stateful public class StandardMailInvitationSender implements LocalMailInvitationSender {private static final String STATUS_SUCCESS = "SUCCESS"; private static final String STATUS_FAIL = "FAIL";private static String INVITATION_TEMPLATE = "/invitation.xhtml";@Logger private static Log LOG;// Component used for rendering template. @In(create = true) private Renderer renderer;@In private EventInvitation eventInvitation;private String status;public String getStatus() { return status; }public void setStatus(String status) { this.status = status; }@Override public void sendInvitation() { LOG.info("Send invitation method is called!"); try { LOG.debug(eventInvitation); renderer.render(INVITATION_TEMPLATE); status = STATUS_SUCCESS; } catch (Exception e) { LOG.error(e); status = STATUS_FAIL; } LOG.info("Invitation sending:" + status); }@Remove public void done() { LOG.debug("Bean removed!"); } }This is simple class which use renderer for creating mail based on template. So there is nothing special. Of course you need to configure mail session in components.xml but that is simple configuration. You need to add next line in components.xml: <mail:mail-session session-jndi-name="java:/Mail" />And that’s all. Your application is ready for sending invitations :). Note: line above in components.xml will create mail session component which will be used by seam for sending mails. For example if you use JBoss 4.xx you may edit configuration in “mail-service.xml” file. But how to configure mail session is out of scope of this post, if you need more information about this topic you can check one of my older post Configure Seam Mail. Reference: Sending Event Invitations With Seam from our JCG partner Igor Madjeric at the Igor Madjeric blog....
java-logo

Compact Off-Heap Structures/Tuples In Java

In my last post I detailed the implications of the access patterns your code takes to main memory. Since then I’ve had a lot of questions about what can be done in Java to enable more predictable memory layout. There are patterns that can be applied using array backed structures which I will discuss in another post. This post will explore how to simulate a feature sorely missing in Java – arrays of structures similar to what C has to offer. Structures are very useful, both on the stack and the heap. To my knowledge it is not possible to simulate this feature on the Java stack. Not being able to do this on the stack is such as shame because it greatly limits the performance of some parallel algorithms, however that is a rant for another day. In Java, all user defined types have to exist on the heap. The Java heap is managed by the garbage collector in the general case, however there is more to the wider heap in a Java process. With the introduction of direct ByteBuffer, memory can be allocated which is not tracked by the garbage collector because it can be available to native code for tasks like avoiding the copying of data to and from the kernel for IO. So one method of managing structures is to fake them within a ByteBuffer as a reasonable approach. This can allow compact data representations, but has performance and size limitations. For example, it is not possible to have a ByteBuffer greater than 2GB, and all access is bounds checked which impacts performance. An alternative exists using Unsafe that is both faster and and not size constrained like ByteBuffer. The approach I’m about to detail is not traditional Java. If your problem space is dealing with big data, or extreme performance, then there are benefits to be had. If your data sets are small, and performance is not an issue, then run away now to avoid getting sucked into the dark arts of native memory management. The benefits of the approach I’m about to detail are:Significantly improved performance More compact data representation Ability to work with very large data sets while avoiding nasty GC pauses[1]With all choices there are consequences. By taking the approach detailed below you take responsibility for some of the memory managment yourself. Getting it wrong can lead to memory leaks, or worse, you can crash the JVM! Proceed with caution… Suitable Example – Trade Data A common challenge faced in finance applications is capturing and working with very large volumes of order and trade data. For the example I will create a large table of in-memory trade data that can have analysis queries run against it. This table will be built using 2 contrasting approaches. Firstly, I’ll take the traditional Java approach of creating a large array and reference individual Trade objects. Secondly, I keep the usage code identical but replace the large array and Trade objects with an off-heap array of structures that can be manipulated via a Flyweight pattern. If for the traditional Java approach I used some other data structure, such as a Map or Tree, then the memory footprint would be even greater and the performance lower. Traditional Java Approach public class TestJavaMemoryLayout { private static final int NUM_RECORDS = 50 * 1000 * 1000;private static JavaMemoryTrade[] trades;public static void main(final String[] args) { for (int i = 0; i < 5; i++) { System.gc(); perfRun(i); } }private static void perfRun(final int runNum) { long start = System.currentTimeMillis();init();System.out.format('Memory %,d total, %,d free\n', Runtime.getRuntime().totalMemory(), Runtime.getRuntime().freeMemory());long buyCost = 0; long sellCost = 0;for (int i = 0; i < NUM_RECORDS; i++) { final JavaMemoryTrade trade = get(i);if (trade.getSide() == 'B') { buyCost += (trade.getPrice() * trade.getQuantity()); } else { sellCost += (trade.getPrice() * trade.getQuantity()); } }long duration = System.currentTimeMillis() - start; System.out.println(runNum + ' - duration ' + duration + 'ms'); System.out.println('buyCost = ' + buyCost + ' sellCost = ' + sellCost); }private static JavaMemoryTrade get(final int index) { return trades[index]; }public static void init() { trades = new JavaMemoryTrade[NUM_RECORDS];final byte[] londonStockExchange = {'X', 'L', 'O', 'N'}; final int venueCode = pack(londonStockExchange);final byte[] billiton = {'B', 'H', 'P'}; final int instrumentCode = pack( billiton);for (int i = 0; i < NUM_RECORDS; i++) { JavaMemoryTrade trade = new JavaMemoryTrade(); trades[i] = trade;trade.setTradeId(i); trade.setClientId(1); trade.setVenueCode(venueCode); trade.setInstrumentCode(instrumentCode);trade.setPrice(i); trade.setQuantity(i);trade.setSide((i & 1) == 0 ? 'B' : 'S'); } }private static int pack(final byte[] value) { int result = 0; switch (value.length) { case 4: result = (value[3]); case 3: result |= ((int)value[2] << 8); case 2: result |= ((int)value[1] << 16); case 1: result |= ((int)value[0] << 24); break;default: throw new IllegalArgumentException('Invalid array size'); }return result; }private static class JavaMemoryTrade { private long tradeId; private long clientId; private int venueCode; private int instrumentCode; private long price; private long quantity; private char side;public long getTradeId() { return tradeId; }public void setTradeId(final long tradeId) { this.tradeId = tradeId; }public long getClientId() { return clientId; }public void setClientId(final long clientId) { this.clientId = clientId; }public int getVenueCode() { return venueCode; }public void setVenueCode(final int venueCode) { this.venueCode = venueCode; }public int getInstrumentCode() { return instrumentCode; }public void setInstrumentCode(final int instrumentCode) { this.instrumentCode = instrumentCode; }public long getPrice() { return price; }public void setPrice(final long price) { this.price = price; }public long getQuantity() { return quantity; }public void setQuantity(final long quantity) { this.quantity = quantity; }public char getSide() { return side; }public void setSide(final char side) { this.side = side; } } } Compact Off-Heap Structures import sun.misc.Unsafe;import java.lang.reflect.Field;public class TestDirectMemoryLayout { private static final Unsafe unsafe; static { try { Field field = Unsafe.class.getDeclaredField('theUnsafe'); field.setAccessible(true); unsafe = (Unsafe)field.get(null); } catch (Exception e) { throw new RuntimeException(e); } }private static final int NUM_RECORDS = 50 * 1000 * 1000;private static long address; private static final DirectMemoryTrade flyweight = new DirectMemoryTrade();public static void main(final String[] args) { for (int i = 0; i < 5; i++) { System.gc(); perfRun(i); } }private static void perfRun(final int runNum) { long start = System.currentTimeMillis();init();System.out.format('Memory %,d total, %,d free\n', Runtime.getRuntime().totalMemory(), Runtime.getRuntime().freeMemory());long buyCost = 0; long sellCost = 0;for (int i = 0; i < NUM_RECORDS; i++) { final DirectMemoryTrade trade = get(i);if (trade.getSide() == 'B') { buyCost += (trade.getPrice() * trade.getQuantity()); } else { sellCost += (trade.getPrice() * trade.getQuantity()); } }long duration = System.currentTimeMillis() - start; System.out.println(runNum + ' - duration ' + duration + 'ms'); System.out.println('buyCost = ' + buyCost + ' sellCost = ' + sellCost);destroy(); }private static DirectMemoryTrade get(final int index) { final long offset = address + (index * DirectMemoryTrade.getObjectSize()); flyweight.setObjectOffset(offset); return flyweight; }public static void init() { final long requiredHeap = NUM_RECORDS * DirectMemoryTrade.getObjectSize(); address = unsafe.allocateMemory(requiredHeap);final byte[] londonStockExchange = {'X', 'L', 'O', 'N'}; final int venueCode = pack(londonStockExchange);final byte[] billiton = {'B', 'H', 'P'}; final int instrumentCode = pack( billiton);for (int i = 0; i < NUM_RECORDS; i++) { DirectMemoryTrade trade = get(i);trade.setTradeId(i); trade.setClientId(1); trade.setVenueCode(venueCode); trade.setInstrumentCode(instrumentCode);trade.setPrice(i); trade.setQuantity(i);trade.setSide((i & 1) == 0 ? 'B' : 'S'); } }private static void destroy() { unsafe.freeMemory(address); }private static int pack(final byte[] value) { int result = 0; switch (value.length) { case 4: result |= (value[3]); case 3: result |= ((int)value[2] << 8); case 2: result |= ((int)value[1] << 16); case 1: result |= ((int)value[0] << 24); break;default: throw new IllegalArgumentException('Invalid array size'); }return result; }private static class DirectMemoryTrade { private static long offset = 0;private static final long tradeIdOffset = offset += 0; private static final long clientIdOffset = offset += 8; private static final long venueCodeOffset = offset += 8; private static final long instrumentCodeOffset = offset += 4; private static final long priceOffset = offset += 4; private static final long quantityOffset = offset += 8; private static final long sideOffset = offset += 8;private static final long objectSize = offset += 2;private long objectOffset;public static long getObjectSize() { return objectSize; }void setObjectOffset(final long objectOffset) { this.objectOffset = objectOffset; }public long getTradeId() { return unsafe.getLong(objectOffset + tradeIdOffset); }public void setTradeId(final long tradeId) { unsafe.putLong(objectOffset + tradeIdOffset, tradeId); }public long getClientId() { return unsafe.getLong(objectOffset + clientIdOffset); }public void setClientId(final long clientId) { unsafe.putLong(objectOffset + clientIdOffset, clientId); }public int getVenueCode() { return unsafe.getInt(objectOffset + venueCodeOffset); }public void setVenueCode(final int venueCode) { unsafe.putInt(objectOffset + venueCodeOffset, venueCode); }public int getInstrumentCode() { return unsafe.getInt(objectOffset + instrumentCodeOffset); }public void setInstrumentCode(final int instrumentCode) { unsafe.putInt(objectOffset + instrumentCodeOffset, instrumentCode); }public long getPrice() { return unsafe.getLong(objectOffset + priceOffset); }public void setPrice(final long price) { unsafe.putLong(objectOffset + priceOffset, price); }public long getQuantity() { return unsafe.getLong(objectOffset + quantityOffset); }public void setQuantity(final long quantity) { unsafe.putLong(objectOffset + quantityOffset, quantity); }public char getSide() { return unsafe.getChar(objectOffset + sideOffset); }public void setSide(final char side) { unsafe.putChar(objectOffset + sideOffset, side); } } }Results Intel i7-860 @ 2.8GHz, 8GB RAM DDR3 1333MHz, Windows 7 64-bit, Java 1.7.0_07 ============================================= java -server -Xms4g -Xmx4g TestJavaMemoryLayout Memory 4,116,054,016 total, 1,108,901,104 free 0 - duration 19334ms Memory 4,116,054,016 total, 1,109,964,752 free 1 - duration 14295ms Memory 4,116,054,016 total, 1,108,455,504 free 2 - duration 14272ms Memory 3,817,799,680 total, 815,308,600 free 3 - duration 28358ms Memory 3,817,799,680 total, 810,552,816 free 4 - duration 32487msjava -server TestDirectMemoryLayout Memory 128,647,168 total, 126,391,384 free 0 - duration 983ms Memory 128,647,168 total, 126,992,160 free 1 - duration 958ms Memory 128,647,168 total, 127,663,408 free 2 - duration 873ms Memory 128,647,168 total, 127,663,408 free 3 - duration 886ms Memory 128,647,168 total, 127,663,408 free 4 - duration 884msIntel i7-2760QM @ 2.40GHz, 8GB RAM DDR3 1600MHz, Linux 3.4.11 kernel 64-bit, Java 1.7.0_07 ================================================= java -server -Xms4g -Xmx4g TestJavaMemoryLayout Memory 4,116,054,016 total, 1,108,912,960 free 0 - duration 12262ms Memory 4,116,054,016 total, 1,109,962,832 free 1 - duration 9822ms Memory 4,116,054,016 total, 1,108,458,720 free 2 - duration 10239ms Memory 3,817,799,680 total, 815,307,640 free 3 - duration 21558ms Memory 3,817,799,680 total, 810,551,856 free 4 - duration 23074msjava -server TestDirectMemoryLayout Memory 123,994,112 total, 121,818,528 free 0 - duration 634ms Memory 123,994,112 total, 122,455,944 free 1 - duration 619ms Memory 123,994,112 total, 123,103,320 free 2 - duration 546ms Memory 123,994,112 total, 123,103,320 free 3 - duration 547ms Memory 123,994,112 total, 123,103,320 free 4 - duration 534msAnalysis Let’s compare the results to the 3 benefits promised above. 1. Significantly improved performance The evidence here is pretty clear cut. Using the off-heap structures approach is more than an order of magnitude faster. At the most extreme, look at the 5th run on a Sandy Bridge processor, we have 43.2 times difference in duration to complete the task. It is also a nice illustration of how well Sandy Bridge does with predictable access patterns to data. Not only is the performance significantly better it is also more consistent. As the heap becomes fragmented, and thus access patterns become more random, the performance degrades as can be seen in the later runs with standard Java approach. 2. More compact data representation For our off-heap representation each object requires 42-bytes. To store 50 million of these, as in the example, we require 2,100,000,000 bytes. The memory required by the JVM heap is: memory required = total memory – free memory – base JVM needs 2,883,248,712 = 3,817,799,680 – 810,551,856 – 123,999,112 This implies the JVM needs ~40% more memory to represent the same data. The reason for this overhead is the array of references to the Java objects plus the object headers. In a previous post I discussed object layout in Java. When working with very large data sets this overhead can become a significant limiting factor. 3. Ability to work with very large data sets while avoiding nasty GC pauses The sample code above forces a GC cycle before each run and can improve the consistency of the results in some cases. Feel free to remove the call to System.gc() and observe the implications for yourself. If you run the tests adding the following command line arguments then the garbage collector will output in painful detail what happened. -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintHeapAtGC -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime -XX:+PrintSafepointStatistics From analysing the output I can see the application underwent a total of 29 GC cycles. Pause times are listed below by extracting the lines from the output indicating when the application threads are stopped. With System.gc() before each run ================================ Total time for which application threads were stopped: 0.0085280 seconds Total time for which application threads were stopped: 0.7280530 seconds Total time for which application threads were stopped: 8.1703460 seconds Total time for which application threads were stopped: 5.6112210 seconds Total time for which application threads were stopped: 1.2531370 seconds Total time for which application threads were stopped: 7.6392250 seconds Total time for which application threads were stopped: 5.7847050 seconds Total time for which application threads were stopped: 1.3070470 seconds Total time for which application threads were stopped: 8.2520880 seconds Total time for which application threads were stopped: 6.0949910 seconds Total time for which application threads were stopped: 1.3988480 seconds Total time for which application threads were stopped: 8.1793240 seconds Total time for which application threads were stopped: 6.4138720 seconds Total time for which application threads were stopped: 4.4991670 seconds Total time for which application threads were stopped: 4.5612290 seconds Total time for which application threads were stopped: 0.3598490 seconds Total time for which application threads were stopped: 0.7111000 seconds Total time for which application threads were stopped: 1.4426750 seconds Total time for which application threads were stopped: 1.5931500 seconds Total time for which application threads were stopped: 10.9484920 seconds Total time for which application threads were stopped: 7.0707230 secondsWithout System.gc() before each run =================================== Test run times 0 - duration 12120ms 1 - duration 9439ms 2 - duration 9844ms 3 - duration 20933ms 4 - duration 23041msTotal time for which application threads were stopped: 0.0170860 seconds Total time for which application threads were stopped: 0.7915350 seconds Total time for which application threads were stopped: 10.7153320 seconds Total time for which application threads were stopped: 5.6234650 seconds Total time for which application threads were stopped: 1.2689950 seconds Total time for which application threads were stopped: 7.6238170 seconds Total time for which application threads were stopped: 6.0114540 seconds Total time for which application threads were stopped: 1.2990070 seconds Total time for which application threads were stopped: 7.9918480 seconds Total time for which application threads were stopped: 5.9997920 seconds Total time for which application threads were stopped: 1.3430040 seconds Total time for which application threads were stopped: 8.0759940 seconds Total time for which application threads were stopped: 6.3980610 seconds Total time for which application threads were stopped: 4.5572100 seconds Total time for which application threads were stopped: 4.6193830 seconds Total time for which application threads were stopped: 0.3877930 seconds Total time for which application threads were stopped: 0.7429270 seconds Total time for which application threads were stopped: 1.5248070 seconds Total time for which application threads were stopped: 1.5312130 seconds Total time for which application threads were stopped: 10.9120250 seconds Total time for which application threads were stopped: 7.3528590 secondsIt can been seen from the output that a significant proportion of the time is spent in the garbage collector. When your threads are stopped your application is not responsive. These tests have been done with default GC settings. It is possible to tune the GC for better results but this can be a highly skilled and significant effort. The only JVM I know that copes well by not imposing long pause times, even under high-throughput conditions, is the Azul concurrent compacting collector. When profiling this application, I can see that the majority of the time is spent allocating the objects and promoting them to the old generation because they do not fit in the young generation. The initialisation costs can be removed from the timing but that is not realistic. If the traditional Java approach is taken the state needs to be built up before the query can take place. The end user of an application has to wait for the state to be built up and the query executed. This test is really quite trivial. Imagine working with similar data sets but at the 100 GB scale. Note: When the garbage collector compacts a region, then objects that were next to each other can be moved far apart. This can result in TLB and other cache misses. Side Note On Serialization A huge benefit of using off-heap structures in this manner is how they can be very easily serialised to network, or storage, by a simple memory copy as I have shown in the previous post. This way we can completely bypass intermediate buffer and object allocation. Conclusion If you are willing to do some C style programming for large datasets it is possible to control the memory layout in Java by going off-heap. If you do, the benefits in performance, compactness, and avoiding GC issues are significant. However this is an approach that should not be used for all applications. Its benefits are only noticable for very large datasets, or the extremes of performance in throughput and/or latency. I hope the Java community can collectively realise the importance of supporting structures both on the heap and the stack. John Rose has done some excellent work in this area defining how tuples could be added to the JVM. His talk on Arrays 2.0 from the JVM Language Summit this year is really worth a watch. John discusses options for arrays of structures, and structures of arrays, in his talk. If the tuples, as proposed by John, were available then the test described here could have comparable performance and be a more pleasant programming style. The whole array of structures could be allocated in a single action thus bypassing the copy of individual objects across generations, and it would be stored in a compact contiguous fashion. This would remove the significant GC issues for this class of problem. Lately, I was comparing standard data structures between Java and .Net. In some cases I observed a 6-10X performance advantage to .Net for things like maps and dictionaries when .Net used native structure support. Let’s get this into Java as soon as possible! It is also pretty obvious from the results that if we are to use Java for real-time analysis on big data, then our standard garbage collectors need to significantly improve and support true concurrent operations. [1] – To my knowledge the only JVM that deals well with very large heaps is Azul Zing Happy coding and don’t forget to share! Reference: Compact Off-Heap Structures/Tuples In Java from our JCG partner Martin Thompson at the Mechanical Sympathy blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close