About Adrian Matei

Adrian Matei (ama [AT] codingpedia DOT org) is the founder of Podcastpedia.org and Codingpedia.org, computer science engineer, husband, father, curious and passionate about science, computers, software, education, economics, social equity, philosophy.

How to use the new Apache Http Client to make a HEAD request

If you’ve updated your Apache HTTP Client code to use the newest library (at the time of this writing it is version 4.3.5 for the httpclient and version 4.3.2 for httpcore) from the version 4.2.x you’ll notice that some classes, like org.apache.http.impl.client.DefaultHttpClient or org.apache.http.params.HttpParams have become deprecated. Well, I’ve been there, so in this post I’ll present how to get rid of the warnings by using the new classes.
 
 
 
 

1. Use case from Podcastpedia.org

The use case I will use for demonstration is simple: I have a batch job to check if there are new episodes are available for podcasts. To avoid having to get and parse the feed if there are no new episodes, I verify before if the eTag or the last-modified headers of the feed resource have changed since the last call. This will work if the feed publisher supports these headers, which I highly recommend as it spares bandwidth and processing power on the consumers.

So how it works? Initially, when a new podcast is added to the Podcastpedia.org directory I check if the headers are present for the feed resource and if so I store them in the database. To do that, I execute an HTTP HEAD request against the URL of the feed with the help of Apache Http Client. According to the Hypertext Transfer Protocol — HTTP/1.1 rfc2616, the meta-information contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request).

In the following sections I will present how the code actually looks in the Java, before and after the upgrade to the 4.3.x version of the Apache Http Client.

2. Migration to the 4.3.x version

2.1. Software dependencies

To build my project, which by the way is now available on GitHub – Podcastpedia-batch, I am using maven, so I listed below the dependencies required for the Apache Http Client:

2.1.1. Before

Apache Http Client dependencies 4.2.x

 <!-- Apache Http client -->
<dependency>
	<groupId>org.apache.httpcomponents</groupId>
	<artifactId>httpclient</artifactId>
	<version>4.2.5</version>
</dependency>
<dependency>
	<groupId>org.apache.httpcomponents</groupId>
	<artifactId>httpcore</artifactId>
	<version>4.2.4</version>
</dependency>

2.1.2. After

Apache Http Client dependencies

<!-- Apache Http client -->
<dependency>
	<groupId>org.apache.httpcomponents</groupId>
	<artifactId>httpclient</artifactId>
	<version>4.3.5</version>
</dependency>		
<dependency>
	<groupId>org.apache.httpcomponents</groupId>
	<artifactId>httpcore</artifactId>
	<version>4.3.2</version>
</dependency>

2.2. HEAD request with Apache Http Client

2.2.1. Before v4.2.x

Example of executing a HEAD request with Apache HttpClient

private void setHeaderFieldAttributes(Podcast podcast) throws ClientProtocolException, IOException, DateParseException{
	
	HttpHead headMethod = null;					
	headMethod = new HttpHead(podcast.getUrl());
	
	org.apache.http.client.HttpClient httpClient = new DefaultHttpClient(poolingClientConnectionManager);
	
	HttpParams params = httpClient.getParams();
	org.apache.http.params.HttpConnectionParams.setConnectionTimeout(params, 10000);
	org.apache.http.params.HttpConnectionParams.setSoTimeout(params, 10000);
	HttpResponse httpResponse = httpClient.execute(headMethod);
	int statusCode = httpResponse.getStatusLine().getStatusCode();
  
	if (statusCode != HttpStatus.SC_OK) {
		LOG.error("The introduced URL is not valid " + podcast.getUrl()  + " : " + statusCode);
	}
  
	//set the new etag if existent
	org.apache.http.Header eTagHeader = httpResponse.getLastHeader("etag");
	if(eTagHeader != null){
		podcast.setEtagHeaderField(eTagHeader.getValue());
	}
  
	//set the new "last modified" header field if existent 
	org.apache.http.Header lastModifiedHeader= httpResponse.getLastHeader("last-modified");
	if(lastModifiedHeader != null) {
		podcast.setLastModifiedHeaderField(DateUtil.parseDate(lastModifiedHeader.getValue()));
		podcast.setLastModifiedHeaderFieldStr(lastModifiedHeader.getValue());
	}	   	      	   	      	   	        	         	      

	// Release the connection.
	headMethod.releaseConnection();	   	       	  		
}

If you are using a smart IDE, it will tell you that DefaultHttpClient, HttpParams and HttpConnectionParams are deprecated. If you look now in their java docs, you’ll get a suggestion for their replacement, namely to use the HttpClientBuilder and classes provided by org.apache.http.config instead.

So, as you’ll see in the coming section, that’s exactly what I did.

 2.2.2. After v 4.3.x

HEAD request example with Apache Http Client v 4.3.x

private void setHeaderFieldAttributes(Podcast podcast) throws ClientProtocolException, IOException, DateParseException{
	
	HttpHead headMethod = null;					
	headMethod = new HttpHead(podcast.getUrl());
					
	RequestConfig requestConfig = RequestConfig.custom()
			.setSocketTimeout(TIMEOUT * 1000)
			.setConnectTimeout(TIMEOUT * 1000)
			.build();
	
	CloseableHttpClient httpClient = HttpClientBuilder
								.create()
								.setDefaultRequestConfig(requestConfig)
								.setConnectionManager(poolingHttpClientConnectionManager)
								.build();

	HttpResponse httpResponse = httpClient.execute(headMethod);
	int statusCode = httpResponse.getStatusLine().getStatusCode();
  
	if (statusCode != HttpStatus.SC_OK) {
		LOG.error("The introduced URL is not valid " + podcast.getUrl()  + " : " + statusCode);
	}
  
	//set the new etag if existent
	Header eTagHeader = httpResponse.getLastHeader("etag");
	if(eTagHeader != null){
		podcast.setEtagHeaderField(eTagHeader.getValue());
	}
  
	//set the new "last modified" header field if existent 
	Header lastModifiedHeader= httpResponse.getLastHeader("last-modified");
	if(lastModifiedHeader != null) {
		podcast.setLastModifiedHeaderField(DateUtil.parseDate(lastModifiedHeader.getValue()));
		podcast.setLastModifiedHeaderFieldStr(lastModifiedHeader.getValue());
	}	   	      	   	      	   	        	         	      

	// Release the connection.
	headMethod.releaseConnection();	   	       	  		
}

Notice:

  • how the HttpClientBuilder has been used to build a ClosableHttpClient [lines 11-15], which is a base implementation of HttpClient that also implements Closeable
  • the HttpParams from the previous version have been replaced by org.apache.http.client.config.RequestConfig [lines 6-9] where I can set the socket and connection timeouts. This configuration is later used (line 13) when building the HttpClient

The remaining of the code is quite simple:

  • the HEAD request is executed (line 17)
  • if existant, the eTag and last-modified headers are persisted.
  • in the end the internal state of the request is reset, making it reusable – headMethod.releaseConnection()

2.2.3. Make the http call from behind a proxy

If you are behind a proxy you can easily configure the HTTP call by setting a org.apache.http.HttpHost proxy host on the RequestConfig:

HTTP call behind a proxy

 HttpHost proxy = new HttpHost("xx.xx.xx.xx", 8080, "http"); 		
RequestConfig requestConfig = RequestConfig.custom()
		.setSocketTimeout(TIMEOUT * 1000)
		.setConnectTimeout(TIMEOUT * 1000)
		.setProxy(proxy)
		.build();

Resources

Source Code – GitHub

  • podcastpedia-batch - the job for adding new podcasts from a file to the podcast directory, uses the code presented in the post to persist the eTag and lastModified headers; it is still work in progress. Please make a pull request if you have any improvement proposals

Web

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

Leave a Reply


− 2 = three



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close