Core Java

How to use the new Apache Http Client to make a HEAD request

If you’ve updated your Apache HTTP Client code to use the newest library (at the time of this writing it is version 4.3.5 for the httpclient and version 4.3.2 for httpcore) from the version 4.2.x you’ll notice that some classes, like org.apache.http.impl.client.DefaultHttpClient or org.apache.http.params.HttpParams have become deprecated. Well, I’ve been there, so in this post I’ll present how to get rid of the warnings by using the new classes.
 
 
 
 

1. Use case from Podcastpedia.org

The use case I will use for demonstration is simple: I have a batch job to check if there are new episodes are available for podcasts. To avoid having to get and parse the feed if there are no new episodes, I verify before if the eTag or the last-modified headers of the feed resource have changed since the last call. This will work if the feed publisher supports these headers, which I highly recommend as it spares bandwidth and processing power on the consumers.

So how it works? Initially, when a new podcast is added to the Podcastpedia.org directory I check if the headers are present for the feed resource and if so I store them in the database. To do that, I execute an HTTP HEAD request against the URL of the feed with the help of Apache Http Client. According to the Hypertext Transfer Protocol — HTTP/1.1 rfc2616, the meta-information contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request).

In the following sections I will present how the code actually looks in the Java, before and after the upgrade to the 4.3.x version of the Apache Http Client.

2. Migration to the 4.3.x version

2.1. Software dependencies

To build my project, which by the way is now available on GitHub – Podcastpedia-batch, I am using maven, so I listed below the dependencies required for the Apache Http Client:

2.1.1. Before

Apache Http Client dependencies 4.2.x

 <!-- Apache Http client -->
<dependency>
	<groupId>org.apache.httpcomponents</groupId>
	<artifactId>httpclient</artifactId>
	<version>4.2.5</version>
</dependency>
<dependency>
	<groupId>org.apache.httpcomponents</groupId>
	<artifactId>httpcore</artifactId>
	<version>4.2.4</version>
</dependency>

2.1.2. After

Apache Http Client dependencies

<!-- Apache Http client -->
<dependency>
	<groupId>org.apache.httpcomponents</groupId>
	<artifactId>httpclient</artifactId>
	<version>4.3.5</version>
</dependency>		
<dependency>
	<groupId>org.apache.httpcomponents</groupId>
	<artifactId>httpcore</artifactId>
	<version>4.3.2</version>
</dependency>

2.2. HEAD request with Apache Http Client

2.2.1. Before v4.2.x

Example of executing a HEAD request with Apache HttpClient

private void setHeaderFieldAttributes(Podcast podcast) throws ClientProtocolException, IOException, DateParseException{
	
	HttpHead headMethod = null;					
	headMethod = new HttpHead(podcast.getUrl());
	
	org.apache.http.client.HttpClient httpClient = new DefaultHttpClient(poolingClientConnectionManager);
	
	HttpParams params = httpClient.getParams();
	org.apache.http.params.HttpConnectionParams.setConnectionTimeout(params, 10000);
	org.apache.http.params.HttpConnectionParams.setSoTimeout(params, 10000);
	HttpResponse httpResponse = httpClient.execute(headMethod);
	int statusCode = httpResponse.getStatusLine().getStatusCode();
  
	if (statusCode != HttpStatus.SC_OK) {
		LOG.error("The introduced URL is not valid " + podcast.getUrl()  + " : " + statusCode);
	}
  
	//set the new etag if existent
	org.apache.http.Header eTagHeader = httpResponse.getLastHeader("etag");
	if(eTagHeader != null){
		podcast.setEtagHeaderField(eTagHeader.getValue());
	}
  
	//set the new "last modified" header field if existent 
	org.apache.http.Header lastModifiedHeader= httpResponse.getLastHeader("last-modified");
	if(lastModifiedHeader != null) {
		podcast.setLastModifiedHeaderField(DateUtil.parseDate(lastModifiedHeader.getValue()));
		podcast.setLastModifiedHeaderFieldStr(lastModifiedHeader.getValue());
	}	   	      	   	      	   	        	         	      

	// Release the connection.
	headMethod.releaseConnection();	   	       	  		
}

If you are using a smart IDE, it will tell you that DefaultHttpClient, HttpParams and HttpConnectionParams are deprecated. If you look now in their java docs, you’ll get a suggestion for their replacement, namely to use the HttpClientBuilder and classes provided by org.apache.http.config instead.

So, as you’ll see in the coming section, that’s exactly what I did.

 2.2.2. After v 4.3.x

HEAD request example with Apache Http Client v 4.3.x

private void setHeaderFieldAttributes(Podcast podcast) throws ClientProtocolException, IOException, DateParseException{
	
	HttpHead headMethod = null;					
	headMethod = new HttpHead(podcast.getUrl());
					
	RequestConfig requestConfig = RequestConfig.custom()
			.setSocketTimeout(TIMEOUT * 1000)
			.setConnectTimeout(TIMEOUT * 1000)
			.build();
	
	CloseableHttpClient httpClient = HttpClientBuilder
								.create()
								.setDefaultRequestConfig(requestConfig)
								.setConnectionManager(poolingHttpClientConnectionManager)
								.build();

	HttpResponse httpResponse = httpClient.execute(headMethod);
	int statusCode = httpResponse.getStatusLine().getStatusCode();
  
	if (statusCode != HttpStatus.SC_OK) {
		LOG.error("The introduced URL is not valid " + podcast.getUrl()  + " : " + statusCode);
	}
  
	//set the new etag if existent
	Header eTagHeader = httpResponse.getLastHeader("etag");
	if(eTagHeader != null){
		podcast.setEtagHeaderField(eTagHeader.getValue());
	}
  
	//set the new "last modified" header field if existent 
	Header lastModifiedHeader= httpResponse.getLastHeader("last-modified");
	if(lastModifiedHeader != null) {
		podcast.setLastModifiedHeaderField(DateUtil.parseDate(lastModifiedHeader.getValue()));
		podcast.setLastModifiedHeaderFieldStr(lastModifiedHeader.getValue());
	}	   	      	   	      	   	        	         	      

	// Release the connection.
	headMethod.releaseConnection();	   	       	  		
}

Notice:

  • how the HttpClientBuilder has been used to build a ClosableHttpClient [lines 11-15], which is a base implementation of HttpClient that also implements Closeable
  • the HttpParams from the previous version have been replaced by org.apache.http.client.config.RequestConfig [lines 6-9] where I can set the socket and connection timeouts. This configuration is later used (line 13) when building the HttpClient

The remaining of the code is quite simple:

  • the HEAD request is executed (line 17)
  • if existant, the eTag and last-modified headers are persisted.
  • in the end the internal state of the request is reset, making it reusable – headMethod.releaseConnection()

2.2.3. Make the http call from behind a proxy

If you are behind a proxy you can easily configure the HTTP call by setting a org.apache.http.HttpHost proxy host on the RequestConfig:

HTTP call behind a proxy

 HttpHost proxy = new HttpHost("xx.xx.xx.xx", 8080, "http"); 		
RequestConfig requestConfig = RequestConfig.custom()
		.setSocketTimeout(TIMEOUT * 1000)
		.setConnectTimeout(TIMEOUT * 1000)
		.setProxy(proxy)
		.build();

Resources

Source Code – GitHub

  • podcastpedia-batch – the job for adding new podcasts from a file to the podcast directory, uses the code presented in the post to persist the eTag and lastModified headers; it is still work in progress. Please make a pull request if you have any improvement proposals

Web

Adrian Matei

Adrian Matei (ama [AT] codingpedia DOT org) is the founder of Podcastpedia.org and Codingpedia.org, computer science engineer, husband, father, curious and passionate about science, computers, software, education, economics, social equity, philosophy.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button