Featured FREE Whitepapers

What's New Here?

java-logo

3 Examples of Parsing HTML File in Java using Jsoup

HTML is the core of the web, all the pages you see on the internet are based on HTML, whether they are dynamically generated by JavaScript, JSP, PHP, ASP or any other web technology. Your browser actually parse HTMLs and render it for you. But what do you do, if you need to parse an HTML document and find some elements, tags, attributes or check if a particular element exists or not, all that using a Java program. If you have been in Java programming for some years, I am sure you have done some XML parsing work using parsers like DOM and SAX. Ironically, there are few instances when you need to parse HTML document from a core Java application, which doesn’t include Servlet and other Java web technologies. To make things worse, there is no HTTP or HTML library in the core JDK as well. That’s why when it comes to parsing an HTML file, many Java programmers had to look at Google to find out how to get value of an HTML tag in Java. When I needed that I was sure that there would be an open source library which will implement that functionality for me, but didn’t know that it was as wonderful and feature rich as JSoup. It not only provides support to read and parse HTML document but also allows you to extract any element from the HTML file, their attributes, their CSS class in JQuery style, and at the same time it allows you to modify them. You can probably do anything with a HTML document using Jsoup. In this article, we will parse and HTML file and find out the value of the title and heading tags. We will also see examples of downloading and parsing HTML from file as well as any URL or internet by parsing Google’s home page in Java. What is JSoup Library Jsoup is an open source Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. Jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers like Chrome and Firefox do. Here are some of the useful features of jsoup library :    Jsoup can scrape and parse HTML from a URL, file, or string     Jsoup can find and extract data, using DOM traversal or CSS selectors     Jsoup allows you to manipulate the HTML elements, attributes, and text     Jsoup provides clean user-submitted content against a safe white-list, to prevent XSS attacks     Jsoup also output tidy HTMLJsoup is designed to deal with different kinds of HTML found in the real world, which includes properly validated HTML to incomplete non-validate tag collection. One of the core strengths of Jsoup is that it’s very robust. HTML Parsing in Java using JSoup In this Java HTML parsing tutorial, we will see three different examples of parsing and traversing HTML documents in Java using jsoup. In the first example, we will parse an HTML String, the contents of which are all tags, in form of a String literal in Java. In the Second example, we will download our HTML document from the web, and in the third example, we will load our own sample HTML file login.html for parsing. This file is a sample HTML document which contains a title tag and a div in the body section which contains an HTML form. It has input tags to capture username and password and submit and reset button for further action. It’s a proper HTML which can be validated i.e. all tags and attributes are properly closed. Here is how our sample HTML file look like : <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <title>Login Page</title> </head> <body> <div id="login" class="simple" > <form action="login.do"> Username : <input id="username" type="text" /><br> Password : <input id="password" type="password" /><br> <input id="submit" type="submit" /> <input id="reset" type="reset" /> </form> </div> </body> </html> HTML parsing is very simple with Jsoup, all you need to call is the static method Jsoup.parse()and pass your HTML String to it. JSoup provides several overloaded parse() method to read HTML files from String, a File, from a base URI, from an URL, and from an InputStream. You can also specify character encoding to read HTML files correctly in case they are not in “UTF-8″ format. The parse(String html) method parses the input HTML into a new Document. In Jsoup, Document extends Element which extends Node. Also TextNode extends Node. As long as you pass in a non-null string, you’re guaranteed to have a successful, sensible parse, with a Document containing (at least) a head and a body element. Once you have a Document, you can get the data you want by calling appropriate methods in Document and its parent classes Element and Node. Java Program to parse HTML Document Here is our complete Java program to parse an HTML String, an HTML file downloaded from the internet and an HTML file from the local file system. In order to run this program, you can either use the Eclipse IDE or you can just use any IDE or command prompt. In Eclipse, it’s very easy, just copy this code, create a new Java project, right click on src package and paste it. Eclipse will take care of creating proper package and Java source file with same name, so absolutely less work. If you already have a Sample Java project, then it’s just one step. Following Java program shows 3 examples of parsing and traversing HTML file. In first example, we directly parse an String with html content, in the second example we parse an HTML file downloaded from an URL, in the third example we load and parse an HTML document from local file system. import java.io.File; import java.io.IOException; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; /** * Java Program to parse/read HTML documents from File using Jsoup library. * Jsoup is an open source library which allows Java developer to parse HTML * files and extract elements, manipulate data, change style using DOM, CSS and * JQuery like method. * * @author Javin Paul */ public class HTMLParser{ public static void main(String args[]) { // Parse HTML String using JSoup library String HTMLSTring = "<!DOCTYPE html>" + "<html>" + "<head>" + "<title>JSoup Example</title>" + "</head>" + "<body>" + "<table><tr><td><h1>HelloWorld</h1></tr>" + "</table>" + "</body>" + "</html>"; Document html = Jsoup.parse(HTMLSTring); String title = html.title(); String h1 = html.body().getElementsByTag("h1").text(); System.out.println("Input HTML String to JSoup :" + HTMLSTring); System.out.println("After parsing, Title : " + title); System.out.println("Afte parsing, Heading : " + h1); // JSoup Example 2 - Reading HTML page from URL Document doc; try { doc = Jsoup.connect("http://google.com/").get(); title = doc.title(); } catch (IOException e) { e.printStackTrace(); } System.out.println("Jsoup Can read HTML page from URL, title : " + title); // JSoup Example 3 - Parsing an HTML file in Java //Document htmlFile = Jsoup.parse("login.html", "ISO-8859-1"); // wrong Document htmlFile = null; try { htmlFile = Jsoup.parse(new File("login.html"), "ISO-8859-1"); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } // right title = htmlFile.title(); Element div = htmlFile.getElementById("login"); String cssClass = div.className(); // getting class form HTML element System.out.println("Jsoup can also parse HTML file directly"); System.out.println("title : " + title); System.out.println("class of div tag : " + cssClass); } } Output: Input HTML String to JSoup :<!DOCTYPE html><html><head><title>JSoup Example</title></head><body><table><tr><td><h1>HelloWorld</h1></tr></table></body></html> After parsing, Title : JSoup Example Afte parsing, Heading : HelloWorld Jsoup Can read HTML page from URL, title : Google Jsoup can also parse HTML file directly title : Login Page class of div tag : simple The Jsoup HTML parser will make every attempt to create a clean parse from the HTML you provide, regardless of whether the HTML is well-formed or not. It can handle the following mistakes : unclosed tags (e.g. <p>Java <p>Scala to <p>Java</p> <p>Scala</p>) implicit tags (e.g.a naked <td>Java is Great</td> is wrapped into a <table><tr><td>) reliably creating the document structure (html containing a head and body, and only appropriate elements within the head). Jsoup is an excellent and robust open source library which makes reading html documents, body fragments, html strings and directly parsing html content from the web, extremely easy.Reference: 3 Examples of Parsing HTML File in Java using Jsoup from our JCG partner Javin Paul at the Javarevisited blog....
software-development-2-logo

Solving ORM – Keep the O, Drop the R, no need for the M

ORM has a simple, production-ready solution hiding in plain sight in the Java world. Let’s go through it in this post, alongside with the following topics:ORM / Hibernate in 2014 – the word on the street ORM is still the Vietnam of Computer Science ORM has 2 main goals only When does ORM make sense? A simple solution for the ORM problem A production-ready ORM Java-based alternative  ORM / Hibernate in 2014 – the word on the street It’s been almost 20 years since ORM is around, and soon we will reach the 15th birthday of the creation of the de-facto and likely best ORM implementation in the Java world: Hibernate. We would then expect that this is by know a well understood problem. But what are developers saying these days about Hibernate and ORM? Let’s take some quotes from two recent posts on this topic: Thoughts on Hibernate and JPA Hibernate Alternatives: There are performance problems related to using Hibernate. A lot of business operations and reports involve writing complex queries. Writing them in terms of objects and maintaining them seems to be difficult. We shouldn’t be needing a 900 page book to learn a new framework. As Java developers we can easily relate to that: ORM frameworks tend to give cryptic error messages, the mapping is hard to do and the runtime behavior namely with lazy initialization exceptions can be surprising when first encountered. Who hasn’t had to maintain that application that uses Open Session In View pattern that generated a flood of SQL requests that took weeks to optimize? I believe it literally can take a couple of years to really understand Hibernate, lot’s of practice and several readings of the Java Persistence with Hibernate book (still 600 pages in it’s upcoming second edition). Are the criticisms on Hibernate warranted? I personally don’t think so, in fact most developers really criticize the complexity of the object-relational mapping approach itself, and not a concrete ORM implementation of it in a given language. This sentiment seems to come and go in periodic waves, maybe when a newer generation of developers hits the labor force. After hours and days trying to do what feels it should be much simpler, it’s only a natural feeling. The fact is that there is a problem: why do many projects spend 30% of their time developing the persistence layer still today? ORM is the Vietnam of Computer Science The problem is that the ORM problem is complex, and there are no good solutions. Any solution to it is a huge compromise. ORM has been famously named almost 10 years ago the Vietnam of Computer Science, in a blog post from one of the creators of Stackoverflow, Jeff Atwood. The problems of ORM are well known and we won’t go through them in detail here, here is a summary from Martin Fowler on why ORM is hard:object identity vs database identity How to map object oriented inheritance in the relational world unidirectional associations in the database vs bi-directional in the OO world Data navigation – lazy loading, eager fetching Database transactions vs no rollbacks in the OO worldThis is just to name the main obstacles. The problem is also that it’s easy to forget what we are trying to achieve in the first place. ORM has 2 main goals only ORM has two main goals clearly defined:map objects from the OO world into tables in a relational database provide a runtime mechanism for keeping an in-memory graph of objects and a set of database tables in syncGiven this, when should we use Hibernate and ORM in general? When does ORM make sense ? ORM makes sense when the project at hand is being done using a Domain Driven Development approach, where the whole program is built around a set of core classes called the domain model, that represent concepts in the real world such as Customer, Invoice, etc. If the project does not have a minimum threshold complexity that needs DDD, then an ORM can likely be overkill. The problem is that even the most simple of enterprise applications are well above this threshold, so ORM really pulls it’s weight most of the time. It’s just that ORM is hard to learn and full of pitfalls. So how can we tackle this problem? A simple solution for the ORM problem Someone once said something like this: A smart man solves a problem, but a wise man avoids it. As often happens in programming, we can find the solution by going back to the beginning and see what we are trying to solve:So we are trying to synchronize an in-memory graph of objects with a set of tables. But these are two completely different types of data structures! But which data structure is the most generic? It turns out that the graph is the most generic one of the two: actually a set of linked database tables is really just a special type of graph. The same can be said of basically almost any other data structure. Graphs and their traversal are very well understood and have a body of knowledge of decades available, similar to the theory on which relational databases are built upon: Relational Algebra. Solving the impedance mismatch The logical conclusion is that the solution for the ORM impedance mismatch is removing to remove the mismatch itself: Let’s store the graph of in-memory domain objects in a transactional-capable graph database! This solves the mapping problem, by removing the need for mapping in the first place. A production-ready solution for the ORM problem This is easier said than done, or is it? It turns out that graph databases have been around for years, and the prime example in the Java community is Neo4j. Neo4j is a stable and mature product that is well understood and documented, see the Neo4J in Action book. It can used as an external server or in embedded mode inside the Java process itself. But it’s core API is all about graphs and nodes, something like this: GraphDatabaseService gds = new EmbeddedGraphDatabase("/path/to/store"); Node forrest=gds.createNode(); forrest.setProperty("title","Forrest Gump"); forrest.setProperty("year",1994); gds.index().forNodes("movies").add(forrest,"id",1);Node tom=gds.createNode(); The problem is that this is too far from domain driven development, writing to this would be like coding JDBC by hand. This is the typical task of a framework like Hibernate, with the big difference that because the impedance mismatch is minimal such framework can operate in a much more transparent and less intrusive way. It turns out that such framework is already written. Spring support for Neo4J One of the creators of the Spring framework Rod Johnson took the task of implementing himself the initial version of the Neo4j integration, the Spring Data Neo4j project. This is an important extract from the foreword of Rod Johnson in the documentation concerning the design of the framework: Its use of AspectJ to eliminate persistence code from your domain model is truly innovative, and on the cutting edge of today’s Java technologies. So Spring Data Neo4J is a AOP-based framework that wraps domain objects in a relatively transparent way, and synchronizes a in-memory graph of objects with a Neo4j transactional data store. It’s aimed to write the persistence layer of the application in a simplified way, similar to Spring Data JPA. How does the mapping to a graph database look like It turns out that there is limited mapping needed (tutorial). We need for one to mark which classes we want to make persistent, and define a field that will act as an Id: @NodeEntity class Movie { @GraphId Long nodeId; String id; String title; int year; Set cast; } There are other annotations (5 more per the docs) for example for defining indexing and relationships with properties, etc. Compared with Hibernate there is only a fraction of the annotations for the same domain model. What does the query language look like? The recommended query language is Cypher, that is an ASCII art based language. A query can look for example like this: // returns users who rated a movie based on movie title (movieTitle parameter) higher than rating (rating parameter) @Query("start movie=node:Movie(title={0}) " + "match (movie)<-[r:RATED]-(user) " + "where r.stars > {1} " + "return user") Iterable getUsersWhoRatedMovieFromTitle(String movieTitle, Integer rating); This is a query language called Cypher, which is based on ASCII art. The query language is very different from JPQL or SQL and implies a learning curve. Still after the learning curve this language allows to write performant queries that usually can be problematic in relational databases. Performance of Queries in Graph vs Relational databases Let’s compare some frequent query types and how they should perform in a graph vs relational databases:lookup by Id: This is implemented for example by doing a binary search on an index tree, finding a match and following a ‘pointer’ to the result. This is a (very) simplified description, but it’s likely identical for both databases. There is no apparent reason why such query would take more time in a graph database than in a relational DB. lookup parent relations: This is the type of query that relational databases struggle. Self-joins might result in cartesian products of huge tables, bringing the database to an halt. A graph database can perform those queries in a fraction of that. lookup by non-indexed column: Here the relational database can scan tables faster due to the physical structure of the table and the fact than one read usually brings along multiple rows. But this type of queries (table scans) are to be avoided in relational databases anyway.There is more to say here, but there is no indication (no readilly-available DDD-related public benchmarks) that a graph-based data store would not be appropriate for doing DDD due to query performance. Conclusions I personally cannot find any (conceptual) reasons why a transaction-capable graph database would not be an ideal fit for doing Domain Driven Development, as an alternative to a relational database and ORM. No data store will ever fit perfectly every use case, but we can ask the question if graph databases shouldn’t become the default for DDD, and relational the exception. The disappearance of ORM would imply a great reduction of the complexity and the time that it takes to implement a project. The future of DDD in the enterprise The removal of the impedance mismatch and the improved performance of certain query types could be the killer features that drive the adoption of a graph based DDD solution. We can see practical obstacles: operations prefer relational databases, vendor contract lock-in, having to learn a new query language, limited expertise in the labor market, etc. But the economic advantage is there, and the technology is there also. And when that is case, it’s usually only a matter of time. What about you, could you think of any reason why Graph-based DDD would not work? Feel free to chime in on the comments bellow.Reference: Solving ORM – Keep the O, Drop the R, no need for the M from our JCG partner Aleksey Novik at the The JHades Blog blog....
jboss-wildfly-logo

WildFly 9 – Don’t cha wish your console was hawt like this!

Everybody heard the news probably. The first WildFly 9.0.0.Alpha1 release came out Monday. You can download it from the wildfly.org website The biggest changes are that it is built by a new feature provisioning tool which is layered on the now separate core distribution and also contains a new Servlet Distribution (only a 25 MB ZIP) which is based on it. It is called “web lite” until there’ll be a better name. The architecture now supports server suspend mode which is also known as graceful shutdown. For now only Undertow and EJB3 use this so far. Additional subsystems still need to be updated. The management APIs also got notification support. Overall 256 fixes and improvements were included in this release. But let’s put all the awesomeness aside for a second and talk about what this post should be about. Administration Console WildFly 9 got a brushed up admin console. After you downloaded, unzipped and started the server you only need to add a user (bin/add-user.sh/.bat) and point your browser to http://localhost:9990/ to see it.With some minor UI tweaks this is looking pretty hot already. BUT there’s another console out there called hawtio! And what is extremely hot is, that it already has some very first support for WildFly and EAP and here are the steps to make it work. Get Hawtio! You can use hawtio from a Chrome Extension or in many different containers – or outside a container in a stand alone executable jar. If you want to deploy hawtio as a console on WildFly make sure to look at the complete how-to written by Christian Posta. The easiest way is to just download latest executable 1.4.19 jar and start it on the command line: java -jar hawtio-app-1.4.19.jar --port 8090 The port parameter lets you specify on which port you want the console to run. As I’m going to use it with WildFly which also uses the hawtio default port this is just directly using another free port. Next thing to do is to install the JMX to JSON bridge, on which hawtio relies to connect to remote processes. Instead of directly using JMX which is blocked on most networks anyway the Jolokia project bridges JMX MBeans to JSON and hawtio operates on them. Download latest Jolokia WAR agent and deploy it to WildFly. Now you’re almost ready to go. Point your browser to the hawtio console (http://localhost:8090/hawtio/) and switch to the connect tab. Enter the following settings:And press the “Connect to remote server” button below. Until today there is not much to see here. Beside a very basic server information you have the deployment overview and the connector status page.But the good news is: Hawtio is open source and you can fork it from GitHub and add some more features to it. The WildFly/EAP console is in a hawtio-web subproject. Make sure to check out the contributor guidelines.Reference: WildFly 9 – Don’t cha wish your console was hawt like this! from our JCG partner Markus Eisele at the Enterprise Software Development with Java blog....
java-logo

lambdas and side effects

Overview Java 8 has added features such as lambdas and type inference. This makes the language less verbose and cleaner, however it comes with more side effects as you don’t have to be as explicit in what you are doing. The return type of a lambda matters Java 8 infers the type of a closure. One way it does this is to look at the return type (or whether anything is returned). This can have a surprising side effect. Consider this code. es.submit(() -> { try(Scanner scanner = new Scanner(new FileReader("file.txt"))) { String line = scanner.nextLine(); process(line); } return null; });This code compiles fine. However, the line return null; appears redundant and you might be tempted to remove it.  However if you remove the line, you get an error. Error:(12, 39) java: unreported exception java.io.FileNotFoundException; must be caught or declared to be thrown This is complaining about the use of FileReader. What has the return null got to do with catching an uncaught exception !? Type inference ExecutorService.submit() is an overloaded method.  It has two methods which take one argument.ExecutorService.submit(Runnable runnable); ExecutorService.submit(Callable callable);Both these methods take no arguments, so how does the javac compiler infer the type of the lambda? It looks at the return type.  If you return null; it is a Callable<Void> however if nothing is returned, not even null, it is aRunnable. Callable and Runnable have another important difference. Callable throws checked exceptions, however Runnable doesn’t allow checked exceptions to be thrown. The side effect of returning null is that you don’t have to handle checked exceptions, these will be stored in the Future<Void> submit() returns.  If you don’t return anything, you have to handle checked exceptions. Conclusion While lambdas and type inference remove significant amounts of boiler plate code, you can find more edge cases, where the hidden details of what the compiler infers can be slightly confusing. Footnote You can be explicit about type inference with a cast. Consider this: Callable<Integer> calls = (Callable<Integer> & Serializable) () -> { return null; } if (calls instanceof Serializable) // is true This cast has a number of side effects. Not only does the call() method return an Integer and a marker interface added,  the code generated for the lambda changes i.e. it adds a writeObject() and readObject() method to support serialization of the lambda. Note: Each call site creates a new class meaning the details of this cast is visible at runtime via reflection.Reference: lambdas and side effects from our JCG partner Peter Lawrey at the Vanilla Java blog....
java-logo

How to Safely Use SWT’s Display asyncExec

Most user interface (UI) toolkits are single-threaded and SWT is no exception. This means that UI objects must be accessed exclusively from a single thread, the so-called UI thread. On the other hand, long running tasks should be executed in background threads in order to keep the UI responsive. This makes it necessary for the background threads to enqueue updates to be executed on the UI thread instead of accessing UI objects directly. To schedule code for execution on the UI thread, SWT offers the Display asyncE‌xec() and syncE‌xec() methods.     Display asyncE‌xec vs syncE‌xec While both methods enqueue the argument for execution on the UI thread, they differ in what they do afterwards (or don’t). As the name suggests, asyncE‌xec() works asynchronously. It returns right after the runnable was enqueued and does not wait for its execution. Whereas syncE‌xec() is blocking and thus does wait until the code has been executed. As a rule of thumb, use asyncE‌xec() as long as you don’t depend on the result of the scheduled code, e.g. just updating widgets to report progress. If the scheduled code returns something relevant for the further control flow – e.g. prompts for an input in a blocking dialog – then I would opt for syncE‌xec(). If, for example, a background thread wants to report progress about the work done, the simplest form might look like this: progressBar.getDisplay().asyncE‌xec( new Runnable() { public void r‌un() { progressBar.setSelection( ticksWorked ); } } ); asyncE‌xec() schedules the runnable to be executed on the UI thread ‘at the next reasonable opportunity’ (as the JavaDoc puts it). Unfortunately, the above code will likely fail now and then with a widget disposed exception, or more precisely with an SWTException with code == SWT.ERROR_WIDGET_DISPOSED. The reason therefore is, that the progress bar might not exist any more when it is accessed (i.e. setSelection() is called). Though we still hold a reference to the widget it isn’t of much use since the widget itself is disposed. The solution is obvious: the code must first test if the widget still exists before operating on it: progressBar.getDisplay().asyncE‌xec( new Runnable() { public void r‌un() { if( !progressBar.isDisposed() ) { progressBar.setSelection( workDone ); } } } ); As obvious as it may seem, as tedious it is to implement such a check again and again. You may want to search the Eclipse bugzilla for ‘widget disposed’ to get an idea of how frequent this issue is. Therefore we extracted a helper class that encapsulates the check new UIThreadSynchronizer().asyncE‌xec( progressBar, new Runnable() { public void r‌un() { progressBar.setSelection( workDone ); } } ); The UIThreadSynchronizers asyncE‌xec() method expects a widget as its first parameter that serves as a context. The context widget is meant to be the widget that would be affected by the runnable or a suitable parent widget if more than one widget are affected. Right before the runnable is executed, the context widget is checked. If is is still alive (i.e. not disposed), the code will be executed, otherwise, the code will be silently dropped. Though the behavior to ignore code for disposed widgets may appear careless, it worked for all situations we encoutered so far. Unit testing code that does inter-thread communication is particularly hard to test. Therefore the UIThreadSynchronizer – though it is stateless – must be instantiated to be replacable through a test double.The source code with corresponding tests can be found here: https://gist.github.com/rherrmann/7324823630a089217f46While the examples use asncE‌xec(), the UIThreadSynchronizer also supports syncE‌xec(). And, of course, the helper class is also compatible with RAP/RWT. If you read the source code arefully you might have noticed that there is a possible race condition. Because none of the methods of class Widget is meant to be thread-safe, the value returned by isDisposed() or getDisplay() may be stale (see line 51 and line 60). This is deliberately ignored at that point in time – read: I haven’t found any better solution. Though the runnable could be enqueued mistakenly, the isDisposed()-check (which is executed on the UI thread) would eventually prevent the code from being executed. And there is another (admittedly small) chance for a threading issue left: right before (a)syncE‌xec() is called the display is checked for disposal in order to not run into a widget disposed exception. But exactly that may happen if the display gets disposed in between the check and the invocation of (a)syncE‌xec(). While this could be solved for asyncE‌xec() by wrapping the call into a try-catch block that ignores widget disposed exceptions, the same approach fails for syncE‌xec(). The SWTExceptions thrown by the runnable cannot be distinguished from those thrown by syncE‌xec() with reasonable effort.Reference: How to Safely Use SWT’s Display asyncExec from our JCG partner Rudiger Herrmann at the Code Affine blog....
jcg-logo

Java Code Geeks and Genuitec are giving away FREE MyEclipse Pro Licenses (worth over $600)!

Ready to take your IDE to the next level? We are partnering with Genuitec, creator of cool Java tools, and we are running a contest giving away FREE licenses for the MyEclipse Ide. MyEclipse is a robust suite of tools for Java EE, Web and Mobile development. Get the best balance of popular technologies from all vendors. From Spring to Maven to REST web services, unify your development under a single stack that supports everything you need.    MyEclipse offers Unified Development under one platform:Enterprise: Eliminate engineering overhead by providing a MyEclipse IDE that meets Enterprise team requirements, including development for IBM WebSphere and other popular Java EE technologies. Save weeks normally lost to project on-ramping, keeping in sync, and releasing software. Mobile: With the evolving world of enterprise mobility, you need an IDE flexible enough for mobile applications. Get your mobile applications off the ground with the PhoneGap mobile project and build capabilities in MyEclipse. Web: With MyEclipse, quickly add technology capabilities to web projects, use visual editors for easier coding and configuration, and test your work on a variety of app servers. Cloud: Leave your silo and get into the cloud with built-in capabilities for exploring and connecting to cloud services.Enter the contest now to win your very own FREE MyElipse Pro Licence. There will be a total of 10 winners! In addition, we will send you free tips and the latest news from the Java community to master your technical knowledge (you can unsubscribe at any time). In order to increase your chances of winning, don’t forget to refer as much of your friends as possible! You will get 3 more entries for every friend you refer, that is 3 times more chances! Make sure to use your lucky URL to spread the word! You can share it on your social media channels, or even mention it on a blog post if you are a blogger! Good luck and may the force be with you! UPDATE: The giveaway has ended! Here is the list of the lucky winners! (emails hidden for privacy)ja…on@gmail.com ma…am@absa.co.za sh…77@gmail.com br…20@gmail.com de…a3@gmail.com em…an@gmail.com an…pu@gmail.com iv…ia@icontainers.com ra…la@gmail.com iu…na@gmail.comWe like to thank you all for participating to this giveaway. Till next time, Keep up the good work!...
java-logo

This is the Final Discussion!

Pun intended… Let’s discuss Java final. Recently, our popular blog post “10 Subtle Best Practices when Coding Java” had a significant revival and a new set of comments as it was summarised and linked from JavaWorld. In particular, the JavaWorld editors challenged our opinion about the Java keyword “final“:         More controversially, Eder takes on the question of whether it’s ever safe to make methods final by default: “If you’re in full control of all source code, there’s absolutely nothing wrong with making methods final by default, because:”“If you do need to override a method (do you really?), you can still remove the final keyword” “You will never accidentally override any method anymore”Yes, indeed. All classes, methods, fields and local variables should be final by default and mutable via keyword. Here are fields and local variables: int finalInt = 1; val int finalInt = 2; var int mutableInt = 3; Whether the Scala/C#-style val keyword is really necessary is debatable. But clearly, in order to modify a field / variable ever again, we should have a keyword explicitly allowing for it. The same for methods – and I’m using Java 8’s default keyword for improved consistency and regularity: class FinalClass { void finalMethod() {} }default class ExtendableClass { void finalMethod () {} default void overridableMethod() {} } That would be the perfect world in our opinion, but Java goes the other way round making default (overridable, mutable) the default and final (non-overridable, immutable) the explicit option. Fair enough, we’ll live with that … and as API designers (from the jOOQ API, of course), we’ll just happily put final all over the place to at least pretend that Java had the more sensible defaults mentioned above. But many people disagree with this assessment, mostly for the same reason: As someone who works mostly in osgi environments, I could not agree more, but can you guarantee that another api designer felt the same way? I think it’s better to preempt the mistakes of api designers rather than preempt the mistakes of users by putting limits on what they can extend by default. – eliasv on reddit Or… Strongly disagree. I would much rather ban final and private from public libraries. Such a pain when I really need to extend something and it cannot be done. Intentionally locking the code can mean two things, it either sucks, or it is perfect. But if it is perfect, then nobody needs to extend it, so why do you care about that. Of course there exists valid reasons to use final, but fear of breaking someone with a new version of a library is not one of them. – meotau on reddit Or also… I know we’ve had a very useful conversation about this already, but just to remind other folks on this thread: much of the debate around ‘final’ depends on the context: is this a public API, or is this internal code? In the former context, I agree there are some good arguments for final. In the latter case, final is almost always a BAD idea. – Charles Roth on our blog All of these arguments tend to go into one direction: “We’re working on crappy code so we need at least some workaround to ease the pain.” But why not think about it this way: The API designers that all of the above people have in mind will create precisely that horrible API that you’d like to patch through extension. Coincidentally, the same API designer will not reflect on the usefulness and communicativeness of the keyword final, and thus will never use it, unless required by the Java language. Win-win (albeit crappy API, shaky workarounds and patches). The API designers that want to use final for their API will reflect a lot on how to properly design APIs (and well-defined extension points / SPIs), such that you will never worry about something being final. Again, win-win (and an awesome API). Plus, in the latter case, the odd hacker will be kept from hacking and breaking your API in a way that will only lead to pain and suffering, but that’s not really a loss. Final interface methods For the aforementioned reasons, I still deeply regret that final is not possible in Java 8 interfaces. Brian Goetz has given an excellent explanation why this has been decideed upon like that. In fact, the usual explanation. The one about this not being the main design goal for the change! But think about the consistency, the regularity of the language if we had: default interface ImplementableInterface { void abstractMethod () ; void finalMethod () {} default void overridableMethod() {} } (Ducks and runs…) Or, more realistically with our status quo of defaulting to default: interface ImplementableInterface { void abstractMethod () ; final void finalMethod () {} void overridableMethod() {} } Finally So again, what are your (final) thoughts on this discussion? If you haven’t heard enough, consider also reading this excellent post by Dr. David Pearce, author of the whiley programming language.Reference: This is the Final Discussion! from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....
scala-logo

Use Cases for Elasticsearch: Index and Search Log Files

In the last posts we have seen some of the properties of using Elasticsearch as a document store, for searching text content and geospatial search. In this post we will look at how it can be used to index and store log files, a very useful application that can help developers and operations in maintaining applications. Logging When maintaining larger applications that are either distributed across several nodes or consist of several smaller applications searching for events in log files can become tedious. You might already have been in the situation that you have to find an error and need to log in to several machines and look at several log files. Using Linux tools like grep can be fun sometimes but there are more convenient ways. Elasticsearch and the projects Logstash and Kibana, commonly known as the ELK stack, can help you with this. With the ELK stack you can centralize your logs by indexing them in Elasticsearch. This way you can use Kibana to look at all the data without having to log in on the machine. This can also make Operations happy as they don’t have to grant access to every developer who needs to have access to the logs. As there is one central place for all the logs you can even see different applications in context. For example you can see the logs of your Apache webserver combined with the log files of your application server, e.g. Tomcat. As search is core to what Elasticsearch is doing you should be able to find what you are looking for even more quickly. Finally Kibana can also help you with becoming more proactive. As all the information is available in real time you also have a visual representation of what is happening in your system in real time. This can help you in finding problems more quickly, e.g. you can see that some resource starts throwing Exceptions without having your customers report it to you. The ELK Stack For logfile analytics you can use all three applications of the ELK stack: Elasticsearch, Logstash and Kibana. Logstash is used to read and enrich the information from log files. Elasticsearch is used to store all the data and Kibana is the frontend that provides dashboards to look at the data. The logs are fed into Elasticsearch using Logstash that combines the different sources. Kibana is used to look at the data in Elasticsearch. This setup has the advantage that different parts of the log file processing system can be scaled differently. If you need more storage for the data you can add more nodes to the Elasticsearch cluster. If you need more processing power for the log files you can add more nodes for Logstash. Logstash Logstash is a JRuby application that can read input from several sources, modify it and push it to a multitude of outputs. For running Logstash you need to pass it a configuration file that determines where the data is and what should be done with it. The configuration normally consists of an input and an output section and an optional filter section. This example takes the Apache access logs, does some predefined processing and stores them in Elasticsearch: input { file { path => "/var/log/apache2/access.log" } }filter { grok { match => { message => "%{COMBINEDAPACHELOG}" } } }output { elasticsearch_http { host => "localhost" } } The file input reads the log files from the path that is supplied. In the filter section we have defined the grok filter that parses unstructured data and structures it. It comes with lots of predefined patterns for different systems. In this case we are using the complete Apache log pattern but there are also more basic building block like parsing email and ip addresses and dates (which can be lots of fun with all the different formats). In the output section we are telling Logstash to push the data to Elasticsearch using http. We are using a server on localhost, for most real world setups this would be a cluster on separate machines. Kibana Now that we have the data in Elasticsearch we want to look at it. Kibana is a JavaScript application that can be used to build dashboards. It accesses Elasticsearch from the browser so whoever uses Kibana needs to have access to Elasticsearch. When using it with Logstash you can open a predefined dashboard that will pull some information from your index. You can then display charts, maps and tables for the data you have indexed. This screenshot displays a histogram and a table of log events but there are more widgets available like maps and pie and bar charts.As you can see you can extract a lot of data visually that would otherwise be buried in several log files. Conclusion The ELK stack can be a great tool to read, modify and store log events. Dashboards help with visualizing what is happening. There are lots of inputs in Logstash and the grok filter supplies lots of different formats. Using those tools you can consolidate and centralize all your log files. Lots of people are using the stack for analyzing their log file data. One of the articles that is available is by Mailgun, who are using it to store billions of events. And if that’s not enough read this post on how CERN uses the ELK stack to help running the Large Hadron Collider In the next post we will look at the final use case for Elasticsearch: Analytics.Reference: Use Cases for Elasticsearch: Index and Search Log Files from our JCG partner Florian Hopf at the Dev Time blog....
gradle-logo

Gradle Goodness: Adding Dependencies Only for Packaging to War

My colleague, Tom Wetjens, wrote a blog post Package-only dependencies in Maven. He showed a Maven solution when we want to include dependencies in the WAR file, which are not used in any other scopes. In this blog post we will see how we solve this in Gradle. Suppose we use the SLF4J Logging API in our project. We use the API as a compile dependency, because our code uses this API. But in our test runtime we want to use the SLF4J Simple implementation of this API. And in our WAR file we want to include the Logback implementation of the API. The Logback dependency is only needed to be included in the WAR file and shouldn’t exist in any other dependency configuration. We first add the War plugin to our project. The war task uses the runtime dependency configuration to determine which files are added to the WEB-INF/lib directory in our WAR file. We add a new dependency configuration warLib that extends the runtime configuration in our project. apply plugin: 'war'repositories.jcenter()configurations { // Create new dependency configuration // for dependencies to be added in // WAR file. warLib.extendsFrom runtime }dependencies { // API dependency for Slf4j. compile 'org.slf4j:slf4j-api:1.7.7'testCompile 'junit:junit:4.11'// Slf4j implementation used for tests. testRuntime 'org.slf4j:slf4j-simple:1.7.7'// Slf4j implementation to be packaged // in WAR file. warLib 'ch.qos.logback:logback-classic:1.1.2' }war { // Add warLib dependency configuration classpath configurations.warLib// We remove all duplicate files // with this assignment. // geFiles() method return a unique // set of File objects, removing // any duplicates from configurations // added by classpath() method. classpath = classpath.files } We can now run the build task and we get a WAR file with the following contents: $ gradle build :compileJava UP-TO-DATE :processResources UP-TO-DATE :classes UP-TO-DATE :war :assemble :compileTestJava :processTestResources UP-TO-DATE :testClasses :test :check :buildBUILD SUCCESSFULTotal time: 6.18 secs $ jar tvf build/libs/package-only-dep-example.war 0 Fri Sep 19 05:59:54 CEST 2014 META-INF/ 25 Fri Sep 19 05:59:54 CEST 2014 META-INF/MANIFEST.MF 0 Fri Sep 19 05:59:54 CEST 2014 WEB-INF/ 0 Fri Sep 19 05:59:54 CEST 2014 WEB-INF/lib/ 29257 Thu Sep 18 14:36:24 CEST 2014 WEB-INF/lib/slf4j-api-1.7.7.jar 270750 Thu Sep 18 14:36:24 CEST 2014 WEB-INF/lib/logback-classic-1.1.2.jar 427729 Thu Sep 18 14:36:26 CEST 2014 WEB-INF/lib/logback-core-1.1.2.jar 115 Wed Sep 03 09:24:40 CEST 2014 WEB-INF/web.xml Also when we run the dependencies task we can see how the implementations of the SLF4J API relate to the dependency configurations: $ gradle dependencies :dependencies------------------------------------------------------------ Root project ------------------------------------------------------------archives - Configuration for archive artifacts. No dependenciescompile - Compile classpath for source set 'main'. \--- org.slf4j:slf4j-api:1.7.7default - Configuration for default artifacts. \--- org.slf4j:slf4j-api:1.7.7providedCompile - Additional compile classpath for libraries that should not be part of the WAR archive. No dependenciesprovidedRuntime - Additional runtime classpath for libraries that should not be part of the WAR archive. No dependenciesruntime - Runtime classpath for source set 'main'. \--- org.slf4j:slf4j-api:1.7.7testCompile - Compile classpath for source set 'test'. +--- org.slf4j:slf4j-api:1.7.7 \--- junit:junit:4.11 \--- org.hamcrest:hamcrest-core:1.3testRuntime - Runtime classpath for source set 'test'. +--- org.slf4j:slf4j-api:1.7.7 +--- junit:junit:4.11 | \--- org.hamcrest:hamcrest-core:1.3 \--- org.slf4j:slf4j-simple:1.7.7 \--- org.slf4j:slf4j-api:1.7.7warLib +--- org.slf4j:slf4j-api:1.7.7 \--- ch.qos.logback:logback-classic:1.1.2 +--- ch.qos.logback:logback-core:1.1.2 \--- org.slf4j:slf4j-api:1.7.6 -> 1.7.7(*) - dependencies omitted (listed previously)BUILD SUCCESSFULTotal time: 6.274 secs Code written with Gradle 2.1.Reference: Gradle Goodness: Adding Dependencies Only for Packaging to War from our JCG partner Hubert Ikkink at the JDriven blog....
software-development-2-logo

Five Reasons Why High Performance Computing (HPC) startups will explode in 2015

1. The size of the social networks grew beyond any rational expectations Facebook(FB) official stats state that FB has 1.32 billion and 1.07 billion mobile monthly active users. Approximately 81.7% are outside the US and Canada. FB manages a combined of 2.4 Billion users, including mobile with 7,185 employees. The world population as estimated by United Nations as of 1 July 2014 at 7.243 billion. Therefore 33% of the world population is on FB. This includes every infant and person alive, and makes abstraction if they are literate or not.   Google reports 540 million users per month plus 1.5 billion photos uploaded per week. Add Twitter, Quora, Yahoo and a few more we reach 3 billion plus people who write emails, chat, tweet, write answers to questions and ask questions, read books, see movies and TV, and so on. Now we have the de-facto measurable collective unconscious of this word, ready to be analyzed. It contains information of something inside us that we are not aware we have. This rather extravagant idea come from Carl Jung  about 70 years ago. We should take him seriously as his teachings led to the development of Meyer Briggs  and a myriad of other personality and vocational tests that proved amazingly accurate. Social media life support  profits depends on meaningful information. FB reports revenues of $2,91 billion per Q2 2014, and only $0.23 billion come from user payments or fees. 77% of all revenues are processed information monetized through advertising and other related services. The tools of the traditional Big Data (the only data there is, is big data) are no longer sufficient. A few years ago we were talking in the 100 million users range, Now the data sets are in exabyte and zettabyte dimensions. 1 EB = 1000^6 bytes = 10^18 bytes = 1 000 000 000 000 000 000 B = 1 000 petabytes = 1 million terabytes = 1 billion gigabytes 1 ZB = 1,000 EB I compiled this chart from information published. It shows the growth of the world’s storage capacity, assuming optimal compression, over the years. The 2015 data is extrapolated from Cisco and crosses one zettabyte capacity.2. The breakthrough in high throughput and high performance computing. The successful search for Higgs particle exceeds anything in  terms of data size analyzed: The amount of data collected at the ATLAS detector from the  Large Hadron Collider (LHC) in CERN, Geneva  is described like this: If all the data from ATLAS would be recorded, this would fill 100,000 CDs per second. This would create a stack of CDs 450 feet high every second, which would reach to the moon and back twice each year. The data rate is also equivalent to 50 billion telephone calls at the same time. ATLAS actually only records a fraction of the data (those that may show signs of new physics) and that rate is equivalent to 27 CDs per minute. It took 20 years and 6,000 scientists. They created a grid which has a capacity of 200 PB of disk and 300,000 cores, with most of the 150 computing centers connected via 10 Gbps links. A new idea, the Dynamic Data Center Concept did not make it yet mainstream, but it would great if it does. This concept is described in a different blog entry. Imagine every computer and laptop of this world plugged into a worldwide cloud  when not in use and withdrawn just as easy storage USB card. Mind boggling, but this will be one day reality. 3. The explosion of HPC startups in San Francisco, California There is a new generation of performance computing physicists who senses the affinities of social networks with super computing. All are around 30 years old and you can meet some of them attending this meetup. Many come from Stanford and Berkeley, and have previously worked in Open Science Grid (OSG) or Fermi Lab but went to settle on the West Coast. Other are talented Russians, – with the same talent as Sergei Brin from Google. They are now now happily American . Some extraordinary faces are from China and India. San Francisco is a place where being crazy is being normal. Actually for me all are “normal” in San Francisco. HPC needs a city like this, to rejuvenate HPC thinkers and break away from the mentality where big bucks are spent for gargantuan infrastructures, similar to the palace of Ceausescu in Romania. The dictator had some 19 churches, six synagogues and 30,000 homes demolished. No one knew what to do with the palace. The dilemma was to make it a shopping center or the Romanian Parliament. Traditional HPC has similar stories, like Waxahacie Watch what I say about user experience in this video. 95% of the scientists do not have access to super commuting marvels. I say we must make high performance computing accessible to every scientist. In its’ ultimate incarnation, any scientists can do higgs-like-events searches on lesser size data and be most of the time successful. See for example PiCloud. See clearly how it works. All written in Python. See clearly how much it costs. They still have serious solutions for Academia and HPC. For comparison look at HTCondor documentation, see the installation or try to learn something called dagman. Simply adding a feature, no one paid attention to make it easy to learn and use. I did work with HTCondor engineers and let me say it, they are of the finest I ever met. All they need an exposure to San Francisco in a consistent way. 4. Can social networks giants acquire HPC HTC competency using HR? No. They can’t. Individual HPC employees recruited through HR will not create a new culture. They will mimic the dominant thinking inside groups and loose original identity and creativity. As Drop-box wisely discovered, the secret is to acquihire, and create an internal core competency with  a startup who delivers something they don’t have yet. 5. The strategy to make HPC / HTC start ups successful. Yes it is hard to have 1 million users as PiCloud. Actually, it is impossible. But PiCloud technology can literally deliver hundreds of millions dollars via golden discoveries using HPC / HTC in social company that already have 100 million users and more. The lesson we learn is this: HPC / HTC  cannot parrot the social media business model of accumulating millions, – never mind billions – of users. Success is not made up of features. Success is about making someone happy. You have to know that someone. Social networks are experts in making easy for people to use everything they offer. And HPC /HTC should make the social media companies happy. It is only through this symbiosis HPC/HTC – on one side – and Social Media plus Predictive Analytics everywhere – on the other side – that high performance computing will be financially successful as a minimum viable product (MVP).Reference: Five Reasons Why High Performance Computing (HPC) startups will explode in 2015 from our JCG partner Miha Ahronovitz at the The memories of a Product Manager blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close