What's New Here?


Saving data to a file in your Android application

This is the second post in my series about storage in Android applications. The other post is available here : http://www.javacodegeeks.com/2014/06/introduction-how-to-save-data-in-your-android-application.html This post is about saving to a file from an Android application, which is the easiest way to store data. There are many situations where you may need to save a file : you may want to use an existing file format to create files that can be opened by the user in another application or the data is simple enough that it can be represented by a text file or a format like XML or YAML. For complex data a database may be a better option, since accessing and parsing a large file can be slow and there are no integrity checks unless you code them by hand. On the other hand, there is less overhead and it easier to work with files than debugging with the data in a database. Depending on how the user will interact (or not) with your files, you will need to decide first which kind of storage to use. Internal storage Each application has its own private internal storage to save files. This is the kind of storage to use if the user shouldn’t be able to modify the file from outside your application, and if other application shouldn’t be able to access those files. Since the internal storage is private to your application, the files will be deleted if your application is uninstalled. The internal storage is also where your application is installed by default, so your files will always be available. On some older or cheaper devices the internal storage is quite limited, so you need to be careful about the size of the data you save if you need to support those devices. You should never hardcode the path to the storage directories, since the directory may changes depending on the version of the Android OS used.  Also, Android 4.4 introduces the concept of multiple users : in that case, the internal and external storage depend on the user logged in and the files of the other users will be invisible. Here are some of the methods used to get the paths to the internal storage:android.content.Context.getFilesDir(): returns a java.io.File object representing the root directory of the internal storage for your application from the current context. android.content.Context.getDir(String name, Context.MODE_PRIVATE): returns a java.io.File object representing the directory name in the internal storage, creating the directory if it does not exists. The second parameter can also be used to set the directory to MODE_WORLD_READABLE or MODE_WORLD_WRITABLE so it is visible by all the other applications, but this is is risky security-wise and was deprecated in API level 17 (Android 4.2). android.content.Context.getCacheDir(): returns a java.io.File object representing the internal cache directory for the application. This is mean for small files (the documentation suggests no more that 1MB total) that can be deleted at any time when the system needs more storage. There is no guarantee that the cache will be cleared, so you must also clear those files manually when they are not needed anymore.As you can see, the files are represented by the File object from the java.io namepace: there is no file object specific to the Android SDK and the standard Java APIs for reading and writing files are used. Also, there is no specific application permission to set in the Android manifest to use the internal storage since it is already private to the application. External storage In addition of the internal storage, there is an external storage space shared by all the applications that is kept when your application is uninstalled. This is the storage that is shown when using a file explorer application and when the device is plugged in your computer. It may be implemented as a SD card that can be removed or as a partition of the built-in storage in the device, so your application should be able to work even if the card is removed or changed. To check the current state of the external storage, you can call the getExternalStorageState() method. On device with many users (starting with Android 4.4), the external storage is specific to the current user and files for other users can’t be accessed. Also, there may be more than one external storage if the device has a built-in external storage which is a partition on the internal memory and a SD card: in that case, the built-in storage is the primary external storage. Reading files from the external storage requires the READ_EXTERNAL_STORAGE permission and writing or reading files requires the WRITE_EXTERNAL_STORAGE permission. Here are the methods you should use to call to get the directories of the primary external storage:android.os.Environment.getExternalStorageDirectory(): returns a java.io.File object representing the root directory of the primary external storage of the device that is shared by all applications. android.os.Environment.getExternalStoragePublicDirectory(): returns a java.io.File object representing a public directory for files of a particular type on the primary external storage of the device.  For example, you can get the path to the public music directory by calling Environment.getExternalStoragePublicDirectory(Environment.DIRECTORY_MUSIC) or the public pictures directory by calling Environment.getExternalStoragePublicDirectory(Environment.DIRECTORY_PICTURES).android.content.Context.getExternalFilesDir(): returns a java.io.File representing the root directory of the primary external storage specific to your application, which is under the directory returned by getExternalStorageDirectory(). Unlike the other directories of the external storage,  the files you store in that folder will be deleted when your application is uninstalled. So, if you need to store files that are only needed by your application you should use this folder. Also, there is no specific permission needed for the application to read or write to its own external storage starting with Android 4.4, but with older versions your application needs the READ_EXTERNAL_STORAGE or WRITE_EXTERNAL_STORAGE permission. android.content.Context.getExternalFilesDirs(): returns an array of java.io.File representing the root directories of all the external storage directories that can be used by your application with the primary external storage as the first directory in the array. All those directories works the same as the primary storage returned by the getExternalFilesDir() method. If the device has a built-in storage as the primary external storage and a SD card as a secondary external storage, this is the only way to get the path to the SD card. This method was introduced in Android 4.4, before that it was impossible to get the path to the  secondary storage.android.content.Context.getExternalCacheDir(): returns a java.io.File object representing the cache of the application on the primary external storage. This cache is not visible to the user and is deleted when the application is uninstalled. There is no mechanism in the Android SDK to delete files in the cache directory, so you need to manage your cache to keep it to a reasonable maximum size. Starting with Android 4.4, the application does not need permission to access its own cache, but with older versions your application needs the READ_EXTERNAL_STORAGE or WRITE_EXTERNAL_STORAGE permission.Example code to save to a file To save a file, you need to get the path to the storage you want to use which is used the same way regardless of the type of storage used since all the methods returns a java.io.File object representing the directory to use. Here is an example of using the external storage to save a text file from an Activity : try { // Creates a trace file in the primary external storage space of the // current application. // If the file does not exists, it is created. File traceFile = new File(((Context)this).getExternalFilesDir(null), "TraceFile.txt"); if (!traceFile.exists()) traceFile.createNewFile(); // Adds a line to the trace file BufferedWriter writer = new BufferedWriter(new FileWriter(traceFile, true /*append*/)); writer.write("This is a test trace file."); writer.close(); // Refresh the data so it can seen when the device is plugged in a // computer. You may have to unplug and replug the device to see the // latest changes. This is not necessary if the user should not modify // the files. MediaScannerConnection.scanFile((Context)(this), new String[] { traceFile.toString() }, null, null); } catch (IOException e) { Log.e("com.cindypotvin.FileTest", "Unable to write to the TraceFile.txt file."); } }Reference: Saving data to a file in your Android application from our JCG partner Cindy Potvin at the Web, Mobile and Android Programming blog....

On Graph Computing

The concept of a graph has been around since the dawn of mechanical computing and for many decades prior in the domain of pure mathematics. Due in large part to this golden age of databases, graphs are becoming increasingly popular in software engineering. Graph databases provide a way to persist and process graph data. However, the graph database is not the only way in which graphs can be stored and analyzed. Graph computing has a history prior to the use of graph databases and has a future that is not necessarily entangled with typical database concerns. There are numerous graph technologies that each have their respective benefits and drawbacks. Leveraging the right technology at the right time is required for effective graph computing. Structure: Modeling Real-World Scenarios with GraphsA graph (or network) is a data structure. It is composed of vertices (dots) and edges (lines). Many real-world scenarios can be modeled as a graph. This is not necessarily inherent to some objective nature of reality, but primarily predicated on the the fact that humans subjectively interpret the world in terms of objects (vertices) and their respective relationships to one another (edges) (an argument against this idea). The popular data model used in graph computing is the property graph. The following examples demonstrate graph modeling via three different scenarios. A Software Graph Stephen is a member of a graph-oriented engineering group called TinkerPop. Stephen contributes to Rexster. Rexster is related to other projects via software dependencies. When a user finds a bug in Rexster, they issue a ticket. This description of a collaborative coding environment can be conveniently captured by a graph. The vertices (or things) are people, organizations, projects, and tickets. The edges (or relationships) are, for example, memberships, dependencies, and issues. A graph can be visualized using dots and lines and the scenario described above is diagrammed below.A Discussion Graph Matthias is interested in graphs. He is the CTO of Aurelius and the project lead for the graph database Titan. Aurelius has a mailing list. On this mailing list, people discuss graph theory and technology. Matthias contributes to a discussion. His contributions beget more contributions. In a recursive manner, the mailing list manifests itself as a tree. Moreover, the unstructured text of the messages make reference to shared concepts.A Concept Graph A graph can be used to denote the relationships between arbitrary concepts, even the concepts related to graph. For example, note how concepts (in italics) are related in the sentences to follow. A graph can be represented as an adjacency list. The general way in which graphs are processed are via graph traversals. There are two general types of graph traversals: depth-first and breadth-first. Graphs can be persisted in a software system known as a graph database. Graph databases organize information in a manner different from the relational databases of common software knowledge. In the diagram below, the concepts related to graph are linked to one another demonstrating that concept relationships form a graph.A Multi-Domain Graph The three previous scenarios (software, discussion, and concept) are representations of real-world systems (e.g. GitHub, Google Groups, and Wikipedia). These seemingly disparate models can be seamlessly integrated into a single atomic graph structure by means of shared vertices. For instance, in the associated diagram, Gremlin is a Titan dependency, Titan is developed by Matthias, and Matthias writes messages on Aurelius’ mailing list (software merges with discussion). Next, Blueprints is a Titan dependency and Titan is tagged graph (software merges with concept). The dotted lines identify other such cross-domain linkages that demonstrate how a universal model is created when vertices are shared across domains. The integrated, universal model can be subjected to processes that provide richer (perhaps, more intelligent) services than what any individual model could provide alone.Process: Solving Real-World Problems with Traversals  What has been presented thus far is a single graph model of a set of interrelated domains. A model is only useful if there are processes that can leverage it to solve problems. Much like data needs algorithms, a graph needs a traversal. A traversal is an algorithmic/directed walk over the graph such that paths are determined (called derivations) or information is gleaned (called statistics). Even the human visual system viewing a graph visualization is a traversal engine leveraging saccadic movements to identify patterns. However, as graphs grow large and problems demand precise logic, visualizations and the human’s internal calculator break down. A collection of traversal examples are presented next that solve typical problems in the previously discussed domains. Determining Circular Dependencies With the growth of open source software and the ease by which modules can be incorporated into projects, circular dependencies abound and can lead to problems in software engineering. A circular dependency occurs when project A depends on project B and, through some dependency path, project B depends on project A. When dependencies are represented graphically, a traversal can easily identify such circularities (e.g. in the diagram below, A->B->D->G->A is a cycle).Ranking Discussion Contributors Mailing lists are composed of individuals with varying levels of participation and competence. When a mailing list is focused on learning through discussion, simply writing a message is not necessarily a sign of positive contribution. If an author’s messages spawn replies, then it can be interpreted that the author is contributing discussion-worthy material. However, if an author’s messages end the conversation, then they may be contributing non-sequiturs or information that is not allowing the discussion to flourish. In the associated diagram, the beige vertices are authors and their respective number is a unique author id.   One way to rank contributors on a mailing list is to count the number of messages they have posted (the author’s out-degree to messages in the mailing list). However, if the ranking must account for fruitful contributions, then authors can be ranked by the depth of the discussion their messages spawn (the tree depth of the author’s messages). Finally, note that other techniques such as sentiment and concept analysis can be included in order to understand the itention and meaning of a message.   Finding Related Concepts  Stephen’s understanding of graphs was developed while working on TinkerPop’s graph technology stack. Nowadays he is interested in learning more about the theoretical aspects of graphs. Via his web browser, he visits the graph Wikipedia page. In a manual fashion, Stephen clicks links and reads articles — depth-first, graph traversals, adjacency lists, etc. He realizes that pages reference each other and that some concepts are more related to others due to Wikipedia’s link structure. The manual process of walking links can be automated using a graph traversal. Instead of clicking, a traversal can start at the graph vertex, emanate outwards, and report which concepts have been touched the most. The concept that has seen the most flow, is a concept that has many ties (i.e. paths) to graph (see priors algorithms). With such a traversal, Stephen can be provided a ranked list of graph related concepts. This traversal is analogous to a wave diffusing over a body of water — albeit real-world graph topologies are rarely as simple as a two-dimensional plane (see lattice). A Multi-Domain Traversal The different graph models discussed previously (i.e. software, discussion, and concept) were integrated into a single world model via shared vertices. Analogously, the aforementioned graph traversals can be composed to yield a solution to a cross-domain problem. For example: “Recommend me projects to participate in that maintain a proper dependency structure, have engaging contributors promoting the space, and are conceptually related to technologies I’ve worked on previously.” This type of problem solving is possible when a heterogenous network of things is linked together and effectively moved within. The means of linking and moving is the graph and the traversal, respectively. To conclude this section, other useful traversal examples are provided. “Compute a ‘stability rank’ for a project based on the number of issues it has and the number of issues its dependencies have, so forth and so on in a recursive manner.” “Cluster projects according to shared (or similar) concepts between them.” “Recommend a team of developers for an upcoming project that will use X dependencies and is related to Y concepts.” “Rank issues by the number of projects that each issue’s submitter has contributed to.” Graph Computing Technologies The practice of computing is about riding the fine line between two entangled quantities: space and time. In the world of graph computing, the same tradeoffs exist. This section will discuss various graph technologies in order to identify what is gained and sacrificed with each choice. Moreover, a few example technologies are presented. Note that many more technologies exist and the mentioned examples are by no means exhaustive. In-Memory Graph Toolkits  In-memory graph toolkits are single-user systems that are oriented towards graph analysis and visualization. They usually provide implementations of the numerous graph algorithms defined in the graph theory and network science literature (see Wikipedia’s list of graph algorithms). The limiting factor of these tools is that they can only operate on graphs that can be stored in local, main memory. While this can be large (millions of edges), it is not always sufficient. If the source graph data set is too large to fit into main memory, then subsets are typically isolated and processed using such in-memory graph toolkits. Examples: JUNG, NetworkX, iGraph, Fulgora (coming soon)[+] Rich graph algorithm libraries [+] Rich graph visualization libraries [+] Different memory representations for different space/time tradeoffs [-] Constrained to graphs that can fit into main memory [-] Interaction is normally very code heavyReal-Time Graph DatabasesGraph databases are perhaps the most popular incarnation of a graph computing technology. They provide transactional semantics such as ACID (typical of local databases) and eventual consistency (typical of distributed databases). Unlike in-memory graph toolkits, graph databases make use of the disk to persist the graph. On reasonable machines, local graph databases can support a couple billion edges while distributed systems can handle hundreds of billions of edges. At this scale and with multi-user concurrency, where random access to disk and memory are at play, global graph algorithms are not feasible. What is feasible is local graph algorithms/traversals. Instead of traversing the entire graph, some set of vertices serve as the source (or root) of the traversal. Examples: Neo4j, OrientDB, InfiniteGraph, DEX, Titan[+] Optimized for local neighborhood analyses (“ego-centric” traversals) [+] Optimized for handling numerous concurrent users [+] Interactions are via graph-oriented query/traversal languages [-] Global graph analytics are inefficient due to random disk interactions [-] Large computational overhead due to database functionality (e.g. transactional semantics)Batch Processing Graph FrameworksBatch processing graph frameworks make use of a compute cluster. Most of the popular frameworks in this space leverage Hadoop for storage (HDFS) and processing (MapReduce). These systems are oriented towards global analytics. That is, computations that touch the entire graph dataset and, in many instances, touch the entire graph many times over (iterative algorithms). Such analyses do not run in real-time. However, because they perform global scans of the data, they can leverage sequential reads from disk (see The Pathology of Big Data). Finally, like the in-memory systems, they are oriented towards the data scientist or, in a production setting, for feeding results back into a real-time graph database. Examples: Hama, Giraph, GraphLab, Faunus[+] Optimized for global graph analytics [+] Process graphs represented across a machine cluster [+] Leverages sequential access to disk for fast read times [-] Does not support multiple concurrent users [-] Are not real-time graph computing systemsThis section presented different graph computing solutions. It is important to note that there also exists hardware solutions like Convey’s MX Series and Cray’s YARC graph engines. Each of the technologies discussed all share one important theme — they are focused on processing graph data. The tradeoffs of each category are determined by the limits set forth by modern hardware/software and, ultimately, theoretical computer science. Conclusion To the adept, graph computing is not only a set of technologies, but a way of thinking about the world in terms of graphs and the processes therein in terms of traversals. As data is becoming more accessible, it is easier to build richer models of the environment. What is becoming more difficult is storing that data in a form that can be conveniently and efficiently processed by different computing systems. There are many situations in which graphs are a natural foundation for modeling. When a model is a graph, then the numerous graph computing technologies can be applied to it. Acknowledgement Mike Loukides of O’Reilly was kind enough to review multiple versions of this article and in doing so, made the article all the better.Reference: On Graph Computing from our JCG partner Marko Rodriguez at the Marko A. Rodriguez’s blog blog....

Understanding the World using Tables and Graphs

Organizations make use of data to drive their decision making, enhance their product features, and to increase the efficiency of their everyday operations. Data by itself is not useful. However, with data analysis, patterns such as trends, clusters, predictions, etc. can be distilled. The way in which data is analyzed is predicated on the way in which data is structured. The table format popularized by spreadsheets and relational databases is useful for particular types of processing. However, the primary purpose of this post is to examine a relatively less exploited structure that can be leveraged when analyzing an organization’s data — the graph/network. The Table Perspective Before discussing graphs, a short review of the table data structure is presented using a toy example containing a 12 person population. For each individual person, their name, age, and total spending for the year is gathered. The R Statistics code snippet below loads the population data into a table. > people <- read.table(file='people.txt', sep='\t', header=TRUE) > people id name age spending 1 0 francis 57 100000 2 1 johan 37 150000 3 2 herbert 56 150000 4 3 mike 34 30000 5 4 richard 47 35000 6 5 alberto 31 70000 7 6 stephan 36 90000 8 7 dan 52 40000 9 8 jen 28 90000 10 9 john 53 120000 11 10 matt 34 90000 12 11 lisa 48 100000 13 12 ariana 34 110000     Each row represents the information of a particular individual. Each column represents the values of a property of all individuals. Finally, each entry represents a single value for a single property for a single individual. Given the person table above, various descriptive statistics can be calculated? Simple examples include:the average, median, and standard deviation of age (line 1), the average, median, and standard deviation of spending (line 3), the correlation between age and spending (i.e. do older people tend to spend more? — line 5), the distribution of spending (i.e. a histogram of spending — line 8).    > c(mean(people$age), median(people$age), sd(people$age)) [1] 42.07692 37.00000 10.29937 > c(mean(people$spending), median(people$spending), sd(people$spending)) [1] 90384.62 90000.00 38969.09 > cor.test(people$spending, people$age)$e cor 0.1753667 > hist(people$spending, xlab='spending', ylab='frequency', cex.axis=0.5, cex.lab=0.75, main=NA) In general, a table representation is useful for aggregate statistics such as those used when analyzing data cubes. However, when the relationships between modeled entities is complex/recursive, then graph analysis techniques can be leveraged. The Graph Perspective A graph (or network) is a structure composed of vertices (i.e. nodes, dots) and edges (i.e. links, lines). Assume that along with the people data presented previously, there exists a dataset which includes the friendship patterns between the people. In this way, people are vertices and friendship relationships are edges. Moreover, the features of a person (e.g. their name, age, and spending) are properties on the vertices. This structure is commonly known as a property graph. Using iGraph in R, it is possible to represent and process this graph data.Load the friendship relationships as a two column numeric table (line 1-2). Generate an undirected graph from the two column table (line 3). Attach the person properties as metadata on the vertices (line 4-6).> friendships <- read.table(file='friendships.txt',sep='\t') > friendships <- cbind(lapply(friendships, as.numeric)$V1, lapply(friendships, as.numeric)$V2) > g <- graph.edgelist(as.matrix(friendships), directed=FALSE) > V(g)$name <- as.character(people$name) > V(g)$spending <- people$spending > V(g)$age <- people$age > g Vertices: 13 Edges: 25 Directed: FALSE Edges: [0] 'francis' -- 'johan' [1] 'francis' -- 'jen' [2] 'johan' -- 'herbert' [3] 'johan' -- 'alberto' [4] 'johan' -- 'stephan' [5] 'johan' -- 'jen' [6] 'johan' -- 'lisa' [7] 'herbert' -- 'alberto' [8] 'herbert' -- 'stephan' [9] 'herbert' -- 'jen' [10] 'herbert' -- 'lisa' ... One simple technique for analyzing graph data is to visualize it so as to take advantage of the human’s visual processing system. Interestingly enough, the human eye is excellent at finding patterns. The code example below makes use of the Fruchterman-Reingold layout algorithm to display the graph on a 2D plane. > layout <- layout.fruchterman.reingold(g) > plot(g, vertex.color='red',layout=layout, vertex.size=10, edge.arrow.size=0.5, edge.width=0.75, vertex.label.cex=0.75, vertex.label=V(g)$name, vertex.label.cex=0.5, vertex.label.dist=0.7, vertex.label.color='black') For large graphs (those beyond the toy example presented), the human eye can become lost in the mass of edges between vertices. Fortunately, there exist numerous community detection algorithms. These algorithms leverage the connectivity patterns in a graph in order to identify structural subgroups. The edge betweenness community detection algorithm used below identifies two structural communities in the toy graph (one colored orange and one colored blue — lines 1-2). With this derived community information, it is possible to extract one of the communities and analyze it in isolation (line 19).  > V(g)$community = community.to.membership(g, edge.betweenness.community(g)$merges, steps=11)$membership+1 > data.frame(name=V(g)$name, community=V(g)$community) name community 1 francis 1 2 johan 1 3 herbert 1 4 mike 2 5 richard 2 6 alberto 1 7 stephan 1 8 dan 2 9 jen 1 10 john 2 11 matt 2 12 lisa 1 13 ariana 1 > color <- c(colors()[631], colors()[498]) > plot(g, vertex.color=color[V(g)$community],layout=layout, vertex.size=10, edge.arrow.size=0.5, edge.width=0.75, vertex.label.cex=0.75, vertex.label=V(g)$name, vertex.label.cex=0.5, vertex.label.dist=0.7, vertex.label.color='black') > h <- delete.vertices(g, V(g)[V(g)$community == 2]) > plot(h, vertex.color="red",layout=layout.fruchterman.reingold, vertex.size=10, edge.arrow.size=0.5, edge.width=0.75, vertex.label.cex=0.75, vertex.label=V(h)$name, vertex.label.cex=0.5, vertex.label.dist=0.7, vertex.label.color='black')     The isolated subgraph can be subjected to a centrality algorithm in order to determine the most central/important/influential people in the community. With centrality algorithms, importance is defined by a person’s connectivity in the graph and in this example, the popular PageRank algorithm is used (line 1). The algorithm outputs a score for each vertex, where the higher the score, the more central the vertex. The vertices can then be sorted (lines 2-3). In practice, such techniques may be used for designing a marketing campaign. For example, as seen below, it is possible to ask questions such as “which person is both influential in their community and a high spender?” In general, the graphical perspective on data lends itself to novel statistical techniques that, when combined with table techniques, provides the analyst a rich toolkit for exploring and exploiting an organization’s data.     > V(h)$page.rank <- page.rank(h)$vector > scores <- data.frame(name=V(h)$name, centrality=V(h)$page.rank, spending=V(h)$spending) > scores[order(-centrality, spending),] name centrality spending 6 jen 0.19269343 90000 2 johan 0.19241727 150000 3 herbert 0.16112886 150000 7 lisa 0.13220997 100000 4 alberto 0.10069925 70000 8 ariana 0.07414285 110000 5 stephan 0.07340102 90000 1 francis 0.07330735 100000 It is important to realize that for large-scale graph analysis there exists various technologies. Many of these technologies are found in the graph database space. Examples include transactional, persistence engines such as Neo4j and the Hadoop-based batch processing engines such as Giraph and Pegasus. Finally, exploratory analysis with the R language can be used for in-memory, single-machine graph analysis as well as in cluster-based environments using technologies such as RHadoop and RHIPE. All these technologies can be brought together (along with table-based technologies) to aid an organization in understanding the patterns that exist in their data. ResourcesNewman, M.E.J., “The Structure and Function of Complex Networks“, SIAM Review, 45, 167–256, 2003. Rodriguez, M.A., Pepe, A., “On the Relationship Between the Structural and Socioacademic Communities of a Coauthorship Network,” Journal of Informetrics, 2(3), 195–201, July 2008.Reference: Understanding the World using Tables and Graphs from our JCG partner Marko Rodriguez at the AURELIUS blog....

Serialization Proxy Pattern example

There are books, which change your life immensely. One of such books is “Effective Java” by Joshua Bloch. Below you may find small experiment, which was inspired by Chapter 11 of this book – “Serialization”. Suppose that we have a class designed for inheritance, which is not Serializable itself, and has no parameterless constructor, like in this example:           public class CumbersomePoint {private String name;private double x;private double y;protected CumbersomePoint(double x, double y, String name) { this.x = x; this.y = y; this.name = name; }public String getName() { return name; }public double getX() { return x; }public double getY() { return y; }... } Now when we extend this class, for example in following way: public class ConvenientPoint extends CumbersomePoint implements Serializable {public ConvenientPoint(double x, double y, String name) { super(x, y, name); }... } and try to serialize and then deserialize any of ConvenientPoint instances, we’ll quickly encounter beautiful InvalidClassException, complaining that there is no valid constructor. Situation looks kinda hopeless, until you apply technique known as Serialization Proxy Pattern. We will start by adding to the ConvenientPoint class following inner class: private static class SerializationProxy implements Serializable {private String name;private double x;private double y;public SerializationProxy(ConvenientPoint point) { this.name = point.getName(); this.x = point.getX(); this.y = point.getY(); }private Object readResolve() { return new ConvenientPoint(x, y, name); }} The SerializationProxy class will represent the logical state of enclosing class instance. We will have to add also following method to ConvenientPoint class: private Object writeReplace() { return new SerializationProxy(this); } Now when the ConvenientPoint instance will be serialized, it will nominate its replacement, thanks to writeReplace method – SerializationProxy instance will be serialized instead of ConvenientPoint. From the other side, when SerializationProxy will be deserialized, readResolve method usage will nominate its replacement, being ConvenientPoint instance. As you see, we’ve made ConvenientPoint serializable, regardless of missing parameterless constructor of non-serializable parent class. One more remark, at the end of this post – if you want to protect against breaking class invariants, enforced by the constructor, you may add following method to class using Serialization Proxy Pattern (ConvenientPoint in our example): private void readObject(ObjectInputStream stream) throws InvalidObjectException { throw new InvalidObjectException("Use Serialization Proxy instead."); } It will prevent deserialization of the enclosing class.Reference: Serialization Proxy Pattern example from our JCG partner Michal Jastak at the Warlock’s Thoughts blog....

Using Android Traceview in Eclipse

The best way to solve a performance problem with an Android application is to profile the application by tracing the execution. This way, you can make decisions on what to improve based on real data and detect bottlenecks in the application. In the Android SDK, the Debug object handles the profiling, and the TraceView utility is used to view the generated trace file. I will describe how to profile using the tools included in the Eclipse Android SDK plugin, but you can also use the traceview command. Keep in mind that a .trace file can get pretty big, so be careful to trace only the parts of the application that are problematic. If you have no idea what to profile, you can trace at large the first time and drill down to specific sections later on. Also, execution is slowed down when methods are traced, so you should only compare traced execution time with another traced time. Profiling from the code Profiling for the code allows you to analyse a section which you already suspect is problematic. To profile from the code :Add calls to the Debug class methods where you want to start and stop profiling. To start profiling in any method, add the following method calls : Debug.startMethodTracing("logname"); And to stop profiling, add the following call : Debug.stopMethodTracing(); Those calls don’t need to be in the same method, but make sure the stopMethodTracing will be reached. Also, your application must have permission to write to the SD card, or else you will get an exception Unable to open trace file “/sdcard/logcat.trace” : Permission denied. Start the application and execute the problematic operation to trace it.  A message is displayed in the Logcat output when the profiling starts and stops, and a logname.trace file will be created at the root of the file system of the device. Navigate to the root of the SD card of the device, and copy the .trace file to the desktop. Open the .trace file with eclipse and it will be shown in the trace tool. See the Analyzing the .trace file section below to check the result.Profiling from Eclipse Profiling during debugging can allow you to pinpoint a problem appearing at random times during the execution or for which you are unsure which lines of code are causing the problem.In the DDMS perspective, click the Start Method Profiling button.Press the same button again (Stop Method Profiling) to stop profiling. The result will appear immediately in Eclipse.Analyzing the .trace file When opening a .trace file, the following window is shown :At the top, each thread has its own timeline. In this application, there is only a main thread, the other threads are system thread. In my opinion, the most interesting columns are :Incl CPU % : Percent of the total CPU time used by this method, including the execution time of all the child methods. Incl CPU Time : CPU time in milliseconds used by this method, including the execution time of all the child methods. Calls + RecurCall/Total : The number of calls to the child method by the parent method (and recursive calls) /Total calls to the method for the whole .trace file­.You can drill down on any method to see the child methods and analyse them. In that case, drilling down on the generateRandomNumber method shows that the child method saveRandomNumber makes up the bulk of the execution time. In fact, if you look at the Calls column you can see that it was called 1000000 times. The method was called multiple time for the test, but for a real profiling case this is where you should paid attention : the number of calls may be reduced, or the execution of the method optimized.Reference: Using Android Traceview in Eclipse from our JCG partner Cindy Potvin at the Web, Mobile and Android Programming blog....

Master your IDE logging with Grep console

One of many daily activities that every programmer needs to do in order to do their work is to control logging output from their application. Logging, when done properly and correctly, provides great insight into the inner workings of the application and may be a great resource for analyzing and optimizing your codes behavior. Whether it is during development or maintenance/support phase of the product life-cycle, this task is often considered to be unpleasant for many programmers. But since log analysis is so important and often required there usually isn’t simple way around. In this article I will present an elegant solution to reviewing logs in development stage of the application within IDE.     If you happen to be a programmer involved in development of an enterprise application you are pretty aware of sizes your applications logs can grow up to. Given sizes and complexity of these logs, it is often quite a tacky task to find certain events of interest. In our team, we identified three basic events of interest common for all the applications in our portfolio:Execution of an SQL expression Beginning and end of an transaction Web service request & responseGrep console Given our development practices and the fact that Spring framework backs most of our products, our favorite IDE is Spring Tool Suite (STS for short). And since STS is build on top of Eclipse IDE it comes with vast Eclipse Marketplace. If you look deep enough you just might find everything there. Among many useful tools and plugins, Marketplace also stores one small but powerful plugin called Grep console. This little project is developed and maintained by Marian Schedenig. What it does is it allows you to define a set of styles that are applied to your logging console based on assigned regular expressions. If you are like me and default Eclipse theme (fonts, colors, font sizes and stuff) is just not working for you, than you are probably using some plugins to tweak your IDEs look. I personally prefer Eclipse Color Themes plugin and Gedit Original Oblivion theme. Great thing about this project is that whole theme settings are available online in a nicely presented way. And this is where things get interesting. This allows programmer to create consistent style across code and logs providing really easy and intuitive way of organizing your thoughts and improves your overall orientation. Grep console in action Following paragraphs are showcasing applied styles for three mentioned events of interest. Execution of an SQL expression One of the most basic tasks of analyzing logs is to locate execution of certain SQL expression. We use datasource-proxy for SQL logging purposes in our project, which provides proxy classes for JDBC API to intercept executing queries. Because of this we were able to easily map Grep consoles regexp to CommonsQueryLoggingListener class that logs any data-source interaction. Given our local style settings this is what one sees in console view within Eclipse IDE. Color: #1eff67 Expression: .*net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener.* 2013-11-18 09:00:08,893 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:1, Num:1, Query:{[select SQ_S_SYS_MON.nextval from dual][]} 2013-11-18 09:00:08,896 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:2, Num:1, Query:{[insert into S_SYS_MON (COL_A, COL_B, COL_C) values (?, ?, ?)][1, 2013-11-18 08:59:07.872, A]} 2013-11-18 09:01:10,920 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:1, Num:1, Query:{[select SQ_S_SYS_MON.nextval from dual][]} 2013-11-18 09:01:10,924 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:3, Num:1, Query:{[insert into S_SYS_MON (COL_A, COL_B, COL_C) values (?, ?, ?)][2, 2013-11-18 09:59:07.872, B]} 2013-11-18 09:02:12,946 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:1, Num:1, Query:{[select SQ_S_SYS_MON.nextval from dual][]} 2013-11-18 09:02:12,949 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:2, Num:1, Query:{[insert into S_SYS_MON (COL_A, COL_B, COL_C) values (?, ?, ?)][3, 2013-11-18 10:59:07.872, C]} 2013-11-18 09:03:14,971 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:1, Num:1, Query:{[select SQ_S_SYS_MON.nextval from dual][]} 2013-11-18 09:03:14,974 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:2, Num:1, Query:{[insert into S_SYS_MON (COL_A, COL_B, COL_C) values (?, ?, ?)][4, 2013-11-18 11:59:07.872, D]} 2013-11-18 09:04:16,999 [systemMonitoringScheduler-1] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:2, Num:1, Query:{[select SQ_S_SYS_MON.nextval from dual][]} Beginning and end of an transaction It is always a good idea to mind the transactions, even more so while debugging. With Grep console its really easy to find the relevant portion of log for given transaction. After applying proper expression and color style, whole transaction management becomes nicely visible in logs and won’t be buried in lines of log. As you can see the whole situation becomes instantly recognizable solely on the basis of color assignment. Color: #ffff00 Expression: .*org.hibernate.transaction.JDBCTransaction.* 2013-11-18 08:36:51,211 [http-bio-8080-exec-6] DEBUG org.hibernate.transaction.JDBCTransaction – begin 2013-11-18 08:36:51,211 [http-bio-8080-exec-6] DEBUG org.hibernate.transaction.JDBCTransaction – current autocommit status: true 2013-11-18 08:36:51,212 [http-bio-8080-exec-6] DEBUG org.hibernate.transaction.JDBCTransaction – disabling autocommit 2013-11-18 08:36:51,223 [http-bio-8080-exec-6] DEBUG net.ttddyy.dsproxy.listener.CommonsQueryLoggingListener – Name:dmDataSource, Time:8, Num:1, Query:{[select count(*) from ABC where a like ? order by b asc][AI%]} 2013-11-18 08:36:51,226 [http-bio-8080-exec-6] DEBUG org.hibernate.transaction.JDBCTransaction – commit 2013-11-18 08:36:51,229 [http-bio-8080-exec-6] DEBUG org.hibernate.transaction.JDBCTransaction – re-enabling autocommit 2013-11-18 08:36:51,230 [http-bio-8080-exec-6] DEBUG org.hibernate.transaction.JDBCTransaction – committed JDBC Connection Web service request & response Last event of interest for me is the receiving of SOAP request or response due to degree of integration required/provided by our application. By specifying following two expressions we are now able to track exactly the points when a message enters/leaves the realm of our application. Color: #03abff Expression: .*org.springframework.ws.client.MessageTracing.* Expression: .*org.springframework.ws.server.MessageTracing.* 2013-11-18 10:31:48,288 [http-bio-8080-exec-3] TRACE org.springframework.ws.soap.saaj.support.SaajUtils – SOAPElement [com.sun.xml.internal.messaging.saaj.soap.ver1_1.Envelope1_1Impl] implements SAAJ 1.3 2013-11-18 10:31:48,315 [http-bio-8080-exec-3] TRACE org.springframework.ws.server.MessageTracing.received – Received request [<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://a.b.c.com/product/v1.0"> <soapenv:Header/> <soapenv:Body> <xsd:getItem> <xsd:itemId>1</xsd:itemId> </xsd:getItem> </soapenv:Body> </soapenv:Envelope>] 2013-11-18 10:31:48,319 [http-bio-8080-exec-3] TRACE org.springframework.ws.soap.saaj.support.SaajUtils – SOAPElement [com.sun.xml.internal.messaging.saaj.soap.ver1_1.Body1_1Impl] implements SAAJ 1.3 2013-11-18 10:31:48,321 [http-bio-8080-exec-3] DEBUG org.springframework.ws.server.endpoint.mapping.PayloadRootAnnotationMethodEndpointMapping – Looking up endpoint for [{http://a.b.c.com/product/v1.0}getItemRequest] 2013-11-18 10:31:48,321 [http-bio-8080-exec-3] DEBUG org.springframework.ws.soap.server.SoapMessageDispatcher – Endpoint mapping [org.springframework.ws.server.endpoint.mapping.PayloadRootAnnotationMethodEndpointMapping@7115660] maps request to endpoint [public com.c.b.a.v10.getItem com.c.b.a.ws.ws.endpoint.ItemQuery.getItem(com.c.b.a.v10.getItemRequestDocument)] 2013-11-18 10:31:48,324 [http-bio-8080-exec-3] TRACE org.springframework.ws.soap.saaj.support.SaajUtils – SOAPElement [com.sun.xml.internal.messaging.saaj.soap.ver1_1.Header1_1Impl] implements SAAJ 1.3 2013-11-18 10:31:48,325 [http-bio-8080-exec-3] DEBUG org.springframework.ws.soap.server.SoapMessageDispatcher – Testing endpoint adapter [org.springframework.ws.server.endpoint.adapter.GenericMarshallingMethodEndpointAdapter@16a77b1c] Conclusion Using this plugin really simplifies the development process and frees your mind to worry about more important issues than browsing meters of logs. Setup is fairly simple and the only limit is your ability to locate the events of interest and create matching rules for them. Another great benefit is the ability to customize console logs the way your IDE looks providing uniform development environment. It is also great for team cooperation since by defining a common set of styles you will always encounter the same logging output even on your colleagues computer. I contacted the developer of Grep console and asked about the possibility of opening external files in Grep Console to have the styles applied and he said himself it is an interesting idea so we just might get this functionality in future releases of the plugin. All this being said I find the Grep Console to be one of the essential tools to enhance my development process and can honestly recommend using it.Reference: Master your IDE logging with Grep console from our JCG partner Jakub Stas at the Jakub Stas blog....

When writing too much code can kill you

So now that I lured you in with that provocative title I suppose I need to clarify. Well it’s true; too much coding can kill you, the real question is “what is the reason?” and the answer to that is; Chronic Stress. So why write about this; well it’s personal. You see: it happened to me, and I’m hoping to tell other people that it could happen to them to. “Holy Smokes Batman, do you mean it takes years” So what is Chronic stress. To quote Wikipedia: Chronic stress  is the response to emotional pressure suffered for a prolonged period over which an individual perceives he or she has no control. A few key observations:Chronic stress is prolonged stress experienced over a long time, period often years. There is a distinct physical effect on the body resulting in the continuous release of hormones meant to only be released for a temporary period of time. When you are in state of prolonged stress, you may not recognise it or if you do, feel that you cannot do anything about it.“Cortico…what!!??” The Human body is designed to deal with stress, provided it’s temporary. When you are under stress your body releases hormones most notably Adrenalin and Cortisol. Adrenalin boosts your heart rate and energy supplies. Cortisol increases glucose in the bloodstream and increases the bodies ability to produce glucose and ability to repair tissue. If you were to compare is to a car, Cortisol is like Nitrous, Adrenalin is your Turbo. Now Nitrous is meant for a short boost, running it permanently is not great for your engine. Adrenalin and Cortisol is pretty much the same. If you run it permanently it’s not great for your body. Here are a few highlights of what you might expect when you suffer from Chronic Stress.Anxiety Irritability Recurring Insomnia Headaches Digestive disorders Depression Reliance on boosters such as caffeine Spikes and drops in energy at odd times Unexplained sudden weight gain Increase in illness and difficulty in recovering from illnessAnd (The ones that can kill you):Hypertension Cardiovascular disease.It’s also worth noting that many of the symptoms have knock on symptoms. For me particularly I started suffering from Insomnia almost a year and a half before I was diagnosed with Chronic stress. The effect this had quite a negative effect on my mood and concentration. Developers are like Soylent green. So Developers – contrary to popular belief – is people. In other words they are fallible limited creatures who have limitations in what they can do. The problem is that we often subject ourselves to increased stress, because of crazy timelines, unrealistic expectations (both by our employers and also by ourselves) and no small amount of ambition. This is complicated if you are a bit of an A-Type personality who has a tendency to take more and more things on. The sad part is the more you take and the higher your stress levels become, the less productive you become. The writing is on the wall First chronic stress is not like a common cold, the symptoms develop after a long period of time and furthermore it’s not something that really well understood. The result is that it’s that this type of thing that creeps up on you, and before you know it you are chronically stressed out and now you need treatment. So here are some early warning signs that you can learn to identify it:Recurrent episodes of insomnia. Multiple people repeatedly tell you: “You look tired” or something similar. Your buttons suddenly start being pushed often and you find yourself getting more easily aggravated at things. You start becoming very negative and people tell you as much. You suddenly realise you haven’t engaged in a personal hobby or done something you like in quite some time. You often have a headache and/or heartburn. You struggle to turn off work and/or you’re always anxious about a work. You often work overtime because it’s the only time you can get “flow” due to the multitude of distractions in your environment.Deal with it! There is quite a lot help out there with regards to reducing stress, but from a personal view there are a few things worth mentioning:Ultimately no one actually forces you to be stressed out, If you don’t find a way to deal with it no one else will. Other people won’t get it or recognise it unless they have first hand experience, you have to accept and deal with it yourself. Acknowledge your limits. Take stress seriously, it can damage your relationships, your career and your health. Don’t be afraid to get professional help. Take that vacation. Be careful when you blur work and pleasure, Especially if one of your hobbies is writing code.So I hope this was informative and here’s hoping to no more stress and – off course – staying alive.Reference: When writing too much code can kill you from our JCG partner Julian Exenberger at the Dot Neverland blog....

A Few Thoughts on Code Completion

Before launching into some philosophical musings about what programmers‘ crack addiction to code completion means, a few observations about my path. In 2003, I started using Eclipse. Before that I had been using JBuilder a lot. Eclipse launched claiming that incremental compilation was the future and anyone not using it was lost. You would see your errors as you typed. I remember not being that thrilled at that prospect, but I did find, over time, that with eclipse, I was able to get an environment going that was very fast. The wheels started coming off of my love affair with it when first the WTP project turned up lame release after release, and then, after adopting Maven, the plugin situation there just produced stunted, epileptic madness. Also, I got to try Xcode, and wow, I thought ‘this thing is so much faster and cleaner, if only it had a few more things, particularly TDD stuff.‘ Well it has that stuff now and it is no doubt, IMHO, the best IDE there is. Of course, I cannot write big data programs with it, or web stuff, but for mobile and applications, it‘s nuts.Lately, as per my ranting ‘documentation‘ on here, I have been using IntelliJ, first as part of Android Studio, then for a project using Play and Akka. It drives me bananas because it‘s so slow. I can‘t stand to work in it. I prefer to just tab over to the console and have play run my tests. It‘s really stupid too and it gets crossed up easily, then if I change things in the project, I have to regenerate the project files, which seems so many pales past a diaper I can‘t even really say the feeling it evokes. Anyway, by every other measure, I think that IntelliJ has eclipse handily beaten. Its ‘Live Templates‘ are much better than eclipse‘s templates, which evolved not one iota in the 10 years since I started using them (note, evolved meaning more than some minute crumb toss of silly improvements). Recently, I stuck my head back into Sublime Edit. The reason is because I was using Bootstrap and I ran into it somewhere, then I saw this tutorial and was kind of ________ (insert your preferred phrase along the lines of blown away). First off, this thing is stupid fast. Secondly, this tutorial shows you things happening that are very compelling. I started to imagine, if I used this for Play, or Akka, what would I miss? Of course, the answer comes back pretty quickly: code completion. Code completion is pretty great and I have showered as much love and praise on it since it crept into development as anyone. However, today, I was thinking about it, and I got to thinking, it‘s pretty baby-like. Like the IDE is offstage cooing for baby to open his mouth and have each cut piece put in, then to ask for more once that one‘s been chewed up. Now, the counter argument to this is that since the C++/Java cataclysm, we live in a world where we are programming with hundreds of APIs all the time. I have been using Hadoop a bit lately. It‘s a perfect example of what OSS engenders: a big, rambling barnyard of crap, another project for every single little crack that needs some caulk. One of the reasons that Xcode and Apple are such a pleasure is you can learn a framework and then use it for most of your coding, rather than constantly going in and out of tons of little halfwit pseudo frameworks that practice different conventions and have no concept of binding logic (they don‘t know who else you are working with after all). So I started thinking, could I do some coding in Sublime? and what would that be like? My highest thing in a development environment is Speed of TDD, period. Above all else, then after that, it‘s probably code completion. But I think there might be ways to live without it. The Staging of Construction If you think about it, code completion is really needed for two reasons:you are either not familiar with the API of the thing you are using you are not familiar because it was badly designed and failed its heuristic challengeWhat would you think the spread of this would be? I would say at least 80% of cases contain some of b. For example, no sane person would debate the fact that Joda is better than the JDK‘s time/date functions. However, I don‘t think Joda is a good API from the perspective of heuristics because I‘ve used it countless times and I never remember all its quirky silliness. I can‘t even remember its concepts, yeah ok, Instant, Duration, etc., but how they map those concepts, to me, is not great, and does not leave a lasting impression. So now, we‘re saying baby has to have his/her food cutup because the wild west of everyone gets API mindshare means you‘re never going to know hardly anything of what you are using. Let me pause for a moment to note the vast tragedy of this situation. The main reason its cosmically tragic, AFAIC, is that that is perhaps the main thing that objects were supposed to do that Structured Programming (methods) could not: as functions get more complex, their signatures grow beyond the 7 +/- 2 horizon (how many concurrent concepts a human being can keep suspended), and the ability to achieve any kind of fluency vanishes. I have made this argument before, but the real genius of the modern version of Objective C is that through Categories, and its newer notation (esp. literals), it‘s kind of saying ‘let the user negotiate the last mile of making things more usable. And there‘s a ton of wisdom in that, because heuristics can often be impacted by need! I don‘t want to carry a ton of baggage when for what I am doing, I only need a small fraction of that. Another possible solution to this is my favorite obsessive pattern: Builder. Bloch‘s Static Builder pattern largely make complex construction much more fluid and easier to navigate. Now, Builder is also PERFECT for code completion because the chaining this reference allows you to just keep going down and hitting . and getting another method to pick. One possible solution in ST would be to just program in a builder user plugin, that would read in the classes, grab the builders and parse them then suggest the methods. Or, it might make sense to just have a window that is open in a kind of sidebar configuration and as you start to go into the process of constructing or calling something, it would appear in that window, or its documentation, or maybe both. Cocoa is one of the oldest and most mature frameworks, and by and large, code completion does not really make that much difference when using it. Partially because the methods are easy to remember, like ViewDidLoad or ViewWillAppear. But also, because you use them so many times, that you learn it and then don‘t really need much help with it. Clearly, any kind of Lean consciousness about these matters has to see that the best of situations is but a tradeoff at this point: the programmer spent a few less seconds looking for a method call, but then, as tests were run and rerun in IDEs that are ludicrously slow, those savings were/will be bled back into the pool 100x over. People used to joke that C++ was the wrapper language. Wrappers are not bad things if they maintain fluency, especially on joints that are going to see a lot of traffic in the course of a long project. Code completion is to IDEs as Outlook is to the PC: a boat anchor, encouraging people to just stay in their cabin, even as the ship takes on water from countless other directions. Until a native, blazing IDE is available for web development, finding alternatives to inline code completion might be the best escape hatch.Reference: A Few Thoughts on Code Completion from our JCG partner Rob Williams at the Rob Williams’ Blog blog....

Knowledge Representation and Reasoning with Graph Databases

A graph database and its ecosystem of technologies can yield elegant, efficient solutions to problems in knowledge representation and reasoning. To get a taste of this argument, we must first understand what a graph is. A graph is a data structure. There are numerous types of graph data structures, but for the purpose of this post, we will focus on a type that has come to be known as a property graph. A property graph denotes vertices (nodes, dots) and edges (arcs, lines). Edges in a property graph are directed and labeled/typed (e.g. “marko knows peter”). Both vertices and edges (known generally as elements) can have any number of key/value pairs associated with them. These key/value pairs are called properties. From this foundational structure, a suite of questions can be answered and problems solved.   Object ModelingThe property graph data structure is nearly identical in form to the object graphs of object oriented programming. Take a collection of objects, remove their methods, and you are left with a property graph. An object’s fields are either primitive and in which cases serve as properties or they are complex and in which case serve as references to other objects. For example, in Java:   class Person { String name; Integer age; Collection<Person> knows; } The name and age properties are vertex properties of the particular person instance and the knows property refer to knows-labeled edges to other people. Emil Eifrem of Neo Technology espouses the view that property graphs are “whiteboard friendly” as they are aligned with the semantics of modern object oriented languages and the diagramming techniques used by developers. A testament to this idea is the jo4neo project by Taylor Cowan. With jo4neo, Java annotations are elegantly used to allow for the backing of a Java object graph by the Neo4j graph database. Beyond the technological benefits, the human mind tends to think in terms of objects and their relations. Thus, graphs may be considered “human brain friendly” as well. Given an object graph, questions can be answered about the domain. In the graph traversal DSL known as Gremlin, we can ask questions of the object graph: // Who does Marko know? marko.outE('knows').inV // What are the names of the people that Marko knows? marko.outE('knows').inV.name // What are the names and ages of the people that Marko knows? marko.outE('knows').inV.emit{[it.name, it.age]} // Who does Marko know that are 30+ years old? marko.outE('knows').inV{it.age > 30} Concept ModelingFrom the instances that compose a model, there may exist abstract concepts. For example, while there may be book instances, there may also be categories for which those books fall–e.g. science fiction, technical, romance, etc. The graph is a flexible structure in that it allows one to express that something is related to something else in some way. These somethings may be real or ethereal. As such, ontological concepts can be represented along with their instances and queried appropriately to answer questions.       // What are the parent categories of history? x = []; history.inE('subCategory').outV.aggregate(x).loop(3){!it.equals(literature)}; x // How many descendant categories does fiction have? c = 0; fiction.outE('subCategory').inV.foreach{c++}.loop(3){true}; c // Is romance at the same depth as history? c = 0; romance.inE('subCategory').outV.loop(2){c++; !it.equals(literature)}.outE('subCategory').inV.loop(2){c--; !it.equals(history)}; c == 0 Automated ReasoningFrom the explicit objects, their relationships, and their abstract categories, reasoning processes can be enacted. A tension that exists in graph modeling is what to make explicit (structure) and what to infer through traversal (process). The trade-off is between, like much of computing, space and time. If there exists an edge from a person to their coauthors, then its a single hop to get from that person to his or her coauthors. If, on the other hand, coauthors must be inferred through shared writings, then a multi-hop step is computed to determine coauthors. Reasoning is the process of making what is implicit explicit. A couple simple reasoning examples are presented below using Gremlin.   // Two people who wrote the same book/article/etc. are coauthors g.V{x = it}.outE('wrote').inV.inE('wrote').outV.except([x])[0].foreach{g.addEdge(null, x, it, 'hasCoauthor')} // People who write literature are authors author = g.addVertex(); author.type='role'; author.name='author' g.V.foreach{it.outE('wrote').inV[0].foreach{g.addEdge(null, it, author, 'hasRole')} >> -1} In the examples above, a full graph analysis is computed to determine all coauthors and author roles. However, nothing prevents the evaluation of local inference algorithms. // Marko's coauthors are those people who wrote the same books/articles/etc. as him marko.outE('wrote').inV.inE('wrote').outV.except([marko])[0].foreach{g.addEdge(null, x, it, 'hasCoauthor')} Conclusion Graphs are useful for modeling objects, their relationships to each other, and the conceptual structures wherein which they lie. From this explicit information, graph query and inference algorithms can be evaluated to answer questions on the graph and to increase the density of the explicit knowledge contained within the graph (i.e. increase the number of vertices and edges). This particular graph usage pattern has been exploited to a great extent in the world of RDF (knowledge representation) and RDFS/OWL (reasoning). The world of RDF/RDFS/OWL is primarily constrained to description logics (see an argument to the contrary here). Description logics are but one piece of the larger field of knowledge representation and reasoning. There are numerous logics that can be taken advantage of. In the emerging space of graph databases, the necessary building blocks exist to support the exploitation of other logics. Moreover, these logics, in some instances, may be used concurrently within the same graphical structure. To this point, the reading list below provides a collection of books that explicate different logics and ideas regarding heterogeneous reasoning. Graph databases provide a green field by which these ideas can be realized. Further ReadingBrachman, R., Levesque, H., “Knowledge Representation and Reasoning,” Morgan Kaufmann, 2004. Wang, P., “Rigid Flexibility: The Logic of Intelligence,” Springer, 2006. Mueller, E.T., “Commonsense Reasoning,” Morgan Kaufmann, 2006. Minsky, M., “The Society of Mind,” Simon & Schuster, 1988.Reference: Knowledge Representation and Reasoning with Graph Databases from our JCG partner Marko Rodriguez at the Marko A. Rodriguez’s blog blog....

Spring Security Misconfiguration

I recently saw Mike Wienser’s SpringOne2GX talk about Application Security Pitfalls. It is very informative and worth watching if you are using Spring’s stack on servlet container. It reminded me one serious Spring Security Misconfiguration I was facing once. Going to explain it on Spring’s Guide Project called Securing a Web Application. This project uses Spring Boot, Spring Integration and Spring MVC. Project uses these views:       @Configuration public class MvcConfig extends WebMvcConfigurerAdapter { @Override public void addViewControllers(ViewControllerRegistry registry) { registry.addViewController("/home").setViewName("home"); registry.addViewController("/").setViewName("home"); registry.addViewController("/hello").setViewName("hello"); registry.addViewController("/login").setViewName("login"); }} Where “/home”, “/” and “/login” URLs should be publicly accessible and “/hello” should be accessible only to authenticated user. Here is original Spring Security configuration from Guide: @Configuration @EnableWebMvcSecurity public class WebSecurityConfig extends WebSecurityConfigurerAdapter { @Override protected void configure(HttpSecurity http) throws Exception { http .authorizeRequests() .antMatchers("/", "/home").permitAll() .anyRequest().authenticated(); http .formLogin() .loginPage("/login") .permitAll() .and() .logout() .permitAll(); }@Override protected void configure(AuthenticationManagerBuilder auth) throws Exception { auth .inMemoryAuthentication() .withUser("user").password("password").roles("USER"); } } Nice and explanatory as all Spring’s Guides are. First “configure” method registers “/” and “home” as public and specifies that everything else should be authenticated. It also registers login URL. Second “configure” method specifies authentication method for role “USER”. Of course you don’t want to use it like this in production! Now I am going to slightly amend this code. @Override protected void configure(HttpSecurity http) throws Exception {         //!!! Don't use this example !!!         http             .authorizeRequests()                               .antMatchers("/hello").hasRole("USER");                  //... same as above ... } Everything is public and private endpoints have to be listed. You can see that my amended code have the same behavior as original. In fact it saved one line of code. But there is serious problem with this. What if my I need to introduce new private endpoint? Let’s say I am not aware of the fact that it needs to be registered in Spring Security configuration. My new endpoint would be public. Such misconfiguration is really hard to catch and can lead to unwanted exposure of URLs. So conclusion is: Always authenticate all endpoints by default.Reference: Spring Security Misconfiguration from our JCG partner Lubos Krnac at the Lubos Krnac Java blog blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books