Home » Java » Core Java » Turning recursive file system traversal into Stream

About Tomasz Nurkiewicz

Tomasz Nurkiewicz
Java EE developer, Scala enthusiast. Enjoying data analysis and visualization. Strongly believes in the power of testing and automation.

Turning recursive file system traversal into Stream

When I was learning programming, back in the days of Turbo Pascal, I managed to list files in directory usingFindFirstFindNext and FindClose functions. First I came up with a procedure printing contents of a given directory. You can imagine how proud I was to discover I can actually call that procedure from itself to traverse file system recursively. Well, I didn’t know the term recursion back then, but it worked. Similar code in Java would look something like this:
 
 
 
 
 

public void printFilesRecursively(final File folder) {
    for (final File entry : listFilesIn(folder)) {
        if (entry.isDirectory()) {
            printFilesRecursively(entry);
        } else {
            System.out.println(entry.getAbsolutePath());
        }
    }
}

private File[] listFilesIn(File folder) {
    final File[] files = folder.listFiles();
    return files != null ? files : new File[]{};
}

Didn’t know File.listFiles() can return null, did ya? That’s how it signals I/O errors, like if IOException never existed. But that’s not the point. System.out.println() is rarely what we need, thus this method is neither reusable nor composable. It is probably the best counterexample of Open/Closed principle. I can imagine several use cases for recursive traversal of file system:

  1. Getting a complete list of all files for display purposes
  2. Looking for all files matching given pattern/property (also check out File.list(FilenameFilter))
  3. Searching for one particular file
  4. Processing every single file, e.g. sending it over network

Every use case above has a unique set of challenges. For example we don’t want to build a list of all files because it will take a significant amount of time and memory before we can start processing it. We would like to process files as they are discovered and lazily – by pipe-lining computation (but without clumsy visitor pattern). Also we want to short-circuit searching to avoid unnecessary I/O. Luckily in Java 8 some of these issues can be addressed with streams:

final File home = new File(FileUtils.getUserDirectoryPath());
final Stream<Path> files = Files.list(home.toPath());
files.forEach(System.out::println);

Remember that Files.list(Path) (new in Java 8) does not look into subdirectories – we’ll fix that later. The most important lesson here is: Files.list() returns a Stream<Path> – a value that we can pass around, compose, map, filter, etc. It’s extremely flexible, e.g. it’s fairly simple to count how many files I have in a directory per extension:

import org.apache.commons.io.FilenameUtils;

//...

final File home = new File(FileUtils.getUserDirectoryPath());
final Stream<Path> files = Files.list(home.toPath());
final Map<String, List<Path>> byExtension = files
        .filter(path -> !path.toFile().isDirectory())
        .collect(groupingBy(path -> getExt(path)));

byExtension.
        forEach((extension, matchingFiles) ->
                System.out.println(
                        extension + "\t" + matchingFiles.size()));

//...

private String getExt(Path path) {
    return FilenameUtils.getExtension(path.toString()).toLowerCase();
}

OK, just another API, you might say. But it becomes really interesting once we need to go deeper, recursively traversing subdirectories. One amazing feature of streams is that you can combine them with each other in various ways. Old Scala saying “flatMap that shit” is applicable here as well, check out this recursive Java 8 code:

//WARNING: doesn't compile, yet:

private static Stream<Path> filesInDir(Path dir) {
    return Files.list(dir)
            .flatMap(path ->
                    path.toFile().isDirectory() ?
                            filesInDir(path) :
                            singletonList(path).stream());
}

Stream<Path> lazily produced by filesInDir() contains all files within directory including subdirectories. You can use it as any other stream by calling map()filter()anyMatch()findFirst(), etc. But how does it really work?flatMap() is similar to map() but while map() is a straightforward 1:1 transformation, flatMap() allows replacing single entry in input Stream with multiple entries. If we had used map(), we would have end up withStream<Stream<Path>> (or maybe Stream<List<Path>>). But flatMap() flattens this structure, in a way exploding inner entries. Let’s see a simple example. Imagine Files.list() returned two files and one directory. For filesflatMap() receives a one-element stream with that file. We can’t simply return that file, we have to wrap it, but essentially this is no-operation. It gets way more interesting for a directory. In that case we call filesInDir()recursively. As a result we get a stream of contents of that directory, which we inject into our outer stream.

Code above is short, sweet and… doesn’t compile. These pesky checked exceptions again. Here is a fixed code, wrapping checked exceptions for sanity:

public static Stream<Path> filesInDir(Path dir) {
    return listFiles(dir)
            .flatMap(path ->
                    path.toFile().isDirectory() ?
                            filesInDir(path) :
                            singletonList(path).stream());
}

private static Stream<Path> listFiles(Path dir) {
    try {
        return Files.list(dir);
    } catch (IOException e) {
        throw Throwables.propagate(e);
    }
}

Unfortunately this quite elegant code is not lazy enough. flatMap() evaluates eagerly, thus it always traverses all subdirectories, even if we barely ask for first file. You can try with my tiny LazySeq library that tries to provide even lazier abstraction, similar to streams in Scala or lazy-seq in Clojure. But even standard JDK 8 solution might be really helpful and simplify your code significantly.

Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
Email address:

Leave a Reply

Be the First to Comment!

avatar
  Subscribe  
Notify of
Want to take your Java skills to the next level?
Grab our programming books for FREE!
Here are some of the eBooks you will get:
  • Spring Interview QnA
  • Multithreading & Concurrency QnA
  • JPA Minibook
  • JVM Troubleshooting Guide
  • Advanced Java
  • Java Interview QnA
  • Java Design Patterns
The Spring Framework Cookbook
  • Learn the best practices of the Spring Framework
  • Build simple, portable, fast and flexible JVM-based systems and applications
  • Explore specific projects like Boot and Batch
The Spring Data Programming Cookbook
  • Learn how to use data access technologies and cloud-based data services
  • Set up the environment and create a basic project
  • Learn how to handle the various modules (e.g. JPA, MongoDB, Redis etc.)
The Selenium Programming Cookbook
  • Kick-start your own projects using this testing framework for web applications
  • Learn JUnit integration and Standalone Server functionality
  • Find out the most popular Interview Questions about the Selenium Framework
The Mockito Programming Cookbook
  • Kick-start your own web projects using this open source testing framework
  • Write simple test cases using the Mockito Framework
  • Learn how to integrate with JUnit, Maven and other frameworks
The JUnit Programming Cookbook
  • Learn basic usage and configuration of JUnit
  • Create multithreaded tests
  • Learn how to integrate with other testing frameworks
The JSF 2.0 Programming Cookbook
  • Build component-based user interfaces for web applications
  • Set up the environment and create a basic project
  • Learn Internationalization and Facelets Templates
The Amazon S3 Tutorial
  • Develop your own Amazon S3 based applications
  • Learn API usage and pricing
  • Get your own projects up and running in minimum time
Java Design Patterns
  • Learn how Design Patterns are implemented and utilized in Java
  • Understand the reasons why patterns are so important
  • Learn when and how to apply each one of them
Java Concurrency Essentials
  • Dive into the magic of concurrency
  • Learn about concepts like atomicity, synchronization and thread safety
  • Learn about testing concurrent applications
The IntelliJ IDEA Handbook
  • Kick-start your own programming projects using IntelliJ IDEA
  • Learn how to setup and install plugins
  • Create UIs with this Java integrated development environment
The Git Tutorial
  • Learn why Git differs from other version control systems
  • Explore Git's usage and best practises
  • Learn branching strategies
The Eclipse IDE Handbook
  • Explore the most widely used Java IDE
  • Learn how to setup and install plugins
  • Built your own projects up and running in minimum time
The Docker Containerization Cookbook
  • Explore the world’s leading software containerization platform
  • Learn how to wrap a piece of software in a complete filesystem
  • Learn how to use DNS and various commands
Developing Modern Applications With Scala
  • Develop modern Scala applications
  • Build SBT and reactive applications
  • Learn about testing and database access
The Apache Tomcat Cookbook
  • Explore Apache Tomcat open-source web server
  • Learn about installation, configuration, logging and clustering
  • Kick-start your own web projects using Apache Tomcat
The Apache Maven Cookbook
  • Explore the Apache Maven build automation tool
  • Learn about Maven's project structure and configuration
  • Learn about Maven's dependency management and plug-ins
The Apache Hadoop Cookbook
  • Explore the Apache Hadoop open-source software framework
  • Learn distributed caching and streaming
  • Kick-start your own web projects using Apache Hadoop
The Android Programming Cookbook
  • Explore the Android mobile operating system
  • Learn about services and page views
  • Learn about Google Maps and Bluetooth functionality
The Elasticsearch Tutorial
  • Explore the Elasticsearch search engine
  • Develop your own Elasticsearch based applications
  • Learn operations, Java API Integration and reporting
Amazon DynamoDB Tutorial
  • Develop your own Amazon DynamoDB based applications
  • Learn DynamoDB Concepts and Best Practices
  • Get your own projects up and running in minimum time
Java NIO Programming Cookbook
  • Learn features for intensive I/O operations
  • Follow a series of tutorials on Java NIO examples
  • Get knowledge on Java Nio Socket and Asynchronous Channels
JBoss Drools Cookbook
  • Explore Drools business rule management system
  • Follow a series of tutorials on Drools examples
  • Get knowledge on business rules for a shopping domain model
Vaadin Programming Cookbook
  • Explore Vaadin web framework for rich Internet applications
  • Learn the Architecture and Best Practices
  • Get knowledge on Data Binding and Custom Components
Groovy Programming Cookbook
  • Explore Apache Groovy object-oriented programming language
  • Create sample applications and explore interview questions
  • Get knowledge on Callback functionality and various widgets
GWT Programming Cookbook
  • Explore the open source Google Web Toolkit
  • Create sample applications and explore interview questions
  • Create and maintain complex JavaScript front-end applications in Java
Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
and many more ....
Email address:
Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
and many more ....
Email address:
Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
Email address:
Want to be a DynamoDB Master ?
Subscribe to our newsletter and download the Amazon DynamoDB Tutorial right now!
In order to help you master this Amazon NoSQL database service, we have compiled a kick-ass guide with all the major DynamoDB features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Programming Interview Coming Up?
Subscribe to our newsletter and download the Ultimate Multithreading and Concurrency interview questions and answers collection right now!
In order to get you prepared for your next Programming Interview, we have compiled a huge list of relevant Questions and their respective Answers. Besides studying them online you may download the eBook in PDF format!
Email address:
Java Interview Coming Up?
Subscribe to our newsletter and download the Ultimate Spring interview questions and answers collection right now!
In order to get you prepared for your next Java Interview, we have compiled a huge list of relevant Questions and their respective Answers. Besides studying them online you may download the eBook in PDF format!
Email address:
Java Interview Coming Up?
Subscribe to our newsletter and download the Ultimate Java interview questions and answers collection right now!
In order to get you prepared for your next Java Interview, we have compiled a huge list of relevant Questions and their respective Answers. Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a Java NIO Master ?
Subscribe to our newsletter and download the Java NIO Programming Cookbook right now!
In order to help you master Java NIO Library, we have compiled a kick-ass guide with all the major Java NIO features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a Drools Master ?
Subscribe to our newsletter and download the JBoss Drools Cookbook right now!
In order to help you master Drools Business Rule Management System, we have compiled a kick-ass guide with all the major Drools features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be an iText Master ?
Subscribe to our newsletter and download the iText Tutorial right now!
In order to help you master iText Library, we have compiled a kick-ass guide with all the major iText features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be an Elasticsearch Master ?
Subscribe to our newsletter and download the Elasticsearch Tutorial right now!
In order to help you master Elasticsearch search engine, we have compiled a kick-ass guide with all the major Elasticsearch features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a Scala Master ?
Subscribe to our newsletter and download the Scala Cookbook right now!
In order to help you master Scala, we have compiled a kick-ass guide with all the basic concepts! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a JUnit Master ?
Subscribe to our newsletter and download the JUnit Programming Cookbook right now!
In order to help you master unit testing with JUnit, we have compiled a kick-ass guide with all the major JUnit features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master Amazon Web Services ?
Subscribe to our newsletter and download the Amazon S3 Tutorial right now!
In order to help you master the leading Web Services platform, we have compiled a kick-ass guide with all its major features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master Spring Framework ?
Subscribe to our newsletter and download the Spring Framework Cookbook right now!
In order to help you master the leading and innovative Java framework, we have compiled a kick-ass guide with all its major features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master Eclipse IDE ?
Subscribe to our newsletter and download the Eclipse IDE Handbook right now!
In order to help you master Eclipse, we have compiled a kick-ass guide with all the basic features of the popular IDE! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master IntelliJ IDEA ?
Subscribe to our newsletter and download the IntelliJ IDEA Handbook right now!
In order to help you master IntelliJ IDEA, we have compiled a kick-ass guide with all the basic features of the popular IDE! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master Docker ?
Subscribe to our newsletter and download the Docker Containerization Cookbook right now!
In order to help you master Docker, we have compiled a kick-ass guide with all the basic concepts of the Docker container system! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to create a kick-ass Android App ?
Subscribe to our newsletter and download the Android Programming Cookbook right now!
With this book, you will delve into the fundamentals of Android programming. You will understand user input, views and layouts. Furthermore, you will learn how to communicate over Bluetooth and also leverage Google Maps into your application!
Email address:
Want to be a GIT Master ?
Subscribe to our newsletter and download the GIT Tutorial eBook right now!
In order to help you master GIT, we have compiled a kick-ass guide with all the basic concepts of the GIT version control system! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a Hadoop Master ?
Subscribe to our newsletter and download the Apache Hadoop Cookbook right now!
In order to help you master Apache Hadoop, we have compiled a kick-ass guide with all the basic concepts of a Hadoop cluster! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master Spring Data ?
Subscribe to our newsletter and download the Spring Data Ultimate Guide right now!
In order to help you master Spring Data, we have compiled a kick-ass guide with all the major features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to create a kick-ass Android App ?
Subscribe to our newsletter and download the Android UI Design mini-book right now!
With this book, you will delve into the fundamentals of Android UI design. You will understand user input, views and layouts, as well as adapters and fragments. Furthermore, you will learn how to add multimedia to an app and also leverage themes and styles!
Email address:
Want to be a Java 8 Ninja ?
Subscribe to our newsletter and download the Java 8 Features Ultimate Guide right now!
In order to get you up to speed with the major Java 8 release, we have compiled a kick-ass guide with all the new features and goodies! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master Java Annotations ?
Subscribe to our newsletter and download the Java Annotations Ultimate Guide right now!
In order to help you master the topic of Annotations, we have compiled a kick-ass guide with all the major features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a JUnit Master ?
Subscribe to our newsletter and download the JUnit Ultimate Guide right now!
In order to help you master unit testing with JUnit, we have compiled a kick-ass guide with all the major JUnit features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master Java Abstraction ?
Subscribe to our newsletter and download the Abstraction in Java Ultimate Guide right now!
In order to help you master the topic of Abstraction, we have compiled a kick-ass guide with all the major features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to master Java Reflection ?
Subscribe to our newsletter and download the Java Reflection Ultimate Guide right now!
In order to help you master the topic of Reflection, we have compiled a kick-ass guide with all the major features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a JMeter Master ?
Subscribe to our newsletter and download the JMeter Ultimate Guide right now!
In order to help you master load testing with JMeter, we have compiled a kick-ass guide with all the major JMeter features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a Servlets Master ?
Subscribe to our newsletter and download the Java Servlet Ultimate Guide right now!
In order to help you master programming with Java Servlets, we have compiled a kick-ass guide with all the major servlet API uses and showcases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a JAXB Master ?
Subscribe to our newsletter and download the JAXB Ultimate Guide right now!
In order to help you master XML Binding with JAXB, we have compiled a kick-ass guide with all the major JAXB features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a JDBC Master ?
Subscribe to our newsletter and download the JDBC Ultimate Guide right now!
In order to help you master database programming with JDBC, we have compiled a kick-ass guide with all the major JDBC features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a JPA Master ?
Subscribe to our newsletter and download the JPA Ultimate Guide right now!
In order to help you master programming with JPA, we have compiled a kick-ass guide with all the major JPA features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a Hibernate Master ?
Subscribe to our newsletter and download the Hibernate Ultimate Guide right now!
In order to help you master JPA and database programming with Hibernate, we have compiled a kick-ass guide with all the major Hibernate features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a JSF Master ?
Subscribe to our newsletter and download the JSF 2.0 Programming Cookbook right now!
In order to get you prepared for your JSF development needs, we have compiled numerous recipes to help you kick-start your projects. Besides reading them online you may download the eBook in PDF format!
Email address:
Want to be a Java Master ?
Subscribe to our newsletter and download the Advanced Java Guide right now!
In order to help you master the Java programming language, we have compiled a kick-ass guide with all the must-know advanced Java features! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a Java Master ?
Subscribe to our newsletter and download the Java Design Patterns right now!
In order to help you master the Java programming language, we have compiled a kick-ass guide with all the must-know Design Patterns for Java! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be a Hadoop Master ?
Subscribe to our newsletter and download the Hadoop Tutorial right now!
In order to help you master Apache Hadoop, we have compiled a kick-ass guide with all the basic concepts of a Hadoop cluster! Besides studying them online you may download the eBook in PDF format!
Email address:
Want to be an Elastic Beanstalk Master ?
Subscribe to our newsletter and download the Amazon Elastic Beanstalk Tutorial right now!
In order to help you master AWS Elastic Beanstalk, we have compiled a kick-ass guide with all the major Elastic Beanstalk features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Amazon Elastic Beanstalk Tutorial
  • Develop your own Amazon Elastic Beanstalk based applications
  • Learn Java Integration and Command Line Interfacing
  • Get your own projects up and running in minimum time
Want to take your Java skills to the next level?
Grab our programming books for FREE!
Here are some of the eBooks you will get:
  • Spring Interview QnA
  • Multithreading & Concurrency QnA
  • JPA Minibook
  • JVM Troubleshooting Guide
  • Advanced Java
  • Java Interview QnA
  • Java Design Patterns
Insiders are already enjoying weekly updates and complimentary whitepapers!
Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.
Email address:
Want to be an ActiveMQ Master ?
Subscribe to our newsletter and download the Apache ActiveMQ Cookbook right now!
In order to help you master Apache ActiveMQ JMS, we have compiled a kick-ass guide with all the major ActiveMQ features and use cases! Besides studying them online you may download the eBook in PDF format!
Email address:
Apache ActiveMQ Cookbook
  • Explore Apache ActiveMQ Best Practices
  • Learn ActiveMQ Load Balancing and File Transfer
  • Develop your own Apache ActiveMQ projects