Software Development

Efficient Techniques for Eliminating Empty Values and Key-Value Pairs in Your Data

In the realm of data transformation within MuleSoft’s Anypoint Platform, precision and efficiency are paramount. Handling JSON data often involves the need to streamline information by removing empty values, ensuring data integrity and optimizing downstream processes. In this blog post, we will delve into the adept capabilities of DataWeave, MuleSoft’s robust data transformation language. Through practical insights and examples, we will explore how DataWeave empowers developers to elegantly and effectively eliminate empty values from JSON datasets. Join me on a journey through the intricacies of data manipulation, where we harness the power of DataWeave to enhance the cleanliness and usability of our JSON data within the Anypoint Platform.

Introduction

Navigating the intricacies of data manipulation is a common challenge in the landscape of integration and data transformation. In this blog post, we embark on a journey into the capabilities of DataWeave, the potent data transformation language within MuleSoft’s Anypoint Platform. Our focus lies on a specific and frequent task – the removal of empty values from JSON data.

To tackle this, we’ll unveil a custom DataWeave function named FilterTree. This versatile function operates recursively, systematically filtering out empty values from any data structure it encounters. As part of our toolkit, we’ll introduce two indispensable helper functions: filterArrayItems and filterObjectDetails. These helpers seamlessly integrate with FilterTree, specializing in the filtration of arrays and objects, respectively.

Get ready to explore the power and flexibility of DataWeave as we unravel the steps to enhance the cleanliness and precision of your JSON data within MuleSoft’s Anypoint Platform. Join me in mastering these custom functions and techniques for efficient data transformation.

Below we will present 2 codes.

Code-1

%dw 2.0
output application/json skipNullOn="everywhere"

fun ExtractRequiredFields(inBound) =
  inBound match {
    case is Array -> removeEmptyDataFromArray(inBound)
    case is Object -> removeEmptyDataFromObject(inBound)
    else -> if (!isEmpty(inBound)) $ else null
  }

fun removeEmptyDataFromArray(arr: Array) =
  if (!isEmpty(arr))
    arr filter (!isEmpty($)) flatMap (
      $ match {
        case is Array -> removeEmptyDataFromArray($)
        case is Object -> removeEmptyDataFromObject($)
        else -> if (!isEmpty($)) $ else null
      }
    )
  else null

fun removeEmptyDataFromObject(inBound: Object) =
  if (!isEmpty(inBound))
    inBound filterObject (!isEmpty($)) mapObject ((value, key, index) ->
      value match {
        case is Array -> (key): removeEmptyDataFromArray(value)
        case is Object -> (key): removeEmptyDataFromObject(value)
        else -> if (!isEmpty(value)) (key): value else null
      }
    ) filterObject (!isEmpty($))
  else null

---
ExtractRequiredFields(payload)

CODE-2:

%dw 2.0
import * from dw::util::Tree
output application/json skipNullOn="everywhere"

fun filterArrayItems(arr: Array) =
  (arr filterTree ((value, path) -> !isEmpty(FilterTree(value)))) filterArrayLeafs ((value, path) ->
    value match {
      case a is Array -> !isEmpty(filterArrayItems(a))
      case a is Object -> !isEmpty(filterObjectDetails(a))
      else -> !isEmpty($)
    }
  )

fun filterObjectDetails(obj: Object) =
  (obj filterTree ((value, path) -> !isEmpty(FilterTree(value)))) filterObjectLeafs ((value, path) ->
    value match {
      case o is Array -> !isEmpty(filterArrayItems(o))
      case o is Object -> !isEmpty(filterObjectDetails(o))
      else -> !isEmpty($)
    }
  )

fun FilterTree(inBound) =
  inBound filterTree ((value, path) ->
    value match {
      case s is Array -> !isEmpty(filterArrayItems(s))
      case s is Object -> !isEmpty(filterObjectDetails(s))
      else -> !isEmpty($)
    }
  )

---
FilterTree(payload)

Both sets of code achieve the same functionality, but there are some differences in their structure and style. Let’s explore the factors that may influence the preference for one version over the other:

CODE-1 Considerations:

Pros:

  1. Modularity and Readability:
    • Functions like removeEmptyDataFromArray and removeEmptyDataFromObject contribute to a modular structure, enhancing code readability and maintainability.
    • Each function has a clear responsibility, making it easier to understand and modify.
  2. Consistency:
    • Consistent naming conventions make the code more predictable and understandable.
  3. Explicit Handling of null:
    • Explicitly checks for null in the removeEmptyDataFromArray function before performing operations, ensuring robustness.

Cons:

  1. Complexity:
    • The structure can be considered more complex due to the nesting of functions and conditions.

CODE-2 Considerations:

Pros:

  1. Simplified Structure:
    • Code is structured more linearly, potentially making it easier to follow for some developers.
  2. Utilization of filterTree:
    • Makes efficient use of filterTree for recursive filtering, reducing the need for separate functions for arrays and objects.

Cons:

  1. Reduced Modularity:
    • Functions are more tightly integrated, potentially making it harder to isolate and modify specific functionalities.
  2. Implicit Handling of null:
    • In some places, the code implicitly assumes that certain values won’t be null. This may be less explicit compared to CODE-1.

The preference between CODE-1 and CODE-2 often depends on the development team’s coding standards, the project’s existing codebase, and individual preferences. CODE-1 prioritizes modularity and explicit handling of null, which can be beneficial for larger projects or teams where readability and maintainability are crucial. On the other hand, CODE-2 simplifies the structure, potentially making it more approachable for developers who prefer a more straightforward and less modular approach.

The FilterTree function operates on an input parameter named InBound, utilizing the filterTree function from the dw::util::Tree module. This function employs a lambda expression, creating a new data structure containing values that meet the specified conditions. The lambda expression assesses whether the value is an array, object, or another type, prompting the invocation of an appropriate helper function designed to eliminate empty values.

The helper functions, filterArrayItems and filterObjectDetails, mirror the logic of FilterTree. However, they also leverage additional functions from the dw::util::Tree module, specifically filterArrayLeafs or filterObjectLeafs. These functions, when provided with a lambda expression, generate a new array or object solely comprising leaf nodes—those lacking children—that satisfy the lambda expression. The lambda expression within these functions distinguishes whether the leaf node is an array, object, or another type, prompting the invocation of the relevant helper function to filter out empty values.

The output directive conclusively specifies the desired output format as JSON. Furthermore, it explicitly instructs the system to skip null values at every level of the resulting data structure.

%dw 2.0
import * from dw::util::Tree
output application/json skipNullOn="everywhere"

// The FilterTree function takes an input parameter 'InBound'
// and applies filterTree to create a new data structure
// containing values satisfying the lambda expression.
fun FilterTree(InBound) =
  InBound filterTree ((value, path) ->
    // The lambda expression checks if the value is an array, object, or anything else
    // and calls the appropriate helper function to filter out empty values.
    value match {
      case s is Array -> !isEmpty(filterArrayItems(s))
      case s is Object -> !isEmpty(filterObjectDetails(s))
      else -> !isEmpty($)
    }
  )

// Helper function for filtering empty values under arrays
fun filterArrayItems(arr: Array) =
  (arr filterTree ((value, path) ->
    // The lambda expression checks if the value is an array, object, or anything else
    // and calls the appropriate helper function to filter out empty values.
    value match {
      case a is Array -> !isEmpty(filterArrayItems(a))
      case a is Object -> !isEmpty(filterObjectDetails(a))
      else -> !isEmpty($)
    }
  )) filterArrayLeafs ((value, path) ->
    // The lambda expression checks if the leaf node is an array, object, or anything else
    // and calls the appropriate helper function to filter out empty values.
    value match {
      case a is Array -> !isEmpty(filterArrayItems(a))
      case a is Object -> !isEmpty(filterObjectDetails(a))
      else -> !isEmpty($)
    }
  )

// Helper function for filtering empty values under objects
fun filterObjectDetails(obj: Object) =
  (obj filterTree ((value, path) ->
    // The lambda expression checks if the value is an array, object, or anything else
    // and calls the appropriate helper function to filter out empty values.
    value match {
      case o is Array -> !isEmpty(filterArrayItems(o))
      case o is Object -> !isEmpty(filterObjectDetails(o))
      else -> !isEmpty($)
    }
  )) filterObjectLeafs ((value, path) ->
    // The lambda expression checks if the leaf node is an array, object, or anything else
    // and calls the appropriate helper function to filter out empty values.
    value match {
      case o is Array -> !isEmpty(filterArrayItems(o))
      case o is Object -> !isEmpty(filterObjectDetails(o))
      else -> !isEmpty($)
    }
  )

---
FilterTree(payload)

The FilterTree function, using the filterTree function from the dw::util::Tree module, processes the input structure, applying a lambda expression. The lambda checks whether the value is an array, object, or other types and calls the appropriate helper function to filter out empty values.

The helper functions, filterArrayItems and filterObjectDetails, use similar logic but utilize filterArrayLeafs and filterObjectLeafs to specifically.

The FilterTree function, using the filterTree function from the dw::util::Tree module, processes the input structure, applying a lambda expression. The lambda checks whether the value is an array, object, or other types and calls the appropriate helper function to filter out empty values.

The helper functions, filterArrayItems and filterObjectDetails, use similar logic but utilize filterArrayLeafs and filterObjectLeafs to specifically

{
  "name": "John Doe",
  "age": null,
  "address": {
    "street": "",
    "city": "Example City",
    "zipcode": null
  },
  "contacts": [
    {
      "type": "email",
      "value": "john@example.com"
    },
    {
      "type": "phone",
      "value": ""
    },
    null
  ]
}

Output after applying the provided DataWeave transformation:

{
  "name": "John Doe",
  "address": {
    "city": "Example City"
  },
  "contacts": [
    {
      "type": "email",
      "value": "john@example.com"
    }
  ]
}

In this example, the FilterTree function efficiently removes empty values and null entries, retaining only the non-empty and meaningful data. The resulting JSON structure maintains the original structure while excluding unnecessary or empty elements, demonstrating the effectiveness of the provided DataWeave transformation.

Wrapping Up

In conclusion, the presented DataWeave transformation, centered around the FilterTree function and its associated helpers, showcases a robust methodology for efficiently removing empty values from complex JSON structures within MuleSoft’s Anypoint Platform. By leveraging the power of filterTree and adeptly crafted lambda expressions, the transformation ensures the precision and cleanliness of the output data.

The modular design, exemplified by FilterTree, filterArrayItems, and filterObjectDetails, contributes to code clarity and maintainability. These functions, along with their judicious use of filterArrayLeafs and filterObjectLeafs, demonstrate a thoughtful approach to handling array and object structures, specifically targeting leaf nodes.

The flexibility of the solution is evident in its adaptability to various data structures, making it a versatile tool for developers working with diverse JSON inputs. The output directive further enhances the utility by ensuring the resulting data is presented in a JSON format, with the exclusion of null values at every level.

Java Code Geeks

JCGs (Java Code Geeks) is an independent online community focused on creating the ultimate Java to Java developers resource center; targeted at the technical architect, technical team lead (senior developer), project manager and junior developers alike. JCGs serve the Java, SOA, Agile and Telecom communities with daily news written by domain experts, articles, tutorials, reviews, announcements, code snippets and open source projects.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button