Word Break Problem Java

 

Given a string and set of dictionary words, determine if string can be segmented into a space-separated sequence of one or more dictionary words.

 

Example:

Input String : “ILOVEANDROIDSRC”

Dictionary Words: {“A”, “AN”, “AND”, “DROID”, “ANDROID”,”I”, “LOVE”, “SRC”}

Output: true (As it can be separated as “I LOVE ANDROID SRC” or “I LOVE AN DROID SRC”)

For finding if it is possible to separate given string, we look at each prefix and search it in dictionary . If current prefix is found in dictionary, we repeat for rest of the substring. If we get true for rest string as well, answer is yes, or else we check for next prefix. If we have finished checking all prefixes and none of them resulted in a solution, result is  false.

Let’s consider above example “ILOVEANDROIDSRC”

We will first check for “I”, it is present in dictionary. We will check for “LOVEANDROIDSRC”. While processing once we get that “LOVE” is  a dictionary word. We process for rest of it “ANDROISRC” and so on and finally return result.

Above process is of checking dictionary word is repeated again. It has overlapping sub-problems. Lets understand with example of “android”.

word_break

From above partially completed recursion tree for checking prefixes, it can be observed that it is checking again for same yellow marked words in dictionary. We can improve it using memoization, we will store result in an array after one lookup in dictionary and will avoid repetition in checking.

Complete Source Code for Word Break Problem in Java

import java.util.Arrays;
import java.util.HashSet;

public class WordBreakProblem {

    public static void main(String[] args) {
        // TODO Auto-generated method stub
        String[] dictionary = {"A", "AN", "AND", "DROID", "ANDROID", "I", "LOVE", "SRC"};
        String inputString = "ILOVEANDROIDSRC";
        System.out.println(isWordBreakPossible(inputString, new HashSet<>(Arrays.asList(dictionary))));

    }

    private static boolean isWordBreakPossible(String inputString, HashSet<String> dict) {

        int inputLength = inputString.length();

        //memoization table where value at index i represents
        //that if string[0..i-1] can be segmented or not
        boolean[] isBreakPossible = new boolean[inputLength + 1];
        //Possible to break with blank "" will always be true, it will serve as base case
        isBreakPossible[0] = true;

        for (int i = 1; i <= inputLength; i++) {

            if (isBreakPossible[i] == false && dict.contains(inputString.substring(0, i)))
                isBreakPossible[i] = true;
            //We have detected subString[0..i] to be in dictionary. Now we will check for remaining string	

            if (isBreakPossible[i]) {
                //if subString detected in dictionary is of length equal to inputString. return true
                if (i == inputLength)
                    return true;
                else {
                    //process for substring[i+1..inputLength]
                    for (int j = i + 1; j <= inputLength; j++) {
                        if (isBreakPossible[j] == false && dict.contains(inputString.substring(i, j)))
                            isBreakPossible[j] = true;

                        //if we are done processing for rest string i.e. j == inputLength
                        //and there exist a prefix in dictionary ending at j
                        if (j == inputLength && isBreakPossible[j])
                            return true;
                    }
                }
            }

        }
        //we have looked into all the prefixes, no match return false
        return false;
    }

}

You may also like...