Problem 2416 Sum of Prefix Scores of Strings
Table of Contents
Problem Statement
Link - Problem 2416
Question
You are given an array words of size n consisting of non-empty strings.
We define the score of a string term as the number of strings words[i] such that term is a prefix of words[i].
- For example, if words = ["a", "ab", "abc", "cab"], then the score of"ab"is2, since"ab"is a prefix of both"ab"and"abc".
Return an array answer of size n where answer[i] is the sum of scores of every non-empty prefix of words[i].
Note that a string is considered as a prefix of itself.
Example 1:
Input: words = ["abc","ab","bc","b"] Output: [5,4,3,2] Explanation: The answer for each string is the following: - "abc" has 3 prefixes: "a", "ab", and "abc". - There are 2 strings with the prefix "a", 2 strings with the prefix "ab", and 1 string with the prefix "abc". The total is answer[0] = 2 + 2 + 1 = 5. - "ab" has 2 prefixes: "a" and "ab". - There are 2 strings with the prefix "a", and 2 strings with the prefix "ab". The total is answer[1] = 2 + 2 = 4. - "bc" has 2 prefixes: "b" and "bc". - There are 2 strings with the prefix "b", and 1 string with the prefix "bc". The total is answer[2] = 2 + 1 = 3. - "b" has 1 prefix: "b". - There are 2 strings with the prefix "b". The total is answer[3] = 2.
Example 2:
Input: words = ["abcd"] Output: [4] Explanation: "abcd" has 4 prefixes: "a", "ab", "abc", and "abcd". Each prefix has a score of one, so the total is answer[0] = 1 + 1 + 1 + 1 = 4.
Constraints:
- 1 <= words.length <= 1000
- 1 <= words[i].length <= 1000
- words[i]consists of lowercase English letters.
Solution
class TrieNode{
public:
    TrieNode* child[26];
    int countChild;
    TrieNode(){
        for(int i=0;i<26;i++){
            child[i]=nullptr;
        }
        countChild=0;
    }
};
class Trie{
public:
    TrieNode* root;
    Trie(){
        root=new TrieNode();
    }
    void insert(string word){
        TrieNode* node=root;
        for(auto c:word){
            int index=c-'a';
            if(node->child[index]==nullptr){
                node->child[index]=new TrieNode();
            }
            node=node->child[index];
            node->countChild++;
        }
    }
    int getScore(string word){
        int count=0;
        TrieNode* node=root;
        for(auto c:word){
            int index=c-'a';
            node=node->child[index];
            count+=node->countChild;
        }
        return count;
    }
};
class Solution {
public:
    vector<int> sumPrefixScores(vector<string>& words) {
        vector<int> ans;
        Trie trie;
        for(auto word:words){
            trie.insert(word);
        }
        for(auto word:words){
           int count=trie.getScore(word);
           ans.push_back(count);
        }
        return ans;
    }
};
Complexity Analysis
| Algorithm                       | Time Complexity | Space Complexity |
| ------------------------------- | --------------- | ---------------- |
| Trie construction and traversal | O(nm)           | O(nm)            |
Explanation
Intial Thoughts
To approach this problem, we should think of how strings can be interconnected, such as a tree-like structure where each node represents a character. We should also consider how to efficiently count the occurrences of each prefix. The given words can be seen as paths in this tree, and the score of a string can be determined by how many paths pass through its prefixes. Another key point is that a string is also a prefix of itself, which needs to be accounted for in our solution. We can use a data structure like a Trie to model this structure and calculate the scores. We can iterate over each word, and for each word, we can iterate over its prefixes to calculate the score.
Intuitive Analysis
Intuitively, solving this problem involves understanding the relationship between the given words and their prefixes. We can imagine a library where each word is a book, and the prefixes are like catalog numbers that help us find the books. We need to create a system that can efficiently count how many books (words) each catalog number (prefix) points to. A Trie data structure is perfect for this, as it allows us to visualize the interconnectedness of the words and their prefixes. By inserting each word into the Trie and then traversing the Trie for each word’s prefixes, we can intuitively understand how to calculate the scores. The key is to recognize that the score of a string is essentially the sum of the counts of its prefixes in the Trie.
1. Intuition
- The problem requires calculating the sum of scores of every non-empty prefix of each word in the given array of words.
- A score of a string term is defined as the number of strings that have term as a prefix.
- To efficiently calculate the scores, a Trie data structure can be utilized to store the words and their prefixes.
- The Trie allows for fast lookup and insertion of words, making it suitable for this problem.
- By iterating through each word and its prefixes, the score of each prefix can be calculated using the Trie.
- The sum of scores of all prefixes for each word can then be computed and returned as the result.
- This approach ensures that the solution is both efficient and accurate.
2. Implementation
- The TrieNodeclass is defined to represent each node in the Trie, containing an arraychildto store child nodes and acountChildvariable to keep track of the number of child nodes.
- The Trieclass is implemented with methodsinsertandgetScoreto insert words into the Trie and calculate the score of a given word, respectively.
- In the insertmethod, each character of the word is iterated through and the corresponding child node is created if it does not exist, with thecountChildvariable incremented accordingly.
- The getScoremethod traverses the Trie for a given word, summing up thecountChildvalues of each node to calculate the score.
- The Solutionclass utilizes theTrieclass to solve the problem, first inserting all words into the Trie and then calculating the sum of scores for each word using thegetScoremethod.
- The result is stored in the ansvector and returned as the final output.
- The use of a Trie data structure enables an efficient solution with a time complexity of O(n*m), where n is the number of words and m is the maximum length of a word.
Complexity Analysis
Time Complexity:
- The time complexity is determined by the two nested loops in the sumPrefixScoresmethod: one for inserting words into the Trie and another for calculating the prefix scores. Each word is iterated over once for insertion and once for scoring, resulting in a total of 2n*m operations, where n is the number of words and m is the average length of a word.
- The dominant operation is the traversal of each word’s characters, which occurs in both the insertandgetScoremethods. Theinsertmethod creates new nodes if necessary, and thegetScoremethod accumulates the count of child nodes. Both operations have a time complexity of O(m), and since they are performed for each word, the overall time complexity becomes O(n*m).
- The Big O classification of O(n*m) is justified because the algorithm’s running time grows linearly with the product of the number of words and the average length of a word. This is due to the fact that each word is processed twice (once for insertion and once for scoring), and each character in a word is processed once.
Space Complexity:
- The space complexity is driven by the storage required for the Trie data structure. Each node in the Trie has an array of 26 possible child nodes, which corresponds to the 26 letters of the alphabet.
- The impact of the Trie data structure on space complexity is significant because each unique prefix of a word requires a separate node in the Trie. The number of nodes in the Trie grows with the number of unique prefixes, which can be as high as nmin the worst case (when all words are unique and have different lengths).
- The justification of the space complexity being O(nm)lies in the fact that in the worst-case scenario, every character in every word requires a new node in the Trie, resulting in a total ofnmnodes. This occurs when all words are distinct and have distinct prefixes, maximizing the number of nodes needed to store all words.
Footnote
This question is rated as Hard difficulty.
Hints
What data structure will allow you to efficiently keep track of the score of each prefix?
Use a Trie. Insert all the words into it, and keep a counter at each node that will tell you how many times we have visited each prefix.
Similar Questions:
| Title | URL | Difficulty | 
|---|---|---|
| Design Add and Search Words Data Structure | https://leetcode.com/problems/design-add-and-search-words-data-structure | Medium | 
| Maximum XOR of Two Numbers in an Array | https://leetcode.com/problems/maximum-xor-of-two-numbers-in-an-array | Medium | 
| Map Sum Pairs | https://leetcode.com/problems/map-sum-pairs | Medium |