Problem 2416 Sum of Prefix Scores of Strings
Table of Contents
Problem Statement
Link - Problem 2416
Question
You are given an array words
of size n
consisting of non-empty strings.
We define the score of a string term
as the number of strings words[i]
such that term
is a prefix of words[i]
.
- For example, if
words = ["a", "ab", "abc", "cab"]
, then the score of"ab"
is2
, since"ab"
is a prefix of both"ab"
and"abc"
.
Return an array answer
of size n
where answer[i]
is the sum of scores of every non-empty prefix of words[i]
.
Note that a string is considered as a prefix of itself.
Example 1:
Input: words = ["abc","ab","bc","b"] Output: [5,4,3,2] Explanation: The answer for each string is the following: - "abc" has 3 prefixes: "a", "ab", and "abc". - There are 2 strings with the prefix "a", 2 strings with the prefix "ab", and 1 string with the prefix "abc". The total is answer[0] = 2 + 2 + 1 = 5. - "ab" has 2 prefixes: "a" and "ab". - There are 2 strings with the prefix "a", and 2 strings with the prefix "ab". The total is answer[1] = 2 + 2 = 4. - "bc" has 2 prefixes: "b" and "bc". - There are 2 strings with the prefix "b", and 1 string with the prefix "bc". The total is answer[2] = 2 + 1 = 3. - "b" has 1 prefix: "b". - There are 2 strings with the prefix "b". The total is answer[3] = 2.
Example 2:
Input: words = ["abcd"] Output: [4] Explanation: "abcd" has 4 prefixes: "a", "ab", "abc", and "abcd". Each prefix has a score of one, so the total is answer[0] = 1 + 1 + 1 + 1 = 4.
Constraints:
1 <= words.length <= 1000
1 <= words[i].length <= 1000
words[i]
consists of lowercase English letters.
Solution
class TrieNode{
public:
TrieNode* child[26];
int countChild;
TrieNode(){
for(int i=0;i<26;i++){
child[i]=nullptr;
}
countChild=0;
}
};
class Trie{
public:
TrieNode* root;
Trie(){
root=new TrieNode();
}
void insert(string word){
TrieNode* node=root;
for(auto c:word){
int index=c-'a';
if(node->child[index]==nullptr){
node->child[index]=new TrieNode();
}
node=node->child[index];
node->countChild++;
}
}
int getScore(string word){
int count=0;
TrieNode* node=root;
for(auto c:word){
int index=c-'a';
node=node->child[index];
count+=node->countChild;
}
return count;
}
};
class Solution {
public:
vector<int> sumPrefixScores(vector<string>& words) {
vector<int> ans;
Trie trie;
for(auto word:words){
trie.insert(word);
}
for(auto word:words){
int count=trie.getScore(word);
ans.push_back(count);
}
return ans;
}
};
Complexity Analysis
| Algorithm | Time Complexity | Space Complexity |
| ------------------------------- | --------------- | ---------------- |
| Trie construction and traversal | O(nm) | O(nm) |
Explanation
Intial Thoughts
To approach this problem, we should think of how strings can be interconnected, such as a tree-like structure where each node represents a character. We should also consider how to efficiently count the occurrences of each prefix. The given words can be seen as paths in this tree, and the score of a string can be determined by how many paths pass through its prefixes. Another key point is that a string is also a prefix of itself, which needs to be accounted for in our solution. We can use a data structure like a Trie to model this structure and calculate the scores. We can iterate over each word, and for each word, we can iterate over its prefixes to calculate the score.
Intuitive Analysis
Intuitively, solving this problem involves understanding the relationship between the given words and their prefixes. We can imagine a library where each word is a book, and the prefixes are like catalog numbers that help us find the books. We need to create a system that can efficiently count how many books (words) each catalog number (prefix) points to. A Trie data structure is perfect for this, as it allows us to visualize the interconnectedness of the words and their prefixes. By inserting each word into the Trie and then traversing the Trie for each word’s prefixes, we can intuitively understand how to calculate the scores. The key is to recognize that the score of a string is essentially the sum of the counts of its prefixes in the Trie.
1. Intuition
- The problem requires calculating the sum of scores of every non-empty prefix of each word in the given array of words.
- A score of a string term is defined as the number of strings that have term as a prefix.
- To efficiently calculate the scores, a Trie data structure can be utilized to store the words and their prefixes.
- The Trie allows for fast lookup and insertion of words, making it suitable for this problem.
- By iterating through each word and its prefixes, the score of each prefix can be calculated using the Trie.
- The sum of scores of all prefixes for each word can then be computed and returned as the result.
- This approach ensures that the solution is both efficient and accurate.
2. Implementation
- The
TrieNode
class is defined to represent each node in the Trie, containing an arraychild
to store child nodes and acountChild
variable to keep track of the number of child nodes. - The
Trie
class is implemented with methodsinsert
andgetScore
to insert words into the Trie and calculate the score of a given word, respectively. - In the
insert
method, each character of the word is iterated through and the corresponding child node is created if it does not exist, with thecountChild
variable incremented accordingly. - The
getScore
method traverses the Trie for a given word, summing up thecountChild
values of each node to calculate the score. - The
Solution
class utilizes theTrie
class to solve the problem, first inserting all words into the Trie and then calculating the sum of scores for each word using thegetScore
method. - The result is stored in the
ans
vector and returned as the final output. - The use of a Trie data structure enables an efficient solution with a time complexity of O(n*m), where n is the number of words and m is the maximum length of a word.
Complexity Analysis
Time Complexity:
- The time complexity is determined by the two nested loops in the
sumPrefixScores
method: one for inserting words into the Trie and another for calculating the prefix scores. Each word is iterated over once for insertion and once for scoring, resulting in a total of 2n*m operations, where n is the number of words and m is the average length of a word. - The dominant operation is the traversal of each word’s characters, which occurs in both the
insert
andgetScore
methods. Theinsert
method creates new nodes if necessary, and thegetScore
method accumulates the count of child nodes. Both operations have a time complexity of O(m), and since they are performed for each word, the overall time complexity becomes O(n*m). - The Big O classification of O(n*m) is justified because the algorithm’s running time grows linearly with the product of the number of words and the average length of a word. This is due to the fact that each word is processed twice (once for insertion and once for scoring), and each character in a word is processed once.
Space Complexity:
- The space complexity is driven by the storage required for the Trie data structure. Each node in the Trie has an array of 26 possible child nodes, which corresponds to the 26 letters of the alphabet.
- The impact of the Trie data structure on space complexity is significant because each unique prefix of a word requires a separate node in the Trie. The number of nodes in the Trie grows with the number of unique prefixes, which can be as high as
nm
in the worst case (when all words are unique and have different lengths). - The justification of the space complexity being
O(nm)
lies in the fact that in the worst-case scenario, every character in every word requires a new node in the Trie, resulting in a total ofnm
nodes. This occurs when all words are distinct and have distinct prefixes, maximizing the number of nodes needed to store all words.
Footnote
This question is rated as Hard difficulty.
Hints
What data structure will allow you to efficiently keep track of the score of each prefix?
Use a Trie. Insert all the words into it, and keep a counter at each node that will tell you how many times we have visited each prefix.
Similar Questions:
Title | URL | Difficulty |
---|---|---|
Design Add and Search Words Data Structure | https://leetcode.com/problems/design-add-and-search-words-data-structure | Medium |
Maximum XOR of Two Numbers in an Array | https://leetcode.com/problems/maximum-xor-of-two-numbers-in-an-array | Medium |
Map Sum Pairs | https://leetcode.com/problems/map-sum-pairs | Medium |