139 Word Break

https://leetcode.com/problems/word-break/

solution

An n-character string has 2^(n–1) different segmentations (there are n–1 positions between characters, each of which can either be or not be a word boundary).

  • 动态规划

# - 完全背包: dp[i]: s[i-1]是否可被word切割
# - 1维dp: 容量为j的背包,所背的物品价值可以最大为dp[j]
# - 先物品还是先背包:因为不完全背包1维需要从后往前,
# - 背后从前到后还是从后到前

class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        n = len(s)
        dp = [False] * (n + 1)
        dp[0] = True

        for i in range(1, n + 1):
            for j in range(i):
                # 注意是 s[j:i]
                if dp[j] and s[j:i] in wordDict:
                    dp[i] = True
        return dp[-1]

时间复杂度:O(n^2 * k) 空间复杂度:O(n)

  • dfs with memorization

"""
更初始的思路:
def wordBreak(s, wordDict):
    def dfs(start):
        # 如果已经到达字符串的末尾,说明所有字符都被成功拆分
        if start == len(s):
            return True
        
        # 尝试从当前位置开始,找到一个字典中的单词
        for end in range(start + 1, len(s) + 1):
            if s[start:end] in wordDict:
                # 如果找到一个单词,继续递归处理剩余的字符串
                if dfs(end):
                    return True
        
        # 如果没有找到任何有效的拆分,返回 False
        return False
    
    # 从字符串的第一个字符开始递归
    return dfs(0)
"""

class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        memo = {}
        wordSet = set(wordDict)
        return self.dfs(s, wordSet, memo)

    def dfs(self, s, wordSet, memo):
        if s in memo:
            return memo[s]
        if s in wordSet:
            return True
        for i in range(1, len(s)):
            prefix = s[:i]
            if prefix in wordSet and self.dfs(s[i:], wordSet, memo):
                memo[s] = True
                return True
        memo[s] = False
        return False

时间复杂度:O() 空间复杂度:O()

follow up

140. Word Break II

# https://leetcode.com/problems/word-break-ii/solutions/44311/python-easy-to-understand-solution/

class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> List[str]:
        wordDict = set(wordDict)
        path = []
        res  = []
        self.backtrack(s, wordDict, start=0, path=path, res=res)
        return res
		
    def backtrack(self, s, wordDict, start, path, res):
        if start == len(s):
            res.append(' '.join(path))
            return
        
        for end in range(start, len(s)):
            if not s[start:end+1] in wordDict:
                continue

            path.append(s[start:end+1])
            self.backtrack(s, wordDict, end+1, path, res)
            path.pop()
  • DFS枚举每次切词的位置,使用Memorization优化(回溯+动态规划思想),每个字符串只用求一遍

"""
catsanddog,ans 依次变化:先是dog满足s[i:] in wordDict 出现append给ans, 继续出栈ans变为 'sand dog', 继续出栈'cat sand dog'
['sand dog']
['cat sand dog']
['and dog'] # 另一条s[:i]的支线
['cat sand dog', 'cats and dog']  # 支线完成,汇到之前已有结果的总线
"""

class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> List[str]:
        memory = {}
        return self.dfs(s, wordDict, memory)

    def dfs(self, s, wordDict, memory):
        if not s:
            return []

        if s in memory:
            return memory[s]

        ans = []
        if s in wordDict:  # 如果整个s在字符串中, 最后一个传递给ans并返回
            ans.append(s)

        for i in range(1, len(s)):
            if s[:i] in wordDict:
                sub_ans = self.dfs(s[i:], wordDict, memory)
                for item in sub_ans:
                    ans.append(s[:i] + ' ' + item)

        memory[s] = ans
        return ans

时间复杂度:O() 空间复杂度:O()

  • Trie

# https://algo.monster/liteproblems/140

472 Concatenated Words

word segmentation by n-gram

Last updated