Minimum Window Substring
Ximin Lin Architect

Minimum Window Substring #hard

Given two strings s and t of lengths m and n respectively, return the minimum window substring of s such that every character in t (including duplicates) is included in the window. If there is no such substring, return the empty string "".

The testcases will be generated such that the answer is unique.

A substring is a contiguous sequence of characters within the string.

Example 1:

Input: s = “ADOBECODEBANC”, t = “ABC”
Output: “BANC”
Explanation: The minimum window substring “BANC” includes ‘A’, ‘B’, and ‘C’ from string t.

Example 2:

Input: s = “a”, t = “a”
Output: “a”
Explanation: The entire string s is the minimum window.

Example 3:

Input: s = “a”, t = “aa”
Output: “”
Explanation: Both ‘a’s from t must be included in the window.
Since the largest window of s only has one ‘a’, return empty string.

Constraints:

  • m == s.length
  • n == t.length
  • 1 <= m, n <= 105
  • s and t consist of uppercase and lowercase English letters.

Follow up: Could you find an algorithm that runs in O(m + n) time?

Solution:

  1. 首先是对比 s 是不是包含 t 的方式,我们用 hashmap 里面存着 wordcount。另外有 total counter 来记录是不是都 include 了
  2. ==sliding window 的办法来选取最小的 window,two pointers i, j:==
    1. s[i:j] 不满足条件时,j++;
    2. ==当 s[i:j]满足条件时,i++ 直到刚刚不满足条件;==

Regular Sliding Window

O(S+T)/O(S+T)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
def minWindow(self, s, t):
"""
:type s: str
:type t: str
:rtype: str
"""

if not t or not s:
return ""

# Dictionary which keeps a count of all the unique characters in t.
dict_t = Counter(t)

# Number of unique characters in t, which need to be present in the desired window.
required = len(dict_t)

# left and right pointer
l, r = 0, 0

# formed is used to keep track of how many unique characters in t are present in the current window in its desired frequency.
# e.g. if t is "AABC" then the window must have two A's, one B and one C. Thus formed would be = 3 when all these conditions are met.
formed = 0

# Dictionary which keeps a count of all the unique characters in the current window.
window_counts = {}

# ans tuple of the form (window length, left, right)
ans = float("inf"), None, None

while r < len(s):

# Add one character from the right to the window
character = s[r]
window_counts[character] = window_counts.get(character, 0) + 1

# If the frequency of the current character added equals to the desired count in t then increment the formed count by 1.
if character in dict_t and window_counts[character] == dict_t[character]:
formed += 1

# Try and contract the window till the point where it ceases to be 'desirable'.
while l <= r and formed == required:
character = s[l]

# Save the smallest window until now.
if r - l + 1 < ans[0]:
ans = (r - l + 1, l, r)

# The character at the position pointed by the `left` pointer is no longer a part of the window.
window_counts[character] -= 1
if character in dict_t and window_counts[character] < dict_t[character]:
formed -= 1

# Move the left pointer ahead, this would help to look for a new window.
l += 1

# Keep expanding the window once we are done contracting.
r += 1
return "" if ans[0] == float("inf") else s[ans[1] : ans[2] + 1]

==Optimized Sliding Window (build char2idx map)==

O(2*filtered_S + S +T)/O(S + T)

==对于 s 很长且很多 char 不在 t 中,构建 filtered_S, filtered_S 里面只包含了 t 中出现的 chars,之后都一样了,之后 iterate filtered_S 就更快了==

1
2
3
S = "ABCDDDDDDEEAFFBC" T = "ABC"
filtered_S = [(0, 'A'), (1, 'B'), (2, 'C'), (11, 'A'), (14, 'B'), (15, 'C')]
Here (0, 'A') means in string S character A is at index 0.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
def minWindow(self, s, t):
"""
:type s: str
:type t: str
:rtype: str
"""
if not t or not s:
return ""

dict_t = Counter(t)

required = len(dict_t)

# Filter all the characters from s into a new list along with their index.
# The filtering criteria is that the character should be present in t.
filtered_s = []
for i, char in enumerate(s):
if char in dict_t:
filtered_s.append((i, char))

l, r = 0, 0
formed = 0
window_counts = {}

ans = float("inf"), None, None

# Look for the characters only in the filtered list instead of entire s. This helps to reduce our search.
# Hence, we follow the sliding window approach on as small list.
while r < len(filtered_s):
character = filtered_s[r][1]
window_counts[character] = window_counts.get(character, 0) + 1

if window_counts[character] == dict_t[character]:
formed += 1

# If the current window has all the characters in desired frequencies i.e. t is present in the window
while l <= r and formed == required:
character = filtered_s[l][1]

# Save the smallest window until now.
end = filtered_s[r][0]
start = filtered_s[l][0]
if end - start + 1 < ans[0]:
ans = (end - start + 1, start, end)

window_counts[character] -= 1
if window_counts[character] < dict_t[character]:
formed -= 1
l += 1

r += 1
return "" if ans[0] == float("inf") else s[ans[1] : ans[2] + 1]
 Comments