Skip to content

Commit 0f1b20c

Browse files
solutions and notes for problem 2976
1 parent 52edb23 commit 0f1b20c

File tree

1 file changed

+94
-0
lines changed

1 file changed

+94
-0
lines changed

problems/2976/paxtonfitzpatrick.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# [Problem 2976: Minimum Cost to Convert String I](https://leetcode.com/problems/minimum-cost-to-convert-string-i/description/?envType=daily-question)
2+
3+
## Initial thoughts (stream-of-consciousness)
4+
5+
- okay, so this is going to be another shortest path problem. Letters are nodes, corresponding indices in `original` and `changed` are directed edges, and those same indices in `cost` give their weights.
6+
- I was originally thinking I'd want to find all min distances between letters using something similar to yesterday's problem (Floyd-Warshall algorithm), but i actually think it'll be more efficient to figure out what letters we need to convert first and then searching just for those. So I think this is calling for Djikstra's algorithm.
7+
- so I'll loop through `source` and `target`, identify differences, and store source-letter + target-letter pairs.
8+
- if a source letter isn't in `original` or a target letter isn't in `changed`, I can immediately `return -1`
9+
- actually, I think I'll store the source and target letters as a dict where keys are source letters and values are lists (probably actually sets?) of target letters for that source letter. That way if I need to convert some "a" to a "b" and some other "a" to a "c", I can save time by combining those into a single Djikstra run.
10+
- then I'll run Djikstra's algorithm starting from each source letter and terminate when I've found paths to all target letters for it.
11+
- I'll write a helper function for Djikstra's algorithm that takes a source letter and a set of target letters, and returns a list (or some sort of container) of minimum costs to convert that source letter to each of the target letters.
12+
13+
---
14+
15+
- after thinking through how to implement Djikstra here a bit, I wonder if Floyd-Warshall might actually be more efficient... Floyd-Warshall's runtime scales with the number of nodes, but since nodes here are letters, we know there will always be 26 of them. So that's essentially fixed. Meanwhile Djikstra's runtime scales with the number of nodes *and* edges, and since the constraints say there can be upto 2,000 edges, we're likely to have a large number of edges relative to the number of nodes. That also means we're much more likely to duplicate operations during different runs of Djikstra than we would be if the graph were large and sparse. So I think I'll actually try Floyd-Warshall first.
16+
17+
## Refining the problem, round 2 thoughts
18+
19+
- we could reduce the size of the distance matrix for the Floyd-Warshall algorithm by including only the letters in `original` and `changed` instead of all 26. But I doubt this would be worth it on average, since it'd only sometimes reduce the number of nodes in the graph and always incur overhead costs of converting `original` and `changed` to sets, looping over letters and converting them to indices instead of looping over indices directly, etc.
20+
- speaking of which, I'll still have to loop over letters and convert them to indices in order to extract the conversion costs for mismatched letters, and I can think of two ways to do this:
21+
- store a letters/indices mapping in a `dict`, i.e. `{let: i for i, let in enumerate('abcdefghijklmnopqrstuvwxyz')}` and index it with each letter
22+
- use `ord(letter)` to get the letter's ASCII value and subtract 97 (ASCII value of "a") to get its index in the alphabet
23+
24+
Both operations would take constant time, but constructing the `dict` will use a little bit of additional memory so I think I'll go with the latter.
25+
- hmmm actually, if I can just use a dict as the letter/index mapping, that might make reducing the size of the distance matrix worth it. Maybe I'll try that if my first attempt is slow.
26+
- hmmm the problem notes that "*there may exist indices `i`, `j` such that `original[j] == original[i]` and `changed[j] == changed[i]`*". But it's not totally clear to me whether they're (A) simply saying that nodes may appear in both the `original` and `changed` lists multiple times because they can have multiple edges, or (B) saying that ***edges*** may be duplicated, potentially with different `cost` values -- i.e., `(original[j], changed[j]) == (original[i], changed[i])` but `cost[j] != cost[i]`. My guess is that it's the latter because the former seems like a sort of trivial point to make note of, so I'll want to account for this when I initialize the distance matrix.
27+
28+
## Attempted solution(s)
29+
30+
```python
31+
class Solution:
32+
def minimumCost(self, source: str, target: str, original: List[str], changed: List[str], cost: List[int]) -> int:
33+
# setup min distance/cost matrix
34+
INF = float('inf')
35+
min_costs = [[INF] * 26 for _ in range(26)]
36+
for orig_let, changed_let, c in zip(original, changed, cost):
37+
orig_ix, changed_ix = ord(orig_let) - 97, ord(changed_let) - 97
38+
if c < min_costs[orig_ix][changed_ix]:
39+
min_costs[orig_ix][changed_ix] = c
40+
# run Floyd-Warshall
41+
for via_ix in range(26):
42+
for from_ix in range(26):
43+
for to_ix in range(26):
44+
if min_costs[from_ix][via_ix] + min_costs[via_ix][to_ix] < min_costs[from_ix][to_ix]:
45+
min_costs[from_ix][to_ix] = min_costs[from_ix][via_ix] + min_costs[via_ix][to_ix]
46+
# compute total cost to convert source to target
47+
total_cost = 0
48+
for src_let, tgt_let in zip(source, target):
49+
if src_let != tgt_let:
50+
src_ix, tgt_ix = ord(src_let) - 97, ord(tgt_let) - 97
51+
if min_costs[src_ix][tgt_ix] == INF:
52+
return -1
53+
total_cost += min_costs[src_ix][tgt_ix]
54+
return total_cost
55+
```
56+
57+
![](https://github.com/user-attachments/assets/2df1bdf7-8f66-4d28-90f8-12998425b3ba)
58+
59+
Not bad. But I'm curious whether creating a graph from only the letters in `original` and `changed` would be faster. It's a quick edit, so I'll try it. Biggest change will be an additional `return -1` condition in the last loop to handle letters in `source` and `target` that can't be mapped to/from anything.
60+
61+
```python
62+
class Solution:
63+
def minimumCost(self, source: str, target: str, original: List[str], changed: List[str], cost: List[int]) -> int:
64+
# setup min distance/cost matrix
65+
INF = float('inf')
66+
letters = set(original) | set(changed)
67+
letters_ixs = {let: i for i, let in enumerate(letters)}
68+
len_letters = len(letters)
69+
min_costs = [[INF] * 26 for _ in range(len_letters)]
70+
for orig_let, changed_let, c in zip(original, changed, cost):
71+
if c < min_costs[letters_ixs[orig_let]][letters_ixs[changed_let]]:
72+
min_costs[letters_ixs[orig_let]][letters_ixs[changed_let]] = c
73+
# run Floyd-Warshall
74+
for via_ix in range(len_letters):
75+
for from_ix in range(len_letters):
76+
for to_ix in range(len_letters):
77+
if min_costs[from_ix][via_ix] + min_costs[via_ix][to_ix] < min_costs[from_ix][to_ix]:
78+
min_costs[from_ix][to_ix] = min_costs[from_ix][via_ix] + min_costs[via_ix][to_ix]
79+
# compute total cost to convert source to target
80+
total_cost = 0
81+
try:
82+
for src_let, tgt_let in zip(source, target):
83+
if src_let != tgt_let:
84+
if (change_cost := min_costs[letters_ixs[src_let]][letters_ixs[tgt_let]]) == INF:
85+
return -1
86+
total_cost += change_cost
87+
except KeyError:
88+
return -1
89+
return total_cost
90+
```
91+
92+
![](https://github.com/user-attachments/assets/263ad81c-900d-40d1-8602-ee5012e4b47e)
93+
94+
Wow, that made a much bigger difference than I expected!

0 commit comments

Comments
 (0)