0187.Repeated-DNA-Sequences

Description

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. Example 1:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

Tags: Math, String

题意

所有DNA序列都可以用 A，C，G，T 四个字母表示，比如 "ACGAATTCCG"，研究DNA序列时，有时识别重复子串是很有意义的。

请编写一个程序，找到所有长度为10的且出现次数多于1的子串。

题解

思路1

用哈希表记录所有长度是10的子串的个数。从前往后扫描，当子串出现第二次时，将其记录在答案中。

时间复杂度分析：总共约 nn 个长度是10的子串，所以总共有 10n10n 个字符。计算量与字符数量成正比，所以时间复杂度是 O(n)O(n)。

结语

如果你同我一样热爱数据结构、算法、LeetCode，可以关注我 GitHub 上的 LeetCode 题解：awesome-golang-algorithm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0187.repeated-dna-sequences.md

0187.repeated-dna-sequences.md

0187.Repeated-DNA-Sequences

Description

题意

题解

思路1

结语

Files

0187.repeated-dna-sequences.md

Latest commit

History

0187.repeated-dna-sequences.md

File metadata and controls

0187.Repeated-DNA-Sequences

Description

题意

题解

思路1

结语