Skip to content

Latest commit

 

History

History
38 lines (20 loc) · 1.27 KB

0187.repeated-dna-sequences.md

File metadata and controls

38 lines (20 loc) · 1.27 KB

0187.Repeated-DNA-Sequences

Description

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. Example 1:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

Tags: Math, String

题意

所有DNA序列都可以用 A,C,G,T 四个字母表示,比如 "ACGAATTCCG",研究DNA序列时,有时识别重复子串是很有意义的。

请编写一个程序,找到所有长度为10的且出现次数多于1的子串。

题解

思路1

用哈希表记录所有长度是10的子串的个数。 从前往后扫描,当子串出现第二次时,将其记录在答案中。

时间复杂度分析:总共约 nn 个长度是10的子串,所以总共有 10n10n 个字符。计算量与字符数量成正比,所以时间复杂度是 O(n)O(n)。

结语

如果你同我一样热爱数据结构、算法、LeetCode,可以关注我 GitHub 上的 LeetCode 题解:awesome-golang-algorithm