Skip to content

Commit e52ef9b

Browse files
committed
chore: fix get text with tesseract 4.1
Get text in [email protected] return it without spaces. Fix: - add `string.whitespace` to char_whitelist
1 parent daf7d10 commit e52ef9b

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

core/utils/image_utils.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,8 @@ def get_main_color(image_path):
105105

106106
@staticmethod
107107
def get_text(image_path, use_cv2=True):
108-
char_whitelist = string.digits
108+
char_whitelist = string.whitespace
109+
char_whitelist += string.digits
109110
char_whitelist += string.ascii_lowercase
110111
char_whitelist += string.ascii_uppercase
111112

@@ -126,7 +127,7 @@ def get_text(image_path, use_cv2=True):
126127
thresh = cv2.adaptiveThreshold(gray, 255, 1, 1, 11, 2)
127128

128129
# apply some dilation and erosion to join the gaps - change iteration to detect more or less area's
129-
thresh = cv2.dilate(thresh, None, iterations=10)
130+
thresh = cv2.dilate(thresh, None, iterations=5)
130131
thresh = cv2.erode(thresh, None, iterations=3)
131132

132133
# Find the contours

0 commit comments

Comments
 (0)