Skip to content

avoid allocations by avoiding **kwargs #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 12, 2020

Conversation

dhalbert
Copy link
Contributor

@dhalbert dhalbert commented Feb 11, 2020

  • Avoid using *args and **kwargs, which allocate storage.
  • Do hasattr() once and remember its value.

There is a remarkable timing peculiarity when doing this right now, using this test program, running on a CPB:

import time
import adafruit_lis3dh
from adafruit_circuitplayground import cp

def time_it(f, n):
    heap_before = gc.mem_free()
    s = time.monotonic()
    for _ in range(n):
        f()
    heap_after = gc.mem_free()
    return "secs: {}, heap: {}".format(time.monotonic() - s, heap_before - heap_after)

def read_reg():
    cp._lis3dh._read_register(0x28 | 0x80, 6)

def read_reg_then_unpack():
    struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6))

for n in (1,40,50,400,1000):
    gc.collect()
    print(n, "cp._lis3dh._read_register:",
          time_it(read_reg, n))
    gc.collect()
    print(n, "struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)):",
          time_it(read_reg_then_unpack, n))
    print()

Test program output _ before_ applying this PR is below. @tannewt Notice the bizarre result for the 400 and 1000 cases above, where simply reading the register is slower than reading the register and then unpacking the result. This may be ameliorated by your proposed heap allocation improvements.

1 cp._lis3dh._read_register: secs: 0.00976563, heap: 96
1 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 0.00976563, heap: 128

40 cp._lis3dh._read_register: secs: 0.0664063, heap: 3840
40 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 0.0644531, heap: 5120

50 cp._lis3dh._read_register: secs: 0.0800781, heap: 4800
50 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 0.0820313, heap: 6400

400 cp._lis3dh._read_register: secs: 0.763672, heap: 38400
400 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 0.587891, heap: 51200

1000 cp._lis3dh._read_register: secs: 2.54102, heap: 96000
1000 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 1.45508, heap: 128000

Test program output with this PR:

1 cp._lis3dh._read_register: secs: 0.0117188, heap: 0
1 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 0.0078125, heap: 32

40 cp._lis3dh._read_register: secs: 0.078125, heap: 0
40 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 0.0898438, heap: 1280

50 cp._lis3dh._read_register: secs: 0.0820313, heap: 0
50 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 0.109375, heap: 1600

400 cp._lis3dh._read_register: secs: 0.621094, heap: 0
400 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 1.04688, heap: 12800

1000 cp._lis3dh._read_register: secs: 1.53516, heap: 0
1000 struct.unpack('<hhh', cp._lis3dh._read_register(0x28 | 0x80, 6)): secs: 3.47656, heap: 32000

@dhalbert dhalbert requested a review from tannewt February 11, 2020 18:01
@tannewt
Copy link
Member

tannewt commented Feb 11, 2020

Why is the struct version slower with this change?

@dhalbert
Copy link
Contributor Author

Why is the struct version slower with this change?

That is the question!

Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Definitely reduces the number of allocations in the test.

@tannewt tannewt merged commit 2d90646 into adafruit:master Feb 12, 2020
@tannewt
Copy link
Member

tannewt commented Feb 12, 2020

For those only following this conversation. The struct version is slower because there no longer single block allocations between tuple allocations, which are 2+ blocks. 2+ block allocations can get very expensive when no single block allocations occur because only single block allocations advance the start location of the allocation scan.

@dhalbert dhalbert deleted the avoid-alloc branch February 12, 2020 19:35
adafruit-adabot added a commit to adafruit/Adafruit_CircuitPython_Bundle that referenced this pull request Feb 18, 2020
Updating https://github.com/adafruit/Adafruit_CircuitPython_ADS1x15 to 2.1.3 from 2.1.2:
  > Merge pull request adafruit/Adafruit_CircuitPython_ADS1x15#52 from ares-est/master

Updating https://github.com/adafruit/Adafruit_CircuitPython_CLUE to 2.0.3 from 2.0.1:
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#17 from kattni/slideshow-fix
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#14 from dherrada/master
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#16 from kattni/slideshow-example
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#15 from kattni/add-display-object
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#12 from kattni/add-space
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#11 from kattni/variable-change
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#10 from kattni/level-bubble-fix
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#9 from kattni/color-fix
  > Merge pull request adafruit/Adafruit_CircuitPython_CLUE#8 from kattni/example-update

Updating https://github.com/adafruit/Adafruit_CircuitPython_Fingerprint to 1.1.6 from 1.1.5:
  > Merge pull request adafruit/Adafruit_CircuitPython_Fingerprint#11 from stitchesnburns/stitchesnburns-patch-1

Updating https://github.com/adafruit/Adafruit_CircuitPython_LIS2MDL to 2.0.2 from 2.0.1:
  > Update README.rst

Updating https://github.com/adafruit/Adafruit_CircuitPython_LSM303_Accel to 1.0.3 from 1.0.2:
  > Merge pull request adafruit/Adafruit_CircuitPython_LSM303_Accel#5 from BiffoBear/rename-example-file
  > Merge pull request adafruit/Adafruit_CircuitPython_LSM303_Accel#4 from FoamyGuy/master

Updating https://github.com/adafruit/Adafruit_CircuitPython_LSM303DLH_Mag to 1.0.4 from 1.0.3:
  > Merge pull request adafruit/Adafruit_CircuitPython_LSM303DLH_Mag#5 from FoamyGuy/master

Updating https://github.com/adafruit/Adafruit_CircuitPython_MCP230xx to 2.2.3 from 2.2.2:
  > Merge pull request adafruit/Adafruit_CircuitPython_MCP230xx#21 from foozmeat/master

Updating https://github.com/adafruit/Adafruit_CircuitPython_MPU6050 to 1.0.4 from 1.0.3:
  > Merge pull request adafruit/Adafruit_CircuitPython_MPU6050#5 from FoamyGuy/master

Updating https://github.com/adafruit/Adafruit_CircuitPython_PyPortal to 3.1.10 from 3.1.9:
  > Merge pull request adafruit/Adafruit_CircuitPython_PyPortal#65 from cogliano/master

Updating https://github.com/adafruit/Adafruit_CircuitPython_RGB_Display to 3.8.9 from 3.8.8:
  > Merge pull request adafruit/Adafruit_CircuitPython_RGB_Display#70 from makermelissa/master

Updating https://github.com/adafruit/Adafruit_CircuitPython_BluefruitConnect to 1.0.11 from 1.0.10:
  > Merge pull request adafruit/Adafruit_CircuitPython_BluefruitConnect#17 from caternuson/controller_example

Updating https://github.com/adafruit/Adafruit_CircuitPython_BusDevice to 4.2.0 from 4.1.4:
  > Merge pull request adafruit/Adafruit_CircuitPython_BusDevice#43 from dhalbert/avoid-alloc

Updating https://github.com/adafruit/Adafruit_CircuitPython_Gizmo to 1.1.3 from 1.1.2:
  > Merge pull request adafruit/Adafruit_CircuitPython_Gizmo#10 from FoamyGuy/master

Updating https://github.com/adafruit/Adafruit_CircuitPython_Bundle/circuitpython_library_list.md to NA from NA:
  > Added the following libraries: Adafruit_CircuitPython_LPS2X
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants