Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit a2bc2d2

Browse files
committedOct 25, 2016
feat(mman): BREAKING API mman as context-manager to release regions
+ Add PY3 compat utilities + doc(changes, tutorial): update on mman usage
1 parent bba086a commit a2bc2d2

File tree

7 files changed

+519
-414
lines changed

7 files changed

+519
-414
lines changed
 

‎doc/source/changes.rst

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,34 +2,46 @@
22
Changelog
33
#########
44

5-
**********
5+
2.1.0
6+
======
7+
8+
* **BREAKING API:** etrofit ``git.util.mman`` as context-manager,
9+
to release memory-mapped regions held.
10+
11+
The *mmap-manager(s)* are re-entrant, but not thread-safe **context-manager(s)**,
12+
to be used within a ``with ...:`` block, ensuring any left-overs cursors are cleaned up.
13+
If not entered, :meth:`StaticWindowMapManager.make_cursor()` and/or
14+
:meth:`WindowCursor.use_region()` will scream.
15+
16+
Get them from ``smmap.managed_mmaps()``.
17+
618
v0.9.0
7-
**********
19+
========
820
- Fixed issue with resources never being freed as mmaps were never closed.
921
- Client counting is now done manually, instead of relying on pyton's reference count
1022

11-
**********
23+
1224
v0.8.5
13-
**********
25+
========
1426
- Fixed Python 3.0-3.3 regression, which also causes smmap to become about 3 times slower depending on the code path. It's related to this bug (http://bugs.python.org/issue15958), which was fixed in python 3.4
1527

16-
**********
28+
1729
v0.8.4
18-
**********
30+
========
1931
- Fixed Python 3 performance regression
2032

21-
**********
33+
2234
v0.8.3
23-
**********
35+
========
2436
- Cleaned up code and assured it works sufficiently well with python 3
2537

26-
**********
38+
2739
v0.8.1
28-
**********
40+
========
2941
- A single bugfix
3042

31-
**********
43+
3244
v0.8.0
33-
**********
45+
========
3446

3547
- Initial Release

‎doc/source/tutorial.rst

Lines changed: 85 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -5,91 +5,111 @@ Usage Guide
55
###########
66
This text briefly introduces you to the basic design decisions and accompanying classes.
77

8-
******
98
Design
10-
******
11-
Per application, there is *MemoryManager* which is held as static instance and used throughout the application. It can be configured to keep your resources within certain limits.
9+
======
10+
Per application, there must be a *MemoryManager* to be used throughout the application.
11+
It can be configured to keep your resources within certain limits.
1212

13-
To access mapped regions, you require a cursor. Cursors point to exactly one file and serve as handles into it. As long as it exists, the respective memory region will remain available.
13+
To access mapped regions, you require a cursor. Cursors point to exactly one file and serve as handles into it.
14+
As long as it exists, the respective memory region will remain available.
15+
16+
For convenience, a buffer implementation is provided which handles cursors and resource allocation
17+
behind its simple buffer like interface.
1418

15-
For convenience, a buffer implementation is provided which handles cursors and resource allocation behind its simple buffer like interface.
1619

17-
***************
1820
Memory Managers
19-
***************
20-
There are two types of memory managers, one uses *static* windows, the other one uses *sliding* windows. A window is a region of a file mapped into memory. Although the names might be somewhat misleading as technically windows are always static, the *sliding* version will allocate relatively small windows whereas the *static* version will always map the whole file.
21+
================
22+
There are two types of memory managers, one uses *static* windows, the other one uses *sliding* windows.
23+
A window is a region of a file mapped into memory. Although the names might be somewhat misleading,
24+
as technically windows are always static, the *sliding* version will allocate relatively small windows
25+
whereas the *static* version will always map the whole file.
26+
27+
The *static* memory-manager does nothing more than keeping a client count on the respective memory maps
28+
which always map the whole file, which allows to make some assumptions that can lead to simplified
29+
data access and increased performance, but reduces the compatibility to 32 bit systems or giant files.
30+
31+
The *sliding* memory-manager therefore should be the default manager when preparing an application
32+
for handling huge amounts of data on 32 bit and 64 bit platforms
2133

22-
The *static* manager does nothing more than keeping a client count on the respective memory maps which always map the whole file, which allows to make some assumptions that can lead to simplified data access and increased performance, but reduces the compatibility to 32 bit systems or giant files.
34+
.. Note::
35+
The *mmap-manager(s)* are re-entrant, but not thread-safe **context-manager(s)**,
36+
to be used within a ``with ...:`` block, ensuring any left-overs cursors are cleaned up.
37+
If not entered, :meth:`StaticWindowMapManager.make_cursor()` and/or
38+
:meth:`WindowCursor.use_region()` will scream.
2339

24-
The *sliding* memory manager therefore should be the default manager when preparing an application for handling huge amounts of data on 32 bit and 64 bit platforms::
40+
41+
Use the :math:`smmap.managed_mmaps()` to take care of all this::
2542

2643
import smmap
2744
# This instance should be globally available in your application
2845
# It is configured to be well suitable for 32-bit or 64 bit applications.
29-
mman = smmap.SlidingWindowMapManager()
46+
with smmap.managed_mmaps() as mman:
3047
31-
# the manager provides much useful information about its current state
32-
# like the amount of open file handles or the amount of mapped memory
33-
mman.num_file_handles()
34-
mman.mapped_memory_size()
35-
# and many more ...
48+
# the manager provides much useful information about its current state
49+
# like the amount of open file handles or the amount of mapped memory
50+
mman.num_file_handles()
51+
mman.mapped_memory_size()
52+
# and many more ...
3653

3754

3855
Cursors
39-
*******
56+
========
4057
*Cursors* are handles that point onto a window, i.e. a region of a file mapped into memory. From them you may obtain a buffer through which the data of that window can actually be accessed::
4158

4259
import smmap.test.lib
43-
fc = smmap.test.lib.FileCreator(1024*1024*8, "test_file")
44-
45-
# obtain a cursor to access some file.
46-
c = mman.make_cursor(fc.path)
47-
48-
# the cursor is now associated with the file, but not yet usable
49-
assert c.is_associated()
50-
assert not c.is_valid()
51-
52-
# before you can use the cursor, you have to specify a window you want to
53-
# access. The following just says you want as much data as possible starting
54-
# from offset 0.
55-
# To be sure your region could be mapped, query for validity
56-
assert c.use_region().is_valid() # use_region returns self
57-
58-
# once a region was mapped, you must query its dimension regularly
59-
# to assure you don't try to access its buffer out of its bounds
60-
assert c.size()
61-
c.buffer()[0] # first byte
62-
c.buffer()[1:10] # first 9 bytes
63-
c.buffer()[c.size()-1] # last byte
64-
65-
# its recommended not to create big slices when feeding the buffer
66-
# into consumers (e.g. struct or zlib).
67-
# Instead, either give the buffer directly, or use pythons buffer command.
68-
buffer(c.buffer(), 1, 9) # first 9 bytes without copying them
69-
70-
# you can query absolute offsets, and check whether an offset is included
71-
# in the cursor's data.
72-
assert c.ofs_begin() < c.ofs_end()
73-
assert c.includes_ofs(100)
74-
75-
# If you are over out of bounds with one of your region requests, the
76-
# cursor will be come invalid. It cannot be used in that state
77-
assert not c.use_region(fc.size, 100).is_valid()
78-
# map as much as possible after skipping the first 100 bytes
79-
assert c.use_region(100).is_valid()
80-
81-
# You can explicitly free cursor resources by unusing the cursor's region
82-
c.unuse_region()
83-
assert not c.is_valid()
60+
61+
with smmap.managed_mmaps() as mman:
62+
fc = smmap.test.lib.FileCreator(1024*1024*8, "test_file")
63+
64+
# obtain a cursor to access some file.
65+
c = mman.make_cursor(fc.path)
66+
67+
# the cursor is now associated with the file, but not yet usable
68+
assert c.is_associated()
69+
assert not c.is_valid()
70+
71+
# before you can use the cursor, you have to specify a window you want to
72+
# access. The following just says you want as much data as possible starting
73+
# from offset 0.
74+
# To be sure your region could be mapped, query for validity
75+
assert c.use_region().is_valid() # use_region returns self
76+
77+
# once a region was mapped, you must query its dimension regularly
78+
# to assure you don't try to access its buffer out of its bounds
79+
assert c.size()
80+
c.buffer()[0] # first byte
81+
c.buffer()[1:10] # first 9 bytes
82+
c.buffer()[c.size()-1] # last byte
83+
84+
# its recommended not to create big slices when feeding the buffer
85+
# into consumers (e.g. struct or zlib).
86+
# Instead, either give the buffer directly, or use pythons buffer command.
87+
buffer(c.buffer(), 1, 9) # first 9 bytes without copying them
88+
89+
# you can query absolute offsets, and check whether an offset is included
90+
# in the cursor's data.
91+
assert c.ofs_begin() < c.ofs_end()
92+
assert c.includes_ofs(100)
93+
94+
# If you are over out of bounds with one of your region requests, the
95+
# cursor will be come invalid. It cannot be used in that state
96+
assert not c.use_region(fc.size, 100).is_valid()
97+
# map as much as possible after skipping the first 100 bytes
98+
assert c.use_region(100).is_valid()
99+
100+
# You can explicitly free cursor resources by unusing the cursor's region
101+
c.unuse_region()
102+
assert not c.is_valid()
84103
85104

86105
Now you would have to write your algorithms around this interface to properly slide through huge amounts of data.
87106

88107
Alternatively you can use a convenience interface.
89108

90-
*******
109+
110+
========
91111
Buffers
92-
*******
112+
========
93113
To make first use easier, at the expense of performance, there is a Buffer implementation which uses a cursor underneath.
94114

95115
With it, you can access all data in a possibly huge file without having to take care of setting the cursor to different regions yourself::
@@ -112,7 +132,9 @@ With it, you can access all data in a possibly huge file without having to take
112132
113133
# it will stop using resources automatically once it goes out of scope
114134
115-
Disadvantages
116-
*************
117-
Buffers cannot be used in place of strings or maps, hence you have to slice them to have valid input for the sorts of struct and zlib. A slice means a lot of data handling overhead which makes buffers slower compared to using cursors directly.
135+
Disadvantages
136+
--------------
137+
Buffers cannot be used in place of strings or maps, hence you have to slice them to have valid
138+
input for the sorts of struct and zlib.
139+
A slice means a lot of data handling overhead which makes buffers slower compared to using cursors directly.
118140

‎smmap/mman.py

Lines changed: 71 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
"""Module containing a memory memory manager which provides a sliding window on a number of memory mapped files"""
2+
from functools import reduce
3+
import logging
4+
import sys
5+
26
from .util import (
7+
PY3,
38
MapWindow,
49
MapRegion,
510
MapRegionList,
@@ -8,15 +13,29 @@
813
buffer,
914
)
1015

11-
import sys
12-
from functools import reduce
1316

14-
__all__ = ["StaticWindowMapManager", "SlidingWindowMapManager", "WindowCursor"]
17+
__all__ = ['managed_mmaps', "StaticWindowMapManager", "SlidingWindowMapManager", "WindowCursor"]
1518
#{ Utilities
16-
19+
log = logging.getLogger(__name__)
1720
#}END utilities
1821

1922

23+
def managed_mmaps():
24+
"""Makes a memory-map context-manager instance for the correct python-version.
25+
26+
:return: either :class:`SlidingWindowMapManager` or :class:`StaticWindowMapManager` (if PY2)
27+
28+
If you want to change the default parameters of these classes, use them directly.
29+
30+
.. Tip::
31+
Use it in a ``with ...:`` block, to free cached (and unused) resources.
32+
33+
"""
34+
mman = SlidingWindowMapManager if PY3 else StaticWindowMapManager
35+
36+
return mman()
37+
38+
2039
class WindowCursor(object):
2140

2241
"""
@@ -25,9 +44,15 @@ class WindowCursor(object):
2544
2645
Cursors should not be created manually, but are instead returned by the SlidingWindowMapManager
2746
28-
**Note:**: The current implementation is suited for static and sliding window managers, but it also means
29-
that it must be suited for the somewhat quite different sliding manager. It could be improved, but
30-
I see no real need to do so."""
47+
.. Tip::
48+
This is a re-entrant, but not thread-safe context-manager, to be used within a ``with ...:`` block,
49+
to ensure any left-overs cursors are cleaned up. If not entered, :meth:`use_region()``
50+
will scream.
51+
52+
.. Note::
53+
The current implementation is suited for static and sliding window managers,
54+
but it also means that it must be suited for the somewhat quite different sliding manager.
55+
It could be improved, but I see no real need to do so."""
3156
__slots__ = (
3257
'_manager', # the manger keeping all file regions
3358
'_rlist', # a regions list with regions for our file
@@ -110,6 +135,10 @@ def use_region(self, offset=0, size=0, flags=0):
110135
111136
**Note:**: The size actually mapped may be smaller than the given size. If that is the case,
112137
either the file has reached its end, or the map was created between two existing regions"""
138+
if self._manager._entered <= 0:
139+
raise ValueError('Context-manager %s not entered for %s!' %
140+
(self._manager, self))
141+
113142
need_region = True
114143
man = self._manager
115144
fsize = self._rlist.file_size()
@@ -243,15 +272,23 @@ class StaticWindowMapManager(object):
243272
These clients would have to use a SlidingWindowMapBuffer to hide this fact.
244273
245274
This type will always use a maximum window size, and optimize certain methods to
246-
accommodate this fact"""
275+
accommodate this fact
276+
277+
.. Tip::
278+
The *memory-managers* are re-entrant, but not thread-safe context-manager(s),
279+
to be used within a ``with ...:`` block, ensuring any left-overs cursors are cleaned up.
280+
If not entered, :meth:`make_cursor()` and/or :meth:`WindowCursor.use_region()` will scream.
281+
282+
"""
247283

248284
__slots__ = [
249-
'_fdict', # mapping of path -> StorageHelper (of some kind
250-
'_window_size', # maximum size of a window
251-
'_max_memory_size', # maximum amount of memory we may allocate
252-
'_max_handle_count', # maximum amount of handles to keep open
253-
'_memory_size', # currently allocated memory size
285+
'_fdict', # mapping of path -> StorageHelper (of some kind
286+
'_window_size', # maximum size of a window
287+
'_max_memory_size', # maximum amount of memory we may allocate
288+
'_max_handle_count', # maximum amount of handles to keep open
289+
'_memory_size', # currently allocated memory size
254290
'_handle_count', # amount of currently allocated file handles
291+
'_entered', # updated on enter/exit, when 0, `close()`
255292
]
256293

257294
#{ Configuration
@@ -280,6 +317,7 @@ def __init__(self, window_size=0, max_memory_size=0, max_open_handles=sys.maxsiz
280317
self._max_handle_count = max_open_handles
281318
self._memory_size = 0
282319
self._handle_count = 0
320+
self._entered = 0
283321

284322
if window_size < 0:
285323
coeff = 64
@@ -297,6 +335,23 @@ def __init__(self, window_size=0, max_memory_size=0, max_open_handles=sys.maxsiz
297335
self._max_memory_size = coeff * self._MB_in_bytes
298336
# END handle max memory size
299337

338+
def __enter__(self):
339+
assert self._entered >= 0, self._entered
340+
self._entered += 1
341+
342+
return self
343+
344+
def __exit__(self, exc_type, exc_value, traceback):
345+
assert self._entered > 0, self._entered
346+
self._entered -= 1
347+
if self._entered == 0:
348+
leaft_overs = self.collect()
349+
if leaft_overs:
350+
log.warning("Cleaned up %s left-over mmap-regions.")
351+
352+
def close(self):
353+
self.collect()
354+
300355
#{ Internal Methods
301356

302357
def _collect_lru_region(self, size):
@@ -399,6 +454,9 @@ def make_cursor(self, path_or_fd):
399454
400455
**Note:** Using file descriptors directly is faster once new windows are mapped as it
401456
prevents the file to be opened again just for the purpose of mapping it."""
457+
if self._entered <= 0:
458+
raise ValueError('Context-manager %s not entered!' % self)
459+
402460
regions = self._fdict.get(path_or_fd)
403461
if regions:
404462
assert not regions.collect_closed_regions(), regions.collect_closed_regions()

‎smmap/test/test_buf.py

Lines changed: 98 additions & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -25,106 +25,112 @@
2525
class TestBuf(TestBase):
2626

2727
def test_basics(self):
28+
# invalid paths fail upon construction
29+
with FileCreator(self.k_window_test_size, "buffer_test") as fc:
30+
with man_optimal:
31+
c = man_optimal.make_cursor(fc.path)
32+
self.assertRaises(ValueError, SlidingWindowMapBuffer, type(c)()) # invalid cursor
33+
self.assertRaises(ValueError, SlidingWindowMapBuffer, c, fc.size) # offset too large
34+
35+
buf = SlidingWindowMapBuffer() # can create uninitailized buffers
36+
assert buf.cursor() is None
37+
38+
# can call end access any time
39+
buf.end_access()
40+
buf.end_access()
41+
assert len(buf) == 0
42+
43+
# begin access can revive it, if the offset is suitable
44+
offset = 100
45+
assert buf.begin_access(c, fc.size) == False
46+
assert buf.begin_access(c, offset) == True
47+
assert len(buf) == fc.size - offset
48+
assert buf.cursor().is_valid()
49+
50+
# empty begin access keeps it valid on the same path, but alters the offset
51+
assert buf.begin_access() == True
52+
assert len(buf) == fc.size
53+
assert buf.cursor().is_valid()
54+
55+
# simple access
56+
with open(fc.path, 'rb') as fp:
57+
data = fp.read()
58+
assert data[offset] == buf[0]
59+
assert data[offset:offset * 2] == buf[0:offset]
60+
61+
# negative indices, partial slices
62+
assert buf[-1] == buf[len(buf) - 1]
63+
assert buf[-10:] == buf[len(buf) - 10:len(buf)]
64+
65+
# end access makes its cursor invalid
66+
buf.end_access()
67+
assert not buf.cursor().is_valid()
68+
assert buf.cursor().is_associated() # but it remains associated
69+
70+
# an empty begin access fixes it up again
71+
assert buf.begin_access() == True and buf.cursor().is_valid()
72+
del(buf) # ends access automatically
73+
del(c)
74+
75+
assert man_optimal.num_file_handles() == 1
76+
77+
def test_performance(self):
78+
# PERFORMANCE
79+
# blast away with random access and a full mapping - we don't want to
80+
# exaggerate the manager's overhead, but measure the buffer overhead
81+
# We do it once with an optimal setting, and with a worse manager which
82+
# will produce small mappings only !
2883
with FileCreator(self.k_window_test_size, "buffer_test") as fc:
29-
30-
# invalid paths fail upon construction
31-
c = man_optimal.make_cursor(fc.path)
32-
self.assertRaises(ValueError, SlidingWindowMapBuffer, type(c)()) # invalid cursor
33-
self.assertRaises(ValueError, SlidingWindowMapBuffer, c, fc.size) # offset too large
34-
35-
buf = SlidingWindowMapBuffer() # can create uninitailized buffers
36-
assert buf.cursor() is None
37-
38-
# can call end access any time
39-
buf.end_access()
40-
buf.end_access()
41-
assert len(buf) == 0
42-
43-
# begin access can revive it, if the offset is suitable
44-
offset = 100
45-
assert buf.begin_access(c, fc.size) == False
46-
assert buf.begin_access(c, offset) == True
47-
assert len(buf) == fc.size - offset
48-
assert buf.cursor().is_valid()
49-
50-
# empty begin access keeps it valid on the same path, but alters the offset
51-
assert buf.begin_access() == True
52-
assert len(buf) == fc.size
53-
assert buf.cursor().is_valid()
54-
55-
# simple access
5684
with open(fc.path, 'rb') as fp:
5785
data = fp.read()
58-
assert data[offset] == buf[0]
59-
assert data[offset:offset * 2] == buf[0:offset]
60-
61-
# negative indices, partial slices
62-
assert buf[-1] == buf[len(buf) - 1]
63-
assert buf[-10:] == buf[len(buf) - 10:len(buf)]
64-
65-
# end access makes its cursor invalid
66-
buf.end_access()
67-
assert not buf.cursor().is_valid()
68-
assert buf.cursor().is_associated() # but it remains associated
69-
70-
# an empty begin access fixes it up again
71-
assert buf.begin_access() == True and buf.cursor().is_valid()
72-
del(buf) # ends access automatically
73-
del(c)
74-
75-
assert man_optimal.num_file_handles() == 1
76-
77-
# PERFORMANCE
78-
# blast away with random access and a full mapping - we don't want to
79-
# exaggerate the manager's overhead, but measure the buffer overhead
80-
# We do it once with an optimal setting, and with a worse manager which
81-
# will produce small mappings only !
86+
8287
max_num_accesses = 100
8388
fd = os.open(fc.path, os.O_RDONLY)
8489
for item in (fc.path, fd):
8590
for manager, man_id in ((man_optimal, 'optimal'),
8691
(man_worst_case, 'worst case'),
8792
(static_man, 'static optimal')):
88-
buf = SlidingWindowMapBuffer(manager.make_cursor(item))
89-
assert manager.num_file_handles() == 1
90-
for access_mode in range(2): # single, multi
91-
num_accesses_left = max_num_accesses
92-
num_bytes = 0
93-
fsize = fc.size
94-
95-
st = time()
96-
buf.begin_access()
97-
while num_accesses_left:
98-
num_accesses_left -= 1
99-
if access_mode: # multi
100-
ofs_start = randint(0, fsize)
101-
ofs_end = randint(ofs_start, fsize)
102-
d = buf[ofs_start:ofs_end]
103-
assert len(d) == ofs_end - ofs_start
104-
assert d == data[ofs_start:ofs_end]
105-
num_bytes += len(d)
106-
del d
107-
else:
108-
pos = randint(0, fsize)
109-
assert buf[pos] == data[pos]
110-
num_bytes += 1
111-
# END handle mode
112-
# END handle num accesses
113-
114-
buf.end_access()
115-
assert manager.num_file_handles()
116-
assert manager.collect()
117-
assert manager.num_file_handles() == 0
118-
elapsed = max(time() - st, 0.001) # prevent zero division errors on windows
119-
mb = float(1000 * 1000)
120-
mode_str = (access_mode and "slice") or "single byte"
121-
print("%s: Made %i random %s accesses to buffer created from %s "
122-
"reading a total of %f mb in %f s (%f mb/s)"
123-
% (man_id, max_num_accesses, mode_str, type(item),
124-
num_bytes / mb, elapsed, (num_bytes / mb) / elapsed),
125-
file=sys.stderr)
126-
# END handle access mode
127-
del buf
128-
# END for each manager
93+
with manager:
94+
buf = SlidingWindowMapBuffer(manager.make_cursor(item))
95+
assert manager.num_file_handles() == 1
96+
for access_mode in range(2): # single, multi
97+
num_accesses_left = max_num_accesses
98+
num_bytes = 0
99+
fsize = fc.size
100+
101+
st = time()
102+
buf.begin_access()
103+
while num_accesses_left:
104+
num_accesses_left -= 1
105+
if access_mode: # multi
106+
ofs_start = randint(0, fsize)
107+
ofs_end = randint(ofs_start, fsize)
108+
d = buf[ofs_start:ofs_end]
109+
assert len(d) == ofs_end - ofs_start
110+
assert d == data[ofs_start:ofs_end]
111+
num_bytes += len(d)
112+
del d
113+
else:
114+
pos = randint(0, fsize)
115+
assert buf[pos] == data[pos]
116+
num_bytes += 1
117+
# END handle mode
118+
# END handle num accesses
119+
120+
buf.end_access()
121+
assert manager.num_file_handles()
122+
assert manager.collect()
123+
assert manager.num_file_handles() == 0
124+
elapsed = max(time() - st, 0.001) # prevent zero division errors on windows
125+
mb = float(1000 * 1000)
126+
mode_str = (access_mode and "slice") or "single byte"
127+
print("%s: Made %i random %s accesses to buffer created from %s "
128+
"reading a total of %f mb in %f s (%f mb/s)"
129+
% (man_id, max_num_accesses, mode_str, type(item),
130+
num_bytes / mb, elapsed, (num_bytes / mb) / elapsed),
131+
file=sys.stderr)
132+
# END handle access mode
133+
del buf
134+
# END for each manager
129135
# END for each input
130136
os.close(fd)

‎smmap/test/test_mman.py

Lines changed: 179 additions & 176 deletions
Large diffs are not rendered by default.

‎smmap/test/test_tutorial.py

Lines changed: 58 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -22,60 +22,61 @@ def test_example(self):
2222
import smmap.test.lib
2323
with smmap.test.lib.FileCreator(1024 * 1024 * 8, "test_file") as fc:
2424
# obtain a cursor to access some file.
25-
c = mman.make_cursor(fc.path)
26-
27-
# the cursor is now associated with the file, but not yet usable
28-
assert c.is_associated()
29-
assert not c.is_valid()
30-
31-
# before you can use the cursor, you have to specify a window you want to
32-
# access. The following just says you want as much data as possible starting
33-
# from offset 0.
34-
# To be sure your region could be mapped, query for validity
35-
assert c.use_region().is_valid() # use_region returns self
36-
37-
# once a region was mapped, you must query its dimension regularly
38-
# to assure you don't try to access its buffer out of its bounds
39-
assert c.size()
40-
c.buffer()[0] # first byte
41-
c.buffer()[1:10] # first 9 bytes
42-
c.buffer()[c.size() - 1] # last byte
43-
44-
# its recommended not to create big slices when feeding the buffer
45-
# into consumers (e.g. struct or zlib).
46-
# Instead, either give the buffer directly, or use pythons buffer command.
47-
from smmap.util import buffer
48-
buffer(c.buffer(), 1, 9) # first 9 bytes without copying them
49-
50-
# you can query absolute offsets, and check whether an offset is included
51-
# in the cursor's data.
52-
assert c.ofs_begin() < c.ofs_end()
53-
assert c.includes_ofs(100)
54-
55-
# If you are over out of bounds with one of your region requests, the
56-
# cursor will be come invalid. It cannot be used in that state
57-
assert not c.use_region(fc.size, 100).is_valid()
58-
# map as much as possible after skipping the first 100 bytes
59-
assert c.use_region(100).is_valid()
60-
61-
# You can explicitly free cursor resources by unusing the cursor's region
62-
c.unuse_region()
63-
assert not c.is_valid()
64-
65-
# Buffers
66-
#########
67-
# Create a default buffer which can operate on the whole file
68-
buf = smmap.SlidingWindowMapBuffer(mman.make_cursor(fc.path))
69-
70-
# you can use it right away
71-
assert buf.cursor().is_valid()
72-
73-
buf[0] # access the first byte
74-
buf[-1] # access the last ten bytes on the file
75-
buf[-10:] # access the last ten bytes
76-
77-
# If you want to keep the instance between different accesses, use the
78-
# dedicated methods
79-
buf.end_access()
80-
assert not buf.cursor().is_valid() # you cannot use the buffer anymore
81-
assert buf.begin_access(offset=10) # start using the buffer at an offset
25+
with mman:
26+
c = mman.make_cursor(fc.path)
27+
28+
# the cursor is now associated with the file, but not yet usable
29+
assert c.is_associated()
30+
assert not c.is_valid()
31+
32+
# before you can use the cursor, you have to specify a window you want to
33+
# access. The following just says you want as much data as possible starting
34+
# from offset 0.
35+
# To be sure your region could be mapped, query for validity
36+
assert c.use_region().is_valid() # use_region returns self
37+
38+
# once a region was mapped, you must query its dimension regularly
39+
# to assure you don't try to access its buffer out of its bounds
40+
assert c.size()
41+
c.buffer()[0] # first byte
42+
c.buffer()[1:10] # first 9 bytes
43+
c.buffer()[c.size() - 1] # last byte
44+
45+
# its recommended not to create big slices when feeding the buffer
46+
# into consumers (e.g. struct or zlib).
47+
# Instead, either give the buffer directly, or use pythons buffer command.
48+
from smmap.util import buffer
49+
buffer(c.buffer(), 1, 9) # first 9 bytes without copying them
50+
51+
# you can query absolute offsets, and check whether an offset is included
52+
# in the cursor's data.
53+
assert c.ofs_begin() < c.ofs_end()
54+
assert c.includes_ofs(100)
55+
56+
# If you are over out of bounds with one of your region requests, the
57+
# cursor will be come invalid. It cannot be used in that state
58+
assert not c.use_region(fc.size, 100).is_valid()
59+
# map as much as possible after skipping the first 100 bytes
60+
assert c.use_region(100).is_valid()
61+
62+
# You can explicitly free cursor resources by unusing the cursor's region
63+
c.unuse_region()
64+
assert not c.is_valid()
65+
66+
# Buffers
67+
#########
68+
# Create a default buffer which can operate on the whole file
69+
buf = smmap.SlidingWindowMapBuffer(mman.make_cursor(fc.path))
70+
71+
# you can use it right away
72+
assert buf.cursor().is_valid()
73+
74+
buf[0] # access the first byte
75+
buf[-1] # access the last ten bytes on the file
76+
buf[-10:] # access the last ten bytes
77+
78+
# If you want to keep the instance between different accesses, use the
79+
# dedicated methods
80+
buf.end_access()
81+
assert not buf.cursor().is_valid() # you cannot use the buffer anymore
82+
assert buf.begin_access(offset=10) # start using the buffer at an offset

‎smmap/util.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,11 @@ def buffer(obj, offset, size):
2828
# return obj[offset:offset + size]
2929

3030

31+
PY3 = sys.version_info[0] >= 3
32+
33+
3134
def string_types():
32-
if sys.version_info[0] >= 3:
35+
if PY3:
3336
return str
3437
else:
3538
return basestring # @UndefinedVariable

0 commit comments

Comments
 (0)
Please sign in to comment.