mapply package¶
-
mapply.init(*, n_workers=- 1, chunk_size=100, max_chunks_per_worker=20, progressbar=True, apply_name='mapply', map_name='mmap', applymap_name='mapplymap')¶ Initialize and patch PandasObject.
- Parameters
n_workers (int) – Amount of workers (processes) to spawn.
chunk_size (int) – Minimum amount of items per chunk. Determines upper limit for n_chunks.
max_chunks_per_worker (int) – Upper limit on amount of chunks per worker. Will lower n_chunks determined by chunk_size if necessary. Set to 0 to skip this check.
progressbar (bool) – Whether to wrap the chunks in a tqdm.auto.tqdm.
apply_name (str) – Attribute name for the patched apply function.
map_name (str) – Attribute name for the patched map function.
applymap_name (str) – Attribute name for the patched applymap function.
Submodules¶
mapply.mapply module¶
-
mapply.mapply.mapply(df_or_series, function, axis=0, *, n_workers=- 1, chunk_size=100, max_chunks_per_worker=20, progressbar=True, args=(), **kwargs)¶ Run apply on n_workers. Split in chunks, gather results, and concat them.
- Parameters
df_or_series (Any) – Argument reserved to the class instance, a.k.a. ‘self’.
function (Callable) – Function to apply to each column or row.
axis (Union[int, str]) – Axis along which the function is applied.
n_workers (int) – Amount of workers (processes) to spawn.
chunk_size (int) – Minimum amount of items per chunk. Determines upper limit for n_chunks.
max_chunks_per_worker (int) – Upper limit on amount of chunks per worker. Will lower n_chunks determined by chunk_size if necessary. Set to 0 to skip this check.
progressbar (bool) – Whether to wrap the chunks in a tqdm.auto.tqdm.
args – Additional positional arguments to pass to function.
kwargs – Additional keyword arguments to pass to function.
- Returns
Series or DataFrame resulting from applying function along given axis.
- Return type
Any
mapply.parallel module¶
-
mapply.parallel.multiprocessing_imap(function, iterable, *, n_workers=- 1, progressbar=True, args=(), **kwargs)¶ Execute function on each element in iterable on n_workers, ensuring order.
- Parameters
function (Callable) – Function to apply to each element in iterable.
iterable (Iterable[Any]) – Input iterable on which to execute function.
n_workers (int) – Amount of workers (processes) to spawn.
progressbar (bool) – Whether to wrap the chunks in a tqdm.auto.tqdm.
args – Additional positional arguments to pass to function.
kwargs – Additional keyword arguments to pass to function.
- Returns
Results in same order as input iterable.
- Return type
List[Any]
-
mapply.parallel.sensible_cpu_count()¶ Count amount of physical CPUs, plus 1 on hyperthreading systems to prioritize.
- Return type
int