mapply package

mapply.init(*, n_workers=- 1, chunk_size=100, max_chunks_per_worker=20, progressbar=True, apply_name='mapply', map_name='mmap', applymap_name='mapplymap')

Initialize and patch PandasObject.

Parameters
  • n_workers (int) – Amount of workers (processes) to spawn.

  • chunk_size (int) – Minimum amount of items per chunk. Determines upper limit for n_chunks.

  • max_chunks_per_worker (int) – Upper limit on amount of chunks per worker. Will lower n_chunks determined by chunk_size if necessary. Set to 0 to skip this check.

  • progressbar (bool) – Whether to wrap the chunks in a tqdm.auto.tqdm.

  • apply_name (str) – Attribute name for the patched apply function.

  • map_name (str) – Attribute name for the patched map function.

  • applymap_name (str) – Attribute name for the patched applymap function.

Submodules

mapply.mapply module

mapply.mapply.mapply(df_or_series, function, axis=0, *, n_workers=- 1, chunk_size=100, max_chunks_per_worker=20, progressbar=True, args=(), **kwargs)

Run apply on n_workers. Split in chunks, gather results, and concat them.

Parameters
  • df_or_series (Any) – Argument reserved to the class instance, a.k.a. ‘self’.

  • function (Callable) – Function to apply to each column or row.

  • axis (Union[int, str]) – Axis along which the function is applied.

  • n_workers (int) – Amount of workers (processes) to spawn.

  • chunk_size (int) – Minimum amount of items per chunk. Determines upper limit for n_chunks.

  • max_chunks_per_worker (int) – Upper limit on amount of chunks per worker. Will lower n_chunks determined by chunk_size if necessary. Set to 0 to skip this check.

  • progressbar (bool) – Whether to wrap the chunks in a tqdm.auto.tqdm.

  • args – Additional positional arguments to pass to function.

  • kwargs – Additional keyword arguments to pass to function.

Returns

Series or DataFrame resulting from applying function along given axis.

Return type

Any

mapply.parallel module

mapply.parallel.multiprocessing_imap(function, iterable, *, n_workers=- 1, progressbar=True, args=(), **kwargs)

Execute function on each element in iterable on n_workers, ensuring order.

Parameters
  • function (Callable) – Function to apply to each element in iterable.

  • iterable (Iterable[Any]) – Input iterable on which to execute function.

  • n_workers (int) – Amount of workers (processes) to spawn.

  • progressbar (bool) – Whether to wrap the chunks in a tqdm.auto.tqdm.

  • args – Additional positional arguments to pass to function.

  • kwargs – Additional keyword arguments to pass to function.

Returns

Results in same order as input iterable.

Return type

List[Any]

mapply.parallel.sensible_cpu_count()

Count amount of physical CPUs, plus 1 on hyperthreading systems to prioritize.

Return type

int