utils.mpi

Module for parallelization using mpi4py.

Allows for easy parallelization in master/slaves mode with one master submitting function or method calls to slaves. Uses mpi4py if available, otherwise processes calls sequentially in one process.

Examples:

Save the following lines in demo_mpi.py and run:

> mpirun -n 10 python demo_mpi.py
  1. Use master/slaves parallelization with the Network class:

    from pyunicorn import Network, mpi
    
    
    def master():
        net = Network.BarabasiAlbert(n_nodes=1000, n_links_each=10)
        print(net.newman_betweenness())
        mpi.info()
    
    mpi.run()
    
  2. Do a Monte Carlo simulation as master/slaves:

    from pyunicorn import Network, mpi
    
    
    def do_one():
        net = Network.BarabasiAlbert(n_nodes=100, n_links_each=10)
        return net.global_clustering()
    
    
    def master():
        n = 1000
        for i in range(0, n):
            mpi.submit_call("do_one", ())
        s = 0
        for i in range(0, n):
            s += mpi.get_next_result()
        print(s/n)
        mpi.info()
    
    mpi.run()
    
  3. Do a parameter scan without communication with a master, and just save the results in files:

    import numpy
    from pyunicorn import Network, mpi
    
    offset = 10
    n_max = 1000
    s = 0
    n = mpi.rank + offset
    while n <= n_max + offset:
        s += Network.BarabasiAlbert(n_nodes=n).global_clustering()
        n += mpi.size
    
    numpy.save("s"+str(mpi.rank), s)
    
exception pyunicorn.utils.mpi.MPIException(value)[source]

Bases: Exception

__init__(value)[source]
__str__()[source]

Return str(self).

__weakref__

list of weak references to the object

pyunicorn.utils.mpi.abort()[source]

Abort execution on all MPI nodes immediately.

Can be called by master and slaves.

pyunicorn.utils.mpi.am_master = True

(bool) indicates that this MPI node is the master.

pyunicorn.utils.mpi.am_slave = False

(bool) indicates that this MPI node is a slave.

pyunicorn.utils.mpi.assigned = {}

(dictionary) assigned[id] is the slave assigned to the call with that id.

pyunicorn.utils.mpi.available = False

(bool) indicates that slaves are available.

pyunicorn.utils.mpi.get_next_result()[source]

Return result of next earlier submitted call whose result has not yet been got.

Can only be called by the master.

If the call is not yet finished, waits for it to finish.

Return type:

object

Returns:

return value of call, or None of there are no more calls in the queue.

pyunicorn.utils.mpi.get_result(id)[source]

Return result of earlier submitted call.

Can only be called by the master.

If the call is not yet finished, waits for it to finish. Results should be collected in the same order as calls were submitted. For each slave, the results of calls assigned to that slave must be collected in the same order as those calls were submitted. Can only be called once per call.

Parameters:

id (object) – id of an earlier submitted call, as provided to or returned by submit_call().

Return type:

object

Returns:

return value of call.

pyunicorn.utils.mpi.info()[source]

Print processing statistics.

Can only be called by the master.

pyunicorn.utils.mpi.n_processed = array([0])

(list of ints) n_processed[rank] is the total number of calls processed by MPI node rank. On slave i, only total_time[i] is available.

pyunicorn.utils.mpi.n_slaves = 0

(int) no. of slaves available.

pyunicorn.utils.mpi.queue = []

(list) ids of submitted calls

pyunicorn.utils.mpi.rank = 0

(int) rank of this MPI node (0 is the master).

pyunicorn.utils.mpi.results = {}

(dictionary) if mpi is not available, the result of submit_call(…, id=a) will be cached in results[a] until get_result(a).

pyunicorn.utils.mpi.run(verbose=False)[source]

Run in master/slaves mode until master() finishes.

Must be called on all MPI nodes after function master() was defined.

On the master, run() calls master() and returns when master() returns.

On each slave, run() calls slave() if that is defined, or calls serve() otherwise, and returns when slave() returns, or when master() returns on the master, or when master calls terminate().

Parameters:

verbose (bool) – whether processing information should be printed.

pyunicorn.utils.mpi.size = 1

(int) number of MPI nodes (master and slaves).

pyunicorn.utils.mpi.slave_queue = [[]]

(list of lists) slave_queue[i] contains the ids of calls assigned to slave i.

pyunicorn.utils.mpi.start_time = 1708946763.526738

(float) starting time of this MPI node.

pyunicorn.utils.mpi.stats = []

(list of dictionaries) stats[id] contains processing statistics for the last call with this id. Keys:

  • “id”: id of the call

  • “rank”: MPI node who processed the call

  • “this_time”: wall time for processing the call

  • “time_over_est”: quotient of actual over estimated wall time

  • “n_processed”: no. of calls processed so far by that slave, including this

  • “total_time”: total wall time until this call was finished

pyunicorn.utils.mpi.submit_call(name_to_call, args=(), kwargs={}, module='__main__', time_est=1, id=None, slave=None)[source]

Submit a call for parallel execution.

If called by the master and slaves are available, the call is submitted to a slave for asynchronous execution.

If called by a slave or if no slaves are available, the call is instead executed synchronously on this MPI node.

Examples:

  1. Provide ids and time estimate explicitly:

    for n in range(0,10):
        mpi.submit_call("doit", (n,A[n]), id=n, time_est=n**2)
    
    for n in range(0,10):
        result[n] = mpi.get_result(n)
    
  2. Use generated ids stored in a list:

    for n in range(0,10):
        ids.append(mpi.submit_call("doit", (n,A[n])))
    
    for n in range(0,10):
        results.append(mpi.get_result(ids.pop()))
    
  3. Ignore ids altogether:

    for n in range(0,10):
        mpi.submit_call("doit", (n,A[n]))
    
    for n in range(0,10):
        results.append(mpi.get_next_result())
    
  4. Call a module function and use keyword arguments:

    mpi.submit_call("solve", (), {"a":a, "b":b},
        module="numpy.linalg")
    
  5. Call a static class method from a package:

    mpi.submit_call("Network._get_histogram", (values, n_bins),
        module="pyunicorn")
    

Note that it is module=”pyunicorn” and not module=”pyunicorn.network” here.

Parameters:
  • name_to_call (str) – name of callable object (usually a function or static method of a class) as contained in the namespace specified by module.

  • args (tuple) – the positional arguments to provide to the callable object. Tuples of length 1 must be written (arg,). Default: ()

  • kwargs (dict) – the keyword arguments to provide to the callable object. Default: {}

  • module (str) – optional name of the imported module or submodule in whose namespace the callable object is contained. For objects defined on the script level, this is “__main__”, for objects defined in an imported package, this is the package name. Must be a key of the dictionary sys.modules (check there after import if in doubt). Default: “__main__”

  • time_est (float) – estimated relative completion time for this call; used to find a suitable slave. Default: 1

  • id (object or None) – unique id for this call. Must be a possible dictionary key. If None, a random id is assigned and returned. Can be re-used after get_result() for this is. Default: None

  • slave (int > 0 and < mpi.size, or None) – optional no. of slave to assign the call to. If None, the call is assigned to the slave with the smallest current total time estimate. Default: None

Return object:

id of call, to be used in get_result().

pyunicorn.utils.mpi.terminate()[source]

Tell all slaves to terminate.

Can only be called by the master.

pyunicorn.utils.mpi.total_time = array([0.])

(list of floats) total_time[rank] is the total wall time until that node finished its last call. On slave i, only total_time[i] is available.

pyunicorn.utils.mpi.total_time_est = array([inf])

(numpy array of ints) total_time_est[i] is the current estimate of the total time MPI slave i will work on already submitted calls. On slave i, only total_time_est[i] is available.