allpairs_master(1) | Cooperative Computing Tools | allpairs_master(1) |
allpairs_master - executes All-Pairs workflow in parallel on distributed systems
allparis_master [options] <set A> <set B> <compare function>
allpairs_master computes the Cartesian product of two sets (<set A> and <set B>), generating a matrix where each cell M[i,j] contains the output of the function F (<compare function>) on objects A[i] (an item in <set A>) and B[j] (an item in <set B>). The resulting matrix is displayed on the standard output, one comparison result per line along with the associated X and Y indices.
allpairs_master uses the Work Queue system to distribute tasks among processors. Each processor utilizes the allpairs_multicore(1) program to execute the tasks in parallel if multiple cores are present. After starting allpairs_master, you must start a number of work_queue_worker(1) processes on remote machines. The workers will then connect back to the master process and begin executing tasks.
On success, returns zero. On failure, returns non-zero.
Let's suppose you have a whole lot of files that you want to compare all to each other, named a, b, c, and so on. Suppose that you also have a program named compareit that when invoked as compareit a b will compare files a and b and produce some output summarizing the difference between the two, like this:
a b are 45 percent similar
To use the allpairs framework, create a file called set.list that lists each of your files, one per line:
a b c ...
Because allpairs_master utilizes allpairs_multicore(1), so please make sure allpairs_multicore(1) is in your PATH before you proceed.To run a All-Pairs workflow sequentially, start a single work_queue_worker(1) process in the background. Then, invoke allpairs_master.
% work_queue_worker localhost 9123 & % allpairs_master set.list set.list compareit
The framework will carry out all possible comparisons of the objects, and print the results one by one (note that the first two columns are X and Y indices in the resulting matrix):
1 1 a a are 100 percent similar 1 2 a b are 45 percent similar 1 3 a c are 37 percent similar ...
To speed up the process, run more work_queue_worker(1) processes on other machines, or use condor_submit_workers(1) or uge_submit_workers(1) to start hundreds of workers in your local batch system.
The following is an example of adding more workers to execute a All-Pairs workflow. Suppose your allpairs_master is running on a machine named barney.nd.edu. If you have access to login to other machines, you could simply start worker processes on each one, like this:
% work_queue_worker barney.nd.edu 9123
If you have access to a batch system like Condor, you can submit multiple workers at once:
% condor_submit_workers barney.nd.edu 9123 10 Submitting job(s).......... Logging submit event(s).......... 10 job(s) submitted to cluster 298.
The Cooperative Computing Tools are Copyright (C) 2022 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.
CCTools 7.13.1 FINAL |