Documentation Center

  • Trial Software
  • Product Updates

Advanced Topics

About Programming Notes

This section presents guidelines and restrictions in shaded boxes like the one shown below. Those labeled as Required result in an error if your parfor code does not adhere to them. MATLAB® software catches some of these errors at the time it reads the code, and others when it executes the code. These are referred to here as static and dynamic errors, respectively, and are labeled as Required (static) or Required (dynamic). Guidelines that do not cause errors are labeled as Recommended. You can use MATLAB Code Analyzer to help make your parfor-loops comply with these guidelines.

Required (static): Description of the guideline or restriction

Classification of Variables

Overview

When a name in a parfor-loop is recognized as referring to a variable, it is classified into one of the following categories. A parfor-loop generates an error if it contains any variables that cannot be uniquely categorized or if any variables violate their category restrictions.

ClassificationDescription
LoopServes as a loop index for arrays
SlicedAn array whose segments are operated on by different iterations of the loop
BroadcastA variable defined before the loop whose value is used inside the loop, but never assigned inside the loop
ReductionAccumulates a value across iterations of the loop, regardless of iteration order
TemporaryVariable created inside the loop, but unlike sliced or reduction variables, not available outside the loop

Each of these variable classifications appears in this code fragment:

Loop Variable

The following restriction is required, because changing i in the parfor body invalidates the assumptions MATLAB makes about communication between the client and workers.

Required (static): Assignments to the loop variable are not allowed.

This example attempts to modify the value of the loop variable i in the body of the loop, and thus is invalid:

parfor i = 1:n
   i = i + 1;
   a(i) = i;
end

Sliced Variables

A sliced variable is one whose value can be broken up into segments, or slices, which are then operated on separately by workers and by the MATLAB client. Each iteration of the loop works on a different slice of the array. Using sliced variables is important because this type of variable can reduce communication between the client and workers. Only those slices needed by a worker are sent to it, and only when it starts working on a particular range of indices.

In the next example, a slice of A consists of a single element of that array:

parfor i = 1:length(A)
   B(i) = f(A(i));
end

Characteristics of a Sliced Variable.  A variable in a parfor-loop is sliced if it has all of the following characteristics. A description of each characteristic follows the list:

  • Type of First-Level Indexing — The first level of indexing is either parentheses, (), or braces, {}.

  • Fixed Index Listing — Within the first-level parenthesis or braces, the list of indices is the same for all occurrences of a given variable.

  • Form of Indexing — Within the list of indices for the variable, exactly one index involves the loop variable.

  • Shape of Array — In assigning to a sliced variable, the right-hand side of the assignment is not [] or '' (these operators indicate deletion of elements).

Type of First-Level Indexing. For a sliced variable, the first level of indexing is enclosed in either parentheses, (), or braces, {}.

This table lists the forms for the first level of indexing for arrays sliced and not sliced.

Reference for Variable Not SlicedReference for Sliced Variable
A.xA(...)
A.(...)A{...}

After the first level, you can use any type of valid MATLAB indexing in the second and further levels.

The variable A shown here on the left is not sliced; that shown on the right is sliced:

A.q{i,12}                         A{i,12}.q

Fixed Index Listing. Within the first-level parentheses or braces of a sliced variable's indexing, the list of indices is the same for all occurrences of a given variable.

The variable A shown here on the left is not sliced because A is indexed by i and i+1 in different places; that shown on the right is sliced:

Not slicedSliced
parfor i = 1:k
   B(:) = h(A(i), A(i+1));
end
parfor i = 1:k
   B(:) = f(A(i));
   C(:) = g(A{i});
end

The example above on the right shows some occurrences of a sliced variable with first-level parenthesis indexing and with first-level brace indexing in the same loop. This is acceptable.

Form of Indexing. Within the list of indices for a sliced variable, one of these indices is of the form i, i+k, i-k, k+i, or k-i, where i is the loop variable and k is a constant or a simple (nonindexed) broadcast variable; and every other index is a scalar constant, a simple broadcast variable, colon, or end.

With i as the loop variable, the A variables shown here on the left are not sliced; those on the right are sliced:

Not slicedSliced
A(i+f(k),j,:,3)
A(i,20:30,end)
A(i,:,s.field1)
A(i+k,j,:,3)
A(i,:,end)
A(i,:,k)

When you use other variables along with the loop variable to index an array, you cannot set these variables inside the loop. In effect, such variables are constant over the execution of the entire parfor statement. You cannot combine the loop variable with itself to form an index expression.

Shape of Array. A sliced variable must maintain a constant shape. The variable A shown here on either line is not sliced:

A(i,:) = [];
A(end + 1) = i;

The reason A is not sliced in either case is because changing the shape of a sliced array would violate assumptions governing communication between the client and workers.

Sliced Input and Output Variables.  All sliced variables have the characteristics of being input or output. A sliced variable can sometimes have both characteristics. MATLAB transmits sliced input variables from the client to the workers, and sliced output variables from workers back to the client. If a variable is both input and output, it is transmitted in both directions.

In this parfor-loop, r is a sliced input variable and b is a sliced output variable:

a = 0;
z = 0;
r = rand(1,10);
parfor ii = 1:10
   a = ii;
   z = z + ii;
   b(ii) = r(ii);
end

However, if it is clear that in every iteration, every reference to an array element is set before it is used, the variable is not a sliced input variable. In this example, all the elements of A are set, and then only those fixed values are used:

parfor ii = 1:n
   if someCondition
      A(ii) = 32;
   else
      A(ii) = 17;
   end
   loop code that uses A(ii)
end

Even if a sliced variable is not explicitly referenced as an input, implicit usage might make it so. In the following example, not all elements of A are necessarily set inside the parfor-loop, so the original values of the array are received, held, and then returned from the loop, making A both a sliced input and output variable.

A = 1:10;
parfor ii = 1:10
    if rand < 0.5
        A(ii) = 0;
    end
end

Broadcast Variables

A broadcast variable is any variable other than the loop variable or a sliced variable that is not affected by an assignment inside the loop. At the start of a parfor-loop, the values of any broadcast variables are sent to all workers. Although this type of variable can be useful or even essential, broadcast variables that are large can cause a lot of communication between client and workers. In some cases it might be more efficient to use temporary variables for this purpose, creating and assigning them inside the loop.

Reduction Variables

MATLAB supports an important exception, called reductions, to the rule that loop iterations must be independent. A reduction variable accumulates a value that depends on all the iterations together, but is independent of the iteration order. MATLAB allows reduction variables in parfor-loops.

Reduction variables appear on both side of an assignment statement, such as any of the following, where expr is a MATLAB expression.

X = X + exprX = expr + X
X = X - exprSee Associativity in Reduction Assignments in Further Considerations with Reduction Variables
X = X .* exprX = expr .* X
X = X * exprX = expr * X
X = X & exprX = expr & X
X = X | exprX = expr | X
X = [X, expr]X = [expr, X]
X = [X; expr]X = [expr; X]
X = {X, expr}X = {expr, X}
X = {X; expr}X = {expr; X}
X = min(X, expr)X = min(expr, X)
X = max(X, expr)X = max(expr, X)
X = union(X, expr)X = union(expr, X)
X = intersect(X, expr)X = intersect(expr, X)

Each of the allowed statements listed in this table is referred to as a reduction assignment, and, by definition, a reduction variable can appear only in assignments of this type.

The following example shows a typical usage of a reduction variable X:

X = ...;            % Do some initialization of X
parfor i = 1:n
    X = X + d(i);
end

This loop is equivalent to the following, where each d(i) is calculated by a different iteration:

X = X + d(1) + ... + d(n)

If the loop were a regular for-loop, the variable X in each iteration would get its value either before entering the loop or from the previous iteration of the loop. However, this concept does not apply to parfor-loops:

In a parfor-loop, the value of X is never transmitted from client to workers or from worker to worker. Rather, additions of d(i) are done in each worker, with i ranging over the subset of 1:n being performed on that worker. The results are then transmitted back to the client, which adds the workers' partial sums into X. Thus, workers do some of the additions, and the client does the rest.

Basic Rules for Reduction Variables.  The following requirements further define the reduction assignments associated with a given variable.

Required (static): For any reduction variable, the same reduction function or operation must be used in all reduction assignments for that variable.

The parfor-loop on the left is not valid because the reduction assignment uses + in one instance, and [,] in another. The parfor-loop on the right is valid:

InvalidValid
parfor i = 1:n
   if testLevel(k)
      A = A + i;
   else
      A = [A, 4+i];
   end
   % loop body continued
end
parfor i = 1:n
   if testLevel(k)
      A = A + i;
   else
      A = A + i + 5*k;
   end
   % loop body continued
end

Required (static): If the reduction assignment uses * or [,], then in every reduction assignment for X, X must be consistently specified as the first argument or consistently specified as the second.

The parfor-loop on the left below is not valid because the order of items in the concatenation is not consistent throughout the loop. The parfor-loop on the right is valid:

InvalidValid
parfor i = 1:n
   if testLevel(k)
      A = [A, 4+i];
   else
      A = [r(i), A];
   end
   % loop body continued
end
parfor i = 1:n
   if testLevel(k)
      A = [A, 4+i];
   else
      A = [A, r(i)];
   end
   % loop body continued
end

Further Considerations with Reduction Variables.  This section provides more detail about reduction assignments, associativity, commutativity, and overloading of reduction functions.

Reduction Assignments. In addition to the specific forms of reduction assignment listed in the table in Reduction Variables, the only other (and more general) form of a reduction assignment is

X = f(X, expr)X = f(expr, X)

Required (static): f can be a function or a variable. If it is a variable, it must not be affected by the parfor body (in other words, it is a broadcast variable).

If f is a variable, then for all practical purposes its value at run time is a function handle. However, this is not strictly required; as long as the right-hand side can be evaluated, the resulting value is stored in X.

The parfor-loop below on the left will not execute correctly because the statement f = @times causes f to be classified as a temporary variable and therefore is cleared at the beginning of each iteration. The parfor on the right is correct, because it does not assign f inside the loop:

InvalidValid
f = @(x,k)x * k;
parfor i = 1:n
   a = f(a,i);
   % loop body continued
   f = @times;  % Affects f
end
f = @(x,k)x * k;
parfor i = 1:n
   a = f(a,i);
   % loop body continued
end

Note that the operators && and || are not listed in the table in Reduction Variables. Except for && and ||, all the matrix operations of MATLAB have a corresponding function f, such that u op v is equivalent to f(u,v). For && and ||, such a function cannot be written because u&&v and u||v might or might not evaluate v, but f(u,v) always evaluates v before calling f. This is why && and || are excluded from the table of allowed reduction assignments for a parfor-loop.

Every reduction assignment has an associated function f. The properties of f that ensure deterministic behavior of a parfor statement are discussed in the following sections.

Associativity in Reduction Assignments. Concerning the function f as used in the definition of a reduction variable, the following practice is recommended, but does not generate an error if not adhered to. Therefore, it is up to you to ensure that your code meets this recommendation.

Recommended: To get deterministic behavior of parfor-loops, the reduction function f must be associative.

To be associative, the function f must satisfy the following for all a, b, and c:

f(a,f(b,c)) = f(f(a,b),c)

The classification rules for variables, including reduction variables, are purely syntactic. They cannot determine whether the f you have supplied is truly associative or not. Associativity is assumed, but if you violate this, different executions of the loop might result in different answers.

    Note:   While the addition of mathematical real numbers is associative, addition of floating-point numbers is only approximately associative, and different executions of this parfor statement might produce values of X with different round-off errors. This is an unavoidable cost of parallelism.

For example, the statement on the left yields 1, while the statement on the right returns 1 + eps:

(1 + eps/2) + eps/2           1 + (eps/2 + eps/2)

With the exception of the minus operator (-), all the special cases listed in the table in Reduction Variables have a corresponding (perhaps approximately) associative function. MATLAB calculates the assignment X = X - expr by using X = X + (-expr). (So, technically, the function for calculating this reduction assignment is plus, not minus.) However, the assignment X = expr - X cannot be written using an associative function, which explains its exclusion from the table.

Commutativity in Reduction Assignments. Some associative functions, including +, .*, min, and max, intersect, and union, are also commutative. That is, they satisfy the following for all a and b:

f(a,b) = f(b,a)

Examples of noncommutative functions are * (because matrix multiplication is not commutative for matrices in which both dimensions have size greater than one), [,], [;], {,}, and {;}. Noncommutativity is the reason that consistency in the order of arguments to these functions is required. As a practical matter, a more efficient algorithm is possible when a function is commutative as well as associative, and parfor is optimized to exploit commutativity.

Recommended: Except in the cases of *, [,], [;], {,}, and {;}, the function f of a reduction assignment should be commutative. If f is not commutative, different executions of the loop might result in different answers.

Violating the restriction on commutativity in a function used for reduction, could result in unexpected behavior, even if it does not generate an error.

Unless f is a known noncommutative built-in, it is assumed to be commutative. There is currently no way to specify a user-defined, noncommutative function in parfor.

Overloading in Reduction Assignments. Most associative functions f have an identity element e, so that for any a, the following holds true:

f(e,a) = a = f(a,e)

Examples of identity elements for some functions are listed in this table.

FunctionIdentity Element
+0
* and .*1
[,] and [;][]
&true
|false

MATLAB uses the identity elements of reduction functions when it knows them. So, in addition to associativity and commutativity, you should also keep identity elements in mind when overloading these functions.

Recommended: An overload of +, *, .*, [,], or [;] should be associative if it is used in a reduction assignment in a parfor. The overload must treat the respective identity element given above (all with class double) as an identity element.

Recommended: An overload of +, .*, union, or intersect should be commutative.

There is no way to specify the identity element for a function. In these cases, the behavior of parfor is a little less efficient than it is for functions with a known identity element, but the results are correct.

Similarly, because of the special treatment of X = X - expr, the following is recommended.

Recommended: An overload of the minus operator (-) should obey the mathematical law that X - (y + z) is equivalent to (X - y) - z.

Example: Using a Custom Reduction Function.  Suppose each iteration of a loop performs some calculation, and you are interested in finding which iteration of a loop produces the maximum value. This is a reduction exercise that makes an accumulation across multiple iterations of a loop. Your reduction function must compare iteration results, until finally the maximum value can be determined after all iterations are compared.

First consider the reduction function itself. To compare an iteration's result against another's, the function requires as input the current iteration's result and the known maximum result from other iterations so far. Each of the two inputs is a vector containing an iteration's result data and iteration number.

function mc = comparemax(A, B)
% Custom reduction function for 2-element vector input

if A(1) >= B(1) % Compare the two input data values
    mc = A;     % Return the vector with the larger result
else
    mc = B;
end

Inside the loop, each iteration calls the reduction function (comparemax), passing in a pair of 2-element vectors:

  • The accumulated maximum and its iteration index (this is the reduction variable, cummax)

  • The iteration's own calculation value and index

If the data value of the current iteration is greater than the maximum in cummmax, the function returns a vector of the new value and its iteration number. Otherwise, the function returns the existing maximum and its iteration number.

The code for the loop looks like the following, with each iteration calling the reduction function comparemax to compare its own data [dat i] to that already accumulated in cummax.

% First element of cummax is maximum data value
% Second element of cummax is where (iteration) maximum occurs
cummax = [0 0];  % Initialize reduction variable
parfor ii = 1:100
    dat = rand(); % Simulate some actual computation
    cummax = comparemax(cummax, [dat ii]);
end
disp(cummax);

Temporary Variables

A temporary variable is any variable that is the target of a direct, nonindexed assignment, but is not a reduction variable. In the following parfor-loop, a and d are temporary variables:

a = 0;
z = 0;
r = rand(1,10);
parfor i = 1:10
   a = i;          % Variable a is temporary
   z = z + i;
   if i <= 5
      d = 2*a;     % Variable d is temporary
   end
end

In contrast to the behavior of a for-loop, MATLAB effectively clears any temporary variables before each iteration of a parfor-loop. To help ensure the independence of iterations, the values of temporary variables cannot be passed from one iteration of the loop to another. Therefore, temporary variables must be set inside the body of a parfor-loop, so that their values are defined separately for each iteration.

MATLAB does not send temporary variables back to the client. A temporary variable in the context of the parfor statement has no effect on a variable with the same name that exists outside the loop, again in contrast to ordinary for-loops.

Uninitialized Temporaries.  Because temporary variables are cleared at the beginning of every iteration, MATLAB can detect certain cases in which any iteration through the loop uses the temporary variable before it is set in that iteration. In this case, MATLAB issues a static error rather than a run-time error, because there is little point in allowing execution to proceed if a run-time error is guaranteed to occur. This kind of error often arises because of confusion between for and parfor, especially regarding the rules of classification of variables. For example, suppose you write

  b = true;
  parfor i = 1:n
     if b && some_condition(i)
        do_something(i);
        b = false;
     end
     ...
  end

This loop is acceptable as an ordinary for-loop, but as a parfor-loop, b is a temporary variable because it occurs directly as the target of an assignment inside the loop. Therefore it is cleared at the start of each iteration, so its use in the condition of the if is guaranteed to be uninitialized. (If you change parfor to for, the value of b assumes sequential execution of the loop, so that do_something(i) is executed for only the lower values of i until b is set false.)

Temporary Variables Intended as Reduction Variables.  Another common cause of uninitialized temporaries can arise when you have a variable that you intended to be a reduction variable, but you use it elsewhere in the loop, causing it technically to be classified as a temporary variable. For example:

s = 0;
parfor i = 1:n
   s = s + f(i);
   ...
   if (s > whatever)
      ...
   end
end

If the only occurrences of s were the two in the first statement of the body, it would be classified as a reduction variable. But in this example, s is not a reduction variable because it has a use outside of reduction assignments in the line s > whatever. Because s is the target of an assignment (in the first statement), it is a temporary, so MATLAB issues an error about this fact, but points out the possible connection with reduction.

Note that if you change parfor to for, the use of s outside the reduction assignment relies on the iterations being performed in a particular order. The point here is that in a parfor-loop, it matters that the loop "does not care" about the value of a reduction variable as it goes along. It is only after the loop that the reduction value becomes usable.

Improving Performance

Where to Create Arrays

With a parfor-loop, it might be faster to have each MATLAB worker create its own arrays or portions of them in parallel, rather than to create a large array in the client before the loop and send it out to all the workers separately. Having each worker create its own copy of these arrays inside the loop saves the time of transferring the data from client to workers, because all the workers can be creating it at the same time. This might challenge your usual practice to do as much variable initialization before a for-loop as possible, so that you do not needlessly repeat it inside the loop.

Whether to create arrays before the parfor-loop or inside the parfor-loop depends on the size of the arrays, the time needed to create them, whether the workers need all or part of the arrays, the number of loop iterations that each worker performs, and other factors. While many for-loops can be directly converted to parfor-loops, even in these cases there might be other issues involved in optimizing your code.

Optimizing on Local vs. Cluster Workers

With local workers, because all the MATLAB worker sessions are running on the same machine, you might not see any performance improvement from a parfor-loop regarding execution time. This can depend on many factors, including how many processors and cores your machine has. You might experiment to see if it is faster to create the arrays before the loop (as shown on the left below), rather than have each worker create its own arrays inside the loop (as shown on the right).

Try the following examples running a parallel pool locally, and notice the difference in time execution for each loop. First open a local parallel pool:

parpool('local')

Then enter the following examples. (If you are viewing this documentation in the MATLAB help browser, highlight each segment of code below, right-click, and select Evaluate Selection in the context menu to execute the block in MATLAB. That way the time measurement will not include the time required to paste or type.)

tic;
n = 200;
M = magic(n);
R = rand(n);
parfor i = 1:n
   A(i) = sum(M(i,:).*R(n+1-i,:));
end
toc
tic;
n = 200;
parfor i = 1:n
   M = magic(n);
   R = rand(n);
   A(i) = sum(M(i,:).*R(n+1-i,:));
end
toc

Running on a remote cluster, you might find different behavior as workers can simultaneously create their arrays, saving transfer time. Therefore, code that is optimized for local workers might not be optimized for cluster workers, and vice versa.

Was this topic helpful?