Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!z66g2000hsc.googlegroups.com!not-for-mail
From: Greg Heath <heath@alumni.brown.edu>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Vectorization for Quadratic Polynomial Regression
Date: Fri, 11 Jul 2008 21:31:10 -0700 (PDT)
Organization: http://groups.google.com
Lines: 151
Message-ID: <46e8a0ff-be1b-43e1-b958-acf5400c4523@z66g2000hsc.googlegroups.com>
References: <cddc4bd3-1f78-4fad-90d2-a7cde7ec5189@e39g2000hsf.googlegroups.com>
NNTP-Posting-Host: 69.141.173.117
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1215837070 22533 127.0.0.1 (12 Jul 2008 04:31:10 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sat, 12 Jul 2008 04:31:10 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: z66g2000hsc.googlegroups.com; posting-host=69.141.173.117; 
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 
Xref: news.mathworks.com comp.soft-sys.matlab:478957


On Jul 1, 3:05=A0pm, Greg Heath <he...@alumni.brown.edu> wrote:
> I have an original predictor matrix X0 with
>
> size(X0) =3D [ m n0]
>
> I'd like to perform the
>
> linear-coefficient/quadratic-variable regression
>
> y =3D X*W + e
>
> size(X) =3D [ m n]
>
> n =3D n0*(n0+3)/2
>
> and X contains the columns of X0,
> their squares, and their cross products.
>
> Currently, X is created by the following double loop
>
> X =3D X0;
> for i =3D 1:n0-1
> =A0 =A0 for j=3D i+1:n0
> =A0 =A0 =A0 =A0 X =3D [ X X0(:,i).*X0(:,j)];
> =A0 =A0 end
> end
> X =3D [X X0.^2];
>
> Example:
>
> X0 =3D [ =A01 2 3; 4 5 6]
>
> X =3D
> =A0 =A0 =A01 =A0 =A0 2 =A0 =A0 3 =A0 =A0 2 =A0 =A0 3 =A0 =A0 6 =A0 =A0 1 =
=A0 =A0 4 =A0 =A0 9
> =A0 =A0 =A04 =A0 =A0 5 =A0 =A0 6 =A0 =A020 =A0 =A024 =A0 =A030 =A0 =A016 =
=A0 =A025 =A0 =A036
>
> Can the double loop be vectorized?

Many thanks to Tom, Bruno, Per and Pierre.
After some minor modifications, I ran the 4 algorithms
below for my simple 2 X 3 illustrative example and
Pierre's more realistic 1030 X 8 example (the size of
my current regression problem).

Notice how the rankings of the times change for the
two cases. I have noticed that, for the simple example,
the double loop times ranged between 1.8 and 4.2
sec. I can only attribute the spread to whatever
the computer was doing in the backgound without
my permission.

I wouldn't be surprised if the rankings change
when others run the script below.

The following results are for just two runs

run =3D1: 1e5 repetitions for size(X0) =3D [2 3]

 time(sec)    Nerd(s)
 2.2970       Greg
 8.1560       Tom
 5.6100       Bruno/Per
 4.1720       Pierre

2. 1e3 repetitions for size(X0) =3D [1030 1]

 time(sec)    Nerd(s)
 6.2040       Greg
1.2960        Tom
1.3750        Bruno/Per
1.2040        Pierre       % ....The winner by a nose!


%%%%%%%%%%%%%%%%%%%%%%%%%%%

clear all, clc

run =3D 2
run =3D 1
if run =3D=3D 1
    X0 =3D [1 2 3; 4 5 6]                  % [2 3]
    reps =3D 1e5
else
    X0 =3D repmat(1:8,[1030 1]);          % [ 1030 8]
    reps =3D 1e3
end
[m n0] =3D  size(X0)

tic % Greg
for k =3D 1:reps
    X     =3D  X0;
    for i =3D 1:n0-1
        for j=3D i+1:n0
            X =3D [ X X0(:,i).*X0(:,j)];
        end
    end
    X =3D [ ones(m,1) X X0.^2 ];
end
t1 =3D toc                      % 2.3
if run =3D=3D 1, X  =3D X, end

% X =3D
%   1   1   2   3    2    3    6    1    4    9
%   1  4   5   6   20   24   30   16   25   36


tic % Tom
for k =3D 1:reps
    col =3D repmat(1:n0,n0,1);
    row =3D col';
    upper =3D ~tril(ones(n0));    % get only (i,j) with j > i
    I =3D row(upper);
    J =3D col(upper);
    X =3D [ones(m,1), X0, X0(:,I).*X0(:,J) X0.^2];
end
t2 =3D toc                      % ~ 8.2 sec
if run =3D=3D 1, X  =3D X, end

tic % Brun0 & Per
for k =3D 1:reps
    [J,I]=3Dmeshgrid(1:n0);           % Bruno
    K =3D J>=3DI;                       % Per
    X =3D [ones(m,1) X0 X0(:,I(K)).*X0(:,J(K))];
end
t3 =3D toc                     % ~ 5.6 sec
if run =3D=3D 1, X  =3D X, end

tic % Pierre
for k =3D 1:reps
  changeIdx      =3D [1, cumsum(n0-1:-1:2)+1];
  ff1(changeIdx) =3D 1;
  sfc            =3D ones(1, changeIdx(end));
  sfc(changeIdx) =3D [2, (3:n0)-n0];
  X =3D [ones(m,1),X0,X0(:,cumsum(ff1)).*X0(:,cumsum(sfc)),X0.^2];
end
t4=3Dtoc                     % ~ 4.2 sec
if run =3D=3D 1, X  =3D X, end

times =3D [t1 t2 t3 t4]'

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Hope this helps.

Greg