fixunknowns

Process data by marking rows with unknown values

Syntax

[y,ps] = fixunknowns(X) [y,ps] = fixunknowns(X,FP) Y = fixunknowns('apply',X,PS) X = fixunknowns('reverse',Y,PS) name = fixunknowns('name') fp = fixunknowns('pdefaults') pd = fixunknowns('pdesc') fixunknowns('pcheck',fp)

Description

fixunknowns processes matrices by replacing each row containing unknown values (represented by NaN) with two rows of information.

The first row contains the original row, with NaN values replaced by the row’s mean. The second row contains 1 and 0 values, indicating which values in the first row were known or unknown, respectively.

[y,ps] = fixunknowns(X) takes these inputs,

`X`	`N`-by-`Q` matrix

and returns

`Y`	`M`-by-`Q` matrix with `M - N` rows added
`PS`	Process settings that allow consistent processing of values

[y,ps] = fixunknowns(X,FP) takes an empty struct FP of parameters.

Y = fixunknowns('apply',X,PS) returns Y, given X and settings PS.

X = fixunknowns('reverse',Y,PS) returns X, given Y and settings PS.

name = fixunknowns('name') returns the name of this process method.

fp = fixunknowns('pdefaults') returns the default process parameter structure.

pd = fixunknowns('pdesc') returns the process parameter descriptions.

fixunknowns('pcheck',fp) throws an error if any parameter is illegal.

Examples

Here is how to format a matrix with a mixture of known and unknown values in its second row:

x1 = [1 2 3 4; 4 NaN 6 5; NaN 2 3 NaN]
[y1,ps] = fixunknowns(x1)

Next, apply the same processing settings to new values:

x2 = [4 5 3 2; NaN 9 NaN 2; 4 9 5 2]
y2 = fixunknowns('apply',x2,ps)

Reverse the processing of y1 to get x1 again.

x1_again = fixunknowns('reverse',y1,ps)

More About

collapse all

Recode Data with `NaNs` Using `fixunknowns`

If you have input data with unknown values, you can represent them with NaN values. For example, here are five 2-element vectors with unknown values in the first element of two of the vectors:

p1 = [1 NaN 3 2 NaN; 3 1 -1 2 4];

The network will not be able to process the NaN values properly. Use the function fixunknowns to transform each row with NaN values (in this case only the first row) into two rows that encode that same information numerically.

[p2,ps] = fixunknowns(p1);

Here is how the first row of values was recoded as two rows.

p2 =
   1  2  3  2  2
   1  0  1  1  0
   3  1 -1  2  4

The first new row is the original first row, but with the mean value for that row (in this case 2) replacing all NaN values. The elements of the second new row are now either 1, indicating the original element was a known value, or 0 indicating that it was unknown. The original second row is now the new third row. In this way both known and unknown values are encoded numerically in a way that lets the network be trained and simulated.

Whenever supplying new data to the network, you should transform the inputs in the same way, using the settings ps returned by fixunknowns when it was used to transform the training input data.

p2new = fixunknowns('apply',p1new,ps);

The function fixunkowns is only recommended for input processing. Unknown targets represented by NaN values can be handled directly by the toolbox learning algorithms. For instance, performance functions used by backpropagation algorithms recognize NaN values as unknown or unimportant values.

Version History

Introduced in R2006a