SAS Tip for Categorising Calculated Values

SAS Programming Tip For Categorising Calculated Values

We derived a % result based on 2 values, using the simple formula based on a post and pre assessment:

AVAL=(postf-pref)/pref*100;

For 2 patients, we ran into issues when the following values were used:

subject 001: AVAL = (0.84-0.75)/0.75*100 = 12

subject 002: AVAL = (1.38-1.2)/1.2*100 = 15

Using a simple categorisation for AVALCAT1:

      if .<AVAL<12 then AVALCAT1="<12";

      else if 12<=AVAL<15 then AVALCAT1=">=12 to <15";

      else if AVAL>=15 then AVALCAT1=">=15";

Gave values of:

subject 001: AVAL = 12   /  AVALCAT1="<12"

subject 002: AVAL = 15   /  AVALCAT1=">=12 to <15"

Both production and QC datasets matched so the issue didn't flag up in the PROC COMPARE and looking at these isolated cases it's clear that they are wrong; however, we can't manually check a study that has 4000 patients!  Thankfully the Pinnacle 21 Validator found the issues when we validated the ADaM dataset stating there was an inconsistent value of AVALCAT1 for AVAL which prompted us to investigate.  Investigation into both prod and QC showed the syntax was programmed correctly and the numbers appeared as integers in the SAS dataset, but clearly must be stored with more precision than a user would see so they were being categorised incorrectly. We updated the code to apply the INT function to take only the integer part of the AVAL categorisation:

      if .<int(AVAL)<12 then AVALCAT1="<12";

      else if 12<=int(AVAL)<15 then AVALCAT1=">=12 to <15";

      else if int(AVAL)>=15 then AVALCAT1=">=15";

Giving both prod and QC values of, and leaving other values correctly categorised:

subject 1: AVAL = 12   AVALCAT1=">=12 to <15"

subject 2: AVAL = 15   AVALCAT1=">=15"

The INT function returns the integer portion of the argument (truncating the decimal portion) and if the argument's value is within 1E-12 of an integer, the function results in that integer so was suitable for use in our case as the number was stored as value extremely close to 12, represented by 12, but not exactly 12 given the calculation we did and how the result is stored using floating point precision.  The suggestion would be to apply the INT function or use the ROUND function, and not assuming if the value looks like an integer on screen then this is necessarily how its stored as it's the stored value that will be used in the calculation.

If you need proof that SAS stores numbers slightly differently based on how they were created, you can run this yourself and test what the answer of the variable equal would be!

data test;

    if 0.3=3*0.1 then equal='Y';

    else equal='N';

run;

For further reading:  http://analytics.ncsu.edu/sesug/2008/PO-082.pdf