SAS® and R

Best of Both Worlds

Posts Tagged ‘SAS

proc sort nodup

leave a comment »

proc sort noduprecs
by id
rid the pesky duplicates
Advertisements

Written by sasandr

April 1, 2014 at 9:46 pm

Posted in SAS

Tagged with ,

Floating point arithmetic

with 4 comments

One time I was trying different cut-off points for classification to define a dichotomous variable for logistic regression, and I kept getting erroneous result when I looked at the data print-out. Values which should have been set to “Yes” according to my algorithm fell into the “No” column, and I just couldn’t figure out what went wrong.

Here is a simple example to illustrate my problem. We start with a SOURCE data set with four variables X, Y, (X-Y) and a certain constant as the cut-off value.

Floating-Point Arithmetic

Obs    x     y     x - y    cutoff
 1     2    1.1      0.9      0.9
 2     2    1.2      0.8      0.8
 3     2    1.3      0.7      0.7
 4     2    1.4      0.6      0.6
 5     2    1.5      0.5      0.5
 6     2    1.6      0.4      0.4
 7     2    1.7      0.3      0.3
 8     2    1.8      0.2      0.2
 9     2    1.9      0.1      0.1

Now a flag variables is created to indicate if (x-y) is equal to the cut-off (1 for Yes, 0 for No).

data FLOAT;
  set source;
  /* No rounding */
  flag1 = (z = cutoff);
  /* round() to the rescue */
  flag2 = (round(z,0.1) = cutoff);
run;

The print-out of dataset FLOAT shows that with round() function, flag variable is set correctly; but we get erratic result sans rounding. This is because in SAS, numeric values are represented as 64-bit floating point numbers, and rules of algebra may not apply to floating point numbers. A paper from the SAS® Institute explains this phenomena in great details. You can check it out yourself.

                                       Without      With
Obs    x     y     x - y    cutoff    rounding    rounding

 1     2    1.1      0.9      0.9         0           1
 2     2    1.2      0.8      0.8         1           1
 3     2    1.3      0.7      0.7         1           1
 4     2    1.4      0.6      0.6         0           1
 5     2    1.5      0.5      0.5         1           1
 6     2    1.6      0.4      0.4         0           1
 7     2    1.7      0.3      0.3         0           1
 8     2    1.8      0.2      0.2         0           1
 9     2    1.9      0.1      0.1         0           1

And as you see from the above example, to circumvent this problem, the quick and dirty way is using round() function to set precision before comparison.

And upon further checking, this is also an issue in R. You can see that, without rounding function, values for some of the comparisons are not returning “TRUE” even though on paper (x-y) and z might look the same.

> x <- rep(2, 9)
> y <- seq(1.1, 1.9, by=0.1)
> z <- seq(0.9, 0.1, by=-0.1)
> x-y
[1] 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
> z
[1] 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
> # Without rounding
> x-y == z
[1] FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE
>
> # With rounding
> round(x-y, 1) == round(z, 1)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

So the take home lesson here is always check your calculation when you are dealing with values with decimal points, use various rounding and truncation functions to ensure decimal precision. And do remember, even though your number/data is continuous, computer only recognizes 0 and 1.

Written by sasandr

July 6, 2012 at 2:26 pm

Posted in R, SAS

Tagged with , ,

SAS tip: Save and load system options

leave a comment »

It is good statistical programming practice to delete all temporary data sets at the end of a macro run, not just to save considerable work space and memory, but also reduce the chance for errors such as reusing the data set from previous run, or warning signs such as naming conflict.

But how about system options? In your macro call, you might need to change several system option settings, and it would be a hassle to reset them back to the original state one-by-one. Thankfully we have OPTSAVE and OPTLOAD procedures at our disposal, which can be used to save SAS option values, and restore them at a later time. In the following example the OPTSAVE procedure writes the values of all the SAS options that can be altered from within a SAS session to a SAS data set [your_saved_options]. The OPTLOAD procedure later restores the SAS session option values from the [your_saved_options] data set.

/* Save your SAS system options */
proc optsave out=[your_saved_options];
run;

/* Reload your SAS system options */
proc optload data=[your_saved_dateset];
run;

This way, SAS options can be reset back to the original values, before exiting the macro.

Written by sasandr

June 6, 2012 at 4:08 pm

Posted in SAS

Tagged with

Abbr. [Update]

leave a comment »

Have you been using the same SAS procedures or SAS functions over and over again, but for some reason never been able to remember the correct syntax, or do you wish you can cut down on repetitive typing, and focus more on thinking?  SAS Enhanced Editor actually has a little tool built in to help you achieve just that.

A SAS abbreviation is a character string defined by you so that when you type the string in the Enhanced Editor window, the string is automatically substituted with a longer text string. Abbreviations are actually keyboard macros that insert one or more lines of text.

1. Create Abbreviation
Just press “Ctrl + Shift + A”, or use the menu “Tools –> Add Abbreviation”In the Abbreviation field, type the name of the abbreviation. In the Text to insert for abbreviation field, type the text that the abbreviation will expand into. Then click OK. For example, here we type in the syntax for function IFN.  We name this abbreviation ifn, and that is the “code word” you type in the Enhanced Editor to invoke this keyboard macro.

Whenever you want to use an abbreviation, simply type in the name of the abbreviation while in the Enhanced Editor. As soon as the last letter of the abbreviation has been entered, a small pop-up ‘tip’ text box containing the first few words of the abbreviation is displayed. If at that point you press the TAB/Enter key the name of the abbreviation will be replaced by the text that you stored.

2. Export/import abbreviation
You can also export or import your abbreviations so you have access to them on multiple machines. Go to “Tools –> Keyboard Macros –> Macros”.

A window opens where you select the abbreviations to export. If you want to select more than one, hold down the CTRL key on your keyboard as you click on each abbreviation. Click “Export”, SAS will automatically select the necessary file type Keyboard Macro Files (*.kmf). Name the file anything you want. SAS will then create the export file for you. And if you want to import KMF file, click “Import.”Voila! After you set up the abbreviation once, you can recall it again and again. This might be especially useful in adding program header block, and it can also reduce effort in looking up syntax for functions, statements, or procedures you are prone to forget to avoid interruption to your programming flow. Hope this trick is useful to you.

[Update] Here is a video from SAS® software solution consulting firm Amadeus Software about Using Abbreviations.

Written by sasandr

May 16, 2012 at 2:37 pm

Posted in SAS

Tagged with