Recently I noticed that an error in SAS does not automatically lead to wroung outcomes. Look what happens.
I created two tables that are sorted by a field (nummer).
proc import datafile = 'C:\Users\tomva\SynologyDrive\SAS\input.xlsx'
dbms=xlsx out=input replace;
proc means data=input sum noprint;
output out=sumtot sum=;
data sumtot(drop=_type_ _freq_); set sumtot(rename=(aantal=sum_aantal)); run;
proc sort data=sumtot;by nummer;run;
proc sort data=input;by nummer;run;
I then decided to join the tables on [nummer], the very field that is used to sort the records upon.
I did this in two steps.
proc datasets library=work noprint;
index create nummer;
set sumtot key=nummer;
aandeel = aantal/sum_aantal;
The output looked as I expected. But then I saw in the logfile something strange: a warning saying that several records were not matched due to absence of data:
nummer=1 aantal=15 teller=2 sum_aantal=25 aandeel=0.6 ERROR=1 IORC=1230015 N=2
nummer=2 aantal=13 teller=4 sum_aantal=43 aandeel=0.3023255814 ERROR=1 IORC=1230015 N=4
nummer=2 aantal=18 teller=5 sum_aantal=43 aandeel=0.4186046512 ERROR=1 IORC=1230015 N=5
This can be explained by the fact that we have an n to 1 join, whereby in the first match a record from the n side is read along with a record from the 1 side. In the next round, another record from the n-side is read but no other record in the 1 side is found that matches the record. Therefore the old values are retained and subsequently written as the run statement is encountered.