(HELP) Problem to solve a problem

#1
Good day lads:
First of all, thank you for having some time for an internet random. I will try to be short and concise.
I have been given two datasets: one has qualitative data about students (student name and surname, date of birth, etc.) and the other has information about how a student has performed during exam days (student surname, subject, mark, time taken to finish the test, etc.). At this point, I think I must note that not all students have taken tests yet and some have taken more than one.
So, I have been asked to merge both datasets. I could do that using SPSS merge files function since I have a Key Variable (student name and surname in one and surname in the other), the thing is that the Key Variable, eventhough intrinsically is the same, formly it is not. This is, a case in a dataset would be John Rimsey Volgan (first dataset) and the same case in the second dataset would be displayed as John R.V. This makes the merge for SPSS impossible as it is not capable of identifying both "names" as for the same person. I hope that it has been clearly explained.
What can I do?
I have thought about creating a new variable (ID_STUDENT for example), but I do not know how to add it other than manually (since cases are not ordered) and have in mind that there are +1000 students and 3000 exam tests.
I would really appreciate any type of help since I am completely frustrated and I have no one else to ask to.
 

Karabiner

TS Contributor
#2
Do you ask how to create a variable in SPSS where John Rimsey Volgan
is transformed to John R. V., or Will Smith is transformed to Will S.?

With kind regards

Karabiner
 
#3
Not exactly.
I need SPSS to understand that the John Rimsey Volgan from one dataset is the John R.V. from the other dataset, so it can successfuly merge both datasets.
Your proposal could make SPSS understand, but how could I change + 1000 student names other than manually?
Thank you very much for your reply, I honestly really appreciate it.
With kind regards,
Kaltxetine.
 

Karabiner

TS Contributor
#4
I still don't know if it is the general feature of the second Id that only the first part of a name
is maintained and the other parts are abbreviated. If so, and John Rimsey Volgan = John R. V.
or Will Smith = Will S. are the only types to consider, then the following (admittedly crude)
syntax will possibly work.

STRING Part1 (A16).
COMPUTE Part1=CHAR.SUBSTR(id_1,1,CHAR.INDEX(id_1,' ')-1).
EXECUTE.

STRING Rest (A32).
COMPUTE Rest=CHAR.SUBSTR(id_1,CHAR.INDEX(id_1,' ')+1,30).
EXECUTE.

STRING Part2 (A32).
COMPUTE Part2=CHAR.SUBSTR(Rest,1,CHAR.INDEX(Rest,' ')-1).
EXECUTE.
IF(Part2='') Part2=Rest.
EXECUTE .

STRING Part3 (A32).
COMPUTE Part3=CHAR.SUBSTR(Rest,CHAR.INDEX(Rest,' ')+1,30).
EXECUTE.
IF(Part3=Rest) Part3 = '' .
EXECUTE .

STRING Id_new (A16).
COMPUTE Id_new=CONCAT(Part1," ",CHAR.SUBSTR(Part2,1,1),". ",CHAR.SUBSTR(Part3,1,1),".").
EXECUTE.

With kind regards

Karabiner