Dealing with missing value in dataset

nlper

New Member
#1
My dataset contains matrix of 1k x 1k and each points having value ranging from 1 to 100. if value is 1, means that point value is lost.

To recover this lost value I have another datset file biult from previous datasets history. this file contains mapping for 1 to 100 to some status value.
This dataset contains 4k rows with two column, 1st having any value from 1-100 and second value is state value. How this 2nd datasets can be used to restore missing values in 1st datasets.

My suggested approach:

From dataset 2, getting average value for each 1-100 points mapped value. And replacing this averaged value into 1st datasets. Now taking average of all values in 1st dataset and replacing missed point (point with value 1). Putting averaged value. Is this correct approach?

sample:
1 of the row in dataset 1
Code:
:

0 0 13 14 0 13 0 0 16 0 14 13 0 0 0 14 15 14 13 13 13 16 0 0 13 16 15 0 13 0 0 13 14 15 16 16 15 0 13 13 14 14 14 15 13 14 0 15 16 15 0 16 14 15 13 16 0 0 0 0 0 0 0 16 14 15 14 14 0 15 16 0 14 0 13 16 0 0 0 14 16 13 0 15 0 13 15 16 0 15 15 0 15 14 16 0 15 16 13 0 0 16 0 0 13 0 14 14 0 13 0 0 13 0 15 14 13 14 0 16 0 0 0 0 13 15 15 0 13 0 0 0 15 0 16 0 13 0 16 14 13 15 13 0 0 16 0 0 15 0 15 16 13 13 13 14 15 16 16 14 13 14 0 0 15 14 15 0 0 0 14 15 0 0 16 13 16 15 14 16 0 0 13 0 16 14 0 0 0 0 13 16 16 0 15 16 14 0 15 16 0 0 14 14 0 0 16 13 13 15 0 0 0 13 15 0 0 0 13 16 0 15 0 0 14 13 0 14 13 0 0 13 0 13 16 0 16 15 13 14 14 16 0 0 0 0 0 16 0 15 13 15 0 16 0 0 15 15 0 16 0 13 0 13 0 15 14 0 16 16 14 0 0 16 16 15 14 13 0 0 16 14 15 0 13 14 16 15 13 16 15 16 0 16 13 0 13 15 14 0 16 13 0 0 0 15 0 15 0 13 13 0 14 0 0 13 14 15 15 14 15 0 0 0 14 0 0 14 16 15 13 0 15 0 0 0 15 0 0 0 0 0 13 0 16 14 0 0 0 13 0 0 15 0 16 13 15 14 15 15 0 15 13 0 15 13 16 14 16 0 0 13 13 16 0 15 16 14 13 13 13 0 15 0 14 13 0 0 14 0 0 15 0 16 14 15 0 0 13 0 0 0 16 16 0 13 16 15 0 0 16 15 15 16 14 16 14 16 16 14 15 15 14 14 16 15 16 13 0 0 0 14 16 0 15 16 13 14 13 0 0 16 14 15 0 14 15 14 16 13 13 0 14 15 0 0 16 13 0 14 16 16 15 15 13 15 13 15 0 0 0 14 0 0 0 0 13 13 0 0 14 14 13 0 16 13 0 13 16 13 14 14 0 16 0 13 15 15 15 15 15 0 13 16 14 0 16 15 15 13 15 15 13 13 0 14 15 16 13 16 0 16 0 13 13 16 0 0 14 0 16 15 13 14 13 0 13 16 0 0 15 0 13 15 14 0 0 13 0 0 16 16 0 15 15 16 13 0 13 13 16 14 15 0 13 13 0 16 15 0 14 16 16 15 16 0 14 15 15 14 14 0 15 0 15 14 0 15 16 16 13 13 0 16 0 13 13 14 0 0 0 13 0 15 14 13 13 14 15 15 0 0 0 0 13 0 16 16 0 15 0 15 0 0 15 13 16 14 0 14 0 15 13 14 13 16 15 16 14 14 13 15 16 0 0 0 0 14 0 15 16 0 0 14 0 13 15 13 0 14 0 16 0 0 0 0 0 15 13 15 0 14 0 14 0 16 0 14 13 15 0 15 15 14 15 16 15 14 14 14 0 15 0 0 16 0 13 13 14 15 14 14 0 13 13 0 15 0 15 14 0 0 15 15 0 14 15 0 16 0 14 0 13 15 0 0 15 14 0 15 0 14 14 16 16 0 15 0 13 0 16 15 0 0 16 14 0 15 0 0 0 16 0 16 13 16 15 15 13 0 16 16 13 0 14 14 16 15 13 14 16 0 13 0 0 0 0 13 16 0 0 0 13 13 15 0 0 0 0 16 0 16 16 14 0 13 16 0 16 0 15 16 0 16 13 0 15 0 13 0 0 16 16 0 16 14 14 0 16 14 16 0 0 14 0 0 15 13 0 14 13 0 15 0 15 0 16 15 15 0 0 13 0 0 13 14 15 15 0 14 13 16 15 15 13 0 13 0 0 15 16 15 0 13 0 0 13 13 0 14 14 0 0 16 0 0 16 16 13 15 0 0 14 14 0 15 15 0 15 15 0 0 0 0 0 16 14 14 14 0 0 13 0 13 0 0 13 15 13 0 0 13 16 0 0 15 0 13 16 0 0 16 16 15 16 0 0 16 0 14 0 16 14 0 16 13 0 0 14 15 16 13 14 16 14 14 13 0 16 15 0 0 16 14 14 13 0 15 0 14 13 15 0 0 14 16 15 0 14 0 0 15 0 15 16 16 13 13 13 16 15 16 15 0 16 0 0 14 13 16 16 16 14 0 16 15 0 0 13 14
sample points in 2nd dataset:
Code:
11 1
19 1
42 2
16 1
63 3
14 1
11 1
83 4
63 3
11 1
13 1
17 1
92 4
86 2
61 3
74 3
17 1
60 3
75 3
43 2