Problem with estimating PDF (ksdensity)
Attached are two sets of data and I need to estimate the Probability density function (PDF) for both of them.
The attached variable detection has 32 elements and a unit of percentages (between 0 and 100 %), and the variable in_process has 96 elements and a unit of number of days (between 0 and 212 days).
I want to estimate the PDF of both variables. For that I am using ksdenity, with the ‘support’ option, because I don’t want the values on x-axis to be negative or over 100%.
Therefore,
for the estimation of PDF of the detection I use the following code:
detection(detection==0)=0.0001; %data must be between the support boundaries
detection(detection==100)=99.9999;
pts=0:0.1:100;
[f,x]=ksdensity(detection,pts,’support’,[0,100]);
plot(x,f);
and for the estimation of PDF of the in_process I use the same following code:
in_process(in_process==0)=1;
in_process(in_process==212)=211;
pts=0:0.1:212;
[f,x]=ksdensity(in_process,pts,’support’,[0 212]);
plot(x,f);
My problem is that the first one looks pretty well (has similar shape as the histogram of detection and looks similar to the PDF that is produced without the support option), while the other one looks bad (creates artificial bumps at the beginning and at the end of the interval).
I don’t undestand why is this happening? Why the first one looks good and the second one doesn’t.
Is this even a good approach and does it make sense to estimate pdf of these variables?
Thank you for your help.Attached are two sets of data and I need to estimate the Probability density function (PDF) for both of them.
The attached variable detection has 32 elements and a unit of percentages (between 0 and 100 %), and the variable in_process has 96 elements and a unit of number of days (between 0 and 212 days).
I want to estimate the PDF of both variables. For that I am using ksdenity, with the ‘support’ option, because I don’t want the values on x-axis to be negative or over 100%.
Therefore,
for the estimation of PDF of the detection I use the following code:
detection(detection==0)=0.0001; %data must be between the support boundaries
detection(detection==100)=99.9999;
pts=0:0.1:100;
[f,x]=ksdensity(detection,pts,’support’,[0,100]);
plot(x,f);
and for the estimation of PDF of the in_process I use the same following code:
in_process(in_process==0)=1;
in_process(in_process==212)=211;
pts=0:0.1:212;
[f,x]=ksdensity(in_process,pts,’support’,[0 212]);
plot(x,f);
My problem is that the first one looks pretty well (has similar shape as the histogram of detection and looks similar to the PDF that is produced without the support option), while the other one looks bad (creates artificial bumps at the beginning and at the end of the interval).
I don’t undestand why is this happening? Why the first one looks good and the second one doesn’t.
Is this even a good approach and does it make sense to estimate pdf of these variables?
Thank you for your help. Attached are two sets of data and I need to estimate the Probability density function (PDF) for both of them.
The attached variable detection has 32 elements and a unit of percentages (between 0 and 100 %), and the variable in_process has 96 elements and a unit of number of days (between 0 and 212 days).
I want to estimate the PDF of both variables. For that I am using ksdenity, with the ‘support’ option, because I don’t want the values on x-axis to be negative or over 100%.
Therefore,
for the estimation of PDF of the detection I use the following code:
detection(detection==0)=0.0001; %data must be between the support boundaries
detection(detection==100)=99.9999;
pts=0:0.1:100;
[f,x]=ksdensity(detection,pts,’support’,[0,100]);
plot(x,f);
and for the estimation of PDF of the in_process I use the same following code:
in_process(in_process==0)=1;
in_process(in_process==212)=211;
pts=0:0.1:212;
[f,x]=ksdensity(in_process,pts,’support’,[0 212]);
plot(x,f);
My problem is that the first one looks pretty well (has similar shape as the histogram of detection and looks similar to the PDF that is produced without the support option), while the other one looks bad (creates artificial bumps at the beginning and at the end of the interval).
I don’t undestand why is this happening? Why the first one looks good and the second one doesn’t.
Is this even a good approach and does it make sense to estimate pdf of these variables?
Thank you for your help. #ksdensity, #pdf MATLAB Answers — New Questions