scale/normalize parameter vector for optimization
In my optim problem, the parameters naturally vary by several orders of magnitude because they represent interpolation values of a B-Spline function F = spapi(k,x,y). For instance, may be close to zero and . The step tolerance is just a constant and does not account for these differences in scale between the parameters. In my case, the objective function changes differently w.r.t changes in the "smaller" y’s than in larger y’s, making parameter scaling or normalization potentially beneficial.
1. In machine learning literature, I often encounter [0,1] scaling as a common technique. Would this approach be suitable for my problem as well? Or can you suggest more appropriate scaling techniques given the parameters represent interpolation values?
2. This might be a separate question, but also relates to parameter scaling/transformation. My Jacobian matrix J (hence, Hessian approx J^T*J) tends to be poorly conditioned. I have considered to switching to a different basis for the parameter vector. So far, my parameters are multipliers for the unit vectors : . I vaguely recall a discussion where a teacher suggested using the normalized eigenvectors of the Hessian as the unit vectors: where are the (normalized) eigenvectors of the Hessian and are the new parameters.
My questions are: In theory, would parameterization in terms of the eigenvectors be effective in improving the conditioning of the problem? If so, is this approach compatible with the presence of bounds and linear constraints on the parameters?
Thank you!In my optim problem, the parameters naturally vary by several orders of magnitude because they represent interpolation values of a B-Spline function F = spapi(k,x,y). For instance, may be close to zero and . The step tolerance is just a constant and does not account for these differences in scale between the parameters. In my case, the objective function changes differently w.r.t changes in the "smaller" y’s than in larger y’s, making parameter scaling or normalization potentially beneficial.
1. In machine learning literature, I often encounter [0,1] scaling as a common technique. Would this approach be suitable for my problem as well? Or can you suggest more appropriate scaling techniques given the parameters represent interpolation values?
2. This might be a separate question, but also relates to parameter scaling/transformation. My Jacobian matrix J (hence, Hessian approx J^T*J) tends to be poorly conditioned. I have considered to switching to a different basis for the parameter vector. So far, my parameters are multipliers for the unit vectors : . I vaguely recall a discussion where a teacher suggested using the normalized eigenvectors of the Hessian as the unit vectors: where are the (normalized) eigenvectors of the Hessian and are the new parameters.
My questions are: In theory, would parameterization in terms of the eigenvectors be effective in improving the conditioning of the problem? If so, is this approach compatible with the presence of bounds and linear constraints on the parameters?
Thank you! In my optim problem, the parameters naturally vary by several orders of magnitude because they represent interpolation values of a B-Spline function F = spapi(k,x,y). For instance, may be close to zero and . The step tolerance is just a constant and does not account for these differences in scale between the parameters. In my case, the objective function changes differently w.r.t changes in the "smaller" y’s than in larger y’s, making parameter scaling or normalization potentially beneficial.
1. In machine learning literature, I often encounter [0,1] scaling as a common technique. Would this approach be suitable for my problem as well? Or can you suggest more appropriate scaling techniques given the parameters represent interpolation values?
2. This might be a separate question, but also relates to parameter scaling/transformation. My Jacobian matrix J (hence, Hessian approx J^T*J) tends to be poorly conditioned. I have considered to switching to a different basis for the parameter vector. So far, my parameters are multipliers for the unit vectors : . I vaguely recall a discussion where a teacher suggested using the normalized eigenvectors of the Hessian as the unit vectors: where are the (normalized) eigenvectors of the Hessian and are the new parameters.
My questions are: In theory, would parameterization in terms of the eigenvectors be effective in improving the conditioning of the problem? If so, is this approach compatible with the presence of bounds and linear constraints on the parameters?
Thank you! optimization, interpolation, scaling, fmincon MATLAB Answers — New Questions