Email: helpdesk@telkomuniversity.ac.id

This Portal for internal use only!

  • My Download
  • Checkout
Application Package Repository Telkom University
All Categories

All Categories

  • IBM
  • Visual Paradigm
  • Adobe
  • Google
  • Matlab
  • Microsoft
    • Microsoft Apps
    • Analytics
    • AI + Machine Learning
    • Compute
    • Database
    • Developer Tools
    • Internet Of Things
    • Learning Services
    • Middleware System
    • Networking
    • Operating System
    • Productivity Tools
    • Security
    • VLS
      • Office
      • Windows
  • Opensource
  • Wordpress
    • Plugin WP
    • Themes WP
  • Others

Search

0 Wishlist

Cart

Categories
  • Microsoft
    • Microsoft Apps
    • Office
    • Operating System
    • VLS
    • Developer Tools
    • Productivity Tools
    • Database
    • AI + Machine Learning
    • Middleware System
    • Learning Services
    • Analytics
    • Networking
    • Compute
    • Security
    • Internet Of Things
  • Adobe
  • Matlab
  • Google
  • Visual Paradigm
  • WordPress
    • Plugin WP
    • Themes WP
  • Opensource
  • Others
More Categories Less Categories
  • Get Pack
    • Product Category
    • Simple Product
    • Grouped Product
    • Variable Product
    • External Product
  • My Account
    • Download
    • Cart
    • Checkout
    • Login
  • About Us
    • Contact
    • Forum
    • Frequently Questions
    • Privacy Policy
  • Forum
    • News
      • Category
      • News Tag

iconTicket Service Desk

  • My Download
  • Checkout
Application Package Repository Telkom University
All Categories

All Categories

  • IBM
  • Visual Paradigm
  • Adobe
  • Google
  • Matlab
  • Microsoft
    • Microsoft Apps
    • Analytics
    • AI + Machine Learning
    • Compute
    • Database
    • Developer Tools
    • Internet Of Things
    • Learning Services
    • Middleware System
    • Networking
    • Operating System
    • Productivity Tools
    • Security
    • VLS
      • Office
      • Windows
  • Opensource
  • Wordpress
    • Plugin WP
    • Themes WP
  • Others

Search

0 Wishlist

Cart

Menu
  • Home
    • Download Application Package Repository Telkom University
    • Application Package Repository Telkom University
    • Download Official License Telkom University
    • Download Installer Application Pack
    • Product Category
    • Simple Product
    • Grouped Product
    • Variable Product
    • External Product
  • All Pack
    • Microsoft
      • Operating System
      • Productivity Tools
      • Developer Tools
      • Database
      • AI + Machine Learning
      • Middleware System
      • Networking
      • Compute
      • Security
      • Analytics
      • Internet Of Things
      • Learning Services
    • Microsoft Apps
      • VLS
    • Adobe
    • Matlab
    • WordPress
      • Themes WP
      • Plugin WP
    • Google
    • Opensource
    • Others
  • My account
    • Download
    • Get Pack
    • Cart
    • Checkout
  • News
    • Category
    • News Tag
  • Forum
  • About Us
    • Privacy Policy
    • Frequently Questions
    • Contact
Home/Matlab/Memory optimization and code speed up

Memory optimization and code speed up

/ 2025-01-10
Memory optimization and code speed up
Matlab

Hi! I have a function to compute what is called Fuzzy Rand Index. It is a way to compare a fuzzy (soft) or probabilistic data partition (clustering result) with a reference hard partition. The math behind this is basically an algebra with t-norms and t-conorms (s-norms). Among some choices, we can use min() as t-norm and max() as s-norm or algebraic product and probabilistic OR (probor) also as t-norm and s-norm. In the present case I’m using min() and max().
The input data are 2 partition matrices (Nxd) in which N is the number of data points and d the number of clusters. So each element h_ji of a partition matrix is precisely the probability of datapoint j belongs to cluster i. In general, N >> d (e.g. N = 20000 to 50000 points and d is tipically 2 to 5 clusters).
The math behind the index is such that I have to compute sort of a "pairwise min" between each data point and all others within the same column (min(h_ji,h_ki)), so I’ll have N*(N-1)/2 values for each of the d columns. To these d arrays of size N*(N-1)/2, I must apply the max s-norm element wise to end up with a quantity V that is an array of size N*(N-1)/2.
Also I must compute the same "pairwise min" t-norm between every data point and each other but mixing the columns like min(h_ji,h_kp). So I’ll end up with factorial(d)/factorial(d-2) arrays of size N*(N-1)/2 and to these I must also apply the max s-norm element wise to end up with a quantity X that is an array of size N*(N-1)/2.
What I call "pairwise min" is much like pdist do, comparing each point with the next ones, but not with the previous ones.
Later, this V and X from one partition matrix are compared with another V0 and X0 from the reference partition matrix with min t-norm.
I have implemented this in 3 different ways.
The slower and the one that has high memory consumption is the more compact form as:
% you can use as an example U = rand(20000,4);
U = rand(30000,4); % just for test
[N,d] = size(U); % U is a partition matrix
comb = ones(d,2).*(1:d)’;
V = max(cell2mat(arrayfun(@(k) cell2mat(arrayfun(@(j) pdist2(U((j+1):N,comb(k,1)),U(j,comb(k,2)),@min),(1:(N-1)).’,’UniformOutput’,false)),1:size(comb,1),’UniformOutput’,false)).’);
comb = nchoosek(1:d,2);
comb = [comb;fliplr(comb)];
X = max(cell2mat(arrayfun(@(k) cell2mat(arrayfun(@(j) pdist2(U((j+1):N,comb(k,1)),U(j,comb(k,2)),@min),(1:(N-1)).’,’UniformOutput’,false)),1:size(comb,1),’UniformOutput’,false)).’);
In some intermediate form, I’ve tried to use parallel code like
% you can use as an example U = rand(20000,4);
U = rand(30000,4); % just for test
[N,d] = size(U); % U is a partition matrix
comb = nchoosek(1:d,2);
ncomb = size(comb,1);

X = cell(n-1,1);
V = cell(n-1,1);
parfor j=1:n-1
tmp = zeros(n-j,1);
for k=1:ncomb
tmp = max(tmp,max(min(U(j,comb(k,1)), U((j+1):end,comb(k,2))) , min(U(j,comb(k,2)), U((j+1):end,comb(k,1)))));
end
X{j} = tmp;
tmp = zeros(n-j,1);
for k=1:d
tmp = max(tmp,min(U(j,k), U((j+1):end,k)));
end
V{j} = tmp;
end
X2 = cell2mat(X);
V2 = cell2mat(V);
Finally, the version I’ve got the best results in terms of speed and memory usage is the following, where I did the trick using circshifts and have used single type, as double precision is not important to me:
% you can use as an example A = rand(20000,4);
A = rand(30000,4); % just for test
[N,d] = size(A); % A is a partition matrix
NC = fix(N/2);
numelements = N*(N-1)/2;
V = zeros(numelements,1,’single’);
X = zeros(numelements,1,’single’);
temp = zeros(N,1,’single’);
for nc1 = 1:NC-1
Ac = circshift(A,-nc1,1);
V((nc1-1)*N + 1 : nc1*N) = max(min(A,Ac),[],2);
X((nc1-1)*N + 1 : nc1*N) = func1(A,Ac,temp,d);
temp(:) = 0;
end

if N/2 – NC == 0
V((NC-1)*N + 1 : end) = max(min(A(1:N/2,:),A(N/2+1:end,:)),[],2);
X((NC-1)*N + 1 : end) = func2(A,N,d,temp(1:N/2));
else
Ac = circshift(A,-NC,1);
V((NC-1)*N + 1 : end) = max(min(A,Ac),[],2);
X((NC-1)*N + 1 : end) = func1(A,Ac,N,d);
end

function temp = func1(A,Ac,temp,d)
for nc2 = 1:d-1
temp = max(temp,max(min(A,circshift(Ac,-nc2,2)),[],2));
end
end
function temp = func2(A,N,d,temp)
for nc2 = 1:d-1
temp = max(temp,max(min(A(1:N/2,:),circshift(A(N/2+1:end,:),-nc2,2)),[],2));
end
end
The problem is that this calculation must be made several times to compare a lot of different clustering results to the reference one and for datasets with N > 20000 things take a lot of time. So do you think there is room for more optimization and get this faster and more memory efficient than the 3rd version with the circshifts?
Also, as this is done multiple times (2300 times for each dataset), this function is being called inside a parfor. So each thread executes this function. But min and max are multi-threaded functions also. My approach was to set the parpool with 6 workers (12 threads) and left 2 workers (4 threads) out of the parpool willing that these 4 threads will be used for the min() and max() multi-threading. Is that right? I thought that using all workers within the parpool would disable the multi-threading of min() and max() and get things slower.Hi! I have a function to compute what is called Fuzzy Rand Index. It is a way to compare a fuzzy (soft) or probabilistic data partition (clustering result) with a reference hard partition. The math behind this is basically an algebra with t-norms and t-conorms (s-norms). Among some choices, we can use min() as t-norm and max() as s-norm or algebraic product and probabilistic OR (probor) also as t-norm and s-norm. In the present case I’m using min() and max().
The input data are 2 partition matrices (Nxd) in which N is the number of data points and d the number of clusters. So each element h_ji of a partition matrix is precisely the probability of datapoint j belongs to cluster i. In general, N >> d (e.g. N = 20000 to 50000 points and d is tipically 2 to 5 clusters).
The math behind the index is such that I have to compute sort of a "pairwise min" between each data point and all others within the same column (min(h_ji,h_ki)), so I’ll have N*(N-1)/2 values for each of the d columns. To these d arrays of size N*(N-1)/2, I must apply the max s-norm element wise to end up with a quantity V that is an array of size N*(N-1)/2.
Also I must compute the same "pairwise min" t-norm between every data point and each other but mixing the columns like min(h_ji,h_kp). So I’ll end up with factorial(d)/factorial(d-2) arrays of size N*(N-1)/2 and to these I must also apply the max s-norm element wise to end up with a quantity X that is an array of size N*(N-1)/2.
What I call "pairwise min" is much like pdist do, comparing each point with the next ones, but not with the previous ones.
Later, this V and X from one partition matrix are compared with another V0 and X0 from the reference partition matrix with min t-norm.
I have implemented this in 3 different ways.
The slower and the one that has high memory consumption is the more compact form as:
% you can use as an example U = rand(20000,4);
U = rand(30000,4); % just for test
[N,d] = size(U); % U is a partition matrix
comb = ones(d,2).*(1:d)’;
V = max(cell2mat(arrayfun(@(k) cell2mat(arrayfun(@(j) pdist2(U((j+1):N,comb(k,1)),U(j,comb(k,2)),@min),(1:(N-1)).’,’UniformOutput’,false)),1:size(comb,1),’UniformOutput’,false)).’);
comb = nchoosek(1:d,2);
comb = [comb;fliplr(comb)];
X = max(cell2mat(arrayfun(@(k) cell2mat(arrayfun(@(j) pdist2(U((j+1):N,comb(k,1)),U(j,comb(k,2)),@min),(1:(N-1)).’,’UniformOutput’,false)),1:size(comb,1),’UniformOutput’,false)).’);
In some intermediate form, I’ve tried to use parallel code like
% you can use as an example U = rand(20000,4);
U = rand(30000,4); % just for test
[N,d] = size(U); % U is a partition matrix
comb = nchoosek(1:d,2);
ncomb = size(comb,1);

X = cell(n-1,1);
V = cell(n-1,1);
parfor j=1:n-1
tmp = zeros(n-j,1);
for k=1:ncomb
tmp = max(tmp,max(min(U(j,comb(k,1)), U((j+1):end,comb(k,2))) , min(U(j,comb(k,2)), U((j+1):end,comb(k,1)))));
end
X{j} = tmp;
tmp = zeros(n-j,1);
for k=1:d
tmp = max(tmp,min(U(j,k), U((j+1):end,k)));
end
V{j} = tmp;
end
X2 = cell2mat(X);
V2 = cell2mat(V);
Finally, the version I’ve got the best results in terms of speed and memory usage is the following, where I did the trick using circshifts and have used single type, as double precision is not important to me:
% you can use as an example A = rand(20000,4);
A = rand(30000,4); % just for test
[N,d] = size(A); % A is a partition matrix
NC = fix(N/2);
numelements = N*(N-1)/2;
V = zeros(numelements,1,’single’);
X = zeros(numelements,1,’single’);
temp = zeros(N,1,’single’);
for nc1 = 1:NC-1
Ac = circshift(A,-nc1,1);
V((nc1-1)*N + 1 : nc1*N) = max(min(A,Ac),[],2);
X((nc1-1)*N + 1 : nc1*N) = func1(A,Ac,temp,d);
temp(:) = 0;
end

if N/2 – NC == 0
V((NC-1)*N + 1 : end) = max(min(A(1:N/2,:),A(N/2+1:end,:)),[],2);
X((NC-1)*N + 1 : end) = func2(A,N,d,temp(1:N/2));
else
Ac = circshift(A,-NC,1);
V((NC-1)*N + 1 : end) = max(min(A,Ac),[],2);
X((NC-1)*N + 1 : end) = func1(A,Ac,N,d);
end

function temp = func1(A,Ac,temp,d)
for nc2 = 1:d-1
temp = max(temp,max(min(A,circshift(Ac,-nc2,2)),[],2));
end
end
function temp = func2(A,N,d,temp)
for nc2 = 1:d-1
temp = max(temp,max(min(A(1:N/2,:),circshift(A(N/2+1:end,:),-nc2,2)),[],2));
end
end
The problem is that this calculation must be made several times to compare a lot of different clustering results to the reference one and for datasets with N > 20000 things take a lot of time. So do you think there is room for more optimization and get this faster and more memory efficient than the 3rd version with the circshifts?
Also, as this is done multiple times (2300 times for each dataset), this function is being called inside a parfor. So each thread executes this function. But min and max are multi-threaded functions also. My approach was to set the parpool with 6 workers (12 threads) and left 2 workers (4 threads) out of the parpool willing that these 4 threads will be used for the min() and max() multi-threading. Is that right? I thought that using all workers within the parpool would disable the multi-threading of min() and max() and get things slower. Hi! I have a function to compute what is called Fuzzy Rand Index. It is a way to compare a fuzzy (soft) or probabilistic data partition (clustering result) with a reference hard partition. The math behind this is basically an algebra with t-norms and t-conorms (s-norms). Among some choices, we can use min() as t-norm and max() as s-norm or algebraic product and probabilistic OR (probor) also as t-norm and s-norm. In the present case I’m using min() and max().
The input data are 2 partition matrices (Nxd) in which N is the number of data points and d the number of clusters. So each element h_ji of a partition matrix is precisely the probability of datapoint j belongs to cluster i. In general, N >> d (e.g. N = 20000 to 50000 points and d is tipically 2 to 5 clusters).
The math behind the index is such that I have to compute sort of a "pairwise min" between each data point and all others within the same column (min(h_ji,h_ki)), so I’ll have N*(N-1)/2 values for each of the d columns. To these d arrays of size N*(N-1)/2, I must apply the max s-norm element wise to end up with a quantity V that is an array of size N*(N-1)/2.
Also I must compute the same "pairwise min" t-norm between every data point and each other but mixing the columns like min(h_ji,h_kp). So I’ll end up with factorial(d)/factorial(d-2) arrays of size N*(N-1)/2 and to these I must also apply the max s-norm element wise to end up with a quantity X that is an array of size N*(N-1)/2.
What I call "pairwise min" is much like pdist do, comparing each point with the next ones, but not with the previous ones.
Later, this V and X from one partition matrix are compared with another V0 and X0 from the reference partition matrix with min t-norm.
I have implemented this in 3 different ways.
The slower and the one that has high memory consumption is the more compact form as:
% you can use as an example U = rand(20000,4);
U = rand(30000,4); % just for test
[N,d] = size(U); % U is a partition matrix
comb = ones(d,2).*(1:d)’;
V = max(cell2mat(arrayfun(@(k) cell2mat(arrayfun(@(j) pdist2(U((j+1):N,comb(k,1)),U(j,comb(k,2)),@min),(1:(N-1)).’,’UniformOutput’,false)),1:size(comb,1),’UniformOutput’,false)).’);
comb = nchoosek(1:d,2);
comb = [comb;fliplr(comb)];
X = max(cell2mat(arrayfun(@(k) cell2mat(arrayfun(@(j) pdist2(U((j+1):N,comb(k,1)),U(j,comb(k,2)),@min),(1:(N-1)).’,’UniformOutput’,false)),1:size(comb,1),’UniformOutput’,false)).’);
In some intermediate form, I’ve tried to use parallel code like
% you can use as an example U = rand(20000,4);
U = rand(30000,4); % just for test
[N,d] = size(U); % U is a partition matrix
comb = nchoosek(1:d,2);
ncomb = size(comb,1);

X = cell(n-1,1);
V = cell(n-1,1);
parfor j=1:n-1
tmp = zeros(n-j,1);
for k=1:ncomb
tmp = max(tmp,max(min(U(j,comb(k,1)), U((j+1):end,comb(k,2))) , min(U(j,comb(k,2)), U((j+1):end,comb(k,1)))));
end
X{j} = tmp;
tmp = zeros(n-j,1);
for k=1:d
tmp = max(tmp,min(U(j,k), U((j+1):end,k)));
end
V{j} = tmp;
end
X2 = cell2mat(X);
V2 = cell2mat(V);
Finally, the version I’ve got the best results in terms of speed and memory usage is the following, where I did the trick using circshifts and have used single type, as double precision is not important to me:
% you can use as an example A = rand(20000,4);
A = rand(30000,4); % just for test
[N,d] = size(A); % A is a partition matrix
NC = fix(N/2);
numelements = N*(N-1)/2;
V = zeros(numelements,1,’single’);
X = zeros(numelements,1,’single’);
temp = zeros(N,1,’single’);
for nc1 = 1:NC-1
Ac = circshift(A,-nc1,1);
V((nc1-1)*N + 1 : nc1*N) = max(min(A,Ac),[],2);
X((nc1-1)*N + 1 : nc1*N) = func1(A,Ac,temp,d);
temp(:) = 0;
end

if N/2 – NC == 0
V((NC-1)*N + 1 : end) = max(min(A(1:N/2,:),A(N/2+1:end,:)),[],2);
X((NC-1)*N + 1 : end) = func2(A,N,d,temp(1:N/2));
else
Ac = circshift(A,-NC,1);
V((NC-1)*N + 1 : end) = max(min(A,Ac),[],2);
X((NC-1)*N + 1 : end) = func1(A,Ac,N,d);
end

function temp = func1(A,Ac,temp,d)
for nc2 = 1:d-1
temp = max(temp,max(min(A,circshift(Ac,-nc2,2)),[],2));
end
end
function temp = func2(A,N,d,temp)
for nc2 = 1:d-1
temp = max(temp,max(min(A(1:N/2,:),circshift(A(N/2+1:end,:),-nc2,2)),[],2));
end
end
The problem is that this calculation must be made several times to compare a lot of different clustering results to the reference one and for datasets with N > 20000 things take a lot of time. So do you think there is room for more optimization and get this faster and more memory efficient than the 3rd version with the circshifts?
Also, as this is done multiple times (2300 times for each dataset), this function is being called inside a parfor. So each thread executes this function. But min and max are multi-threaded functions also. My approach was to set the parpool with 6 workers (12 threads) and left 2 workers (4 threads) out of the parpool willing that these 4 threads will be used for the min() and max() multi-threading. Is that right? I thought that using all workers within the parpool would disable the multi-threading of min() and max() and get things slower. multi-thread, parfor, speed up, memory usage MATLAB Answers — New Questions

​

Tags: matlab

Share this!

Related posts

External Mode Connection Issue with C2000 LaunchPad and Speedgoat System
2025-05-17

External Mode Connection Issue with C2000 LaunchPad and Speedgoat System

how to validate mscohere?
2025-05-17

how to validate mscohere?

Transfer history to MATLAB 2025a
2025-05-17

Transfer history to MATLAB 2025a

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Categories

  • Matlab
  • Microsoft
  • News
  • Other
Application Package Repository Telkom University

Tags

matlab microsoft opensources
Application Package Download License

Application Package Download License

Adobe
Google for Education
IBM
Matlab
Microsoft
Wordpress
Visual Paradigm
Opensource

Sign Up For Newsletters

Be the First to Know. Sign up for newsletter today

Application Package Repository Telkom University

Portal Application Package Repository Telkom University, for internal use only, empower civitas academica in study and research.

Information

  • Telkom University
  • About Us
  • Contact
  • Forum Discussion
  • FAQ
  • Helpdesk Ticket

Contact Us

  • Ask: Any question please read FAQ
  • Mail: helpdesk@telkomuniversity.ac.id
  • Call: +62 823-1994-9941
  • WA: +62 823-1994-9943
  • Site: Gedung Panambulai. Jl. Telekomunikasi

Copyright © Telkom University. All Rights Reserved. ch

  • FAQ
  • Privacy Policy
  • Term

This Application Package for internal Telkom University only (students and employee). Chiers... Dismiss