MATLAB's inefficient copy-on-write implementation

MATLAB’s copy-on-write memory management seems to have a serious defect, which I think is the reason behind the abysmal performance of subsasgn overloading. (The same problem probably occurs with parenAssign in the new R2021b RedefinesParen class — I haven’t yet experimented with it.) Normally, an array assignment like b = a simply does a pointer copy; the array data is not copied until b is modified (e.g. b(1) = 1). Thereafter, subsequent modification of b (e.g. b(2) = 1) do not copy the full array; they just modify it in place as long as the reference count is 1. For example,

clear, a = zeros(1e8,1);
memory % 2764 MB used by MATLAB
b = a;
memory % 2764 MB
tic, b(1) = 1; toc, memory % 0.329099 seconds, 3540 MB
tic, b(2) = 1; toc, memory % 0.000123 seconds, 3541 MB

However, the benefit of copy-on-write is lost when the variable is changed in a function, e.g.

% test.m
function x = test(x)
x(1) = 1;

In this case, the x reference count is apparently incremented in test before the assignment is made, so this will always result in a full array copy. For example,

clear, a = zeros(1e8,1);
tic, a = test(a); toc % 0.337475 seconds
tic, a = test(a); toc % 0.310373 seconds

To see what’s happening with copy-on-write, test.m is modified as follows:

function x = test(x)
memory
x(1) = 1;
memory
return

The array modification inside the function forces a full array copy, even though the original array is immediately discarded:

clear, a = zeros(1e8,1);
memory % 2748 MB
a = test(a); % 2748 MB, 3503 MB
memory % 2740 MB

I would think this problem could be easily avoided by treating any variable that appears as both an input and output argument in a function (e.g. function x = test(x)) as a reference variable, i.e. its reference count is not incremented on entering the function and is not decremented upon exiting. If the function is called with different input and output arguments, e.g. y = test(x), then the interpreter would implement this as y = x; y = test(y).

Is there any particular reason why MATLAB does not or cannot do this? There are many applications such as subasgn overloading that could see a big performance boost if this problem is fixed.MATLAB’s copy-on-write memory management seems to have a serious defect, which I think is the reason behind the abysmal performance of subsasgn overloading. (The same problem probably occurs with parenAssign in the new R2021b RedefinesParen class — I haven’t yet experimented with it.) Normally, an array assignment like b = a simply does a pointer copy; the array data is not copied until b is modified (e.g. b(1) = 1). Thereafter, subsequent modification of b (e.g. b(2) = 1) do not copy the full array; they just modify it in place as long as the reference count is 1. For example,

clear, a = zeros(1e8,1);
memory % 2764 MB used by MATLAB
b = a;
memory % 2764 MB
tic, b(1) = 1; toc, memory % 0.329099 seconds, 3540 MB
tic, b(2) = 1; toc, memory % 0.000123 seconds, 3541 MB

However, the benefit of copy-on-write is lost when the variable is changed in a function, e.g.

% test.m
function x = test(x)
x(1) = 1;

In this case, the x reference count is apparently incremented in test before the assignment is made, so this will always result in a full array copy. For example,

clear, a = zeros(1e8,1);
tic, a = test(a); toc % 0.337475 seconds
tic, a = test(a); toc % 0.310373 seconds

To see what’s happening with copy-on-write, test.m is modified as follows:

function x = test(x)
memory
x(1) = 1;
memory
return

The array modification inside the function forces a full array copy, even though the original array is immediately discarded:

clear, a = zeros(1e8,1);
memory % 2748 MB
a = test(a); % 2748 MB, 3503 MB
memory % 2740 MB

Is there any particular reason why MATLAB does not or cannot do this? There are many applications such as subasgn overloading that could see a big performance boost if this problem is fixed. MATLAB’s copy-on-write memory management seems to have a serious defect, which I think is the reason behind the abysmal performance of subsasgn overloading. (The same problem probably occurs with parenAssign in the new R2021b RedefinesParen class — I haven’t yet experimented with it.) Normally, an array assignment like b = a simply does a pointer copy; the array data is not copied until b is modified (e.g. b(1) = 1). Thereafter, subsequent modification of b (e.g. b(2) = 1) do not copy the full array; they just modify it in place as long as the reference count is 1. For example,

clear, a = zeros(1e8,1);
memory % 2764 MB used by MATLAB
b = a;
memory % 2764 MB
tic, b(1) = 1; toc, memory % 0.329099 seconds, 3540 MB
tic, b(2) = 1; toc, memory % 0.000123 seconds, 3541 MB

However, the benefit of copy-on-write is lost when the variable is changed in a function, e.g.

% test.m
function x = test(x)
x(1) = 1;

In this case, the x reference count is apparently incremented in test before the assignment is made, so this will always result in a full array copy. For example,

clear, a = zeros(1e8,1);
tic, a = test(a); toc % 0.337475 seconds
tic, a = test(a); toc % 0.310373 seconds

To see what’s happening with copy-on-write, test.m is modified as follows:

function x = test(x)
memory
x(1) = 1;
memory
return

The array modification inside the function forces a full array copy, even though the original array is immediately discarded:

clear, a = zeros(1e8,1);
memory % 2748 MB
a = test(a); % 2748 MB, 3503 MB
memory % 2740 MB

Cart

Cart

MATLAB’s inefficient copy-on-write implementation

Related posts

Simulink crash after installation 2024a

it takes soo much time to sign in

How to uninstall a support package after deletion of its files?

Leave a Reply Cancel reply

Information

Contact Us

All Categories

Search

Cart

All Categories

Search

Cart

MATLAB’s inefficient copy-on-write implementation

Share this!

Related posts

Simulink crash after installation 2024a

it takes soo much time to sign in

How to uninstall a support package after deletion of its files?

Leave a Reply Cancel reply

Sign Up For Newsletters

Information

Contact Us