Vertically Concatenating Timetables of Different Sizes
As the title states, I wish to do just that. More specifically, I need to combine n timetables, each with one row, but varying column counts (Dimension counts). There is a nonzero intersection between each timetable, and so there are data being shared between each. However, some may also have data that other’s do not.
The goal here is to vertically concatenate each timetable so that all the data within the intersection are aligned by column/dimension name type. If there are data that are missing and so one timetable is smaller than the other, then that dimension/data type will be populated as a ‘NaN’ within the smaller timetable. In the example I give below, I say there are three timetables of lengths 1×40, 1×36 and 1×30. I wish to get a resulting timetable that is then 3×40.
Method 1:
Obviously doesn’t work as the # of columns are different.
Method 2:
I tried using a solution on another post that does exactly what I want, but with tables instead of timetables. This did not give the correct result as the output is a 3×215, which I don’t understand why. It also combines the dimension labels with “All” which doesn’t make sense to me as I specify “MergeKeys” as “True” which should leave the dimension labels alone according to the documentation. It comes close though in that it does concatenate the datetimes correctly and populates empty data/units with ‘NaN’s. I tried specifying the “Key” variables to align to, but because some timetables do not contain these variables and the Matlab function outerjoin wasn’t built to create missing variable columns that auto populate with NaNs when it encounters this, this does not work.
Method 3:
I tried using the synchronize function as this is specifically geared towards timetables. I get the exact same resulting timetable as Method 2. The synchronize documentation and "Combine Timetables and Synchronize Their Data" documentation give examples that do the exact same thing (Just that I have more columns and different dimension names) as my example from what I can see, so I am even more confused by why this does not produce the same result that I would like.
So I think I am just not employing Methods 2,3 correctly or I may have to create another for-loop prior to concatenation that creates the new variables in the smaller timetables and populates them with NaNs using indexing of the largest timetable.
%Create 3 timetables of different sizes and random dimension names with nonzero intersections
varTypes = {‘string’};
doub = ‘double’;
str = ‘string’;
szList = [40,36,30];
ind = {‘A’,’B’,’C’};
TimeDim = [datetime(‘2023-01-01 20:14:58’),datetime(‘2023-02-21 22:13:04’),datetime(‘2023-03-11 10:12:58’)];
for i = 1:length(szList)
for j = 1:szList(i)
varTypes{end+1} = doub;
varTypes{end+1} = str;
end
ArrayList.ind{i} = varTypes;
varTypes = {‘string’};
end
clear i j
for i = 1:length(szList)
ttSaved.ind{i} = timetable(‘Size’,[1 2*szList(i)+1],’VariableTypes’,ArrayList.ind{1,i},’RowTimes’,TimeDim(i));
ttSaved.ind{i} = renamevars(ttSaved.ind{i},’Var1′,’Exp_ID’);
ttSaved.ind{i}.Exp_ID = i;
end
clear i
%Label each dimenion and populate tables with data
for i = 1:length(ttSaved.ind)
choice = randperm(szList(i));
for j = 1:szList(i)
ttSaved.ind{i} = renamevars(ttSaved.ind{i},sprintf(‘Var%d’,2*j),sprintf(‘Data_%d’,choice(j)));
ttSaved.ind{i} = renamevars(ttSaved.ind{i},sprintf(‘Var%d’,2*j+1),sprintf(‘Units_%d’,choice(j)));
end
Names = ttSaved.ind{i}.Properties.VariableNames;
for k = 1:szList(i)
ttSaved.ind{i}.(Names{2*k}) = rand*100; %Add data
ttSaved.ind{i}.(Names{2*k+1}) = char(choice(k)); %Add Units
end
end
clear i j k
%Method 1
ttAll = [];
for i = 1:length(ttSaved.ind)
ttAll = [ttAll ; ttSaved.ind{i}];
end
clear i
%Method 2
ttAll = ttSaved.ind{1};
AllMeas = ttSaved.ind{1}.Properties.VariableNames;
for i = 2:length(ttSaved.ind)
ttAll = outerjoin(ttAll,ttSaved.ind{i},’Keys’,AllMeas,’MergeKeys’, true);
end
clear i
%Method 3
ttAll2 = ttSaved.ind{1};
for i = 2:length(ttSaved.ind)
ttAll2 = synchronize(ttAll2,ttSaved.ind{i});
end
clear i
Note: This post is somewhat of a continuation of a previous post, but since I marked the last as answered and this one is different enough and contains the same data structures to my actual data, I needed create a new question.As the title states, I wish to do just that. More specifically, I need to combine n timetables, each with one row, but varying column counts (Dimension counts). There is a nonzero intersection between each timetable, and so there are data being shared between each. However, some may also have data that other’s do not.
The goal here is to vertically concatenate each timetable so that all the data within the intersection are aligned by column/dimension name type. If there are data that are missing and so one timetable is smaller than the other, then that dimension/data type will be populated as a ‘NaN’ within the smaller timetable. In the example I give below, I say there are three timetables of lengths 1×40, 1×36 and 1×30. I wish to get a resulting timetable that is then 3×40.
Method 1:
Obviously doesn’t work as the # of columns are different.
Method 2:
I tried using a solution on another post that does exactly what I want, but with tables instead of timetables. This did not give the correct result as the output is a 3×215, which I don’t understand why. It also combines the dimension labels with “All” which doesn’t make sense to me as I specify “MergeKeys” as “True” which should leave the dimension labels alone according to the documentation. It comes close though in that it does concatenate the datetimes correctly and populates empty data/units with ‘NaN’s. I tried specifying the “Key” variables to align to, but because some timetables do not contain these variables and the Matlab function outerjoin wasn’t built to create missing variable columns that auto populate with NaNs when it encounters this, this does not work.
Method 3:
I tried using the synchronize function as this is specifically geared towards timetables. I get the exact same resulting timetable as Method 2. The synchronize documentation and "Combine Timetables and Synchronize Their Data" documentation give examples that do the exact same thing (Just that I have more columns and different dimension names) as my example from what I can see, so I am even more confused by why this does not produce the same result that I would like.
So I think I am just not employing Methods 2,3 correctly or I may have to create another for-loop prior to concatenation that creates the new variables in the smaller timetables and populates them with NaNs using indexing of the largest timetable.
%Create 3 timetables of different sizes and random dimension names with nonzero intersections
varTypes = {‘string’};
doub = ‘double’;
str = ‘string’;
szList = [40,36,30];
ind = {‘A’,’B’,’C’};
TimeDim = [datetime(‘2023-01-01 20:14:58’),datetime(‘2023-02-21 22:13:04’),datetime(‘2023-03-11 10:12:58’)];
for i = 1:length(szList)
for j = 1:szList(i)
varTypes{end+1} = doub;
varTypes{end+1} = str;
end
ArrayList.ind{i} = varTypes;
varTypes = {‘string’};
end
clear i j
for i = 1:length(szList)
ttSaved.ind{i} = timetable(‘Size’,[1 2*szList(i)+1],’VariableTypes’,ArrayList.ind{1,i},’RowTimes’,TimeDim(i));
ttSaved.ind{i} = renamevars(ttSaved.ind{i},’Var1′,’Exp_ID’);
ttSaved.ind{i}.Exp_ID = i;
end
clear i
%Label each dimenion and populate tables with data
for i = 1:length(ttSaved.ind)
choice = randperm(szList(i));
for j = 1:szList(i)
ttSaved.ind{i} = renamevars(ttSaved.ind{i},sprintf(‘Var%d’,2*j),sprintf(‘Data_%d’,choice(j)));
ttSaved.ind{i} = renamevars(ttSaved.ind{i},sprintf(‘Var%d’,2*j+1),sprintf(‘Units_%d’,choice(j)));
end
Names = ttSaved.ind{i}.Properties.VariableNames;
for k = 1:szList(i)
ttSaved.ind{i}.(Names{2*k}) = rand*100; %Add data
ttSaved.ind{i}.(Names{2*k+1}) = char(choice(k)); %Add Units
end
end
clear i j k
%Method 1
ttAll = [];
for i = 1:length(ttSaved.ind)
ttAll = [ttAll ; ttSaved.ind{i}];
end
clear i
%Method 2
ttAll = ttSaved.ind{1};
AllMeas = ttSaved.ind{1}.Properties.VariableNames;
for i = 2:length(ttSaved.ind)
ttAll = outerjoin(ttAll,ttSaved.ind{i},’Keys’,AllMeas,’MergeKeys’, true);
end
clear i
%Method 3
ttAll2 = ttSaved.ind{1};
for i = 2:length(ttSaved.ind)
ttAll2 = synchronize(ttAll2,ttSaved.ind{i});
end
clear i
Note: This post is somewhat of a continuation of a previous post, but since I marked the last as answered and this one is different enough and contains the same data structures to my actual data, I needed create a new question. As the title states, I wish to do just that. More specifically, I need to combine n timetables, each with one row, but varying column counts (Dimension counts). There is a nonzero intersection between each timetable, and so there are data being shared between each. However, some may also have data that other’s do not.
The goal here is to vertically concatenate each timetable so that all the data within the intersection are aligned by column/dimension name type. If there are data that are missing and so one timetable is smaller than the other, then that dimension/data type will be populated as a ‘NaN’ within the smaller timetable. In the example I give below, I say there are three timetables of lengths 1×40, 1×36 and 1×30. I wish to get a resulting timetable that is then 3×40.
Method 1:
Obviously doesn’t work as the # of columns are different.
Method 2:
I tried using a solution on another post that does exactly what I want, but with tables instead of timetables. This did not give the correct result as the output is a 3×215, which I don’t understand why. It also combines the dimension labels with “All” which doesn’t make sense to me as I specify “MergeKeys” as “True” which should leave the dimension labels alone according to the documentation. It comes close though in that it does concatenate the datetimes correctly and populates empty data/units with ‘NaN’s. I tried specifying the “Key” variables to align to, but because some timetables do not contain these variables and the Matlab function outerjoin wasn’t built to create missing variable columns that auto populate with NaNs when it encounters this, this does not work.
Method 3:
I tried using the synchronize function as this is specifically geared towards timetables. I get the exact same resulting timetable as Method 2. The synchronize documentation and "Combine Timetables and Synchronize Their Data" documentation give examples that do the exact same thing (Just that I have more columns and different dimension names) as my example from what I can see, so I am even more confused by why this does not produce the same result that I would like.
So I think I am just not employing Methods 2,3 correctly or I may have to create another for-loop prior to concatenation that creates the new variables in the smaller timetables and populates them with NaNs using indexing of the largest timetable.
%Create 3 timetables of different sizes and random dimension names with nonzero intersections
varTypes = {‘string’};
doub = ‘double’;
str = ‘string’;
szList = [40,36,30];
ind = {‘A’,’B’,’C’};
TimeDim = [datetime(‘2023-01-01 20:14:58’),datetime(‘2023-02-21 22:13:04’),datetime(‘2023-03-11 10:12:58’)];
for i = 1:length(szList)
for j = 1:szList(i)
varTypes{end+1} = doub;
varTypes{end+1} = str;
end
ArrayList.ind{i} = varTypes;
varTypes = {‘string’};
end
clear i j
for i = 1:length(szList)
ttSaved.ind{i} = timetable(‘Size’,[1 2*szList(i)+1],’VariableTypes’,ArrayList.ind{1,i},’RowTimes’,TimeDim(i));
ttSaved.ind{i} = renamevars(ttSaved.ind{i},’Var1′,’Exp_ID’);
ttSaved.ind{i}.Exp_ID = i;
end
clear i
%Label each dimenion and populate tables with data
for i = 1:length(ttSaved.ind)
choice = randperm(szList(i));
for j = 1:szList(i)
ttSaved.ind{i} = renamevars(ttSaved.ind{i},sprintf(‘Var%d’,2*j),sprintf(‘Data_%d’,choice(j)));
ttSaved.ind{i} = renamevars(ttSaved.ind{i},sprintf(‘Var%d’,2*j+1),sprintf(‘Units_%d’,choice(j)));
end
Names = ttSaved.ind{i}.Properties.VariableNames;
for k = 1:szList(i)
ttSaved.ind{i}.(Names{2*k}) = rand*100; %Add data
ttSaved.ind{i}.(Names{2*k+1}) = char(choice(k)); %Add Units
end
end
clear i j k
%Method 1
ttAll = [];
for i = 1:length(ttSaved.ind)
ttAll = [ttAll ; ttSaved.ind{i}];
end
clear i
%Method 2
ttAll = ttSaved.ind{1};
AllMeas = ttSaved.ind{1}.Properties.VariableNames;
for i = 2:length(ttSaved.ind)
ttAll = outerjoin(ttAll,ttSaved.ind{i},’Keys’,AllMeas,’MergeKeys’, true);
end
clear i
%Method 3
ttAll2 = ttSaved.ind{1};
for i = 2:length(ttSaved.ind)
ttAll2 = synchronize(ttAll2,ttSaved.ind{i});
end
clear i
Note: This post is somewhat of a continuation of a previous post, but since I marked the last as answered and this one is different enough and contains the same data structures to my actual data, I needed create a new question. timetables, concatenating timetables, concatenation, synchronize MATLAB Answers — New Questions