Your thoughts serializing arrays to manage state in excel (by way of inventory tracking)
I’m curious what you all think of this approach for managing state in excel with minimal computational impact. I’m sure after a few thousand different items to track, performance may be implicated, but the formulas initially feel like they will create static values that don’t have to be recalculated with any frequency. Skip towards the end if you just want to see the formulas.
I think one of the important ideas here is that this formula can be used within a table column the same as any other formula that doesn’t return an array, the difference being that calculations are not lost between steps of the row and the only information a calculation needs is the limited state info in the prior state.
Any comments welcome. If you’ve found a better way to do this, I’d be curious to know. I am limiting myself to using only native excel formulas/functionality plus excel labs.
P.S. The formulas have one or two undeveloped ideas because they are not essential. For instance, there is one inline lambda that uses a switch to return a comparison function for data masking that has only implemented the equals and does not equal operators.
**********Description********************
General idea is such – write a series of formulas that serialize and deserialize arrays and combine it with an efficient mechanism to access the most recent state. The example uses the idea of lots of items where each lot has a quantity and cost – transactions can either increase the quantity (and cost) or decrease it. The state being managed is the existent lots, their quantities, and their costs.
columns are simple:
control; date; lot; qty; cost; state
So here is the overall structure:
table formula checks control column to see if the current row impacts the state –
if it doesn’t, return nothing,
if it does, move on
if it is the first control number, just serialize the row data
if it isn’t, find the prior state and update it
The way it updates the state is by
1) deserializing the prior state
2) determining whether the prior state already included the current row’s item
3) updating that value if so or just returning the current rows values,
4) filtering the prior state to remove the current row’s item
5) vstacking the filtered array onto the current row’s entry (potentially updated), and
6) serializing the current state.
Using 40 rows as an example, I might end with a serialized state that looks like this:
lot_H,65,650;lot_I,100,1000;lot_J,45,450;lot_K,70,700;lot_L,30,300;lot_A,135,1350;lot_B,150,1500;lot_C,145,1450;lot_D,90,900;lot_E,120,1200;lot_F,80,800;lot_G,170,1700
and deserializes to this:
lot_A 135 1350
lot_B 150 1500
lot_C 145 1450
lot_D 90 900
lot_E 120 1200
lot_F 80 800
lot_G 170 1700
lot_H 65 650
lot_I 100 1000
lot_J 45 450
lot_K 70 700
lot_L 30 300
using this function: SORT(array.deserialize_string(I57))
table formula:
=IF(
TRIM([@lot]) = “”,
“”,
IF(
[@control] = 1,
array.serialize_array(Table68[@[lot]:[cost]]),
LET(
current_trans, Table68[@[lot]:[cost]],
last_state_ctrl, MAX(
FILTER([control], ([control] < [@control]) * ([lot] <> “”) * (2025 <= [@date])),
1
),
last_state, XLOOKUP(last_state_ctrl, [control], [state], “”, 0),
array.serialize_array(array.track_state(array.deserialize_string(last_state), current_trans))
)
)
)
module formulas:
deserialize_string = LAMBDA(state_string, TEXTSPLIT(state_string, “,”, “;”, FALSE));
serialize_row = LAMBDA(row, TEXTJOIN(“,”, FALSE, row));
serialize_array = LAMBDA(state_array,
REDUCE(
,
BYROW(state_array, LAMBDA(r, serialize_row(r))),
LAMBDA(acc, current_row, acc & “;” & current_row)
)
);
serialized_wrapper = lambda(state_string, func_to_apply, serialize_array(func_to_apply(deserialize_string(state_string))));
replace_row =
lambda(
new_row,
original_array,
[replacement_col_idx],
let(
idx_col, if(isomitted(replacement_col_idx), 1, replacement_col_idx),
data_mask, choosecols(new_row, idx_col) <> choosecols(original_array,idx_col),
filtered_array, filter(original_array, data_mask),
hstack(filtered_array, replace_row)
)
);
columnMask =
lambda(
value,
comp_to_array,
[comparison_operator],
[column_idx],
[value_idx],
let(
operator, if(isomitted(comparison_operator),”=”,comparison_operator),
comp_func, lambda(operator, lambda(value, comp_array, switch(operator, “=”, value=comp_array, “<>”, value<>comp_array,”Not yet implimented”))),
col_idx, if(isomitted(column_idx), 1, column_idx),
val_idx, if(isomitted(value_idx), 1, value_idx),
adj_value, if(COLUMNS(value)>1, CHOOSECOLS(value, val_idx), value),
comp_col, choosecols(comp_to_array, col_idx),
comp_func(operator)(adj_value, comp_col)
)
);
track_state =
lambda(
last_state_string, current_trans,
LET(
lots, array.deserialize_string(last_state_string),
lot_to_add, INDEX(current_trans, 1, 1),
prior_lot_values, IFERROR(
FILTER(lots, array.columnMask(lot_to_add, lots), ERROR.TYPE(3)),
ERROR.TYPE(3)
),
filtered_array, FILTER(lots, array.columnMask(lot_to_add, lots, “<>”), ERROR.TYPE(3)),
replacement_line, IF(
ISNA(prior_lot_values),
current_trans,
HSTACK(
lot_to_add,
(INDEX(prior_lot_values, 1, 2) + INDEX(current_trans, 1, 2)),
(INDEX(prior_lot_values, 1, 3) + INDEX(current_trans, 1, 3))
)
),
final_array, IF(
OR(TRIM(last_state_string) = “”),
replacement_line,
IF(OR(ISNA(filtered_array)), replacement_line, VSTACK(filtered_array, replacement_line))
),
final_array
)
);
I’m curious what you all think of this approach for managing state in excel with minimal computational impact. I’m sure after a few thousand different items to track, performance may be implicated, but the formulas initially feel like they will create static values that don’t have to be recalculated with any frequency. Skip towards the end if you just want to see the formulas. I think one of the important ideas here is that this formula can be used within a table column the same as any other formula that doesn’t return an array, the difference being that calculations are not lost between steps of the row and the only information a calculation needs is the limited state info in the prior state.Any comments welcome. If you’ve found a better way to do this, I’d be curious to know. I am limiting myself to using only native excel formulas/functionality plus excel labs. P.S. The formulas have one or two undeveloped ideas because they are not essential. For instance, there is one inline lambda that uses a switch to return a comparison function for data masking that has only implemented the equals and does not equal operators. **********Description******************** General idea is such – write a series of formulas that serialize and deserialize arrays and combine it with an efficient mechanism to access the most recent state. The example uses the idea of lots of items where each lot has a quantity and cost – transactions can either increase the quantity (and cost) or decrease it. The state being managed is the existent lots, their quantities, and their costs. columns are simple:control; date; lot; qty; cost; state So here is the overall structure:table formula checks control column to see if the current row impacts the state – if it doesn’t, return nothing, if it does, move on if it is the first control number, just serialize the row data if it isn’t, find the prior state and update it The way it updates the state is by1) deserializing the prior state2) determining whether the prior state already included the current row’s item3) updating that value if so or just returning the current rows values,4) filtering the prior state to remove the current row’s item5) vstacking the filtered array onto the current row’s entry (potentially updated), and6) serializing the current state. Using 40 rows as an example, I might end with a serialized state that looks like this:lot_H,65,650;lot_I,100,1000;lot_J,45,450;lot_K,70,700;lot_L,30,300;lot_A,135,1350;lot_B,150,1500;lot_C,145,1450;lot_D,90,900;lot_E,120,1200;lot_F,80,800;lot_G,170,1700and deserializes to this: lot_A 135 1350lot_B 150 1500lot_C 145 1450lot_D 90 900lot_E 120 1200lot_F 80 800lot_G 170 1700lot_H 65 650lot_I 100 1000lot_J 45 450lot_K 70 700lot_L 30 300 using this function: SORT(array.deserialize_string(I57)) table formula:
=IF(
TRIM([@lot]) = “”,
“”,
IF(
[@control] = 1,
array.serialize_array(Table68[@[lot]:[cost]]),
LET(
current_trans, Table68[@[lot]:[cost]],
last_state_ctrl, MAX(
FILTER([control], ([control] < [@control]) * ([lot] <> “”) * (2025 <= [@date])),
1
),
last_state, XLOOKUP(last_state_ctrl, [control], [state], “”, 0),
array.serialize_array(array.track_state(array.deserialize_string(last_state), current_trans))
)
)
)
module formulas:
deserialize_string = LAMBDA(state_string, TEXTSPLIT(state_string, “,”, “;”, FALSE));
serialize_row = LAMBDA(row, TEXTJOIN(“,”, FALSE, row));
serialize_array = LAMBDA(state_array,
REDUCE(
,
BYROW(state_array, LAMBDA(r, serialize_row(r))),
LAMBDA(acc, current_row, acc & “;” & current_row)
)
);
serialized_wrapper = lambda(state_string, func_to_apply, serialize_array(func_to_apply(deserialize_string(state_string))));
replace_row =
lambda(
new_row,
original_array,
[replacement_col_idx],
let(
idx_col, if(isomitted(replacement_col_idx), 1, replacement_col_idx),
data_mask, choosecols(new_row, idx_col) <> choosecols(original_array,idx_col),
filtered_array, filter(original_array, data_mask),
hstack(filtered_array, replace_row)
)
);
columnMask =
lambda(
value,
comp_to_array,
[comparison_operator],
[column_idx],
[value_idx],
let(
operator, if(isomitted(comparison_operator),”=”,comparison_operator),
comp_func, lambda(operator, lambda(value, comp_array, switch(operator, “=”, value=comp_array, “<>”, value<>comp_array,”Not yet implimented”))),
col_idx, if(isomitted(column_idx), 1, column_idx),
val_idx, if(isomitted(value_idx), 1, value_idx),
adj_value, if(COLUMNS(value)>1, CHOOSECOLS(value, val_idx), value),
comp_col, choosecols(comp_to_array, col_idx),
comp_func(operator)(adj_value, comp_col)
)
);
track_state =
lambda(
last_state_string, current_trans,
LET(
lots, array.deserialize_string(last_state_string),
lot_to_add, INDEX(current_trans, 1, 1),
prior_lot_values, IFERROR(
FILTER(lots, array.columnMask(lot_to_add, lots), ERROR.TYPE(3)),
ERROR.TYPE(3)
),
filtered_array, FILTER(lots, array.columnMask(lot_to_add, lots, “<>”), ERROR.TYPE(3)),
replacement_line, IF(
ISNA(prior_lot_values),
current_trans,
HSTACK(
lot_to_add,
(INDEX(prior_lot_values, 1, 2) + INDEX(current_trans, 1, 2)),
(INDEX(prior_lot_values, 1, 3) + INDEX(current_trans, 1, 3))
)
),
final_array, IF(
OR(TRIM(last_state_string) = “”),
replacement_line,
IF(OR(ISNA(filtered_array)), replacement_line, VSTACK(filtered_array, replacement_line))
),
final_array
)
); Read More