Project Selection: Portfolio Analysis using Copilot in Excel with Python
One of my favourite features in Project Online, and Project Server, has been the portfolio features that allow you do match your plan portfolio against your business drivers to ensure you are gaining the best strategic value. I created a Power App to do something similar a while back, and when Copilot in Excel with Python was announced, I thought it would be a good time to see if I could recreate what Project does, but in Excel. This is probably a feature that we won’t be rebuilding for new Planner but may be a perfect scenario for Copilot to fill the gap, and maybe some agents can help out? Copilot in Excel with Python is available now for Insiders (Excel Advanced Analysis ) and I must say I was pretty impressed how straightforward it was to talk Copilot through creating some Python for me to do the analysis – and give me exactly the same choice of projects from my portfolio. I’m not using any insider knowledge to do this, but using public documentation on the maths behind the process from Portfolio Analysis with Microsoft Project Server 2010 a white paper written for Microsoft by Andrew Lavinsky, now with EY) and if you want to dig deeper, a search on Saaty and Analytic Hierarchy Process (AHP – the name given to this particular optimization technique) will find more papers going back to the late 70’s. I was keen to check results against Project Online, so used a similar approach, but this does generally show how you might ask Copilot to come up with different optimization algorithms if you have specific scenarios in mind. I’ll walk through more or less the steps I followed, along with the issues I saw along the way and the way I steered Copilot to put things right. At this point I am only looking at the cost side or things. Resourcing is a topic for another day…
Initial Steps – Setting up a Portfolio in Project Online
To give me something to validate against, I configured some drivers and projects in Project Online and set up some driver prioritizations and then mapped my projects to them. I set a budget and forced in a couple of projects. I used the articles at https://learn.microsoft.com/en-us/projectonline/portfolio-analysis-overview to configure some data, if you want to follow along. I used the same names for drivers and plans, although I could not bring myself to have a plan called Lync 2012 Enterprise Deployment – so that was renamed as Teams Enterprise Deployment.
My target is to have my driver priorities match these:
My plan priorities to look something like this:
And my final selection to match up with this group of plans:
Into Excel – Prioritizing Business Drivers!
Starting with the Business Drivers sheet, and I soon learned that it was better to keep things tidy in Excel – so I just labelled my drivers A-E as it made the matrix easier to see, rather than having long text descriptions. From memory I knew this needed to be a symetrical matrix, with the leading diagonal all ‘1’s as each driver matches itself, then the other numbers represented the ‘is as important’, ‘is more important’, is much more important’ and finally ‘is extremely more important than’ – with the same for ‘less important’ levels too. I got the numbers wrong first time through (I used 1,3,5 and 7) and my percentages were a little bit off – but I still got the same project selection – but for accuracy I’ll use the ones from the white paper – 1,3,6 and 9, and the reciprocals for the ‘less than’ options. You only need to fill out the top half of the matrix (to the upper right of the ‘1’s) and have the lower as =1/(the matching cell). So my matrix ended up looking like this to match the driver comparisons I had chosen in Project Online:
I then fired up Copilot and gave the following prompt:
Can you write some Python code to set a ranking for the drivers A, B, C, D and E on the business drivers sheet based on the pair-wise comparison using the Analytic Hierarchy Process. Can you give a consistency ratio for the analysis as well as displaying the priorities calculated for the drivers.
If I hadn’t known about AHP then I think Copilot might still have got to this by starting with basics like asking how to prioritize items against each other – but a little knowledge accelerated this part. You’ll see later how I needed to encourage Copilot to do better…
The answer from Copilot was that it needed to engage overdrive to solve this – or more accurately, to use start advanced analysis:
I clicked Start advanced analysis, which created a new sheet then I could see a few cells that had PY in them so knew it was starting to do things with Python. After creating a DataFrame (the construct the Python library called pandas uses for handling numeric arrays) and copying in my data, which it identified correctly it came up with some Python to do what I asked – and soon had a result, not 42, but 0.031127. Project shows the consistency ratio as a percentage (around 97% for this analysis), so I did ask Copilot if this was an ok value for consistency ratio and it suggested anything under 0.1 was good. It didn’t initially show me the individual priorities, so I had to ask Copilot Can you show the priorities for each driver? The Analysis1 sheet that Copilot was working in then looked like this:
I’ll add to the foot of this post some of the responses and code generated along the way. The beauty of having the Python code here is you could take it and use it elsewhere. You want to create an Azure Function in Python that would take a call from a Power App and return the priorities? This would be a good starting point – just take it over to VS Code. Or just use Copilot in Github. So many possibilities.
So how are we doing? Those numbers look good. Trust me, the A was the top driver, E 2nd and then D, B, C. And the percentages are only differing in the 3rd decimal place. I call that pretty close.
Applying the priorities to plans
In Project Online to get the priorities for the plans, a similar approach is taken as for prioritizing the drivers, saying for each one what impact it will have for the driver concerned:
Behind the scenes, as per the white paper, these levels of None to Extreme can be represented by number 0,1,3,6 and 9. In my Projects sheet I show this same data, for consumption by Copilot and Python:
Then I asked Copilot:
Can you then apply these driver priorities to the Projects on the Projects sheet, in rows which are aligned to the drivers A to E referenced by column using the Ranking of values 0,1,3,6 and 9. 9 is a higher alignment. The aim is to prioritise the projects that have the most alignment with the most important drivers.
It certainly gave a result here, but I need to see it normalized to check if the % matched project – this was the Analysis1 sheet at this point before they were normalized, where you can also see some of the Python in cell A51 that was doing the work:
Slight break here for some gotches I hit along the way. I did see issues a couple of times where Copilot would think there were only 5 plans, as it assumed a similar shape for the dataframe from the driver analysis. It got things right this time, maybe as I’d changed focus to the Projects sheet. Anyway, before I’d just give it a nudge and tell if what the project dataframe should be and all was well. Another gotcha is that I am writing the blog as I go along – and Copilot seemed to lose interest due to the long gap between prompts… It did pick up again without having to start over.
And the normalized result – looking good with just some differences in the 2nd decimal from the result in Project Online:
Budgets, and Mandatory Projects
The next step was to get Copilot to look at my Budget sheet. I did try to have the driver rankings and budgets all in the same sheet, but it seemed to be less trouble to keep things clean and have these separate. My next prompt for Copilot:
Looking at the Budgets sheet, it would cost over $2.5M to execute all projects. If I have just $1.5M which projects should I execute based on the priorities calculated?
And Copilots response? The first what was on the Analysis1 sheet (I added the total) and then what was returned in the chat area (and I haven’t really been telling that side of the story, but it has kept me informed of what it was doing and the resultant DataFrames, as well as showing the Python being used (see the foot of this blog post):
Based on the priorities calculated and the $1.5M budget constraint, the following projects should be executed:
Shared GPS devices for transportation
Helmet with Integrated Sunvisor
Voice Recognition Product
Aquisition Target Analysis
E-campaign to Bloggers
Catalog Publishing
Removable protective lining
Print Advertising Campaign System
This is all well and good, but it has a couple of problems. Firslty there are a couple of projects that I really must do, even though they don’t rank highly against these business drivers, but they keep the company running (the ones showing Force-in, on the Proejct Online screenshot above). And after I’ve solved that issue – these projects aren’t delivering the best strategic value I could possibly obtain. They are just the highest ranked plans chosen until the money runs out. The helmet project is very expensive, and even if it does rank very high, could the money be better spent to give higher strategic value?
Let sort out the projects I need to do first;
If I absolutely must execute the Operations Management plan, and the Internal Application Customization plan, which projects would also fit in the $1.5M budget?
Copilot responded:
To include the Operations Management and Internal Application Customization plans within the $1.5M budget, I will first subtract their costs from the total budget. Then, I will select the highest priority projects that fit within the remaining budget. I will write Python code to perform this selection.
Including the Operations Management and Internal Application Customization plans, the following projects fit within the $1.5M budget:
Shared GPS devices for transportation
Helmet with Integrated Sunvisor
Voice Recognition Product
E-campaign to Bloggers
Operations Management
Internal Application Customization
Cool, but how to get a better optimization? Maybe Copilot has some ideas?
The method used here just adds the top priority plan until the budget is reached. Is there any other algorithm that could be used to optimize the strategic value by trying different combinations of plans together?
At this point, and something similar had popped up at other times, I’d see Copilot get and error, then try something else instead. Cool! Before I’d seen it reference the error, work out why it got an error, then try another approach.
This time however it did appear to be stuck. Looking at the code it seems like it is using Normalized_Score when it doesn’t have that in the DataFrame it is considering. Of course I could have debugged the Python myself (eventually…).
Re-thinking the prompts from the beginning
Instead, I reviewed the responses leading up to this to see where Copilot was headed:
A more sophisticated approach to optimize the strategic value while considering different combinations of plans is to use a knapsack algorithm. The knapsack problem is a classic optimization problem that can help in selecting the best combination of projects to maximize the total value within a given budget. I will write Python code to implement the knapsack algorithm to find the optimal combination of projects.
Maybe I could give it a bit of a push in the right direction. I cleaned up my sheets so that no extraneous information existed (I also had a ‘benefits’ value on my budget sheet which I wasn’t using) and crafted a set of prompts to start from the beginning, not bothering with some of the normalizing steps, which I’d only needed to validate results against my known percentages. I also put the final selection in a single prompt, giving it the budget, the clue it should use knapsack and also tell it my mandatory plans. These were my 3 final prompts:
Can you write some Python code to set a ranking for the drivers A, B, C, D and E on the business drivers sheet based on the pair-wise comparison using the Analytic Hierarchy Process. Can you give a consistency ratio for the analysis and display the priorities calculated for each driver.
Can you then apply these driver priorities to the Projects on the Projects sheet, in rows which are aligned to the drivers A to E referenced by column using the Ranking of values 0,1,3,6 and 9. 9 is a higher alignment. The aim is to prioritise the projects that have the most alignment with the most important drivers.
Looking at the Budgets sheet, can you use the knapsack approach to choose the optimal set of projects to maximize strategic value, based on a strict budget of $1.5M which should not be exceeded and also ensuring that the mandatory projects, Operations Management plan and Internal Application Customization plan are included. Do not include fractions of projects, all or nothing.
The ‘partial project’ piece was added after seeing my budget apparently exceeded, but on closer inspection of the results and DataFrame, it had selected 0.8 of a project to consume the final $120K. Pretty sure that for most projects you don’t get 80% of the strategic impact for doing 80% of the plan… But it had been easy to see what it was doing so debugging wasn’t a problem and modifying the prompt usually got things back on track. I also had the best experience when the sheet that contained the data needed for each step was active when the prompt was entered, otherwise sometimes the DataFrames could be the wrong shape. I’d see it only considering 5 projects, if the sheet with 5 drivers was still active.
Final Result!
A perfect match to the $1.38M that Project Online had selected – see the screenshot right at the start. I haven’t shown here, but I did quiz Copilot on the measure of strategic value compared to the maximum I could have achieved – and got a result around 70%, which also matched that shown by Project. I did sometimes see that even with the prompt saying complete projects only, I did sometimes still see the budget met by adding 80% of the next highest priority plan.
Final Thoughts
Comparing to the results from Project Online allowed me to easily verify if I was getting the results I was expecting, giving me confidence I could use the same prompts against a different data set and still make a good selection. To take out more of the uncertainty, rather than using Copilot each time, it would make sense to use my ‘trusted and verified’ Python code – although it might need a little work to allow it to work with variable numbers of drivers. Tip – the value of the variable RI (Random Index) is different for different sizes of matrix.
What are the different values of RI, Random Index, used when carrying out AHP analysis on different sized matrices?
The Random Index (RI) values used in the Analytic Hierarchy Process (AHP) analysis for different sized matrices are as follows:
For a 1×1 matrix: RI = 0.00
For a 2×2 matrix: RI = 0.00
For a 3×3 matrix: RI = 0.58
For a 4×4 matrix: RI = 0.90
For a 5×5 matrix: RI = 1.12
For a 6×6 matrix: RI = 1.24
For a 7×7 matrix: RI = 1.32
For an 8×8 matrix: RI = 1.41
For a 9×9 matrix: RI = 1.45
For a 10×10 matrix: RI = 1.49
These values are used to determine the consistency of the pairwise comparison matrix in AHP.
Thanks Copilot!
Appendix
The following contains the responses to my final 3 prompts – and the Python generated.
Responses to Prompt 1
To rank the drivers using the Analytic Hierarchy Process (AHP), the pairwise comparison matrix will be used to calculate the priority vector and the consistency ratio. The steps involved are:
Normalize the pairwise comparison matrix.
Calculate the priority vector (eigenvector).
Compute the consistency index (CI) and consistency ratio (CR). I will write Python code to perform these steps and display the priorities and consistency ratio.
“=PY(“”#Calculate priorities and consistency ratio using AHP
from numpy.linalg import eig
# Extract the pairwise comparison matrix
pairwise_matrix = Business_Drivers_A1_F6_df.iloc[:, 1:].values
# Normalize the pairwise comparison matrix
column_sums = pairwise_matrix.sum(axis=0)
normalized_matrix = pairwise_matrix / column_sums
# Calculate the priority vector (eigenvector)
priority_vector = normalized_matrix.mean(axis=1)
# Calculate the consistency index (CI)
eigenvalues, _ = eig(pairwise_matrix)
max_eigenvalue = max(eigenvalues.real)
CI = (max_eigenvalue – len(priority_vector)) / (len(priority_vector) – 1)
# Calculate the consistency ratio (CR)
RI = 1.12 # Random Index for a 5×5 matrix
CR = CI / RI
# Create a DataFrame to display the priorities
priorities_df = pd.DataFrame({
‘Driver’: Business_Drivers_A1_F6_df[‘Drivers’],
‘Priority’: priority_vector
})
priorities_df, CR””,1)”
The priorities for each driver have been calculated using the Analytic Hierarchy Process (AHP). The consistency ratio (CR) for the analysis is approximately 0.0312, indicating a consistent pairwise comparison. The priorities for each driver are displayed in the DataFrame on the sheet.
Responses to Prompt 2
o prioritize the projects based on the driver priorities, the following steps will be taken:
Multiply each project’s alignment score by the corresponding driver priority.
Sum the weighted scores for each project to get a total score.
Rank the projects based on their total scores. I will write Python code to perform these steps and display the prioritized projects.
“=PY(“”#Rank projects based on driver priorities
# Extract the priorities from the previous calculation
priorities = priorities_df[‘Priority’].values
# Multiply each project’s alignment score by the corresponding driver priority
weighted_scores = Projects_A1_F14_df.iloc[:, 1:].values * priorities
# Sum the weighted scores for each project to get a total score
Projects_A1_F14_df[‘Total_Score’] = weighted_scores.sum(axis=1)
# Rank the projects based on their total scores
Projects_A1_F14_df[‘Rank’] = Projects_A1_F14_df[‘Total_Score’].rank(ascending=False)
# Sort the projects by rank
ranked_projects_df = Projects_A1_F14_df.sort_values(by=’Rank’)
ranked_projects_df””,1)”
The projects have been ranked based on their alignment with the driver priorities. The ranked projects are displayed in the DataFrame on the sheet, with the highest priority project being ‘Shared GPS devices for transportation’.
Responses to Prompt 3
To solve this problem using the knapsack approach, the following steps will be taken:
Extract the budget and strategic value for each project from the Budgets sheet.
Ensure that the mandatory projects are included in the selection.
Use the knapsack algorithm to select the optimal set of projects that maximizes the total strategic value without exceeding the budget. I will write Python code to perform these steps and display the selected projects.
“=PY(“”#Select optimal set of projects using knapsack approach
# Extract the budget and strategic value for each project
budgets = Budgets_A1_B15_df[‘Budget’].values
strategic_values = Projects_A1_F14_df[‘Total_Score’].values
# Ensure that the mandatory projects are included
mandatory_projects = [‘Operations Management’, ‘Internal Application Customization’]
mandatory_indices = [Budgets_A1_B15_df[Budgets_A1_B15_df[‘Project’] == project].index[0] for project in mandatory_projects]
# Define the knapsack problem
num_projects = len(budgets)
# Initialize the DP table
dp = [[0 for _ in range(1500001)] for _ in range(num_projects + 1)]
# Fill the DP table
for i in range(1, num_projects + 1):
for w in range(1500001):
if i – 1 in mandatory_indices:
dp[i][w] = dp[i – 1][w – budgets[i – 1]] + strategic_values[i – 1] if w >= budgets[i – 1] else dp[i – 1][w]
else:
if w >= budgets[i – 1]:
dp[i][w] = max(dp[i – 1][w], dp[i – 1][w – budgets[i – 1]] + strategic_values[i – 1])
else:
dp[i][w] = dp[i – 1][w]
# Find the selected projects
selected_projects = []
w = 1500000
for i in range(num_projects, 0, -1):
if dp[i][w] != dp[i – 1][w]:
selected_projects.append(i – 1)
w -= budgets[i – 1]
# Ensure mandatory projects are included
for idx in mandatory_indices:
if idx not in selected_projects:
selected_projects.append(idx)
# Create a DataFrame to display the selected projects
selected_projects_df = Budgets_A1_B15_df.iloc[selected_projects]
selected_projects_df””,1)”
The optimal set of projects has been selected using the knapsack approach, ensuring that the mandatory projects ‘Operations Management’ and ‘Internal Application Customization’ are included. The selected projects are displayed in the DataFrame on the sheet.
Microsoft Tech Community – Latest Blogs –Read More