Using azure Search datasource in Azure AI completion API is creating many prompt tokens

I have written Azure Open AI implementation integrating with Azure cognitive search, as shown below.

{
“data_sources”: [
{
“type”: “AzureCognitiveSearch”,
“parameters”: {
“endpoint”: “AZURE_SEARCH_END_POINT”,
“key”: “AZURE_SEARCH_KEY”,
“index_name”: “INDEX_NAME”,
“scope”: {
“in_scope”: true,
“top_n_documents”: 1
}
“queryType”:”vectorSimpleHybrid”,
}
}
],

Then I make Azure Open API completion call using this Azure cognitive search service. I use mode GPT 3.5.

Everything is working fine as expected.

The challenge is the number of tokens it is generating is huge. For the simple prompt like “Who is Some XYZ”, it shows ‘4000+’ prompt tokens.

How do I avoid this. I am not adding any context to this. I tested taking out the above code and the prompt token count is around 5.

Can you suggest ways to solve this?

I tested adding filter to point to one chunkId, even then the prompt token count does not come down.

I was told to run a semantic search on to search index and then give the results as input chat completion API. But not sure how to implement this.

Can you suggest or point me the right direction.

REST API call would be preferred but even python code will help.

Thank you

HiI have written Azure Open AI implementation integrating with Azure cognitive search, as shown below. {
“data_sources”: [
{
“type”: “AzureCognitiveSearch”,
“parameters”: {
“endpoint”: “AZURE_SEARCH_END_POINT”,
“key”: “AZURE_SEARCH_KEY”,
“index_name”: “INDEX_NAME”,
“scope”: {
“in_scope”: true,
“top_n_documents”: 1
}
“queryType”:”vectorSimpleHybrid”,
}
}
], Then I make Azure Open API completion call using this Azure cognitive search service. I use mode GPT 3.5.Everything is working fine as expected.The challenge is the number of tokens it is generating is huge. For the simple prompt like “Who is Some XYZ”, it shows ‘4000+’ prompt tokens.How do I avoid this. I am not adding any context to this. I tested taking out the above code and the prompt token count is around 5.Can you suggest ways to solve this? I tested adding filter to point to one chunkId, even then the prompt token count does not come down.I was told to run a semantic search on to search index and then give the results as input chat completion API. But not sure how to implement this.Can you suggest or point me the right direction.REST API call would be preferred but even python code will help. Thank you Read More

Cart

Cart