It’s April 2024 and it’s been about 17 months since we’ve been utilizing LLMs like ChatGPT to assist us in code technology and debugging duties. Whereas it has added an excellent stage of productiveness, there are certainly instances when the code generated is stuffed with bugs and makes us take the nice ole StackOverflow route.
On this article, I’ll give a fast demonstration on how we will handle this lack of “verification” utilizing Conversable Brokers provided by AutoGen.
What’s AutoGen?
“AutoGen is a framework that allows the event of LLM functions utilizing a number of brokers that may converse with one another to unravel duties.”
Presenting LeetCode Drawback Solver:
Begin with quietly putting in autogen:
!pip set up pyautogen -q –progress-bar off
I’m utilizing Google Colab so I entered by OPENAI_API_KEY within the Secrets and techniques tab, and securely loaded it together with different modules:
import osimport csvimport autogenfrom autogen import Cachefrom google.colab import userdatauserdata.get(‘OPENAI_API_KEY’)
I’m utilizing gpt-3.5-turbo solely as a result of it’s cheaper than gpt4. In the event you can afford dearer experimentation and/otherwise you’re doing issues extra “severely”, you need to clearly use a stronger mannequin.
llm_config = {“config_list”: [{“model”: “gpt-3.5-turbo”, “api_key”: userdata.get(‘OPENAI_API_KEY’)}],”cache_seed”: 0, # seed for reproducibility”temperature”: 0, # temperature to regulate randomness}
Now, I’ll copy the issue assertion from my favorite LeetCode drawback Two Sum. It’s one of the vital generally requested questions in leetcode-style interviews and covers primary ideas like caching utilizing hashmaps and primary equation manipulation.
LEETCODE_QUESTION = “””Title: Two Sum
Given an array of integers nums and an integer goal, return indices of the 2 numbers such that they add as much as goal. Chances are you’ll assume that every enter would have precisely one resolution, and chances are you’ll not use the identical ingredient twice. You possibly can return the reply in any order.
Instance 1:Enter: nums = [2,7,11,15], goal = 9Output: [0,1]Rationalization: As a result of nums[0] + nums[1] == 9, we return [0, 1].
Instance 2:Enter: nums = [3,2,4], goal = 6Output: [1,2]
Instance 3:Enter: nums = [3,3], goal = 6Output: [0,1]
Constraints:
2 <= nums.size <= 104-109 <= nums[i] <= 109-109 <= goal <= 109Only one legitimate reply exists.
Observe-up: Are you able to provide you with an algorithm that’s lower than O(n2) time complexity?”””
We will now outline each of our brokers. One agent acts because the “assistant” agent that implies the answer and the opposite serves as a proxy to us, the person and can also be accountable for executing the instructed Python code.
# create an AssistantAgent named “assistant”
SYSTEM_MESSAGE = “””You’re a useful AI assistant.Clear up duties utilizing your coding and language expertise.Within the following instances, counsel python code (in a python coding block) or shell script (in a sh coding block) for the person to execute.1. When you want to gather information, use the code to output the information you want, for instance, browse or search the net, obtain/learn a file, print the content material of a webpage or a file, get the present date/time, verify the working system. After ample information is printed and the duty is able to be solved primarily based in your language ability, you possibly can clear up the duty by your self.2. When you want to carry out some process with code, use the code to carry out the duty and output the end result. End the duty well.Clear up the duty step-by-step if you want to. If a plan shouldn’t be supplied, clarify your plan first. Be clear which step makes use of code, and which step makes use of your language ability.When utilizing code, you could point out the script kind within the code block. The person can not present another suggestions or carry out another motion past executing the code you counsel. The person cannot modify your code. So don’t counsel incomplete code which requires customers to change. Do not use a code block if it isn’t supposed to be executed by the person.In order for you the person to save lots of the code in a file earlier than executing it, put # filename: <filename> contained in the code block as the primary line. Do not embody a number of code blocks in a single response. Don’t ask customers to repeat and paste the end result. As an alternative, use ‘print’ operate for the output when related. Verify the execution end result returned by the person.If the end result signifies there may be an error, repair the error and output the code once more. Counsel the complete code as an alternative of partial code or code adjustments. If the error cannot be mounted or if the duty shouldn’t be solved even after the code is executed efficiently, analyze the issue, revisit your assumption, gather additional information you want, and consider a special method to attempt.If you discover a solution, confirm the reply rigorously. Embrace verifiable proof in your response if attainable.
Further necessities:1. Throughout the code, add performance to measure the entire run-time of the algorithm in python operate utilizing “time” library.2. Solely when the person proxy agent confirms that the Python script ran efficiently and the entire run-time (printed on stdout console) is lower than 50 ms, solely then return a concluding message with the phrase “TERMINATE”. In any other case, repeat the above course of with a extra optimum resolution if it exists.”””
assistant = autogen.AssistantAgent(identify=”assistant”,llm_config=llm_config,system_message=SYSTEM_MESSAGE)
# create a UserProxyAgent occasion named “user_proxy”user_proxy = autogen.UserProxyAgent(identify=”user_proxy”,human_input_mode=”NEVER”,max_consecutive_auto_reply=4,is_termination_msg=lambda x: x.get(“content material”, “”).rstrip().endswith(“TERMINATE”),code_execution_config={“work_dir”: “coding”,”use_docker”: False,},)
I set the human_input_mode to “NEVER” as a result of I’m not planning to offer any inputs myself and max_consecutive_auto_reply to 4 to restrict the back-and-forth turns within the dialog. The Assistant agent has been instructed to reply with the phrase “TERMINATE” that tells the UserProxyAgent when to conclude the dialog.
Now, the enjoyable time! We’ll provoke the dialog by sending a message from our UserProxyAgent to our Assistant.
One added advantage of utilizing AutoGen (even for non-agentic workflows) is that it supplies specific caching functionality that will help you save API prices throughout growth. Right here, I’m caching responses on the disk however you may also combine redis for this function.
# Use DiskCache as cachewith Cache.disk(cache_seed=7) as cache:# the assistant receives a message from the user_proxy, which incorporates the duty descriptionchat_res = user_proxy.initiate_chat(assistant,message=”””Clear up the next leetcode drawback and in addition touch upon it is time and area complexity:nn””” + LEETCODE_QUESTION)
Output:
user_proxy (to assistant):
Clear up the next leetcode drawback and in addition touch upon it is time and area complexity:
Title: Two Sum
…
Observe-up: Are you able to provide you with an algorithm that’s lower than O(n2) time complexity?
——————————————————————————–assistant (to user_proxy):
To resolve this drawback effectively, we will use a dictionary to retailer the weather we have now seen thus far together with their indices. Whereas iterating by means of the array, we will verify if the complement of the present ingredient (goal – present ingredient) is already within the dictionary. Whether it is, we have now discovered the pair that sums as much as the goal.
Here is the Python code to unravel the Two Sum drawback with a time complexity lower than O(n^2):
“`python# filename: two_sum.pydef two_sum(nums, goal):num_dict = {}for i, num in enumerate(nums):complement = goal – numif complement in num_dict:return [num_dict[complement], i]num_dict[num] = i
# Take a look at the operate with the given examplesnums1 = [2, 7, 11, 15]target1 = 9print(two_sum(nums1, target1)) # Output: [0, 1]
nums2 = [3, 2, 4]target2 = 6print(two_sum(nums2, target2)) # Output: [1, 2]
nums3 = [3, 3]target3 = 6print(two_sum(nums3, target3)) # Output: [0, 1]“`
The time complexity of this resolution is O(n) as a result of we iterate by means of the listing of numbers solely as soon as. The area complexity can also be O(n) as a result of we retailer components in a dictionary.
You possibly can run this code and verify if it supplies the proper output. Let me know the entire run-time, and if it is lower than 50 ms, you possibly can reply with “TERMINATE”. If it isn’t, I can counsel a extra optimum resolution.
——————————————————————————–
>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)…user_proxy (to assistant):
exitcode: 0 (execution succeeded)Code output: [0, 1][1, 2][0, 1]
——————————————————————————–assistant (to user_proxy):
The Python script ran efficiently and supplied the proper output for the given examples. Because the whole run-time shouldn’t be supplied, I’ll counsel measuring the entire run-time of the algorithm utilizing the “time” library to find out if it is lower than 50 ms.
You possibly can add the performance to measure the entire run-time by importing the “time” library and utilizing it to calculate the elapsed time earlier than and after calling the `two_sum` operate. Here is how one can modify the code:
“`python# filename: two_sum.pyimport time
def two_sum(nums, goal):num_dict = {}for i, num in enumerate(nums):complement = goal – numif complement in num_dict:return [num_dict[complement], i]num_dict[num] = i
# Take a look at the operate with the given examples and measure the entire run-timenums1 = [2, 7, 11, 15]target1 = 9start_time = time.time()print(two_sum(nums1, target1)) # Output: [0, 1]end_time = time.time()print(“Whole run-time:”, (end_time – start_time) * 1000, “ms”)
nums2 = [3, 2, 4]target2 = 6start_time = time.time()print(two_sum(nums2, target2)) # Output: [1, 2]end_time = time.time()print(“Whole run-time:”, (end_time – start_time) * 1000, “ms”)
nums3 = [3, 3]target3 = 6start_time = time.time()print(two_sum(nums3, target3)) # Output: [0, 1]end_time = time.time()print(“Whole run-time:”, (end_time – start_time) * 1000, “ms”)“`
Please run this modified code and supply the entire run-time for every instance. If the entire run-time is lower than 50 ms for all examples, you possibly can reply with “TERMINATE”.
——————————————————————————–
>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)…user_proxy (to assistant):
exitcode: 0 (execution succeeded)Code output: [0, 1]Whole run-time: 0.01239776611328125 ms[1, 2]Whole run-time: 0.00286102294921875 ms[0, 1]Whole run-time: 0.0016689300537109375 ms
——————————————————————————–assistant (to user_proxy):
The Python script ran efficiently, and the entire run-time for every instance is considerably lower than 50 ms. Due to this fact, you possibly can reply with “TERMINATE”. The algorithm supplied an environment friendly resolution with a time complexity of O(n) and an area complexity of O(n).
——————————————————————————–
That is what’s taking place right here:
The UserProxyAgent asks the Assistant to unravel the issue primarily based on the duty description.The Assistant suggests an answer with a Python blockThe UserProxyAgent executes the python code.The Assistant reads the console output and responds again with a modified resolution (with time measurement performance. Actually, I might’ve anticipated this modified resolution instantly however this habits may be tuned by means of immediate engineering or by using a stronger LLM).
With AutoGen, you may also show the price of the agentic workflow.
chat_res.price({‘total_cost’: 0,’gpt-3.5-turbo-0125′: {‘price’: 0,’prompt_tokens’: 14578,’completion_tokens’: 3460,’total_tokens’: 18038}}
Concluding Remarks:
Thus, through the use of AutoGen’s conversable brokers:
We mechanically verified that the Python code instructed by the LLM truly works.And created a framework by which the LLM can additional reply to syntax or logical errors by studying the output within the console.
Thanks for studying! Please comply with me and subscribe to be the primary once I submit a brand new article! 🙂
Try my different articles:
A Deep Dive into Analysis in Azure Immediate FlowDevelop a UI for Azure Immediate Circulation with StreamlitBuild a customized Chatbot utilizing Hugging Face Chat UI and Cosmos DB on Azure Kubernetes ServiceDeploy Hugging Face Textual content Era Inference on Azure Container Occasion