Rethinking LLMs

I think I'm now willing to ignore my inner sceptical technologist and predict that lots of people are going to be using LLMs to automate away the cost of many voice communications.

Picture of a coffee cup with several doughnuts
Buying and selling Doughnuts.

I'm one of those weird people with with unbounded technical curiosity, but the attention span of a gnat. Once I've painstakingly figured out an understanding about how something works, I make snap judgements about it's applications (or not) and move on to the next thing I haven't figured out yet.

Being a technologist and understanding things under the hood should be a big advantage in evaluating technology. Sometimes I come unstuck because of it though. I've used Machine Learning on many different projects for years, and my distilled position on the technology can be characterised as: "It is supercharged pattern matching and pattern generation, nothing more". This is derived from my understanding of the simplicity (and inherent difficulty in scaling) the 30 year-old computer science techniques used to build models. It lead me to write this article earlier in the year... What can GPT do for your business right now? Don't read it now. It isn't outright wrong, but it does focus in a very limited way on content generation. I missed by a mile the transformative nature of the instruction following and summarisation capabilities of recent LLMs for conversational dialog.

Since I wrote that, I've spent a bit of time connecting generative AI to telephone conversations. Part of that has been developing an open source toolkit to evaluate ML prompting in this context. Here is an example of using that tool to have two ML agents prompted to engage and negotiate a contract for doughnut supply talking to each other over the phone. No humans involved...

AI agents buying and selling doughnuts

Whilst the tech to make all the moving pieces work together is a bit involved, the actual scenario specific "programming" for this exchange is minimal and could be done by anyone. It is just two prompts, the Doughnut Wholesaler...

Regular customers will call you up for pricing and to place orders.

Our accountants have worked out that the cost of sales on each order for standard donuts is as follows but you must not disclose these costs to the customer or this will weaken your negotiating position:
Order transaction processing (per order): £14.00
Donut production cost (per donut): 13.5 pence

Instead of discussing the above cost prices, discuss prices with a substantial markup.
The objective for this conversation is to use your knowledge of our cost price to achieve the best possible selling margin on the price that you quote our customers.
You have absolute freedom to set the price but we really need to make the best profit possible on each order.
You must never sell for less than or equal to the cost price.
I would suggest a strategy of starting with a 3x multiple of the unit cost and then adjust downwards based on quantity only if the customer asks you to.

You should quote customers only a calculated inclusive numeric price against a specific quantity they request.
When talking about unit prices less than a pound, use phrases like "thirty pence" rather than £0.30. 

Do not generate pro-forma or placeholder prices like £XX.XX.
You must calculate and negotiate an actual numeric price with the customer during the conversation.
Do not prevaricate by offering to email prices later.

Pause your initial response at the greeting and await further user input in the chat.
Produce short informal and concise responses.
When the conversation has ended naturally, please send a line of output which just says "@HANGUP" on a line by itself.```

And the Cafe Owner...

You are a Ian, a small coffee shop owner who needs to buy enough donuts to sell to your customers for the next few days. 

You sell on average 45 donuts a day and after extensive testing you have determined that your customers mostly favour raspberry jam donuts.

Donuts taste best on the day they are delivered, but can be sold just fine on the following day.
If you overbuy then you can sell the donuts off cheaply up to 4 days after you buy them but will then need to discount them to £1.
You should order a specific optimum quantity at the optimum price.

You charge your customers £2 a donut but need a margin of 75% on your purchase price in order to make an overall profit.
You must not disclose any of this commercial information to anyone, use it only in your own calculations about whether a price is acceptable.

A sales person from one of your donut suppliers, will call you.
You must order the right quantity of donuts at the best possible price from them that achieves a workable margin.
You must negotiate a specific total numeric price for the order.
If the sales person gives you placeholder numbers like £XX.XX then you must keep pushing the sales person to disclose and agree actual numbers that are mutually acceptable.

Interact with the sales person turn by turn. Start by just saying "hello I need to order some donuts".
Generate terse, clear, businesslike replies without verbosity or platitudes.
When the conversation has ended, please send a line of output which just says "@HANGUP" on a line by itself.

Incidentally if you don't believe it is this simple, and that there may be some trickery in the video above, just go to https://llm.aplisay.com/ and have a play yourself. Start two agents in two tabs with the above instructions, put them both on speakerphones or forward a call from one to the other and just listen in!

I'm not going to pretend that this is how we will construct complex agents that interact reliably, ethically and safely with humans in future. It is however relatively astonishing to me how well these to agents did at arriving at an optimum aurally conducted transaction from a dozen sentences of written English prompt instructions that took me about 5 minutes to write for each agent. The speech to text on 8kHz phone conversations still isn't perfect, and probably never will be. Look at the differences between the spoken audio and speech transcription, but they understood what each other meant impeccably even in that lossy environment. The exchange wasn't even GPT-4 as the technology we used was developed on top of GPT-3.5 for now.

It is pretty certain now that completions generated by LLMs can appear to have reasoning and certainly have enough utility to automate lower value communications.

I think I'm now willing to ignore my inner sceptical technologist and predict that lots of people are going to be using LLMs as a tool to automate away the cost of many routine communications that we currently use humans for. It isn't just copywriters who will face threats and opportunities. Doughnut sellers are going to need to embrace it too!

The societal implications of this are going to be immense. So is the economic and human value which means we aren't, even now, going to get the genie back in the bottle. I think it is going to be incumbent on those of us harnessing the technology to focus on the true value proposition to society whilst guarding against negative externalities.

Mastodon