The newer models like o1 and o3-mini-high have come on leaps and bounds. Here’s a non-exhaustive list of the kinds of problems I hit, and where it helps:
Integrating new libraries or tools
Example: Yesterday I needed to get a list of all the variables present in a Twig template. Twig doesn’t do this out of the box, so you need to reverse-engineer it. I googled a bit and found some old forum posts that were in the right ballpark, but I’d still need to get into the weeds of the Twig codebase to really get to grips with it.
Result: 9/10. I gave the forum post to o1-pro, and explained the issue. It very nearly one-shotted a solution. There was one issue with Twig’s class names1 , but zero logic problems. It worked fine.
I then needed to extend this to find variables within logic statements like {% if contact.first_name == bob %} and it merrily adapted it after a few minutes of thinking. It also explained to me what it had done, how it worked, and why. I could have probably figured this out myself, but grokking the Twig codebase would have taken…a day? Optimistically?
Explaining complex code
Example: as part of the same problem I came across this:
/**
* Examine a token string and filter each token expression.
*
* @internal
* This function is only intended for use within civicrm-core. The name/location/callback-signature may change.
* @param string $expression
* Ex: 'Hello {foo.bar} and {whiz.bang|filter:"arg"}!'
* @param callable $callback
* A function which visits (and substitutes) each token.
* function(?string $fullToken, ?string $entity, ?string $field, ?array $modifier)
* @param string|null $format
*
* @return string
*/
public function visitTokens(string $expression, callable $callback, ?string $format = 'text/html'): string {
// Regex examples: '{foo.bar}', '{foo.bar|whiz}', '{foo.bar|whiz:"bang"}', '{foo.bar|whiz:"bang":"bang"}'
// Regex counter-examples: '{foobar}', '{foo bar}', '{$foo.bar}', '{$foo.bar|whiz}', '{foo.bar|whiz{bang}}'
// Key observations: Civi tokens MUST have a `.` and MUST NOT have a `$`. Civi filters MUST NOT have `{}`s or `$`s.
$quoteStrings = $format === 'text/html' ? [
// Note we just treat left & right quotes as quotes. Our brains are not big enough to enforce them
// & maybe user brains are not big enough to use them correctly anyway.
'"',
'&lquote\;',
'&rquote\;',
'"\;',
'”\;',
'“\;',
'"\;',
] : ['"'];
// The regex is a bit complicated, we so break it down into fragments.
// Consider the example '{foo.bar|whiz:"bang":"bang"}'. Each fragment matches the following:
$tokenRegex = '([\w]+)\.([\w:\.]+)';
$quoteRegex = '(?:' . implode('|', $quoteStrings) . ')';
/* MATCHES: 'foo.bar' */
$filterArgRegex = ':[\w' . $quoteRegex . ': %\-_()\[\]\+/#@!,\.\?]*'; /* MATCHES: ':"bang":"bang"' */
// Key rule of filterArgRegex is to prohibit '{}'s because they may parse ambiguously. So you *might* relax it to:
// $filterArgRegex = ':[^{}\n]*'; /* MATCHES: ':"bang":"bang"' */
$filterNameRegex = "\w+"; /* MATCHES: 'whiz' */
$filterRegex = "\|($filterNameRegex(?:$filterArgRegex)?)"; /* MATCHES: '|whiz:"bang":"bang"' */
$fullRegex = ";\{$tokenRegex(?:$filterRegex)?\};";
return preg_replace_callback($fullRegex, function($m) use ($callback, $quoteStrings) {
$filterParts = NULL;
if (isset($m[3])) {
$filterParts = [];
$enqueue = function($m) use (&$filterParts) {
$filterParts[] = $m[1];
return '';
};
$quoteOptions = implode('|', $quoteStrings);
$quotedRegex = ':' . '(?:' . $quoteOptions . ')' . '(.+?(?=' . $quoteOptions . ')+)' . '(?:' . $quoteOptions . ')';
$unmatched = preg_replace_callback_array([
'/^(\w+)/' => $enqueue,
';' . $quotedRegex . ';' => $enqueue,
], $m[3]);
if ($unmatched) {
throw new \CRM_Core_Exception('Malformed token parameters (' . $m[0] . ')');
}
}
return $callback($m[0] ?? NULL, $m[1] ?? NULL, $m[2] ?? NULL, $filterParts);
}, $expression);
}
which was being called like so:
$e->getTokenProcessor()->visitTokens($e->string, function($token = NULL, $entity = NULL, $field = NULL, $filterParams = NULL) {
I very much needed some help here. Passing a function in the parameters is not something I encounter very often. Plus the callbacks in the regex. I had more than two problems.
Result: 10/10. I used o1 for this, as I wanted a bit of reasoned care. o1-pro would be the fallback if o1 struggled, but it’s slower. o1 broke down the function signature, the regex patterns, the regex construction, and the callbacks. I didn’t spot anything it had missed – although to be fair I probably wouldn’t in this case.
Reworking deprecated templates
Example: we had an old Mailchimp template that just stopped working one day. Something was obviously deprecated. Mailchimp, having long abandoned all pretense of caring about anything but money, were little use. And it was a gigantic template. Could o1-pro just…fix it?
Result: 5/10. Not really. o1-pro knew what the problems were, and explained them in detail. It tried very hard to rewrite it, but the result didn’t quite work. I think it was just too big. It seemed close, but it felt like it would take longer to tease apart the underlying issues from o1’s fixes – so we just rebuilt it. I’ve saved it to try again on o3-pro.
Dumb mistakes
The code obviously should work, and there’s obviously some stupid problem somewhere because I am stupid.
Example: Oh god why doesn’t this work, please fix it, I’m very tired
'groups' => [
'include' => $this->dto->includeGroups,
'exclude' => $this->dto->excludeGroups,
],
'from_name' => $this->dto->fromName,
from_email' => $this->dto->fromEmail,
'scheduled_date' = $this->dto->scheduleDate->format('Y-m-d H:i:s'),
'approval_date' => date('Y-m-d H:i:s'),
'created_id' => $this->dto->createdID,
Result: 10/10. Claude or o3-mini spot this instantly, are very nice about it, and suggest I get some sleep.
How could this be better?
Sometimes you know there’s a better way. No particular example here, but I do this pretty regularly with functions I feel aren’t very elegant.
Result: 8/10. Works very well on normal code. It regularly gives you new approaches, or reminds you of things you’d forgotten. So you’re learning too.
Occasionally it’ll optimize beyond the point of readability, and on fancier stuff it sometimes recommends overcomplex patterns: set up factories and repositories etc. But you can just say ‘I don’t need it that complex’ and it’ll usually back off.
Drudge work
I have done this a million times, please do it for me and save me 10 minutes.
Example: I need a function to extract x data from this embed, parse it, and fire a callback with the results. Here’s the example given in the documentation, and here’s the output I need.
Result: 10/10. It used to be that Claude and 4o would regularly make one or two mistakes. But o1 and o3-mini reliably get the standard stuff right first time.
Complexities in the full codebase
Example: somewhere in the processing of renewing a membership, but only in unknown circumstances, an end date is being added to the payment object. There are a bunch of symfony events being fired, and there’s my own custom code on top of the default handling, and I don’t know where it’s happening or why.
Result: 0/10. The full codebase doesn’t fit into the context window, the latest code is way outside the training window, and I can’t practically fine-tune on it2. I have to handle this the old-fashioned way, like it’s 2024 or something.
I cannot wait until this is a thing. My life will change. Probably the world too.
Integrating the API
Result: 9/10. I won’t go into much detail here, but integrating with OpenAI and Anthropic is trivial: “here’s an API key and an endpoint, have at it”. Google, not so much – projects and credentials and libraries and blah.
The only catch is that it’s slower than you expect: you forget that the API needs to wait for all the tokens to generate, while the regular chat UI shows them as they appear in real-time.
How can I achieve this thing?
OpenAI’s new Deep Research tool is a wonder. You can send it off to do 5-30min of research (usually about 10) and it’ll go google a bunch of things, read a bunch of pages, and generate a report for you.
Example: what options are there for open-source email builders? I need something I can embed into my website. It needs to be able to do x, y and z. I’ve tried a and b, and I had these problems. Please let me know the advantages and disadvantages of each, as well as any user reviews.
Result: 11/10. Amazing, job-changing stuff. I’ve had it make me reports of 11,000 words. These have had some misunderstandings, but if a human wrote me one of these I would consider they had done a good job. No hallucinations so far. They’re just so good at starting you off on thorny problems. And they save you hours.
Conclusions
o1-pro and Deep Research are only available on OpenAI’s Pro tier, which is £200/month. This is a lot of money. We decided to try it at work for a month, and it’s been an obvious win. My dev time is worth £100+/hr, and the Twig work alone has covered that. It’s a no-brainer if you can afford it.
I haven’t even mentioned the AI-completion built into PHPStorm, which is so very pleasing when it does something clever.
There’s obviously the tedious shibboleth: it’s not perfect3. You have to have your wits about you. But while there is plenty in AI that’s genuinely difficult – copyright, for example – “this tool is not perfect therefore I won’t use it” is throwing the baby out with the bathwater. Just look at what you can do now.