13 things we learned from two AI Developer Tools workshops using just GitHub Copilot Agent Mode

Interested in joining us?

Check out our current vacancies
13 things we learned from two AI Developer Tools workshops using just GitHub Copilot Agent Mode featured image

Over the past few weeks, we ran two hands-on workshops with teams across Rightmove to explore how GitHub Copilot, specifically its Agent Mode, could support day-to-day engineering work.

The idea was simple:

take real Jira tickets, and solve them using only prompts, no keyboard coding allowed

From back-end engineers and front-end developers to product managers, QA’s, BA’s and app analysts — everyone got involved. Each team had an hour to tackle a ticket, then 20 minutes to present what worked, what didn’t, and what they learned.

Why we did it

With AI tooling becoming increasingly embedded in the developer experience, we wanted to give our teams a chance to experiment in a structured but low-pressure environment.

These workshops were designed to help engineers and their cross-functional teammates build practical prompting skills, explore what GitHub Copilot Agent Mode can (and can’t) do, and see how it fits into their everyday workflows.

We were also keen to gather honest, real-world feedback to help us shape how we support AI adoption across different domains, especially as one of our key strategic pillars at Rightmove is to make AI part of our culture.

How it was organised

For both workshops, we tasked the teams and managers to identify some suitable Jira work items that could explore how effective Agent Mode was. We made sure there was a variety of work types; upgrades, bug fixes, tech debt and feature development across a mix of legacy and modern solutions.

We compiled these together on Miro boards for each session so individuals or pairs/groups could claim them then take notes. This allowed everyone to find something they were either comfortable with or wanted to learn about, minimise duplication then make notes on how the exercise went.

What went well

Copilot proved genuinely helpful in a number of ways:

  • Speeding up small changes. Things like renaming constants, removing dead code, or generating unit tests were done quickly and cleanly.
  • Boilerplate generation. It handled repetitive patterns well, especially in newer projects with clear conventions.
  • Codebase exploration. Many participants found it useful to query Copilot about code they weren’t familiar with or ask for “explain like I’m a five year old” breakdowns.
  • Non-coding tasks. Writing acceptance criteria, structuring Jira stories, and generating documentation all came up as valuable use cases.

Where it struggled

Naturally, it wasn’t perfect. Some common pain points emerged:

  • Lack of context. Copilot often struggled with project-specific conventions, outdated codebases, or legacy patterns.
  • Incorrect assumptions. It occasionally generated the wrong logic, placed code in the wrong location, or misunderstood what the user was asking for.
  • Tests and formatting. While it could generate tests, they weren’t always correct or aligned with our standards. Formatting could also be hit and miss.
  • Overcorrections. In some cases, Copilot made changes beyond the scope of the request, which reinforced the importance of clear, constrained prompts.

Lessons we’re taking forward

Across both workshops, one with our Customer domain and one with Internal Product teams, some key themes have emerged so far:

  • Prompt engineering is a real skill. Copilot works best when you guide it firmly. Being explicit, giving it the right context, and even using tricks like “change nothing else” made a big difference.
  • Codebase health matters. Clean, modern, well-structured projects gave much better results than legacy code with unclear patterns.
  • It’s not just for engineers. Product managers and analysts also found value, particularly in generating user stories and understanding technical dependencies.
  • Adoption needs trust. People need to see the tool in action, warts and all, to build confidence in using it responsibly.
  • Choose the right model. Some models worked better than others, especially when it came to some specific prompts. Many engineers found Claude Sonnet 3.5 better than GPT-4.1 with initial attempts.
Software Engineering
Joe Stickings author photo

Post written by

Joe Stickings is the Head of Engineering at Rightmove. He leads our talented engineering team across our offices in London, Milton Keynes and Newcastle. Outside of work Joe is a big football fan and a competitive tennis player. He is also passionate about aviation and is a qualified pilot. In his spare time you'll find him flying light aircraft at his local airfield.

View all posts by Joe Stickings

Discover more from Rightmove Tech Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading