Cabinet Office / AI search

The volume of unstructured information held by government is vast: imagine a stack of paper 20,000 miles high in the centre of Whitehall. Working with Cabinet Office we’ve prototyped machine learning and natural language processing tools to make the knowledge in these documents transparent and accessible.

Official looking at a computer screen

The Better Information for Better Government programme (BI4BG) seeks to bring a 21st Century approach to knowledge management so policymakers can learn from previous work and KIMs (knowledge and information managers) can meet their legislative obligations around archiving and Freedom of Information access.

But this is an immense task – billions of documents are held in ‘digital heaps’ across government. Our team calculated that it would take 100 people 100 years to manually categorise all that information: a human-only solution is clearly not going to work.

What we did

During Discovery we engaged with more than 40 stakeholders across 13 government departments in order to surface user needs and develop value propositions with a technology strategy that would address the need for better exploitation of knowledge across government.

In the subsequent Alpha, our team rapidly prototyped a product in an iterative process using one-week sprints, refining the design based on 4-5 usability testing sessions within each sprint. Our data scientists undertook three separate experiments to test the fundamental ML and NLP approaches that would underpin the product in a parallel stream of work.

Any use of machine learning within the public sector must be interpretable and explainable, so during Discovery we identified appropriate algorithms for prototyping experimental ML/NLP approaches. 

Within very tight timeframes we demonstrated the value of three different core technologies that would underpin the beta product, co-design and prototype an MVP beta product and validate our work with a wide cross-government audience.


Using NLP we were able to successfully identify high-value documents, improving on the accuracy of the current processes with a >80% reduction in incorrectly identified docs (i.e. false positives). By using appropriate algorithms we were able to ensure that our solution remained interpretable and explainable. The prototype service successfully passed its alpha assessment.

The best application of AI I’ve seen in government

Luke Sands, Cabinet Office Head of Digital +44 (0) 203 086 8229