With Apple's Siri and other voice-recognition software becoming commonplace, you might take it for granted that we can now talk to computers. But as dependable as these systems have become, you will not get them to do anything that they are not already programmed to do. But Regina Barzilay and colleagues at the Massachusetts Institute of Technology have talked a computer into writing new software.
Their system takes a task described in natural language and automatically generates the computer code to carry it out – an important first step toward allowing people who are not familiar with computer code to program computers. "It won't replace the need for programmers, but it can help with specific programming tasks," says Barzilay.
The team focused on a common problem – writing software that reads the input given to a computer. By generating this code automatically, programmers are freed up to write the parts of software that require more creativity.
Code that checks input is at the heart of web forms, spreadsheets and databases. The challenge is to specify what kind of input is allowed. When you log in to a website, for example, software code checks that what you type matches the required format for a password or email address. An email address must consist of letters and/or digits, then an @ symbol, more letters and/or digits, and end with ".com" or ".co.uk" or similar.
Barzilay's team developed a technique that takes a natural-language description of the required input and automatically generates the code to check for that input. The system works by extracting noun phrases – such as "one or more letters and digits" and "an 'at' symbol" – and builds code accordingly.
Imprecise and ambiguous
They tested it with 106 natural-language descriptions of different input formats taken from ACM International Collegiate Programming Contests – input formats designed to challenge coders. They found that their system could automatically generate correct software for over 70 per cent of these descriptions.
The work will be presented in August at the annual meeting of the Association for Computational Linguistics in Sofia, Bulgaria.
Robert Chatley at software development consultancy Develogical in London agrees that generating code from natural language can be useful. But he notes that we are a long way from doing this for more complex tasks. "Natural language can often be imprecise, ambiguous and idiomatic, none of which are handled well by computers," he says.