Assistant Professor David Lo from SMU School of Information Systems is working to push the boundaries of what can be done by analysing huge amounts of software data.
By Nadia El-Awady
SMU Office of Research – Software analytics is a relatively new field of research, having developed more formally near the start of the 21st century. It involves analysing the large amount of data produced during the software lifecycle, including source code, bug reports, and user feedback. By analysing this data, software developers are able to improve software development and performance.
David Lo, Assistant Professor at the Singapore Management University School of Information Systems (SIS), has published many research papers on the topic in the past six years.
“My work is motivated by the high cost involved in developing and maintaining software systems and the importance of delivering systems of high quality,” says Professor Lo. “New innovations are needed to design tools and techniques that can help keep software development and maintenance costs low, while keeping the quality of software systems high.”
The surge of software data that has recently become publicly available online provides excellent opportunities to create customised solutions that can be used to automate software engineering tasks, he explains.
“Being able to create new solutions to tackle concrete problems excites me the most,” he says.
Even though software engineering has been a part of information systems for some time, it still faces a wide range of problems that require solutions. The field has been developing rapidly in recent years with the introduction of new platforms, processes and programming tools to create software products. This not only creates new challenges but also new opportunities, Professor Lo explains.
“Being able to understand and work with those challenges and design solutions to address them, not alone but with students and colleagues from academia and industry across the globe, makes my job an interesting and satisfying one,” he says.
In 2014, Professor Lo published a study he conducted with two other SMU colleagues in which they developed an algorithm to create a search engine for source code (commands that are assembled into a software programme). Many code search techniques had been proposed previously, but they depended on searching through text only. However, source code is not mere text, it contains elements that depend on one another in order for the software programme to perform an execution process. Professor Lo and his colleagues developed a technique called AutoQuery, which allowed programmers to search through codes using dependency queries made out of small snippets of code. The technique took into consideration the code structure rather than simply looking at its text.
Better ways to debug
Software programmes often contain defects or bugs that need to be detected and repaired. This manual “debugging” usually requires much valuable time and resources. To help developers debug more efficiently, automated debugging solutions have been proposed. One family of solutions goes through information available in bug reports. Another goes through information collected by running a set of test cases. Professor Lo notes that until now, there has been a “missing link” that prevents these threads of work from being combined together.
Together with colleagues from SMU, Professor Lo has developed an automated debugging approach called Adaptive Multimodal Bug Localisation (AML). AML gleans debugging hints from both bug reports and test cases, and it performs a statistical analysis to pinpoint programme elements that are likely to contain bugs. Moreover, AML adapts itself for different kinds of bugs.
“AML can reduce the manual process of finding where a bug resides in a big programme,” he explains. “While most past studies only demonstrate the applicability of similar solutions for small programmes and artificial bugs, our approach can automate the debugging process for many real bugs that impact large programmes,” he explains.
Professor Lo and his colleagues presented the AML at the 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering in Italy. Currently, they plan to contact several industry partners to take AML one step closer to being integrated as a software development tool.
Taking a multidisciplinary approach
Professor Lo is enthusiastic about multidisciplinary work with his SMU colleagues. “Besides colleagues who specialise in similar research areas, I collaborate with many other colleagues across the five research areas at the School of Information Systems,” he says. “I have benefitted from their diverse expertise to solve challenges that I otherwise could not have solved alone, and to spot opportunities that I otherwise would not have noticed. These collaborations have resulted in many pieces of work that have been published in various international conferences and journals.”
Professor Lo is also hoping to be involved in future collaborations with colleagues from other schools at SMU. “I strongly believe a multidisciplinary approach will result in holistic research works that expand frontiers of research in new and interesting directions,” he says.
For example, he is currently looking at ways to optimise cooperative workflows in software organisations and in open source teams. A project of this kind would require expertise from diverse fields such as organisational behaviour, psychology and group behaviour, empirical analysis, applied statistics, and game theory. Professor Lo also plans to study the problem-solving and mental task processes that software developers undergo. This project would benefit from the expertise of his colleagues from the School of Social Sciences in psychology, he says.
Aside from his research projects, Professor Lo enjoys teaching a variety of undergraduate and postgraduate software engineering courses at SMU. He supervises undergraduate projects that require teams of students to develop software solutions for real clients, and also works closely with SMU PhD candidates to bring his research ideas to fruition.
“SMU provides a lot of support for faculty members to do research, for instance, travel grants to present papers at conferences; visiting professors; and hardware support are some of the things that SMU provides to facilitate research activities. “Also, the Office of Research has provided much support for research grant submissions, and the SMU library has provided much support in securing greater visibility for my work.”
One of Professor Lo’s research ambitions is to develop an Internet-scale software analytics solution. With Internet-scale software analytics, massive amounts of passive software data buried in myriads of diversified online repositories can be analysed to transform manual, painstaking and error-prone software engineering tasks to automated activities that can be performed efficiently with high quality. This is done by harvesting the wisdom of the masses, accumulated through years of development efforts by thousands of developers that are hidden in these passive, distributed and diversified data sources. “I strongly believe this will be ground-breaking because no existing software analysis technique has come close to making sense of software engineering data at this scale and diversity in a holistic way,” says Professor Lo.