Sun.Star Cebu

Cebuano student leads team behind voice-controlled photo editing app

- BY MAX T. LIMPAG Of Sun.star Cebu

CHANGE the color of the shirt, the voice on the video said. Like magic, the shirt of the woman in the photo took on a bluish hue and, with a swipe on a slider, turned orange.

The video is a demonstrat­ion of PixelTone, a prototype iPad app that allows users to edit images using voice commands and touch gestures.

The app was created by a team from the University of Michigan School of Informatio­n working with Adobe Research. That team is led by graduate student research assistant and masters student Gierad Laput, a Cebuano.

Laput is from Barangay Guizo, Mandaue City. He went to Colegio de la Inmaculada Concepcion – Mandaue in elementary before attending Cebu City National Science High School. He studied engineerin­g at the University of San Carlos for a year before moving to the United States in 2004.

Academic offers

Laput said the training he got from the schools in Cebu “really prepared me for the academic work and rigor in the US.” He got his undergradu­ate degree at the University of Michigan, where he is working on a masters degree. He will be pursuing a PhD in computer science this September.

Although he still hasn’t decided where to attend, Laput said he got full offers from 10 schools, including Massachuse­tts Institute of Technology, University of Washington, University of California Berkeley, Carnegie Mellon University and Stanford University.

In his undergradu­ate study, Laput and a colleague submitted CrowdConne­ct, a platform for internal crowdsourc­ing, to the Ford IT Innovation Contest. It was voted as one of the top 10 entries out of more than 200 submission­s.

“We received great feedback from top-level managers within the company, but unfortunat­ely, I had to leave Ford to study for my masters, so I was not able to fully push through with idea,” he said.

For a research project in the summer of 2012, Laput and six research scientists collaborat­ed on PixelTone. He was an intern with Adobe Research in San Francisco and the only student on the team.

He said the idea was inspired by Siri, Instagram and Photoshop. In their paper “PixelTone: A Multimodal Interface for Image Editing,” the team said, “photo editing can be a challengin­g task, and it becomes even more difficult on the small, portable screens of mobile devices that are now frequently used to capture and edit images.”

“To address this problem we present PixelTone, a multimodal photo editing interface that combines speech and direct manipulati­on.”

Laput said, “the idea was also inspired by folks like my dad and my sister, who have less experience or are sometimes intimidate­d by monolithic applicatio­ns such as Photoshop. They often turn to tools like Instagram or Microsoft Paint for their photo editing needs.”

“In essence, we tried to answer the question: ‘how can we fuse the richness of Photoshop and the simplicity of Instagram?’ This question was the main motivation behind PixelTone.”

Understand commands

Laput said the app can understand spoken commands and users do not need to memorize phrases.

“For example, you can say ‘make the image spicy’ and PixelTone will try to interpret what ‘spicy’ means. It uses grammar technology to find a command that is a synonym to the an unknown word, in this case ‘spicy.’ In this particular example, it will increase the ‘warmth’ of the image since ‘spicy’ is related to ‘warm,’” he said.

He said that even if the user jumbles the words, the app will still try to understand the command. If PixelTone cannot understand the request, it will offer the user options and “has the potential to learn new commands that way.”

Game-changing

Laput said they showed the app to people inside Adobe, including product managers of Photoshop.

“Since the idea behind PixelTone is particular­ly new and potentiall­y gamechangi­ng, Adobe will have to invest time to make sure the technology is ready for a wider audience,” he said.

While the current prototype will not be able to understand “ipa-gwapa (make her beautiful),” Laput said they are working on another idea that will allow users to teach PixelTone to understand words and phrases like “ipa-gwapa.”

Laput said voice interface will, in the future, become the main driver for interactin­g with computers.

“But the best experience is the one that gives users multiple options (i.e., voice with gestures, but not voice or gestures alone), since this brings greater flexibilit­y in helping them accomplish their goals,” he said.

 ?? (CONTRIBUTE­D FOTO) ?? GIERAD LAPUT shown above in a recent presentati­on. Laput, a Cebuano, is a masters student at the University of Michigan School of Informatio­n and is the main contributo­r to a research project on photo editing using voice control and touch gestures.
(CONTRIBUTE­D FOTO) GIERAD LAPUT shown above in a recent presentati­on. Laput, a Cebuano, is a masters student at the University of Michigan School of Informatio­n and is the main contributo­r to a research project on photo editing using voice control and touch gestures.

Newspapers in English

Newspapers from Philippines