Cheap, color, holographic video
Today in the journal Nature, researchers at MIT?s Media Lab report a new approach to generating holograms that could lead to color holographic-video displays that are much cheaper to manufacture than today?s experimental, monochromatic displays. The same technique could also increase the resolution of conventional 2-D displays.
Using the new technique, Daniel Smalley, a graduate student in the Media Lab and first author on the new paper, is building a prototype color holographic-video display whose resolution is roughly that of a standard-definition TV and which can update video images 30 times a second, fast enough to produce the illusion of motion. The heart of the display is an optical chip, resembling a microscope slide, that Smalley built, using only MIT facilities, for about $10.
?Everything else in there costs more than the chip,? says Smalley?s thesis advisor, Michael Bove, a principal research scientist at the Media Lab and head of its Object-Based Media Group. ?The power supplies in there cost more than the chip. The plastic costs more than the chip.?
Joining Bove and Smalley on the Nature paper are two other graduate students in Bove?s group, James Barabas and Sundeep Jolly, and Quinn Smithwick, who was a postdoc at MIT at the time but is now a research scientist at Disney Research.
When light strikes an object with an irregular surface, it bounces off at a huge variety of angles, so that different aspects of the object are disclosed when it?s viewed from different perspectives. In a hologram, a beam of light passes through a so-called diffraction fringe, which bends the light so that it, too, emerges at a host of different angles.
One way to produce holographic video is to create diffraction fringes from patterns displayed on an otherwise transparent screen. The problem with that approach, Bove explains, is that the pixels of the diffraction pattern have to be as small as the wavelength of the light they?re bending, and ?most display technologies don?t happily shrink down that much.?
Stephen Benton, a Media Lab professor who died in 2003, created one of the first holographic-video displays by adopting a different technique, called acousto-optic modulation, in which precisely engineered sound waves are sent through a piece of transparent material. ?The waves basically squeeze and stretch the material, and they change its index of refraction,? Bove says. ?So if you shine a laser through it, [the waves] diffract it.?
Benton?s most sophisticated display ? the Mark-II, which was built with the help of Bove?s group ? applied acousto-optic modulation to a crystal of an expensive material called tellurium dioxide. ?That was the biggest piece of tellurium dioxide crystal that had ever been grown,? Bove says. ?And that wasn?t TV resolution. So there was a definite scaling problem going on there.?
Smalley instead uses a much smaller crystal of a material called lithium niobate. Just beneath the surface of the crystal he creates microscopic channels known as waveguides, which confine the light traveling through them. Onto each waveguide, he also deposits a metal electrode, which can produce an acoustic wave.
Each waveguide corresponds to one row of pixels in the final image. In the Mark-II, the tellurium dioxide crystal had to be big enough that the acoustic waves producing the separate lines of the hologram were insulated from each other. In Smalley?s chip, on the other hand, the waveguides with their individual electrodes can be packed mere micrometers apart from each other.
Beams of red, green and blue light are sent down each waveguide, and the frequencies of the acoustic wave passing through the crystal determine which colors pass through and which are filtered out. Combining, say, red and blue to produce purple doesn?t require a separate waveguide for each color; it just requires a different acoustic-wave pattern.
Bove considers that the most exciting aspect of the new chip. ?Until now, if you wanted to make a light modulator for a video projector, or an LCD panel for a TV, or something like that, you had to deal with the red light, the green light and the blue light separately,? he says. ?If you look closely at an LCD panel, each pixel actually has three little color filters in it. There?s a red subpixel, a green subpixel and a blue subpixel.?
?First of all,? he continues, ?that?s inefficient, because the filters, even if they were perfect, would throw away two-thirds of the light. But second, it reduces either the resolution or the speed at which the modulator can operate.?
According to Smalley, on the other hand, ?What?s most exciting about [the new chip] is that it?s a waveguide-based platform, which is a major departure from every other type of spatial light modulator used for holographic video right now.? Waveguides are already a common feature in commercial optoelectronics, Smalley explains, and techniques for manufacturing them are well established. ?One of the big advantages here is that you get to use all the tools and techniques of integrated optics,? he says. ?Any problem we?re going to meet now in holographic video displays, we can feel confidence that there?s a suite of tools to attack it, relatively simply.?
?This has the potential to be a game-changer, and I?m really serious about that,? says Pierre Blanche, an assistant research professor at the University of Arizona who is also researching holographic video. ?It?s a huge achievement.?
The capacity of holographic-video displays is measured according to something called the space-bandwidth (SB) product, Blanche explains. ?The SB product is the product between the number of pixels and their spatial frequency ? the inverse of their size,? he says. ?So we are looking for a large number of very small pixels. To have a large number of pixels is not really helpful if they are large.?
?Already with the acousto-optic modulator that Professor Benton used in the past, you had a good space-bandwidth product, but the pixels were large and they couldn?t fit a lot together,? Blanche says. ?[Smalley and Bove] improve the space-bandwidth product by factor of 500, which is really ? wow.?
Blanche says that the experimental holographic-video system that he and Nasser Peyghambarian, chair of photonics and lasers at Arizona, are developing has some advantages over the MIT system. ?Our images still have better quality,? he says. ?But they achieve video rate and we haven?t. This is a very exciting time for us.?
Multiview 3-D photography made simple
Computational photography is the use of clever light-gathering tricks and sophisticated algorithms to extract more information from the visual environment than traditional cameras can.
The first commercial application of computational photography is the so-called light-field camera, which can measure not only the intensity of incoming light but also its angle. That information can be used to produce multiperspective 3-D images, or to refocus a shot even after it?s been captured.
Existing light-field cameras, however, trade a good deal of resolution for that extra angle information: A camera with a 20-megapixel sensor, for instance, will yield a refocused image of only one megapixel.
Researchers in the Camera Culture Group at MIT?s Media Lab aim to change that with a system they?re calling Focii. At this summer?s Siggraph ? the major computer graphics conference ? they?ll present a paper demonstrating that Focii can produce a full, 20-megapixel multiperspective 3-D image from a single exposure of a 20-megapixel sensor.
Because a light-field camera captures information about not only the intensity of light rays but also their angle of arrival, the images it produces can be refocused later.Photos: Kshitij Marwah
Moreover, while a commercial light-field camera is a $400 piece of hardware, Focii relies only on a small rectangle of plastic film, printed with a unique checkerboard pattern, that can be inserted beneath the lens of an ordinary digital single-lens-reflex camera. Software does the rest.
Gordon Wetzstein, a postdoc at the Media Lab and one of the paper?s co-authors, says that the new work complements the Camera Culture Group?s ongoing research on glasses-free 3-D displays. ?Generating live-action content for these types of displays is very difficult,? Wetzstein says. ?The future vision would be to have a completely integrated pipeline from live-action shooting to editing to display. We?re developing core technologies for that pipeline.?
In 2007, Ramesh Raskar, the NEC Career Development Associate Professor of Media Arts and Sciences and head of the Camera Culture Group, and colleagues at Mitsubishi Electric Research showed that a plastic film with a pattern printed on it ? a ?mask? ? and some algorithmic wizardry could produce a light-field camera whose resolution matched that of cameras that used arrays of tiny lenses, the approach adopted in today?s commercial devices. ?It has taken almost six years now to show that we can actually do significantly better in resolution, not just equal,? Raskar says.
Focii represents a light field as a grid of square patches; each patch, in turn, consists of a five-by-five grid of blocks. Each block represents a different perspective on a 121-pixel patch of the light field, so Focii captures 25 perspectives in all. (A conventional 3-D system, such as those used to produce 3-D movies, captures only two perspectives; with multiperspective systems, a change in viewing angle reveals new features of an object, as it does in real life.)
The key to the system is a novel way to represent the grid of patches corresponding to any given light field. In particular, Focii describes each patch as the weighted sum of a number of reference patches ? or ?atoms? ? stored in a dictionary of roughly 5,000 patches. So instead of describing the upper left corner of a light field by specifying the individual values of all 121 pixels in each of 25 blocks, Focii simply describes it as some weighted combination of, say, atoms 796, 23 and 4,231.
According to Kshitij Marwah, a graduate student in the Camera Culture Group and lead author on the new paper, the best way to understand the dictionary of atoms is through the analogy of the Fourier transform, a widely used technique for decomposing a signal into its constituent frequencies.
In fact, visual images can be ? and frequently are ? interpreted as signals and represented as sums of frequencies. In such cases, the different frequencies can also be represented as atoms in a dictionary. Each atom simply consists of alternating bars of light and dark, with the distance between the bars representing frequency.
The atoms in the Camera Culture Group researchers? dictionary are similar but much more complex. Each atom is itself a five-by-five grid of 121-pixel blocks. Each block consists of arbitrary-seeming combinations of color: The blocks in one atom might all be green in the upper left corner and red in the lower right, with lines at slightly different angles separating the regions of color; the blocks of another atom might all feature slightly different-size blobs of yellow invading a region of blue.
Behind the mask
In building their dictionary of atoms, the Camera Culture Group researchers ? Marwah, Wetzstein, Raskar, and Yosuke Bando, a visiting scientist at the Media Lab ? had two tools at their disposal that Joseph Fourier, working in the late 18th century, lacked: computers, and lots of real-world examples of light fields.
To build their dictionary, they turned a computer loose to try out lots of different combinations of colored blobs and determine which, empirically, enabled the most efficient representation of actual light fields.
Once the dictionary was built, however, they still had to calculate the optimal design of the mask they use to record light-field information ? the patterned plastic film that they slip beneath the camera lens. Bando explains the principle behind mask design using, again, the analogy of Fourier transform.
?If a mask has a particular frequency in the vertical direction? ? say, a regular pattern of light and dark bars ? ?you only capture that frequency component of the image,? Bando says. ?So you have no way of recovering the other frequencies. If you use frequency domain reconstruction, the mask should contain every frequency in a systematic manner.?
?Think of atoms as the new frequency,? Marwah says. ?In our case, we need a mask pattern that can effectively cover as many atoms as possible.?
?It?s cool work,? says Kari Pulli, senior director of research at graphics-chip company Nvidia. ?Especially the idea that you can take video at fairly high resolution ? that?s kind of exciting.?
Pulli points out, however, that assembling an image from the information captured by the mask is currently computationally intensive. Moreover, he says, the examples of light fields used to assemble the dictionary may have omitted some types of features common in the real world. ?There?s still work to be done for this to be actually something that consumers would embrace,? Pulli says.
Diversifying your online world
The Internet promises a seemingly frictionless way of connecting individuals from around the globe. But in reality, that?s not what happens online: Instead, we clump together with people similar to ourselves, and have those affinities reinforced by tools that guide us to other people or products that resemble those we already know.
Perhaps we can change that, though, and better incorporate new, international perspectives and knowledge into our everyday lives. That, at least, is the thesis of ?Rewire: Digital Cosmopolitans in the Age of Connection,? a new book by MIT?s Ethan Zuckerman, published this month by W.W. Norton.
?There was this early promise on the Internet that no one cares if you?re coming from Japan or Jordan or Jamaica, as along as you have something to add to the conversation,? says Zuckerman, director of MIT?s Center for Civic Media and principal research scientist at the MIT Media Lab. ?But it seemed to me that we?ve been getting narrower and narrower views of the world [online]. I wasn?t even getting the perspective I?d get from a good newspaper.?
As Zuckerman details in the book, this is not just his impression. Many studies have shown that social, political and cultural filtering occurs routinely on the Internet ? not to mention filtering by gender and language.
Zuckerman?s aim ? in the book and in his research group at MIT ? is to encourage researchers to build tools encouraging people to explore the world, engage others and move beyond their normal social circuits.
?We?re still well below what a really level world would look like,? Zuckerman says. ?The good news is, we can still get it right. There is time to jump in and try to make it better.?
Ghana, gone from the news
Zuckerman?s interest in diversifying online culture arose, in part, from his experiences working at a nonprofit organization in Ghana about a dozen years ago.
?Ghana had a remarkable election in 2000, free and fair,? Zuckerman says, referring to the country?s first-ever democratic transfer of power. ?From the perspective of people following Africa, we thought this was amazing news, people should be celebrating. But no one really noticed in the U.S., [apart from] The New York Times. I got very interested in what we do, and don?t, pay attention to.?
Among other efforts, Zuckerman co-founded the Global Voices project, an online citizen-media site relaying news and information from around the world.
?We?re not just filtering politically and culturally, we?re filtering on a national basis,? Zuckerman says.
To be sure, one might ask: Why is having a global perspective desirable? Zuckerman offers a few answers, including basic civic engagement at a time of, for instance, globalized supply chains that undergird the products we buy.
?If we?re going to depend on stuff built by people from all over the globe, there?s a point at which we might have to pay attention to the issues and politics,? Zuckerman says. ?Suddenly people have a lot of questions about buying clothes [made] in Bangladesh. These are the sorts of issues that make you realize that if you don?t have more of a global perspective, you?re missing opportunities to improve things, you?re not anticipating dangers.?
Another point Zuckerman emphasizes is that cognitive diversity is useful for both creativity and problem-solving ? and that kind of diversity is more readily available to people who step outside their cognitive comfort zones.
?Historically a great deal of creative thought has come from engagement with people in other cultures,? Zuckerman says; his book cites examples from music, politics and corporate life.
The initial response to ?Rewire? has been positive; a review in Bookforum called it a ?patient and thoughtful? assessment of the Internet?s realities and potential.
What is to be done?
But if we?re missing an opportunity to become better global citizens, how can we change that? The current approach of Zuckerman, and his graduate students, is to create online tools that nudge Internet users toward new perspectives.
?Do you really want to use Facebook to help you track down every elementary school friend you ever had?? Zuckerman asks. ?Or can we push you in new directions and introduce you to, say, people from other parts of the world who have things in common with you??
One tool Zuckerman?s group is working on does this with Twitter, by analyzing the composition of the feeds people follow, and then recommending more feeds ? mostly ones only slightly similar to your own user profile.
?A conventional recommendation system would say, ?Let me find the people who recommend the same things, whatever they found that you didn?t find, you?re going to love,?? Zuckerman says. ?What you probably want to do is build a recommendation system that?s about 30 degrees different. If you?re a secular liberal and you get links to a religious conservative, there?s a pretty good chance your response will be, ?Why do I want to pay attention to this?? But handing me links from a religious progressive might push me in an interesting direction.?
Other tools, Zuckerman suggests, will help identify key links in social networks that might diversify one?s contacts. Many social networking sites presume that weak ties ? people we don?t know particularly well ? can be highly valuable in areas like job searches. But Zuckerman believes that certain people who constitute a set of unique connections for us ? they represent ?bridge? ties ? are the most valuable of all. Identifying and emphasizing these ?bridge? people for all of us, Zuckerman thinks, could improve social networking sites.
In this view, being a ?digital cosmopolitan? is everyone?s responsibility ? and software engineers, among others, should keep thinking about ways to encourage that practice.
?I?m hoping the book will inspire other people to start building this stuff,? Zuckerman says.
Automated ?coach? could help with social interactions
Social phobias affect about 15 million adults in the United States, according to the National Institute of Mental Health, and surveys show that public speaking is high on the list of such phobias. For some people, these fears of social situations can be especially acute: For example, individuals with Asperger?s syndrome often have difficulty making eye contact and reacting appropriately to social cues. But with appropriate training, such difficulties can often be overcome.
Now, new software developed at MIT can be used to help people practice their interpersonal skills until they feel more comfortable with situations such as a job interview or a first date. The software, called MACH (short for My Automated Conversation coacH), uses a computer-generated onscreen face, along with facial, speech, and behavior analysis and synthesis software, to simulate face-to-face conversations. It then provides users with feedback on their interactions.
The research was led by MIT Media Lab doctoral student M. Ehsan Hoque, who says the work could be helpful to a wide range of people. A paper
documenting the software?s development and testing has been accepted for presentation at the 2013 International Joint Conference on Pervasive and Ubiquitous Computing, known as UbiComp, to be held in September.
?Interpersonal skills are the key to being successful at work and at home,? Hoque says. ?How we appear and how we convey our feelings to others define us. But there isn?t much help out there to improve on that segment of interaction.?
Many people with social phobias, Hoque says, want ?the possibility of having some kind of automated system so that they can practice social interactions in their own environment. ? They desire to control the pace of the interaction, practice as many times as they wish, and own their data.?
The MACH software offers all those features, Hoque says. In fact, in randomized tests with 90 MIT juniors who volunteered for the research, the software showed its value.
First, the test subjects ? all of whom were native speakers of English ? were randomly divided into three groups. Each group participated in two simulated job interviews, a week apart, with MIT career counselors.
But between the two interviews, unbeknownst to the counselors, the students received help: One group watched videos of interview advice, while a second group had a practice session with the MACH simulated interviewer, but received no feedback other than a video of their own performance. Finally, a third group used MACH and then saw videos of themselves accompanied by an analysis of such measures as how much they smiled, how well they maintained eye contact, how well they modulated their voices, and how often they used filler words such as ?like,? ?basically? and ?umm.?
Evaluations by another group of career counselors showed statistically significant improvement by members of the third group on measures including ?appears excited about the job,? ?overall performance,? and ?would you recommend hiring this person?? In all of these categories, by comparison, there was no significant change for the other two groups.
The software behind these improvements was developed over two years as part of Hoque?s doctoral thesis work with help from his advisor, professor of media arts and sciences Rosalind Picard, as well as Matthieu Courgeon and Jean-Claude Martin from LIMSI-CNRS in France, Bilge Mutlu from the University of Wisconsin, and MIT undergraduate Sumit Gogia.
Designed to run on an ordinary laptop, the system uses the computer?s webcam to monitor a user?s facial expressions and movements, and its microphone to capture the subject?s speech. The MACH system then analyzes the user?s smiles, head gestures, speech volume and speed, and use of filler words, among other things. The automated interviewer ? a life-size, three-dimensional simulated face ? can smile and nod in response to the subject?s speech and motions, ask questions and give responses.
?While it may seem odd to use computers to teach us how to better talk to people, such software plays an important [role] in more comprehensive programs for teaching social skills [and] may eventually play an essential step in developing key interpersonal skills,? says Jonathan Gratch, a research associate professor of computer science and psychology at the University of Southern California who was not involved in this research. ?Such programs also offer important advantages over the human role-players often used to teach such skills. They can faithfully embody a specific theory of pedagogy, and thus can be more consistent than human role-players.?
One reason the automated system?s feedback is effective, Hoque believes, is precisely because it?s not human: ?It?s easier to tell the brutal truth through the [software],? he says, ?because it?s objective.?
While this initial implementation was focused on helping job candidates, Hoque says training with the software could be helpful in many kinds of social interactions.
After finishing his doctorate in media arts and sciences this summer, Hoque will become an assistant professor of computer science at the University of Rochester in the fall.
President Ollanta Humala of Peru visits MIT
Ollanta Humala, the president of Peru, visited the MIT campus on Wednesday, meeting with MIT President L. Rafael Reif, faculty members and students. Humala was accompanied by a delegation that included Peruvian ministers of education, defense, foreign relations, and foreign commerce and tourism, as well as the U.S. ambassador to Peru.
The visit to MIT rounded out a three-day tour for Humala that also included meetings with President Barack Obama in Washington and Massachusetts Gov. Deval Patrick in Boston. The visit was Humala?s first official trip to the United States since his election as Peru?s president in 2011.
During the 90-minute visit, held at the Media Lab, Reif warmly welcomed Humala, speaking to the delegation in his native Spanish.
?We have followed your commitment to science and technology,? Reif told Humala. ?Hearing that another country is like-minded ? this is music to my ears.?
?Working on the world?s biggest problems entails working with others,? Reif added. ?We would like to seize this opportunity with Peru.?
Representatives of Peru and MIT then assembled for a ceremony in which a letter of intent was signed by María Gisella Orjeda Fernández, president of the Science, Technology and Technological Innovation National Council of Peru (CONCYTEC), and MIT Vice President Claude Canizares, the Bruno B. Rossi Distinguished Professor of Experimental Physics. The letter establishes a ?mutually beneficial collaboration? in the areas of education and research.
?We want to bet on education?
?This is an issue that will change our countries,? Humala observed after the signing. ?Peru has bet on gold, and it has bet on oil. ? Today, we want to bet on education.?
Humala stressed the need for more educational opportunities for Peruvian students as a means of addressing poverty in his nation. ?It is the moral obligation of any government to provide opportunities to our youth,? Humala said through a translator. ?We want to bequeath [opportunities] to our youth.?
For Latin Americans, Humala observed, learning a second language is essential to educational advancement. Compared to many other nations, ?We have a language barrier, because when we cross borders, we still speak Spanish.?
Several faculty members gave Humala brief presentations on their research. Daniela Rus, professor of computer science and engineering and director of MIT?s Computer Science and Artificial Intelligence Laboratory (CSAIL), spoke of a common interest among the lab?s students, many who come from various backgrounds.
?At CSAIL, we all speak the language of computers, so there is no language barrier,? Rus said. ?Our lab is really a melting pot; there are many Spanish-speaking students. [Peruvian students] will feel very comfortable here.?
Tyler Jacks, the David H. Koch Professor of Biology and director of the MIT?s Koch Institute for Integrative Cancer Research, spoke of his lab?s work to improve cancer treatment. Among other projects, Jacks and his colleagues are developing new methods to target drugs directly to cancer cells, in an attempt to avoid unwanted side effects. Humala, in turn, spoke of the research efforts by Peru?s national cancer institute, adding, ?We would like to extend a bridge to you.?
President meets Peruvians at MIT
Barton Zwiebach, professor of physics, spoke of his work in superstring theory. ?Every particle is an infinitesimal string, vibrating,? he explained to the delegation.
Zwiebach, who was born and raised in Peru, obtained a degree in electrical engineering from Peru?s Universidad Nacional de Ingeniera. ?Every time I?m identified as Peruvian, I speak with pride,? Zwiebach said.
During his visit, Humala also met briefly with Peruvian students and alumni at MIT, offering them congratulations.
?He?s very proud of us because we are here,? said Sandra Torres, who last week received an MBA from the MIT Sloan School of Management. ?There are many projects in my country, so it would be a great opportunity to collaborate. There are many things that connect me with my country.?
Claudio Di Leo, a graduate student in mechanical engineering, said Humala?s visit is a promising step toward improving his native country?s economy.
?I think the way to bring the country forward is to educate the people and raise the level of education, and bringing people to MIT is a great way to do that,? Di Leo said. ?I think it would be interesting to go back and bring some of these ideas of technology and learning back to our country.?
Sampling some MIT innovations firsthand, Humala and his delegation toured the Media Lab, and explored the Tangible Media Group. There, Hiroshi Ishii, the Jerome B. Wiesner Professor of Media Arts and Sciences, demonstrated a few interactive projects, including a display of shape-shifting ?digital? sand.
To commemorate the visit, Humala presented Reif with a silver frame ? a token of Peru?s long history in silverwork. In return, Reif presented Humala with a gift of two books: a signed copy of Instiute Professor and Professor of Linguistics Emeritus Noam Chomsky?s ?Interventions,? in Spanish, and ?Countless Connecting Threads: MIT?s History Revealed through Its Most Evocative Objects.?
The visit was coordinated by MIT?s Global Initiatives.