a PDF of this document can be downloaded for printing

Art as research is largely undeveloped and ultimately the most crucial to culture. The implications of scientific and technological research are so far reaching in their effects on both the practical and the philosophical planes, that it is an error to conceive of them as narrow technical enterprises. The full flowering of research requires a much wider participation in the definition of research agendas and in the pursuit of research questions than is provided from those in technical fields alone. It needs the benefit of the perspectives from many disciplines including the humanities and the arts, not just in commentary but in actual research.

adapted from: artificial intelligence research as art
by Stephen Wilson, Professor, Conceptual Information Arts, SFSU
[ The full document can be viewed here ]

 

Dream Illustrator


INTRODUCTION


"What did you dream about last night?"
"I can't really remember it. You were in it. We were going somewhere. It was a strange dream,
I wish I could remember it."

What are dreams? Fragments of memories, distorted by unresolved emotion... The subconscious sorting out what gets stored in memory, and what gets dumped... An alternate realm of existence... A form of subconscious communication, or a random assortment of chemicals and electrical impulses...

It is clear there is a great deal of mystery in understanding the nature of dreams. Humans have pondered dreams since the first dream of the first human.

Egyptians believed dreams held power, and were often prophetic. Egyptians developed and practiced "dream incubation", a process where one would sleep in the temple, and the next morning a "priest" would interpret their dream. Greeks and Babylonians also believed dreams carried divine messages. In 5th century BC Greek philosophers began to suggest dreams were merely created in the mind of the dreamer. Aristotle believed that dreams were a recollection of the days events, analogous to viewing forms reflected in water. He also suggested that a doctor could diagnose a person's illness by interpreting a dream they had. Hippocrates, founder of modern medicine, supported this theory. Galen of Pergamum, a Greco-Roman physician, documented the success of this practice.

Christianity revived the idea that dreams were in the realm of supernatural. The Bible has numerous stories about prophetic dreams, the interpretations of which changed the course of the lives of the dreamers. Mohammed received much of the text of the Koran from a dream. As well, he interpreted the dreams of his disciples. Martin Luther, the founder of Protestantism, explained the origin of bad dreams by stating sin was "the confederate and father of foul dreams."

In the early Middle East, the Zoroastrians followed set rules for the interpretation of dreams, specific for each day of the month on which the dream occurred. Gabdorrhachamn was an early Arabic dream interpreter who believed dreams were prophetic and could only be interpreted by a person with "a clean spirit, chaste morals, and the Word of Truth."

Alfred Maury, a French doctor, believed external stimuli to be the catalyst to all dreams. This developed into the modern attitude on dream interpretation. And of course there is Freud, who's theory was that although dreams may be prompted by external stimuli, they were a reflection of our deepest desires going back to childhood. For Freud, dreams were not for entertainment—they all held important meaning. Freud did not think we could accurately interpret our own dreams. Carl Jung took a different angle. He believed that dreams remind us of our wishes, which enables us to realize the things we unconsciously yearn for. Therefore dreams are messages to ourselves, from ourselves to help us fulfill our own wishes. Jung believed dreams were meant to be understood, and we should pay attention to our dreams for our own benefit.

Today there are those who believe dreams are a meaningless fact of life, a biological process yielding insignificant results. However, there are others who believe that dreams carry great significance and have the potential to unlock mysteries about human existence and purpose.

"We all dream every night, but we forget about 90%. The dream life is just as real as the waking life. If we can't crack the code of our dreams,
we're really only half here."

–Tom Robbins on the importance of dreams

 

The year is 2081.

Jacob receives a gift from his wife and children. It is something he has wanted a long time. He is anxious to use it the first night, but he must program it first. It can take a few days to get it right. He sets the palm sized black cube near his bed and settles back comfortably. He reads aloud and looks at the included pictures as they appear in the text. Varied pulses of sound come from the cube. As he reacts, the tuned receptors of the Dream Illustrator map the locations and intensities of the electroencephalogram impulses coming from the speech, image, and auditory processing centers of Jacob's brain.

He goes to sleep.

The device maps and records impulses in the brain, and the body's muscular tissue. More is captured during nights with greater EEG activity during REM sleep cycles. For the first few days fragmented images are played back. As Jacob recognizes symbols and images the Dream Illustrator cube records a positive match in the brain map. With more use the device adapts, and increases accuracy.

The Dream Illustrator translates his brain waves to images and sends them to a full spectrum holographic motion picture generator, common in most homes at the time. The projections fill the space of a room, allowing the viewer to move around and in the images created, viewing them from any angle. Over time the dream recorder creates an environment nearly as perfect as in the mind of the dreamer. Jacob's entire dream can be seen, heard, rotated, repeated, slowed down and paused by voice command.What will we discover about ourselves by making tangible the world of dreams? We will discover what synchronistic purpose links all humans through our dreams.

The rendering of unconscious dreaming will be a new medium for us to explore as artists.
The artist makes tangible the intangible. The artist interprets the surrounding world, changes it, and gives it back with meaning. We reveal the elements of human life (emotion and imagination) by touching and tasting the world around us and transforming that experience into a tangible thought. We will all be able to explore the interpretation of life through our subconscious.

Surrealism as a literary and art movement was started in France in the 1920's. The essence of surrealism as a form of expression started much earlier. Simply put, surrealism attempts to express the workings of the subconscious. The real show is on the inside. To raise to the surface of the soul, all that is in the habit of keeping concealed is surrealism. It is the soul's expression. The inside is turned outside. Spiritual lucidity.

Jacob has something in common with all humans—we all feel a need to find our soul, and give it purpose.

A personal introspection of the subconscious is a gateway to the truth. A collective observation of dreams might influence the course of future actions. We will inspire new ideas and share amazing visions from our subconscious imaginations that will change the course of mankind.


How did we get there?

 

Speech Illustrator


The year is 2067.

A father reads his daughter a bedtime story. A classic piece of literature from the 20th century, ink words on yellow brittle paper. In the space above the foot of her bed a lifelike holographic image appears. The characters from the book act out the adventure as the words of the book are read.

After the book is finished, the father is reminded of a similar adventure from his childhood. As he tells his story, full of new embellishments, the girl sees her father at a young age moving through the scenes described in his story.

In a meeting with her new client, a landscape designer describes her vision for the custom home that is being built. in the space above a conference table a 3-dimensional projected image of the client's building plans are seen. The designer begins to describe a landscape. The trees, foliage, and sculpted earth formations take shape as they are spoken. The client reacts. Alterations are made in realtime. The session is recorded and the results are integrated into the architect's virtual model.


How did we get there?

 

In the year 2043 a doctor from Bulgaria travels to a remote village east of Kanye, Botswana to teach new medical procedures. He brings a small device that he sets near the outdated flat video monitor in the village hospital. As he speaks Bulgarian into the device his words are translated into audible Setswana. The procedures he describes are illustrated on the monitor as his words are spoken. The local doctors then ask questions, and offer alternative suggestions, their speech translated to Bulgarian, and their ideas illustrated as they are spoken.

The Bulgarian doctor leaves the device behind. Later in an international telephone conference they are able to share ideas and new techniques while their speech is translated and illustrated, both seeing the same images in their respective locations.

In New Zealand, A stunt coordinator for filmmaking works out an action sequence in his studio by verbally describing the scene. The computer program illustrates his description in high-resolution detailed computer animation. He saves the sequence. He presents the sequence to the director and cast, viewing the action sequence from any angle with the ability to zoom in and slow down critical moves. Each actor gets a copy to study from.


How did we get there?

 

In the year 2028 a 7 year old in Beijing picks a tablet sized object off a shelf in his English class. The screen lights up and asks "what is your name?" The child answers and the tablet begins with a quick review of the previous lesson. It asks "where is the bear?" The child replies, "The bear is in the river." The tablet replies, "Yes. That is correct. The bear is in the river."
The child wonders aloud "What does a bear sound like?" The tablet replies with a growl.

A high school student in Alaska receives in the mail her new tablet encyclopedia. Because of her father's job, her family moves often, and sometimes to towns without a school. The screen lights up and she asks it a question. The tablet displays several topics. She begins with a lesson in art. The student asks a question regarding the history of Picasso. The tablet offers to show a video. After viewing, the student tells the tablet to show more about Jacqueline. Several paintings are returned. The tablet conducts tests and grades the student. Missed answers are worked into the next lessons. When the tablet is brought near a viable wireless internet connection it begins to update and expand its stored information and backs up the student's records to a central database.


How did we get there?

 

In 2007 a research project was launched at The University of Washington to develop Speech Illustrator, a device that turned human speech into images. The rough prototype was a program that was built to run on most any computer equipped with a microphone.

It used speech recognition software to turn each individual word of speech into computer code. The code was linked to a database of images, a picture dictionary. As the word "Ocelot" was spoken, and image of the nocturnal feline would appear on the screen.

 

BASICS OF OPERATION


A spoken word enters the computer with Speech Recognition software. The word links to a specific value in the program that retrieves an image. The image appears on the video screen.
Sequences can be recorded by storing them as text documents. Image compositions can be played back from these text documents. Image compositions may also be created by keying in words, or translating previously existing written documents such as poetry.

Nouns will be the first words to be illustrated. They are broken down into categories. The noun's category is determined by where the image is most likely to appear in the depth of field of a picture. Words like "forest" or "city" are assigned the value of "scenic". They will appear in the background of the image. Single object nouns are placed in the foreground.

After a basic database of nouns are established, simple adjectives will be defined. Adjectives will modify which version of the noun is selected. For example the image for "backpack", can be modified by saying "small blue backpack" or "large green backpack". Most adjectives added to the lexicon will require new images to be added to the image library. Example: "orangutan" is different from "old orangutan" and is different from "happy old orangutan".

Prepositions will link to vector coordinates in relationship to the nouns. Their usage can affect positions on an x, y axis, or perceived depth by layers and size adjustment. "The elephant is behind the tree. A bird is above the elephant." Eventually images may also be arranged by using touch sensitive screens.

Verbs will also influence how nouns are displayed. Verbs will eventually trigger motion graphics.

New picture definition libraries will be available for upgrades as the vocabulary and ability of the device advances. Users will be able to define their own words with a drawing program, loading an image or a built-in camera. This will be especially useful with names.

 

RESEARCH & DEVELOPMENT CHALLENGES


There must eventually be an enormous catalog of images to draw upon. This will take a lot of time to collect, categorize, and label for recall use.

Syntax, inflection, and tone can greatly alter the meaning of a group of words. Human brains have the gift of abstract thought to derive meaning. Computers, so far, do not. Words can have multiple meanings depending on their arrangement with surrounding words. This could present a difficult programming hurdle. Fortunately there is a great head-start with WordNet® (see below). Artificial intelligence improvements will also play a role in syntax comprehension.

 

EXISTING TECHNOLOGY & CURRENT RESEARCH


The following technologies can be integrated to form the Speech Illustrator.

Dragon NaturallySpeaking®
Software by ScanSoft. Brings Speech-to-Text ability to PC format computers. The company claims up to 99% accuracy. It uses natural language commands and can process up to 160 words per minute. APIs for developers to "voice-enable" any workflow application.
[ read more about it here ]

ViaVoice®
Speech-recognition software from IBM for both Mac and PC operating systems allows for natural, continuous speech voice dictation. Currently available in 5 language options: German, Italian, Japanese, British English, American English. US Mac version can be purchased for $125, or downloaded for $80.
[ read more about it here ]

Giant Picture Dictionary
A database of images still in development. It will allow a user to type in virtually any word to instantly retrieve pictures or symbols for that word. Pictures and symbols will be mapped to and from multiple languages. The database can be increased by approved users. Development is being led by Wally Flint.
[ read more about it here ]
This style of program currently exists in very basic forms aimed at educating children or ESL students.

WordNet®
WordNet is a lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. More than 42,000 links between nouns and verbs are morphologically related, classifying synonym sets by category, region, usage, glossary, or new terminology. WordNet was developed by the Cognitive Science Laboratory at Princeton University.
[ read more about it here ]

Virtual Reality environments have been created that can be manipulated by voice, gesture and eye tracking. The Human Interface Technology Lab on the campus of the University of Washington is a research and development lab in virtual interface technology. HITL was established to transform virtual environment concepts and early research into practical, market-driven products and processes. HITL research strengths include interface hardware, virtual environments software, and human factors. The Lab hopes to develop a new generation of human-machine interfaces to provide solutions to challenges in a variety of domains.
[ read more about it here ]

Three-dimensional holographic video images will be generated by a computer. They will be shown in full-motion color and, with input from a user, changed on the fly. Viewers who move around a holographic video image will be able to see it moving from every side. Research and development is currently being done by the Spatial Imaging Group at MIT.
[ read more about it here ]

NEURALAB – A PC based brain wave analyzer capable of measuring neural efficiency (IQ). It measures the brain’s response time to both auditory and visual stimuli. The Neuralab provides industry recognized EEG strip chart printouts. This product is wholly owned by Advanced design & Development and is available for market exploitation.
[ read more about it here ]

 

FUTURE ADVANCES


Later versions of this device will incorporate libraries of sound and video clips that will work in conjunction with the image dictionary. Still images, sound, and video will come together to illustrate the scenes described.

Advances in artificial intelligence will allow Speech Illustrator to "adapt" to its user, allowing them to grow and learn together resulting in increased efficiency and accuracy when creating images.

Various user profiles can be made for one device. Speech Illustrator will recognize the user by their name being spoken, then verify the user by their unique voice spectrum. Custom databanks can be created for users who want to define their own graphics. The user will also be able to create a composite graphic with verbal commands, save the composite image, and create a vocal shortcut to recall that image as a single graphic in a new scene. Operations will become like many layout and design programs, allowing users to group, un-group, and move images after they are placed.

Brain wave analyzing technology could be adapted to make maps of where image memories, sound memory, and vocabulary are stored in the brain of a user. When these maps get more accurate in the future thoughts will turn into images. Think of a red ball, and see a red ball on the screen.

 

WHY


Humans create from ideas.

Speech has been a basic part of human communication since the earliest humans. Pictures, icons and images have been another form of language since the earliest humans. There are great speakers. There are great visual communicators. There are creative visionaries among them.

Spoken words create images in the minds of others. When telling an idea, you must rely on the accuracy of your own words and the visual imaginations of those hearing the idea. Most people do not have the talent or time to quickly draw or create an image to reinforce their speech. It is a challenge to be accurately understood when communicating one's vision of an idea.

In the long-term, a device that accurately connects images to spoken words, and allows a user to modify those images with natural speech, would open a floodgate of new and creative ideas that can be expressed and understood. The Speech Illustrator can increase the power of an individual's ability to communicate an idea.

 

FIRST FORM OF THE IDEA


Speech Illustrator as an art gallery installation

"Painting is silent poetry, and poetry is painting with the gift of speech."

– Simonides (556 BC - 468 BC)


A sectioned off room in a gallery contains a hidden video projector and a microphone on a stand in the center. When viewers speak into the microphone, any words spoken that exist in the Speech Illustrator's image bank appear on the projector screen. When possible, multiple pictures will be available for the same word. When a word with multiple image definitions is spoken, an image is chosen at random, thus reducing repeated images when words are repeated.

Full-screen graphic images would be created for concrete nouns as well as more abstract words like "love" and a variety of adjectives such as "dangerous" or "exciting". Picture definitions can be abstract or literal, and be in the form of photography, illustration, collage or video.

Various artists could be summoned to take on the challenge of defining an assigned group of words. Images would be reviewed, and those that are accepted become part of the image dictionary.

This idea encourages a variety of individuals to explore the medium and would undoubtedly produce uniquely poetic results. The power of live storytelling exponentially increases power. Surrealists said, when two things come together, a third thing is created. The surrealism movement started with poets. Words will take on a unique and temporary existence beyond the power of naked words. The images will elicit a different emotion than they do in silence.

This early version of Speech Illustrator will call upon the viewer/ listener to interpret the relationship between the common meaning of words compared to the illustrated representation that appears on the screen. Because of this, the visual interpretation can in turn alter the meaning of the spoken word in ways unexpected by the speaker. The ability to connect meaning and influence meaning in both directions will be imperative to the function of the advanced Speech Illustration. When the technology is developed to map and record the brain's activity with precision, cross directional interpretation of meaning will be imperative to creating a useful Dream Illustrator.

 

BUDGET AND TIMELINE


Itemization and timeline is for the gallery installation of Speech Illustrator as described on page seven.

A dedicated computer to store the image definitions, run and test the program: $4,000

Software for image manipulation, programming, and speech recognition: $1,800

High resolution scanner for digitizing found imagery and created art: $1,000

Digital camera for capturing imagery: $1,500

Freelance computer programmer to write the software and adapt speech recognition: approximately 6 weeks at $1,500/ week = $9,000

My personal time to define, research, record, and produce visuals for the image database, then design and build the gallery installation: 6 months at $4,000/ month = $24,000

For gallery presentation of the project; video projector, microphone with stand, amplifier, speakers, translucent screen, and building materials to erect screen: $8,000 ($2,500 for each additional projector and screen used)

Advertising budget for the gallery installation: $3,000


Total estimated cost: $52,300
Total estimated time from project start to gallery unveiling: 6 months


[Time and budget estimates for the Speech Illustrator portable tablet, with accurate picture generating ability, would be determined after creating the gallery installation. A successful gallery installation will help generate funding and support for undertaking the creation of the portable tablet, which could take years to develop.]

Any of the equipment listed above that is already owned by the school, or can be borrowed or donated will decrease from the total budget requirements.

 

 

return to the main menu

a PDF of this document can be downloaded for printing