Summary: This project is an exploration of what e-books, reading, and augmented media could be in the near future. However, I propose a type of augmented media that aims to add to, not change, the reading experience we already know. In addition, the goals of the project are not only to portray scenarios of use and possible features/interfaces, but also – by taking elements from design fiction – speculate what issues might arise around book soundtracks, and how the reading experience could be affected.

Soundtrack to a Book is a project exploring what e-books, reading, and augmented media could be in the near future, and was partly inspired by a lecture by James Bridle, at the 2013 Creating Minds Conference run by the Berkeley Center for New Media.

In many augmented media projects, instead of enhancing the original experience, they create a new one (for example, “Sensory Fiction” a project by MIT Media Lab students). These proposed new experiences often meet criticism because people cannot see themselves ever using the object or how it could be incorporated into their lives. (i.e. “Why would you want to put on a harness every time you wanted to read?”) Furthermore, these concepts are often presented without any context, or in a sterile sci-fi environment completely unrelated to the environments we live in today.

I propose a type of augmented media that aims to add to, not change, the reading experience we already know – the soundtracks are meant to be ambient, not the main focus. A study by researchers at the University of Waterloo, determined that when listening to audiobooks, the mind actually wanders more, retains less of the content, and is less interested in general. If, instead of replacing the experience of reading, audio supplemented it, maybe we could see the reverse affects: more focused attention, closer reading of texts, higher retention of content from multi-sensorial information, and more interest and engagement.

The goals of the project are not only to portray scenarios of use and possible features/interfaces, but also – by taking elements from design fiction – speculate what issues might arise around book soundtracks, and how the reading experience could be affected.


I. Explanation of Features/Interface

II. 1st World Culture/Society in the Near Future – the questions that drove this project and the existing tech that influenced the fictional elements in the video.

III. Extensions – Ideas not represented in the video + if I were to continue this project…

IV. References – the books and sounds I used

V. Reflection – on criticism of Design Fiction

VI. Acknowledgments

VII. Further Readings/References

Explanation of Features/Interface

Auto Play

(1:10) – The device automatically plays sounds when your eyes pass over sound-related words. Potential scenario of use: it could allow for closer reading. If you started skimming unintentionally, a sound could notify you of something you missed.

Tap to Play

(1:57) – The device only plays a sound when you tap on the word. If a sound hasn’t been tagged, an error sound will play. Potential scenario of use: just as reading devices currently allow you to look up words, and translate text easily, with this feature, you could learn about musical references more easily.


(4:30) – The device plays an ambient soundtrack based on the moods that have been tagged in the chapter. The soundtrack stops automatically if you look away from the page and starts again when you look back. The system turns off automatically if you unplug the headphones. Potential scenario of use: the ambient soundtrack could be used to block out noise if you wanted to read in a public space, such as a cafe, on the train during commute, at a park, etc.


(1:02) – This would allow the device to know where you are on the page and play the sound when your eyes pass over. I based the one in the video on optical, video-based tracking where infrared light is reflected off of the eye and is read by a sensor, such as a video camera. Eye-tracking equipment I have seen for user research is pretty sizable, but in the future, I imagine hardware (along with improved software) could be developed to fit in a hand-held device. Other questions I did not address: Will the device collect eye-tracking data? If so, what kind? What could we do with that data?

Calibrate Reading speed

If the eye-tracker were constantly running, it would probably require a lot of energy. If the device accurately calibrated your reading speed, it could probably play the Ambient mode soundtrack by estimating where you are instead of tracking where you are. I am not as sure it could work with Auto Play.

The Reading Speed in the video show 275 wpm (words per minute), which is about the average reading speed for adults. Some sources say the average is 250-300 wpm, this reading speed test by Staples (try it out! it’s kind of fun) says that the average is 300 wpm. For reference, the average for 8th graders is 250 wpm, college students – 450 wpm, college professors – 675 wpm, speed readers – 1500 wpm. Also, the reading rate changes if you are reading to learn, reading for comprehension, or skimming.

Tagging + Rate Sound Accuracy

(3:15) – These features were based off of the following questions: Who will create these soundtracks? And how will they be created?

  • Users could tag sound words or moods while they are reading.
  • Online or within an app, there could be a database of sounds. Users could also contribute to the database by uploading sounds and also tagging them to appropriate sound words or moods.
  • Users could link tagged words to sounds from the database, or the system could link them automatically – for Auto Play and Tap to Play sounds.
  • An Ambient soundtrack for a chapter could be generated based on the moods and sounds tagged in that chapter.
  • (1:31) Rating sound accuracy would allow us to check linked sounds and change inaccurate ones.

Eventually the system could cater to you based on your past tagging – maybe you generally associate certain words with certain moods and sounds. Each person’s associations are deeply influenced by individual experiences as well as the environment in which they live. If the system collected data on how people tag, would be see trends based on location, age, language, gender? An easy example is that different cultures have different onomatopoeia for the same sound. It would be interesting to find out trends or associations that are not immediately apparent.

Alternatively, the system could present soundtracks that are as true to the culture, time period, and perception of the author, as possible.

All content must be rated G

(2:59) – Movies have MPAA ratings (rated G, PG, PG-13, R, NC-17), songs have radio edits and explicit versions. Though book-banning exists in extreme situations, books do not have ratings based on whether content is ‘age-appropriate’. I wondered if this would change if soundtracks were introduced. In the video, I present a possible scenario in which a BSAA (Book Soundtrack Association of America) would regulate soundtracks, demanding that all auditory content be suitable for all ages, so books themselves aren’t withheld, regulated, or banned.

1st World Culture/Society in the Near Future

What will Amazon be trying to sell us in the future?

Currently, Kindles display “Special Offers” (basically ads) as screensavers. The “Special Offers” I have seen include new book releases, deals on books, Kindle accessories, magazine/newspaper trials.

In the video (0:15), the ad features a Qi Adaptor. Qi is the current inductive power standard, and was developed by the Wireless Power Consortium, which currently consists of 204 members (including Asus, HTC, LG Electronics, Microsoft, Motorola Mobility, Nokia, Qualcomm, Samsung, and Sony).  The logo in the ad is the actual logo of Qi. In the future instead of having chargers for all our devices, we could have a single Qi system, which allows wireless charging through resonant inductive coupling. And for older devices that don’t have Qi compatible hardware built in, there could be adaptors.

What other augmented media will exist?

Most of our media is visual, auditory, or a combination the two. I was wondering about engaging other senses and creating media that has haptic or olfactory components.

TeslaTouch Shopping (2:52) – TeslaTouch is a technology developed by the Disney Research Pittsburgh that creates tactile sensations through touch screens using electrovibration. Another example is the company Senseg. When you move your finger across the screen it will actually feel like fabric or sand or whatever they program it to be. Though tech appropriated by companies for commerce might not be the most meaningful application, it seemed the most realistic.

Scratch and Sniff Magazine (2:50) – Just for fun, a reference to scratch and sniff stickers, which smelled nothing like they were supposed to.

How will tech affect relationships?


(0:11) – There are two opposite ideas represented by this scene.

1. With smartphones, in general we interact less with our environment. Could there be ways of receiving information that still allows us to be fully present in the environment we’re in? Current digital picture frames aren’t the most useful, but if incorporated with social media they could have interesting uses (for example, a picture-only newsfeed, that shows posts from close friends and family).

2. Will the combination of tech, convenience, and an all-fun-no-responsibility mindset affect our relationship with pets? What if we opt for cute animal videos displayed in our home, instead of caring for actual pets? I began thinking about this concept after watching the movie Her, which seemed to me a commentary on what our relationships are beginning to look like now and a projection of what they might become. I wrote a reaction to the movie here.

Bitcoin and digital transactions?

(2:50) – I would argue that we are slowly moving toward a digital-transaction-only economy. Most transactions are made via card, especially now, with third party payment services like Venmo, Square, Paypal making it easier go without hard cash. And most of our wealth (or lack of), we just see as number on a bank account statement anyways.

There is difference between digital transactions and a digital currency, and I thought it would be interesting to think about a scenario in which a digital currency became dominant. Right now Bitcoin is more a commodity than a currency, and the value of one Bitcoin is around $500. If it is used as a currency and it’s value becomes $10,000, for example, price labels will become ridiculously long. Will we need a better way to represent the value? How do countries with large-numbered base currencies represent prices? Are there problems that arise – miscounting the zeros for example?

edit: Maxie Sievert, someone I know from Germany, commented that a simple solution could be to use Metric System notation (for example, 1 mB = 1 milli-Bitcoin).

What cultural things will remain constant?

(0:25) – Christmas will probably still exist.

(disclaimer: It’s not my Christmas tree. Please do not be offended if the holiday of your culture was not represented.)

What objects will remain constant?

With tech, all things eventually become obsolete, and when speculating about the future, it is easiest to speculate about future tech. Because of this, speculative work often portrays the future as something independent from the present.  For example, sci-fi films often present the future as a sleek, sterile environment uncluttered by any objects of the past.  In this article, Nick Foster talks about the “future mundane” and how the future is actually an “accretive space.” I set the video in “accretive spaces” to show that in the future, we will still have objects and mementos, which retain their value not because of usefulness but sentiment.


For children, schoolwork, learning – book soundtracks could allow for closer reading of texts and increase engagement, especially for auditory learners.

Haptic experience in reading – what if we introduced electrovibration technology to reading devices so when you rub your finger across a texture word, it will feel like the actual texture.

Ideas for implementation – A next step could be to actually prototype a web-based tagging system for creating book soundtracks. I also wonder if there could be a system that wouldn’t require eye-tracking to match the soundtracks to your place on the page.

Soundtrack for The Prince of Tides by Pat Conroy (Prologue)

I created an ambient soundtrack for the Prologue. It may be too fast or too slow depending on your reading speed, but should give a general idea of what the experience would be like.

Authors begin to write with soundtracks in mind – what would a resulting piece look like? Would the piece still be able to stand alone without the soundtrack? Would reliance on the soundtrack mean less compelling writing? Could reading/hearing the piece become such an integrated experience, you don’t even think about the soundtrack?

Google Reading Glass (or Contact Lenses) – You could read on any a blank sketchbook/notebook, or even a surface. The Reading Glass or Contacts would project the text onto the surface. With a blank book, you could turn pages as if you were reading a real book, but then you could also look up words and highlight text as you can with e-readers. It would save the page for you automatically. Soundtracks could also be incorporated with Google Reading Glass.



The Crying of Lot 49 by Thomas Pynchon (1:10 to 3:08)

I actually first thought of the concept for book soundtracks while reading the first chapter of this book. There were many references to songs and music that I didn’t know about. I looked them up and played them while continuing to read, and thought, “I wish when a book referenced a song, you could hear the music playing, or even ambient sounds described, like traffic or wind or nature.”

The Wise Man’s Fear by Patrick Rothfuss (3:14 to 4:04)

The Prince of Tides by Pat Conroy (4:30 to 5:05)


(1:12) Door slam, from MixOnline

(1:14) Wings

(1:22) Concerto for Orchestra,  IV. Intermezzo Interrotto: Allegretto – Bartok

(1:33) Paper Rip, from SoundJay

(2:18) Interview on NBC’s “Huntley-Brinkley Report” 9 September 1963, JFK Library and Museum

(2:30) Slavic tones – recorded by Critter Taylor for this project. Pynchon references “Margo” on the next page.

(2:35) Ebonics Language Lesson

(2:39) Beep, from SoundJay

(2:44) Look Down – Lamont Cranston Band

(4:42) Low Tide – Echoes of Nature


Design fiction has met a lot of criticism for limited cultural scope (targeted at privileged upper-middle class, doesn’t address issues about class, race, gender, etc). It has also been criticized for fetishizing tech while ignoring technical feasibility or the “complex causes that drive real world technology development and uptake.”

To the first criticism, yes, unfortunately this project also falls into that category. I started this project before encountering those ideas, and since then, I have been struggling to understand where I stand with design fiction and form a solid opinion. Though I agree with their arguments, I do see value in certain aspects of design fiction – mainly its potential to address cultural issues that the design industry doesn’t have time for or won’t make money off of. For future projects, I plan on developing concepts based on ethnographic research, focusing not on tech but specifically those issues that design fiction doesn’t currently address.

To the second criticism, the fictional elements in this project aren’t purely imagination, but extensions of existing technology (i.e. Kindle, Qi, TeslaTouch, Bitcoin, etc.) or cultural references not to be taken seriously (i.e. the scratch and sniff). Hopefully by thinking through how the system might actually work, with users contributing to the creation of soundtracks, I contributed some thought to feasibility. And hopefully by thinking about cultural implications, secondary applications, side effects – not just what the interface would be – I was able to move the project beyond tech fetishizing.


Thank you to:

Patrick Harazin, for suggesting The Crying of Lot 49 to me and talking with me about ideas for a Google Reading Glass.

Julian Bleecker, for giving me feedback in the earliest stage of this project, about the cultural implications and the ambient mode.

Jason DePerro, for giving me feedback about how the system could actually, the ambient mode, and different associations across cultures.

Rain Chan-Kalin, for acting.

Critter Taylor, for the recording “Slavic tones” sound clip.

Further Readings/References

My original storyboard for the video

Stories from the New Aesthetic – another great lecture by James Bridle

A Design Fiction Evening – a lecture event hosted by the Near Future Laboratory and IDEO, where Julian Bleecker talks about design fiction and Nick Foster explains the future mundane

Towards Fantastic Ethnography and Speculative Design – by Anne Galloway

Making Music with a Bike – just a cool sound-related project. It’s a track by Johnnyrandom, recorded entirely by sampling sounds from a bike