Understanding The8bitera
Understanding The8bitera
∗ †∗ †∗
Fabio Zünd Pascal Bérard Alexandre Chapiro
[email protected] [email protected] [email protected]
†∗ † †∗
Stefan Schmid Mattia Ryffel Markus Gross
[email protected] [email protected] [email protected]
†∗ †∗
Amit H. Bermano Robert W. Sumner
[email protected] [email protected]
ABSTRACT
We propose a hardware and software system that transforms 8-bit
side-scrolling console video games into immersive multiplayer ex-
periences. We enhance a classic video game console with custom
hardware that time-multiplexes eight gamepad inputs to automati-
cally hand off control from one gamepad to the next. Because con-
trol transfers quickly, people at a large event can frequently step in
and out of a game and naturally call to their peers to join any time
a gamepad is vacant. Video from the game console is captured and
processed by a vision algorithm that stitches it into a continuous,
expanding panoramic texture, which is displayed in real time on
a 360 degree projection system at a large event space. With this
system, side-scrolling games unfold across the walls of the room to
encircle a large party, giving the feeling that the entire party is tak-
ing place inside of the game’s world. When such a display system
is not available, we also provide a virtual reality recreation of the
experience. We show results of our system for a number of classic
console games tested at a large live event. Results indicate that our
work provides a successful recipe to create immersive, multiplayer,
interactive experiences that leverage the nostalgic appeal of 8-bit
games.
CCS Concepts
Figure 1: Side-scrolling games unfold across the walls of the
•Computing methodologies → Image and video acquisition; Im-
room to encircle the players, immersing them into the game’s
age processing; Virtual reality;
world. The games are played using the console, enhanced to al-
low up to eight players to participate, in the center of the room.
Keywords
games, panoramic stitching, virtual reality 1. INTRODUCTION
Video games are a cultural phenomenon. Through their unique
∗ETH Zurich, Universitätsstrasse 8, 8092 Zürich, Switzerland combination of visual, narrative, auditory, and interactive elements,
video games provide an engaging medium of expression within our
†Disney Research Zurich, Stampfenbachstrasse 48, 8006 Zürich,
society. The 8-bit era, dominated by the Nintendo Entertainment
Switzerland System (NES) [3], included a focus on side-scrolling graphics with
pivotal leaps in game design, mechanics, and story that deeply in-
fluenced nearly every game that followed [23]. As a case in point,
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed the original Super Mario Bros. series largely defined the platform-
for profit or commercial advantage and that copies bear this notice and the full cita- ing genre and pioneered a new level of game feel characterized by
tion on the first page. Copyrights for components of this work owned by others than loose and fluid movement through an expansive world [20]. These
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission games touched the lives of a huge number of gamers and their
and/or a fee. Request permissions from [email protected]. gameplay still holds up today.
CVMP 2015, November 24-25, 2015, London, United Kingdom Although 8-bit games have had a dramatic collective cultural im-
c 2015 ACM. ISBN 978-1-4503-3560-7/15/11. . . $15.00 pact, the actual experience of playing them is largely an individual
DOI: http://dx.doi.org/10.1145/2824840.2824848 one. With few exceptions, hardware and design limitations restrict
gameplay to one or two players in front of a low-resolution display. adapting content to fit modern architectures [13, 14]. In our work,
This setup confines gameplay to a small region, restricts social in- we try to preserve the feeling of retro gaming as much as possi-
teraction, and limits the number of players that can enjoy a clas- ble, while simultaneously adapting the content to be displayed in a
sic gaming experience. Even with modern game hardware, party modern immersive setting.
games rarely extend beyond a small number of people playing in Immersive display has been an active topic for both research
front of a display. and industry in the past years. While pioneering efforts such as
In this paper, we propose a custom hardware and software setup the CAVE automatic virtual environment [6] are complex to set up
to transform classic side-scrolling games into collective experiences and may accommodate no more than a single user, modern systems
in which the games become immersive group activities. We take often try to enhance the viewers’ experience by augmenting stan-
advantage of the nostalgic appeal of 8-bit games and their com- dard display technologies. Commercially available systems such as
pelling yet accessible gameplay by using a classic NES console as IMAX [10] provide viewers with a wider field of view than stan-
our game hardware. Our system enhances the NES in two dramatic dard cinemas. Projection [11] and additional illumination [22] can
ways. First, we add a custom piece of hardware that takes as in- be employed in tandem with standard displays in order to present
put eight NES gamepads and time multiplexes the output so that content to the peripheral vision of observers. Finally, modern VR
the real-time control is handed automatically from one gamepad to prototypes shift the viewing experience to a wearable VR headset.
the next either every five seconds or based on progress through a In this work, we test our system on two different immersive display
game. Second, we capture the NES video output and direct it to a setups: a custom commercial 360 degree projection system and the
computer vision system that stitches video frames into a continu- popular Oculus Rift DK2 [18]).
ous, expanding texture similar to a panoramic photo. This texture Image tracking is used in our system to correctly place the cur-
is displayed live on a 360 degree projection system that allows the rent video frame in the global context of an expanding panoramic
game to unfold on the walls of a large event space. When such texture. Due to our target video setup, we consider only side-
a display system is not available, we also provide a Virtual Real- scrolling games. As a result, the tracked frame can only move
ity (VR) recreation of the experience. A conceptual illustration of horizontally. The speed of this movement, however, is determined
our system is included in Figure 1, along with a photo of the system by the player. It is not uniform and can include standing still or
in action. even backtracking. Camera movement in side-scrolling games was
Taken together, our system transforms classic gaming into an im- analyzed in depth by Keren [12].
mersive, cooperative multiplayer experience designed to enhance To track the movement, corresponding points in the input frame
large parties and other social events. The eight-way multiplexing and the output buffer must be found. Many algorithms to find
hardware encourages multiple people to play and adds a new level scene correspondences exist, ranging from dense correspondence
of social interaction on top of existing gameplay. Because control algorithms like optical flow [4] and stereo reconstruction [7] to
transfers quickly, people at a large event can frequently step in and sparse algorithms based on feature detection like SIFT [16]. Robust
out of the game. Whenever a gamepad is unused, others playing matching based on RANSAC [24] is used in applications such as
naturally call to their peers to join in before the control reaches the panorama stitching. All of these methods deal with complex situa-
vacant gamepad. The panoramic stitching and 360 degree projec- tions with many degrees of freedom. While these methods solve a
tion allow a side-scrolling game to encircle a large party, giving wide array of problems, the tracking required for our application is
the feeling that the entire party is taking place inside of the game’s limited to 1D shifts and requires real-time performance. For these
world. By using a real NES console, as opposed to an emulator, reasons we choose a straightforward confidence weighted model,
we maintain the tangible connection to the physical Nintendo hard- described in Section 4.2.
ware and the nostalgic appeal of loading real game cartridges into
the system. Successful operation sometimes even requires blowing 3. OVERVIEW
on the cartridge’s connectors to clear away dust, as many gamers
We present a system that bridges the gap between classic 8-bit
fondly (or not so fondly) remember from their childhood.
side-scrolling console video games and state-of-the-art media dis-
Our core contributions include the conceptual design of our sys-
play systems to deliver compelling, multiplayer, immersive game
tem that unfolds 8-bit side-scrolling games around a large party as
experiences. Figure 2 depicts an overview of our system’s archi-
well as the technical design of the time-multiplexing hardware and
tecture. While our system is not limited to a specific console, we
computer vision algorithms for panoramic stitching of video game
optimized it for the NES. The video signal generated by the console
input. We show results of our system on a number of NES games at
is captured by a tracking PC, with careful attention to quality and
a live event with over four-hundred participants as well as in a VR
latency. The tracking PC identifies background motion in order to
scene that recreates the feeling of the large event space.
stitch the video frames together into an expanding panoramic tex-
ture image. We deployed and tested our system at the conference
2. RELATED WORK banquet during the Eurographics 2015 conference, which was held
Retro gaming is a general term referring to a modern commu- in a large event space that contains an integrated state-of-the-art
nity where old games, mostly those produced in the 1980s and early 360 degree projection system. Our system used this display sys-
1990s, are played or collected. Many people find the video games tem to seamlessly wrap the game texture around the event room as
they played as children to have a nostalgic allure [19], resulting in players played. We also recreated the feeling of this live event in
a significant cultural impact of old school gaming. This connection VR using the Oculus Rift DK2.
has been acknowledged and leveraged by researchers for various Our system is tailored for this party scenario, where several play-
purposes. Areas such as psychotherapy [8] and speech therapy [21] ers engage in the game together. This cooperation is enabled through
benefit from the engaging aspects of retro games to motivate pa- multiplexing several NES gamepads, activating only one of them at
tients. Generations impacted by classic games are subjected to tar- each point in time. The gamepads are switched between players
geted teaching methods that take their interests into account [9]. based either on a fixed time interval or on the current position of
Others appreciate the elegant designs of early gaming systems and the game in the 360 degree projection. The dynamic and automatic
strive to preserve the characteristic visual look of pixel art when transition between active gamepad control encourages interaction
Projection Room Server Room Game Area
Speakers NES Up-Scaler Capture Card Tracking PC VR Display
Audio RGB HDMI USB 3
USB 2
NES Controller Cable
4x HDMI
Multiplexer Projectors
Media Server
8x HDMI
Gamepads ...
Figure 2: System architecture overview. The NES output video signal is first captured and sent for analysis. The tracking PC
processes the video stream, tracks the background, and creates a wide, panoramic image. This image is either incorporated into a
VR environment or sent to a 360 degree projection system. The projection system receives the video streams, processes, and outputs
them to the projectors. For the projection configuration, the division between the projection hall and server room is indicated by
blue and green backgrounds, respectively.
between the players for successful gameplay and increases the so- system that wraps horizontally around the walls, we consider only
cial aspect of the system. horizontal side-scroller games. Thus, our tracking algorithm must
In our VR demo, a simple scene is created based on the original consider only one degree of freedom of camera movement. We pro-
360 degree projection event space. The scene contains four walls pose several strategies to perform this calculation and compose the
which are textured with the tracked image. The player is located in input frames with the existing output buffer pixels. Figure 3 depicts
the center of the room and can follow the progress of the game by an overview of our approach.
turning his or her head.
Refinement
Neighborhood
Filled Background
Masked Region
Figure 3: Camera tracking. Top left: the current input frame with a manually labeled region excluded from the tracking. Top right:
Illustration of the matching error for camera movement estimation of the current input frame. Bottom: the output buffer with the
current input frame position (red). A strip of pixels corresponding to the camera movement is filled in (green).
4.2.2 Tracking Confidence • Direct: In the simplest case we use only the previous frame
During gameplay, the scene may change completely, such as to fill in the missing background pixels. Since the previ-
when switching between levels or showing a game over screen. ous frame contains both foreground and background objects,
Since the camera cannot be tracked in this situation, our system both will be copied. Foreground objects will sometimes re-
keeps track of the current camera position and overwrites the cur- main frozen in the buffer texture.
rent frame within the buffer. To detect such a screen refresh, a
confidence value for the tracked position is computed. A non-
minimum suppression is applied to the matching errors by label- • Median: By taking into account multiple frames, we can es-
ing all of the offset positions that have two adjacent neighbors with timate the background pixel colors more robustly and treat
higher matching errors. The minimum error e1 is used together foreground objects as outliers. This estimation is done by
with the second lowest labeled matching error e2 to compute the storing the last F frames from which we compute the median
confidence: color for each pixel. We found a value of F = 20 to be a good
e1 compromise. On one hand, more frames provide a better
confidence = 1 − . (2) background color estimation. But, on the other hand, static
e2
foreground elements like score boards will produce smeared
Finally, a threshold is applied to determine if the tracking suc-
artifacts when additional frames are used.
ceeded. In our examples, we use a threshold value of 0.1.
4.2.3 Refinement Both modes have their advantages and disadvantages. Figure 4
Since the camera can move by subpixel values due to the scaling depicts a side-by-side comparison. Since the direct mode is not
and jitter in the input signal, we refine the current position at sub- able to suppress foreground elements, they will remain in the final
pixel levels by computing the matching scores for N positions per output. Depending on their movement during processing, they can
pixel in a 2-pixel neighborhood. The position with the minimum be squeezed, stretched, or torn apart. The median mode, on the
error is the final frame position. We used N = 10. other hand, can suppress foreground elements, but it bears two dis-
advantages. First, if the tracking is inaccurate the generated image
4.2.4 Frame Composition is blurry. Second, foreground objects might suddenly appear or dis-
There are different strategies to composite the input frame into appear as soon as they exit the active frame, depending on whether
the output buffer. The pixels at the current position should always they are contained in the majority of the previous frames or not.
be taken from the input frame since this is where the actual game The most appropriate method depends on the game and on which
action takes place. When the camera moves, on the other hand, artifacts are preferred by the user. We found the sharper results of
a strip of pixels can be filled in with the information from one or the direct mode to be most appealing and used this mode during
more previous frames. We propose two different strategies. deployment of our system.
Selected Input Frames
Direct
Median
Arduino Connectors
4.4.3 Gamepad Multiplexing
The NES console supports up to two gamepads simultaneously. Figure 9: Custom-designed multiplexing hardware for eight
Since our design requires eight gamepads with control hand-over, NES gamepads.
we built a custom piece of electronics to selectively multiplex the
eight gamepads. The system must be able to give control to any
of the gamepads during a running game experience. Figure 8 de-
picts the schematic of the Arduino-based [1] multiplexing device 5. RESULTS
and Figure 9 shows the actual hardware implementation. The bus The tracking PC, performing the tracking and producing the out-
width is given as the number in square brackets. Voltage source and put, is a i7 3.2Ghz machine with a Geforce GTX 770 graphics card.
ground (omitted in the scheme) for all gamepad connectors, includ- It generates twenty 12 M-pixel frames per second, with approxi-
ing the one connecting to the console, are connected in parallel. mately 120 ms latency, most of which originates from the capture
Gamepad ground and Arduino ground are hooked up together to hardware. For the 360 degree projection configuration, the pro-
have the same reference potential for the multiplexer/demultiplexer jection system introduces additional delay, but does not affect the
components. All latch lines and clock lines coming from the eight frame rate. Tracking takes 3 ms to 4 ms on average on thus does
gamepad connectors are connected to an 8-bit demultiplexer mod- not significantly contribute to the overall latency.
ule. The eight data lines are connected to an 8-bit multiplexer We tested our system on seven classical games, both in the VR
unit. The single demultiplexer and multiplexer output line for latch, configuration, as shown in Figure 5, and the 360 degree projection
clock, and data is connected to the console gamepad plug. The configuration, summarized in Figure 10. Figures 11 and 12 depict
arrow heads indicate the input/output behavior for all wires and the unfolded output buffer of our tracking process using the direct
buses. A three-wire selection bus is connected in parallel to all method. For context, selected input frames from the console are
multiplexer/demultiplexer components, making it possible to select shown above the buffer, in their respective positions. The game
one of the eight gamepads and connect its wires through to the con- world is clearly and continuously captured in the unfolded buffer.
sole. The selection wires are operated by an Arduino. The system Figure 11 depicts the well known Super Mario Bros. trilogy.
software determines the currently active gamepad and sends the These games include many foreground characters that pass by while
Figure 10: The system as it was deployed during the conference banquet of Eurographics 2015. Different games provide varying
ambiance, and immerse players and spectators in the 8-bit realm.
side-scrolling, and are therefore sometimes visible in the tracked dynamics, with strangers talking and laughing with one another,
buffer. Figure 12 depicts the other games that were tested. Excite- warning each other to be ready when control was passed on to the
bike is a fast-paced game in which the whole buffer is transversed next player, and calling other participants to join when a gamepad
very quickly, implying quick camera motion per frame of up to 15 was vacant. The 360 degree projection contributed a special am-
pixels. By contrast, Castlevania is slower in this aspect. Note how biance to the evening, and provided entertainment for those not
the buffer is overwritten when the scene changes from a forest set- playing. At a few instances during the evening, a collective cheer
ting to the castle interior. In Life Force, the input frame includes a from the audience erupted when the players beat a difficult level.
black bar on the right hand side, and therefore the current position Limitations in our system direct us to opportunities for future
within the buffer is clearly noticeable, as well as a level change. research. Games with highly repetitive features can lead to track-
Probotector is a game in which the map evolves very clearly and ing failures, as can be seen in the Excitebike result in Figure 12.
cleanly. Our system does not explicitly distinguish foreground sprites from
Unfortunately, ground truth data is not available. However, one background elements, leading to frozen sprite characters in the stitched
can still visually validate the results by comparing a segment of the textures. Tracking improvements as future work could address
output buffer with a single input frame as shown in Figure 11 and these issues and also offer new ways to enhance the game expe-
Figure 12. Mismatches produced by foreground elements, drift or rience. Explicitly distinguishing between background and fore-
artifacts are directly visible. ground elements would allow us to add additional depth perception
into the VR version of our system so that the background is offset
6. CONCLUSION in space from the foreground.
Latency is a critical issue in any interactive system. Although
We have presented a hardware and software system to transform our capture and processing algorithms are designed to execute as
classic side-scrolling games into immersive, multiplayer experi- fast as possible, some latency is unavoidable. In the VR setting, our
ences. By using a real NES console and games, we take advan- test players did not notice latency that would hinder them from suc-
tage of the nostalgic appeal of the 8-bit game era. We tested our cessfully playing the game. However, some latency was noticeable
system live at a large event with over four-hundred people and ob- during the Eurographics 2015 event, as the 360 degree projection
served strong engagement. Throughout the evening, people con- system incurred additional latency to our capture and processing.
tinually played the system, actively cooperating with one another As our target was 360 degree projection, we focused on side-
to advance in the game levels. We observed a new range of social
scrolling games and developed a tracking algorithm designed for http://www.gdcvault.com/play/1022243/Scroll-Back-The-
this use case. Our method does not currently support vertical scrolling Theory-and.
or more complicated camera movement. Future work could con- [13] J. Kopf and D. Lischinski. Depixelizing pixel art. In
sider alternate display geometries beyond circular projection as well Transactions on graphics, volume 30, page 99. ACM, 2011.
as vertical scrolling. Our VR system provides an ideal tool to test [14] F. Kreuzer, J. Kopf, and M. Wimmer. Depixelizing pixel art
and debug various setups in preparation for real-world deployment. in real-time. In Proceedings of the 19th Symposium on
Likewise, we demonstrated two control switching methodologies: Interactive 3D Graphics and Games, pages 130–130. ACM,
temporal switching after a fixed number of seconds or switching 2015.
based on the physical position of the game projection. Exploring [15] A. LaMothe. Game programming for the propeller powered
other control switching modes is an area of future work. hydra. Parallax, Inc, S.l, 2006.
Perhaps the most interesting opportunity for future work entails [16] C. Liu, J. Yuen, and A. Torralba. Sift flow: Dense
accommodating more advanced game consoles. Although our sys- correspondence across scenes and its applications.
tem was designed for the NES console, we are confident it would Transactions on Pattern Analysis and Machine Intelligence,
work on other classic 2D consoles such as the 16-bit Super Nin- 33(5):978–994, 2011.
tendo Entertainment System (SNES) [3]. However, future console
[17] Micomsoft. DP3913515 XRGB-mini framemeister compact
generations focus on 3D graphics and violate the assumptions of
up scaler unit. Website.
our tracking algorithm. Dynamic scene reconstruction for 3D games
http://www.micomsoft.co.jp/xrgb-mini.htm.
combined with VR could provide a novel, compelling way to expe-
rience such games. [18] Oculus VR. Oculus Rift development kit 2. Website, 2014.
https://www.oculus.com/dk2/.
[19] J. Suominen. The past as the future? nostalgia and
7. ACKNOWLEDGEMENTS retrogaming in digital culture. Fibreculture, 11, 2008.
The copyright to all imagery from Super Mario Bros., Super [20] S. Swink. Game Feel: A Game Designer’s Guide to Virtual
Mario Bros. 2, Super Mario Bros. 3, and Excitebike lies with Nin- Sensation. Morgan Kaufmann Game Design Books. Taylor
tendo Co., Ltd. The copyright to all imagery from Castlevania, Life & Francis, 2009.
Force, and Probotector lies with Konami Corporation. We would
[21] C. T. Tan, A. Johnston, A. Bluff, S. Ferguson, and K. J.
like to thank Alessia Marra and Maurizio Nitti for their artistic sup-
Ballard. Retrogaming as visual feedback for speech therapy.
port and Jan Wezel for his engineering work.
In SIGGRAPH Asia 2014 Mobile Graphics and Interactive
Applications, page 4. ACM, 2014.
8. REFERENCES [22] A. Weffers-Albu, S. de Waele, W. Hoogenstraaten, and
[1] Arduino Uno Board Overview. Website, February 2014.
C. Kwisthout. Immersive TV viewing with advanced
http://arduino.cc/en/Main/ArduinoBoardUno.
Ambilight. In International Conference on Consumer
[2] Blackmagic design intensity shuttle. Website, July 2015.
Electronics, pages 753–754. IEEE, 2011.
https://www.blackmagicdesign.com/products/intensity.
[23] M. J. P. Wolf, editor. The Video Game Explosion: A History
[3] Nintendo. Website, July 2015. https://www.nintendo.com.
from PONG to PlayStation and Beyond. Greenwood Press,
[4] T. Brox, C. Bregler, and J. Malik. Large displacement optical 2008.
flow. In Conference on Computer Vision and Pattern
[24] W. Zhang and J. Košecká. Generalized ransac framework for
Recognition, 2009., pages 41–48. IEEE, 2009. relaxed correspondence problems. In Third International
[5] Coolux. Pandoras box server. Website. Symposium on 3D Data Processing, Visualization, and
http://www.coolux.de/products/pandorasboxserver/. Transmission, pages 854–860. IEEE, 2006.
[6] C. Cruz-Neira, D. J. Sandin, and T. A. DeFanti.
Surround-screen projection-based virtual reality: the design
and implementation of the CAVE. In Proceedings of the 20th
annual conference on Computer graphics and interactive
techniques, pages 135–142. ACM, 1993.
[7] Y. Furukawa and J. Ponce. Accurate, dense, and robust
multiview stereopsis. Transactions on Pattern Analysis and
Machine Intelligence, 32(8):1362–1376, 2010.
[8] J. E. Gardner. Can the Mario Bros. help? Nintendo games as
an adjunct in psychotherapy with children. Psychotherapy:
Theory, Research, Practice, Training, 28(4):667, 1991.
[9] M. Guzdial and E. Soloway. Teaching the Nintendo
generation to program. Communications of the ACM,
45(4):17–21, 2002.
[10] IMAX Corporation. IMAX: a motion picture film format and
a set of cinema projection standards, 2010.
[11] B. R. Jones, H. Benko, E. Ofek, and A. D. Wilson.
IllumiRoom: peripheral projected illusions for interactive
experiences. In Proceedings of the SIGCHI Conference on
Human Factors in Computing Systems, pages 869–878.
ACM, 2013.
[12] I. Keren. The theory and practice of cameras in
side-scrollers. Website, March 2015.
Super Mario Bros.
Figure 11: The unfolded output buffer of our tracking process, for the Super Mario Bros. trilogy. For every game, the first row
depicts selected input frames coming from the console, in their respective positions within the output buffer. The second row depicts
the output tracking buffer. The combination of input frames continuously maps out the game realm in the buffer.
Excitebike
Castlevania
Life Force
Probotector
Figure 12: The unfolded output buffer of our tracking process, for four of the tested games. For each game, the first row depicts
selected input frames coming from the console, in their respective positions within the output buffer. Scene changes overwrite the
current position in the buffer.